Abstract
In today’s world of multimedia communication, the use of multiple modes of discourse is prevalent in various fields. The international academic community has taken an interest in studying multimodality from different perspectives. This paper uses CiteSpace 6.1.R6 to visually analyze literature on multimodal discourse studies (MDS) in the Web of Science (WoS) Core Collection from 1997 to 2023. The results show a significant increase in English research papers on MDS over the past two decades, a development made possible through collaborative efforts and knowledge-sharing among scholars from different regions and institutions. MDS conducted by scholars from across the world reveals diverse research categories and highlights hot issues and social practices. Consequently, the discipline has produced significant social value. The analyses of keywords and literature further reveal that MDS mainly explores language education, politics, children’s education, identity construction, and media. Additionally, the most cited literature on MDS plays a fundamental role in guiding scholars across the world to understand the key concepts of multimodality and to conduct multimodal discourse analysis from different perspectives. This research provides valuable insights for scholars engaged in or interested in multimodal discourse analysis worldwide, helping them understand MDS’s evolution while offering significant implications for conducting research on multimodal discourse.
Keywords
Introduction
In his seminal paper “Discourse Analysis,”Harris (1952, p. 6) introduced a methodological approach to discourse analysis by “collet[ing] those elements which have like distributions into one class, and thereafter speak of the distribution of the class as a whole rather than of each element individually.” This innovative perspective challenged the traditional focus on the sentence as the primary unit of analysis, thus laying the groundwork for the evolution of multimodal discourse analysis.
Harris’s work encouraged scholars in the field to expand their analytical lens to encompass the interplay between sentence structure, textual composition, and societal practices, as evidenced by the contributions of Fairclough (1992, 2005, 2013) and Lim et al. (2012). As semiotics and systemic functional linguistics advanced, discourse analysts increasingly turned their attention to the rich tapestry of multimodal data and the discourses it shapes.
The works of Forceville (1994, 2011, 2024), Kress and Van Leeuwen (1996, 2001), Kress (2010, 2015), Van Leeuwen (1995, 2005, 2012), Norris (2004, 2006, 2011), Machin and Mayr (2012), Bateman et al. (2017), Ledin and Machin (2018), O’Halloran et al. (2018), Fernandez-Fontecha et al. (2020), Jewitt et al. (2021), and Kress and Bezemer (2023), among others, have significantly enriched the discourse on multimodality. These scholars have collectively pushed the boundaries of discourse analysis, emphasizing the importance of considering the full spectrum of communicative modes and their integration within social contexts.
The field of multimodal discourse analysis is deeply invested in understanding the significance of various symbols, encompassing both verbal and non-verbal cues, and their collective influence on the construction of discourse. This includes a broad spectrum of non-verbal signs that extend across visual, auditory, gustatory, and tactile modalities, as well as the diverse media through which information is conveyed and shared. These different modes and media are integral to the multimodal framework.
As Van Leeuwen elucidates, multimodality encompasses “the combination of different semiotic modes—for instance, language and music—in a communicative artifact or event” (Van Leeuwen, 2005, p. 281). This concept also involves “the diverse ways in which a number of distinct semiotic resource systems are both co-deployed and co-contextualized in the making of a text-specific meaning” (Baldry & Thibault, 2006, p. 21). The dynamic interplay among multiple symbols and modes within a specific communicative context results in the emergence of multimodal discourse.
An initial inquiry into the Web of Science (WoS) database, a globally recognized and authoritative repository of scholarly literature, reveals a consistent progression in multimodal discourse studies (hereafter abbreviated as MDS) on a global scale. Notably, the proliferation of MDS literature has been on an upward trajectory over the past two decades, signifying profound theoretical advancements, significant interdisciplinary collaboration, and innovative topic exploration. A bibliometric examination of the literature on MDS offers researchers and the broader academic community a comprehensive overview of the field’s trajectory and current standing within the international academic community, thereby enhancing the understanding of MDS’s worldwide trends and developmental patterns.
In the present study, “international academic community” refers to the academic community of scholars, researchers, educators, and research institutes that write and publish research papers and books in English and engage in research on multimodal discourse and related fields around the world. The community encompasses a diverse range of disciplines, including linguistics, communication, sociology, anthropology, education, and psychology. Its focus is on the theoretical development, practical application, and interdisciplinary research methods of multimodal discourse analysis.
This paper utilizes CiteSpace, a prominent bibliometric tool, to visually dissect the literature on multimodal discourse studies, with the goal of illustrating the field’s global progression. The study leverages CiteSpace to delineate the trajectory and impact of MDS over the past two decades, as well as to elucidate the interdisciplinary connections that have emerged within the last 20 years. The paper seeks to address two pivotal inquiries:
(1) What is the developmental trajectory of MDS in the international academic community over the past two decades?
(2) How has MDS been integrated with other academic disciplines in the last 20 years?
Literature Review
Major Methodologies for Analyzing Multimodal Discourse
The development of multimodal discourse analysis has been influenced by the semiotic theories of Peirce (Wisdom, 1934) and Saussure (1916), as well as Halliday’s (1978) Systemic Functional Linguistics. According to Halliday’s (1978) Systemic Functional Linguistics, language is expected to be explained in social contexts. The theory views language (both verbal and non-verbal) as a way of meaning making in a given context from a social semiotic perspective. That is to say, the study of language symbols should not be limited to the linguistic level, but should be extended to the social and cultural levels.
Multimodal discourse analysis absorbed the key concepts of Halliday’s (1973) Systemic Functional Grammar, including systemic grammar, functional grammar (metafunctions), register theory, and the notion that language is a social symbol (Zhu, 2007). By borrowing Halliday’s (1994) idea of three metafunctions, namely, ideational function, interpersonal function and textual function, Kress and Van Leeuwen (1996) put forward a visual grammar framework. In their seminal book, Reading Images: The Grammar of Visual Design, Kress and Van Leeuwen (1996) attempted to clarify the concept of “multimodal discourse,” and proposed to explore the semantic relations between different visual symbols, such as images, from three levels: representational meaning, interactive meaning and compositional meaning. Since then, multimodal discourse studies have flourished and attracted attention from scholars all over the world.
In a sense, the advancement of multimodal discourse studies is contingent upon the robust foundation of systemic functional grammar and semiotics. MDS not only absorbs the essence of linguistics and semiotics, but also integrates the theoretical frameworks of many disciplines, thereby providing a new perspective for the analysis of discourse comprised of various semiotic resources, including words, images, sounds and other modes. As for the approaches to doing multimodal discourse studies, MDS has by far largely drawn on four major methodologies, that is, conversation analysis, mediated discourse analysis, critical discourse analysis, and metaphor analysis.
Conversation analysis is the study of recorded, naturally occurring talk-in-interaction, aiming to discover how participants understand and respond to one another in their turns at talk, with a central focus on how sequences of action are generated (Hutchby & Wooffitt, 2008; Nordquist, 2023). Analyses of conversation examine both verbal and non-verbal aspects of social interaction in everyday situations, and pay attention to conversational features like turn-taking, repair, adjacency pairs, and so on.
As Halliday (1978) emphasized, the social dimensions are inherent in discourse analysis, which reveals how humans understand each other’s speech. Scollon (1998) proposed mediated discourse analysis in Mediated Discourse as Social Interaction: A Study of News Discourse. According to Scollon (1998), the connection between society and discourse is reflected through mediation. While conversation analysis delves into the nuances of face-to-face interaction, mediated discourse analysis extends its purview to encompass a diverse array of mediated communication modalities, including text, images, and multimedia. Crucially, mediated discourse analysis transcends mere linguistic analysis, directing attention toward broader social phenomena and behaviors (Scollon & Saint-Georges, 2012).
The recognition of discourse as inherently social lays the groundwork for critical inquiry into the power dynamics and ideological underpinnings embedded within communication. This critical orientation aligns with the principles of Critical Discourse Analysis (CDA), which interrogates the power structures and ideological hegemonies manifested through discourse. CDA reveals the ideologies, power relations and social structures that lie behind a text. By integrating multimodal perspectives into this critical framework, Multimodal Critical Discourse Analysis (MCDA) offers a nuanced understanding of how diverse semiotic resources interact to construct meaning and reinforce or challenge dominant societal narratives and power structures. As Machin (2013) illustrated, MCDA is an innovative approach that draws upon the core theoretical principles underlying critical discourse analysis. In MCDA, scholars tend to integrate qualitative and quantitative methods, using text, image, and discourse analysis techniques, to comprehensively analyze the symbolic system in multimodal texts, addressing issues related to politics (Milner, 2013; Ledin & Machin, 2018; Breazu, 2023), advertisements (Roderick, 2013; Chen & Eriksson, 2019; Kenalemang, 2022), news (Bednarek & Caple, 2014; Serafis et al., 2019), technoculture (Brock, 2018; Djonov & Van Leeuwen, 2018; Bouvier & Machin, 2023), education (Bower & Hedberg, 2010; Weninger, 2020; Luo et al., 2021 ), and so on.
Additionally, multimodal discourse analysis has a close connection with metaphor analysis in that metaphor, be it rhetoric or cognitive, widely exists in human communication and is consciously or subconsciously used by language users to construct and convey meaning. Metaphor, as a fundamental element of cognitive linguistics, enables the visualization of abstract concepts and complex emotions in a discourse. The introduction of metaphor to multimodal discourse allows for a comprehensive analysis of the articulation of modal density and media richness in meaning construction. Forceville (1994) initially proposed the concept of “pictorial metaphor”, extending the study of metaphor from traditional linguistic aspect to images. This study led to the expansion of metaphor research from words to the multi-modal field, which includes images, sounds, actions, among others. The interpretation of multimodal metaphor is frequently contingent upon the shared experience of the creator and the audience, and is closely related to idealized cognition. To interpret and hence fully understand the creativity and meaning of a multimodal metaphor, it is necessary to go beyond verbal metaphor and consider the non-verbal factors such as vision, hearing and the other multi-sensory modalities that generate the multimodal metaphor (Forceville, 2024).
For instance, Pérez-Sobrino (2016) examined a multimodal corpus containing 210 advertisements, analyzing the frequency of conceptual operations such as metaphor and metonymy in advertisements, the utilization of modal cues, as well as the impact of advertising patterns and marketing strategies on the complexity of concepts in advertisements. The study found that metaphor is the most commonly used conceptual operation in advertisements, and that advertising patterns significantly influence the conceptual complexity of advertisements. Xu et al. (2022) focused on the use of multimodal metaphors in school education. They investigated the metaphorical understanding of the nature of English learning among intermediate English learners in Iran, collecting data through both verbal and non-verbal (pictorial) tools. This study revealed that learners expressed positive attitudes towards English learning through linguistic and visual metaphors, perceiving learning as a joyful, dynamic, and individual discovery process. This research provides new insights into understanding the learning experiences of second language learners. Additionally, Abdel-Raheem (2021) suggested that images of Covid-19, often described as “war” or “waves,” have potential drawbacks and pointed out that few people had studied the origin of the metaphors used to describe this disease.
In summary, semiotics and systemic functional linguistics have laid a solid foundation for constructing and analyzing multimodal discourse, addressing diverse issues in different scenarios. Meanwhile, the frequently adopted methods in discourse analysis have also been widely employed in doing mutimodal discourse research, including conversation analysis, mediated discourse analysis, CDA, MCDA, multimodal metaphor analysis. The adoption of these methodologies has contributed a lot to the development of multimodal discourse studies across the world.
A Brief Review of Bibliometric Analyses of Multimodal Discourse Studies
Bibliometric analysis involves the quantitative analysis of comprehensive knowledge systems in specific fields using statistical and mathematical methods, including document retrieval, data cleaning, and data analysis. The past 20 years have seen a prosperous and sustainable development of multimodal discourse studies worldwide. Scholars have attempted to employ bibliometric methods to visually demonstrate and analyze the development of research on multimodal discourse over the past 10 to 20 years.
In the international academic community, a limited number of scholars have performed bibliometric analyses on multimodal discourse studies (MDS) literature. For instance, Caple et al. (2018) employed the visualization tool Kaleidographic to examine the multimodal construction of news value on Facebook, revealing the tool’s ability to uncover complex symbolic mode intersections often overlooked in static representations. Similarly, Huan and Guan (2020) utilized VOSviewer to analyze discourse analysis papers from 1978 to 2018 in Scopus, noting a shift from the United States’ dominance to a more global contribution, with countries like Britain, Australia, and China increasingly influencing the field. Sun et al. (2021) applied CiteSpace to assess linguistic studies on social media from the WoS Core Collection, providing insights into the field’s status, methods, and key topics over the past decade. These studies demonstrate the effectiveness of bibliometric tools in visually mapping and tracking the evolution of MDS. However, the scarcity of such reviews emphasizes the need for further exploration in MDS to enhance understanding of its development and impact in the academic community.
In addition to the bibliometric reviews of multimodal discourse studies among international scholars, review articles focusing on multimodal discourse analyses have also been published in China. For instance, Guo (2016) reviewed international literature on MDS up to 2016, using data from the WoS and focusing on articles indexed in the Social Science Citation Index (SSCI) from 2004 to 2015. This review traced the shift in international MDS research from descriptive to empirical studies and contrasted it with the early stage of MDS in China, where there was a stronger focus on textual and visual analysis in fields like education, drama, film, and literature, with less emphasis on quantitative and empirical research.
Furthermore, Chen and Francisco (2017) conducted a comprehensive bibliometric analysis of MDS literature, referencing both the China National Knowledge Infrastructure (CNKI) and WoS databases. They used CiteSpace to examine MDS-related papers published between 1970 and October 12, 2017, including those from SSCI journals in WoS, and national core journals and the Chinese Social Sciences Citation Index (CSSCI) journals in CNKI. Their findings indicated that while the international academic community prioritizes the development of multimodal discourse analysis frameworks, China’s research lags behind in theoretical exploration of MDS research areas, highlighting a significant difference in research focus between the two academic communities.
Bibliometric studies in China have extensively examined multimodal discourse studies, providing insights into field’s global standing within a defined timeline. Cheng and Zhang (2017) evaluated MDS research in 11 CSSCI-indexed Chinese journals from 2003 to 2015, revealing swift growth in the field along with challenges in theoretical depth, interdisciplinary application, and methodological diversity. Xu (2019) concentrated on CNKI publications of MDS-related papers from 2009 to 2018, observing a predominance of theoretical discussions over empirical research. Moreover, Shi and Xu (2020) scrutinized 158 papers on online multimodal discourse from 2004 to 2018, showcasing a variety of research methods including discourse analysis, corpus methods, interviews, questionnaires, and eye-tracking experiments. Additionally, Shi (2021) reviewed Chinese MDS from 2000 to 2020, emphasizing the innovation in applied research, particularly in the applications of technology. Collectively, these studies reflect an expanding scope and a growing inclination towards empirical methodologies in Chinese MDS research.
Previous literature reviews have demonstrated underscored the growing diversity in the form and meaning of multimodal discourse studies. China has notably taken the lead in bibliometric analyses, highlighting the necessity for a global overview of MDS, especially by scholars from around the world. Although an initial review of Web of Science articles indicates MDS spanning multiple disciplines, there is a scarcity of studies that explore the most recent advancements in MDS. Crucial inquiries about its evolving trends, accomplishments, interdisciplinary synthesis, and prospective impacts have yet to be thoroughly addressed.
In response to these gaps, the current study endeavors to utilize a bibliometric approach to conduct an extensive review of international MDS research from the past twenty years. This effort aims to provide a more holistic understanding of the field’s trajectory, achievements, and potential directions for future research.
Research Design
Working Definitions of Some Key Terms
To clearly profile the development and status quo of multimodal discourse studies in the international arena, this study provides working definitions of some key terms that will be frequently referred to. Firstly, the term “international academic community/circle” in this study, as aforesaid, specifically refers to the academic community of scholars, researchers, educators, and research institutes that write and publish research papers and books in English and engage in research on multimodal discourse analysis and related fields around the world. The community covers a diverse range of disciplines, including linguistics, communication, sociology, anthropology, education, and psychology. In contrast, “the Chinese academic community” or “the academic community in China,” mentioned in this study, as aforesaid, refers to Chinese scholars who write research papers in Chinese language and get them published in Chinese journals. Secondly, the modifier “English” in expressions like “English literature,”“English journals,” and “English international journals” only highlights the use of English language.
Selection of the Bibliometric Tool and Literature Databases
A variety of bibliometric tools, such as COOC, CiteSpace, VOSviewer, Cite, and HistCite, are useful and effective for researchers to mine literature data and explore research innovation. Among them, CiteSpace is a tool for multi-dynamic citation analysis, which can help researchers draw various knowledge maps to explore the key paths and turning points in the development of a specific discipline or research field, and identify frontiers and development trends of the given research area. It can detect and visualize trends in disciplines over time, thereby revealing the dynamics of the profession from the cutting edge to the basics (Chen, 2006). CiteSpace’s function of visualization enhances readers’ understanding of knowledge flow, evolving research priorities, and emerging trends of a specific research field, thus providing readers with a comprehensive view of the profession’s progression and forecasting its future paths. The aforesaid studies such as Chen and Francisco (2017), Shi and Xu (2020), and Shi (2021), also exemplified the role of CiteSpace in clear and scientific demonstration of the development of a given academic area.
In addition, CiteSpace offers various functions for literature analysis, such as portraying emerging trends and tracking topics, labeling co-citation clusters, and time-zone visualization. This can be a valuable aid in addressing research questions such as understanding the trajectory, research dynamics, and research themes of MDS. Therefore, this study attempts to use CiteSpace 6.1.R6 to carry out a bibliometric analysis of the selected citation data and generate knowledge maps to analyze multimodal discourse studies in the international academic community over the past 20 years.
Given that a large number of international journals are written in English, one of the widely-spoken international languages, this study confines the scope of research to papers in English and published in core international journals included in the multidisciplinary literature database, Web of Science (WoS). WoS contains an authoritative collection of over 20,000 research articles, proceeding papers, book reviews, book chapters, early access, and more. Analyzing papers on MDS collected from WoS can ensure the authority and comprehensiveness of the literature review of MDS in the international academic community in recent decades.
Research Procedures
On the Web of Science Core Collection’s basic retrieval page, we selected “English” as the language and utilized “multimodal discourse,”“multi-modal discourse,” and “multimodality discourse” as subject terms for our search. Each term was entered into the search box with “Topic” as the scope, yielding three distinct search results. Our initial review revealed that 1997 marked the first inclusion of multimodal discourse analysis papers in WoS. Consequently, this study’s bibliometric analysis spans from January 1st, 1997, to December 2nd, 2023, broadening the time frame beyond Guo’s (2016) review. This approach provides a more comprehensive perspective on the evolution of MDS research within the international academic sphere.
After configuring the search parameters, we initiated the literature search, yielding results categorized as “Article,”“Review Article,”“Proceedings Paper,”“Editorial Material,”“Early Access,” and “Book Review.” From WoS, we extracted a total of 1,693 MDS-related entries in plain text format. These were subsequently imported into CiteSpace for deduplication, focusing on “article,”“review article,” and “proceeding paper” (hereafter referred to as “research papers”). The deduplication process culminated in the selection of 1,314 unique entries for subsequent bibliometric analysis.
Results and Discussion
In our study, we integrate quantitative and qualitative methods, employing CiteSpace to generate and analyze the knowledge maps of multimodal discourse studies (MDS) literature, encompassing the three aforementioned categories. These maps cover a comprehensive set of literature details, such as publication volumes, contributing authors, publishing countries and institutions, research keywords and categories, and influential works. Utilizing this extensive dataset from WoS, we aim to outline the overall development, key research areas, and distinctive features of MDS within the international academic community. Additionally, we highlight the latest emerging foci of MDS. Through this approach, we addressed the two main research questions.
Volumes of Publication
The number of MDS publications from the international academic community is a clear parameter of the field’s growth rate. Based on the results of the aforesaid searches on WoS and the operation of the duplicate removal function of CiteSpace, we drew a bar chart of the number of English-language literature on literature on MDS from 1997 to 2023 (see Figure 1).

Number of MDS publications in English international journals from 1997 to 2023.
Figure 1 clearly presents the overall development trajectory of MDS in the past two decades. It is apparent that the initial stage of MDS development, ranging from 1997 to 2011, had a significantly lower number of publications, ranging from 1 to 5 per year. This indicates that initially multimodal discourse did not receive much attention from international scholars. Nevertheless, starting from 2012, there was a sudden surge in the number of MDS publications to 33. From 2012 to 2015, the number of publications fluctuated between 33 and 57. Subsequently, since 2016, MDS among the international academic community has seen a generally upward trend, indicating growing attention to mutilimodality from the academia internationally. The number of studies on multimodal discourse reached 227 in 2021, presenting a heightened level of interest from the international academic community in multimodal discourse studies in the past 5 years.
Additionally, Figure 1 highlights a pivotal moment in 2012 for the growth of MDS publications. Utilizing the “Refine” function of the WoS database, we identified “Educational Studies” as the prevalent research theme, with 42 % of the 33 papers focusing on multimodality in education. Notably, Hampel and Stickler’s (2012) study on videoconferencing in language classrooms garnered the most citations. Their work emphasized the importance of multimodal interaction, including written and spoken elements, and the adaptations made by both learners and educators to succeed in online environments. The study’s novelty lies in its exploration of diverse interaction modes in online courses, a concept that was gaining traction and influencing teaching and learning. Hampel and Stickler (2012) emphasized that video meetings enrich media and modal density, a concept previously defined by Norris (2004). The rapid evolution of videoconferencing and its integration into daily communication has spurred a notable increase in MDS research since 2012, focusing on meaning construction through multimodal resources.
A noticeable finding pertains to the year 2019, where our refined search results using the WoS database revealed a doubling in publications in both “Linguistics” and “Educational Studies” compared to 2018. Moreover, there was a remarkable surge in MDS publications within the realm of “Psychology.” Additionally, by 2021, MDS publications peaked, signaling a substantial expansion in research areas, including emerging topics like “Women’s Studies.” These shifts indicate the close integration of MDS with other academic domains, addressing human concerns such as education, mentality, and gender issues.
Overall, from 1997 to 2023, publications related to MDS exhibited a fluctuating growth trajectory, progressing in waves. This dynamic evolution reflects the diverse topics and perspectives explored within the field, highlighting the multifaceted impact of multimodal discourse across various disciplines.
Author Cooperation and Co-occurrence of Cited Author
The “Author” function of CiteSpace is instrumental in illustrating the collaboration network among scholars, drawing upon extensive literature to depict academic cooperation within a specific field. We set “NodeTypes” as “Author” on CiteSpace operation interface, and then ran the function towards the aforesaid 1,314 research papers to obtain the map of author cooperation (see Table 1 and Figure 2).
Top Ten Authors in Number of MDS Publications.

The knowledge map of author cooperation.
Table 1 lists the top ten authors ranked by their number of publications on MDS across the past two decades around. As presented in Figure 2, the running of “Author” function via CiteSpace generated a total of 404 nodes and 318 connections. The top ten active scholars with a relatively larger number of publications on MDS are David Machin, Göran Eriksson, Ahmed Abdel-Raheem, Sabine Tan, John A. Bateman, Peter Wignell, Gwen Bouvier, Ariel Chen, Kay L. O’Halloran, and Per Ledin.
As can be observed from Figure 2, the collaborative network of MDS scholars displays moderate density and forms four relatively prominent collaborative clusters. Firstly, the collaborative team studying multimodal neurocognition, is mainly comprised of Adolfo M. García, Lucas Sedeño, Agustin Mariano Ibanez, Agustina Birba, and Kathy A. Mills; secondly, the group focusing on multimodal linguistics and multimodality in education, consists of John A. Bateman, Kevin Chai, Kay L. O’Halloran, Peter Wignell, Sabine Tan, Fei Victor Lim, and so on; thirdly, the team concentrating on multimodal corpora, is represented by Jaroslav Kadlec, Sebastien Bourban, Mael Guillemot, Mike Flynn, Simone Ashby, Jean Carletta, Iain McCowan, Pierre Wellner; and the group investigating MDA under the framework of communication and linguistics, is primarily composed of David Machin, Per Ledin, Göran Eriksson, Gwen Bouvier, Ariel Chen, and Petre Breazu. Among these groups, the scholars in the second and fourth cooperative teams are more active and influential, given that more links are interconnected between these group members. The author cooperation network offers a snapshot of the ecology of multimodal discourse studies in the international academic arena.
Additionally, conducting a co-citation analysis of cited authors is another crucial aspect for identifying prominent authors and research within an academic field. When two or more than two authors are cited in the same one paper or across multiple papers simultaneously, they establish a co-citation relationship (Chen, 2006). Utilizing the “Cited Author” function of CiteSpace, we generated Table 2 and Figure 3, which depict the frequently-cited authors and the co-occurrence of cited authors in the field of MDS within the international academic community.
Top Ten Frequently-Cited Authors in MDS Field.

The knowledge map of co-occurrence of cited authors.
Table 2 lists the top ten most frequently cited authors in the MDS field. “Count” indicates the number of times an author has been cited, and “Year” denotes the year the work was first cited by others. The leading authors in terms of citations are Gunther Kress, Theodoor Van Leeuwen, David Machin, Norman Fairclough, M. A. K Halliday, Sigrid Norris, Charles Goodwin, Carey Jewitt, John A. Bateman and Kay L. O’Halloran. These scholars have significantly contributed to the advancement of MDS. Moreover, the connectivity of the individual author nodes in Figure 3 highlights that the frequently co-cited authors tend to foster stronger collaborative relationships with one another.
A look at Figures 2 and 3 helps observe that three authors, David Machin, John A. Bateman, and Kay L. O’Halloran, have engaged in closer collaborative research with their peers. Their scholarly endeavors have greatly propelled the development of multimodal discourse studies, serving as both inspiration and guidance for fellow researchers and shaping the trajectory of MDS. Consequently, their academic contributions are widely acknowledged as pivotal catalysts in the evolution of this field.
Countries and Institutions of Publications
To examine the geographic distribution of international scholars contributing to multimodal discourse studies, we utilized CiteSpace’s “Countries” function, which organizes citation data by country and publication count. This method enables us to assess and rank countries by their scholarly contributions to MDS literature over the past 20 years. The findings are detailed in Table 3 and illustrated in Figure 4.
Top Ten Countries in Number of MDS Publications.

The knowledge map of countries of MDS publications.
In Table 3, “Count” presents the total number of MDS publications from scholars in each country from 1997 and 2023 (up to 2 December, 2023). “Year” demotes the year when scholars from the country published their inaugural research on multimodality. As Table 3 outlines, the United States, England (which CiteSpace identifies as the United Kingdom), China, Australia, and Spain are the top five countries with the highest volume of MDS publications, in that order.
To delve deeper into multimodal discourse research by scholars from the top ten countries in terms of MDS publications, we utilized the “Refine” function of WoS to identify prevalent research topics in this domain. Subsequently, we analyzed the abstracts of the articles authored by scholars from these countries. Linguistics, communication, education, and humanities were identified as common themes among scholars, albeit with varying emphases on specific topics.
To encapsulate the pivotal characteristics of multimodality studies by scholars from the top five countries, we highlighted the defining features of their research, offering a snapshot of the global academic discourse on MDS.
American scholars have explored a diverse array of topics within MDS, encompassing sociology, psychological experiments, and rehabilitation. For example, Choe (2019) combined interactional sociolinguistics with conversation analysis to study the multimodal construction of social eating in Korean livestream mukbang. Azevedo and Gašević (2019) investigated cognitive and social processes in self-regulated learning across various domains and age groups, demonstrating the value of interdisciplinary research in advancing learning technologies. Grosz et al. (2023) performed a semantic analysis of face emojis, revealing their distinctive role in contemporary written communication. These studies illustrate the American scholars’ focus on a broad spectrum of social and psychological aspects within MDS.
Scholars from the UK have broadened their scope to multimodal discourses across various fields, covering management, women’s studies, psychology, sociology, and environmental science. For instance, Brookes et al. (2016) scrutinized campaign texts from the Start4Life government initiative, focusing on the discourses surrounding infant feeding practices in UK health promotion campaigns. Doyle et al. (2020) delved into corporate climate branding, particularly Unilever’s campaigns, to reveal the company’s strategy of leveraging emotional appeals to frame consumption as a form of “climate care.”Stead et al. (2023) explored the evolving recognition and legitimization of female leaders in British media, highlighting the interplay pf gender and class dynamics. These studies demonstrate the UK scholars’ commitment to employing multimodal resources to tackle significant social concerns such as health, climate, and gender.
In contrast, Chinese scholars have exhibited a pronounced inclination towards linguistics, communication, education, humanities, and computer science. Van Den Hoven and Yang (2013) introduced a method for analyzing multimodal public discourse, emphasizing the accountability of rhetors and the systematic reconstruction of their arguments, exemplified by a case study on Hu Jintao’s state visit to the United States. Wang and Feng (2021) investigated how the city of Xi’an used TikTok to blend local folk art with modern elements, crafting a digital image that merges its historical and modern identities. Machin and Liu (2023) conducted a critical discourse analysis of the UN’s 2030 Agenda for Sustainable Development document, emphasizing the role of semiotic resources in enhancing the document’s rhetorical impact. Critics, however, have pointed out that the 17 Sustainable Development Goals may obscure issues due to their vaguely defined, overlapping, and redundant nature. These studies indicate that Chinese scholars often concentrate on themes of international relations, social media, and societal advancement within the realm of MDS.
Australian scholars have concentrated on a range of subjects, particularly in sociology, psychology, management, sports tourism, and education. O’Halloran et al. (2016) introduced a novel method combining multimodal discourse analysis with data mining and visualization to scrutinize the language and imagery used in online propaganda by violent extremist groups, revealing their recruitment and legitimization strategies. Wrench and Garrett (2018) explored the role of sports media in shaping perceptions of sports culture, gender, race, and other social dynamics. Tang (2023) created frameworks for socio-semiotic multimodal analysis and multimodal interaction analysis to study the materiality of science learning, highlighting the importance of material objects in science education and teacher-student interaction. These studies illustrate the application of multimodal resources to understand and address a variety of social, sports, business, and educational issues.
Spanish scholars have mainly focused on sociology, communication, information science, and cultural studies. Morell (2015) analyzed the use of English as a Lingua Franca in international conferences, proposing a Systemic Functional Linguistic and multimodal framework to improve communication by incorporating diverse modes. Helm and Dooly (2017) discussed the transition in human communication from text-based to multimodal, emphasizing the importance of accurately transcribing and representing multimodal data for online interaction. In addition, Civila and Jaramillo-Dent (2022) studied the cultural hybridization of Spanish-Moroccan mixed couples through their TikTok videos, showing the influence of Moroccan culture on their identities. Overall, Spanish scholars frequently address issues of social communication.
The research output from the top five countries in MDS publications is significant, demonstrating the broad applicability of multimodal discourse across various disciplines. Scholars from these countries have notably contributed to the cross-disciplinary evolution of MDS, impacting fields such as linguistics, communication, sociology, education, cultural studies, psychology, computer science, and neuroscience. Scholars in different countries, influenced by their unique social contexts and academic environments, tend to prioritize issues relevant to their societies. As a result, MDS scholars exhibit distinct research focuses and interests, reflecting their diverse experiences and academic backgrounds.
Further analysis into the institutional affiliations of scholars active in multimodal discourse studies has revealed key institutions with significant contributions to the field over the past two decades. Figure 5 provides a visual representation of the geographical distribution of these leading MDS scholars, showcasing their affiliations across various countries. Complementing this, Table 4 offers a detailed breakdown of the publication output of select top-performing institutions, including the total number of MDS papers they have produced and the year in which they published their first MDS-related paper.

The knowledge map of major institutions of MDS publications.
Top 17 Institutions in Number of MDS Publications.
The data not only highlight the institutions that have been at the forefront of MDS research but also provide insights into the historical development of the field within these academic environments. Notably, a significant role is played by institutional support and collaboration in advancing scholarly work in multimodal discourse analysis and related interdisciplinary areas.
Figure 5 illustrates the dominance of universities in multimodality research, as shown by the institution distribution map. The top 17 contributors to MDS literature are diverse and international, including institutions from Sweden, Australia, Singapore, the UK, Germany, China, Canada, Denmark, Belgium, and Spain. These institutions have made significant contributions to the field, highlighting their crucial role in advancing MDS.
Specifically, Table 4 indicates that Sweden has distinguished itself with the highest volume of MDS publications, reflecting a strong research presence in the field. Australia and Singapore also show notable contributions to MDS, with significant academic outputs from institutions in the UK, Germany, Denmark, and Spain. Furthermore, Canada, Belgium, and China have emerged as key contributors to the development of MDS.
The global distribution of MDS research reflects the field’s extensive reach. The growth of MDS is likely driven by advancements in information technology and the widespread use of multimedia in communication. These developments encourage researchers to explore diversified channels for meaning expression, enhancing the discourse on multimodality and its impact across academic and practical fields.
Research Categories
We conducted a comprehensive analysis of MDS research categories from 1997 to 2023 using advanced Boolean searches on WoS. The search, refined with “Exact Search” in “Less Options,” included specific topics, document types, and language settings, spanning January 1st, 1997, to December 2nd, 2023. Employing established keywords for MDS literature, we targeted “Article,”“Review,” and “Proceedings Paper” document types. After compiling a list of matched papers, we utilized the “Web of Science Categories” to categorize the research within MDS.
Table 5 outlines the principal research categories of MDS during the specified period, illustrating robust interconnections between MDS and various academic disciplines such as linguistics, communication, education research, sociology, psychology, management, women’s studies, computer science, artificial intelligence, rehabilitation, audiology, speech-language pathology, business, neurosciences, cultural studies, and more. The array of research categories within MDS not only aids in comprehending how language shapes meaning and facilitates communication but also directs scholars’ focus towards exploring the role of abundant semiotic symbols such as images and sounds, in enhancing human interaction and constructing discourse systems.
The Categories of Frequently Discussed Topics in MDS Publications.
In the international academic community, scholars have delved into the use of multimodality in non-linguistic educational contexts. For instance, Chen and Herbst (2013) highlighted the role of gestures, linguistic resources and other interactive forms like diagrams in enhancing students’ understanding of geometric reasoning in classrooms. Similarly, Herakleioti and Panagiotis (2016) demonstrated the multimodal nature of meaning generation in kindergarten children’s engagement with scientific concepts.
Multimodal discourse analysis also significantly contributes to arts and humanities. Krisjanous (2016) analyzed how Dark Tourism websites use multimodal channels to engage prospective tourists, and O’Halloran et al. (2021) developed a platform integrating multimodal discourse analysis with computational models for visual analysis of texts, images, and videos.
In women’s studies, multimodal discourse analysis has been used to examine the portrayal of female images in various contexts, such as advertisements. Lazar (2014) critiqued and analyzed the evolution of female representation in Singaporean jewelry ads, showing how multimodal resources can influence societal perceptions of gender.
Applied studies in audiology and speech pathology have also benefited from MDS. Graven et al. (2011) emphasized the benefits of multimodal intervention for emotional and physical recovery, and Kong and Law (2019) used multimodal neuroimaging, based on their establishment of a Cantonese aphasia database, to study aphasia, offering new insights into patient rehabilitation.
Overall, MDS enriches theoretical understanding in social sciences and humanities, and serves as a vital tool for analyzing social issues with practical implications in various fields, including education, arts, women’s studies, and medicine.
Keywords and Influential Literature
Analyzing keywords and influential literature is crucial for understanding the focal points of multimodal discourse studies and identifying the seminal works that have propelled the field over the past two decades, thus addressing the second research question.
Keywords act as succinct summaries that help researchers quickly identify the main research themes. In bibliometric analysis, various knowledge maps–such as co-occurrence, clustering, and temporal trends–can be created to examine keywords. We employed CiteSpace’s keyword function to generate these maps.
CiteSpace’s keyword function builds a co-word network by correlating keywords from the citation dataset, which surfaces the research highlights within a specific timeframe (Chen, 2006). These knowledge maps are instrumental for literature reviewers in forecasting the trajectory of a research field. CiteSpace’s accuracy and comprehensiveness are ensured by using the “Author Keywords” and “Keywords Plus” tags from the WoS database, which are specifically curated for each research topic.
We extracted keywords from the WoS database using CiteSpace, focusing on “Author Keywords” (DE tags) and “Keywords Plus” (ID tags). The former are provided by authors to emphasize their paper’s focus, while the latter are identified through text mining technology, reflecting the core issues in the field. By analyzing citation data with CiteSpace, we created a keyword co-occurrence map, highlighting the top 5 most frequent keywords per year, as detailed in Table 6 and illustrated in Figure 6.
Keywords of MDS Publications.

The knowledge map of keyword co-occurrence.
The keyword co-occurrence knowledge map is instrumental in revealing the interconnections and significance of keywords within a research field. In Table 6, “Count” indicates the occurrences of keywords in articles, while centrality in Figure 6 denotes their prominence in the network. Core issues in the MDS field are represented by frequently co-occurring keywords such as “discourse,”“language,”“multimodal discourse analysis,”“social media,”“communication,”“critical discourse analysis,”“identity,” and “social semiotics.” These terms highlight the central themes, theories, methodologies, and applications within MDS.
“Discourse” is the central research object, with “language” as a key component of MDS. The application of semiotic resources across contexts like social media, identity, literacy, politics, education, and gender illustrates MDS’s broad scope. Keywords like “knowledge,”“construction,” and “science” point to MDS’s focus on knowledge construction and cognition, indicating interdisciplinary trends.
Using CiteSpace, we identified keywords with the strongest citation bursts, which indicate sudden prominence or significance during specific times, revealing emerging research directions. Figure 7 showcases the top 25 keywords with the strongest citation bursts in MDS literature over the past two decades, providing a snapshot of significant shifts and trends in the field.

The top 25 keywords with the strongest citation bursts map of MDS.
From Figure 7, it’s evident that the top 25 keywords with the strongest citation bursts in MDS literature include terms like “discourse,”“language,” and “multimodal discourse analysis.” During 2012 to 2016, the most cited keywords, reflecting major research emphases, were “critical discourse analysis,”“organization,”“talk,”“comprehension,” and “conversation.” This period saw a significant cross-disciplinary engagement, with MDS scholars drawing on theories and methods from linguistics, sociology, and communication studies to explore topics like conversation and institutional discourse.
From 2017 to 2023, the academic focus shifted towards keywords such as “conversation,”“organization,”“context,”“life,”“behavior,”“power,”“student,”“learner,”“self,”“work,”“news,”“ethnography,”“teacher,”“technology,”“discourse,” and “hand gesture.” These terms suggest a growing interest in communication, education, news media, and social networking, indicating a broadening scope in MDS research.
The evolution of these keywords over time reflects the dynamic nature of MDS research. Initially, the focus was on dialogues, social interaction, and cognitive processes, but later, the field expanded to include social network analysis, cultural studies, education, and natural language processing. This shift reflects the growing complexity and diversity of questions addressed within the MDS community.
The progression of social media and changes in communication and social behaviors have spurred multimodal discourse studies to provide new perspectives on the effects of various communication channels and social interactions. Technological advancements have further enabled researchers to analyze multimodal data more deeply, leveraging cutting-edge technologies such as computer vision and natural language processing. In addition, the widespread application of multimodal resources in areas like news reporting, everyday communication, human-computer interaction, and education has significantly influenced the development of key terms in MDS over the past 20 years. These factors have collectively contributed to the enrichment and evolution of the field, reflecting its adaptability to the changing landscape of communication and technology.
Additionally, the keyword clustering map is an essential tool for analyzing sub-fields within a research domain. It groups similar keywords, providing a clear view of topic distribution, emerging trends, and disciplinary overlaps. The map’s effectiveness is gauged by Q and S values, accessible through CiteSpace. The Q value, on a scale of 0 to 1, indicates significant clustering when it exceeds 0.3. The S value, ranging from −1 to 1, measures cluster quality, with a value above 0.5 suggesting high homogeneity and closely linked nodes (Ma et al., 2020).
Using CiteSpace, we produced a keyword clustering knowledge map in MDS (as shown in Figure 8) and pinpointed key concepts across various clusters (refer to Table 7). The map’s Q value of 0.4136 and S value of 0.7146 signify a clear and high-quality clustering structure. This indicates that MDS keywords can be categorized into distinct thematic groups, with closely related terms clustering together.

The knowledge map of keyword clustering.
Top Terms in Different Keyword Clusters.
The keyword co-occurrence network consists of 11 clusters, where the number of keywords per cluster is inversely proportional to the cluster’s size. Notably, smaller cluster labels in the form of “#_” denote more significant keywords in the academic field. Figure 8 suggests that the foremost clusters –“#0 multimodal critical discourse analysis,”“#2 conversation analysis,”“#5 social semiotics,” and “#6 critical discourse analysis”—comprise the primary frameworks for analyzing multimodal discourse on a global scale.
Table 7 offers a comprehensive overview of the key issues in each MDS sub-field. The “Cluster ID” and “Top terms” columns delineate specific research directions or sub-fields and their central concepts. An examination of the “Top Terms” column reveals the primary research themes, subjects, and methodologies within each cluster.
As Table 7 demonstrates, the top four clusters in the international academic community of multimodal discourse studies focus on deciphering the meanings conveyed by written, spoken, and symbolic discourses, with the goal of addressing societal, political, and ideological issues. Meanwhile, clusters “#1 science,”“#3 gesture,” and “#7 youth” are more prevalent in applied research, providing substantial insights for social practices in diverse settings.
Notably, the “#4 Hong Kong” cluster is unique, named after a geographic location, indicating that keyword clusters in MDS are not solely about research methods and applications but also encompass studies centered around a specific place. To delve deeper into this cluster, we conducted an advanced search in the WoS using “Hong Kong” as the keyword. The findings highlighted a concentration of studies on education (Pun & Tai, 2021; Williams, 2020), advertising (Huang, 2023; Tse et al., 2023), politics and society (Lou, 2017; Dynel & Poppi, 2021; Lams & Zhou, 2022).
For example, Pun and Tai (2021) explored the implementation of science education through translanguaging in a Hong Kong secondary school’s English medium of instruction (EMI) setting. Their study examined how students in a secondary English laboratory used multilingual and multimodal resources to build scientific knowledge, revealing that despite an English-only policy, students effectively constructed knowledge and engaged in scientific practices by leveraging their linguistic and semiotic resources. Similarly, Williams (2020) studied how fifth-grade bilingual students in a Hong Kong school utilized semiotic repertoires in content-based science lessons, identifying four common strategies where students used gestures and models to aid their translanguaging in class.
In addition to education, international scholars have explored the use of multimodality in social media in Hong Kong. For instance, Lams and Zhou (2022) analyzed the 2019 Hong Kong protests’ Chinese state-society interactions, with a focus on “fanquan girls,” who were pro-China fans of pop stars. They found that the girls’ personification of China through various visual elements was a potent communicative method that reinforced the state’s discursive co-optation of society, rather than challenging authoritarian rule. In general, research on multimodal discourse in the “Hong Kong” cluster shows a diversity of topics, discourse varieties, and analysis approaches, enhancing discursive practices in Hong Kong.
Figure 8 also illustrates the interconnectedness of the eight clusters labeled 0 to 7, indicating a significant degree of keyword co-occurrence across MDS sub-fields. These clusters, while distinct, all relate to language, communication, discourse analysis, and sociology. They also reflect the multifaceted nature of multimodal discourse studies, which are applied in various contexts with unique characteristics. The clustering suggests that future research in this area will likely involve interdisciplinary collaboration to tackle a range of social issues. Moreover, with the rise of extended reality technologies, such as virtual reality, augmented reality, and mixed reality, these tools are poised to offer new dimensions and insights for the advancement of multimodal discourse research.
Beyond the keyword co-occurrence network, CiteSpace provides additional analytical tools. Keyword timeline maps trace the progression of keywords, highlighting trends and significant developments. Keyword timezone maps, on the other hand, show the temporal distribution of keywords, indicating the field’s growth and key research directions. These visualizations deepen our understanding of the field’s evolution and contribute to a comprehensive view of the research landscape. Using CiteSpace, we created maps that represent the keyword timeline (Figure 9) and keyword timezone (Figure 10).

The knowledge map of keyword timeline.

The knowledge map of keyword timezone.
Figure 9 depicts the evolution and interconnections of keywords across 11 high-frequency clusters in multimodal discourse studies over the past 20 years. Yellow points mark the initial appearance of keywords, with lines showing their development and connections, especially between clusters. Keywords in clusters 0 to 7 exhibit a continuous progression from their emergence to 2023, evidenced by their dense interconnectivity.
Figure 10, however, shows a remarkable change in the density of connections between keywords around 2010. Prior to this year, connections were sparse, suggesting a less integrated and interdisciplinary field in MDS. Post-2010, the connections between keywords across clusters increased, reflecting a more dynamic, diverse, and inclusive research environment, especially noticeable after 2020.
Analyzing the keyword timeline and timezone maps, we identified three distinct phases in the development of high-frequency keywords in MDS from 1997 to 2023: the initial phase (1997–2001), the progressing phase (2002–2012), and the maturity phase (2013–2023). These phases highlight the field’s research milestones and trends.
Figure 10 highlights the initial phase from 1997 to 2001, with emerging buzzwords like “natural language processing,”“multimodal systems,”“discourse,” and “health.” This period focused on natural language processing, multimodal technologies, discourse analysis, and elderly health, likely influenced by the rise of corpus linguistics and the aging global population.
In the progressing stage, keywords such as “memory,”“social interface,”“gesture,”“language,”“communication,”“multimodal analysis,”“critical discourse analysis,”“social semiotics,” and “education” gained prominence. Scholars emphasized multimodal interaction, language user experience, information processing, language comprehension, human-computer interaction, behavioral understanding, and the integration of social and cognitive sciences. This focus was driven by technological progress, growing societal demands for multimodal resources, and interdisciplinary collaboration between discourse analysis and other fields.
In the maturity stage, the focus of multimodal discourse studies shifted towards prevalent keywords such as “media,”“conversation analysis,”“classroom,”“identity,”“gender,”“politics,”“Twitter,”“Internet memes,”“women,”“health communication,”“corpus,” and “Hong Kong.” Scholars prioritized areas like language learning, identity, gender, social media, corpus analysis, sociology, politics, health communication, and regional contextual analysis. This trend reflects an alignment with societal needs and an effort to address real-world challenges through multimodality, enriched by the development of related disciplines.
The analysis of extensive keyword data reveals that MDS has emphasized solving a range of global social issues, including elderly care, environmental protection, educational improvement, health for the vulnerable, and cognitive development. Recent MDS developments demonstrate the effective integration of discourse analysis with various disciplines and the broad application of multimodal resources in contexts like social media, language learning, and political engagement. These findings demonstrate MDS’s potential to address numerous real-world challenges.
Furthermore, frequently-cited literature offers insights into prevalent research topics and influential theories within the field. High citation rates signify significant influence within the scholarly community. Our CiteSpace analysis provided a comprehensive examination of literature with high citation frequencies, as shown in Table 8, highlighting the impact of these works on the field of MDS.
Top Ten Most Frequently Cited Literature on Multimodal Discourse Studies.
Table 8 clearly demonstrates that the most frequently-cited works in multimodal discourse studies are predominantly books, with two exceptions being research articles. These texts are foundational for discourse analysis, particularly for multimodal approaches. The top six books each provide a theoretical framework or methodological approach to multimodal discourse analysis.
The book Doing Visual Analysis: From Theory to Practice by Ledin and Machin (2018) is the most cited, offering a comprehensive introduction to visual analysis theories like semiotics, iconography, and visual psychology. It equips readers with practical methods for observing, describing, and interpreting images and visual texts, highlighting the approach’s versatility across fields such as art, media, and social sciences. The authors emphasize the role of multimodal discourse analysis in revealing social phenomena, power dynamics, and inequalities, thus providing valuable insights for future research in the field.
The second key text, Multimodality: A Social Semiotic Approach to Contemporary Communication by Kress (2010), applies social semiotics to understand multimodal communication in today’s society. Oleksiak (2012) emphasizes Kress’s advocacy for a communication model that does not favor any single mode–writing, speech, or image–over others. The book concentrates on visuals and language as primary modes of interest, as noted by Forceville (2011). Specifically, Kress (2010) introduces a social-semiotic framework to explore how meaning is constructed across different modes, demonstrating the comprehensive nature of multimodality. The book also showcases how this theory can be applied to educational contexts and the analysis of mobile devices, as illustrated by Yang (2012), further expanding the practical applications of multimodal discourse analysis.
The third influential book, How to Do Critical Discourse Analysis: A Multimodal Introduction by Machin and Mayr (2012), is dedicated to the practical application of multimodal discourse analysis. It guides readers through a step-by-step process, covering essential concepts like visual and auditory grammar, genre analysis, and the application of multimodal discourse in fields such as media, education, and sociology. This book extends the scope of both visual analysis and critical discourse analysis, offering methods for more precise analysis of visual communication, as recognized by Han (2015). It stands as a valuable resource for scholars interested in the critical examination of multimodal communication.
Moreover, the book Multimodality: Foundations, Research and Analysis - A Problem-Oriented Introduction by Bateman et al. (2017) offers a comprehensive guide to the principles, research methods, and analytical techniques of multimodal discourse. The book employs a problem-oriented approach to engage readers in solving real-world issues, thereby enhancing their understanding of multimodal communication. It covers a wide range of multimodal types and semiotic resources, from face-to-face interactions to digital media, and categorizes common features across various disciplinary approaches to multimodal studies, as noted by Featherman (2018). The book also equips readers with methodologies and technologies for discourse, image, and sound analysis.
While not specifically focused on multimodal discourse analysis, the book Language and Power by Fairclough (2001) is significant for its examination of the interplay between language and power, offering insights for multimodal critical discourse analysis. Fairclough elucidates that language is socially conditioned and shapes social relations and ideologies, as discussed by Bacchini (2018). This perspective accentuates the role of textual analysis in revealing and transforming ideologies and power structures. The book lays the methodological and ideological groundwork for critical discourse analysis, which examines discourse’s impact on social reality. Its exploration of language’s role in power dynamics, emphasis on discourse analysis as a means to change reality, and contributions to critical discourse analysis make it a highly cited work. The book’s holistic research perspective solidifies its value for scholars in the field of multimodal discourse analysis.
In addition, David Machin’s research articles are frequently cited in multimodal discourse studies. In 2013, Machin defined discourse as encompassing participants, behaviors, goals, values, and activities, realized through various communicative processes. He discussed how multimodal critical discourse studies can address key issues by examining how discourses and ideologies are transmitted across different modes and genres, emphasizing the role of symbolic resources and genres in power relations and ideologies.
In the other article titled “The Need for a Social and Affordance-Driven Multimodal Critical Discourse Studies” (2016), Machin further developed a multimodal approach to critical discourse studies. He highlighted the need to reveal hidden ideologies in texts and how power structures re-contextualize social practices to maintain control. Machin also emphasized the importance of visual analysis, suggesting that its integration can deepen the analysis of multimodal critical discourse. He argued that discourse is present in symbols at every level, and these symbols shape ideology.
In essence, Machin’s (2013, 2016) work explores the relationship between discourse, ideology, and power, and highlights the role of discourse in constructing social realities, relationships, and ideologies.
Overall, the frequently cited books and articles in MDS provide essential theoretical and practical insights for scholars engaged in multimodal discourse analysis. These works have significantly influenced the research perspectives and frameworks within the field.
Conclusion
The growing relevance of multimodal discourse studies (MDS) is evident in the digital age. Utilizing CiteSpace 6.1.R6, this study traced the development of MDS in the international academic sphere from 1997 to 2023. By analyzing knowledge maps, the study outlined the progression of MDS, highlighting key research areas, interdisciplinary connections, and more. These findings offer strategic guidance for future multimodal discourse research globally.
The publication trend in MDS demonstrates a clear increase from 1997 to 2023 (Figure 1). Research output grew steadily until 2021, with a significant spike between 2012 and 2015. The subsequent maturity phase from 2016 to 2023 saw a consistent rise in MDS publications, reflecting ongoing scholarly interest.
Knowledge maps (Figures 2 and 3) also reveal that certain researchers have established strong collaborative relationships across disciplines like linguistics, communication, education, neurocognition, and corpus analysis. However, such collaborations are not universally prevalent. The prominence of cited authors such as David Machin, John A. Bateman, and Kay L. O’Halloran highlight their influential contributions to MDS.
In terms of geographical contributions, the United States, the United Kingdom, China, Australia, and Spain are the leading nations in MDS research from 1997 to 2023 (Figure 4). Notable institutions contributing to MDS include Örebro University (Sweden), Curtin University (Australia), Nanyang Technological University (Singapore), Lancaster University (UK), University of Bremen (Germany), The University of Hong Kong, and The Hong Kong Polytechnic University (China) (Figure 5 and Table 4). Scholars from these institutions have been instrumental in advancing MDS, tackling diverse issues from politics and education to social media, health, cross-cultural communication, and identity studies.
Notably, multimodal discourse studies has established a strong interdisciplinary connection with fields such as linguistics, communication, educational research, sociology, psychology, and management. This integration recognizes the dual importance of language in meaning-making and the role of non-verbal symbols in communication. The bibliometric review of MDS literature suggests a promising future for the field’s expansion into new and varied domains.
Specifically, knowledge maps of keywords (Figures 6–10) provide a panoramic view of MDS trends and its interdisciplinary evolution. They allow scholars to pinpoint key themes, monitor the field’s growth, and detect patterns of change. The analysis indicates a shift from an initial focus on linguistics, sociology, psychology, and communication during 2012 to 2016, to a more recent emphasis on education, sociology, anthropology, computer science, and social networking from 2017 to 2023. The rise of social media and advancements in digital technology highlight the growing significance of digital communication in MDS.
Furthermore, the evolution of virtual and augmented reality technologies has propelled research into immersive digital environments for diverse applications. Technological progress aids in the representation, processing, and analysis of multimodal data, enhancing its accuracy and reliability. This, in turn, empowers researchers to propose practical solutions and advocate for the application of multimodal data in various social contexts.
In addition, the foundational role of classic literature in multimodal discourse studies is undeniable, as it has significantly influenced the field’s trajectory over the past two decades. These seminal works have not only provided essential theoretical frameworks but also expanded our understanding of how symbols interact with social practices, how discourse wields power, and how it shapes ideologies.
However, the current literature review acknowledges its limitations. By concentrating on English-language publications within the international academic community, it may have overlooked the broader global landscape of MDS. To address this, future research endeavors should aim to incorporate a wider range of literature, including works in other languages, to achieve a more holistic and inclusive representation of the field worldwide. Moreover, while this review has made comparisons based on research topics across different countries and institutions, it also recognizes the need for future studies to delve deeper into the distinctive perspectives and contributions of scholars from varied cultural and academic backgrounds. This approach would further enrich the multifaceted nature of MDS and its ongoing development.
In conclusion, this study provides an overview of the development of MDS in the international academic community over the past two decades, highlighting its interdisciplinary integration with various fields. The findings suggest that MDS scholars should adopt diverse perspectives in their research. Furthermore, technological advancements in multimodal data mining, including natural language processing, speech recognition, and gesture recognition, will enable scholars to further explore the impact of semiotic resources on human life and social practices across a range of areas, including education, cognition, information dissemination, identity, healthcare, and social behaviors.
Footnotes
Acknowledgements
The authors would like to express their gratitude to Liya Zhu for her valuable contributions to this research.
Author Contributions
Huidan LIU conceived and designed the study, Huidan LIU and Lihua LIU analyzed the data, Lihua LIU contributed analysis tools, Huidan LIU wrote the paper, Huadong LI modified the paper and was in charge of the correspondence of the submission and re-submission of the paper.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors received financial support from a government project and a university project for the research, authorship, and/or publication of this article. The Shanghai Philosophy and Social Science Planning Projects-Youth Project fund grant number is: 2024EYY010. The project is “A Study on China-Related Reporting in Mainstream English Media of South Pacific Island Countries”. The university project fund grant number is: H20220352. The project is “Building of a Multimodal Corpus of Chinese Diplomatic Discourse”.
The name of the government funder: Shanghai. The name of the univeristy funder: Shanghai Maritime University.
Ethical Considerations (Institution Name and Study Reference Number)
No animal and human studies are involved in this study.
Informed Consent
Informed consent was obtained from all participants included in this study.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
