Abstract
Semantic modality, a core notion in linguistics, pertains to how language expresses meanings associated with possibility, necessity, and capability. In the last 20 years, research on semantic modality has significantly expanded, highlighting its interdisciplinary importance in linguistics, cognitive science, and computational technology. This study utilizes bibliometric analysis to examine trends, topic focusses, and significant contributions in semantic modality research from 2005 to 2024. A total of 3,025 articles obtained from the Scopus database were examined utilizing applications such as VOSviewer and Biblioshiny. The results indicate a significant rise in publications, especially post-2016, propelled by progress in natural language processing (NLP), deep learning, and multimodal technologies. Thematic mapping revealed “Human” and “Semantics” as essential themes in the research, emphasizing the amalgamation of language theories with cognitive and computational applications. Furthermore, terms like “multi-modal” and “cross-modal” highlight the increasing interest in multimodal interactions, which include text, visual, and auditory data. This study highlights the necessity for more regional and linguistic representation to improve cultural and linguistic variety, notwithstanding the dominance of China and the United States in the research environment. Significant problems were observed, including biases in datasets and deficiencies in the effective integration of cross-media modalities. This study offers an in-depth analysis of the progression of semantic modality research and presents essential recommendations for subsequent investigations. It emphasizes the necessity of diversifying datasets, enhancing interdisciplinary cooperation, and utilizing advanced AI models like GPT-4 and CLIP to enrich semantic comprehension across many cultural contexts.
Plain Language Summary
Semantic modality, a core notion in linguistics, pertains to how language expresses meanings associated with possibility, necessity, and capability. In the last twenty years, research on semantic modality has significantly expanded, highlighting its interdisciplinary importance in linguistics, cognitive science, and computational technology. This study utilises bibliometric analysis to examine trends, topic focusses, and significant contributions in semantic modality research from 2005 to 2024. A total of 3,025 articles obtained from the Scopus database were examined utilising applications such as VOSviewer and Biblioshiny. The results indicate a significant rise in publications, especially post-2016, propelled by progress in natural language processing (NLP), deep learning, and multimodal technologies. Thematic mapping revealed “Human” and “Semantics” as essential themes in the research, emphasising the amalgamation of language theories with cognitive and computational applications.
Keywords
Introduction
Semantic modality is a crucial concept in linguistics that defines how language conveys meanings associated with possibility, necessity, and capacity. This concept involves how speakers utilize language to convey varying degrees of certainty, duty, and intention, making it relevant to both theoretical and applied linguistics (Bodnaruk, 2019; Palmer, 2001; Zhang, 2019). Semantic modality is closely linked to grammatical and lexical categories such as tense and aspect, although it uniquely conveys the speaker’s perspective and attitude toward the stated proposition (Bybee, 1994; Costa & Pessotto, 2022). Thus, semantic modality is crucial not just in linguistic studies but also in several disciplines like logic, philosophy of language, sociolinguistics, and computational linguistics, with applications in sentiment analysis, machine translation, and natural language processing (NLP).
Over the past 2 decades, research on semantic modality has progressed markedly. Preliminary study focused on the discovery and analysis of modality markers in specific languages, emphasizing grammaticalization processes and their semantic distinctions (Jucker & Taavitsainen, 2010; Narrog, 2005; Pfau & Steinbach, 2006). Recent research trends have concentrated on cross-linguistic comparisons, diachronic evolution, and the cognitive underpinnings of modality expressions, enabled by technological advancements and computational approaches (Bayoudh et al., 2022; Dullieva, 2017; Tahmasebi et al., 2021). Corpus linguistics and extensive analytical techniques have enabled the delineation of modality markers’ distributions across various genres and contexts (Aarts & Meijs, 2022; Meyer, 2023; Perkins & Roe, 2024). These advancements offer new opportunities to understand broader and more dynamic patterns in modality use. Despite substantial research efforts, a comprehensive analysis of trends and patterns in this field is absent, leading to shortcomings in understanding the global landscape of semantic modality studies.
Research on semantic modality has attracted interest from other areas. In philosophy, modality pertains to epistemic and metaphysical logic, examining possibilities and requirements within the realms of reality and knowing (Elgin, 2023; Williamson, 2023). Modality analysis in education facilitates comprehension of how language influences teaching-learning processes and meaning construction in educational interactions (Lavoie & Pellerin, 2018; Von, 2024; Wan et al., 2020). In computational linguistics, the significance of semantic modality has increased due to improvements in machine learning technologies, including applications such as semantic role labelling, discourse analysis, and automated machine translation systems (Manning, 2015; Mitkov, 2022; Tsujii, 2021). Thus, enhanced understanding of semantic modality research can aid in the formulation of more effective techniques across these various domains.
Notwithstanding considerable research endeavors, a thorough examination of trends and patterns in semantic modality research from a bibliometric standpoint has yet to be undertaken. This research seeks to address this deficiency by utilizing bibliometric techniques to examine publications in this domain from 2005 to 2024. This analysis delineates research trends and advancements in semantic modality while identifying principal authors, prevailing topics, and emerging trends in worldwide literature. Despite this progress, existing reviews remain fragmented and largely descriptive, often limited to particular languages or contexts without providing a systematic global mapping. Furthermore, little attention has been paid to the interpretive dimension—how thematic clusters, author networks, and geographical patterns reflect broader theoretical or practical dynamics in the field. By integrating descriptive bibliometric evidence with interpretive analysis, this study seeks to uncover not only what the trends are, but also what they mean for the development of semantic modality research across disciplines.
This study addresses several key questions:
What are the publication and citation trends in semantic modality research during 2005 to 2024?
Who are the most influential authors and countries in this field?
What are the thematic clusters and key keywords defining semantic modality research?
How does author co-citation reveal thematic trends?
By offering insights into these research dynamics, this study aims to benefit not only linguists but also scholars from other disciplines interested in exploring the role of modality in language and communication.
Literature Review
Semantics, the study of meaning in language, is a fundamental part of linguistics that examines how words, phrases, and sentences communicate meaning (Hallman, 2017; Jabbar Alsaedi, 2018; Saeed, 2015; Szabó, 2009). Semantics serves as a fundamental pillar of the discipline, complementing phonology, which analyses sound systems, and grammar, which explores the structure of words and sentences (Saeed, 2015). The principal objective of semantics is to examine the correlation between language statements and their generated meanings, revealing patterns that allow speakers to comprehend unexpected sentences by deconstructing them into known elements (Hallman, 2017; Szabó, 2009). Fundamental elements of semantics encompass linguistic meaning, which investigates the relationship between language forms and their meanings, elucidating the processes by which meaning is formed and comprehended (Slabakova, 2018; Szabó, 2009). A crucial aspect is the recognition of consistent patterns connecting forms such as words or phrases to their meanings, providing insights into how speakers comprehend new sentences (Hallman, 2017). Moreover, semantics prioritizes an objective examination of meaning, concentrating on the conventional definitions of words while diminishing subjective or context-dependent interpretations (Jabbar Alsaedi, 2018).
Semantic modality, a branch of semantics, examines how language expresses meanings associated with possibility, necessity, capacity, and intention (Bybee, 1994; Palmer, 2001). This encompasses diverse verbal phrases that convey a speaker’s attitudes, views, or assessments regarding a proposition (Kratzer, 2012). Semantic modality investigates the interplay of grammatical, lexical, and contextual components in forming intricate meanings, rendering it important in both theoretical and applied linguistics (von Fintel & Iatridou, 2008). Formal semantic methodologies facilitate an exact examination of modality through the use of formal logic frameworks to elucidate the connections between linguistic expressions and their significations (Montague, 1970). Simultaneously, the Natural Semantic Metalanguage (NSM) approach enables cross-cultural examination of modality by recognizing universal semantic primes inherent in all languages (Wierzbicka, 1996). Utilizing these theories, semantic modality offers new perspectives on the relationship between language, cognition, and technology, enhancing our comprehension of how language reflects and influences human interaction (Nuyts, 2001). Although these approaches are well established, many studies remain language-specific or theory-bound, offering limited integration with broader global or interdisciplinary perspectives
Language modality can be classified into two main types: epistemic and deontic modality. Epistemic modality pertains to a speaker’s knowledge and views, indicating their evaluation of the veracity or probability of a proposition depending on the facts at hand. Modal verbs such as “might” and “must” convey varying degrees of possibility or certainty based on the speaker’s understanding (Furey, 2017; Nugraha, 2019; Zhang, 2019). Conversely, deontic modality pertains to necessity and permission, often reflecting rules, obligations, or permissions, typically expressed through modal verbs like “should” and “must” (Nugraha, 2019; Zhang, 2019). Beyond these modal types, languages provide various expressions to articulate these concepts. Modal verbs such as “can,”“must,”“should,” and “might” are primary tools for expressing modality. Their flexibility allows them to convey different modal meanings depending on context. Additionally, adjectives and adverbs significantly contribute to expressing modality. Words like “necessary,”“possible,”“certainly,” and “probably” modify sentence meanings, reflecting varying degrees of necessity and possibility (Portner, 2023; Ramón, 2009; Traugott, 2011). Together, these linguistic tools enable nuanced communication of attitudes, beliefs, and obligations in language. However, previous literature reviews often stop at outlining these categories without critically examining how modality research evolves across disciplines, technologies, or cultural contexts.
This gap highlights the importance of moving beyond descriptive accounts toward integrative and interpretive perspectives. While corpus linguistics and computational approaches have begun to link modality with NLP and cognitive processing, there remains a lack of systematic mapping that situates these developments within the global research landscape. A bibliometric analysis, therefore, provides a means to connect fragmented findings and reveal how theoretical traditions, technological applications, and interdisciplinary studies intersect in the study of semantic modality.
Methods
This study utilized a bibliometric method to examine and delineate research trends in semantic modality from 2005 to 2024. Bibliometric analysis is a quantitative approach that systematically assesses academic literature, utilizing publication data to reveal insights into writing patterns, citation trends, author partnerships, and prevailing themes within a study domain (Hood & Wilson, 2001; Sun et al., 2021). This study employed bibliometric tools, specifically Biblioshiny and VOSviewer. VOSviewer is a specialized application intended for visualizing bibliographic networks, encompassing author collaborations, co-citations, and keyword linkages (McAllister et al., 2022; Van Eck & Waltman, 2010). Biblioshiny is particularly effective for analyzing research patterns and topics because of its dynamic grouping and data visualization features (Aria & Cuccurullo, 2017). These tools were selected over alternatives such as CiteSpace because they provide stronger visualization capabilities and are widely validated in bibliometric studies, thereby enhancing replicability and transparency of results.
Data Collection
The primary data source for this analysis was the Scopus database, chosen for its extensive coverage of high-quality peer-reviewed academic publications. Scopus is among the largest and most reliable databases, providing comprehensive metadata on scholarly articles. Scopus was preferred to Web of Science (WoS) because it offers broader coverage in the humanities and social sciences, which are central to semantic modality research, while still maintaining rigorous indexing standards. The search was conducted on December 9, 2024, using the following query:
(TITLE-ABS-KEY ((semantic OR semantics) AND (modality OR modalities) AND (language OR linguistics)) AND PUBYEAR > 2004 AND PUBYEAR < 2025)
This query targeted articles published between 2005 and 2024, yielding a total of 3,025 articles. The metadata collected included titles, abstracts, keywords, publication years, journal names, authors, and citation counts. Irrelevant or duplicate articles were further filtered during the preprocessing stage. To ensure reliability, two independent researchers cross-checked the dataset at the filtering stage, and inclusion criteria were explicitly documented to enhance consistency.
Data Preprocessing
Subsequent to data collection, the initial phase involved preprocessing to guarantee the quality and pertinence of the information for analysis. At this stage, duplicate entries were eliminated, and irrelevant publications were excluded based on their titles and abstracts. Moreover, keyword standardization was executed to mitigate variability arising from the utilization of synonyms or varying word forms. To guarantee the inclusion of articles pertinent to semantic modality, those not centered on the primary subject were excluded through stringent inclusion criteria, encompassing articles authored in English or possessing English titles and abstracts, as well as those highlighting “semantic modality” as a principal theme.
Analysis Procedure
To enhance validity, the analysis was conducted in three phases: (1) descriptive mapping of general publication characteristics, (2) network analysis of authors, keywords, and countries, and (3) interpretive analysis linking bibliometric results to theoretical perspectives. Each phase was documented and archived to enable reproducibility.
General Characteristics Analysis
The initial phase of our investigation involved delineating overarching publication trends. This involved examining annual publication totals, prominent journals disseminating publications on semantic modality, and patterns of author collaboration. This sought to furnish a comprehensive summary of research dynamics in this domain over the last 20 years. Furthermore, the nations and organizations most engaged in semantic modality research were identified.
Co-Citation and Keyword Analysis
Using VOSviewer, co-citation networks were analyzed to identify the most influential authors in semantic modality research. This analysis enabled the mapping of relationships among authors contributing significantly to this field. Additionally, keyword association analysis was conducted to uncover recurring themes in semantic modality research. Keywords meeting specific frequency thresholds were retained, and this analysis aimed to reveal conceptual patterns defining this research area.
Identifying Research Themes
To understand the main topics in semantic modality research, frequently appearing keywords were grouped into thematic clusters. These clusters were created based on semantic relationships among keywords and their relevance to semantic modality. Trend topics helped identify dominant themes in the literature and how these themes evolved over time.
Visualization and Interpretation
Upon concluding the investigation, bibliometric maps illustrating keyword associations and theme clusters were generated utilizing Biblioshiny and VOSviewer. These maps were utilized to investigate emerging trends and delineate the progression of principal study domains in semantic modality. The visualizations provide insights into prospective study trajectories and avenues for additional investigation. Interpretive notes were added to highlight how observed bibliometric patterns (e.g., keyword shifts, author clusters, and country distributions) could be connected with theoretical developments in linguistics and computational research.
This bibliometric analysis sought to elucidate the evolution of semantic modality over the past 20 years. Furthermore, the findings of this investigation provided substantial insights for linguists and scholars across other fields interested in investigating modality in language and communication.
Results
General Characteristics
This section delineates publication patterns, keywords, principal publication venues, utilized languages, and research subject areas. Table 1 presents the annual publication count of works on semantic modality. A linear regression model was applied, revealing a significant rise in the number of articles published over the analyzed decade, with the following statistics: F-statistic (F): 36.85, p-value (p): 9.76 × 10−6 (highly significant), Coefficient of Determination (R2): .672, and Adjusted R2: .654. Through bibliometric analysis, we discerned publishing patterns in semantic modality research from 2005 to 2024. The results indicated a substantial rise in yearly publications, demonstrating heightened interest and progress in this domain.
Number of Publications Published Per Year.
Table 1 depicts the annual publication count in the field of semantic modality from 2005 to 2024. Publications saw significant growth in 2016, with 129 papers published, in contrast to 57 in 2007. The significant increase persisted, reaching its peak in 2024 with a total of 511 published articles. This increase underscores a rise in interest in semantic modality during the past decade (Figure 1).

Publication trend in the examined 2 decades.
This steady increase, particularly after 2016, indicates not only a growing scholarly interest but also the influence of broader trends in computational linguistics and corpus-based methods. The timing aligns with the wider adoption of large-scale language corpora and NLP tools, which may have facilitated new approaches to semantic modality research.
Thematic Clusters and Keywords
Table 2 enumerates keywords with a frequency of 100 or more occurrences in publications relevant to semantic modality. These keywords serve as a foundation for delineating and examining prevailing topics in the study conducted from 2005 to 2024. The table delineates 40 principal keywords commonly employed in the literature about semantic modality. Prominent terms like “Semantics” (1,870 occurrences), “Human” (631 occurrences), and “Article” (517 occurrences) highlight the significance of essential language and psychological principles, together with human-centric research.
Keyword List (Freq ≥ 100).
Thematic Groupings
Semantics and Linguistics The keyword “Semantics” (1,870 occurrences) is the most frequently mentioned, affirming its central role in this field of research. The relationship between semantic modality and linguistics is vital, with keywords such as “Linguistics” (169 occurrences) and “Language Model” (163 occurrences) indicating a strong focus on linguistic aspects, particularly language modeling and semantics. “Computational Linguistics” (257 occurrences) reflects a tendency to integrate semantic modality with computational techniques, including text data analysis and modeling.
Human-Centered Research Keywords such as “Human” (631 occurrences), “Humans” (474 occurrences), “Male” (409 occurrences), “Female” (398 occurrences), and “Adult” (353 occurrences) suggest a focus on studies involving human subjects. This includes language and cognition analysis across demographic groups. Keywords like “Young Adult” (148 occurrences) and “Middle Aged” (110 occurrences) highlight the relevance of age and demographic factors in semantic modality research.
Technological and Cognitive Aspects Keywords such as “Natural Language Processing Systems” (298 occurrences) and “Language Processing” (153 occurrences) demonstrate the significant role of technology in facilitating semantic analysis, particularly in NLP contexts. Terms like “Deep Learning” (118 occurrences), “Embeddings” (118 occurrences), and “Pre-training” (112 occurrences) reflect trends in machine learning techniques applied to semantic and language modeling. Cognitive terms such as “Cognition” (159 occurrences) and “Brain” (117 occurrences) underline the relationship between semantic modality and cognitive understanding, with applications in brain imaging technologies for studying language processing responses.
Multimodality and Cross-Modal Studies Keywords “Multi-modal” (359 occurrences) and “Cross-modal” (205 occurrences) highlight an increasing trend to study semantic modality in broader contexts, incorporating multiple modalities such as text, images, and sound. This aligns with advancements in technologies like computer vision and speech recognition integrated with NLP. Keywords “Computer Vision” (195 occurrences) and “Speech” (123 occurrences) emphasize the role of image and audio recognition in understanding semantic modality.
Medical and Neuroscience Applications Keywords such as “Brain Mapping” (133 occurrences), “Magnetic Resonance Imaging” (121 occurrences), and “Functional Magnetic Resonance Imaging” (115 occurrences) demonstrate the application of brain imaging technologies in mapping and understanding how the human brain processes semantic information. “Nuclear Magnetic Resonance Imaging” (111 occurrences) further underscores the use of medical imaging techniques in semantic research.
Learning and Task Performance Keywords like “Learning Systems” (128 occurrences) and “Task Performance” (120 occurrences) reflect a focus on the application of semantic modality in educational and task performance contexts, including AI-based system development.
Key insights and trends in semantic modality research emphasize the dominance of linguistic and semantic aspects, focusing on a deep understanding of how language conveys meanings related to possibility, necessity, and ability. A notable trend is the increased use of technologies like NLP and deep learning, marking a shift toward data-driven and computational approaches in analyzing language modalities. Furthermore, there is a growing focus on multimodal studies that incorporate various inputs, such as visual and audio data, to explore how multiple modalities interact to convey meaning. This interdisciplinary research spans fields like linguistics, cognitive psychology, technology, and medicine, fostering a holistic framework for understanding language and meaning across diverse contexts.
The keyword data reflects a growing diversity and interdisciplinary focus in semantic modality research. There is an increasing integration between traditional linguistic techniques and advanced technology, alongside applications in various fields, ranging from natural language processing to medical and educational contexts. Future research is likely to deepen our understanding of how different modalities, whether human or machine, can collaborate to interpret and communicate meaning more effectively.
In the keyword co-occurrence analysis, “Human” appeared as a central keyword with high frequency and centrality. Other frequent keywords included “language,”“modality,” and “semantics,” reflecting the core focus of this research area. This prominence indicates that semantic modality research is strongly grounded in human-centered contexts, such as communication, cognition, and interaction, rather than being purely abstract or formal. It also suggests that the field consistently frames modality as a phenomenon inseparable from human meaning-making processes.
Table 3 enumerates journals and conferences with 16 or more publications pertinent to semantic modality. This data elucidates the principal platforms for research publication in this discipline, highlighting emerging patterns and trajectories. The table comprises 20 publication locations with a substantial number of contributions pertaining to semantic modality. These outlets include academic journals, international conferences, and proceedings across several disciplines, including computing, neurology, psychology, multimedia processing, and natural language processing. Prominent venues, such as Lecture Notes in Computer Science, exhibit markedly higher publishing frequencies than their counterparts, highlighting the preeminence of this subject within computational and technical domains.
Publications Venues (Number of Publications ≥ 16).
Research trends in semantic modality reveal a significant prevalence of platforms centered on international conference proceedings, particularly those affiliated with IEEE and ACM. This indicates a predilection for presenting scientific findings at conferences, which enable more rapid dissemination of recent discoveries than conventional academic journals. These conferences frequently emphasize advanced technologies, including multimedia processing and machine learning, which are essential components of semantic modality research.
Moreover, there is a growing focus on computing and technology, as demonstrated in platforms such as Lecture Notes in Computer Science and IEEE proceedings. These sites illustrate the increasing importance of artificial intelligence, natural language processing, and image processing in semantic modality research. This development indicates a more profound integration of semantic modality notions with sophisticated computational methods, such as computer vision and deep learning. These achievements persist in enhancing and broadening the applicability of this study across several disciplines and technology.
The prominence of journals in applied linguistics and computational linguistics reflects the dual orientation of semantic modality research, balancing theoretical concerns with increasing technological applications.
Table 4 shows that English dominates research publications in the field of semantic modality, with a significantly higher frequency (2,820) compared to other languages. This reflects the fact that English serves as the primary language in the international academic community, particularly in the fields of technology, linguistics, and computing. The dominance of English also highlights the tendency for research to be published in international journals and conferences, which typically use English as the medium of communication.
Most Frequently Languages Used and Research Subject area.
In terms of subject areas, Computer Science leads with 1,521 publications, aligning closely with the trend of integrating semantic modality with advanced computational technologies such as artificial intelligence, natural language processing, and machine learning. The dominance of Computer Science underscores the widespread application of semantic modality concepts in technology-based systems, including text analysis, language models, and image processing. Arts and Humanities (997) and Social Sciences (933) rank second and third, showcasing the application of semantic modality in fields such as linguistics, philosophy, discourse analysis, and cultural studies. Research focusing on language within social and cultural contexts is often found in these areas, underscoring the importance of understanding modality in human communication and interaction.
Furthermore, fields such as Mathematics (439), Neuroscience (428), and Psychology (399) also contribute to semantic modality research, reflecting its relevance in understanding cognitive and neurological processes, as well as the relationship between language and perception. Other fields, including Engineering (366), Medicine (228), and Health Professions (100), provide additional contributions, albeit in smaller numbers, indicating the application of modality research in medical technology, treatments, and health-related data analysis.
Overall, this analysis highlights that semantic modality research is not confined to linguistics or philosophy but is integrated across various other disciplines, particularly computing and social sciences. This underscores the interdisciplinary and practical nature of the topic, with applications extending to a wide range of academic and technological fields.
Table 5 indicates that China (755) and the United States (665) are the leading countries in semantic modality research productivity. The dominant contribution of these two nations signifies their considerable impact on technology, computing, and linguistics research—domains closely associated with semantic modality. China and the United States possess robust research infrastructure, substantial financing, and broad international collaborations, which enhance their prolific publication production in this field.
Most Productive Countries (Top 20) and Most Productive Authors (Top 20).
Several distinguished authors are notable for their substantial contributions to the study of semantic modality. Jefferies, E. (12), Ji, Z. (12), and Lambon Ralph, M.A. (12) are the most prolific authors, underscoring their participation in significant research pertaining to the linguistic and cognitive dimensions of semantic modality. These three individuals are esteemed authorities in cognitive linguistics and neuropsychology, underscoring the essential contribution of these fields to the enhancement of semantic modality comprehension. This data underscores a cohort of prolific authors who excel in semantic modality research, emphasizing the advancement of language theories, cognitive understanding, and progressively intricate computer applications.
Although China is the most productive country in terms of publication volume, the most prolific individual authors are not affiliated with Chinese institutions. This discrepancy suggests that China’s contribution may be driven by large-scale collaborative teams or institutional initiatives, while individual research leadership remains more dispersed internationally. Such a pattern highlights the importance of examining collaboration networks alongside raw publication counts.
Table 6 displays the most frequently cited works in the realm of semantic modality and associated fields, providing insights into the most significant and impactful research in this area. A handful of papers in the table has both a substantial citation count and considerable impact on the advancement of ideas and applications in semantic modality. These extensively referenced works illustrate significant contributions to the discipline, encompassing a wide array of topics—from technical elements like deep learning, transformers, and multimodal reasoning to neuropsychological and cognitive viewpoints. These works define the theoretical foundations of the area and facilitate practical applications in artificial intelligence, natural language processing, and image processing.
Most Cited Articles.
Based on Figure 2, the data illustrates the evolution of the frequency and relevance of various terms in semantic modality research, analyzed across three key time points: Year Q1, representing the initial period when the terms began to appear significantly; Year Median, marking the peak of the terms’ popularity; and Year Q3, indicating the final period when the terms were still actively used in research. This approach provides insights into the shifting focus and dynamics of terms in research over time.

Trend topics.
Categories of Trends and Frequency:
Topics with long-term trends.
Topics with long-term trends demonstrate sustained relevance, with an earlier Year Q1 and a later Year Q3. For instance, Priming (2008–2016) and Springer-Verlag (2008–2016) were relevant in the early phases of research but declined after 2016, indicating a narrower focus during that period. On the other hand, terms like Logic (2010–2021), Auditory (2010–2020), and Memory (2011–2021) remained in use for over a decade, especially in neuroscience and cognitive linguistics contexts, emphasizing the relationships among logic, memory, and language.
2. Topics with medium-term trends (2013–2021).
Medium-term topics reached peak popularity during the intermediate period from 2013 to 2021. For example, Processing (2013–2022), with a frequency of 1,068, became a focal point in studies of natural language processing and data processing. Terms like Words (2012–2021) and Word (2012–2022), with high frequencies, underscore the importance of research on individual and collective words in linguistic and textual analysis. Furthermore, core terms such as Semantic (2016–2023) and Semantics (2014–2023), with frequencies of 3,962 and 1,371 respectively, demonstrate that semantic modality remained a primary focus throughout this period.
3. Topics with modern trends (2017–2024).
Modern trends emerged alongside advancements in technology and new applications, particularly from 2017 to 2024. Terms such as Visual (2018–2023), Modalities (2017–2023), and Information (2017–2024) reflect a growing focus on integrating multimodalities in technological applications, including computer vision and multimedia analysis. Significant increases in frequency are observed for terms like Image (2021–2024) and Text (2021–2024), with frequencies of 1,264 and 1,249 respectively, indicating intensive research on text- and image-based data. Models (2021–2024), with a frequency of 1,233, underscores the dominance of machine learning models in modern applications. Emerging terms like Challenges, Enhance, and Clip (2022–2024) reflect the latest trends in cutting-edge technology, especially in transformer-based models like CLIP.
4. Dominance of general topics.
General topics remain dominant, as seen with terms like Language (2015–2023), which has the highest frequency of 3,978, highlighting that language remains central to all semantic modality research. Commonly used terms in academic contexts, such as Study (2014–2022), Paper (2015–2023), and Results (2015–2023), also show high frequencies, reflecting their frequent use in reporting study findings and analyses.
The analysis reveals a dominance of technological and multimodal themes, with terms like Image, Text, Models, and Visual highlighting research trends shifting toward technology-driven, cross-media applications. Additionally, there is a clear transition in research focus, where classical terms such as Priming and Springer-Verlag lost relevance after 2016, while modern terms like CLIP and Challenges gained prominence, reflecting adaptation to cutting-edge technological advancements. However, core terms like Semantic, Language, and Processing have remained relevant throughout, demonstrating that semantic modality research remains deeply rooted in linguistic and semantic foundations, even as it evolves with technological progress.
This shift from traditional to modern terminology illustrates how semantic modality research adapts to technological change while retaining its linguistic foundation, showing the field’s resilience and capacity for renewal. This evolution from classical terms such as “Priming” toward modern concepts like “CLIP” demonstrates how the field adapts to technological innovation while maintaining continuity with its linguistic origins. It indicates that semantic modality research is both resilient and responsive, capable of integrating new computational paradigms without losing sight of its foundational theoretical concerns.
Figure 3, The Thematic Map illustrates the relationships among various research themes based on two primary dimensions: Centrality and Density, which were analyzed to evaluate the relevance and development of themes within a specific field. Four main themes were identified in semantic modality research. The theme “Human” is the most relevant and mature in this research area, with high frequency, the highest centrality (.727), and the highest density (14.201), establishing it as the core and dominant theme. The theme “Multi-modal” is also significant, with a large cluster size and strong internal connections, reflecting a well-developed and important topic in research. The theme “Semantics” is highly relevant externally but remains in the stages of internal development, indicating that this is an emerging area with substantial potential. The theme “Computer Circuits” has low relevance to other themes but is well-developed within its own scope, indicating a very specific focus in certain research areas.

Thematic map.
The thematic clustering reveals not only traditional concerns in linguistic theory (e.g., epistemic and deontic modality) but also the emergence of applied themes in computational linguistics. This transition suggests that semantic modality research is gradually broadening from purely theoretical frameworks toward data-driven and interdisciplinary applications, reflecting a dynamic evolution of the field.
Figure 4 illustrates the Author Co-Citation Map. The co-citation map depicts the interconnectedness of writers through mutual citations in scholarly works. Two groups were found from this research. Cluster 1 has authors including Chen X., He K., and Wang X., who constitute a substantial and intricately linked collective. This cluster predominantly encompasses technological research subjects, including computer vision and machine learning. Wang X. has accumulated 23,575 citations and possesses a total link strength of 962, signifying considerable significance and impact within this network. Cluster 2 is predominantly comprised of authors like Caramazza A., Lambon Ralph M. A., and Pulvermuller F. This cluster emphasizes linguistic and neuroscience research, specifically examining the connection between the brain and language. Lambon Ralph, M. A. possesses a total link strength of 6,502 and 546 citations, underscoring their significant influence in this domain.

Author co-citation.
The collaboration maps indicate that semantic modality research is characterized by regional clustering, with limited cross-regional bridges. This suggests that while research communities are productive locally, global integration of findings remains relatively modest, pointing to opportunities for stronger international collaboration.
Discussion
The results of this bibliometric analysis reveal significant dynamics in semantic modality research over the past 2 decades. Trends indicate a surge in attention to this field, particularly after 2016, marked by advancements in computational technologies such as Natural Language Processing (NLP), deep learning, and multimodality applications. The peak in publications in 2024, totaling 511 articles, reflects the high relevance of this topic across disciplines. This pattern aligns with broader shifts in corpus-based linguistics and the proliferation of AI-driven methodologies, suggesting that growth in semantic modality research is not isolated but embedded within global transformations in linguistic inquiry.
Key Research Focus and Technological Integration
The findings affirm that semantic modality extends beyond conventional linguistic research, evolving into a multidisciplinary subject that includes technology, cognition, and practical applications. Terms like semantics (1,870 occurrences) and computational linguistics (257 occurrences) signify a transition from traditional language theories to data-centric methodologies. Theories such as distributional semantics (Lenci et al., 2022) and deep learning-based word embeddings (Lenci, 2020) underpin contemporary NLP tools, converting semantic representation into mathematical representations. These representations facilitate contextual modelling by approaches such as Transformers, including BERT, hence enhancing the significance of distributional semantics in NLP (Jiang et al., 2023).
The advent of sophisticated generative models such as GPT-4 has considerably improved multi-modal integration and semantic analysis. GPT-4 demonstrates proficiency in cross-domain modality understanding and multi-task learning, facilitating the effective management of intricate semantic inferences, including sentiment analysis and contextual dialogue systems (Chen et al., 2024). Furthermore, current assessments indicate its enhanced efficacy relative to earlier models, especially in specialized domains like medical and technical problem-solving (Antaki et al., 2024) and sentiment analysis (Shobayo et al., 2024)
Furthermore, models such as Contrastive Language–Image Pretraining (CLIP) and GPT-fusion, launched in 2024 by OpenAI, illustrate the potential of cross-media modality integration (e.g., text and images) for applications like speech-to-image production and cross-modal analysis. This is evidenced by the prevalence of terms like multi-modal (359 occurrences) and cross-modal (205 occurrences), signifying the preeminence of cross-media modalities in contemporary research. This topic encounters obstacles, notably the prevalence of English as the principal study medium, which may engender linguistic bias and restrict generalizability in contexts characterized by distinct morphological features, such as agglutinative or tonal languages. Taken together, these developments indicate that semantic modality is no longer studied solely as a linguistic phenomenon but increasingly as a computational construct, bridging theoretical linguistics with applied AI.
Cognitive and Neurolinguistic Connections
The prevalence of the “Human” theme in the thematic map, exhibiting the highest centrality (.727), emphasizes the significance of human-centric comprehension in semantic modality. This bibliometric evidence reinforces theoretical claims (Kratzer, 2012; Palmer, 2001) that modality is inherently tied to human subjectivity, judgment, and interaction, rather than being reducible to abstract logical structures.
Theories of embodied cognition propose that human sensory and motor experiences directly shape the formation of language meaning. Dove (2023) asserted that language systems are embodied, serving not merely as communication tools but also as vital mechanisms for enabling semantic memory and abstract concepts via multimodal and multilevel processes (Dove, 2023). Moreover, Naro et al. (2022) emphasize the significance of embodied cognition in connecting motor functions to language, especially in the setting of neurological rehabilitation through mirror neurone systems (Naro et al., 2022). Terms like brain mapping (133 occurrences) and functional magnetic resonance imaging (115 occurrences) exemplify the significance of neuropsychological technologies in elucidating the relationship between language and cognition.
Recent studies in 2024 examined the inherent connection of semantic networks and their association with semantic processing. The research indicated that “areas including the left anterior temporal lobe, angular gyrus, and orbital regions of the left inferior frontal gyrus are implicated in this process” (Huang et al., 2024). A further study examined the connection between frontal and temporal areas during the construction of syntactic structures in both speaking and listening. The results indicated that “the connectivity between the pars triangularis and pars opercularis of the left inferior frontal gyrus and the left posterior temporal lobe intensifies with the complexity of syntactic structures, in both language production and comprehension” (Giglio et al., 2024). While these studies do not explicitly examine epistemic modes, they offer essential insights into the functions of the prefrontal cortex and temporal lobes in semantic and syntactic processing, potentially relevant to the comprehension of epistemic modality.
Neurolinguistic methodologies are augmented by AI technologies such as ChatGPT, which are engineered to emulate human communication patterns contextually. The capacity of these models to identify and produce language modalities from semantic inputs demonstrates the amalgamation of cognitive linguistics and technology. Thus, the bibliometric evidence of “Human” as a central node gains explanatory support: semantic modality research consistently orients itself around the human experience of meaning-making, both in theoretical linguistics and in technological emulation.
Geographical and Collaboration Patterns
An important discrepancy arises in the results: While China emerges as the most productive country in terms of publication volume, the most prolific individual authors are largely affiliated with Western institutions. This pattern suggests that Chinese research is driven by large-scale institutional projects and collaborative teams, whereas theoretical innovation and individual author visibility are more evenly distributed across the global research community. Moreover, collaboration maps reveal strong intra-regional clusters but relatively fewer cross-regional connections, indicating that semantic modality research remains fragmented along geographical lines. Future collaboration strategies could help integrate these regional hubs into a more globally connected knowledge network.
Multimodality and Cross-Disciplinary Applications
The transition to multimodality underscores the significance of research in contemporary technological environments. Research on semantic modality increasingly incorporates modalities such as sights and sounds, as indicated by the frequency of keywords like visual languages (286 occurrences) and computer vision (195 occurrences). This methodology facilitates pragmatic applications in education, healthcare technology, and social analysis.
GPT-Vision integrates linguistic processing with visual analysis, allowing systems to concurrently read and respond to textual and image data (OpenAI, 2023). This technology has been utilized in the investigation of human behavior, instructional simulations, and adaptive teaching systems. Recent study indicates that the integration of language processing with visual analysis produces substantial results across multiple applications. Convolutional Neural Networks (CNN) and Transformer-based methodologies have been employed to autonomously provide descriptions of visual content, hence improving contextual comprehension in the Indonesian language (Mulyawan et al., 2022). Moreover, multimodal Skip-gram models have been created to amalgamate visual and linguistic data, yielding more comprehensive and precise word representations for semantic tasks such as image labelling and retrieval (Lazaridou et al., 2015). Moreover, studies on MoAI indicate that these models enhance real-world scene comprehension by integrating data from computer vision systems and linguistic abilities (Lee et al., 2024).
These findings highlight the significant potential of technology that integrate the two modalities for future applications. Nevertheless, multimodal integration has methodological obstacles, including the absence of datasets that reflect cultural and linguistic variety. From a bibliometric perspective, the increasing centrality of multimodality-related keywords illustrates how the field is broadening its scope, creating opportunities for interdisciplinary applications in education, medicine, and social sciences.
Evolution and Future of Research
Trend analysis indicates a transition from traditional terminology such as priming (2008–2016) to contemporary terminology like CLIP (2022–2024). This shift signifies the adaptation of research to innovations in advanced technologies while preserving traditional language theoretical foundations. The prevalence of concepts like language (3,978 occurrences) and processing (1,068 occurrences) suggests that linguistic principles remain crucial, despite the ongoing proliferation of data-driven applications. This confirms that while technology drives new directions, the theoretical foundations of linguistics continue to anchor the field, ensuring continuity in scholarly identity.
The topic benefits from strong research infrastructure and international collaboration, with China and the United States in the forefront of publications. Nevertheless, research in areas such as Southeast Asia and Africa possesses significant potential to enhance the range of cultural and linguistic viewpoints in this domain. Expanding the geographical reach of semantic modality research will not only diversify datasets but also enrich theoretical perspectives, counteracting biases introduced by predominantly English-language corpora.
Practical Implications and Recommendations
This study presents broad practical implications, spanning from NLP-based educational tools to medical applications such as cognitive analysis through brain imaging. To advance the field, future research should prioritize reducing dataset bias by including more non-English languages, developing more inclusive methodologies for multimodality integration, and leveraging advanced AI models like GPT-4 to enhance cross-modal understanding and applications. Equally important is the need for stronger theoretical integration: future studies should bridge bibliometric evidence with semantic theories to ensure that the rapid technological expansion of modality research remains anchored in linguistic and cognitive frameworks.
Conclusion
Research on semantic modality has significantly progressed over the last 20 years, resulting in a remarkable increase in publications peaking in 2024. This trend highlights the growing importance of semantic modality as a crucial interdisciplinary topic, driven by advancements in computing technologies such as deep learning, natural language processing, and multimodal integration. By mapping research output, author networks, and thematic clusters, this bibliometric study provides the first comprehensive overview of how the field has evolved globally, revealing both the continuity of linguistic foundations and the expansion into technological and cognitive domains.
The bibliometric analysis emphasizes phrases like Semantics, Human, and Computational Linguistics, signifying a strong basis in language and cognitive principles. Thematic maps highlight the importance of themes such as “Human” and “Semantics” for research progress, while themes like “Computer Circuits,” which show diminished relevance, suggest potential avenues for further exploration, particularly in advanced technologies like artificial intelligence and edge computing. The prominence of “Human” as a central theme underscores the inseparability of modality from human meaning-making, bridging theoretical linguistics with embodied cognition and applied AI.
This study also underscores the dominance of research in countries like China and the United States, while highlighting the discrepancy between national productivity and individual author influence. These findings point to the structural role of institutional collaboration and funding systems in shaping research visibility, while also identifying gaps in global integration. Expanding participation from regions such as Southeast Asia and Africa would not only diversify datasets but also enrich the conceptual landscape of modality studies.
This study acknowledges its limitations, including linguistic bias toward English, dataset bias, and inconsistencies between theoretical frameworks and actual applications. The challenges in model validation and multimodal integration highlight the need for further methodological breakthroughs to address the complexities of cross-modal data. Future research should therefore prioritize (1) enhancing cross-linguistic and cross-cultural representation, (2) strengthening theoretical integration between linguistic semantics and computational models, and (3) fostering international collaboration to connect currently fragmented regional hubs.
Semantic modality has become a field at the intersection of linguistics, cognition, and artificial intelligence. This study contributes by offering a global bibliometric perspective that not only documents trends but also interprets their significance. By doing so, it provides a foundation for future interdisciplinary research and practical applications that will deepen our understanding of meaning, language, and human communication in an increasingly digital world.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
