Abstract
The use of emoji and emoticons has reshaped modern practices of online language usage and communication. Despite the rapid growth of research in this area, there remains a lack of systematic, quantitative, and visualized analyses that comprehensively map its disciplinary structure, influential contributors, and thematic trends. In this review, 3,310 relevant publications collected from Scopus between 1996 and 2024 were subjected to a bibliometric analysis. The results reveal a strong concentration of emoji and emoticon research within computer science, communication, and psychology. Keyword co-occurrence and co-citation analyses further identify several major research clusters, which can be synthesized into four overarching thematic traditions: pragmatic and discourse-oriented studies, cognitive and affective processing, technological and computational approaches, and cross-cultural and sociolinguistic perspectives. Across the globe, the United States produced the most publications, followed by China and India. Citation analyses additionally highlight the most influential institutes, journals, articles, and authors shaping this field. Overall, these findings advance our understanding of the multidisciplinary nature and thematic evolution of emoji and emoticon research from a global perspective, while underscoring the need for greater cross-disciplinary integration. Building on the identified research patterns and gaps, we finally propose a future research agenda along three dimensions: physical implementation, cognitive investigation, and applied development.
Highlights
Integrated disciplines of emoji and emoticon research from a macro lens
Quantitatively examined the publications of research on emoji and emoticon
Proposed research agenda from three levels for future research
Introduction
Human communication involves the transmission of both abstract and concrete information through verbal and non-verbal symbols (Richmond & McCroskey, 2019). With rapid advancements in computing technology and a growing understanding of relevant socio-psychological effects, computer-mediated communication (CMC) has become a prominent focus of research. Studies have shown that CMC often suffers from coordination difficulties due to the absence of key elements essential for effective communication, such as informational feedback, social cues for regulating discussions, non-verbal involvement, and established conversational norms (Kiesler et al., 1984). Specifically, the lack of non-verbal cues, such as facial expressions, intonation, and gestures, can significantly affect the quality of information exchange in CMC (Archer & Akert, 1977). To compensate for these missing cues, representations like emoticons and emojis have emerged as important communicative tools in CMC, drawing increasing scholarly attention.
Emoticons are symbolic representations of facial expressions in the form of combinations of punctuation marks (e.g., “
In contrast to emoticons, which are composed of punctuation marks, emojis are pictorial symbols (e.g.,
) rather than typographic characters (Guibon et al., 2016). Since their introduction in 1997 as a standardized set of small images depicting a wide range of concepts from facial expressions to national flags, emojis have seen a dramatic rise in usage on social media platforms over the past decade (Eisner et al., 2016). Due to their visual nature, emojis are often compared to ancient systems of communication such as hieroglyphs and cuneiform, suggesting that they function more as visual carriers of information than as alphabetic language (Alshenqeeti, 2016). Recent research has begun to examine emojis in more nuanced terms, distinguishing between emotional emojis (used to convey feelings) and semantic emojis (used to express meanings), rather than treating them as a monolithic category (Wang et al., 2023). While some critics argue that emojis represent a regression in language, diminishing the intellectual quality of communication (Jones, 2015), others view them as evidence of language’s dynamic, expressive, and systematic complexity (Adler et al., 2006). In digital communication, emojis increasingly serve linguistic functions and continue to evolve in form, size, and coverage.
Specifically, emoticons emerged in early computer-mediated communication during the 1980s and 1990s as textual representations of facial expressions (e.g., “
Emoji and emoticon research is driven by overlapping questions and is increasingly relevant across various digital contexts. However, existing studies often approach the topic from isolated disciplinary perspectives, such as communication, computer science, or psychology, which fail to reflect its inherently interdisciplinary nature (Bai et al., 2019). This disciplinary fragmentation has limited the development of a comprehensive understanding, highlighting the need for integrative approaches that reflect the complex, cross-domain nature of emoji and emoticon use. So far, we have only identified one systematic review on emoji, conducted by Bai et al. (2019). This review analyzed 167 studies across eight disciplines, revealing that emoji research is rapidly expanding but remains fragmented across fields such as computer science, communication, linguistics, and psychology. It highlights the emotional and semantic functions of emoji, as well as the influence of individual, cultural, and platform-specific factors on their use. The authors also pointed out persistent challenges such as ambiguity and cross-platform inconsistencies, and proposed future directions including multimodal analysis, emotion validation, and sociocultural perspectives.
However, traditional systematic reviews often rely on manual coding and qualitative synthesis, which may be limited by subjectivity and scope constraints (Ljubešić & Fišer, 2016). They typically focus on a relatively small set of studies and may overlook broader publication trends, emerging research fronts, or cross-disciplinary patterns. To address these limitations, bibliometric analysis offers a data-driven and scalable approach to mapping the structure, evolution, and knowledge networks of a research field.
Rooted in early 20th-century quantitative publication analysis and formalized by Pritchard (1969), bibliometric methods use citation data, co-authorship networks, and keyword co-occurrence to reveal the intellectual structure, research frontiers, and evolution of scientific fields. Tools such as the Science Citation Index (SCI), Social Science Citation Index (SSCI), and visualization techniques (e.g., co-citation and co-word mapping) have proven effective for identifying influential publications, high-impact authors, institutional productivity, and topic clusters (Ellegaard, 2018; Leydesdorff & Wagner, 2009). They also measure the impact of authors, publications, journals, and countries for comparative ranking and evaluation (Leydesdorff & Wagner, 2009; Moiwo & Tao, 2013). When applied to the field of emoji and emoticon studies, bibliometric methods would enable a systematic overview of research output, facilitate the detection of core literature and emerging themes, and provide empirical grounding for evaluating the development and interdisciplinarity of the field (Agarwal et al., 2016; Pei et al., 2022).
In particular, despite growing multidisciplinary interest, prior research on emojis and emoticons remains fragmented, with limited integration across fields from a macro-level perspective. Most existing studies rely on subjective or descriptive approaches, often centered on popular topics or highly cited publications. To date, no scientometric review has systematically mapped the intellectual structure, disciplinary divisions, and evolving trends of emoji and emoticon research from a multidisciplinary and global perspective. The present study thus addresses this gap through a large-scale bibliometric analysis.
We include both emoji and emoticon research, as these are the most widely used static visual symbols embedded in CMC, in contrast to stickers, which can only be sent independently and cannot be embedded within textual messages (Zhou et al., 2017).
Specifically, this study aims to answer the following research questions (RQs) based on the scholarly publications from 1996 to 2024:
In addition to addressing these questions, we aim to identify key issues and potential research directions that will guide future scholarly inquiry in this field. By synthesizing patterns across disciplines, geographic regions, and temporal trends, this study seeks to uncover not only what has been studied, but also what remains underexplored or overlooked. Through co-citation analysis, keyword co-occurrence mapping, and network visualization, we aim to highlight conceptual gaps, emerging subfields, and cross-disciplinary linkages that can inform future theoretical development and methodological innovation. Ultimately, this research will provide a foundation for a more coherent and integrative research agenda, supporting the continued evolution of emoji and emoticon studies as a truly interdisciplinary domain.
Data and Methods
Scopus was selected as the source for the literature search and data collection for our research. Scopus is one of the most comprehensive and trusted databases of scholarly literature covering various disciplines (Mishra et al., 2021). Compared with WoS (Web of Science Core Collection), which is a prevalent source of citation information for bibliometrics investigations, Scopus contains more academic publications in humanities and social sciences together with more conference papers. Other broad databases such as PubMed, Derwent, China National Knowledge Infrastructure (CNKI), and China Social Science Citation Index (CSSCI) tend to be limited in scope due to their disciplinary or regional focus. In contrast, Scopus provides broad multidisciplinary coverage and offers built-in analytical tools that facilitate bibliometric analysis. As such, Scopus was selected as the primary data source for this study due to its comprehensive coverage and suitability for cross-disciplinary bibliometric research.
Our initial search formula “‘emoji*’ OR ‘emoticon*’” obtained 3,507 document results (as retrieved on March 12, 2024). As we detected the first relevant publication was “The Effects of Emotional Icons on Remote Communication” (Rivera et al., 1996), we stipulated the covered range as 1996-2024. After removing irrelevant data, 3,310 papers were exported as “plain text” (txt format), which contained basic information of authors, institutions, countries/regions, references, etc. The file was named “download_scopus.txt” and placed in the folder “input,” so as to meet the requirements of subsequent bibliometric analysis software including CiteSpace (Pei et al., 2022). The procedure for data retrieval and collection is shown in Figure S1.
To analyze the research landscape in emoji/emoticon research in 1996–2024, we divided the period into three chronological phases: the early years (1996–2004), the mid-period (2005–2014), and the recent years (2015–2024), with respective publication counts of 31, 364, and 2,915 posts. First, based on the source journals of the Scopus articles, we attempted to answer RQ1 by using a custom Python script to identify the disciplinary areas in which the literature is located in light of the subject categories defined by the Journal Citation Reports (JCR). Meanwhile, the disciplinary categorization of the relevant journals was verified by all contributing authors. Second, to address the research stream change of RQ2, clustering and centrality analyses, as well as time-series analysis, were conducted with the keywords of those retrieved papers. Third, we performed statistical analyses and visualizations of author publication trends, cross-citation relationships, significant publications, and journal hotness using CiteSpace, VOSviewer, R-package “Bibliometrix,” and a custom Python script, so as to answer RQ3 and RQ4 (Aria & Cuccurullo, 2017; Chen, 2006; Derviş, 2019; Ding et al., 2014; van Eck & Waltman, 2010).
CiteSpace, as a widely used citation visualization and analysis tool in bibliometrics, focuses on emerging trends found in the scientific literature. Through the use of visualization tools, it shows how knowledge is managed, integrated, and exploited (Chen, 2016). We used CiteSpace to analyze co-cited authors, references, and keywords. Meanwhile, article clustering of co-cited authors who studied the recurring areas of research in the literature and research stream in the corresponding areas was revealed by clustering overlays to identify research trends over time.
VOSviewer is a widely used software tool for constructing and visualizing bibliometric networks. It provides a user-friendly interface and a range of functions for mapping relationships among authors, journals, and keywords (Wong, 2018). Unlike CiteSpace, VOSviewer focuses on the graphical depiction of bibliometrics (e.g., by describing how closely the nodes relate to the edges of varying thickness). Each node corresponds to a specific parameter on the VOSviewer map, such as country/region, journal, and keyword. The size of a node is determined by the weighting attributes such as the number of publications, frequency of occurrence, or number of citations. The group to which the nodes and lines belong determines their color. The strength of the connections is assessed using the index of total link strength (TLS), which may be scaled up to reflect the overall co-authorship and co-citation link strength throughout countries, institutes, or keywords.
Additionally, the R-package “Bibliometrix” provides an interactive local website that allows for multiple bibliometric analysis tasks (Aria & Cuccurullo, 2017). This tool is particularly useful for extracting key metrics such as the average number of citations and H-index. By leveraging this functionality in combination with custom Python scripts, we could perform quantitative data analysis of the relevant journals and prolific authors.
Results and Discussion
RQ1: What Disciplinary Fields Have Contributed to Emoji and Emoticon Research?
We obtained 3,310 highly relevant articles on emoji and emoticon studies from Scopus. Figure S2 illustrates the global trend in the volume of publications from 1996 to 2024. Notably, the period from 2015 to 2024 accounted for over 80% of total publications. This surge can be attributed to the release of Unicode version 8.0, which introduced a significant innovation (i.e., the skin color modifier). This enhancement allowed users to select different skin tones for subsequent character-based emoji, greatly enriching the frequency and scope of emoji usage (Coats, 2018; Kejriwal et al., 2021). The yearly number of global articles has risen from 11 in 2005 to 476 in 2023, with a 23.28% annual growth rate. Here, we used 2005–2023 for extrapolation because fewer than 10 articles were published annually before 2005, which may have biased the calculation of the growth rate estimate. After 2017, there was a noticeable increase in the annual volume of pertinent publications, and the number of publications has steadily increased.
More importantly, we are interested in the multidisciplinary nature and chronological distributions of emoji/emoticon research. We categorized the collected journals in light of the JCR journal categories and plotted a bar chart according to the total number of articles published in the journals of each category. Figure 1 illustrates the disciplines corresponding to the journals with the number of publications greater than or equal to three. We found that Computer Science (n = 279), Communication (n = 121), and Psychology (n = 104) are in the numerical priority of each journal category, which was in large consistent with Bai et al.’s (2019) findings (#1: Computer Science; #2: Communication; #3: Marketing). To avoid the non-negligible disruption caused by journals with a large base of low publication volume, we also included JCR-indexed journals with fewer than three publications and generated bar graphs (Fig. S3). Together, the two datasets (the subset of journals with ≥3 publications and the complete journal collection) corroborated the results of our disciplinary categorization, which was also visualized in a word cloud diagram (Fig. S4a).

Analysis of disciplines and journals. Bar chart of the categories to which journals with a publication volume greater than or equal to 3 belong.
In addition to analyzing the sentimental and semantic meanings of emoji and emoticons, in particular, a computer-science approach has been increasingly establishing emoji and emoticon datasets (e.g., Godard & Holtzman, 2022; Rodrigues et al., 2018) as references for computational and cognitive investigations. Meanwhile, communication studies (as well as linguistics) were interested in the efficiency of emoji/emoticon in CMC and differing contexts. Interestingly, we also identified a growing interest in the cognitive mechanism of emoji/emoticon processing and their correlations with individual differences, such as gender and personal traits, as revealed by psychological studies.
Among the retrieved 1,050 academic journals publishing articles on emoji and emoticon, Communications in Computer and Information Science (n = 59), Computers in Human Behavior (n = 41), and Advances in Intelligent Systems and Computing (n = 37) are the top three. Figure S4b depicts the trend in the number of publications in the top 10 journals by year from 1995 to 2024. A noteworthy distinction should be made here with “top cited journals with the strongest citation bursts” in Figure S4d, which represents the popular concentration of highly cited journals generated by CiteSpace. The top 10 active journals published 276 papers, or 14.64% of the total number. Among the top 10 journals in terms of the number of publications, Computers in Human Behavior had the highest impact factor (IF 2023 = 9.0), followed by Food Quality and Preference (IF 2023 = 4.9), and IEEE Access (IF 2023 = 3.4). In particular, a marked increase in publication output across journals was observed in 2018. This surge not only corresponds to the overall upward trajectory in scholarly production, but also aligns temporally with the peak popularity of emoji adoption in digital communication media. It is interesting to notice that emojis and emoticons have been widely used and investigated to relate to food taste and experience, as shown by data from Food Quality and Preference.
Based on the differing colors indicating different journals, Figure S4b also shows that Communications in Computer and Information Science and Advances in Intelligent Systems and Computing have higher annual publication growth rate. We then focused on the prevalence intervals of journals to characterize the likelihood of the presentation of possible hotspots. Figure S4d depicts the inconsistency of popularity across journals by the bold red line, indicating a shift from traditional communication studies to computational approach. Meanwhile, we analyzed academic journal publications separately from conference paper publications. Lecture Notes in Computer Science (LNCS) (n = 175), CEUR Workshop Proceedings (n = 88), and ACM International Conference Proceeding Series (n = 56) are the top three proceedings out of 734 academic conferences. Figure S4c depicts the trend in the number of publications in the top 10 proceedings by year, showing the pattern of change from 1995 to 2024. The top 10 active proceedings published 463 papers, or 32.49% of the total number. Lecture Notes in Computer Science has the most consistent number of publications and the highest growth rate. It is worth noting that the number of papers related to emoji/emoticon published at this conference has increased significantly in recent years, suggesting that the study of emoji and emoticon from the perspective of computer science and cognitive studies has become a major trend. Additionally, consistent with journal publications, conference papers also exhibited a significant increase in output in 2018, maintaining high publication volumes in subsequent years. This trend aligns with and further substantiates the growing scholarly interest in emoji/emoticon field in recent years.
RQ2: What Research Streams Have Emerged Over Time in Relevant Journals?
Keyword co-occurrence and clustering could give us an idea of the areas of interest and prospective research directions in this multidisciplinary field. The keywords covered the main topics of the publications, and 1,090 of the keywords retrieved from the abstracts and titles satisfied the threshold. We set the keyword to appear for a minimum of five times without weighting. The clustering of all keywords in the current study using VOSviewer is shown in Figure S5a, which identifies certain established themes in a particular research area and demonstrates how the keywords can be grouped into five clusters by differing colors (red, blue, green, yellow, and purple). “Social networking (online),”“human,”“emotion,”“semantics,” and “computer-mediated communication” are a few of the most frequently used words.
The first cluster (red) mainly concerns social networking, semantics, and data mining, while the second relates to the interplay between emoji and linguistics (e.g., speech recognition). Specifically, the keywords “social networking (online)” and “computer-mediated communication” manifest the principal social functions of emoji and emoticons during communications. Emoji and similar symbols (emoticons included) could compensate for the lack of non-verbal clues during online communications, thus helping users to deliver emotions and meanings and further facilitating social networking and bonding (Bai et al., 2019). In general, the first two clusters could represent the semantic and social roles of emoji and emoticons.
The keyword “human” in the blue cluster shows the main research subject of emoji/emoticon research, in addition to emojis and emoticons themselves. Emoji/emoticons might exert differing influences on different human individuals (mostly on “emotion” and cognition), considering their ages, personalities, genders, social media usage habits, and cultural backgrounds, among others. This line of research shows a shift from mass population to individual characteristics when discussing the role of emoji/emoticons in CMC. Another noteworthy research stream related to the keyword “human” is that it not only covers human-to-human communication, but also human-computer interaction, especially in the new millennium. In addition to “emotion,” other cognitive domains including decision making, food preference, and consumer behavior were widely investigated, which manifests the inter-disciplinary characteristics of psychology, marketing, and behavioral sciences. As such, this cluster could denote the emotional and cognitive aspects of emoji/emoticon use during human communication.
Moreover, the last two clusters demonstrate the methodological advances and innovations in emoji and emoticon research. “Big data” and “computation theory” are the newest buzzwords in purple, along with the “data mining” in red, which represent the computational approach. By using big data and machine learning techniques, the semantic and sentimental dimensions of emoji/emoticons could be measured and digitalized more precisely and systematically, thus further optimizing computation algorithms and computer systems. Meanwhile, the yellow cluster highlights the physiological mechanism of emoji/emoticon use and processing, in light of psychophysiological methods including brain potentials and evoked body responses (e.g., skin conductance). For instance, by monitoring the brain theta oscillations induced by emoji semantic anomalies in an electroencephalogram (EEG) experiment, Tang et al. (2021) shed light upon the neuro-cognitive mechanism of the semantic processing during emoji reading.
Another key indication of research frontiers, hotspots, and upward trends is the intensity of keyword bursts (Fig. S5b). Most importantly, “communication” (1996–2014), “internet” (2000–2019), and “world wide web” (2004–2016) are terms whose citation outbursts span a remarkable amount of time, suggesting that the field of research on Internet and CMC continues to receive a lot of attention. In addition, the terms “language processing” (2022–2024), “natural languages” (2022–2024), “deep learning” (2022–2024), and “machine learning” (2022–2024) have become popular in recent years and continue to be influential today. As can be seen, computation and cognition are the two most recent research streams in emoji/emoticon research, which reflects the multidisciplinary and also dynamic nature of this field. Nevertheless, cognitive research on emoji/emoticons seems less adequately studied as compared with massive computational analysis, as revealed by the keyword analysis results. Future studies might need to delve into understanding the many aspects of emoji processing and application by combining computational and neuro-cognitive techniques. This encourages cross-disciplinary collaborations across psychology, linguistics, communication, computer sciences, and brain sciences.
While keyword co-occurrence and clustering analyses provide a structural overview of research streams, a meaningful understanding of the field also requires synthesizing these clusters into substantive thematic traditions. Drawing on the identified keyword clusters, the literature on emoji and emoticon research can be broadly organized into four major thematic directions. Table 1 summarizes the major thematic traditions distilled from the bibliometric clusters.
Major Thematic Traditions in Emoji and Emoticon Research.
First, pragmatic and discourse-oriented studies constitute a well-established research tradition, particularly within linguistics and communication studies. Research in this stream examines how emojis operate as interactional resources in CMC, contributing to meaning-making beyond propositional content. Empirical and theoretical studies have shown that emojis and emoticons can modulate illocutionary force, disambiguate speaker intention, and guide recipients’ interpretation of stance and interpersonal orientation (Alshenqeeti, 2016; Dresner & Herring, 2010; Walther & D’Addario, 2001). Related work further demonstrates that emoji/emoticons are frequently deployed to soften directives, mitigate face-threatening acts, and manage relational dynamics in online interaction (Beißwenger & Pappert, 2019; L. Li & Yang, 2018; Lo, 2008; Riordan, 2017). From this perspective, emojis are not treated as mere emotional embellishments, but as pragmatically motivated semiotic cues embedded in discourse norms and communicative practices.
Second, cognitive and affective processing research focuses on how emojis and emoticons are perceived, interpreted, and integrated during language comprehension. Studies in this stream examine emotional responses, semantic integration, and attentional processes associated with emoji/emoticon use, often adopting experimental paradigms from cognitive psychology and neuroscience. Behavioral and psychophysiological evidence suggests that emojis can influence emotional responses and semantic integration processes, sometimes functioning in ways comparable to lexical items or pictorial stimuli, while in other cases eliciting distinct processing patterns (Filik et al., 2016; Paggio & Tse, 2022). Neurocognitive studies further indicate that emoji processing engages affective and attentional mechanisms, shedding light on how visual symbols interact with language processing systems (Y. Li et al., 2024; Tang et al., 2021). This body of work positions emojis as cognitively meaningful elements within multimodal communication rather than peripheral or decorative features.
Third, computational approaches represent the most rapidly expanding research stream. These studies primarily aim to model and quantify emoji meaning at scale, drawing on methods from natural language processing (NLP), machine learning, and sentiment analysis. Large-scale analyses of social media data have demonstrated that incorporating emoji/emoticons can substantially improve sentiment classification and emotion detection in informal texts (Kiritchenko et al., 2014; Kralj Novak et al., 2015). More recent work leverages deep learning techniques to capture nuanced semantic and affective patterns associated with emoji use, reflecting the growing integration of emoji research into affective computing and human–computer interaction (Eisner et al., 2016; Zhao et al., 2018). Within this tradition, emojis are conceptualized as quantifiable signals that enhance computational representations of human communication.
Finally, cross-cultural and sociolinguistic perspectives address variation in emoji interpretation and usage across languages, cultures, and platforms. Studies in this stream demonstrate that emoji meanings are shaped by cultural conventions, language-specific practices, and platform-dependent renderings, rather than being universally fixed (Kejriwal et al., 2021; Ljubešić & Fišer, 2016). Comparative analyses reveal systematic differences in emoji frequency, preference, and interpretation across communities, highlighting the role of sociocultural knowledge in emoji-based communication (Bai et al., 2019; Prada et al., 2018). This perspective underscores the importance of viewing emojis as culturally situated semiotic resources embedded in broader sociolinguistic systems.
Taken together, these thematic traditions illustrate how emoji and emoticon research has progressed from early descriptive and interaction-oriented studies toward a multidisciplinary domain integrating pragmatic theory, cognitive science, computational modeling, and sociocultural analysis. Interpreting bibliometric clusters through these thematic lenses helps clarify not only the structural organization of the literature, but also the substantive insights the field has generated regarding the communicative functions and processing mechanisms of emojis and emoticons.
RQ3: Which Countries, Institutions, and Authors Are the Most Productive in Emoji And Emoticon Research?
Research on emoji and emoticon has been conducted and published across 92 countries and regions, reflecting its broad international scope. According to the global productivity in Figure S6c, papers in emoji and emoticon were primarily published in North America, Asia, European nations and South America. Table S1 lists the top 10 countries/regions for articles related to emoji and emoticon, and Figure S6a shows the annual publication counts in these countries/regions throughout the period 1996–2024. The figure also shows that India has a high growth rate of publication in recent years, while China and the United States have maintained a steady pace of production. The United States produced the most publications (18.49%, 612), followed by China (14.47%, 479), and India (12.05%, 399). The United States received the most citations (13,380 times), the highest average article citations (21.863 times), and the highest H-index (54), surpassing China, which came in second (7,801 times).
The data also illustrates the USA’s positive influence and importance in emoji and emoticon studies. In Table S1, Single Country Publication (SCP) denotes the volume of publications co-authored with scholars of the same nationality, whereas Multiple Countries Publication (MCP) denotes the volume of papers co-authored with authors from other nations (Pei et al., 2022). The share of MCP can be seen as an indicator of the frequency of inter-country cooperation. We found that the top 10 countries/regions in terms of the number of articles all show a tendency to cooperate deeply with other countries. Chinese mainland (labeled as China in the figure) enjoys cooperation from many countries/regions, among which the two most important are the United States and Hong Kong SAR of China (Fig. S6b). However, cooperation between other countries is not significant in the map. Figure S6b shows that 145 nations were included in our analysis of global cooperation using the VOSviewer, and Total Link Strength (TLS) refers to the thickness of the lines connecting nodes, representing the level of international co-authorship. The co-authorship visualization map shows that the top five TLS were the USA (TLS = 197), the United Kingdom (TLS = 123), China (TLS = 105), Australia (TLS = 58), and Germany (TLS = 56).
The papers and proceedings on emoji and emoticon included contributions from 5,643 institutions. Table S2 includes the top 10 productive institutions, among which University of California, Tsinghua University, and Peking University are the big three. The data show that the multidisciplinary nature of this field has led to the fragmentation of institutes rather than their concentration in one area (Fig. S7a). At the same time, inter-institutional cooperation is limited to certain small areas (Fig. S7b), and we do not observe large-scale or complex institutional cooperation from the data.
This review includes 1,068 authors (nodes in CiteSpace), which conforms to the thresholds we set in the CiteSpace software. Table 2A and B present the 13 most productive and 10 most popular co-cited authors who published papers on emoji and emoticon. Interestingly, authors with high citations do not exhibit high centrality in the software. The top 5 authors exhibiting centrality in the CiteSpace software are Christophel, D (Count = 1, Centrality = 1.04), Rezabek, LL (Count = 30, Centrality = 0.94), Hu, M (Count = 54, Centrality = 0.81), Gunawardena, CN (Count = 2, Centrality = 0.80), and Chuang, ZJ (Count = 2, Centrality = 0.76). Figure S8a shows a visualization of the author network and Figure 2(a) presents the clustering of co-cited authors’ articles. A certain degree of collaboration between different authors is evident. Lines between the circles indicate collaborations between authors, in which thicker lines indicate closer collaboration, and different colors indicate differing years. Each circle represents one author. Meanwhile, for the clustering of topics, the inconsistency of the topics is distinguished by different colors. Obviously, there is a clear collaboration between the different authors, especially in the last decade. Figure S8b shows the publication associations and research tendencies of popular authors in the field through a Sankey diagram. Likewise, we can observe the research streams in computer science, communication, and cognition, as we identified in answering RQ2. Figure 2(b) demonstrates the top 25 authors with the strongest citation bursts.

Analysis of co-authorship and cited authors. (a) Citespace visualization map of topic cluster view of author co-citation. (b) CiteSpace visualization map of the top 25 authors with the strongest citation bursts from 1996 to 2024. The dark green line indicates the time at which it appeared above the threshold, and the thick red line suggests its centrality.
The H-index reflects both the productivity and impact of a researcher’s overall scholarly contributions. Table 2A shows that Jaeger, SR (19 papers), Ares, G (18 papers), and Barbieri, F (15 papers) published the most papers. Ares, G (H-index of 14), Jaeger, SR (H-index of 13), and Barbieri, F (H-index of 10) showed the highest H-index values, which suggested the influence and academic status of their papers. Table 2B shows that the top 10 co-cited authors received more than 500 citations. The most notable nodes are associated with Chew, C (1,177 citations), Eysenbach, G (1,177 citations), and Yang, N (1,264 citations). Meanwhile, three of the top 10 authors show centrality. CiteSpace results indicate that the 10 most cited scholars and the centrality values go to Walther, JB (Centrality = 0.19), Novak, PK (Centrality = 0.02), and Fischer, AH (Centrality = 0.01). The contributions of these co-cited authors proved influential to emoji and emoticon research. It is worth noting that the great number of single citations might result in multiple authors sharing the same average number of citations. This table responds to the fact that it is skewed towards the authors corresponding to popular literature in terms of their contribution to the field. Only by combining both citation value data and centrality data can we get a full picture of how much authors in a field contribute to the overall field.
RQ4: What Are the Most Highly Cited Publications and Influential Works in Emoji and Emoticon Research?
All retrieved articles have been cited 44,360 times as of the search date, with 13.4 average citations per document. In our investigation, 981 references (nodes in CiteSpace) were referenced in total, which conforms to the thresholds we set in the software. The top 10 references in emoji and emoticon are shown in Table S3. We first focused on the authors of highly cited literature. Chew, C has the most total citations (n = 1,177), followed by Kiritchenko, S with 648 citations. Here we compared the count calculation values provided in the CiteSpace instead of centrality, and found strong agreement with Table S3, suggesting that we can use the measure of citations to explain the areas covered by the articles and their connection to emoji/emoticon research. Interestingly, the most cited work “Pandemics in the Age of Twitter: Content Analysis of Tweets During the 2009 H1N1 Outbreak” (Chew & Eysenbach, 2010) examined emoticons as one component of social media communication and demonstrated their relevance during a specific public health crisis, despite not explicitly referencing emoticons in the article’s title or abstract. It is also important to note that since the total number of citations usually increases over time, earlier publications are more likely to be cited than later papers. Thus, we then focused on the fact that the concentration of highly cited papers was published around 2010, suggesting that research developments during this period played an important role in advancing the field. PLoS ONE and Computers in Human Behavior, as two of the main platforms, deliver insightful information in psychology and human behaviors associated with emojis and emoticons.
The top 25 references with the strongest citation bursts in emoji and emoticon research are shown in Figure S9. Interestingly, as far as centrality is concerned, these articles do not exhibit the same level of sustained influence as the previous keyword displays. Still, by searching the research areas and journal or conference sources of the corresponding articles, we found that this part of results highly overlapped with Figure S5b, showing a disciplinary shift from communication and linguistic approaches to computational and cognitive interpretations.
Figure S10 shows a graphical representation of the reference cultivation network, in which the five clusters were clearly distinguished by colors, showing a strong cluster effect and a homogeneous network. By comparing author sources, we can categorize these five types of citation networks as sentiment analysis (red), computer-mediated communication (green), retrospective review (yellow), semiotics (blue), and marketing (purple). CiteSpace used 282 co-citations (displayed after threshold filtering) and a 1-year time period from 1996 to 2024, with the most cited 10% of the literature as a selected group to show the co-citations (Figure 3).

Analysis of co-cited references. Timeline visualization of co-cited reference clusters generated using CiteSpace. The time evolution is indicated with different colored lines, and the nodes on the lines indicate the references cited.
The co-cited references are also shown in the timeline graph in Figure 3, which depicts how the research hotspots change over time. These results can be divided into eight clusters. “Sentiment analysis” (#0) is the largest cluster, which resonates the most pronounced role of emoji/emoticons in communications, along with its semantic (“distributional semantics,” #4) and pragmatic functions (“willingness to forgive,” #6). Interestingly, “hinglish” (#3) shows the cultural and contextual diversity of emoji/emoticons, which expands the research scope of linguistic studies. Meanwhile, “deep learning” (#1) and “text analysis” (#2) manifest the mainstream methods of computing and digitalizing emoji and emoticons, in line with the findings from the keyword analysis results in section “RQ2: What Research Streams Have Emerged Over Time in Relevant Journals?”
However, the co-cited reference analysis did not show much influential work adopting physiological and neural techniques to examine the cognitive mechanisms of emoji and emoticon reading. It remains unclear to what extent emoji/emoticon resemble or differ from real emotions and text counterparts. Further, neuro-imaging and psycho-physiological recordings could be used to investigate those underlying correlates.
Conclusion and Research Agenda
Drawing on a systematic review combined with bibliometric techniques, this study aimed to depict the research landscape of emoji and emoticon studies. This study covered a total of 3,310 publications from 1996 to 2024. Specifically, this study highlights the multidisciplinary nature of emoji and emoticon research from a holistic, macro-level perspective by identifying key scientific fields, research themes, methods, countries/regions, and influential authors. More importantly, we attempted to indicate the key issues that could drive future investigations on emoji and emoticon from multidisciplinary and global perspectives.
Regarding RQ1, we identified that the most relevant fields of emoji/emoticon studies were computer science, communication, and psychology, revealing a significant interaction between humans and computers when addressing the many facets of emoji and emoticons. Meanwhile, the analysis of the field’s temporal development indicated a significant burst of publications in the last decade, which was also primarily driven by computer science. In particular, computational methods mainly focused on the sentimental content of emoji/emoticon, while communication studies (as well as linguistics) concentrated on their pragmatic and social roles in CMC. Meanwhile, psychology and social sciences provided important insights into how the use of emoji/emoticons could facilitate human behavior, social interaction, and understanding the cognitive correlates of their usage. In addition, the application of emoji and emoticons has been widely investigated among various fields and industries including food science and technology, engineering, health care, information science, public/environmental/occupational health, among others. However, the top fields of emoji and emoticon research focused on fundamental studies instead of practical studies, as revealed by the data. Applications of emoji and emoticons thus should be encouraged by applying them into various fields and seeking collaborations across disciplines.
The results of keyword occurrence and clustering analysis were informative for understanding the research streams of the covered topics (RQ2). The first two clusters mainly concern the semantic and social roles of emojis and emoticons in computer-mediated communication, while the third one denotes the emotional and cognitive aspects of using emoji and emoticons. Meanwhile, the last two clusters represent the influential methodologies employed in this field, that is, big data and computation theory, respectively. Nevertheless, the cognitive correlates of emoji/emoticon were less adequately studied as compared with massive computational interpretations and establishments, which should be addressed by the future investigations. While bibliometric analyses necessarily emphasize publication patterns and structural relationships, we extend this approach by explicitly synthesizing the identified clusters into substantive thematic research traditions, namely pragmatic and discourse-oriented studies, cognitive and affective processing, technological and computational approaches, and cross-cultural and sociolinguistic perspectives. This additional layer of interpretation clarifies how quantitative knowledge structures correspond to established theoretical and empirical directions in emoji and emoticon research.
In terms of RQ3, the results of our analysis provided straightforward information on the regional and personal distributions of relevant publications. Overall, the productions of emoji and emoticon research were unevenly distributed globally, which was mainly dominated by the United States, India, and China. The most productive institutes consist of University of California, Tsinghua University, Peking University, among others. The most prolific authors come from various countries and institutes, as well as differing disciplines including computer science, communication, linguistics, and psychology. Meanwhile, the co-authorship analysis showed obvious cooperation across countries, institutes, and disciplines.
The RQ4 was addressed by our co-citation analysis to identify the most cited work. The top ten original articles with the most citations were mainly published on journals of computer science, psychology, and communication. The most cited work conducted content analysis of tweets during the 2009 H1N1 pandemic, in which emoticons were an integral part. This work was published on PLoS ONE with 1,177 citations. Further, the co-cited reference analysis revealed the state-of-the-art techniques including sentiment analysis, deep learning, and text analysis. Overall, semantics did not receive as much attention as emotions in the analysis of emoji and emoticons. Meanwhile, research work adopting physiological and neural techniques was less encountered during the co-citation reference analysis results.
To sum up, our systematic review could make an important contribution to emoji/emoticon academia, as we summarized the most significant research issues, methodologies, countries, institutes, and authors, in light of a multidisciplinary and global perspective. This review would thus be of interest for researchers from various backgrounds.
Emoji and emoticons have become an integral component of contemporary digital communication, functioning as visual resources that enrich the expression of emotions, meanings, and social intentions. Prior research has conceptualized emojis not merely as communicative tools, but as historical, social, and cultural objects situated at the intersection of affective expression and socio-economic forces (Stark & Crawford, 2015). At the same time, the growing prominence of visual elements on social media platforms such as Instagram and Vine has underscored the need for systematic approaches to analyzing visual-mediated communication (Highfield & Leaver, 2016). Against this broader communicative backdrop, the present bibliometric analysis provides a comprehensive overview of how emoji and emoticon research has developed across disciplines. Our findings indicate that existing studies have primarily concentrated on two major dimensions (Figure 4(a)). The first is a content-oriented dimension, encompassing semantic and emotional aspects of emojis, which have been extensively investigated through computational approaches and, to a lesser extent, linguistic analyses (Wang et al., 2023). The second is a functional dimension, focusing on the roles of emojis and emoticons at both social and individual levels. Research in communication and applied fields (e.g., food science and technology) has examined how these visual symbols facilitate computer-mediated communication and support the development of emoji- or emoticon-based applications, such as product evaluation and educational feedback.

Conceptual map and future research framework for emoji and emoticon studies. (a) The content and functional dimensions of emoji/emoticon research. (b) An integrative three-level research framework.
At the same time, the bibliometric results reveal notable gaps in the current literature. In particular, comparatively less attention has been paid to how emoji and emoticon use interacts with individual differences, including race, gender, and socio-economic background, despite evidence that such factors systematically shape emoji usage and interpretation (Ljubešić & Fišer, 2016; Prada et al., 2018; Rodrigues et al., 2018; Sadiq & Shahida, 2019). Moreover, while emojis and emoticons have been proposed as potential indicators of emotional states and personality traits, this line of research remains underdeveloped and fragmented (Stark & Crawford, 2015).
Building directly on these empirical patterns and identified gaps, we therefore propose an integrative research map and agenda for emoji and emoticon studies across disciplines (Figure 4(b)). This framework aims to systematically connect foundational representations, cognitive mechanisms, and applied domains, thereby offering a theoretically grounded and empirically motivated structure for future research.
First, the physical level concerns the digitalization and establishment of semantic, emotional, and social norms of emoji and emoticons. While there already exist several emoji datasets (e.g., Godard & Holtzman, 2022; Rodrigues et al., 2018; Scheffler & Nenchev, 2024) involving various sentimental dimensions (e.g., valence and arousal), their volumes are relatively small (100–400 emojis). Meanwhile, most of the normed dimensions were about emotions, whereas their semantic aspects were seldom registered. Moreover, those datasets mainly focus on face emojis, while object and event emojis (e.g., party balloons and champagne) as well as emoticons were rarely touched upon. Those object and event emojis contain an important aspect of visual symbols, that is, social information, which is distinct from semantic and sentimental information. However, these features were not scientifically measured and quantified in the existing dataset. Addressing the social aspects of emoji/emoticons would enable them to become viable tools of probing social cognition (Y. Li et al., 2024). Meanwhile, the emotional and semantic norms of the existing emoji datasets were established mostly based on human ratings, which might limit their expansions to larger scales. A recent line of research showed the feasibility of using AI-generated estimates to quantify the familiarity, concreteness, valence, and arousal for over 100,000 Spanish words (Brysbaert et al., 2024; Martínez et al., 2024). Future studies could utilize large language models to compensate human ratings in emotional, semantic, and social aspects of emoji and emoticons when developing large-scale emoji/emoticon lexicons.
The cognitive level is about the cognitive and neuro-scientific mechanism of human’s perceiving and using emoji and emoticons in various contexts. There were two crucial issues: (1) Are emoji/emoticons processed in a similar way as textual words during reading and communication? (2) Can emoji/emoticons elicit robust emotions as other affective stimuli? The existing findings regarding the first issue were mixed. For instance, Weissman (2019) found that emojis were processed similarly as words in sentential contexts such that participants could integrate information from various modalities during sentence comprehension. In contrast, evidence from EEG oscillations and eye-tracking showed that emojis and words might constitute differing processing strategies on semantics (Paggio & Tse, 2022; Tang et al., 2021). The discrepancy could be further examined by more sound experimental paradigms and nuanced brain-imaging techniques with higher spatial resolutions like functional Magnetic Resonance Imaging (fMRI) and magnetoencephalography (MEG). Concerning the second issue, Y. Li et al. (2024) identified similar socio-affective mechanism between face emojis and actual human faces, thus establishing the notion that emojis are processed as social information. Other stimuli (e.g., audios, pictures, and videos) should also be examined to identify the extent to which they resemble or differ from emoji/emoticons in terms of mental representation and processing. Importantly, the semantic and sentimental meanings of both face and non-face emojis could be modulated by individual and social traits, such as working memory and theory of mind (Y. Li et al., 2024; Tang et al., 2021). The interplay between emoji/emoticon processing, cognitive functions, and social information should be further addressed. This calls for cross-disciplinary collaboration between psychology, linguistics, neuroscience, among others, as well as technical fusions and methodological innovations.
The application of emoji/emoticons addresses their efficiency in real-world usages and problem-solving. Our bibliometric analysis identified a wide range of application areas, including food science and technology, healthcare, and information and library science. One limitation to overcome is their variation among different cultures and platform renderings (e.g., IOS, Android, and Windows). Meanwhile, the semantic ambiguity and use inefficiency (Bai et al., 2019; Riordan, 2017) should be addressed to facilitate applicational adaption and system optimization. Future research and application should expand the range of their applications by not only advancing the algorithm but also including more non-face emojis. Overall, the digitalization and dataset development of emoji and emoticons would provide valuable resources for the cognitive investigations and applications. Meanwhile, deepening the cognitive understanding of emoji/emoticon processing would be insightful for practical developments, which would in turn facilitate this understanding. Advancements in the three levels of emoji/emoticon research call for thorough exchange and cooperation across different disciplines globally.
Supplemental Material
sj-docx-1-sgo-10.1177_21582440261456453 – Supplemental material for Multidisciplinary and Global Perspectives on Emoji and Emoticon Research: A Bibliometric Review and Research Agenda
Supplemental material, sj-docx-1-sgo-10.1177_21582440261456453 for Multidisciplinary and Global Perspectives on Emoji and Emoticon Research: A Bibliometric Review and Research Agenda by Runshan Gui, Zikai Lin, Ke Huang, Guandong Yue, Chengwen Wang and Fei Gao in SAGE Open
Footnotes
Acknowledgements
We thank members of the Gao laboratory for suggestions and comments.
Ethical Considerations
Ethical approval was not required because the information utilized in this research was obtained from open sources and excluded interaction with human subjects.
Consent to Participate
All authors have read and agreed to participate in the unpublished work.
Consent for Publication
All authors have read and agreed to the published version of the manuscript.
Author Contributions
All authors have read and agreed to the published version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Shanghai Pujiang Program (23PJC005), Central University of Finance and Economics New Sprout Scholar Support Program (XMXZ2413) and FDUROP (Fudan Undergraduate Research Opportunities Program) (23819).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
Declaration of Generative AI and AI-Assisted Technologies in the Writing Process
During the preparation of this work, the authors used DeepL and Large Language Model GPT 4o in order to polish the language and optimize essay writing. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
