When Texts Meet Emoji: A Multi-Stage Study of Tourism Brands

Abstract

Are social media posts with emoji more engaging? Guided by the media richness theory, this study explores the relationship between visual (i.e., emoji) and textual content, and how they collectively impact user engagement with peer-to-peer accommodation brands. A three-stage sequential design using naturalist data, including text mining, frequent/rare itemset mining, and one-way ANOVA, was used. This study revealed that the combination of {Travel Tips and Inspiration,} and {Interaction and Motivation,} tended to result in an increased amount of likes and shares in social media posts. By theoretically revealing and empirically examining the complex relationship between verbal and visual content, this study enriches the theoretical understanding of media richness in tourism brands. Practically, this study provides actionable guidelines for tourism brands to increase user engagement by effectively using visual-verbal content.

Keywords

media richness emoji frequent/rare pattern user engagement social media

Introduction

Recent years have witnessed the rapid growth of peer-to-peer accommodation worldwide, disrupting traditional accommodation models (Mody et al., 2021). Peer-to-peer accommodation is part of the fast-growing fleet of the sharing economy businesses that enable individuals to rent out their spare space for short periods of time for a fee (L. Zhu et al., 2019). By 2025, peer-to-peer accommodation is estimated to make up 17% of global accommodation (World Economic Forum, 2017). However, compared with traditional accommodation providers, peer-to-peer accommodation is less known to consumers. To penetrate the accommodation business, an effective way for peer-to-peer accommodation brands is to strategically build its distinctive brand personality (Wang et al., 2021), into its brand-generated content (Liang et al., 2020; Tao et al., 2022). To enhance brand awareness and reach, and to project their personalities, peer-to-peer accommodation brands need to create brand generated content to maintain ongoing dialogs with consumers (Wang et al., 2021).

As such, brands including peer-to-peer accommodation need to include various elements into their brand generated content, including verbal (e.g., text) and visual content (e.g., emoji), to deal with, “What to say?” and, “How to say it?” (Luangrath et al., 2017). This practice aligns with the media richness theory, which holds that text-based content is not the richest channel (Walther & Parks, 2002). Visual content, such as emoji can be considered to be “rich” media, because they represent feedback, multiple cues, language diversity, and personal focus (Moussa, 2019). Thus, the communication effectiveness of brand-generated social media content from peer-to-peer accommodation brands, depends on the combinations of emoji and text, rather than on text alone (Wang et al., 2023). In fact, peer-to-peer accommodation brands (as shown later in this study) often include a range of emoji in their social media posts, as a means of engaging their followers in more innovative and entertaining ways (Luangrath et al., 2017). In the crowded marketplace, new brands can leverage emoji as a visual differentiator. Incorporating unique or customized emoji that align with their brand or products can help brands stand out and gain recognition in digital spaces, ultimately helping turn new consumers into brand advocates (Wilk et al., 2020). This strategy is critical to peer-to-peer accommodation brands as they are relatively new compared to the traditional accommodation providers. This visual distinctiveness can contribute to brand recall, help new brands carve their niche and develop new consumer-brand in competitive markets relationship (Alvarez et al., 2023). Emoji can add a level of emotional appeal to a message, and help the message stand out in a sea of text (Bai et al., 2019). However, the co-occurrence pattern of text and emoji, and their performance, are less clear to tourism researchers and practitioners, both theoretically and practically (Wang et al., 2023). As visual paralanguage, emoji supplement the text and overcomes the absence of nonverbal signals in online communication.

Therefore, guided by the media richness theory, this research aims to explore (1) the relationship between verbal (e.g., text) and visual content (e.g., emoji) in brand-generated social media content and (2) their joint effects on user engagement with peer-to-peer accommodation brands. A three-stage research design was adopted: Stage 1 incorporated text mining, to extract hidden topics that co-occurred with emoji; Stage 2 used Frequent/Rare Itemset Mining techniques to identify topic-emoji patterns; and Stage 3 examined the impacts of identified frequent/rare topic-emoji patterns on user engagement.

Theoretical Background and Literature Review

Media Richness Theory

Media richness theory was initially developed to model the comparative efficiency of various communication channels to decrease equivocality in organizational decision-making (Daft & Lengel, 1986). One fundamental element of this theory is media richness, referred to as “the degree to which cues are available in a specific communication context.” According to Daft et al. (1987, p. 358), four key factors determine how rich a medium is: (1) feedback, or how quickly responses can be provided; (2) multiple cues, or how many cues (including as physical presence, voice intonation, body language, words, numbers, and graphic symbols) can be used to convey a message; (3) language diversity, and the number of meanings that can be expressed through symbols, and personal focus; and (4) the ability to produce messages that are personalized for each person. The other fundamental element is the equivocality of a communication scenario, meaning the extent to which a decision-making scenario and the facts associated with it, are available to various interpretations (Walther, 2011). As for the relationship between these two elements, media richness theory holds that there should a perfect fit between the equivocality of a communication scenario, and the richness of a media. Greater equivocality necessitates richer media, while lesser equivocality requires leaner media for maximum efficiency. This theory was initially developed with the idea that a perfect fit (or a perfect misfit) impacts efficiency, and it is frequently discussed in literature as having an impact on communication effectiveness (Walther, 2011).

As peer-to-peer accommodation brands are still relatively new to the tourism market compared with traditional accommodation providers, creating richer media to engage with the general public is critical. Personal, emotionally demanding tasks, such as revealed in the stories of Airbnb hosts, are said to have a high degree of equivocality, making them more suitable for richer media (Daft & Lengel, 1986; Dennis & Kinney, 1998; Walther & Parks, 2002). Conversely, relatively lean media, such text-based messages, cannot be well-suited for efficiently conveying complex emotional issues (Walther & Parks, 2002). Furthermore, when nonverbal cues are absent, the ability to change the tone of a communication message, or express one’s uniqueness, or demonstrate dominance or charisma, will be diminished (Kiesler, 1986; Walther & Parks, 2002). As such, communication effectiveness depends on the combination of media types, rather than on a single media type (Walther & Parks, 2002). Thus, it is more effective to combine emoji with text to enhance communication effectiveness.

The media richness theory is relevant to the study of emoji in this research (Moussa, 2019). First, emoji improve feedback by offering stronger nonverbal cues that can be understood more quickly and efficiently than text-only. Second, emoji increase the ability of the text message to transmit multiple cues, thanks to their 3,633 unique symbols (Unicode, 2021), making the task of conveying and deciphering meanings, easier. Third, emoji increase language diversity, and enable the conveyance of broader thoughts and ideas, since they incorporate facial expressions, gestures, symbols, and physical items (Bai et al., 2019; Novak et al., 2015). Finally, emoji also allow text senders to focus more on themselves by enabling more nuanced expressions that facilitate a better understanding of the feelings and emotions of the senders (Moussa, 2019).

Emoji and Text

The rapid development of digital communication has seen the increasing use of nonverbal information about “the way something is said” and “what is being said” (Luangrath et al., 2017). Various nonverbal communication elements, such as symbols, images, demarcations, or any combination of these, have been conceptualized as textual paralanguage, that is, “written manifestations of nonverbal audible, tactile, and visual elements” (Luangrath et al., 2017, p. 98). Textual paralanguage is often embedded in the verbal message, and adds contextual information, laden with emotion and meaning (Luangrath et al., 2017). Among textual paralanguage, emoji is the most popular paralinguistic element (Tang & Hew, 2019). Researchers thus far have examined the development and categories (Novak et al., 2015), the emotional and semantic functions (Danesi, 2016), the drivers, motivations and diverse use (Prada et al., 2018), and the impacts of emoji on consumer responses (e.g., G. H. Huang et al., 2020; Valenzuela-Gálvez et al., 2023). These studies, however, have overlooked an important fact in brand digital communication—that emoji and texts often exist together (Wang et al., 2023).

While emoji can occasionally be used to merely replace words in text, they are more frequently used to add new information (Na’aman et al., 2017). Emoji can also play a supplementary role in clarifying the intended meaning of ambiguous content (Riordan, 2017), or in providing emotions to content (Shiha & Ayvaz, 2017). Emoji also represent a new modality, unique in their text’s emotional and semantic structure (Cappallo et al., 2019). Thus, it is crucial to examine the relationship between emoji and text (Cappallo et al., 2019; Wu et al., 2018), which would provide a basis for measuring the communication effectiveness of brand self-presentation on social media.

The limited research on text and emoji to date, covers two main areas: (1) their impact on consumer behavior and (2) prediction of emoji using text (see Table 1). The first stream identifies the interaction roles of the text and emoji in generating user engagement (e.g., McShane et al., 2021), purchase intention (e.g., Manganari & Dimara, 2017), and review helpfulness (e.g., G. H. Huang et al., 2020). While the positive impact of emoji on consumer behavior is widely acknowledged, extant literature has largely focused on whether facial emoji is present in messages, overlooking the impacts of emoji type on digital branding (Cappallo et al., 2019), with one exception of Wang et al. (2023). While Wang et al. (2023) have briefly touched on types of emoji, their study has only divided emoji into two types, overlooking the diversity of emoji. More importantly, their analysis is limited to product-related textual content (esthetic experience and promotion), which cannot fully reveal the relationship between various emoji and brand-generated content. This study extends the study of Wang et al. (2023) by empirically identifying the patterns of text-emoji combination and investigating the effects of these patterns on user engagement using naturalistic data.

Table 1.

Representative Studies Combining Both Text and Emoji.

Areas	Studies	Topics	Context	Methods	Variables related to emoji	Variables related to text	Emoji used
	Current Study	Co-occurring patterns and their performace of text and emoji	Social media communication	Text mining, itemset mining, and ANOVA	Frequent/Rare topic-emoji patterns	Topics aggregated from text	Emoji used by peer-to-peer accommodation brands
Impacts on consumer behavior	Wang et al. (2023)	Impacts on user engagement	Social media communication	Empirical study and online experiment	Emoji type (emotional vs semantic)	content type (esthetic experience vs promotion)	Emotional and semantic emoji
	McShane et al. (2021)	Impacts on user engagement	Social media communication	Lab experiment and empirical study	Emoji use (yes vs no)	Emoji–text interplay condition (yes vs. no); Emoji-text relatedness (high vs. low)	Emoji used by the celebrity and corporate brands
	G. H. Huang et al. (2020)	Impacts on review helpfulness	Online consumer reviews	Lab experiment and empirical study	Emoji use (yes vs. no)	Review valence (positive vs. negative)	Face-like emoji
	Manganari and Dimara (2017)	Impacts on attitude to the hotel and booking intention	Online consumer reviews	Experiment	Emoji use (yes vs. no)	Review valence (positive vs. negative)	Facial emoji
Relationship between emoji and text	Peng and Zhao (2021)	Multi-emoji prediction from text	Social media individual posts	Seq2Emoji model	Focus: The emotional correlation between emoji and text		32 emoji with emotional significance
	Liebeskind and Liebeskind (2019)	Single emoji prediction from text	Social media political comments	Character n-grams representations	Focus: Study emoji prediction based on different languages		The 20 most frequent emoji
	Wu et al. (2018)	Multi-emoji prediction from text	Social media individual posts	Hierarchical neural model	Focus: Build high quality sentence representations by highlighting important contexts		The top 30 frequent emoji
	Barbieri et al. (2016)	Single emoji prediction from text	Social media individual posts	Vector space skip-gram model	Focus: The semantic relation between words and emoji		The 100/300 most frequent emoji

The second area explores the relationship between emoji and text, focusing on predicting emoji from text using machine learning approaches (e.g., Barbieri et al., 2016). However, these approaches all computed the embedding of emoji and words in a semantic space to generate an accurate model, rather than provide explanatory strategies for brands (Fournier-Viger et al., 2017). This stream of research overlooked the fact that emoji can also predict text. In fact, it is well-documented that visual elements are superior to those of text alone as they require less cognitive efforts to understand and draw inferences (Brubaker & Wilson, 2018). Thus, emoji, visual pictographs that are frequently displayed in colorful forms and utilized inline in text (Das et al., 2019; Ge & Gretzel, 2018; Rodríguez-Hidalgo et al., 2017), can be used to predict specific accompanying content, based on their original meanings and portrayed emotions.

In tourism, research on emoji is relatively recent, and the majority of studies (e.g., Basoda et al., 2022) have focused on whether emoji is present in social media posts, overlooking the relevance of the accompanying text. To fully examine the relationship between emoji and text and their joint effects, one promising approach would be to first discover meaningful and valuable frequent/rare patterns through co-occurrence patterns and itemset mining. Based on these patterns, the effectiveness of topic-emoji combinations on user engagement could then be assessed.

Social Media Engagement in Tourism

Social media engagement, often operationalized as the users’ liking, commenting and sharing behavior, has increasingly been used in tourism and hospitality (So et al., 2016), as engagement can lead to positive outcomes, such as brand awareness, brand trust, and word-of-mouth (Cvijikj & Michahelles, 2013). As such, a growing number of tourism literature has started to examine what and how social media content drives user engagement (Suh et al., 2021). Two main factors have been frequently identified in extant tourism literature including (1) text characteristics, such as vividness and interactivity (e.g., de Vries et al., 2012); and (2) text topics (Yang et al., 2022). While these studies have provided insights into the factors driving social media engagement, they focus on “text” only, failing to consider the visual content (e.g., emoji) that are often embedded in social media content (Wong et al., 2023). Thus, opportunities arise to critically examine the joint effects of emoji and text on user engagement on social media so that better social media performance of tourism brands can be more accurately measured.

Research Design

This research examined the relationship between emoji and text, and its joint effects on user engagement in peer-to-peer accommodation brand-generated social media content. Three stages were involved: Stage 1 involved using text mining to extract topics that co-occurred with the use of emoji, using word co-occurrence networks and text semantic cluster analysis; Stage 2 used itemset mining to identify frequent and rare topic-emoji patterns; and Stage 3 examined the effects of topic-emoji patterns on user engagement. Stages 1 and 2 were critical, as they addressed the first research aim, that is, to identify the co-occurrence patterns of text and emoji. In Stage 1, co-occurrence semantic network allowed the authors to identify which words were the most commonly used, how they were connected, and how the different communities were clustered (Segev, 2022). These clusters could be analyzed further to gain a deeper understanding of the underlying topics in the texts using real Twitter data. Stage 2 constructed a transactional database to further extract frequent and rare co-occurring patterns of topic and emoji, through frequent/rare itemset mining. Frequent/rare itemset mining is a data mining technique that seeks to derive itemsets that occur together frequently (rarely) (Luna et al., 2019). Stage 3 addressed the second aim, by examining the effects of topic-emoji patterns on user engagement, using one-way ANOVA. Figure 1 outlines the multi-stage research process. In this study, the rationale in choosing Twitter as the data source was that Twitter is one of the most widely used global social media marketing platforms (Jin & Cheng, 2020; Wang et al., 2021); use of Twitter enables brands to be innovative with their generic strategies on tweets, with limited characters (Juntunen et al., 2020). Barnes et al. (2020) found that 96% Fortune 500 companies in 2019 had active corporate Twitter accounts.

Figure 1.

Multi-stage research process.

Stage 1: Topics Extraction From Text Mining

Data Selection

A total of 62 brands that had been extensively endorsed by the literature and industry reports as typical, global peer-to-peer accommodation brands, were first selected (Chan, 2018; Keshen, 2019; Meeroona, 2019; World Bank Group, 2018). This list was then narrowed down according to the following criteria: (1) The provision of peer-to-peer accommodation was required to be the brand’s main business, not just one component, (2) the accommodation exchange could not be reciprocal or free, (3) the operation was required to have continued until the end of 2019, and (4) the brand was required to have a unique official Twitter account, in English. To ensure an adequate sampling size, peer-to-peer Twitter accounts with tweets that included emoji accounting for at least 7.0% of all tweets posted in 2015 to 2019, were selected. As a result, seven brands were retained for further analysis. A brief description of each brand is outlined in Table 2.

Table 2.

Profile of the Peer-to-Peer Accommodation Brands.

Brand	Founded	Listings	Geographical spread	Products	Tweets	Tweets with emoji
Airbnb	2008	5,000,000	191 countries	Unique homes, vacation homes, bed and breakfasts, and boutique hotels	1,846	246
HomeAway	2005	2,000,000	190 countries	Cabins, condos, castles, villas, barns, and farmhouses	5,249	596
Vrbo	1995	2,000,000	190 countries	Vacation rental homes, condos, villas, apartments, beach Houses	905	305
Clickstay	2003	94,300	80 countries	Villas, apartments, cottages and other holiday lettings	5,440	1,584
Stayz	2011	53,428	Australia	Bench house, cottage and apartment	694	374
onefinestay	2009	5,000	Europe, North America, Caribbean, Caribbean, Central America, Australia	Homes and villas	1,409	164
GowithOh	1997	2,000	Europe	Tourist apartments for families and luxury apartments or the ideal ones for an escapade	3,276	231

For each of these brands, a python-based web crawler was used to collect all the tweets posted from January 1st, 2015, to December 31st, 2019. A total of 18,819 tweets was included in this stage, and 3,500 tweets with a minimum of one emoji were used for further analysis. Furthermore, the numbers of likes, shares, and comments for each tweet were gathered.

Data Preprocessing

Each tweet that included emoji was pre-processed, as follows: (1) The symbol “@,” URL links, and all non-letter characters (including numbers and punctuation) were excluded, (2) regular, negative and other unique spoken contractions were expanded, for example, “isn’t” into “is not,” (3) texts were set to lowercase, (4) tokenization was undertaken (i.e., dividing each tweet into words), (5) parts-of-speech were identified and tagging (i.e., the process of determining parts of speech to each word, based on its grammatical category in a given sentence) was carried out, (6) lemmatization was undertaken (i.e., the conversion of words to their root form, such as “ate” and “eats” into “eat”), and (7) “stop” words, such as “a” and “the,” that have little or minimal meaning, were removed.

Word Co-occurrence Network

After preprocessing the data, a complete list of all the words was extracted from the text to develop a word co-occurrence network. This was achieved by highlighting the most commonly used words, and how they were connected, as well as the different clustered communities (Khokhar, 2015). In addition, a list of all bigrams, (two words appearing together in a tweet), was generated, and their frequencies calculated. Next, the networked output data was visualized using Gephi visualization and exploration software. The word co-occurrence network consisted of a set of nodes and edges. “Nodes” refer to the words transformed from the text, their size being set by betweenness centrality scores (Brandes, 2001); “Edges” usually represent connections between words; the frequency of bigrams determines their strength. In addition, Force Atlas (a Continuous Graph Layout Algorithm) was run to spatialize the word co-occurrence network (Khokhar, 2015).

Text Semantic Cluster Analysis

After transforming the text into a word co-occurrence network, the Louvain method was conducted to partition the network into different clusters (Blondel et al., 2008). The modularity of a graphical network is an indicator of its strength as a whole, describing how easily it can be discomposed into communities (Newman, 2006). Higher modularity means strong connections within the same community, whereas weak connections mean that connections are spread among different communities.

Modularity is known as, the proportion of “edges that fall into a given community” to the “total number of edges that can exist in those communities.” Mathematically, Equation 1 is followed for modularity, Q

Q = \frac{1}{\sum_{i j} A_{i j}} \sum_{i j} [A_{i, j} - \frac{\sum_{j} A_{i j} \sum_{i} A_{i j}}{\sum_{i j} A_{i j}}] δ (c_{i}, c_{j})

(1)

where A_ij denotes the edge’s weight between i and j, ∑jAij represents the total of the weights of the edges associated to vertex i, and c_i is the community to which vertex i is assigned. The δ function δ (u, v) returns 1 if u = v, and 0, otherwise.

Text Analysis Results

Descriptive Statistics

Five thousand, three hundred ninety-five (5,395) types of words, and 26,240 types of bigrams were identified for analysis. Figure 2 shows the frequency distribution of the number of occurrences of words. Around 93.79% of the words appeared less than 20 times, and the mean frequency of each word was 6.48, with a standard deviation of 21.13. This resulted in 592 different types of emoji, with an average of 2.12 emoji per tweet (see Figure 3). The 15 most frequent words/bigrams were extracted from the dataset (see Table 3).

Figure 2.

Frequency distribution of the number of occurrences of words.

Figure 3.

Frequency distribution of the number of occurrences of emoji.

Table 3.

Word/Bigrams Frequency Lists (Top 15).

No.	Words	Freq.	No.	Bigrams	Freq.
1	Villa	595	1	£, pw	217
2	Holiday	413	2	Private, pool	99
3	£	379	3	Bed, villa	70
4	Beach	350	4	Beach, villa	56
5	Home	293	5	Click, link	49
6	Get	269	6	Chance, win	39
7	View	262	7	Holiday, home	37
8	Pool	256	8	Beach, house	32
9	Family	236	9	Half, term	31
10	Stay	228	10	Book, cost	31
11	pw	219	11	Link, profile	30
12	Look	191	12	Hot, tub	29
13	Summer	180	13	Near, beach	29
14	Take	178	14	Win, £	29
15	Private	169	15	Take, look	28

Topics Co-occurring With Emoji

After data cleaning, a list of words and bigrams were loaded in Gephi, to visualize the word co-occurrence network. Words appearing less often than 20 times were dropped, to allow the micro-topics of a text to be distinguished (see J. Huang, 2017; Y. Zhu et al., 2019). After this, the word co-occurrence network consisted of 335 nodes and 5,774 edges. As can be seen in Figure 4, four communities were identified using the “Force Atlas” algorithm and the Louvain method, for community detection. Each community was composed of strongly related keywords, representing complex specific semantic concepts, distinct from other communities (Drieger, 2013; J. Huang, 2017).

Figure 4.

Word co-occurrence network.

Table 4 displays four categories: Accommodation Tour (AT), Travel Tips and Inspiration (TI), Interaction and Motivation (IM), and Advertising and Promotion (AP), respectively. The first topic, “AT,” describes the physical property and facilities, views, location, space, and listings reviews. Consistent with a real host, the peer-to-peer accommodation brand proudly introduces its listings to it guests. Words commonly used to describe the properties, such as “beach,” “view,” “pool,” “bed,” and “beautiful.” The second topic, “TI,” is mainly related to travel tips and advice, especially destination recommendations. In addition, the content inspires travel plans and suggested activities. The value of this type of content lies in satisfying a consumer’s wish to escape reality, hedonism, esthetic enjoyment, and emotional release (Creevey et al., 2019). The third topic, “IM,” consists mainly of words relating to social interaction and future rewards. Social interaction tends to involve relational conversations—by posing questions, providing quizzes, eliciting votes, and asking for choices and shares, and expressing gratitude, congratulations and blessings. The promise of future rewards can be especially useful as a driver of engagement, with some interesting activities encouraged, such as looking for footprints in a given photo. The last topic, “AP,” mainly concerns advertising, prices, and exclusive deals, often containing indicative terms, such as “£ pw,” and “still available.”

Table 4.

Topics Extraction From Text Semantic Cluster.

No.	Topics	Keywords
1	Accommodation Tour (AT)	Villa, beach, view, pool, private, bed, best, house, beautiful, apartment, great, new, nsw, click, check, sleep, cabin, right, bedroom, gorgeous, link, enjoy, come, near, south, say, luxury, sea, lovely, tour, property, portugal, florida, island, cyprus, hot, beauty, offer, coast, city, boast, stunning, cottage, share, think, star, park, london, rest, mountain, country, retreat, modern, live, call, ocean, relax, garden, bay, lake, location, qld, amazing, tenerife, town, spacious, game, profile, tub, greece, favourite, large, la, even, back, area, sunny, thailand, wanderlust, people, locate, photo, bathroom, wale, fantastic, inspire, barcelona, traveltuesday, water, walk, luxurious, imagine, discover, welcome, sunshine, costa, terrace, hill, pick, golf, kitchen, swim, quiet, full, close, base, charm, soak, magical, paradise, majorca, review, outside, overlook, deck, rustic, spanish, wanderlustwednesday, charming, green, fabulous, eye, resort, wait, state
2	Travel Tips and Inspiration (TI)	Holiday, home, family, look, summer, take, see, perfect, travel, one, make, like, vacation, love, happy, homeaway, know, would, place, good, next, find, trip, want, year, rental, airbnb, sun, well, big, let, top, dream, destination, way, explore, spot, france, use, around, world, vic, winter, italy, friendly, room, thing, set, experience, via, life, kid, fun, adventure, whole, malta, ski, every, ever, celebrate, never, feel, reason, host, space, leave, favorite, vacay, list, pet, easy, group, choose, snow, season, tip, really, watch, road, nothing, keep, paris, feature, austin, budget, read, enough, hike, wish, style, bring, ultimate, much, design, round, without, wonderful, meet, sure, morning, mean, sxsw, magazine, tell, real, disney, guest, local, quiz, outdoor, guide
3	Interaction and Motivation(IM)	Stay, time, getaway, win, need, day, could, week, away, weekend, enter, friend, christmas, night, break, head, chance, spend, escape, rt, tag, spring, voucher, thanks, competition, fall, instagram, give, giveaway, vote, follow, everyone, vrbo, late, today, two, everything, long, first, include, romantic, heart, step, clickstayphotoes, grab, fam, forget, palm, peaceful, winner
4	Advertising and Promotion (AP)	£, get, pw, book, go, spain, plan, start, little, august, familytravel, visit, clickstay, minute, last, may, bargain, dealoftheday, algarve, turkey, ready, still, available, cost, half, croatia, july, help, term, price, discount, september, availability, fridayfeeling, snap, fancy, mondaymotivation, booking, october, st, open, fire, yes, try, please, must, california, cover

Stage 2: Frequent/Rare Topic-Emoji Pattern Mining

Construction of the Transaction Data

In this stage, a transaction database was used to further analyze itemset mining, based on a set of tweets, with each tweet being a set of topics and emoji. That is, brand-generated tweets served as the units for the subsequent analysis. One problem was that the topics exacted from the text semantic clusters, were the topics of the entire text, not the corresponding topics of each tweet. Thus, the topics were required to correspond to each tweet. Based on the indicative words clustered as a topic (as identified in the previous steps), tweets were coded manually under the different topic categories, by one of the authors and an independent coder, who had previously received a coding worksheet and examples of each topic. After two coders had coded all the tweets on their own, any coding differences were resolved through discussion with another author.

The next step was to convert each tweet containing a mix of topics and emoji, into a suitable format that would present analyzable transaction data. Let T = {t₁, t₂, . . ., t_m1} be a set of m₁ distinct items, each representing a topic. Similarly, E = {e₁, e₂, . . ., e_m2} was a set of m₂ distinct items, each representing an emoji. A transaction in this stage was defined as a list of items representing topics and emoji ${{\hat{t}}_{1,} {\hat{t}}_{2,} \dots {\hat{e}}_{1,} {\hat{e}}_{2,} \dots},$ where ${\hat{t}}_{i}$ and ${\hat{e}}_{i}$ could be any item $t_{i} \in T$ and $e_{i} \in E .$ For items representing an emoji, it was important to note that an item was not allowed to show more than once, in a single transaction. When the same emoji appeared repetitively in a tweet, it was coded only once in the transaction.

Frequent/Rare Itemset Mining

The standard settings of itemset mining were defined as follows. Let I = {i₁, i₂, . . ., i_m} be a set of m distinct items (symbols), each representing an item (symbol). Let D = {TID₁, TID₂. . ., TID_n} be a set of n transactions, defined as a transaction database, and each transaction TID_i ⊆ I (1 ≤ i ≤ n) as a set of distinct items. Let |TID_i|=k denote that TID_i contains k items, called k-itemset. Similarly, |D| described the database containing the number of all the transactions. Taking the transaction database in Table 5 as an illustration, there were five transactions (TID₁, TID₂, . . ., TID₅) and five items (t₁, t₂, e₁, e₂, e₃). For example, the first transaction was a three-item set, representing the set of items t₁, t₂, and e₂.

Table 5.

A Transaction Database.

ID	Transactions
TID ₁	{t₁, t₂, e₂}
TID ₂	{t₁, e₁}
TID ₃	{t₁, t₂, e₁, e₃}
TID ₄	{t₂, e₁, e₃}
TID ₅	{t₁, t₂, e₁, e₃}

In general, various indicators can be used to evaluate the interest for itemset, in itemset mining (Fournier-Viger et al., 2017), but “support” is a basic measure of the “interestingness” of patterns (Szathmary et al., 2007). The “support” of an itemset is divided into “absolute support” and “relative support.” Let Sup(X) = |{T|X ⊆ T ∧ T ∈ D}| denotes the itemset X’s absolute support, and describes the number of transactions that contains itemset X; Let relSup(X) = Sup(X)/|D| denotes the relative support of itemset X, and describes the proportion of transactions containing X in the transaction database. For example, the itemset {t₁, e₁} appeared in three transactions (TID₂, TID₃, and TID₅), so its “absolute support” was 3, and “relative support” was 0.60.

The aim of itemset mining is to extract useful patterns from a database. Among them, a frequent itemset has a “support” that is equal to, or more than, the specified minimum support set by users. Let minrelSup denotes the given minimum “relative support” threshold. If relSup(X) ≥ minrelSup, then itemset X is frequent; otherwise, it is rare. A frequent itemset is usually regarded as an extended “regular” item, whereas a rare itemset is related to an “exception” in the data, and all its subsets are frequent itemsets (Szathmary et al., 2007). Intuitively, an itemset with higher “support” is better. However, a rare itemset can convey information relevant to content marketing managers. Thus, frequent and rare itemsets were highly relevant for this study. In addition, the authors adopted “negFIN,” an efficient algorithm for fast mining frequent itemsets (Aryabarzan et al., 2018), and “AprioriRare,” an algorithm for mining minimal rare itemsets (Szathmary et al., 2007). The two algorithms were applied to the transaction database shown in Table 5, with minrelSup = 0.6, and a range of patterns were identified (as shown in Table 6).

Table 6.

Frequent/Rare Patterns.

Type	ID	Patterns	Sup	relSup
Frequent	f₁	{t₁, e₁}	3	0.6
	f₂	{t₂, e₁, e₃}	3	0.6
	f₃	{t₂, e₁}	3	0.6
	f₄	{t₂, e₃}	3	0.6
Rare	r₁	{t₁, e₃}	2	0.4
Rare	r₂	{t₁, t₂, e₁}	2	0.4

Topic-Emoji Pattern Mining Results

Transaction Data Descriptive Statistics

Topics extracted from text semantic clusters and emoji were matched to their corresponding individual tweets. Because coding decisions were not necessary for identifying emoji embedded in the tweets, combined with the fact that emoji displaying long-tailed distribution could bring complexity to later analysis, the samples of tweets that included emoji were further reduced from 3,500 to 1,924. The 1,924 tweets containing the 13 most frequently used emoji was chosen, based on the fact that in the curve of the emoji frequency, the 13 most frequently used emoji reflected the last inflection point. Following this, Cohen’s “kappa” was used to ascertain whether the opinions of the two coders agreed on the topics of the 1,924 tweets. The Cohen’s kappa coefficient for this study was .702, indicating a strong degree of agreement (Landis & Koch, 1977).

This total of 1,924 transactions (containing four topics and 325 emoji), were constructed for all analyses. Table 7 lists the statistics of the four identified topics and the top 13 emoji. In total, 1,924 transactions corresponded to a 2,215 frequency of topics, that is, on average, each tweet involved more than one different topic. Among these topics, the majority of transactions described “AT” (53.01%), followed by “TI” and “IM” (24.22% and 21.83%, respectively). Peer-to-peer accommodation brands were less likely to post “AP” (16.06%). In terms of emoji, the most frequent emojis used by peer-to-peer accommodation brands were “smileys,” and emoticons of people, including “”, “”, and “”. Other high-frequency emoji were, “”, “”, “”, “”, “”, “”, “”, “”, “”, and “”. It is worth noting that according to the Emoji Sentiment Ranking (Novak et al., 2015), these frequently used emoji were all positive, with their sentiment scores greater than 0. In addition, each tweet involved on average, more than one emoji, especially the “”, which was embedded four or more times.

Table 7.

Identified Topics and Emojis (Top 13).

Item	Symbol	No. of items	No. of transactions	No. of items/transactions
Accommodation Tour	AT	1,020	-	0.5301
Travel Tips and Inspiration	TI	466	-	0.2422
Interaction and Motivation	IM	420	-	0.2183
Advertising and Promotion	AP	309	-	0.1606
Total		2,215	1,924	1.1512
Smiling_face_with_heart-eyes		861	686	1.2551
White_medium_star		496	110	4.5091
Sun_with_face		388	325	1.1938
Smiling_face_with_sunglasses		234	231	1.0130
Round_pushpin		194	194	1.0000
Red_heart		155	150	1.0333
Camera		139	135	1.0296
Sparkles		138	130	1.0615
House_with_garden		136	135	1.0074
Palm_tree		131	128	1.0234
Sun		128	110	1.1636
Water_wave		109	93	1.1720
House		88	87	1.0115

Frequent/Rare Topic-Emoji Patterns

The data sets of the constructed transactions at the individual tweet level were analyzed by using the negFIN and the AprioriRare algorithms in SPMF (Fournier-Viger et al., 2016). The minimum relative support was set to minrelSup = 0.04. The negFIN scans the data sets for frequent patterns and returns those patterns with a greater or equal to 0.04 relSup, whereas AprioriRare for mining minimal rare patterns, returns those patterns with a smaller 0.04 relSup, with all their subsets being frequent itemsets. Given that this stage only focused on the co-occurrence of topics and emoji, and that the sample size in each pattern could not be too small, only those patterns that contained both topics and emoji, and those rare patterns with a relSup of 0.02 to 0.04 were reported.

Fourteen frequent and 13 rare patterns were selected (see Table 8). Each pattern was annotated with its “absolute support” and “relative support.” For frequent patterns, the most regular co-occurrence was found between “AT” and “”, with an absolute support of 404 (f₁) indicating that peer-to-peer accommodation brands convey enthusiastic feelings of love, infatuation, and adoration for their own listings; “” and “TI,” “IM,” and “AP” also frequently appeared together (f_2–4, ); f_5–9 and f₁₂ indicated that “AT” was often self-presented with a group of emoji, including, “”, “”, “”, “”, “”, and “”; f_10–11 indicated that “” was commonly used in combination with “TI” and “IM”; f₁₃ demonstrated that more than one emoji (e.g., “” and “”) co-occurred with “AT,” while f₁₄ revealed that more than one topic existed (e.g., “AT” and “AP”), frequently co-occurring with “” in the database.

Table 8.

Frequent/Rare Patterns.

No.	Frequent patterns	Sup	relSup	No.	Rare patterns	Sup	relSup
f₁	{AT,}	404	0.2100	r₁	{AP,}	67	0.0348
f₂	{TI,}	159	0.0826	r₂	{TI,}	55	0.0286
f₃	{IM,}	118	0.0613	r₃	{IM,}	72	0.0374
f₄	{AP,}	141	0.0733	r₄	{AP,}	49	0.0255
f₅	{AT,}	122	0.0634	r₅	{IM,}	43	0.0223
f₆	{AT,}	167	0.0868	r₆	{TI,}	65	0.0338
f₇	{AT,}	97	0.0504	r₇	{IM,}	52	0.0270
f₈	{AT,}	109	0.0567	r₈	{AT,}	55	0.0286
f₉	{AT,}	88	0.0457	r₉	{AT,}	45	0.0234
f₁₀	{TI,}	91	0.0473	r₁₀	{IM,}	55	0.0286
f₁₁	{IM,}	94	0.0489	r₁₁	{AT,}	43	0.0223
f₁₂	{AT,}	88	0.0457	r₁₂	{AT,}	46	0.0239
f₁₃	{AT,, }	97	0.0504	r₁₃	{AT,}	69	0.0359
f₁₄	{AT, AP,}	90	0.0468

For rare patterns, there was a rare co-occurrence between “AP” and “”, with relative support of 0.0348 (r₁); “”, “TI,” “IM,” and “AP” rarely appeared together, (r_2–4). Another exception was “IM” and “”, which only occurred infrequently (r₅); and although “” tended to express love, co-occurrences with “TI” and “IM” were also rare (r_6–7); r_8–9 and r_11–13 show that “AT” is seldom self-presented with the following group of emoji, “”, “”, “”, “”, and “”; another rare pattern was “IM” with “”, with a lower relative support of 0.0286, (r₁₀).

Stage 3: Effects of Topic-Emoji Patterns on User Engagement

The aim of Stage 3 was to examine the effects of topic-emoji patterns on user engagement. This stage focused on the impact of frequent/rare patterns of topic-emoji on user engagement. Thus, one-way ANOVA was conducted to first determine whether there was statistical evidence that the user engagement means of two or more independent patterns were significantly different overall. Furthermore, multiple comparisons were followed up with post-hoc tests to indicate which specific means were significantly different. The dependent variable was “user engagement,” which was operationalized as the number of clicks on “Like,” retweeting, or comments made by consumers, that is, likes, shares and comments on each tweet (see e.g., Tafesse & Wien, 2018); The independent variable consisted of different categorical, independent patterns, identified by itemset mining.

User engagement is generally measured by the number of likes, shares, and comments, and it was therefore necessary to match the previously identified 27 patterns with specific tweets, to convert the transaction data into a sample set, uniquely identified by the pattern. The sample set consisted of 2,563 records, including the two subsets: 14 frequent patterns (1,861; 72.61%) and 13 rare patterns (702; 27.39%). The number of likes, shares, and comments, count data, was processed as z scores prior to data analysis, and in addition, a small number of outliers was removed, based on 3δ principle. Statistics showed that the number of likes, shares, and comments for the entire sample was −0.0881 to 2.5930, −0.1082 to 2.9867, and −0.0777 to 2.6244, respectively. The descriptive statistics of the sample by frequent/rare pattern were as follows: the average number of likes, shares, and comments for frequent patterns was −0.0509, −0.0746, and −0.0553, respectively; for rare patterns, the means of likes, shares, and comments were −0.0264, −0.0328, and −0.0426, respectively. Notably, user engagement in frequent and rare patterns, differed, and the value of rare patterns was higher. This conclusion can also be drawn from Figure 5, which shows the means of likes, shares, and comments for 27 specific patterns.

Figure 5.

Means plot of 14 frequent patterns and 13 rare patterns.

The differences between frequent and rare groups were tested with Welch ANOVA. This revealed that there was statistical evidence that the means of the number of likes, shares, and comments were significantly different, F (1, 968.9912) = 6.9818, p < .001; F (1, 836.1813) = 17.3080, p < .001; and F (1,1,000.2057) = 3.8556, p < .050, respectively.

Next, the existence of a statistically significant difference was examined in user engagement concerning 27 specific patterns. The results for Welch ANOVA indicated that statistically, there was a disparity in the number of likes F (26, 675.8615) = 6.9933, p < .001; and shares, F (26, 673.8236) = 5.1045, p < .001; and comments, F (26, 661.6375) = 6.2728, p < .001. The Games-Howell post-hoc test was further conducted to determine which specific pairs of means were significantly different. Only the results for multiple comparisons with significant differences are listed in Table 9. As indicated by c₁–c₁₉, four conclusions can be drawn. First, the same emoji that are embedded in different text topics will generate different levels of user engagement. As in c₁, c₂, c₆, and c₁₅, for example, with regards to the impact of “” on user engagement, “AT” is better than “TI,” which in turn, is better than “AP,” whereas “IM” is better than “AP.” Second, even if the same topic is matched with different emoji, it will generate different user engagement. Compared with a combination of “AT” and “”, the number of likes associated with “AT” and “” was statistically significantly greater as in c₃, indicating that emotional emoji are more effective at generating “liking” behavior by consumers, than are semantic emoji embedded in listings descriptions with esthetic experience. Third, the combinations of different emoji and different topics will also generate different user engagement, as in c₇–c₁₃, and c₁₆. Finally, both frequent and rare patterns will drive user engagement. In particular, as a rare pattern, the combination of “IM” with “” can generate a higher number of “shares” than combining “AT” with “”, as demonstrated by c₁₉.

Table 9.

Games-Howell Post Hoc Test Results.

No.	(I)	(J)	No. of likes			No. of shares			No. of comments
No.	(I)	(J)	Mean (I)	Mean (J)	△Mean	Mean (I)	Mean (J)	△Mean	Mean (I)	Mean (J)	△Mean
c₁	f₁	f₂	−0.0632	−0.0407	−0.0225	−0.0915	−0.0625	−0.0290*	−0.0607	−0.0417	−0.0190
c₂	f₁	f₄	−0.0632	−0.0735	0.0103	−0.0915	−0.0929	0.0014	−0.0607	−0.0765	0.0158*
c₃	f₁	f₉	−0.0632	−0.0791	0.0159*	−0.0915	−0.1036	0.0121	−0.0607	−0.0704	0.0097
c₄	f₁	f₁₃	−0.0632	−0.0806	0.0174*	−0.0915	−0.0957	0.0042	−0.0607	−0.0760	0.0153*
c₅	f₁	f₁₄	−0.0632	−0.0773	0.0141*	−0.0915	−0.0962	0.0047	−0.0607	−0.0777	0.0170*
c₆	f₂	f₄	−0.0407	−0.0735	0.0328*	−0.0625	−0.0929	0.0304	−0.0417	−0.0765	0.0348*
c₇	f₂	f₅	−0.0407	−0.0713	0.0306*	−0.0625	−0.0938	0.0313	−0.0417	−0.0711	0.0294
c₈	f₂	f₈	−0.0407	−0.0733	0.0326*	−0.0625	−0.0895	0.0270	−0.0417	−0.0732	0.0315*
c₉	f₂	f₉	−0.0407	−0.0791	0.0384*	−0.0625	−0.1036	0.0411*	−0.0417	−0.0704	0.0287
c₁₀	f₂	f₁₂	−0.0407	−0.0685	0.0278*	−0.0625	−0.0883	0.0258	−0.0417	−0.0596	0.0179
c₁₁	f₂	f₁₃	−0.0407	−0.0806	0.0399*	−0.0625	−0.0957	0.0332*	−0.0417	−0.076	0.0343*
c₁₂	f₂	f₁₄	−0.0407	−0.0773	0.0366*	−0.0625	−0.0962	0.0337*	−0.0417	−0.0777	0.0360*
c₁₃	f₂	r₁	−0.0407	−0.0723	0.0316*	−0.0625	−0.0981	0.0356*	−0.0417	−0.0729	0.0312*
c₁₄	f₂	r₄	−0.0407	−0.0641	0.0234	−0.0625	−0.0862	0.0237	−0.0417	−0.0744	0.0327*
c₁₅	f₃	f₄	−0.0502	−0.0735	0.0233*	−0.0610	−0.0929	0.0319	−0.0546	−0.0765	0.0219
c₁₆	f₃	f₉	−0.0502	−0.0791	0.0289*	−0.0610	−0.1036	0.0426*	−0.0546	−0.0704	0.0158
c₁₇	f₃	f₁₃	−0.0502	−0.0806	0.0304*	−0.0610	−0.0957	0.0347	−0.0546	−0.0760	0.0214
c₁₈	f₃	f₁₄	−0.0502	−0.0773	0.0271*	−0.0610	−0.0962	0.0352	−0.0546	−0.0777	0.0231
c₁₉	f₉	r₁₀	−0.0791	−0.0278	−0.0513	−0.1036	−0.0280	−0.0756*	−0.0704	−0.0196	−0.0508

Note. Likes, shares, and comments are measured with z scores.

Significance level is .05.

Conclusion and Discussion

Through the use of text mining and itemset mining, the current study has identified both topics co-occurring with emoji and frequent/rare patterns where topics and emoji are embedded in peer-to-peer accommodation brand-generated social media content. This study also investigated the impacts of topic-emoji patterns on user engagement. Specifically, the following four content types, “Accommodation Tour,” “Travel Tips and Inspiration,” “Interaction and Motivation,” and “Advertising and Promotion,” were extracted as the textual context for emoji. Regarding co-occurrence patterns, frequent patterns, such “Accommodation Tour” combined with “”, were revealed as “regularities,” whereas rare patterns, such as “Interaction and Motivation” combined with “”, were revealed to be “exceptions.” Moreover, the co-occurrence of topic-emoji supports the proposition of Cappallo et al. (2019)—emoji is distinct from text. For example, the probability of “” being used to represent love is almost 60%, where “Accommodation Tour” and “Advertising and Promotion” appear, whereas the combination of “” and “”, has an 88.18% tendency to be linked with “Accommodation Tour.” Moreover, this proposition is further supported by the impacts of different patterns on user engagement. Those patterns with different emoji and the same topics have different effectiveness. For example, the number of likes for “Accommodation Tour” combined with “” is statistically significantly greater compared to “Accommodation Tour” combined with “”. Similarly, the number of comments relating to “Accommodation Tour” and “”, is statistically significantly greater than those of “Advertising and Promotion” and “”. This demonstrates that the same emoji embedded in different topics can generate different results. Meanwhile, the revelations from the extraction of rare patterns of topics and emoji have been useful. The rare pattern of “Interaction and Motivation” and “”, generated a higher number of shares compared to a frequent pattern, for example, “Accommodation Tour” and “”. This supports the argument of Szathmary et al. (2007) for the recognition of rare itemsets, in contrast with the majority of existing studies, which only focus on the mining of frequent itemsets (Szathmary et al., 2007).

Theoretical Implications

By examining the effects of visual-verbal content in social media posts, this study enriches the theoretical understanding of the media complexities relating to tourism brands in general, and peer-to-peer accommodation brands, specifically. First, this study broadens the explanatory power of media richness theory, by investigating the relationship between visual and verbal content (i.e., text and emoji) in brand-generated social media posts. Previous literature has rarely provided a full theoretical and empirical explanation of how emoji and text can work together to enhance the media richness of brand-generated social media content. Compared with prior research, which has focused on the relationship between emoji and text from a semantic perspective (e.g., Barbieri et al., 2016; Wu et al., 2018), this study has theoretically and empirically shown that communication effectiveness depends on combinations of emoji and text. This opens a wide range of opportunities for further study on the relationship between emoji and text. As shown in this study, a text explains what brands wish to say, and an emoji expresses how this is said, suggesting that the combination of the two can significantly strengthen the carriers of brand-generated social media content in conveying complex information.

Second, this study provides theoretical insights into how new or novel tourism brands can effectively leverage brand-generated social media content in establishing its online presence. As a new alternative to traditional accommodation, peer-to-peer accommodation brands need to be recognized and achieve their unique positioning. Although there has been some research that has explored peer-to-peer accommodation brand personality (Wang et al., 2021) and advertising appeal (Liu & Mattila, 2017), extant tourism literature has not fully addressed the nuances of the digital communication strategies of peer-to-peer accommodation brands. This study provides novel stimuli for the formation of new brand connections with consumers, through the co-occurrence of emoji and text. This extends previous literature on the factors that influence new brand’s penetration strategies into a relatively well-established market by going beyond the logo design of brands (Henderson et al., 2004), media channel choice (Anselmsson & Tunca, 2019), brand name (Yorkston & Menon, 2004), or type font characteristics (Grohmann et al., 2013). This study reveals that emoji co-occurring with text can be effectively used as a significant and easy-to-implement approach for tourism brands to stimulate user engagement.

This study also contributes to the marketing communication literature by developing a typology of the relationship between texts and emoji in tourism brand digital communication. Previous studies have mostly considered emoji to be a binary variable (e.g., the presence of emoji, informational or emotional), and its impact on consumer reactions (Ko et al., 2022). This study has investigated the effect of emoji at a more in-depth level, by identifying and empirically testing the effect of the typology of emoji and texts on user engagement. The typology identified by this research offers a holistic view of the combination of emoji and texts in tourism brand digital communication.

Finally, this study also contributes to user engagement literature in tourism, by examining the effects of visual-verbal content on different engagement metrics (i.e., the number of likes, shares, comments). Previous studies have suggested that “like,” “share,” and “comment” require different level of cognitive effort (Muntinga et al., 2011). Like-clicking requires the lowest cognitive effort and is regarded as passive engagement, whereas sharing and commenting requires a higher level of cognitive effort and is considered to be “contributing” and “co-creating” (Kim & Yang, 2017). Given that extant studies relating to engagement have predominantly adopted single measurements of the number of likes, comments, or shares, this study is innovative by utilizing them together, further revealing significant differences among these three different engagement types. For instance, the results of the current research indicated that “Travel Tips and Inspiration” combined with “”, led to a higher number of likes and shares, and a relatively small number of comments.

Methodological Contributions

This study contributes significantly to extant tourism methodological literature by providing a broader methodological toolbox to better comprehend the relationship between verbal and visual content. The use of a sequential research design, from text mining to itemset mining, in investigating the joint effect of emoji and text, goes beyond the conventional approach, which has tended to examine emoji and the text separately (Miller et al., 2017). In fact, the challenge in examining the relationship between emoji and the text, is that they mutually influence each other (Tang & Hew, 2019), and as such, scholars have called for more innovative and multiple methods to fully investigate this phenomenon (McShane et al., 2021). The methodological approach of the current study directly addressed this call, and the methods presented in this study demonstrate abundant possibilities in tourism to develop innovative methodological approaches.

Practical Implications

This study offers actionable guidelines for tourism brands to increase user engagement by leveraging the typology of the relationship between emoji and text. Brand can utilize the findings of this study to effectively use emoji and text together when crafting brand-generated social media content. For example, to achieve a larger number of likes and shares, it is encouraged that “Travel Tips and Inspiration” and “”, and “Interaction and Motivation” and “”, be used together, rather than “Accommodation Tour” and “”, as the combination of these two has a greater impact on user engagement.

This study also offers important implications for developing relevant algorithms to optimize the platform design for social media platforms. As shown in our study, certain combination of text and emoji can induce higher level of engagement. The findings indicate that when users type certain text, the platform can recommend the matching emoji that can stimulate engagement among users, which can significantly enhance user experience on social media platform.

Limitations and Future Research

This study is not without limitations. Although the co-occurring patterns of topics and emoji, and their effects on user engagement, have provided helpful insights, other elements included in social media content have not been considered in this study. Examples of these related elements, are hashtags, links, images, and videos. These can disproportionately influence the characteristics of content, and further impact user engagement. Second, due to the constraints of itemset mining algorithms, the repetitive emoji in a tweet have only been coded once in the construction process of the transaction data. Emoji repetitions can emphasize the corresponding emotions and semantics of social media content (Pereira & Pestana, 2022). Thus, one potential extension of this work would be to incorporate other factors appearing together in the same social media content, in addition to the text and emoji. Lastly, another extension to explore in future studies is the high-level classification of emoji, such as emotion-related and entity-related emoji, to be converted into transaction data with topics.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Zhejiang Provincial Philosophy and Social Sciences Planning Project (No. 24NDJC308YBMS), the Zhejiang Provincial Natural Science Foundation of China (No. LQ24G020002), the Fundamental Research Funds for the Provincial Universities of Zhejiang (No. XT202306), Zhejiang Gongshang University “Digital+” Disciplinary Construction Management Project (No. SZJ2022B006), Research Start-up Foundation for Introduced Talents of Zhejiang Gongshang University (No. 1040XJ2322029).

ORCID iDs

Xiaowei Wang

Mingming Cheng

Jingjie Zhu

Author Biographies

Dr Xiaowei Wang is a Research Associate Professor in Tourism Management at Zhejiang Gongshang University, China. Her research interest is tourism marketing and big data analytics.

Dr Mingming Cheng is a Professor in Digital Marketing and Director of the Social Media Research Lab in the School of Management and Marketing at Curtin University, Australia. Further information can be found: .

Jingjie Zhu is a PhD candidate at the Social Media Research Lab, School of Management and Marketing, Curtin university. Her research interest focuses on video analytics and social media marketing.

Dr Ruochen Jiang is a Professor in Marketing in the Shanghai Development Institute at Shanghai University of Finance and Economics, China. Her research interest is relationship marketing.

References

Alvarez

David

M. E.

George

(2023). Types of Consumer-Brand Relationships: A systematic review and future research agenda. Journal of Business Research, 160, 113753.

Anselmsson

Tunca

(2019). Exciting on Facebook or competent in the newspaper? Media effects on consumers’ perceptions of brands in the fashion category. Journal of Marketing Communications, 25(7), 720–737.

Aryabarzan

Minaei-Bidgoli

Teshnehlab

(2018). negFIN: An efficient algorithm for fast mining frequent itemsets. Expert Systems with Applications, 105, 129–143.

Bai

Dan

Yang

(2019). A systematic review of emoji: Current research and future perspectives. Frontiers in Psychology, 10, 2221.

Barbieri

Ronzano

Saggion

(2016). What does this emoji mean? a vector space skip-gram model for twitter emojis [Conference session]. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 3967–3972).

Barnes

N. G.

Mazzola

Killeen

(2020). Oversaturation & Disengagement: The 2019 Fortune 500 social media dance- The effects of high level social media interactions across media platforms. https://www.umassd.edu/cmr/research/2019-fortune-500.html.

Basoda

Dogan

Cobanoglu

(2022). The role of emoji use in destination decision making. In Ozturk

A. B.

Hancer

(Eds.), Digital Marketing and Social Media Strategies for Tourism and Hospitality Organizations. Oxford: Goodfellow Publishers. (pp. 103–122).

Blondel

V. D.

Guillaume

J. L.

Lambiotte

Lefebvre

(2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics Theory and Experiment, 2008(10), P10008.

Brandes

(2001). A faster algorithm for betweenness centrality. The Journal of Mathematical Sociology, 25(2), 163–177.

10.

Brubaker

P. J.

Wilson

(2018). Let’s give them something to talk about: Global brands’ use of visual content to drive engagement and build relationships. Public Relations Review, 44(3), 342–352.

11.

Cappallo

Svetlichnaya

Garrigues

Mensink

Snoek

C. G. M.

(2019). New modality: Emoji challenges in prediction, anticipation, and retrieval. IEEE Transactions on Multimedia, 21(2), 402–415.

12.

Chan

(2018). 10 Airbnb competitors that you should know about. https://www.tripping.com/industry/rental-companies/9-airbnb-competitors-that-you-should-know-about

13.

Creevey

Kidney

Mehta

(2019). From dreaming to believing: A review of consumer engagement behaviours with brands’ social media content across the holiday travel process. Journal of Travel & Tourism Marketing, 36(6), 679–691.

14.

Cvijikj

I. P.

Michahelles

(2013). Online engagement factors on Facebook brand pages. Social Network Analysis and Mining, 3(4), 843–861.

15.

Daft

R. L.

Lengel

R. H.

(1986). Organizational information requirements, media richness and structural design. Management Science, 32(5), 554–571.

16.

Daft

R. L.

Lengel

R. H.

Trevino

L. K.

(1987). Message equivocality, media selection, and manager performance: Implications for information systems. MIS Quarterly, 11(3), 355–366.

17.

Danesi

(2016). The semiotics of emoji: The rise of visual language in the age of the Internet. Bloomsbury.

18.

Das

Wiener

H. J. D.

Kareklas

(2019). To emoji or not to emoji? Examining the influence of emoji on consumer reactions to advertising. Journal of Business Research - Turk, 96, 147–156.

19.

Dennis

A. R.

Kinney

S. T.

(1998). Testing media richness theory in the new media: The effects of cues, feedback, and task equivocality. Information Systems Research, 9(3), 256–274.

20.

de Vries

Gensler

Leeflang

P. S. H

. (2012). Popularity of brand posts on brand fan pages: An investigation of the effects of social media marketing. Journal of Interactive Marketing, 26(2), 83–91.

21.

Drieger

(2013). Semantic network analysis as a method for visual text analytics. Procedia - Social and Behavioral Sciences, 79(6), 4–17.

22.

Fournier-Viger

Lin

C. W.

Gomariz

Gueniche

Soltani

Deng

Lam

H. T.

(2016). The SPMF open-source data mining library version 2 [Conference session]. Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853 (pp. 36–40).

23.

Fournier-Viger

Lin

J. C.

Chi

T. T.

Zhang

H. B.

(2017). A survey of itemset mining. Data Mining and Knowledge Discovery, 7(4), e1207.

24.

Gretzel

(2018). Emoji rhetoric: A social media influencer perspective. Journal of Marketing Management, 34(15–16), 1272–1295.

25.

Grohmann

Giese

J. L.

Parkman

I. D.

(2013). Using type font characteristics to communicate brand personality of new brands. Journal of Brand Management, 20(5), 389–403.

26.

Henderson

P. W.

Giese

J. L.

Cote

J. A.

(2004). Impression management using typeface design. Journal of Marketing, 68(4), 60–72.

27.

Huang

G. H.

Chang

C. T.

Bilgihan

Okumus

(2020). Helpful or harmful? A double-edged sword of emoticons in online review helpfulness. Tourism Management, 81, 104135.

28.

Huang

(2017). The dining experience of Beijing Roast Duck: A comparative study of the Chinese and English online consumer reviews. International Journal of Hospitality Management, 66, 117–129.

29.

Jin

Cheng

(2020). Communicating mega events on Twitter: Implications for destination marketing. Journal of Travel & Tourism Marketing, 37(6), 739–755.

30.

Juntunen

Ismagilova

Oikarinen

E. L.

(2020). B2B brands on Twitter: Engaging users with a varying combination of social media content objectives, strategies, and tactics. Industrial Marketing Management, 89, 630–641.

31.

Keshen

(2019). 20 Airbnb alternatives for perfect holiday homes. https://www.finder.com.au/sites-like-airbnb.

32.

Khokhar

(2015). Gephi cookbook. Packt Publishing Ltd.

33.

Kiesler

(1986). The hidden messages in computer networks. Harvard Business Review, 64(3), 46–54, 58–60.

34.

Kim

Yang

S.-U.

(2017). Like, comment, and share on Facebook: How each behavior differs from the other. Public Relations Review, 43(2), 441–449.

35.

E. (.

Kim

(2022). Influence of emojis on user engagement in brand-related user generated content. Computers in Human Behavior, 136, 107387.

36.

Landis

J. R.

Koch

G. G.

(1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.

37.

Liang

Schuckert

Law

Chen

C.-C.

(2020). The importance of marketer-generated content to peer-to-peer property rental platforms: Evidence from Airbnb. International Journal of Hospitality Management, 84, 102329.

38.

Liebeskind

(2019). Emoji prediction for Hebrew political domain [Conference session]. Companion Proceedings of the 2019 World Wide Web Conference (pp. 468–477).

39.

Liu

S. Q.

Mattila

A. S.

(2017). Airbnb: Online targeted advertising, sense of power, and consumer decisions. International Journal of Hospitality Management, 60, 33–41.

40.

Luangrath

A. W.

Peck

Barger

V. A.

(2017). Textual paralanguage and its implications for marketing communications. Journal of Consumer Psychology, 27(1), 98–107.

41.

Luna

J. M.

Fournier-Viger

Ventura

(2019). Frequent itemset mining: A 25 years review. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 9(6), e1329.

42.

Manganari

E. E.

Dimara

(2017). Enhancing the impact of online hotel reviews through the use of emoticons. Behaviour and Information Technology, 36(7), 674–686.

43.

McShane

Pancer

Poole

Deng

(2021). Emoji, playfulness, and brand engagement on twitter. Journal of Interactive Marketing, 53(1), 96–110.

44.

Meeroona. (2019). Where to advertise your holiday home: 27 best vacation rental websites. https://travelaway.me/vacation-rental-sites/.

45.

Miller

Kluver

Thebault-Spieker

Terveen

Hecht

(2017). Understanding emoji ambiguity in context: The role of text in emoji-related miscommunication [Conference session]. Proceedings of the Eleventh International AAAI Conference on Web and Social Media (pp. 152–161).

46.

Mody

M. A.

Hanks

Cheng

(2021). Sharing economy research in hospitality and tourism: A critical review using bibliometric analysis, content analysis and a quantitative systematic literature review. International Journal of Contemporary Hospitality Management, 33(5), 1711–1745.

47.

Moussa

(2019). An emoji-based metric for monitoring consumers’ emotions toward brands on social media. Marketing Intelligence & Planning, 37(2), 211–225.

48.

Muntinga

D. G.

Moorman

Smit

E. G.

(2011). Introducing COBRAs: Exploring motivations for brand-related social media use. International Journal of Advertising, 30(1), 13–46.

49.

Na’aman

Provenza

Montoya

(2017). Varying linguistic purposes of emoji in (Twitter) context [Conference session]. Proceedings of ACL 2017, Student Research Workshop (pp. 136–141).

50.

Newman

M. E.

(2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences, 103(23), 8577–8582.

51.

Novak

P. K.

Smailović

Sluban

Mozetič

(2015). Sentiment of emojis. PLoS One, 10(12), e0144296.

52.

Peng

Zhao

(2021). Seq2emoji: A hybrid sequence generation model for short text emoji prediction. Knowledge-Based Systems, 214(2), 106727.

53.

Pereira

Pestana

(2022). Is there meaning in the emoji sequences used on social media? [Conference session] World Conference on Information Systems and Technologies (pp. 279–292). Springer.

54.

Prada

Rodrigues

D. L.

Garrido

M. V.

Lopes

Cavalheiro

Gaspar

(2018). Motives, frequency and attitudes toward emoji and emoticon use. Telematics and Informatics, 35(7), 1925–1934.

55.

Riordan

M. A.

(2017). The communicative role of non-face emojis: Affect and disambiguation. Computers in Human Behavior, 76, 75–86.

56.

Rodríguez-Hidalgo

Tan

E. S. H.

Verlegh

P. W. J.

(2017). Expressing emotions in blogs: The role of textual paralinguistic cues in online venting and social sharing posts. Computers in Human Behavior, 73, 638–649.

57.

Segev

(2022). Semantic network analysis in social sciences. Routledge.

58.

Shiha

Ayvaz

(2017). The effects of emoji in sentiment analysis. International Journal of Computer and Electrical Engineering, 9(1), 360–369.

59.

K. K. F.

King

Sparks

B. A.

Wang

(2016). The role of customer engagement in building consumer loyalty to tourism brands. Journal of Travel Research, 55(1), 64–78.

60.

Suh

Eck

(2021). A study of customer engagement, satisfaction and behavioral intentions among Airbnb users. International Journal of Tourism Sciences, 20(1), 26–39.

61.

Szathmary

Napoli

Valtchev

(2007). Towards rare itemset mining [Conference session]. 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007) (pp. 305–312).

62.

Tafesse

Wien

(2018). Using message strategy to drive consumer behavioral engagement on social media. Journal of Consumer Marketing, 35(3), 241–253.

63.

Tang

Hew

K. F.

(2019). Emoticon, emoji, and sticker use in computer-mediated communication: A review of theories and research findings. Journal of International Communication, 13, 2457–2483.

64.

Tao

Fang

Luo

Wan

(2022). Which marketer-generated-content is more effective? An experimental study in the context of a peer-to-peer accommodation platform. International Journal of Hospitality Management, 100, 103089.

65.

Unicode. (2021). Emoji counts, v14.0. https://www.unicode.org/emoji/charts/emoji-counts.html.

66.

Valenzuela-Gálvez

E. S.

Garrido-Morgado González-Benito . (2023). Boost your email marketing campaign! Emojis as visual stimuli to influence customer engagement. Journal of Research in Interactive Marketing, 17(3), 337–352.

67.

Walther

J. B.

(2011). Theories of computer-mediated communication and interpersonal relations. In Knapp

M.L.

Daly

J.A.

(Eds.), The Handbook of Interpersonal Communication. Sage. (pp. 443–479).

68.

Walther

J. B.

Parks

M. R.

(2002). Cues filtered out, cues filtered in: Computer-mediated communication and relationships. In Knapp

M.L.

Daly

J.A.

(Eds.), The Handbook of interpersonal communication. Sage. (pp. 529–563).

69.

Wang

Cheng

Jiang

(2023). The interaction effect of emoji and social media content on consumer engagement: A mixed approach on peer-to-peer accommodation brands. Tourism Management, 96, 104696.

70.

Wang

Cheng

Wong

I. A.

Teah

Lee

(2021). Big-five personality traits in P2P accommodation platforms: Similar or different to hotel brands? Current Issues in Tourism, 24(23), 3407–3419.

71.

Wilk

Soutar

G. N.

Harrigan

(2020). Online brand advocacy (OBA): The development of a multiple item scale. Journal of Product & Brand Management, 29(4), 415–429.

72.

Wong

I. A.

M. V.

Lin

(2023). The transformative virtual experience paradigm: the case of Airbnb’s online experience. International Journal of Contemporary Hospitality Management, 35(4), 1398–1422.

73.

World Bank Group. (2018). Tourism and the sharing economy: Policy & potential of sustainable peer-to-peer accommodation. http://documents.worldbank.org/curated/en/161471537537641836/pdf/130054-REVISED-Tourism-and-the-Sharing-Economy-PDF.pdf.

74.

World Economic Forum. (2017). Digital transformation initiative: Aviation, travel and tourism industry. http://reports.weforum.org/digital-transformation/wp-content/blogs.dir/94/mp/files/pages/files/wefdtiaviation-travel-and-tourism-white-paper.pdf.

75.

Huang

Xie

(2018). Tweet emoji prediction using hierarchical model with attention [Conference session]. Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers (pp. 1337–1344).

76.

Yang

Lin

Jiang

Huo

(2022). Fostering consumer engagement with marketer-generated content: The role of content-generating devices and content features. Internet Research, 32(7), 307–329.

77.

Yorkston

Menon

(2004). A sound idea: Phonetic effects of brand names on consumer judgments. Journal of Consumer Research, 31(1), 43–51.

78.

Zhu

Cheng

Wong

I. A.

(2019). Determinants of peer-to-peer rental rating scores: The case of Airbnb. International Journal of Contemporary Hospitality Management, 31(9), 3702–3721.

79.

Zhu

Cheng

Wang

Jiang

(2019). The construction of home feeling by Airbnb guests in the sharing economy: A semantics perspective. Annals of Tourism Research, 75, 308–321.