Abstract
The issue of language and culture is recognized as paramount for the service industry, but has been poorly addressed by the academic literature. Framed within the Social Representation Theory and operationalized with the concept of high- and low- context cultures conceptualized by Hall, this research aims to explore differences in reviews written in different languages. Reviews written in French (high-context) and German (low-context) were analyzed in the original formats using descriptive statistics, automatic text analysis, network data visualization, and technical attribute assessment to seize the differences. This research found comparing to high-context culture, low-context culture reviews have high polarity but low subjectivity. Given the intrinsic differences in polarity and subjectivity between cultures, researchers and practitioners should monitor the changes of sentiments by cultures to reduce the bias resulted from the intrinsic differences. Furthermore, data visualizations show different topics between cultures and offer additional insights of the differences.
Keywords
Introduction
The issue of language and culture is recognized as paramount when dealing with services; however, the research related to these issues is to date scarce and scattered (Baker and Kim, 2019; Holmqvist et al., 2017). The rise of the internet added a new and more complex layer to this: in fact, the relationship between language, culture, and the internet is poorly addressed by academic literature (Hale, 2016), especially in tourism and hospitality research (Schuckert et al., 2015). This is surprising, given the importance of online reviews (Filieri and McLeay, 2014) written by customers from different languages and cultural backgrounds. Therefore, generating an understanding of core differences in online reviews written in different languages along with their technical attributes (i.e., subjectivity and polarity) can be meaningful (Tian et al., 2016).
Language related issues are critical to academia and practitioners for the following reasons. Firstly, there is an ongoing debate between academics concerning the importance of review languages: while some suggested (auto) translation will serve the purpose (Cenni and Goethals, 2017) of creating a holistic understanding of the service under examination, others supported the notion that review websites should allow users to read reviews in their mother tongue (Mariani et al., 2019); additionally, some researchers questioned the common practices of calculating the average ratings from all language groups (Hale, 2016) to determine the reputation of an organization. These conflicting suggestions may result from different research methods used, which is the second point.
Secondly, given the advance of big data analytics and visualization tools, research methods could be improved. Most research adopted a quantitative method such as regression modelling to understand the relationship between independent variables and the review ratings (e.g., Filieri and McLeay, 2014). Yet, researchers also question if the ratings can summarize the guests’ experience (Antonio et al., 2018a, 2018b; Han et al., 2016). This is essential because quantitative ratings cannot capture nuances and emotions expressed in languages (Geetha et al., 2017). Certainly, some researchers also examined the review content. Initially, these researchers translated the review content from different languages into English (Francesco and Roberta, 2019; Huang, 2017); or analysed reviews in their original languages but could only address a smaller set of content (Cenni and Goethals, 2017). In addition, these researchers tend to classify content into different categories for comparison reasons (instead of looking for uniqueness in every language) and consequently traded-off the unique features of each language. This may explain why research findings show similarities between different language speakers (Cenni and Goethals, 2017; Huang, 2017; Nakayama and Wan, 2018, 2019).
More recently, researchers have moved away from coding manually to automatic or semi-automatic text analysis, which enables researchers to deal with a larger amount of review content data (Chatterjee, 2020; Xiang et al., 2017; Zhao et al., 2019). Nevertheless, text analysis across multiple languages presents methodological difficulties (Antonio et al., 2018a, 2018b; Han et al., 2016); hence only a limited number of researchers addressed review content in different languages (Schuckert et al., 2015). To the best of our knowledge, only Antonio et al. (2018a) investigated reviews written in different languages in their original formats (English, Spanish, and Portuguese), adopting topic modelling and visualization tools.
Communication and language are the medium through which an individual makes sense of reality and links it to the values of their own culture (Hofstede, 2001, 2015). Framed within the Social Representation Theory – which elaborates on the collective representation of a social object by a given community for the purpose of behaving and communicating – (Moscovici, 1961) and operationalized with the concept of high- and low- context cultures conceptualized by Hall (1976) which describes the importance of communication context and the use of explicit messages in a cultural continuum, this research aims to explore differences in reviews written in different languages. In doing so, the research is based on two corpora of hotel reviews written by reviewers from high-context (French) and low-context (German) cultures for two popular Italian cities (i.e., Florence and Milan). Although previous researchers identified that review ratings or review content differ between language groups (Francesco and Roberta, 2019; Huang, 2017; Liu et al., 2017; Mariani, et al., 2019; Schuckert et al., 2015), they did not offer an explanation for the differences.
Reviews should reflect the different ways of understanding and representing reality and different values belonging to given cultures based on the authors’ origin and language. More specifically, as per Hall (1976), low-context languages, such as German and English, are more explicit and characterized by limited ambiguity and a high level of clarity (Hall, 1976; Jeong and Crompton, 2018; Mattila, 2019). Conversely, high-context languages, such as French and Italian, carry an additional layer of meaning not explicitly communicated. Within this context, this research aims to understand if reviews from high- and low-context cultures differ by adopting both quantitative and qualitative methods. Specifically, this research adopted a threefold approach: firstly, descriptive statistics were used to scout differences in ratings among different languages; secondly, automatic text analysis and network data visualization (Antonio et al., 2018a, 2018b) were adopted to shed light on the relationship among different words in different corpora. Finally, technical attributes such as subjectivity and polarity (Tian et al., 2016) were studied to generate an understanding of core differences between high- and low- context cultures and their language in the review corpora.
Besides attempting to generate a clear contribution at the theoretical level by advancing the knowledge about the implication of high- and low- context cultures (Hall, 1976) in the internet arena, this work has been designed to inform practitioners' policies. In fact, marketers could beneficiate from a detailed understanding of the cultural differences represented in a given language to adapt their marketing mix to target customers originating from countries who speak languages associated with higher online ratings (Antonio et al., 2018a) and/or fine-tune communication and content based on language and culture (Antonio et al., 2018a, 2018b).
Literature review
The importance of hotel reviews
Travelers increasingly use user-generated digital media to scout for information about their perspective holidays (Abubakar et al., 2017). This happens before the trip to inform decision-making (Sparks and Browning, 2011) but also, thanks to the ubiquity of the internet (Xiang et al., 2014), during the travel experience where people can access various thousands of recommendations and blogs about travel and tips about attractions to visit (Leung et al., 2013). Travelers are empowered by this ever-increasing amount of freely available information accessible at a click of a mouse and increasingly plan their holiday, from travel to accommodation up to the activities within it by themselves (Zhang et al., 2017).
TripAdvisor has fostered user-generated content exchange between travellers in the form of electronic word of mouth (eWOM) (Gretzel and Yoo, 2008; Inversini and Masiero, 2014). Tourism and hospitality researchers are leading the discussion about the impact of eWOM on various aspects of travellers' decisions (e.g., Filieri and McLeay, 2014), on accommodation performance and sales, and on the characteristics of the most helpful reviews (e.g., Park and Nicolau, 2015). Generating a proper understanding of customer expectations through online reviews curation could hamper the online reputation of the organization (Buhalis and Inversini, 2014); this, following Anderson (2012), could have a direct impact on hospitality organization performance.
One other stream of literature discusses the challenges faced by the hospitality industry where customers have very specific expectations and/or needs (Liang et al., 2017); this, following Tse and Ho (2009) could be based on different cultural background: in fact, guest from different countries/culture could value services attributes in a different way. Although this is crucial to deploy the expected service, to date, the research on cross-cultural tourist behaviour based on user-generated content is scattered (Huang, 2017).
Social representation theory in travel research
The aggregation of eWOM not only shapes the product reputation (Buhalis and Inversini, 2014) but also informs a representation of the actual product/accommodation. Social Representation Theory (Moscovici, 1961) can support the generation of a better understanding of this phenomenon by shedding light on psycho-social phenomena in modern societies (Wagner et al., 1999). With his seminal work on the subject, Moscovici (1961) moved from Durkheim’s sociological notion of collective representations to define the concept of social representations as creation of meaning as a cognitive product of social interactions (Billig, 1996; Byford, 2002) supported by language and everyday communication (Wagner et al., 1999). Through the process of anchoring, new elements of reality are classified in pre-existing categories of common sense, such as stable categories of concepts and/or images (Moscovici, 1961, 1984), thus familiarizing the new elements and allowing social actors to classify and label the new object according to stable, shared, historical, cultural knowledge (Wagner et al., 1999).
Social Representation Theory has been used in travel and tourism to generate an understanding of attitudes and responses to tourism development (Andriotis and Vaughan, 2003; Dickinson and Dickinson, 2006; Dickinson and Robbins, 2008; Dickinson et al., 2009; Fredline et al., 2003; Yuksel et al., 1999). While most of the research using Social Representations Theory is based on quantitative data (e.g., Suess and Mody, 2016), recently, the theory has been applied to the study of online content (e.g., related to the concept of volunteering in the tourism field - Inversini et al., 2019) because online-hosted narratives (Kozinets et al., 2010) could give an indication about the interpretation and representation of reality by a given (cultural) group.
Language as a proxy of culture
Therefore, moving from the Social Representations Theory and focusing on the social understanding of ‘meaning’ (Billig, 1996) as it can be grasped by harvesting online narratives (Inversini et al., 2019), this research uses language as a proxy of a culture’s creation of meanings (Hofstede, 2015). Language is deeply connected with one’s identity (Hofstede, 2015) and shapes the way people of a given culture think and behave (Kim and Filimonau, 2017). The particular language spoken by an individual influences both (i) the way s/he understands and conceptualizes reality (Lucy, 1997) and (ii) the way in which s/he creates meaning as a cognitive product of social interactions (Byford, 2002).
Research using native languages as a proxy for cultural differences is quite popular in tourism. Compared to the official common language, the common unofficial language is a better predictor for international tourist flows in Europe (Okafor et al., 2018). In Switzerland, a country with four official languages, the language spoken can shape the destination choice, accommodation choice, trip duration, and travel experiences of Swiss nationals (Laesser et al., 2014). In addition to the language spoken, the difference in “Future-Time-Reference” in languages impacts on behaviors. Even though both Chinese and Korean tourists understand the impact of tourism on the environment, comparing to Chinese (a weak Future-Time-Reference language), Korean (a strong Future-Time-Reference language) speaking people do not transfer their knowledge into high pro-environmental attitudes (Kim and Filimonau, 2017).
Languages also impact perceptions. Bilinguals (English and Chinese) perceive that, in the Chinese context, noun-composed messages as more efficient in affecting purchasing decisions, while in the English context, adjective-composed messages work better (Zhang et al., 2017). Bilinguals perceive English advertisements are more informative than Chinese (Zhang et al., 2017). Hence, the interaction of word categories and language could impact positive word-of-mouth and website user experience satisfaction in the service industry (Zhang et al., 2017). Related to this topic is the difference in Need for Cognitive Closure influences preferences in menus written in authentic language or in English (Choi et al., 2018). Overall, these researchers provide evidence that language could impact attitudes, decisions, and behaviours.
The study of language in tourism research
Although scant research exists on the importance of languages and culture in the internet mediated environment (Hale, 2016), researchers discussed the relationship between language and the service encounter (e.g., Davras and Caber, 2019) as well as different writing styles in different cultures (e.g., Baker and Kim 2019). A growing body of research (e.g., Schuckert et al., 2015) is looking into differences in ratings when it comes to different languages.
Language experiences as part of the service encounter
Customer experiences in the service industry consist of a chain of interactions between customers and employees. When guests and service staff speak a common language, they can communicate more effectively and result in the positive ratings (Mariani et al., 2019). Yet, there are cultural differences in how customers evaluate employees’ language ability. For example, French- and Spanish-speaking tourists refer more frequently to their language experiences (in their mother tongue) during their hotel stays than German-speaking customers do (Goethals, 2016). Among the three countries studied, Spanish-speaking tourists are the most positive, while French-speaking tourists are the most negative (Goethals, 2016). Turkish and Russian customers evaluate employee foreign language ability as a “satisfier”, while German tourists see it as a “dissatisfier if absent” (Davras and Caber, 2019).
Writing styles of online reviews
The concept of “language” tends to be associated with different cultures, yet, language can be analysed through writing styles within the same language as well. Research found that figurative language does not offer significant advantages over literal language in persuasive power (Wu et al., 2017). Most negative reviews reflect a negative experience; however, in some cases, reviewers discuss their negative evaluation with some positive assessments (Vasquez, 2011). Research showed that reviews written by Spanish-speaking tourists tend to be more positive even if they contain both positive and negative opinions in their reviews; reviews by French-speaking tourists are more negative and reinforce negative opinions in their reviews (Goethals, 2016). Language complexity (detailed vs vague) and emotional expression (high and low) in reviews can influence readers’ trustworthiness (Baker and Kim, 2019). For example, trustworthiness is low when positive reviews contain too much detailed information or when negative reviews contain vague language and high emotion (Baker and Kim, 2019).
Comparing writing styles between different languages, the frequencies of the main recurrent speech acts (e.g., retrospective speech acts such as description and evaluation of the experience, or future-oriented speech acts such as recommendations or future intentions) and topics discussed are similar between language speakers (Cenni and Goethals, 2017; Huang, 2017), but the focuses of the reviews (Nakayama and Wan, 2018, 2019), the expressiveness and the range of vocabulary used in the reviews (Nakayama and Wan, 2018, 2019), the emphasis on hotel attributes (Francesco and Roberta, 2019), the elaboration as well as sharing of personal credibility-enhancing information (Cenni and Goethals, 2017), and the perception of value (Huang, 2017) differ between different language speakers.
In fact, if the review ratings offer a summative perspective of the different languages spoken by different customers (Liu et al., 2017; Schuckert et al., 2015), it is possible to argue that the actual texts or narratives hosted online (Kozinets et al., 2010) offer a more granular view of the differences among languages and cultures (e.g., Huang, 2017). By grouping customers by language, there is a significant difference in what concerns preferences and important attributes/characteristics related to the hotel and the room (Liu et al., 2017; Schuckert et al., 2015). Correlations of review ratings between different language groups are generally high, but some language pairs are more correlated than others (Hale, 2016). Because English and Spanish are used globally and across different countries, reviews given by English- and Spanish-speaking customers vary most among all languages (Liu et al., 2017).
Therefore, language should be considered vital information for generating both (i) a stronger understanding of the customers (Francesco and Roberta, 2019; Huang, 2017; Liu et al., 2017; Schuckert et al., 2015) and (ii) an effective segmentation to inform specific cultural needs (Liu et al., 2017). However, previous empirical research has shown the impact of different languages spoken by customers from different cultural backgrounds (Cenni and Goethals, 2017; Goethals, 2016; Nakayama and Wan, 2018, 2019) but did not explain the cause of the differences.
High- and low- context cultures
To shed additional light on the causes of the differences in online reviews and the connection between language and culture, the conceptualization of high- and low- context cultures by Hall (1976) is introduced here. High-context languages (such as French and Italian) carry an additional layer of meaning not explicitly communicated (Meyer, 2014). This theory implies that people who communicate with a language belonging to a high-context culture do not always express themselves directly but with more nuances. Therefore, in high-context cultures, an additional effort should be made to ‘read between the lines’ to understand what is communicated whilst paying special attention to non-verbal cues such as body language (Jeong and Crompton, 2018; Mattila, 2019). On the contrary, the languages spoken by consumers from a low-context culture (such as English and German) are more explicit and characterized by limited ambiguity and a high level of clarity (Hall, 1976; Jeong and Crompton, 2018; Mattila, 2019). In low-context cultures, meaning is attached to the messages themselves, and they are more likely to be taken at face value (Jeong and Crompton, 2018); communication tends to be direct, and using words to convey unambiguous meanings (Jeong and Crompton, 2018; Mattila, 2019). The most recent research examining the high- and low-context concept is related to pricing decisions and found no differences between American (representing low-context culture), Korean, and Chinese (both representing high-context culture) respondents (Jeong and Crompton, 2018). Surprisingly, the high- and low- context concept has not yet been applied in language-related review research. Since language is the first characteristic of a person’s identity (Hofstede, 2015), and the linguistic relativity hypothesis proposes that our spoken languages shape the way we think about the world (Lucy, 1997), we propose high- and low- context cultures (Hall, 1976) as the explanatory theory for the causes of the differences between language groups, and develop the following hypotheses.
There are differences between review ratings given by high-context culture customers and low-context culture customers.
There are different topics in reviews written by high-context culture customers and low-context culture customers.
Polarity and subjectivity
Review polarity and subjectivity can be used to explain customer satisfaction and review usefulness, and to detect deceptive reviews (Chatterjee, 2020; Geetha et al., 2017; Martinez-Torres and Toral, 2019; Zhao et al., 2019). Sentiment polarity is defined as any direction of valence (either positive or negative) within the text (Chatterjee, 2020; Geetha et al., 2017). When there are more positive words than negative ones in reviews, it results in high sentiment polarity and high customer ratings (Geetha et al., 2017; Zhao et al., 2019) but leads to lower review helpfulness (Chatterjee, 2020). In terms of high- and low- context cultures, low-context culture is more explicit, with limited ambiguity and a high level of clarity, while high-context culture is the opposite. Given the natures of high- and low- context cultures, the polarity from low-context culture should be higher than high-context culture.
The polarity of low-context culture is higher than the polarity of high-context culture. Furthermore, moving from the work of Zhao et al. (2019), which posits that language used to describe hospitality products and services can be considered objective, while other information can be considered subjective. Low subjectivity corresponds to a more cognitive rational customer, while high subjectivity corresponds to a more affective customer; those are, respectively, less likely and more likely to complain about. Given the natures of high- and low- context cultures (Hall, 1976), the subjectivity from low-context culture should be lower than high-context culture.
The subjectivity of low-context culture customers is lower than the subjectivity of high-context culture customers.
Research design
This research aims to understand the language differences in online reviews by the concept of high- and low- context cultures as conceptualized by Hall (1976). To tackle the research aim and investigate the hypotheses, this study will adopt a mixed method approach, investigating reviews written in two languages (French representing high-context culture, while German representing low-context culture) in two main Italian destinations: Florence and Milan. French and German have been used as examples of high-context and low-context cultures, respectively, in previous research, including Campbell et al. (1988); Hall and Hall (2001); and Nishimura et al. (2008). These two languages were chosen because the selected reviewers are close to their culture of origin, and the different nationalities are geographically dispersed in Europe (Ammon, 2014; Edmiston and Dumenil, 2015; Simons and Fennig, 2015). German native speakers are predominantly located in Germany, Austria, and Switzerland. The French speakers are represented in France, Belgium, and Switzerland. Reviews written in English and Spanish could be geographically and culturally dispersed (Liu et al., 2017); therefore, these were not included in this study.
Furthermore, the use of domestic language exerts a positive impact on online ratings (Mariani et al., 2019) and may interfere with the high- and low- context cultures; hence, Italian reviews are excluded from this study. Florence and Milan were selected because they are primary destinations in Italy, with Milan being third and Florence fourth in terms of arrivals, after Rome and Venice (Statista, 2022); and second and fourth, respectively, in terms of GDP generated by tourism (Netti, 2022). These two destinations could increase the possibility of having hotels with reviews written in French and German.
Data collection
Data specifications.
Travel Appeal provided a file containing hotel reviews in French 3847 and in German 2059. After data cleaning, we have usable data of 3730 French reviews and 1956 German reviews. French reviews were from 380 hotels, while German reviews were from 199 hotels. When searching for hotels in Milan and Florence, TripAdvisor shows 461 hotels in Milan and 431 hotels in Florence. Hence, the French research data is from 43% of the total Milan and Florence hotels, while the German data is from 22% of all in Milan and Florence. Because the authors did not estimate the French and German review populations, the representativeness and generability of this research could be a limitation. Furthermore, as French reviews were from 43% of all hotels while German reviews were from 22%, it may suggest French tourists are more spread than German tourists, which may be due to cultural or commercial reasons such as hotel distribution practices.
Data processing and analysis
Quantitative analysis
To address Hypothesis 1, seven independent sample t-tests have been conducted. These seven t-tests addressed the following variables: general ratings, cleanliness ratings, service ratings, sleep quality ratings, room ratings, location ratings, and value ratings.
Textual analysis
To address the Hypothesis 2, automated text analysis was employed.
Automated Text Analysis was designed not to look into main (i.e., popular) discussions; rather, it aims at generating a broad understanding regarding the richness of the topic discussed. In order to proceed to the automated text analysis, two corpora were created based on the language (French and German – specific built-in dictionaries featuring stop words and accents for the two languages were used). Those were submitted to Iramuteq (http://www.iramuteq.org/); this free software was created by Pierre Ratinaud, and until 2009 it was only available for the French language but currently has complete dictionaries in several languages (Souza et al., 2018). Iramuteq is a software that supports discursive textual analysis, offering agility and rigour to qualitative textual data analysis (Ramos et al., 2018). In this study, Iramuteq was used for lemmization and co-occurrence analysis to create a semantic similarity analysis technique (belonging to Social Representation Theory – Levidow and Upham, 2017). This analysis featuring co-occurrences was developed by Flament (1981) in order to investigate the proximity and relations among elements of a given cluster. A similarity index is developed by calculating a contingency coefficient between the elements of the cluster (Flament, 1981). This semi-supervised approach has the strength that authors could control and supervise the process all along: this is not possible with topic model approaches where the lemmization and the connection among the lemmas are done by the unsupervised machine. Results were then exported to Gephi to create a Social Network visualization (Williams et al., 2017). This methodology to study online hosted narratives has been proven to be effective in understanding how different groups interpret and represent a given aspect of the same reality (Inversini et al., 2020) but has never been employed to study cultural nuances in travel related reviews.
Polarity and subjectivity
Finally, to address Hypotheses 3 and 4, TextBlob, a library from Python was used. TextBlob is a Python library to process textual data; it provides a simple API to generate Natural Language Processing (NLP) analysis (Diyasa et al., 2021) such as part-of-speech marking, noun phrase extraction, classification, and sentiment analysis; the latter is one of the most used Textblob functionalities (e.g., Sv et al., 2022) when a massive amount of textual data has to be analyzed (for example in the case of Twitter conversation about Covid19 Vaccine – Praveenet al., 2021). TextBlob uses a module containing a lexicon of words tagged with “polarity” and “subjectivity” scores (Tafesse, 2021). Users import TextBlob and calculate subjectivity and polarity presented in texts. Many researchers have used TextBlob to conduct subjectivity and polarity research, including Deng et al. (2019); Micu et al. (2017); Roy (2023), Saura et al., (2022a); Saura et al., (2022b); and Tafesse (2021).
Polarity and Subjectivity: Sublime Text and Grupo were used to convert the csv format data to structured data and debug the UTF-8 characters. Similar to the research of Zhao et al. (2019), the sentiment function in TextBlob from Python library was used to calculate polarity and subjectivity for French and German. As defined by Zhao et al. (2019), polarity is a float which lies in the range of -1 and 1, where 1 means a positive statement and -1 means a negative statement. Similarly, Zhao et al. (2019) explained that personal opinions, emotions, or judgment could be considered subjective, while factual information is objective. Subjectivity is also a float which lies in the range of 0 and 1 (Zhao et al., 2019).
Results
Descriptive data for hotel attribute ratings
Descriptive data for hotel ratings.
Automated text analysis and data visualization
The French corpus (Figure 1) includes Cluster 1 related to food service perceptions (breakfast, buffet); Cluster 2 related to room characteristics; and Cluster 3 related to hotel characteristics. In the French high-context language, the social component and quality of meals are important, they are the rhythm of French social habits on a daily basis (Cluster 1). During a hotel experience, the meals consumed at the property are important in the evaluation of the experience, for breakfast as well as in the restaurant. Customers from a French culture are known to be critical and set high expectations in hospitality, as many aspects of dining and service are rooted in French culture (Quellier, 2013). Quality is therefore prioritized over quantity, which is not the case with German speakers, and eating is a social event, which explains why they appreciate quality and take the time to have meals in restaurants in accordance with their collectivist cultural dimension (Guffey, 2009). French speaking reviewers repeatedly indicated their preference for French-speaking staff which confirmed the finding of Goethals (2016). High context culture (French) review visualization.
Direct negative feedback is often given by French reviewers and confirmed the findings from Goethals (2016). This tendency is embedded in their high-context culture that fosters debate and argumentation since a young age in schools but is also practiced in close-circle communication and business context (Meyer, 2014). The analysis of an experience is, therefore, often two-sided.
The German corpus (Figure 2) presents Cluster 1 related to perceptions of hotel service characteristics and amenities (room, bathroom, and breakfast) and transportation (e.g., train); Cluster 3 related to customer staff perceptions (courteous, friendly, helpful); and Cluster 2 related to the physical organization and facilities of the room. Through analysis of the nodes in Cluster 1 and Cluster 2, the satisfaction of German reviewers with the basics of their room and their breakfast service is distinguished. The practicality and functionality are often highlighted first, which is common to notice in low-context languages, as their experience descriptions are less emotionally weighed than in high-context languages (Meyer, 2014). Additionally, this cluster reflects the practicality of German travellers in transportation. Many prefer covering distances on foot in the area of the hotel and visit surrounding cities via public transportation. Cluster 3 shows appreciation of friendly and helpful service. The expectations of low-context language travellers are often lower in terms of service, and any surprisingly positive service will add great value to the experience (Meyer, 2014). Less importance is given to the staff (Cluster 3) in this corpus, with only a few adjectives (e.g., friendly) which characterize the front line employees. Low-context culture (German) review visualization.
Based on Figures 1 and 2, Hypothesis 2 is confirmed that there are different topics in reviews written by high-context and low-context culture customers. The characteristic that stood out in the context of hotel experiences for reviews written in low-context language are pragmatism and practicality. The aesthetic and emotional components of the experience will not highly impact the evaluation of the stay the way it would for reviewers from high-context language.
Network analysis results.
Polarity and subjectivity
Polarity and subjectivity between corpora.
Discussions
Motivated by the lack of understanding of the explanatory factor for variances in reviews by different language speaking customers, we used Social Representation Theory (Moscovici, 1961) and the high- and low- context cultures conceptualized by Hall (1976) as theoretical underpinnings to investigate the phenomenon of reviews written in different languages; differences were highlighted using descriptive statistics, automated text analysis and text mining. Previous researchers have documented the differences between language speakers, but did not provide the explanations of the cause of these differences (Liu et al., 2017).
The conceptualization of high- and low- context cultures from Hall (1976) has been proven to be effective from a theoretical point of view to generate a better understanding of the language/cultural differences among customers reviewing hospitality organizations online; the results of polarity and subjectivity analysis shown in Table 4 indicate that there is an intrinsic difference between high- and low-context cultures. Compared to high-context cultures, low-context cultures have high polarity but low subjectivity. Previous researchers have documented that German-speaking travellers gave the highest satisfaction, while French-speaking travellers are the most negative (Goethals, 2016; Liu et al., 2017). Given the intrinsic or built-in differences in polarity and subjectivity between high- and low- context cultures, researchers and practitioners should not assume reviews demonstrating German travellers are more satisfied than French travellers. Monitoring the changes of sentiments by high- and low-context cultures could reduce the bias resulting from the intrinsic differences.
According to the linguistic relativity hypothesis, how people think about reality is influenced by the language they speak (Lucy, 1997). The data visualizations shown in Figures 1 and 2 offer additional insights of the differences between high- and low- context cultures from a Social Representations Theory perspective. At the topic level, high- and low- context culture customers have different focuses. Two of the three clusters shown in Figure 1 match the stereotypes of French culture, such as focusing on the social component and quality of meals as well as the two-sided arguments. In Figure 2, pragmatism and practicality are clearly shown.
From a methodological point of view, the research demonstrate that human languages can express concepts and emotions which quantitative ratings cannot capture (Geetha et al., 2017). The nuances provided by reviewers are evidence of their complex understanding of reality (Inversini et al., 2020). Data visualization could reveal something quantitative data cannot. Surprisingly, only a few researchers have adopted data visualization for reviews (e.g., Antonio et al., 2018a; Geetha et al., 2017; Martinez-Torres and Toral, 2019).
Conclusion
Framed within Social Representation Theory, this research applies the high- and low- context cultures (Hall, 1976) and uses a mixed method approach to canvas the differences in reviews written in two different languages (French and German) about the same hotels in two destinations.
Contribution to theory
This research tested four hypotheses and found support in differences between high- and low- context cultures. Specifically, data visualization shows the different clusters, and compared to high-context culture, low-context culture reviews have high polarity but low subjectivity. In fact, the high- and low- context culture (Hall, 1976) can explain the differences is the key finding, as this is one of the first attempt to explain the ‘reason why’ behind differences in review: the culture the reviewer belongs to, will therefore influence the way in which s/he will review the experience. This research also contributes to theory by adding a layer related to the research of internet-mediated reviews showing that culture and language may affect meanings when it comes to reviews. Additionally, Social Representation Theory can support the explanation related to the choice of topics when reviewing an experience: different customers, belonging to different cultures pay attention to different facets of the same experience anchoring the meanings in their own cultural views (Moscovici, 1984) and this is reflected within the reviews. This research offers a plausible explanation of the cause of the differences in reviews. This research cautions researchers for the intrinsic differences in polarity and subjectivity between high- and low- context cultures, and their impacts on review ratings. Through data visualization and text mining, the nuances in languages could be better revealed and provide more insights to the customers perceptions and experiences.
Practitioners implications
Some researchers addressed the multi-lingual research by translating all languages into English (Francesco and Roberta, 2019; Huang, 2017), while others kept the original languages but were limited by the amount of content (Cenni and Goethals, 2017). The progress in topic modelling, data analysis, and data visualization afford researchers to conduct research in languages other than English. By working on the original formats, this research can keep the nuance embedded in high- and low- context cultures. Hence, with the assistance of technology, practitioners should analyze customers' reviews in native languages to understand nuances in relation to their (main) target markets, thus being prepared and supporting product customization along the customer journey.
Furthermore, some researchers questioned the common practices of calculating the average ratings from all language groups (Hale, 2016), and suggested websites should allow users to read reviews in their mother tongue (Mariani et al., 2019), while others suggested (auto) translation will serve the purpose (Cenni and Goethals, 2017). Based on the research findings presented in this paper, users should be able to read reviews in their language to support the mindful choice of hospitality service.
The research findings also have website design and digital communication implications. Without changing the actual product or service, hoteliers can stress one or more characteristics of their properties and/or hospitality experience to different target audiences, thus creating a more sounding communication towards different cultures (Tigre Moura et al., 2015).
These two actions would dramatically strengthen communication towards different cultures, reducing the bias resulting from the intrinsic differences among textual representations of cultures.
Limitations and future research
The current study found differences in data visualization, topic modelling, subjectivity and polarity between high- and low- context cultures. This research supports the call from Mattila (2019) that more research could use the high- and low- context cultures to understand travellers’ behaviour.
There is scarce research investigating the high- and low- context cultures in the tourism and hospitality domain. This research explores only two examples of high- and low- context cultures; future research should focus on exploring more cultures to obtain statistically generalizable results. This research used high- and low- context cultures (Hall, 1976) to explain the differences in reviews. Future researchers could use Hofstede’s cultural dimensions (Hofstede, 2015) to explain further the cultural differences in reviews, or even – more in general – differences in social media posts. Furthermore, because of lemmization and co-occurrence analysis, this research did not consider the impact of the length of reviews. Yet, with the advancement in natural language processing, the relationship between the length of reviews and high- and low- context cultures should be further explored.
Footnotes
Acknowledgements
The authors thank Travel Appeal for providing the data for this research.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
