Abstract
Previous research on sentiment’s impact on perceived helpfulness shows mixed results; while some highlight the benefits of positive valence, others favour negativity or balanced (50/50) reviews. These inconsistencies may arise from sentiment polarity approaches that overlook emotional complexity. This study examines how sentiment and emotions expressed in online customer reviews on platforms such as TripAdvisor influence perceived helpfulness. We analysed the differences in three sentiments and eight emotions between helpful and unhelpful reviews (n = 2,785,999) using sentiment analysis (e.g., positive, neutral, and negative) and emotion analysis (e.g., anger, disgust, fear, joy, sadness, surprise, happiness, and love). To achieve this, we developed and trained an artificial intelligence emotion detection model using a transformer-based machine learning algorithm on a tweet emotion dataset (n = 2,774,566). Findings reveal that a slight increase in negative emotions (from 11% to 17%) significantly enhances perceived helpfulness, supporting negativity bias theory. These findings are further enriched by broader psychological theories such as emotional salience and diagnosticity, which help explain why certain emotional expressions in reviews may be more cognitively and behaviorally impactful. Reviews blending high positive and low negative emotions are most helpful, while extreme or balanced sentiments are less impactful. Additionally, negative emotions (notably sadness) are more prevalent in helpful reviews as price levels rise, suggesting an even stronger negativity bias. Logistic regression analysis further confirms emotion-focused models, particularly those emphasising negative emotions, exhibit greater explanatory power than sentiment-based models, particularly in the high-price context.
Keywords
Introduction
Ever hesitated because a one-star rant looked surprisingly “helpful”? Such moments reveal how the tone of a message and the emotions it conveys shape judgment, not just for restaurants, but across a wide range of experience-based services such as hotels, flights, or entertainment. While negativity bias highlights consumers’ tendency to weigh bad news more heavily than good (Rozin & Royzman, 2001), prior work also shows that people use their feelings as diagnostic cues when those feelings seem relevant and salient to the task (Pham, 2007; Rocklage & Fazio, 2020; Schwarz, 2012). Taken together, these perspectives help explain why some negative emotions increase perceived helpfulness whereas others do not, and why certain positive emotions can also enhance perceived helpfulness when they are judged as informative for the decision at hand.
Research on salience shows that attributes drawing disproportionate attention command greater decision weight (Bordalo et al., 2013). When applied to textual reviews, emotions that are both noticeable and perceived as task-relevant become powerful heuristics. Merging salience with diagnosticity predicts that strongly expressed, goal-congruent emotions (e.g., sadness about cleanliness failures) outperform weak or misplaced emotions (e.g., surprise about menu layout) in driving helpfulness. This perspective complements, rather than replaces, negativity bias, clarifying why some negative emotions (sadness, anger) are more impactful than others and why certain positive emotions (joy) can still enhance credibility under specific conditions (Qahri-Saremi & Montazemi, 2023).
To overcome prior limitations in sentiment polarity analysis, this study investigates the deeper influence of both sentiment and emotion on the perceived helpfulness of online reviews, using restaurant reviews as a representative and theoretically grounded context within the broader experience economy. Although prior work documents a general negativity bias, we focus on restaurants as a methodologically robust and conceptually relevant testbed for studying emotion dynamics in reviews of experience-based services. Restaurants offer high review volume, variation in price sensitivity, and immediate consumption feedback, making them well-suited for testing fine-grained emotional effects. These features are also characteristic of other service domains such as hotels and flights, suggesting the potential for broader generalisability (Hamilton & Thompson, 2007; Litvin et al., 2008).
This study investigates two key factors influencing the impact of online reviews on customer behaviour. The first factor is the helpfulness of online reviews, which has been shown to be a significant predictor of review impact (Banerjee et al., 2017; Huang et al., 2015). Helpful reviews are more likely to influence customer behaviour than unhelpful ones because they are perceived by other users as informative, credible, and decision-relevant (Banerjee et al., 2017; Huang et al., 2015). The present study therefore compares the content of restaurants’ helpful and unhelpful online reviews and identifies which emotional and sentiment patterns distinguish reviews that readers judge as useful from those they do not. The present study aims to compare the content of restaurants’ helpful and unhelpful online reviews and to identify the factors that make a review more helpful by applying both emotion and sentiment analyses. Existing studies about the influence of sentiment in messages have produced mixed results: (1) Some show a positive relationship between positive review valence and perceived helpfulness of online reviews (Malik & Hussain, 2017). (2) Others have reported a mixed relationship between negative review valence and perceived helpfulness of online reviews (Ren & Hong, 2019; Yin et al., 2014). (3) Some studies indicate that a balanced (50/50) review is regarded by consumers as the most helpful (Chatterjee, 2020; Nakayama & Wan, 2019). This inconsistency in results is partially attributed to the use of sentiment polarity approaches, which fail to capture the complexity of human emotions. To overcome this limitation, this study investigates the influences of sentiment and emotions on the perceived helpfulness of online reviews deeply. Therefore, our first research question (RQ1) is articulated as: RQ1: Whether there are differences in the sentiments/emotions expressed in (un/) helpful online reviews?
The second factor is restaurant price level, which can significantly impact customer behaviour (Kim et al., 2014; Konuk, 2019). The study explores whether there is a difference in the helpfulness of online reviews for restaurants with different price levels. Price level can significantly impact customer behaviour, and customers may have different expectations and perceptions of restaurants based on their price level. The present study investigates whether there are differences in the emotions expressed in the usefulness of online restaurant reviews with different price levels. Therefore, our second research question (RQ2) is articulated as RQ2: Are there differences in the sentiments/ emotions expressed in (un/) helpful online reviews associated with different product/service price levels?
Recent research highlights the varying effects of online review valence (e.g., star ratings) on consumer purchasing decisions, with findings ranging from positive (Ho-Dac et al., 2013; Liu & Park, 2015) to negative correlations (Chevalier & Mayzlin, 2006; Papathanassis & Knolle, 2011). This inconsistency may stem from traditional sentiment analysis methods’ inability to fully capture complex human emotions. Given the large quantities of online texts available online, traditional methodology such as hand coding is not feasible, and automation is required for delving into the vast quantities of online review texts (Hatipoğlu et al., 2019). To resolve these challenges, a transdisciplinary approach is required. Recent developments in computer science engineering suggest that a specific emotion detection model can be developed by utilising big data and state-of-the-art deep learning language algorithms (e.g., Devlin et al., 2018; Liu et al., 2019; Tenney et al., 2019), which can provide automated and accurate specific emotion analysis when used in tourism. We, therefore, introduce the specific emotion detection model using a transformer-based machine learning algorithm trained on big data consisting of tweets that include emotion hashtags (e.g., Lee et al., 2021; Lee et al., 2023). Therefore, our third research question (RQ3) is articulated as: RQ3: Can emotion analysis offer more nuanced insights into (un/) helpful online reviews compared to traditional sentiment analysis?
This study addresses key gaps in the literature on how online reviews influence consumer behaviour by making three distinct contributions. First, it introduces a transformer-based emotion detection model trained on big data, enabling the analysis of eight discrete emotions, moving beyond traditional sentiment polarity approaches. Second, it incorporates broader psychological theories of emotional diagnosticity and salience to explain when and why certain emotions (e.g., sadness, joy) enhance perceived review helpfulness. Third, it examines how emotional impact varies across restaurant price levels, revealing how the role of emotions shifts with consumer expectations in different service tiers. By exploring the emotional content of helpful and unhelpful reviews across nearly 2.8 million observations, the study offers both theoretical advancement and actionable insights for businesses seeking to optimise their online reputation strategies. These findings underscore the importance of tailored review management approaches for building trust and engagement in the digital marketplace.
Literature Review
Helpfulness of Online Reviews
The rise of social media and review platforms has made online reviews a critical source of information for consumer purchase decisions, particularly in service-oriented industries such as restaurants (Chua et al., 2020; Zhang, 2019). However, the abundance of electronic word of mouth has led to information overload, potentially adversely affecting decision-making and increasing inconvenience, confusion, and stress among customers (Frías et al., 2008). The internet is exploding with information, with over 500,000 tweets, more than 5 million Google searches, and over 20 million emails sent per second, according to Internet Live Stats (2023). Despite the abundance of online reviews, only a small amount of information accessed by customers is deemed helpful. The helpfulness and usefulness of information play a mediating role between influence processes and information adoption (Sussman & Siegal, 2003), underscoring the need to identify factors that distinguish helpful reviews. For this reason, the helpfulness of online reviews can play a significant role in shaping consumer perceptions and, in turn, influence restaurants’ success. Hence, understanding the factors that contribute to helpful and unhelpful reviews is essential for restaurants seeking to manage their online reputation. Emotionality in reviews, for instance, has been shown to enhance their perceived helpfulness and influence decisions about hedonic products like leisure and entertainment more than utilitarian goods, such as household items and electronics (Rocklage & Fazio, 2020). These findings highlight the importance of emotional language in shaping consumer behaviour, and emotional language may be an important factor contributing to helpful and unhelpful reviews.
Negativity Bias & Negative and Positive Emotions
Our theoretical underpinning begins with the concept of negativity bias, a psychological phenomenon suggesting that negative events exert a more significant impact on an individual’s state than positive ones, closely tied to loss aversion—a principle suggesting that losses are perceived more intensely than equivalent gains (Kahneman et al., 1991). Together, negativity bias and loss aversion provide foundational theories for understanding consumer perception and behaviour, particularly in how online reviews are interpreted for their helpfulness.
Recent studies explore the interplay between emotions and online review helpfulness, revealing the distinct influence of negative and positive emotions. Ren and Hong (2019) found that anger in reviews diminishes perceived helpfulness, particularly for experiential goods, while fear can unexpectedly increase a review’s helpfulness by presenting persuasive narratives. Conversely, sadness tends to lower perceived helpfulness, illustrating the complex roles of specific negative emotions. Yin et al. (2014) highlight the distinct impacts of emotions such as anxiety and anger on review helpfulness. Anxiety in reviews often results in them being viewed as less helpful due to perceived uncertainty, whereas anger, despite being negative, may enhance a review’s perceived usefulness by indicating specific concerns and issues.
On the other hand, Malik and Hussain (2017) turn the focus toward positive emotions, identifying trust, joy, and anticipation as key drivers of review helpfulness, highlighting the influence of positive sentiments in shaping perceptions of helpfulness. Furthermore, Chatterjee (2020) and Nakayama and Wan (2019) emphasise the value of balanced (50/50) sentiments in terms of increasing perceived helpfulness. Chatterjee’s study on hotel reviews shows that moderate emotional expressions are more impactful than extreme ones, while Nakayama and Wan’s cross-cultural research highlights that evenly expressed sentiments across review aspects significantly enhance perceived helpfulness.
While the negativity bias remains central in understanding consumer response to reviews, it does not fully explain why some positive emotions, such as joy, contribute positively to helpfulness, particularly in upscale contexts. According to emotional diagnosticity theory, individuals treat emotions as heuristics when they believe those emotions convey task-relevant information (Pham, 2007). For example, sadness in a negative review may signal a severe service failure, while joy in a positive review may reflect an exceptional experience, both of which are perceived as diagnostic and therefore helpful. Furthermore, salience theory holds that emotionally vivid elements capture attention and carry more decision weight (Bordalo et al., 2013). Thus, emotions that are both salient and goal-congruent (e.g., anger about food quality or joy about ambience) are more likely to enhance perceived usefulness. These perspectives help clarify why not all negative emotions increase helpfulness equally, and why some positive emotions, when contextually aligned, can also signal trustworthiness and informativeness (Qahri-Saremi & Montazemi, 2023). Building on this, studies in cognitive neuroscience show that emotionally salient content, especially those with high arousal and vivid valence, is more likely to attract attention and be encoded deeply by readers (Kensinger & Schacter, 2006). At the same time, the theory of diagnosticity suggests that emotionally rich cues are judged as more informative and credible when aligned with the decision context (Feldman & Lynch, 1988). These mechanisms further explain why specific discrete emotions, such as sadness over service failure or joy over exceptional quality, carry greater persuasive weight than generic valence labels.
Beyond aggregate star ratings, recent studies show that the valence of a single user-generated review, especially when it is strongly negative, can heavily sway consumer purchase decisions (Varga & Albuquerque, 2024). In such cases, it is the emotional tone of the specific review, rather than the overall average rating, that drives perceived credibility and usefulness. In fact, just a single strongly negative review can outweigh a product’s star rating in influence; reading a few detailed reviews is often enough to overturn the effect of a high average rating (Lei et al., 2022). It is the emotive and concrete language in these reviews that tends to carry the greatest weight in shaping perceived credibility and usefulness (Lei et al., 2022). Consequently, identifying which specific emotions, beyond broad valence, make a review appear trustworthy and “helpful” has become a growing managerial priority (Felbermayr & Nanopoulos, 2016). These findings underscore the need to move beyond sentiment polarity and investigate the distinct role of discrete emotions in consumer judgment.
Price Level
Price level is a crucial factor in the restaurant industry as it has a significant impact on consumer behaviour and restaurant selection. Higher-priced restaurants are often perceived as offering superior quality (Kleinsasser & Wagner, 2011). Price perception is a significant variable in the decision-making process as it represents what is given up, including monetary and nonmonetary costs, to obtain a product or service (Fang et al., 2016). Research shows that reviews of high-priced restaurants are more likely to be perceived as helpful, as consumers seek detailed information before making purchase decisions (Zhu et al., 2014). However, limited research has explored how price level affects the emotional tone and sentiment of reviews. While prior studies have examined the influence of sentiment and emotion on review helpfulness, few have investigated how price level interacts with these factors. This study addresses this gap by analysing the impact of price level on the emotional tone, sentiment, and star ratings of restaurant reviews. The findings emphasise the complex role of emotions in shaping review helpfulness, highlighting how contextual factors, such as product type and price, mediate this relationship.
Sentiment & Emotion Analysis
Sentiment analysis is a widely used process for extracting subjective information from text, identifying positive, negative, or neutral sentiments (Shivaprasad & Shetty, 2017). Using natural language processing (NLP), it determines the polarity of words and phrases in text. This approach is essential for analysing online reviews, offering insights into the emotional tone of reviews and consumer perceptions (Micu et al., 2017). However, the results of sentiment analysis studies are inconsistent, with some showing a positive relationship (Liu & Park, 2015; Yin et al., 2014) between review valence and the perceived helpfulness of the reviews, whereas others report a negative relationship (Willemsen et al., 2011). This is because sentiment analysis alone has a limited ability to predict customer behaviour, as human emotions are not simply divided into positive and negative. Understanding the various emotions driving customer behaviour is critical to understanding and analysing the helpfulness of online reviews.
Specific emotion analysis is a more nuanced approach to sentiment analysis that seeks to understand the specific emotions driving customer behaviour. Emotion analysis uses NLP techniques based on the latest machine learning algorithm to identify emotions such as fear, anger, sadness, joy, surprise, and disgust, as well as track their impact on customer behaviour. Studies highlight the role of emotions in review helpfulness but show varied results. For example, Ren and Hong (2019) examined 11,522 online reviews and counted emotional words using a lexicon. They found that emotions of anger and sadness in a customer review have a negative impact on review helpfulness. On the other hand, Wang et al. (2019) analysed 265,205 online reviews and counted emotional words using a lexicon. They found that emotions, including anger, disgust, and fear (versus joy, sadness, and trust), positively impact review helpfulness. Similarly, Chatterjee (2020) analysed 942 online reviews and found that emotions, including disgust and sadness (versus anger and fear), positively impact review helpfulness. Li et al. (2016) analysed 600,686 online reviews and found that emotions of anger and anxiety in a customer review have a positive impact on review helpfulness.
These findings emphasise the significance of emotions in online reviews but reveal inconsistencies in their effects. Different emotional expressions may have different effects on the helpfulness of reviews. Therefore, more research using the latest machine learning algorithm is necessary to better understand the role of emotions in customer reviews and their impact on helpfulness.
Methodology
In this study, we aimed to investigate the impact of star ratings, price level, and specific emotions on the helpfulness of online restaurant reviews. We collected a large dataset of over 2.7 million online restaurant reviews, with the helpfulness vote as the dependent variable and star ratings and price level as independent variables. In addition, we utilised an emotion lexicon to extract sentiment variables (positive, negative, and neutral) from the review text. To further examine the impact of specific emotions on review helpfulness, we trained a deep language model using an emotion dataset of over 2.7 million tweets, which allowed us to extract specific emotions (e.g., anger, disgust, fear, joy, sadness, surprise, happy, and love).
Our initial analysis revealed differences between helpful and unhelpful reviews in terms of these variables. Additionally, we conducted a second analysis to highlight the impact of price level on the helpfulness of reviews. By examining these factors, we hope to provide insights into the determinants of helpfulness in online restaurant reviews and their influence on consumer decision-making.
Data Collection
Valuable insights into customer, prospect, and reviewer behaviour can be gained through the mining of large amounts of online data (Leeflang et al., 2017; Nie et al., 2011; Wedel & Kannan, 2016). We utilised a dataset suitable for a digital tourism context to compare and test the proposed hypotheses. The study collected over 2.7 million online restaurant reviews and 10,592 restaurant information from TripAdvisor.com. Initially, we collected 2,785,999 reviews of 37,758 restaurants in nine major U.S. cities (Chicago, Houston, Las Vegas, Los Angeles, New York City, Pittsburgh, San Antonio, San Diego, and San Francisco) using web data scraping software. These online reviews were posted over a period of 17 years, from October 2002 to July 2019, and only reviews written in English were scraped because English was the primary language of the reviews. This dataset is referred to as the review dataset.
Descriptive Statistics of Key Variables
Detecting Sentiment in Text
We utilised the Valence Aware Dictionary and sEntiment Reasoner (VADER), a widely used Python package for automatic sentiment analysis (Alaei et al., 2019), to analyse the sentiment of online reviews related to restaurant experiences. As argued by Hutto and Gilbert (2014), VADER has been shown to outperform human annotators in sentiment evaluation. One of the advantages of VADER is its ability to assess sentiment by assigning positive and negative sentiment values to words in an emotion lexicon, without requiring prior machine learning model training.
Descriptive Statistics of Sentiment Variables
In this study, we use VADER as a benchmark tool for sentiment polarity because it is a widely adopted, off-the-shelf solution in tourism and marketing research. By contrast, as detailed in the next section, our emotion model is a more recent transformer-based approach that provides fine-grained discrete emotions. Our aim is therefore not to conduct a controlled comparison of model architectures, but to offer a pragmatic contrast between what managers would learn from a standard polarity tool and what they can additionally learn from a state-of-the-art emotion model when analysing review data in practice.
Detecting Emotions in Text
Detecting emotions in text involves a three-step process. First, the specific emotions to be detected must be determined, given the diversity of human emotions. Second, a large training dataset expressing the chosen emotions is required to train the artificial intelligence model. Finally, the emotion detection model is trained on this dataset using a deep language transformer model, which achieves higher classification accuracy by comprehending the sentence’s context (Al-Omari et al., 2020; Chatterjee et al., 2019).
We adopted the tweet-collection procedure and eight emotions of anger, disgust, fear, joy, sadness, surprise, happy, and love from a previous study (Lee et al., 2021), which suggested collecting two additional positive emotion hashtags (i.e., #happy and #love) from Ekman’s (1971) six emotion hashtags (i.e., #anger, #disgust, #fear, #joy, #sadness, and #surprise). Positive emotions are heterogeneous. In affective-science research, joy and happiness are treated as distinct, discrete states rather than synonyms: joy is a short-lived, high-arousal response to goal attainment, whereas happiness reflects a broader cognitive appraisal of life satisfaction (Van Cappellen, 2020; Watkins et al., 2018). These distinctions matter because short-lived high-arousal states can trigger different attributional inferences than longer-term evaluative states. We collected the emotion-labelled dataset (n = 2,774,566; #anger = 229,344; #disgust = 47,968; #fear = 405,638; #joy = 507,757; #sadness = 319,359; #surprise = 419,724; #happy = 299,464; #love = 545,312) using the Twitter API by searching eight emotion hashtags posted from 2007 to 2023. The dataset was cleaned to include only the English lexicon, excluding duplicate tweet IDs and tweets containing fewer than three English words. Emotion hashtags were removed from the tweet text and used as emotion labels. The processed dataset was then utilised to train the emotion detection model.
We applied the deep language transformer model, the Robustly Optimised BERT Pretraining Approach (RoBERTa), to the training of the tweet emotion dataset in order to achieve high performance through transfer learning, as previous studies have suggested (Lee et al., 2021, 2023). The RoBERTa transformer model was trained with ten times more data than the original Bidirectional Encoder Representations from Transformers (BERT) model (Devlin et al., 2018) and outperformed BERT in various NLP tasks (Liu et al., 2019). We trained the RoBERTa-large transformer model (learning rate of 0.00001 for 3 epochs) on the emotion dataset (80% train set and 20% test set) using the Hugging Face library (Jain, 2022), which is a popular open-source library for NLP that provides various tools and libraries for building and deploying NLP models. It is widely known for pretrained transformer models such as BERT, GPT, and RoBERTa, which can be fine-tuned for tasks including emotion analysis. The trained transformer model achieved an overall accuracy of 0.75, with weighted-average precision, recall, and F1 all at 0.75, indicating balanced performance across the dataset (random classification accuracy = 12.5%). Emotion-specific F1 scores ranged from 0.53 (disgust) to 0.83 (fear), with joy (0.79), love (0.77), happy (0.74), surprise (0.72), and anger and sadness (both 0.70). Although disgust yielded the lowest F1 score, reflecting its low prevalence, it still exceeded the commonly used 0.50 threshold for minority classes, supporting the robustness of the emotion-detection pipeline.
Descriptive Statistics of Emotion Variables
Human Validation of the Emotion Detection Model
We first drew a stratified random sample of 400 TripAdvisor reviews, selecting 50 reviews for each of the eight emotions (anger, disgust, fear, happy, joy, love, sadness, surprise) as predicted by the emotion detection model. For each sampled review, the AI-assigned dominant emotion was compared against four independent human coders, who each selected a single dominant emotion from the same eight-category scheme. Inter-rater agreement was evaluated using simple percent agreement, Cohen’s kappa for pairwise coder–coder and AI–coder comparisons (Cohen, 1960), and Fleiss’ kappa for multi-rater reliability (Fleiss, 1971), with interpretive benchmarks following Landis and Koch (1977).
Simple agreement between the emotion detection model and individual coders ranged from 71.9% to 79.4%, with Cohen’s κ between 0.68 and 0.77, indicating substantial agreement. On average, the simple agreement between the emotion detection model and individual coders was 75.6% (mean κ = 0.72). Inter-rater reliability among the four human coders was also high. Pairwise Cohen’s κ ranged from 0.71 to 0.90, and Fleiss’ κ for all four coders across 398 reviews was 0.77, suggesting that the human labels constitute a reliable benchmark. Across all six coder pairs, the mean simple agreement was 80.6% (mean κ = 0.77), closely mirroring the multi-rater Fleiss’ κ.
When we compared the emotion detection model against the majority-vote label (excluding ties), agreement further increased to 84.7% (κ = 0.82). For example, if three of the four coders selected “joy” and one selected “happy,” the majority-vote label was “joy,” whereas cases with a 2–2 split were excluded from this analysis. At the emotion level, when the emotion detection model predicted a given emotion, it showed very high agreement with human consensus for happiness, joy, love, disgust, and sadness (accuracy ≈90–98%), moderate agreement for anger (80%), and lower agreement for fear (51%) and surprise (67%). Misclassifications were concentrated in theoretically adjacent categories, such as fear vs. sadness and surprise vs. high-arousal positive emotions (joy/happiness), which are often linguistically blurred in user-generated review texts. Taken together, these findings suggest that, despite being trained on hashtag-supervised tweets, the emotion detection model generalises reasonably well to long-form TripAdvisor reviews when evaluated against reliable human annotations.
Results
Comparison of (Un/) Helpful Reviews
Comparison of the Averages of Helpful and Unhelpful Reviews
*p < 0.05.
**p < 0.01.
***p < 0.001.
p values that exceed the 0.05 threshold are considered not significant, abbreviated by NS in the table.
The effect sizes were measured using Cohen’s d value, which indicates the standardised difference between two means. A value of 0.2 was interpreted as a small effect size, 0.5 as a medium effect size, and above 0.8 as a large effect size (Thalheimer & Cook, 2002). The findings revealed that negative reviews were more helpful than positive ones for the star rating variable, with a large effect size of 1.030. In terms of sentiment variables, less positive and more neutral reviews were perceived as helpful, with very small effect sizes between 0.049 to 0.125. For the emotion variables, reviews with more sadness, less happiness, and less love were considered more helpful, with small effect sizes between 0.049 and 0.264.
In summary, the results indicate that negative emotions are magnified in helpful reviews as compared to unhelpful ones. For instance, the mean values of anger, disgust, and fear emotions are higher in helpful reviews as compared to unhelpful reviews. However, it is important to note that positive emotions still dominate the reviews, even in the helpful reviews. Happy emotion has the highest mean value in both helpful and unhelpful reviews. Joy and love are also dominant emotions in the reviews. The findings suggest that a slight increase (from 11% to 17%) in negative emotions (anger, disgust, fear and sadness) significantly alters the perceived helpfulness of the reviews, which sentiment analysis may not fully capture. Emotion analysis captures a wide range of emotions beyond simple positive or negative sentiment, allowing for the detection of nuanced expressions that sentiment analysis might miss. For instance, sadness in a review might signal genuine disappointment or unmet expectations, making the review more helpful to others by providing cautionary insights. Similarly, anger and disgust can highlight serious issues or problems with a product or service that potential consumers would want to avoid.
The Effect of Price Level
Comparison of the Averages of (Un/) Helpful Reviews by Price Level
*p < 0.05.
**p < 0.01.
***p < 0.001.
p values that exceed the 0.05 threshold are considered not significant, abbreviated by NS in the table.
The t-values for all independent variables were significant (−91.591 ≤ t ≤ 118.343, p < 0.05), indicating that the differences between helpful and unhelpful reviews were not due to chance. The effect sizes (d) were mostly moderate to large, ranging from 0.248 to 1.030, suggesting that the differences between the groups were practically significant as well. Taken together, these results suggest that there is an association between helpful and unhelpful reviews based on restaurant price level and that the emotional tone of reviews differs across different price levels. Therefore, the findings from the previous analysis are reinforced with higher price levels.
Next, Figure 1 illustrates the distribution of sentiment—categorised into negative, neutral, and positive—in online reviews, stratified by the restaurant price level and review helpfulness. The x-axis segregates the data into six groups, representing unhelpful and helpful reviews across three price levels, indicated as Price level 1 ($), Price level 2 ($$–$$$), and Price level 3 ($$$$). The y-axis quantifies the proportion of each sentiment category. In unhelpful reviews, we observe a consistent proportion of positive sentiment (approximately 0.24 to 0.26) across all price levels, with neutral sentiment being predominant (over 0.70) and negative sentiment being the least represented (around 0.03). Conversely, helpful reviews exhibit a slightly lower proportion of positive sentiment (approximately 0.22) across all price levels, with the neutral sentiment remaining the most common. Sentiment distribution in online reviews across different price levels
In Figure 2, the emotional profile of online restaurant reviews is dissected across three price levels, with a distinction between reviews deemed helpful and unhelpful. The chart classifies emotions into positive (happy, joy, love), neutral (surprise), and negative (anger, disgust, fear, sadness) categories. Each bar segment represents the proportion of the respective emotion within the reviews. For unhelpful reviews, positive emotions (happy, joy, love) are consistently the most represented across all price levels, constituting a majority with happiness being the most prominent. The neutral emotion of surprise is present but minimal, while negative emotions collectively form the smallest proportion, with anger and disgust being scarcely observed. Interestingly, fear and sadness are slightly more present but remain low. In helpful reviews, the positive emotions remain dominant but show a slight decrease in proportion as price levels increase, particularly with happiness and love. Surprise, as a neutral emotion, maintains a consistent but minor presence across all price levels. Negative emotions, while still less prevalent than positive ones, are slightly more represented in helpful reviews compared to unhelpful ones, with sadness showing a noticeable increase at higher price levels. Emotion distribution in online reviews across different price levels
Overall, emotion analysis allows for the detection of nuanced emotional expressions that sentiment analysis might overlook. For example, the mean difference of sadness escalates more than 16 times from the low-price to the high-price level, which becomes more valuable for consumers seeking to avoid negative experiences. In contrast, sentiment analysis might categorise such a review simply as negative without capturing the depth of the reviewer’s emotional state.
Alternative Logistic Regression Analysis
This section of the study presents an alternative logistic regression analysis, comparing the impact of sentiment and specific emotions on the perceived helpfulness of online restaurant reviews across varying price levels: low ($), medium ($$ - $$$), and high ($$$$). The analysis aims to uncover how sentiment and emotions influence review helpfulness, with a particular focus on the role of negative emotions as price levels increase. Specifically, the independent variables (IVs) for sentiment models include sentiment categories (negative, neutral, and positive) and emotion models for specific emotions (anger, disgust, fear, sadness, surprise, happy, joy, and love) expressed in online restaurant reviews. Note that the variable “love” was excluded from the logistic regression model because it was not statistically significant, indicating it did not meaningfully contribute to predicting the dependent variable (DV). DV is the perceived helpfulness of online restaurant reviews, measured as a binary outcome (un/helpful reviews). Both sentiment and emotion models were developed separately for three different restaurant price levels: low ($), medium ($$ - $$$), and high ($$$$). The logistic regression model fitness was assessed using Nagelkerke R2, which indicated the proportion of variance in the DV explained by the IVs (Hemmert et al., 2018).
Comparative Analysis of Sentiment and Emotions on Review Helpfulness by Price Level
*p < 0.05.
**p < 0.01.
***p < 0.001.
p values that exceed the 0.05 threshold are considered not significant, abbreviated by NS in the table.
For medium-priced restaurants, the sentiment analysis demonstrated a stronger, yet still statistically insignificant, impact of sentiment categories (negative, neutral, positive) continued to have no significant effect on review helpfulness. The Nagelkerke R2 increased to 0.015, reflecting a modest improvement in model fit over the low-price category. However, the specific emotion analysis revealed that negative emotions gained prominence as predictors of helpfulness in this price category. Anger (B = 1.288, Exp(B) = 3.626, p < 0.001) and disgust (B = 1.495, Exp(B) = 4.458, p < 0.001) were particularly strong positive predictors. Other negative emotions such as fear (B = 0.514, Exp(B) = 1.672, p < 0.001) and sadness (B = 0.192, Exp(B) = 1.212, p < 0.001) also emerged as significant predictors, suggesting that consumers may perceive reviews containing these negative emotions as more helpful. Conversely, positive emotions like happy (B = −0.590, Exp(B) = 0.555, p < 0.001) remained negatively associated with helpfulness. The Nagelkerke R2 for the emotion model was 0.018, indicating that specific emotions provided a slightly more substantial explanation for helpfulness in this price range.
In the high-priced restaurant category, Negative sentiment had a statistically significant and strong effect on review helpfulness (B = 2.985, Exp(B) = 19.785, p = 0.006). In contrast, neutral and positive sentiments did not significantly impact helpfulness. The Nagelkerke R2 was 0.031, the highest among all price levels, indicating that sentiment, particularly negative sentiment, became critical in determining the perceived helpfulness of reviews in more expensive dining contexts. However, specific emotion analysis in this price segment showed that anger (B = 1.631, Exp(B) = 5.110, p < 0.001) and disgust (B = 2.039, Exp(B) = 7.680, p < 0.001) continued to strongly influence helpfulness positively. Other emotions like fear (B = 0.419, Exp(B) = 1.520, p < 0.001) and sadness (B = 0.611, Exp(B) = 1.843, p < 0.001) also contributed positively to perceived helpfulness. Notably, the emotion surprise (B = 0.220, Exp(B) = 1.246, p < 0.001) became a positive predictor in this category, suggesting that unexpected elements in reviews might capture attention and enhance their perceived value. Positive emotions such as happy (B = −0.592, Exp(B) = 0.553, p < 0.001) continued to reduce the likelihood of a review being deemed helpful. However, joy (B = 0.059, Exp(B) = 1.060, p = .010) provided a slight positive impact. The Nagelkerke R2 for this emotion model was 0.033, suggesting that negative emotions had a greater impact on helpfulness as the restaurant’s price level increased.
The analysis reveals a clear trend: as the restaurant price level increases, both sentiment and specific emotions, particularly negative ones, play a more significant role in influencing the perceived helpfulness of reviews. While sentiment analysis shows a growing influence in higher-priced contexts, it is the specific emotions, such as anger and disgust, that emerge as consistently stronger predictors of review helpfulness across all price levels. The increase in Nagelkerke R2 values from 0.004 in low-priced to 0.033 in high-priced restaurants indicates that as the dining context becomes more upscale, the emotional tone of reviews becomes increasingly important for determining their perceived value. In summary, the alternative logistic regression analysis underscores the growing significance of negative emotions as the restaurant’s price level increases. While sentiment analysis shows a notable impact in the high-priced restaurant category, the specific emotions, especially negative ones, become stronger and more reliable predictors of review helpfulness in higher-priced contexts.
While many of the effects reported are statistically significant due to the large sample size, we also assessed their practical significance. For example, sadness in high-priced restaurants shows a strong and statistically significant effect (B = 0.611, OR = 1.843, p < 0.001), indicating that each standard-deviation increase in expressed sadness nearly doubles the odds of a review being marked helpful. To illustrate the real-world impact, moving from the 25th to the 75th percentile of sadness intensity (0.04 → 0.18 on a 0–1 scale) increases the predicted probability that a review is voted helpful from 0.21 to 0.29—an 8-percentage-point absolute gain, or 38% relative lift, holding other covariates constant. This aligns with behavioural science benchmarks that suggest ORs between 1.10 and 2.00 can reflect practically meaningful changes in digital behaviour at scale (Chen et al., 2010; Lin et al., 2013). Thus, the sadness–helpfulness relationship is not only statistically significant but also managerially and psychologically impactful in high-price contexts.
Conclusion
This study provides profound insights into how sentiments and emotions in online restaurant reviews influence their perceived helpfulness, addressing Research Questions (RQ1, RQ2, and RQ3). We first summarise the key empirical findings (RQ1–RQ3), then outline their theoretical implications, and finally discuss managerial implications and limitations.
For RQ1, the analysis demonstrates significant differences in the sentiments and emotions expressed in helpful versus unhelpful reviews. While positive emotions like happiness, joy, and love are prevalent across all reviews, helpful reviews distinctly feature higher mean scores for negative (e.g., anger, disgust, fear, sadness) and neutral emotions (e.g., surprise). Conversely, unhelpful reviews are predominantly characterised by positive emotions. This pattern reflects negativity bias, where a slight increase in negative emotions (from 11% to 17%) markedly enhances perceived helpfulness. Furthermore, reviews that combine a majority of positive emotions with a minority of negative ones are deemed more helpful than those with purely positive, negative, or balanced (50/50) sentiments.
For RQ2, the findings highlight how the relationship between review helpfulness and sentiment/emotion shifts with restaurant price levels. Negative emotions, particularly anger and fear, become increasingly important for high-priced restaurants, reflecting heightened consumer expectations. This observation indicates that consumers’ perception of a review’s helpfulness, particularly regarding negative sentiments and emotions, is significantly influenced by the restaurant’s price level, thus addressing RQ2. Logistic regression analysis demonstrates that negative emotions significantly enhance review helpfulness, with this effect intensifying as price levels rise. In low-priced restaurants, anger (B = 0.396, p = 0.006) and disgust (B = 0.661, p < 0.001) were found to significantly increase the likelihood of a review being perceived as helpful. However, as the price level rises, the impact of these negative emotions becomes even more pronounced. At medium-priced restaurants, fear (B = 0.514, Exp(B) = 1.672, p < 0.001) and sadness (B = 0.192, Exp(B) = 1.212, p < 0.001) emerge as key predictors. The trend culminates in high-priced restaurants, where negative sentiment overall becomes a significant predictor (B = 2.985, p = 0.006), with other emotions like fear (B = 0.419, Exp(B) = 1.520, p < 0.001) and sadness (B = 0.611, Exp(B) = 1.843, p < 0.001) contributed positively to perceived helpfulness. Also, in high-priced restaurants, the effect of sentiment on review helpfulness became statistically significant, particularly for negative sentiment, which was strongly predictive of helpfulness. The increasing Nagelkerke R2 values—from 0.004 in low-priced restaurants to 0.033 in high-priced restaurants—illustrate that as the restaurant price level increases, the emotional content of reviews becomes a more critical determinant of their perceived value.
From a theoretical perspective, delving deeper into RQ2 reveals a more nuanced understanding of human cognition and negativity bias. (1) At lower price points, the engagement with negative reviews is relatively superficial. Consumers, facing lower financial stakes, may note negative feedback without it substantially impacting their decision-making process. Negative reviews might cause a momentary pause, but are unlikely to significantly alter perceptions or choices due to the minimal risk involved. (2) Moving to mid-range price levels, consumers exhibit increased caution, weighing the cost against the perceived value more meticulously. Negative reviews receive closer attention as buyers seek to rationalise their spending, marking the point where negativity bias begins to significantly influence purchasing decisions. Buyers are more likely to consider the risks mentioned in negative reviews, striving for a balanced decision-making process. (3) At the upper end of the price spectrum, the stakes feel much higher, leading consumers to thoroughly vet their options before making a purchase. Negative reviews are scrutinised and heavily weighed against high expectations tied to premium pricing. Here, negativity bias is most acute, with consumers focusing on avoiding regret and loss, leading to a meticulous and critical review of feedback. This progression in consumer behaviour across different price levels highlights a core aspect of human decision process: the increase in scrutiny and loss aversion (Kahneman et al., 1991) in alignment with higher financial and psychological stakes. Negative reviews, thus, become increasingly pivotal in shaping purchasing decisions as the price (potential for regret) rises.
Theoretically, at the same time, our findings are not reducible to a simple rule that “more negativity always increases helpfulness.” Across all price tiers, positive emotions remain numerically dominant even in reviews that are judged helpful; what distinguishes helpful from unhelpful reviews is a modest shift in the emotional mix rather than a wholesale reversal of valence. Helpful reviews retain a strongly positive baseline but contain a small, additional amount of negative emotion that makes the experience feel more concrete and informational. Moreover, the effects of specific emotions change with the decision context. For example, sadness is negatively related to helpfulness in low-priced settings but becomes a positive predictor in mid- and high-priced restaurants, where it signals meaningful losses relative to higher expectations. Likewise, joy is largely uninformative at lower price levels but shows a weak positive association with helpfulness only in the high-price segment, where brief, high-arousal expressions of delight are more likely to be interpreted as credible evidence of truly exceptional service. These patterns indicate that the impact of discrete emotions depends on the informational demands of the decision context rather than on valence alone.
Although our findings broadly affirm the negativity bias, they are further illuminated through the lenses of emotional salience and diagnosticity. For example, joy appears to modestly enhance perceived helpfulness in high-end restaurants, where consumers may interpret emotionally rich, authentic positivity as a credible indicator of exceptional service. In contrast, vague or overly generic emotional expressions, whether positive or negative, may lack diagnostic value and thus fail to influence perceived helpfulness. These frameworks help clarify why emotions with clear, context-relevant meaning, such as sadness about unmet expectations or joy about service excellence, stand out, capture attention, and improve perceived informativeness.
For RQ3, the emotion analysis model developed in this study provides a powerful tool for businesses and marketers to understand the emotions that drive customer behaviour. By analysing customer feedback, businesses can identify areas for improvement and tailor their products and services to meet customer needs. Furthermore, the study highlights the importance of using advanced machine learning algorithms to analyse large volumes of customer data, which can provide more accurate and nuanced insights into customer behaviour. For instance, our emotion analysis adeptly identifies negativity bias, demonstrating how a slight increase (from 11% to 17%) in negative emotions within a review can significantly influence its perceived helpfulness, a nuance that sentiment analysis may overlook. Further investigation reveals that as the price level of a restaurant escalates, the difference between helpful and unhelpful reviews becomes more distinct, with negative emotions (notably sadness) playing a pivotal role.
The alternative logistic regression analysis highlights the greater explanatory power of emotion models compared to sentiment models in predicting review helpfulness, particularly as restaurant price levels increase. For low-priced restaurants, the sentiment model had a Nagelkerke R2 of 0.004, while the emotion model slightly improved it to 0.005. In medium-priced restaurants, the Nagelkerke R2 increased from 0.015 in the sentiment model to 0.018 in the emotion model. This trend was still pronounced in high-priced restaurants, where the sentiment model’s Nagelkerke R2 was 0.031, but the emotion model reached 0.033. These results underscore that emotions, especially negative ones, play a more significant and consistent role in determining review helpfulness as the price level rises.
This study provides critical insights into the impact of online reviews on consumer behaviour, particularly the role of sentiment and emotion in shaping perceived helpfulness. The investigation into the sentiments and emotions expressed within these reviews, addressing RQ1 and RQ2, underscores a significant trend: the presence of a negativity bias. Specifically, we found that even a modest increase in negative sentiments, particularly those expressing anger and fear, can significantly enhance a review’s perceived helpfulness. This effect is amplified in the context of higher-priced restaurants, where the stakes of consumer expectations and decision-making processes are inherently elevated. Our analysis not only corroborates the existence of negativity bias but also delineates its variable impact across different pricing levels of restaurants, deepening the understanding of how consumers engage with online reviews. This discovery is pivotal for understanding consumer engagement with online reviews and highlights the critical role of emotion in shaping consumer perceptions and decisions. Furthermore, by applying advanced machine learning algorithms to analyse large-scale datasets, this study advances methodological approaches in sentiment and emotion analysis, addressing RQ3 and offering a more refined perspective on consumer perceptions and behaviour.
From a managerial perspective, our findings have several practical implications for marketers, business operators, and reputation managers, particularly in the hospitality and restaurant industries. Although restaurants are frequently discussed under the hospitality umbrella, they represent a distinct and often underexamined segment within hospitality research (DiPietro, 2017; Sabah Al Kaabi et al., 2022). Our focus on restaurants helps address this gap and offers domain-specific insights. In this context, it is useful to distinguish between ‘happiness’ and ‘joy’ in review text. Happiness reflects a diffuse, ongoing sense of feeling good about a brand, whereas joy refers to brief, high-arousal bursts of excitement in response to specific goal attainment (Bruhn & Schnebelen, 2017; Watkins et al., 2018). In information-processing terms, diffuse happiness functions as a low-diagnostic background mood, while short-lived peaks of joy are encoded as stronger, more diagnostic cues that something unusually good has occurred. When price – and thus the perceived cost of making a bad decision – is high, consumers are more likely to rely on these highly diagnostic emotional signals when judging whether a review is informative and trustworthy. In our context, spikes of joy in otherwise positive reviews can therefore serve as credible indicators of standout service episodes in high-price settings, increasing the likelihood that such reviews are evaluated as helpful.
Our findings extend and deepen the knowledge on the role of discrete emotions in shaping the perceived helpfulness of online reviews and offer actionable implications for online retailers and review platforms. Specifically, they provide practical insights for businesses and third-party platforms to manage posted reviews more efficiently and to strategise in pre-empting negativity bias and readers’ responses. Earlier studies have typically focused either on a broad range of generalised emotions or on a wide array of product categories (e.g., the Yelp Review Study by Xu et al., 2023). By identifying which specific emotions contribute to perceived helpfulness, businesses can better analyse customer feedback to improve products and services. Understanding the impact of discrete emotions allows companies to manage and respond to reviews more effectively, prioritising responses to reviews that are likely to influence potential customers significantly.
First, the pronounced impact of negative emotions on review helpfulness, especially in higher-priced establishments, underscores the need for businesses to carefully monitor and address customer feedback that expresses dissatisfaction, frustration, or disappointment. This proactive engagement is vital for mitigating the potential damage of negative reviews and maintaining a positive brand image. In line with Mattila and Ro (2008), we also recommend that restaurant managers implement emotion-aware service recovery practices. Training frontline staff to recognise and respond to emotional cues, such as sadness or disappointment, during or immediately after the service encounter can prevent these emotions from escalating into negative online reviews. By addressing emotional discontent in the moment, businesses may reduce the likelihood of reputational damage before it occurs. Second, businesses should consider implementing targeted strategies for managing online reviews based on the price level of their services. For high-end restaurants, where negative emotions are more likely to influence perceived helpfulness, it is essential to prioritise swift and thoughtful responses to negative reviews. This can help not only resolve customer issues but also demonstrate a commitment to quality, which may positively influence other potential customers. We further recommend that social media or community managers adopt emotionally congruent response strategies. In line with Sparks and Bradley (2017), organisations should avoid overly neutral or generic replies, particularly in response to emotionally charged reviews. Instead, matching the emotional tone of the reviewer (e.g., expressing empathy in response to sadness or frustration) can enhance perceived authenticity, signal care, and strengthen reputational repair. Furthermore, the study’s findings suggest that fostering a mix of positive and negative emotions in reviews could be beneficial. Encouraging satisfied customers to leave detailed, emotionally rich feedback can help create a more balanced and credible online presence. This approach could enhance the perceived helpfulness of positive reviews by including elements that address potential concerns or challenges, thereby appealing to a broader range of potential customers. Most online reviews continue to be dominated by positive sentiment, often lacking emotional specificity. To enhance their perceived informativeness and usefulness, we recommend that businesses prompt happy customers to reflect on the emotional highlights of their experiences (e.g., “What made the experience joyful or memorable?”). Encouraging this kind of reflection can amplify the perceived authenticity and helpfulness of positive reviews, which is critical for balancing the impact of negative ones (Duan et al., 2008). From a managerial standpoint, this suggests that businesses should not only monitor the valence of emotional content in reviews but also assess their salience and diagnostic clarity. Emotionally vivid, contextually relevant reviews, whether positive or negative, are more likely to be judged as helpful, and thus have a disproportionate influence on consumer decision-making. Apart from the results, the study underscores the value of advanced machine learning models in extracting consumer insights from online reviews. Businesses can adopt artificial intelligence-driven sentiment analysis tools to better understand customer emotions, identify emerging trends, and refine service strategies accordingly. Additionally, high-end restaurants should consider differentiated review management approaches that reflect the emotional stakes of their clientele. In line with Kwortnik and Ross (2007), we recommend assigning senior managers to handle emotionally negative reviews and offering personalised post-resolution outreach, such as direct calls or in-person gestures. Publicly acknowledging emotionally charged feedback with gravitas can also foster reputational repair and signal empathy to prospective diners.
Previous studies often categorise emotions broadly as positive or negative. Our research delves deeper by examining specific negative emotions, such as anger, fear, and sadness, and their distinct effects on perceived review helpfulness. This granularity provides a more nuanced understanding of emotional influences on consumer perceptions. Building upon the concept of negativity bias, we incorporate the frameworks of emotional salience and diagnosticity to explain why certain emotional expressions in reviews are more impactful. This theoretical integration offers a comprehensive perspective on how consumers process emotional information in reviews. Finally, our study investigates how product price levels moderate the relationship between emotional content and perceived helpfulness. In doing so, it extends existing theories of negativity bias and price–quality inference by showing that the diagnostic impact of discrete emotions strengthens as perceived decision stakes rise with higher price levels.
Despite its contributions, this study has certain limitations. First, we only analysed the impact of helpful and unhelpful reviews on customer behaviour in the restaurant industry. Future studies can explore the impact of reviews in other industries to provide a more comprehensive understanding of the relationship between reviews and customer behaviour. Moreover, our data are drawn from English-language reviews in major U.S. cities, so cultural and linguistic variation in emotional expression is not captured and cross-cultural generalisation should be made with caution. Second, the study relies on a specific emotion detection model trained on a tweet emotion dataset, which may not fully represent the linguistic characteristics of online reviews. Although our large-scale dataset of over 2.7 million reviews mitigates this concern, future research could refine emotion classification models using industry-specific training datasets to enhance accuracy. While the model achieves satisfactory performance and a sample of its labels was manually checked against human coders, automated emotion detection remains a probabilistic approximation that may overlook subtle or context-specific nuances. Third, this study relied on publicly available review data, limiting our ability to account for reviewer-specific variables such as demographics, purchase history, or verified status. These unobserved factors may influence how emotional content is perceived. In addition, platform-level mechanisms such as ranking, visibility, and recommendation algorithms may shape which reviews are exposed widely enough to receive helpfulness votes, introducing potential exposure bias into our outcome measure. Finally, our models focus on isolating the incremental contribution of sentiment and discrete emotions and therefore do not incorporate additional covariates (e.g., review length, restaurant popularity) or multilevel structures that account for reviews nested within restaurants and cities. Future research could address these limitations by using enriched datasets, hierarchical modelling approaches, or mixed methods to examine how reviewer characteristics and platform algorithms interact with emotional content to shape perceived helpfulness.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
