Abstract
Why do connected users in online social networks express similar emotions? Past approaches have suggested situational emotion transfers (i.e., contagion) and the phenomenon that emotionally similar users flock together (i.e., homophily). We analyze these mechanisms in unison by exploiting the hierarchical structure of YouTube through multilevel analyses, disaggregating the video- and channel-level effects of YouTuber emotions on audience comments. Dictionary analyses using the National Research Council emotion lexica were used to measure the emotions expressed in videos and user comments from 2,083 YouTube vlogs selected from 110 vloggers. We find that video- and channel-level emotions independently influence audience emotions, providing evidence for both contagion and homophily effects. Random slope models suggest that contagion strength varies between YouTube channels for some emotions. However, neither average channel-level emotions nor number of subscribers significantly moderate the strength of contagion effects. The present study highlights that multiple, independent mechanisms shape emotions in online social networks.
A large part of modern social life occurs online. Billions of people use the Internet to catch up with their friends, make dates, and maintain their hobbies. Accordingly, a substantial part of everyday emotions are elicited through social media, raising important questions about the psychological processes underlying the emotions people experience online. Different psychological mechanisms have been proposed to explain the phenomenon that emotions of connected social media users correlate (Alloway, Runac, Qureshi, & Kemp, 2014; Bazarova, Choi, Schwanda Sosik, Cosley, & Whitlock, 2015; Bollen, Gonçalves, Ruan, & Mao, 2011; Ferrara & Yang, 2015; Kramer, Guillory, & Hancock, 2014). Broadly speaking, the proposed mechanisms fall into two categories: situational emotion transfer from Person A to Person B (most frequently labeled “emotional contagion”) and general similarity between Person A and Person B (e.g., “flocking together” or homophily of emotionally similar people). The difficulty lies in disentangling the contribution of each hypothesized mechanism. Here, we utilize multilevel analyses, which can model the hierarchical structure of a major social media website (YouTube) to simultaneously estimate the effects of situational emotional contagion and homophily.
A number of studies tested for contagion effects in online social networks: Coviello and colleagues (2014), for instance, estimated the situational effect of Facebook user emotions on friend emotions. To avoid confounding contagion with other mechanisms, they restricted their analyses to user emotions predicted by rainfall. The high specificity of their model, which allowed for good statistical control, was also a weak point. Analyses focused exclusively on initial emotion expressions caused by rain and the downstream reactions of friends who lived far away from the original poster (and thus were not exposed to the initial rain). More importantly, they designed their method to test the presence of situational emotion transfers (interpreted as contagion), while not explicitly modeling the parallel mechanism of homophily. Similarly, Kramer, Guillory, and Hancock (2014) demonstrated a situational spread of emotions by experimentally manipulating people’s Facebook newsfeeds, with the finding that people express more positive emotions when they are presented with more positive emotions of other users. However, the authors did not investigate homophily as an additional mechanism that could also contribute to emotion clusters on social networks.
Conversely, a separate string of research has focused on the question of whether connected users share psychological dispositions. For instance, Youyou, Stillwell, Schwartz, and Kosinski (2017) found that Facebook friends tend to score similarly on measures of Big Five personality traits. Other studies reveal homophily on online networks between people who share social attributes (e.g., their ethnic background; Wimmer & Lewis, 2010). Regarding emotional flocking, past work found that people who express similarly valenced emotions on specific political topics were more frequently connected in network clusters on Twitter (Himelboim, Cameron, Sweetser, Danelo, & West, 2016; Yuan, Murukannaiah, Zhang, & Singh, 2014). More generally, online microblogging websites were argued to host emotion communities, which consist of interconnected users who are characterized by similar patterns of emotion expressions (Bollen et al., 2011; Zhu, Wang, Wu, & Zhang, 2017). The question of how far such communities are based on homophily versus emotional contagion (e.g., elicited by highly connected users) often remains unaddressed.
While both emotional contagion and emotional homophily have been investigated within online social networks, the vast majority of projects have focused on one of the two mechanisms in isolation. To date, there is very little work that tries to estimate both parallel mechanisms simultaneously and in mutual control for each other. Lewis, Gonzalez, and Kaufman (2012), however, investigated the spread of “tastes” (e.g., likes and dislikes of music) on social media and explicitly modeled both homophily (Person A befriends Person B because both like the Music Genre C) and taste diffusion (Person A likes Music Genre C because Friend B likes Music Genre C). The authors conclude that correlations of tastes between people are more commonly due to selection effects (cf. homophily or flocking) than to taste diffusion (cf. contagion).
Here, we attempt to estimate both effects, situational emotional contagion and emotional homophily, by concentrating on a relatively unexplored online environment: YouTube vlogs. People who own a YouTube channel occasionally upload vlogs (short for video blogs) in which they talk to the audience, presenting parts of their “thoughts, opinions, or experiences” (Cambridge Dictionary, 2018). YouTube has not attracted as much research attention in psychology as Twitter, Facebook, or Google. However, we propose that YouTube serves as a promising platform to study human emotions (Perez Rosas, Mihalcea, & Morency, 2013; Wollmer et al., 2013), as the users experience very vivid stimuli and often express their emotional reactions in the comment sections (Oksanen et al., 2015). Further, the sheer size of YouTube (2018; over 1 billion users) and the potential impact it has on people’s daily lives (1 billion hours watched daily) make it an environment worth studying for psychologists. An example of the dynamism of video-induced emotions is given by Guadagno, Rempala, Murphy, and Okdie (2013) who show that emotional reactions to videos lead to these videos going viral.
Our methodological approach is based on two pillars: First, the structure of YouTube allows us to distinguish the clustering of spectators on specific “channels” (video collections of a specific vlogger) from the emotion transfer that occurs between vlogger and audience for a specific video. Second, the method multilevel analysis maps to the hierarchical structure of YouTube with individual videos (Level 1 or individual level), belonging to a specific vlogger channel (Level 2 or group level).
With this hierarchical structure, the distinction between contagion versus homophily can be described as follows: Theories on situational emotion transfers (most prominently emotional contagion) hypothesize that there is an immediate Level 1 effect of vlogger emotion on audience emotion, and this effect should exist after controlling for the effects that channel-level emotion aggregates might have on the composition of a channel’s audience. Conversely, homophily theories propose that general/stable vlogger emotions (i.e., Level 2 aggregates of channel emotions) select audience emotions even after controlling for the emotions expressed in individual videos (because, for instance, positive audiences are drawn to channels that are generally positive). Importantly, we acknowledge that contagion and homophily are two specific labels for emotion transfers that are not without alternatives in psychological research. In fact, situational emotion transfers might also be the result of other psychological processes, such as empathy, sympathy, or selective responding. Similarly, channel-level effects are commonly described as evidence of homophily, but they might also be comprised of socialization of audiences (i.e., a sort of long-term contagion). In the current article, we investigate whether emotion transfers are related to at least two mechanisms (immediate and sustained effects) and label those mechanisms as contagion and homophily. A further subcategorization of the effects can be achieved through qualitative analyses (see Discussion section).
In line with prior research on emotional contagion and homophily, we hypothesize that vlogger’s situational (Level 1) and average (Level 2) expressions of Emotion A will both independently predict their audiences’ expressions of Emotion A.
In addition to these hypotheses, we explore whether the strength of contagion and homophily effects differs between channels. If so, we will investigate whether emotional contagion depends on the average emotions of the vlogger (i.e., cross-level interactions), and whether homophily effects differ between small and large channels. Are contagion effects for a specific emotion stronger (or weaker) on channels where that emotion is habitually expressed? And are homophily effects stronger or weaker on larger, more popular channels, given that there are more (but potentially more dissimilar) people flocking to these channels?
Method
We found the channels of the vloggers through different ways such as online vlogger lists, reports about vlogging, recommendations of colleagues, prior knowledge, and searching the term “vlog” and “vlogger” on YouTube and Google. Our primary concern when including additional channels in our sample was to ensure a broad coverage of different types and contents of channels and videos found on YouTube. Our final sample includes vloggers specialized in lifestyle, fashion, science, arts, traveling, makeup, gaming, cars, comedy, shopping, photography, sports, and collecting things, adding up to a final set of 2,083 YouTube vlogs from 110 vloggers. To address the possibility that this procedure affected our results, we conducted sensitivity analyses to show that our results were robust even when focusing on subsets of our data (see Supplemental Materials). The number of subscribers per vlogger varies between tens of thousands and tens of millions (M = 3,255,470, SD = 6,827,628).
Small channels (less than 10.000 subscribers) were not collected, despite being common on YouTube, because we are interested in audience emotions which are simply too sparse on small channels. We excluded vlogs that did not feature English-speaking vloggers as well as very long vlogs (>15 min) that often document longer periods of a vlogger’s life (e.g., the last month/year) and that therefore include a wide range of emotions as well as large quantities of text.
There are no guidelines yet on how much text is needed to capture emotion expression in YouTube comments. We therefore used research on emotions on Twitter as a reference point. Many publications argue that 20 tweets is the minimum number of tweets required to make psychological inferences about the author (e.g., Ritter, Preston, & Hernandez, 2014; Sylwester & Purver, 2015). Therefore, we decided to scrape 20 vlogs per vlogger (or the maximum available number if less videos had been uploaded), because 20 vlogs usually contain substantially more text than 20 tweets and should therefore provide us with sufficient data for each vlogger. User comments are usually much shorter than spoken text in vlogs and often even shorter than individual tweets. It is therefore reasonable to assume that we need more than 20 comments to characterize a comment section. To ensure that we have an amount of text that is at least as large as 20 tweets, we sampled 120 comments per vlog. This cutoff is comparable to (or larger) than the cutoffs used in previous studies that examined YouTube comments (Oksanen et al., 2015).
We scraped the spoken text (subtitles) from the vlogs and the comments from the audience through an automated python script. Most subtitles were machine-generated by YouTube (89 of 100 in a random sample of videos) and therefore occasionally contained errors. However, there is no large quality difference to the human-generated subtitles, and we do not assume that results of our analyses could be explained through random errors in the automatic transcriptions.
Measures
We obtained linguistic measures of positive emotion, negative emotion, and the specific emotions joy and anger for both the vlogger and audiences, by cross-referencing the words in the video captions (vlogger emotion) and comment sections (spectator emotion) with the National Research Council (NRC) emotion lexica, which provide rich collections of linguistic cues for all four constructs (e.g., “happy” indicates joy, “rage” indicates anger, “admire” indicates positive emotion but not joy specifically, and “lifeless” indicates negative emotion but not anger specifically; Kiritchenko, Zhu, & Mohammad, 2014). The emotion labels for each word in the lexica were generated by crowdsourcing on MTurk and they can be accessed via the tidytext R package (version 0.1.9; Silge et al., 2018). While dictionary-based approaches are not perfect in annotating emotions (e.g., negated adjectives like “not sad” are incorrectly classified as negative), the NRC emotion lexica can be utilized effectively to code emotions over large user-generated texts on the Internet (e.g., Korkontzelos et al., 2016). The measures represent relative frequencies (0–1) of emotion-indicative words in the analyzed texts.
Analyses
We employed a multilevel approach in which we model emotions expressed by the audience based on emotions expressed in vlogs and emotions expressions averaged per vlogger. Individual-level emotions were entered as grand mean-centered vlog emotions and group-level emotions were entered as the vlogger-average emotion. Disaggregating the effects of video versus vlogger emotion by entering the predictor variable once as a grand mean-centered variable and once as the vlogger averages is the easiest way to disentangle the Level 1 from the Level 2 effect as significance tests for both effects are immediately provided in a multilevel model. 1
Results
Descriptive results for all emotions expressed by the vloggers and the audiences can be found in Table 1. We started modeling audience emotions with the so-called empty models which only include a random intercept. Such models indicate whether a multilevel approach is necessary, by quantifying the amount of variance (here: variance in audience emotions) explained by between-group (here: between-channel) differences. A significant effect of the random intercept as well as an intraclass correlation of >.05 indicate the necessity of multilevel modeling. The empty models revealed significant effects of the random intercepts (p < .001) and the computed intraclass correlation ranged from .145 (negative emotion) to .421 (joy), indicating substantial between-channel differences in emotion expression.
Descriptive Statistics for Vlogger and Audience Emotion.
Note. The table depicts Level 1 descriptive statistics of vlogger and audience emotions. The scale reflects relative frequency (0–1) of emotion-indicating words in all expressed words.
Next, we estimated two models predicting each emotion: Model 1 included the (grand mean centered) Level 1 emotion expressions of the vlogger; Model 2 also included the Level 2 averages of vlogger emotion. Table 2 shows the results of all sequences of models, which are described in the following section.
Multilevel Models Predicting Audience Emotions From Vlog Emotions.
Note. ICC = intraclass correlation; empty model = random intercept only model; Level 1 effect = fixed effect of grand mean-centered video emotion; Level 2 effect = fixed effect of emotion averages for channel/vlogger.
Model 1 tested the effects of individual-level (i.e., video-specific) emotion expressions on audience reactions. In other words, we tested for the effects of video-level emotional contagion without controlling for channel-level homophily. There were significant positive effects of video emotions on audience emotions (positive emotion: b = .246, SE = .027, p < .001; negative emotion: b = .384, SE = .046, p < .001; joy: b = .303, SE = .028, p < .001; and anger: b = .42, SE = .032, p < .001).
As a next step, Model 2 additionally entered group-level (i.e., channel-averaged) emotion expressions as a fixed effect into the models. There were significant positive effects of group-level vlogger emotion on audience emotion (positive emotion: b = .531, SE = .188, p = .006; negative emotion: b = .596, SE = .123, p < .001; joy: b = .655, SE = .197, p = .001; and anger: b = .665, SE = .103, p < .001), providing evidence for the effect of user homophily. Importantly, the effects of vlog-specific emotions remained significant even when controlling for channel-level emotions (positive emotion: b = .235, SE = .028, p < .001; negative emotion: b = .301, SE = .05, p < .001; joy: b = .29, SE = .029, p < .001; and anger: b = .37, SE = .033, p < .001). However, the effects of video-specific emotion decreased (4% for positive emotion, 22% for negative emotion, 4% for joy, 12% for anger) when aggregated emotions were added to the models, indicating that there is some confounding between both effects if analyzed individually.
Exploring random slopes
Our primary analyses used random intercepts to control for variability between vlogger channels. We further examined random slope models to examine the reliability of contagion effects across channels. We found that allowing the slopes to vary significantly improved model fit for positive emotions, χ2(2) = 11.362, p = .003; joy, χ2(2) = 23.395, p < .001; and anger, χ2(2) = 19.553, p = .001, while we did not find a significant improvement for negative emotions in general, χ2(2) = 4.474, p = .107. Thus, there were some vlogger characteristics that appear to have affected the strength of emotion transfers between video and spectators, at least for some emotions. The model improvements were however generally not large and model selection based on the Bayesian information criterion would favor the more parsimonious model for both negative and positive emotions (Akaike information criteria consistently favor the random slope model). Figure 1 illustrates how random slope models disaggregate video-level and channel-level effects.

The video-level effects of vlogger emotions on spectator emotions (solid lines) are estimated within vlogger channels and under consideration of average vlogger emotions (dashed lines). Almost all video-level slopes (99.3%) remain positive while varying in size.
To explore which channel attributes could explain the conditional strength of emotion transfers, we added cross-level interaction terms between channel and video emotions to our models. No statistically significant interactions emerged, all |t(1,971)|s ≤ 1.064, all ps = ns. We also found no significant interactions between contagion effects and channel size, all |t(1,952 2 )|s ≤ 1.241, all ps = ns, or homophily effects and channels size, all |t(105)|s ≤ 1.953, all ps = ns, for any of the four emotions. Therefore, the marginal conditionality of the strength of contagion and homophily effects remains to be explained.
Discussion
The present analyses show two independent ways that emotions spread in the YouTube community. The first is an immediate emotion transfer that occurs when audience members watch a vlogger express emotions in a video. The second path is between average vlogger emotions (i.e., emotion averages over vlogs) and audience emotions, which materializes beyond the effect of the emotions in the vlog that is currently being watched. The two most popular interpretations of these two effects are emotional contagion for the immediate effect and similarity-based flocking (or homophily) for the sustained effect. Our analyses show that both effects, which were proposed in past psychological research, contribute independently to the apparent spread of emotions over social media. However, only the emotional contagion effect can really be labeled a spreading effect, as emotions are actually transferred from user to user. Homophily works the other way around by bringing users with similar emotions closer together. Thus, our models reveal that there is a spread of emotions as well as a “despread” (inching together) of similar users that lead to the observed correlations between the emotions of different people online. The demonstrated confounding of both effects shows that neither should be interpreted without consideration of the other.
In line with Lewis and colleagues (2012), our analysis suggests that the channel effect contributes more to the explanation of audience emotions than video effects. In the presented models, an increase in average emotionality by 10% predicts an increase of roughly 5–6.5% audience emotionality for all emotion variables, whereas equivalent video effects generally predicted about 2.5–3.5% increases in emotion expressions. This difference suggests that viewer emotions can be better predicted based on who rather than what they watch in any given moment. This reasoning is supported by the fact that the decision to watch a specific vlogger is usually a more informed decision than the choice to watch a specific vlog because users usually have less information about the contents of specific vlogs. Thus, dispositional emotionality of the viewer is more strongly linked to the overall channel than any individual video. However, individual videos provide very salient, in situ emotion expressions, which should have a strong effect on the viewers. We speculate that we found stronger effects for the channel level, as many vloggers express very characteristic (i.e., invariant) emotions, which leave little room for video-level effects.
Importantly, our study builds on prior research by demonstrating that contagion and homophily effects do not only occur for message-based social media websites like Twitter or Facebook but also on the video-based platform YouTube. As emotion expressions are very vivid in video format and given that many vloggers have millions of followers watching their frequent vlogs, we conclude that YouTube constitutes a highly impactful source of emotions as well as a meeting point for emotion communities. In a recent report (Royal Society for Public Health, 2017), YouTube was estimated to have the most positive impact on the well-being of young people in comparison to other big social network sites. Emotion transfers can certainly be expected to form part of this effect, albeit not always in a positive direction.
Our estimation of random slopes models shows that emotional contagion appears to be a reasonably stable effect, as it occurs for almost all investigated YouTube channels and emotions (99.3% of all coefficients were positive). Still, the strength of emotional contagion occurring for individual videos appears to be affected by vlogger characteristics. We started exploring which channel characteristics might be responsible for the differences in contagion strength. Our analyses of average vlogger emotions and channel size did, however, not lead to any significant results. We speculate that we did not have the right data to explain why emotion transfers partly depend on the YouTuber. Channel popularity and emotionality are salient attributes but need not necessarily moderate emotion transfers. For future research efforts, it might be more worthwhile to consider moderating attributes such as vloggers’ charisma (Cherulnik, Donley, Wiewel, & Miller, 2001), status (Delvaux, Meeussen, & Mesquita, 2016), and facial expressiveness (Wild, Erb, & Bartels, 2001), which have been shown to affect emotion transfers. Coding these YouTuber characteristics might enable us to better understand the conditional strength of emotion transfers on social media.
Beyond Contagion and Homophily
The presented analyses demonstrate that there are at least two reasons why emotions correlate on social media. While we and past research have labeled these two effects emotional contagion and homophily, we want to emphasize that the exact psychological explanations for the immediate and the sustained effect remain undetermined in computational social science (Salganik, 2017). In order to give a more realistic appreciation of computational research on emotion transfers, we go on to contemplate how the effects, observed here and in prior research, could be reinterpreted and broken down further into different submechanisms.
After emotional contagion, empathy appears to be the second most prominent mechanism explaining immediate transfers of emotions between individuals. While the exact distinction between both mechanisms is complicated (e.g., Wispé, 1987), empathy with a vlogger (especially “cognitive empathy”) implies putting yourself into the vlogger’s shoes, whereas emotional contagion does not require spectators to understand the vlogger’s situation (Preston & de Waal, 2002). Both processes are distinct (but overlapping), describe emotion transfers, and could therefore form part of our individual-level effect. Yet another form of individual-level emotion transfers is sympathy, which unlike empathy and emotional contagion does not necessarily imply an emotion matching between people (Preston & de Waal, 2002). An example would be to be happy or sad for a vlogger because something happened to the vlogger. A qualitative assessment of the user comments supports our hunch that the immediate Level 1 effect is again split into at least these three parallel effects. We find instances of apparent contagion (his laugh always makes me laugh), empathy (I HAVE […] TOO! I am constantly being asked if I am okay and it annoys me so much), and sympathy (happy to hear ur doing good). Yet another possibility that is rarely considered is selective responding. An example can illustrate this Level 1 effect. A YouTuber can be quite positive, which leads positive people to flock to the channel and some (but certainly not all) negative people to discard the channel (i.e., Level 2 homophily). The commenting behavior of the remaining negative people might be affected by the emotions of a specific video, with negative videos leading to increased commenting of this viewer group, thereby leading to a Level 1 effect of video emotion. Research on depression supports this potential mechanism by showing that depressed individuals show increased attention to negative emotions in other people (e.g., Joormann & Gotlib, 2007).
Similarly, the Level 2 effect could consist of distinct but parallel subeffects. The common interpretation of the channel effect is that there is homophily between vloggers and audiences. However, audience socialization is equally applicable to explain the observed Level 2 effect. This effect would be based on the gradual formation of norms (e.g., “being positive”) among people that regularly follow a vlogger. While both potential Level 2 effects lead to the development of emotion communities, one occurs through the selection of group members, while the other occurs through changes within group members (Anderson, Keltner, & John, 2003).
Our study demonstrates that the spread of emotions over social media splits into situational and sustained mechanisms. Still, there are many distinct effects, identified in basic psychological research, which can (jointly) explain both mechanisms. We hope that our discussion of some of these mechanisms makes researchers gain awareness of the frequent uncertainty of psychological labels in computational research.
Limitations
While controlling for channel-level effects makes the effect of immediate emotion transfers more interpretable, there might be flocking artifacts left over in the Level 1 effects. For instance, it is possible that regular followers of a YouTube channel skip a video if the title appears to be in dissonance with their own traits (e.g., positive people might be less inclined to watch a video of their favorite positive vlogger if the video title is: “Today was a sad day”). Still this spontaneous de-/flocking should not be overestimated. Compare it too skipping an episode of your favorite TV show because the title of the episode does not fit your personality or skipping a book of your favorite author if the title is less aligned with your traits than the titles of prior books (which you loved). In fact, we assume that the opposite effect might be more reasonable with positive people being intrigued when their favorite positive vlogger suddenly posts a video with a sad title. While we estimate the effect of these artifacts to be small, their existence is still reasonable and could be targeted in future research efforts.
Generally, and related to the point above, research and analysis designs on YouTube are limited as it is not possible to assemble comments given by one person on different YouTube videos. An accumulation and analysis of such “commenter-level” texts would enable researchers to analyze network phenomena like homophily more closely on YouTube. It would, for instance, allow researchers to quantify the independent contributions of homophily and audience socialization. Homophily effects could be quantified as the change in the amount of dispositionally positive or negative viewers, whereas socialization could be quantified as the change in the commenting behavior of recurring viewers. Importantly, such analyses would require strict ethics regulation as individual user data would be analyzed.
Conclusion
We demonstrate the existence of immediate and sustained mechanisms which help to explain the spread of emotions over social media. The emotions expressed in YouTube videos, as well as the dispositional emotionality of a vlogger, independently predict the emotions experienced by audience members. Commonly, these two effects are labeled emotional contagion and homophily. However, new data science techniques to collect and process data should not lead to theory tunnel vision in psychological research. We therefore discuss that the distribution of emotions over social networks is likely based on a host of additional mechanisms such as empathy, sympathy, and audience socialization, which, when taken together with contagion and homophily effects, explain why connected users express similar emotions.
Supplemental Material
SPPS820309_suppl_mat - Multilevel Emotion Transfer on YouTube: Disentangling the Effects of Emotional Contagion and Homophily on Video Audiences
SPPS820309_suppl_mat for Multilevel Emotion Transfer on YouTube: Disentangling the Effects of Emotional Contagion and Homophily on Video Audiences by Hannes Rosenbusch, Anthony M. Evans and Marcel Zeelenberg in Social Psychological and Personality Science
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
The supplemental material is available in the online version of the article.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
