Abstract
Recent research has explored the role of empathy in the context of music listening. Here, through an empathy priming paradigm, situational empathy was shown to act as a causal mechanism in inducing emotion, although the way empathy was primed had low levels of ecological validity. We therefore conducted an online experiment to explore the extent to which information about a composer’s expressive intentions when writing a piece of music would significantly affect the degree to which participants reportedly empathise with the composer and in turn influence emotional responses to expressive music. A total of 229 participants were randomly assigned to three groups. The experimental group read short texts describing the emotions felt by the composer during the process of composition. To control for the effect of text regardless of its content, one control group read texts describing the characteristics of the music they were to hear, and a second control group was not given any textual information. Participants listened to 30-second excerpts of four pieces of music, selected to express emotions from the four quadrants of the circumplex theory of emotion. Having heard each music excerpt, participants rated the valence and arousal they experienced and completed a measure of situational empathy. Results show that situational empathy in response to music is significantly associated with trait empathy. As opposed to those in the control conditions, participants in the experimental group responded with significantly higher levels of situational empathy. Receiving this text significantly moderated the effect of the expressiveness of stimuli on induced emotion, indicating that it induced empathy. We conclude that empathy can be induced during music listening through the provision of information about the specific emotions of a person relating to the music. These findings contribute to an understanding of the psychological mechanisms that underlie emotional responses to music.
Music is well documented as being a highly emotive art form. However, the mechanisms by which this occurs, while frequently researched, are still poorly defined due to the absence of agreement between the multiple existing models. An emotion is often conceptualised as a brief episode that is characterised by the synchronisation of expression, activation, feelings and arousal in response to a specific stimulus (Scherer, 2005). Furthermore it is well established that, within the Western context, expressive characteristics of the music can convey emotion (see Juslin & Laukka, 2003). However, an emotion recognised as being expressed in a piece of music may not always equate to the emotion induced in and felt by a listener (Egermann & McAdams, 2013). For example, music may be recognised as expressing sadness without making the listener feel sad. There are two theoretical models that explain why emotional expressions in music might (under certain conditions) lead to felt and congruent emotional induction. Scherer and Zentner (2001) describe several emotion production rules, listing empathy as one of five basic psychological mechanisms that are involved in generating emotional responses to music. The second often-cited theoretical framework of emotional responses to music is the BREVCEM (Brainstem reflex, Rhythmic entrainment, Evaluative conditioning, Visual imagery, Contagion, Episodic memory, and Musical Expectancy) model, proposed by Juslin and Västfjäll (2008). In addition to other mechanisms, they suggest emotional contagion as a possible source of emotion induction through music. These two theories of musically induced emotion have in common that they reference two related mechanisms: emotional contagion and empathy. Both have at their core the principle that emotional expressions are felt by and induced in listeners. While these theoretical accounts might seem convincing to the reader, there are very few experimental investigations that test the effect of these mechanisms on listeners’ responses. The study presented here therefore aims to fill in this gap and test the causal role that empathy plays in moderating emotional responses to expressions in music.
Music, empathy, and emotion
While several studies have tested the relationship between music and empathy, they have often relied on correlational analyses of naturally occurring inter-individual differences in trait empathy and emotional responses to music. For example, Vuoskoski and Eerola (2011) analysed correlations between dispositional empathy and emotion ratings and found that scores on the fantasy subscale were associated with the intensity of emotional responses evoked by music. The same authors also found that trait empathy contributed to the susceptibility to sadness induced by unfamiliar music, once again based on correlations between naturally occurring inter-individual differences in trait empathy and the intensity of emotional responses to the music; they carried out no experimental manipulation of empathy, as an independent variable (Vuoskoski & Eerola, 2012, see also Vuoskoski et al., 2012). In another example, Eerola et al. (2016) concluded from self-reports of felt emotion and a pictorial facial expression judgment task that being moved by sad music is associated with empathy, having correlated the self-report data with scores on a measure of general social trait empathy: the Interpersonal Reactivity Index (IRI; Davis, 1980). Wöllner (2012) correlated self-reported ratings of expressivity provided by string quartet performers with those provided by observers and found that empathy facilitates estimations of other individuals’ expressive intentions. Egermann and McAdams (2013) showed that self-rated empathy was responsible for reducing the difference between recognised and felt emotion ratings in response to musical stimuli. However, they also did not attempt to manipulate the extent to which their participants empathised with musicians. Balteş and Mui (2014) correlated scores on the Toronto Empathy Questionnaire (TEQ; Spreng et al., 2009) with ratings on the Geneva Emotion Music Scale (GEMS; Zentner et al., 2008) and found that trait empathy was associated with increased sublimity and unease.
However, authors of previous studies have relied on naturally occurring differences between individual levels of trait empathy, which limits the internal validity of their studies. Accordingly, measured levels of trait empathy might correlate with other unknown, un-controlled personal characteristics. The need to isolate and control for the separate processes contributing to empathy has also been noted by recent researchers (e.g., Carr & Mendez, 2018; Healy & Grosman, 2018; Lamm et al., 2007; Mui & Vuoskoski, 2016); specifically, using an “empathy priming paradigm” (Wallmark et al., 2018, p.16).
To our best knowledge, the first experimental investigation to test the causal influence of situational empathy on emotional responses to music was carried out by Mui and Balteş (2012). Using a within-participants design, and two musical stimuli, they instructed their participants to experience either high empathy by imagining “as vividly as possible how the performer feels, what is described in the music and [trying] to feel those emotions” or, conversely, low empathy by trying “to take an objective perspective toward what is described in the music and . . . not to get caught up in how the performer might feel” (p. 3). Participants’ subjective and physiological responses to the two stimuli differed according to whether they heard them in the high- or low-empathy condition. Thus, by inducing empathy in response to specific situations and finding that emotional responses to music altered, Mui and Balteş produced evidence to support for the causal role of empathy in moderating emotional responses to music.
However, there are two issues with the instructions used that limit the validity of this investigation. First, instructing someone to empathise (or not) lacks ecological validity, since that will rarely occur in many music listeners’ lives. Second, we believe that instructing someone to “take an objective perspective” towards the music is almost equivalent to instructing them to feel no emotion at all. We therefore conclude that, in this investigation, the authors did not compare an empathetic response with an unempathetic one, but rather compared a generally emotional response with a response in which the participant had suppressed their emotions. It is therefore still unknown if empathy could cause an emotional response to music. We felt it necessary to conduct an experimental investigation in which empathy would be induced in a more ecologically valid way, and then tested to find out if it moderated induced emotional responses to emotional expression in music.
Models of empathy
The majority of models of empathy that have been published share a similar tripartite structure. According to Decety and Jackson (2004, 2006) these processes comprise an affective response that often involves sharing the emotional experience of another being, a cognitive ability to recognise and take the perspective of the other person, and a regulatory mechanism that keeps track of the source of feelings. One model refers to these three processes as mentalising, experience sharing and sympathy (Zaki & Oschner, 2012). Singer and Lamm (2009) describe empathy as an affective state, isomorphic to the observed emotion and consciously attributed to an external source, and the subsequent reaction, thus obscuring the clarity of the distinction between the three processes but covering them, nonetheless. Other accounts describe empathy as having four parts: affective sharing, self-awareness, perspective taking and emotion regulation. Although self-awareness and perspective taking are grouped in many models, they are separate in this one (Gerdes et al., 2010). Social psychologists have labelled the parts of empathy more broadly as antecedents, processes, intrapersonal outcomes and interpersonal outcomes (Davis, 2018, pp. 13–21). In music psychology a model of empathy describes constructs similar to those of the disciplines mentioned above. The Common Coding Model of Prosocial Behaviour Processing (Schubert, 2017) depicts a process involving the recognition of an emotion, mimicry of this emotion causing an embodied emotion, the cognitive act of perspective taking and prosocial behaviour. Taken together, these models reveal the consensus that empathy has three components. In this article we term them emotion recognition, emotion contagion and perspective taking.
Inducing and measuring situational empathy
To devise an empathy induction method that would be ecologically valid, we took inspiration from a study on the effect of programme notes on participants’ musical preferences (Margulis, 2010). In this study, participants were given different types of textual information relating to the music to which they were then exposed. The texts they received were either “structural” or “dramatic” in content, but they were matched for length. Margulis found no significant effect of the content of programme notes on music preference. While the purpose of her research differed from ours as it tested preference, not empathy, we adapted her strategy of giving participants textual information with different semantic content for our purposes. In another study in which text was used to influence emotional responses to music, participants were played two pieces of music and given two accompanying descriptions of visual imagery: a “sad narrative” and a “neutral narrative”(Vuoskoski & Eerola, 2015, p. 265). The authors found that the sad narrative intensified the sadness felt in response to the music that expressed sadness. This result suggests that providing contextual information is a useful method for moderating emotional responses to music. The impact of programme notes has also been tested qualitatively by Bennett and Ginsborg (2018), who found that 39% of their 29 participants reported that the information had a positive impact on their experience of the music. In particular, participants with greater experience of listening and performing, themselves, were less likely to accept the information in the programme note.
To find out whether informational texts had indeed induced empathy, a tool had to be chosen that would adequately measure each participant’s level of empathic experience at the moment of testing (Zhou et al., 2003). Methods of measuring empathy have included measuring facial and verbal responses to emotionally evocative videotapes (Roberts & Strayer, 1996), and comparing concurrent empathic responses with repeated self-reports (Ickes et al., 1990). Other methods include the Multifaceted Empathy Test (Dziobek et al., 2008). Participants are shown photographs of individuals, infer their mental states and rate their emotional reactions to the photographs. Participants’ emotional empathy is assessed based on their ratings. Yet another option is to evaluate combined reports of concern and arousal in response to short movie excerpts (Kuypers, 2017). However, these measures all rely on non-musical social stimuli such as a picture, another participant or a video excerpt, which would not be applicable in a music listening context. The IRI (Davis, 1980) is a widely used measure of trait empathy that measures components of empathy, namely perspective taking, empathic concern, fantasy and personal distress. More recently, Kreutz et al. (2008) developed the Music-Empathizing-Systemizing (ME-MS) Inventory, which tests for general music empathy. Both the IRI and ME-MS inventories measure trait or dispositional empathy, however, rather than the situational response we sought to test. For these reasons we chose to develop our own measure of situational empathy. The items that we included were based on the three commonly accepted elements of empathy: emotion recognition, emotion contagion and perspective taking. The exact wording of the items was based on previous attempts to capture situational empathy (e.g., Shen, 2010).
Aims, research questions, and hypotheses
The aims of this study were, first, to find out if empathy could be confirmed as a mechanism underlying emotional induction in music listening; we induced emotion by providing listeners with textual information. Second, we aimed to measure situational empathy; we did this using a psychometric tool that we developed ourselves, the Situational Music Empathy Measure. Accordingly, informed by the theories and previous research presented above, we tested the following hypotheses in the current study (see Figure 1):

Illustration of theoretical music listening empathy model with hypotheses (H1–4) tested in this experiment.
Methods
Ethics compliance
All participants who took part in this study gave informed consent in keeping with the ethical guidelines from the University of York Arts and Humanities Ethics Committee, who formally approved this study. Each participant had the right to leave the study at any time. While the information participants were given at the outset concerned the procedure to be used in the study, its aims were not revealed until they received a full debriefing at the end.
Musical stimuli
The four pieces of music used in the present study were selected to represent each of the four quadrants of the circumplex theory of emotion (Russell, 1980), and were played to all the participants in a randomised order. Each music excerpt was from a film score and lasted 30 seconds:
Roll Tide (Zimmer, 1995) – high arousal, high valence;
Main Theme from Chocolat (Portman, 2000) – low arousal, high valence;
Main Theme from Halloween (Carpenter, 1979) – high arousal, low valence;
A Small Measure of Peace (Zimmer, 2003) – low valence, low arousal.
A pre-test study with 11 participants was conducted before the main study to ascertain whether they agreed that the music excerpt represented the desired quadrant. This was done by playing the music excerpts and asking the participants to rate arousal and valence for each one using two sliders with a range of −10 to +10. The results can be seen in Figure 2. The mean ratings for each of the four music excerpts can be seen in Table 1.

Arousal and valence ratings for the four music excerpts.
Range, mean and standard deviation of the Arousal and Valence ratings for each of the music excerpts by participants in the pre-study (n = 11).
Our intention in this study was to select music that was unfamiliar to the majority of participants. We checked this by asking participants to rate their familiarity with each of the four music excerpts on a scale of 1 (not at all familiar) to 5 (extremely familiar). The overall mean familiarity rating was 1.30 (Roll Tide M = 1.25, SD = 1.00; Chocolat M = 1.28, SD = 1.24; Halloween M = 1.51, SD = 1.16; A Small Measure of Peace M = 1.15, SD = 0.88). Unfamiliar music was chosen because it has been found that musical stimuli with which the listener are familiar can induce emotion by triggering memories (Tahlier et al., 2013). Association and memory are the mechanisms by which emotion is induced, in this situation, rather than the emotional quality of the music. Nevertheless, there is also evidence to suggest that unfamiliar music can stimulate strong emotional responses (Gabrielsson, 2011).
Empathy Induction Manipulation
As discussed above, the study was conducted to find out how situational empathy affects emotional responses to music. It was therefore necessary to induce empathy. Participants were randomly assigned to one of three groups in which they received either an empathy-induction text (the experimental group) or a “structural” music text (control group 1) or no text (control group 2; the texts can be seen in Appendix A). The texts were modelled on those used by Margulis (2010) but aimed to induce empathy. It had to be decided with whom the researcher was attempting to get the participants to empathise; options included the composer, the performer or performers, the person about whom the piece had been written or any other fictional figure. In the present study participants were encouraged to empathise with the composer of the piece, so the contextual information received by the experimental group included the circumstances in which the music was conceived or composed. Control group 1 participants were given text of a similar length, but its content was about the music excerpts’ musical characteristics: the instruments used and their structure, dynamics and tempo. This was to test the effect of text, regardless of its semantic content, on empathy.
Self-Reports
All data were collected digitally, via an online Qualtrics questionnaire. Participants completed the study at a time and in a location of their choice. Having heard each music excerpt, participants were asked to rate their feelings of valence (the degree to which something is experienced as pleasant or positive; Posner et al., 2005) and arousal (the state of being physiologically alert, awake and attentive) on a scale of −10 to +10. Such emotion ratings can be mapped onto the each of four quadrants of the circumplex theory of emotion (Russell, 1980). This method of measuring emotion was chosen because it was expedient and efficient. It allowed us to compare the emotions felt by the participants with the emotions expressed by the music; it also had the advantage, unlike other measures of emotion that have considerably more parameters, of adding only two questions to a questionnaire that was already long and demanding. Participants in this study did, however, have to report their emotional response four times, because they listened to four music excerpts; in addition, they were asked how familiar they were with the music and how much they liked it.
An attention check in the form of a brief multiple-choice question about the text they had read before listening to the music followed, for participants in the experimental group and control group 1 only, to ensure that they had read it. Next, all participants completed the Situational Music Empathy Measure (see Appendix B). This consists of ten theoretically informed items, based on the three components of empathy identified in the literature: emotion recognition (two items), emotion contagion (four items) and perspective taking (four items).
Participant background characteristics
The IRI (Davis, 1980) was used as a measure of trait empathy, to find out if inter-individual trait differences between participants are associated with their situational empathy responses. Demographic questions were included at the end of the questionnaire to assess participants’ age, sex, level of education, level of musical training and, if it was novice or higher, what instrument they played; participants also completed the Ten Item Personality Inventory (Gosling et al., 2003) and rate their level of concentration during the study on a scale from 1 (low) to 10 (high). Those participants who reported a concentration level ⩽3 were removed from the analysis.
Participants
Participants were recruited in three main ways: via (1) social media across the UK (n = 86); (2) the use of free, online participant recruitment websites Survey Circle, Survey Tandem and Poll Pool (n = 25); and (3) emails to institutions such as universities, choirs and community projects (n = 118).
Accordingly, a total of 229 participants started filling in the questionnaire (female = 134, male = 56, other = 1, did not answer = 38). Data from 34 participants were excluded due to incomplete results (n = 21), incorrect attention check questions (n = 6), incorrect response to a listening test (n = 3) or concentration ratings ⩽3 (n = 4). If participants completed all ratings following the musical examples their data were used, even if they did not complete the demographic questions at the end of the questionnaire. The participants had an age range of 14 to 82 years with a mean age of 36 years. A total of 43.5% of participants had a bachelor’s degree or equivalent and 42.9% had a postgraduate qualification; only 1% of participants had no formal qualifications. Participants who reported having no musical training comprised 36.6%; 21.5% described themselves as novice musicians, 31.4% as amateur musicians, and 10.5% of participants claimed to be professional musicians. In terms of listening to the music excerpts, 42.6% of participants used headphones, 27.1% used computer speakers, 27.3% used phone speakers and 6% reported listening using other means.
Procedure
Having been provided with information about the procedure to be used in the study, participants signed a digital consent form. Next, they completed a sound test by listening to a short audio file and responding to the question “What was the first instrument you heard?” This test ensured that the volume was set to a comfortable level and that participants would be able to hear the music excerpts used as stimuli. It also served as a hurdle: participants could not proceed unless they responded, preventing them from taking part in the study by merely clicking through the questionnaire (Reips, 2002). They were then randomly allocated to one of the three conditions and listened to all four music excerpts in a randomised order, after which they responded to the self-report questions described above. They answered the individual difference measures and the demographic questions, and rated their level of concentration throughout the study. Finally, they were given a full debriefing, including the aims of the study and the details of the pieces they had listened to, and offered the opportunity to provide feedback on the questionnaire. The questionnaire took between 12 and 34 minutes to complete (M = 21 min).
Analysis
Because a repeated measures design was used, the data were restructured into the long format and subsequent analyses were conducted through hierarchical linear models using the MIXED function in IBM SPSS Statistics V25.0. Corresponding residual covariance structures of models were selected based on the smallest Akaike’s Information Criteria score. All metrical predictors and outcome variables were z-standardised to allow for comparisons between estimated effect sizes.
Results
Measuring situational empathy
Since the study required the creation of a new instrument to measure induced empathy, its internal reliability was tested using Cronbach’s alpha. The measure was further tested based on the three, theoretically informed, subscales; the results of this can be seen in Table 2 (with the relevant questions reverse-scored).
Descriptive statistics for the items in the situational music empathy measure and the internal consistency of its theoretically informed subscales.
Note: Items 1, 3, 5, 7 and 10 have been reverse scored. *indicates an acceptable internal consistency score based on the commonly accepted values (George & Mallery, 2003).
A Spearman’s Rho correlation analysis showed, however, that all three subscales are highly correlated with each other, as shown in Table 3. It was therefore deemed more appropriate to conceive of situational empathy as a single parameter, calculated as the mean of all 10 items on the questionnaire, to produce a single measure of situational empathy for each participant. The reliability statistic for this mean score on the Situational Music Empathy Measure, is .892 which, based on this set of results, is considered acceptable (George & Mallery, 2003).
Spearman’s Rho correlations between the three subscales of the Situational Music Empathy Measure (Emotion Contagion, Perspective Taking and Emotion Recognition).
Note: **p<.01; n = 808.
Relationship between general trait empathy and situational empathy in response to music (H1)
We subsequently tested the criterion validity of the new measure based on the assumption that level of trait empathy will affect the degree to which participants feel situational empathy. Table 4 shows means and standard deviations for each of the IRI items and factors.
Mean and standard deviation of the IRI items and factors.
The four IRI subscales, representing trait empathy, were the predictor variables in a linear model, and situational empathy, measured using the Situational Music Empathy Measure, was the dependent variable.
The results shown in Table 5 indicate a highly significant positive relationship between trait fantasy and situational empathy. Furthermore, Perspective Taking shows a non-significant trend, also suggesting a positive relationship with situational empathy.
Hierarchical Linear Modelling of the effect of levels of trait empathy, measured with the Interpersonal Reactivity Index (IRI), on situational empathy.
Note: This model used the compound symmetry covariance structure due to the lowest resultant AIC. *p < .05, **p < .01, n = 808, +a non-significant trend with p < .10; z-standardised variables predictor and outcome variables.
The effect of experimental empathy induction on situational music empathy (H3)
This analysis tested the effect of the between-groups variable (empathy-inducing text, structural music text or no text) on the experience of situational empathy in response to the music excerpts. Figure 3 shows that participants who received the empathy-inducing text experienced much higher situational empathy than participants in the other two conditions. The linear model presented in Table 6 shows that this difference is significant: the empathy-inducing text, but not the structural music text, significantly affected situational empathy.

Effect of the type of text the participants received on the extent to which they experienced situational empathy.
Hierarchical Linear Modelling of the effect of the between-subjects factor type of text on experienced situational empathy.
Note: This model used the compound symmetry covariance structure due to the lowest resultant AIC. *p < .05, **p < .01, n = 808; z-standardised outcome variables.
Dummy variable 1 = empathy text group, 0 = control group (no text).
Dummy variable 1 = music text group, 0 = control group (no text).
The effect of the interaction between situational music empathy and expressed emotion in music on the emotion felt by participants (H2)
Figure 4 shows that higher levels of situational empathy caused the participants to feel higher levels of valence for music excerpts with high expressed valence, and lower levels of valence for music excerpts with low expressed valence. To analyse this, situational empathy was recoded into two categorical groups with an equal number of participants in each (median split). The results of this analysis indicated that situational empathy had moderated the effect of expressed valence on participants’ responses. Similarly, situational empathy had increased the level of arousal felt by participants in response to music excerpts with high expressed arousal and decreased it for music excerpts with low expressed arousal.

The effect on participants’ felt emotion ratings of the interaction between situational empathy, which has been recoded into two categorical groups with an equal number of participants in each (median split), and the emotions expressed in the music.
The linear modelling of these differences, shown in Table 7, shows that expressed arousal has a direct and significant effect on felt arousal. The interaction between expressed arousal and the extent to which participants experienced situational empathy resulted in a significant and positive effect on their ratings of arousal. Expressed valence had a significant effect on induced ratings of valence. The interaction between the expressed valence of the music excerpts and situational empathy was also significant and positive. This indicates that the more a participant was able to empathise with a composer, the more similar was the level of valence they felt to the level of valence expressed by the music.
Hierarchical Linear Modelling of the effect on felt emotion ratings (Valence or Arousal) of experienced situational empathy and the expressed emotion (Valence or Arousal).
Note: This model used the compound symmetry covariance structure due to the lowest resultant AIC. *p < .05, **p <. 01, z-standardised variables. n = 808.
Dummy variable 1 = High arousal, 0 = Low arousal (no text).
Dummy variable 1 = High/Positive valence, 0 = Low/Negative valence (no text).
The effect of the interaction between empathy induction and the expressed emotion in the music on the emotion felt by participants (H4)
Having shown that listeners’ responses to emotions expressed by music are moderated by situational music empathy, we investigated whether this moderation effect could also be induced by providing different types of information about the music, to increase participants’ levels of cognitive empathy and the extent to which they were able to share the perspective of the composer.
Figure 5 shows the effect of expressed emotions on the emotions felt by participants according to experimental condition (empathy-inducing, structural music or no text). Each line represents one of the conditions. For the experimental group, empathy-inducing texts intensified the effect of expressed valence on felt emotion; felt valence ratings were higher than those of both control groups for high valence music s and lower for low valence music excerpts. For control group 1, structural music texts did not have the same effect; the line representing these texts runs almost in parallel to that of control group 2. As for the effect of expressed arousal on the arousal experienced by participants, there was no interaction. For control group 1, the structural music text reduced the effect of expressed arousal but to a lesser extent than for participants in the experimental group and control group 2. The effect of the empathy-inducing text was much stronger than that of the structural music text or no text when expressed valence was low than when it was high. This could be because low-valence emotions are more likely to require action; for example, fear and anger often induce a fight-or-flight response (Lebel, 2017). It could also be the case, however, that the empathy-inducing texts for the two low-valence music excerpts were more successful than those written for the high-valence music excerpts and that negative emotions in response to the texts were therefore more salient.

The effect on participants’ felt emotion ratings of the interaction between the type of text and the emotions expressed in the music.
Table 8 shows the results of a hierarchical linear model testing the effect of the interaction between type of text (the between-participants condition) and expressed emotion (the within-participants condition) on felt arousal. Generally, the results confirmed those shown in Figure 5. There was a significant effect of expressed arousal; there was also a significant and negative effect of structural music text on felt arousal, and an interaction between them. This indicates that participants in control group 1 who received structural music texts experienced decreased responses to expressed emotions. By contrast, felt valence was significantly influenced by expressed valence, and also moderated significantly by receiving an empathy-inducing text moderation empathy text but not a structural music text. Thus participants in the experimental group who received empathy-inducing texts responded significantly more strongly to expressed valence than participants in the other conditions.
Hierarchical Linear Modelling of the effect on felt emotion ratings (Valence or Arousal) of the type of text and the expressed emotion (Valence or Arousal).
Note: This model used the diagonal covariance structure due to the lowest resultant AIC. n = 808. All significant results at the p = 0.05 level are identified with *, any results significant at the p = 0.01 level are highlighted by **, a non-significant trend is marked by +. Predictor variables are z-standardised unless they are indicated as a dummy variable, outcome variables are z-standardised.
Dummy variable 1 = empathy text group, 0 = control group (no text).
Dummy variable 1 = music text group, 0 = control group (no text).
Dummy variable 1 = High arousal, 0 = Low arousal (no text).
Dummy variable 1 = High/Positive valence, 0 = Low/Negative valence (no text).
Discussion
The aims of this study were to explore and test the role of empathy in moderating emotional responses to emotional expressions in music. We confirmed that situational music empathy is correlated with the fantasy dimension of trait empathy (H1). Situational music empathy in turn then was shown to moderate the effects of emotional expression on induced emotion in music (H2). Finally, we also showed that it is possible to induce situational music empathy by providing participants with specific background information (H3), which then in turn also moderated the effects of emotional expression on induced emotion (H4).
In the introduction to this report the distinction between trait and situational empathy was discussed; however, while they are discrete paradigms, it was assumed that the two concepts are related (H1). An individual who has higher levels of trait empathy will report higher levels of empathy based on a specific stimulus. Trait empathy was measured using the four subscales of the IRI as predictor variables. Only the fantasy subscale of trait empathy was found to have a significant and positive relationship with situational empathy (as shown by Eerola et al., 2016). In the instructions for administering the IRI, Davis (1983) describes the fantasy sub-scale as a measure of participants’ proclivity to imagine themselves as experiencing the feelings and actions of fictitious characters. In our study, participants were encouraged to take the perspective of an unknown figure for whom there was no visual image nor, for two thirds of the participants, contextual information. It was therefore essential for them to be able to imagine the emotions of the composer figure if they were to take their perspective. Thus, music empathy can be seen to occur in a similar way to social empathy. Listening is often a social experience and emotions would seem to be shared, even between virtual personas.
To find out whether situational empathy is indeed a mechanism for inducing emotions in response to music, the mean scores on the Situational Music Empathy Measure were used in a hierarchical linear model to test the moderating effect of situational empathy on the emotions expressed in music, which in turn influence the emotions felt by participants (H2). The results show that higher levels of situational empathy moderated the participants’ valence responses so that, for low expressed valence, the responses were lower, and for high expressed valence, responses were higher. Similarly, situational empathy increased the level of arousal felt by participants in response to music excerpts with high levels of expressed arousal and decreased it for music excerpts with low levels of expressed arousal. These results indicate that situational empathy could be the mechanism through which the emotions felt by the participants were induced (Scherer & Zentner, 2001).
The third hypothesis was that it is possible to induce situational music empathy by providing participants with specific background information (H3). We tested the effect of the type of text received by participants on their mean situational empathy scores. In support of the hypothesis, participants who received the empathy-inducing texts experienced higher levels of situational empathy than the no-text control group. Conversely, those who received the structural music texts experienced a non-significant reduction in their levels of situational empathy. This directly supports the suggestion that it is possible to induce empathy. The mere presence of information alone is not sufficient, however. This is because the structural music texts, which were deliberately comparable in length to the empathy-inducing texts, did not induce empathy. The non-empathic texts may rather have drawn the attention of participants away from the expressivity of the music and towards its musical content. The implications of this are that it is possible to induce the desired emotional response in a listener by encouraging them to empathise with a figure connected to the music, in this case the composer.
The final analysis was conducted to test the effect of the content of the textual information on their emotional experience (H4) in an attempt to induce empathy, or more specifically perspective taking, in a more ecologically valid way than previous studies (Miu & Balteş, 2012). The results indicate that those participants that received the empathy-inducing texts experienced significantly stronger responses to expressed valence than those in the structural music and no-text conditions, but the structural music texts had no such moderation effect on felt arousal. They did however decrease the intensity of the arousal experienced by participants in response to expressed arousal. Interestingly, participants who were manipulated into taking the perspective of the composer responded more strongly to expressed valence than to expressed arousal. This could indicate that arousal is largely an evolutionary experience related to the fight-or-flight response and, since the listener does not perceive themselves to be in danger, they do not experience any increase in arousal eliciting a flight-or-fight response, and are therefore less likely to be influenced by background information on the composer. Or it may be that, generally, their arousal response is based more on the stimulus than its contextual characteristics (see also Egermann et al., 2015, who observed universal arousal responses to low-level stimulus characteristics in a cross-cultural listening experiment). In summary, the results show that trait and situational empathy are related. Situational empathy was also found to moderate emotional responses to expressive music. Finally, we found that we were able to induce situational empathy, confirming that it could act as a mechanism for emotion induction.
Limitations
We have sought to address some of the past limitations of study design as well as to gain further insights into music-related empathy. As discussed in the introduction, this study was designed as an experiment to be more rigorous than some previous studies, in which conclusions were drawn from correlational analyses (e.g., Eerola et al., 2016; Egermann & McAdams, 2013). This was achieved by inducing empathy so that a causal link between music and empathy could be established and thus reducing the potential effects of confounding variables. Empathy was induced in a more ecologically valid way than in some previous research, in which participants were instructed to feel a certain way or imagine specific scenarios (e.g., Miu & Balteş, 2012) and manipulated by giving participants different types of text.
It should be noted that the use of emotion descriptors in the empathy-inducing texts could have influenced participants’ emotion ratings through mechanisms such as affective priming (Murphy & Zajonc, 1993) rather than empathy. We needed, however, to describe the composers’ emotions, and indeed expressing emotion explicitly seems to be a natural requirement for inducing any kind of empathetic response. We also showed that measures of situational empathy were influenced by those texts, indicating that the mechanism we studied represents empathy rather than priming.
Conclusions
We conclude that the results presented here contribute to an understanding of the role of empathy as an emotion-induction mechanism in music. We showed that empathy can be induced by providing information on the specific emotions of a figure relating to the music. This has implications for composers, performers, and concert curators who may want to evoke particular responses in their audiences. Furthermore, while situational empathy moderated responses to both expressed arousal and valence, experimentally-induced empathy moderated responses only to expressed valence but not arousal. The implications of this could be that programme notes or album sleeve notes, for example, can be used to influence the valence of listeners’ responses to emotionally expressive music. It is likely that the more emotional an individual finds a piece of music the more they will prefer it.
Footnotes
Appendix A – Empathy Induction Texts
The texts provided to participants for the purposes of manipulating the between-subjects independent variable.
| Structural music texts | Empathy-inducing texts |
|---|---|
| Roll Tide (Zimmer, 1995) | |
| This excerpt is of a musical score for a film and employs a blend of orchestra and synthesizer sounds. The piece gets progressively louder, as evidenced in this excerpt, as more instruments are added. The tempo is steady throughout and has a military constancy. Released in 1995. | The composer of this piece is a former military submarine officer who composed this music during his time on an American submarine. The composer felt a real excitement and describes a pride at serving his nation. In this piece he tries to capture the adrenaline and adventure he experienced while on the boat. |
| Main Theme from Chocolat (Portman, 2000) | |
| This excerpt is from a piece that relies on piano and strings, both natural harp and synthesised, to accompany a flute melody. The volume increases throughout the excerpt and the texture becomes thicker, employing a fluid tempo. The piece uses repetitive and exaggerated phrasing throughout and the ambiguous major/minor modality results in uncertainty. | This piece is written to depict a time when the composer and her child moved to a tranquil French town in the winter of 1959. Used to a nomadic lifestyle the composer is relaxed at the prospect of a fresh start and was content with the new town she arrived in on a windy autumnal afternoon. |
| Main Theme from Halloween (Carpenter, 1979) | |
| The agitated keyboard in this excerpt is juxtaposed with the synthesised brass bass line. The excerpt is repetitive in pitch and texture; however, the volume increases throughout the excerpt. The structure sees the keyboard and bass line coming in and out through the piece in a sequential manner. The percussive sounds are consistent throughout and add a constant accompaniment over the top of the rest of the music. | Written as a reaction to a visit to a mental hospital, the composer of this piece was moved to write this music as an outlet for the fear, anger and discomfort he experienced following his visit. In the piece his use of repetition and the percussive clock-like sound are his representation of the feeling he experienced of time being never ending and repetitive and uncomfortable in the hospital. |
| A Small Measure of Peace (Zimmer, 2003) | |
| This music is firmly in the minor mode with significant use of strings providing the melody and accompaniment for the piece from which this excerpt is derived. The swelling dynamic implies significant phrasing and is aided by an increase in instrumentation later in the piece creating a thicker texture. Later in the piece low strings take over the melody. | This expert is taken from a piece written following a significant fire in which several people died. The composer wrote this to express the despair, fatigue and false calm that she witnessed in the aftermath of the event. The exhaustion is represented by the end of every phrase getting quieter, as if lacking in energy. |
Appendix B. Situational Music Empathy Measure
All questions to be answered on a Likert scale of 1 (strongly disagree) – 5 (strongly agree)
*denotes reverse scored item.
Items testing Perspective Taking: 1, 3, 6, 9 Items testing Emotional Contagion/Embodied Emotion: 2, 4, 5, 10 Items testing Emotion Recognition: 7, 9
Contributorship
KON and HE researched literature and conceived the study. KON and HE were involved in study design, gaining ethical approval, participant recruitment and data analysis. KON wrote the first draft of the manuscript. Both authors reviewed and edited the manuscript and approved the final version of the manuscript.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
