Does fantasy empathy predict the attribution of virtual agency to music?

Abstract

Social emotions have figured prominently in recent research pertaining to music-related emotions. If music is indeed able to evoke social emotions in listeners, the implication is that music may be perceived in some way as akin to a human agent. Yet music, especially instrumental music, is not obviously an agent capable of feeling. Following up on past research linking liking sad music to the fantasy facet of trait empathy, the results of three studies are reported. The first two were online surveys involving 112 and 137 participants, respectively, who rated sets of words in terms of their implied agency, synonymousness, or applicability for describing music. The third involved a listening task in which 299 participants listened to 24 short excerpts of instrumental music, and selected up to 3 words, from a list of 16, that best described each excerpt. The list of 16 words was compiled based on the results of the first two studies and comprised 8 pairs of words that differed in terms of their level of implied agency but were matched in terms of their meaning and applicability to music. Participants also completed the Fantasy subscale of the Interpersonal Reactivity Index. The results did not support the hypothesis that high-fantasy listeners would be more likely to impute (virtual) agency to music. Instead, the attribution of agency was significantly associated with enjoyment and musical arousal.

Keywords

trait empathy fantasy agency liking emotion arousal valence

How does music induce emotion in listeners? Research on emotion suggests a distinction between social and non-social emotions that hinges on awareness of an agent. Many emotions are linked exclusively to social experiences, including feelings of admiration, hate, suspicion, envy, embarrassment, pride, love, pity, and so forth. However, some induced emotions require no perception of agency. For example, although fear can be induced by encountering a threatening burglar, fear can also be induced by encountering a deep hole or chasm; joy might be induced by the return of an absent friend, but it can also be induced by discovering a source of much-needed water; sadness or anger might arise in response to unjustified criticism, or from seeing one’s home destroyed by fire.

As noted, the key to the evoking of social emotions, however, is the perception of agency. In literature, drama, and film, it is nearly always the case that persons or characters are represented explicitly. Even animated cartoons nearly always portray characters as sentient beings capable of displaying thoughts, intentions, and feelings. In contrast to inanimate objects (furniture, rocks, etc.), animate agents are able to evoke complex emotions in observers, such as feelings of affection, anger, or compassion.

In music involving one or more vocalists, the singer(s) similarly offers a human presence that ostensibly allows listeners to recognize the existence of a sentient being whose thoughts, intentions, and feelings might be expected to evoke appropriate responses in listeners. However, in the case of purely instrumental music, it is not obvious that the music conveys, portrays, or represents a conscious being or agent. While (instrumental) musical sounds can certainly be perceived and understood as resulting from the intentional motor actions produced by a human agent (e.g., Launay, 2015), such music may also be experienced as conveying imagined or virtual agency. Previous empirical work has shown that it is common for listeners to informally describe instrumental music using terms that imply a sentient actor. Watt and Ash (1998), for example, found that people are much more likely to ascribe agency-related descriptors (male, female, good, evil, angry, pleased, gentle, and violent) to music than to food. In addition, music scholars have drawn attention to the ways in which instrumental music exhibits animacy cues suggestive of agency (Broze, 2013; Hatten, 2018). Without some sense of agency, one would expect music-induced emotions to be limited to non-social emotions.

While some philosophers have remained skeptical about the capacity of music to convey imagined agency (e.g., Davies, 1997), many music theorists and philosophers (e.g., Hatten, 2018; Levinson, 1996, 2006; Robinson & Hatten, 2012) have argued the opposite; that the listener sometimes experiences music as conveying virtual agency, or as a narrative containing a fictional or virtual persona. Recent empirical evidence supports this notion, demonstrating that listeners are able to perceive (instrumental) musical improvisations in terms of social intentions (Aucouturier & Canonne, 2017) and often construct imagined narratives when listening to instrumental music (e.g., Margulis, 2017; Margulis et al., 2022). Furthermore, instrumental music has been shown to elicit strong experiences of feeling moved and touched (e.g., Vuoskoski et al., 2022), which has more recently been conceptualized as a social-relational emotion with social bonding functions (e.g., Fiske et al., 2019; Menninghaus et al., 2015).

In approaching questions related to music-induced emotion, it is prudent not to assume that all listeners are having the same experience. Indeed, existing research suggests that there is considerable variability in listeners’ experiences. This variability is most apparent in the diversity of musical tastes. For example, listeners can be divided roughly evenly between those who like and those who dislike nominally sad music (Garrido & Schubert, 2011; Huron & Vuoskoski, 2020; Taruffi & Koelsch, 2014). Variability is also evident in the types of feelings evoked. For example, fans of Heavy Metal experience feelings of power and joy when listening to metal music, whereas non-fans experience tension and anger (Thompson et al., 2019).

Some of these differences have been linked to personality traits. For example, several studies have implicated openness to experience as a factor influencing the intensity of music-induced emotions (Colver & El-Alayli, 2016; Dobrota & Reić Ercegovac, 2015; Maruskin et al., 2012; McCrae, 2007; Nusbaum & Silvia, 2011; Rentfrow & Gosling, 2003; Silvia et al., 2015; Silvia & Nusbaum, 2011). More recently, however, it has been shown that listeners who score high on openness to experience also tend to exhibit high levels of empathy (Costa et al., 2014; Melchers et al., 2016; Sattmann & Parncutt, 2018). In a study involving some 300 participants, Sattmann and Parncutt (2018) found that openness to experience fails to serve as a significant predictor of music-induced emotion when a measure of trait empathy is included, suggesting that the main personality trait predicting individual differences in music-induced emotion is empathy (see also Eerola et al., 2016; Vuoskoski et al., 2012). In many ways, high trait empathy makes more sense than high openness, since empathy more clearly presupposes that the listener is interpreting the music as conveying some degree of agency.

An influential model of empathy is Mark Davis’s 4-factor model, which forms the basis for the Interpersonal Reactivity Index (IRI) (Davis, 1980, 1983). The four factors in Davis’s model include empathic concern, personal distress, perspective taking, and fantasy. Empathic concern is the disposition to feel concern or compassion for another person experiencing some misfortune or stress. Personal distress assesses the disposition to mirror or echo feelings of anxiety or unease when witnessing stress or suffering in others. Perspective taking is the cognitive tendency to spontaneously adopt the psychological point of view of others. Finally, fantasy ostensibly measures the ability to imaginatively transpose oneself into fictional situations, such as those found in literature, drama, or personal daydreams.

In the case of sad music, several studies have shown that the enjoyment of sad music is positively associated with empathic concern but not with personal distress (Eerola et al., 2016; Kawakami & Katahira, 2015; Vuoskoski & Eerola, 2017; Vuoskoski et al., 2012). Huron and Vuoskoski (2020) reviewed behavioral and neuroscientific research identifying compassion as a positively valenced emotion. If empathic concern is interpreted as a measure of compassion and personal distress is interpreted as a measure of (contagious) commiseration, then trait-related studies of sad music offer a ready explanation for the enjoyment of sad music; compared with sad-music dislikers, sad-music likers experience a higher ratio of (positive) compassion to (negative) commiseration. That is, those listeners who most enjoy sad music experience more positive feelings of concern, pity, charity, or sympathy in relation to negative feelings of distress, misery, suffering, or sorrow. This seemingly straightforward account of sad-music enjoyment is confounded, however, by the fact that the IRI fantasy factor is even more predictive of sad-music liking (e.g., Eerola et al., 2016; Vuoskoski et al., 2012).

In order for music to evoke a feeling of compassion, one would expect that a prior condition is that listeners perceive, sense, or infer some form of agency. Since much of the musical stimuli used in extant sad-music experiments have relied on non-vocal instrumental music, one possibility is that people who score high on trait fantasy are more likely to impute agency to instrumental sounds, and so are more likely to experience a socially oriented emotion, such as compassion. This conjecture suggests the hypothesis that high-fantasy listeners are more likely to impute agency to instrumental music. That is, listeners with high-fantasy proneness are more likely to describe music as conveying a sense of animacy or agency.

Method

A simple approach to testing this hypothesis might be to have listeners describe their musical experiences and compare their use of agency-related words or metaphors to their fantasy-facet scores on the IRI. Such open-ended tasks require complex content analysis of participant descriptions, and the design lends itself to low power. A more powerful design might involve asking participants to select music-appropriate descriptive terms from a prior list of independently assessed words with contrasting agency connotations.

In brief, three online studies were conducted. The materials and procedures were reviewed and approved by the Internal Research Ethics Committee at the Department of Psychology, University of Oslo (reference number: 9553162). All participants provided written informed consent before participation. In Study 1a, 57 native-English speakers rated 140 descriptive terms according to their implied agency. In Study 1b, 55 native-English speakers rated the similarity (synonymousness) of pairs of descriptive terms drawn from Study 1a. The result of Studies 1a and 1b was the creation of a word list used in Studies 2 and 3. In Study 2, 137 participants rated the appropriateness of each word as a useful musical descriptor. In Study 3, 299 listeners heard a variety of instrumental musical passages and were asked to choose suitable music descriptors. With the exception of Study 1b, participants in all studies also completed the IRI. All data and code are provided in the Open Science Framework (OSF) repository: https://osf.io/z3xhs/.

Music descriptors

In much music-related emotion research, it is common to have participants provide verbal descriptions of the emotional content, either emotions thought to be represented or conveyed by the music, or emotions thought to be evoked or induced in the listener. As noted, rather than employing an open-ended descriptive task, we aimed to increase experimental power by providing participants with a list of terms from which they would select appropriate descriptors.

In the current study, two properties of descriptive terms were of interest. First, how much does the term connote agency? And secondly, how suitable is the term for describing music? In addition, in order to reduce the possibility of highly skewed data, it would be helpful to match each high-agency term with a nearly synonymous low-agency term. Many musical descriptors clearly imply some sort of agent. Examples include words such as joyful, elated, and proud. Other descriptors imply less agency, such as rough, tranquil, and simple. Apart from the issue of agency, some descriptive terms are more appropriate for music than others. For example, listeners are more likely to describe a musical passage as peaceful or perky but not pastel or plastic.

In describing a particular musical passage, one would expect broad agreement among listeners regarding general features. For example, a given passage might reasonably be described as fast, happy, thrilling, or energetic, as opposed to slow, sad, morose, or lethargic. Ideally, in providing a list of possible descriptors, it would be useful to offer terms that are synonyms or near synonyms with regard to musical character that nevertheless differ with regard to agency or animacy. For example, the term heroic might be expected to imply more agency than the nominal synonym grand. Similarly, the term triumphant would seem to imply more agency than the nominal synonym monumental.

We began by assembling a list of descriptive musical terms gleaned from the existing literature on music-related emotion. For example, Zentner et al. (2008) collected a large number of musical descriptors from French fans attending a musical festival. From the collected descriptors, they produced a 9-factor model of music-related emotion. However, for the purposes of this study, we used 38 unique descriptive words drawn from Study 2. This set included words (in English translation) such as angry, calm, nervous, and proud. Only terms that would function as descriptive adjectives were selected, and thus terms such as thrills and goose bumps were excluded. In addition to these descriptors, we supplemented our list with a further 48 music descriptive terms drawn from a number of other published and unpublished sources pertaining to the evaluation of music-related emotion, timbre, and other musical features. Our preliminary list of 86 words is shown in Table 1.

Table 1.

The preliminary list of 86 words drawn from previous studies on music and emotion.

Active	Colorless	Dull	Gentle	Loving	Rough	Spiritual
Aggressive	Compassionate	Ecstatic	Gloomy	Manic	Sad	Strong
Angry	Complex	Emotional	Glum	Melancholic	Sensual	Sympathetic
Anguished	Contented	Empty	Happy	Nervous	Serene	Tender
Animated	Cool	Energetic	Harsh	Peaceful	Sexy	Tense
Anxious	Dark	Fast	Heavy	Placid	Slow	Timid
Bright	Deep	Fat	Heroic	Pleasant	Smart	Tranquil
Calm	Defiant	Fearful	Hot	Poetic	Soft	Triumphant
Caring	Depressed	Forceful	Intimate	Powerful	Somber	Vivid
Cheerful	Dim	Friendly	Joyful	Proud	Soothing	Volatile
Close	Drab	Frightened	Lively	Radiant	Sophisticated	Warm
Cold	Dramatic	Fun	Loud	Reflective	Sorrowful	Wide
Colorful	Dreamy

Since our intention was to compare high-agency descriptors with low-agency descriptors, it was necessary to identify approximate synonyms for each of the 86 candidate terms. Hence, for each term in Table 1, the authors identified one or more companion terms that the authors regarded as plausible synonyms that also contrast with the agency implied by the original term. For example, the authors conjectured that depressed implies relatively high agency and that a plausible low-agency synonym for depressed might be dark. Similarly, we conjectured that dull implies rather low agency and that a plausible high-agency synonym would be boring.

For several especially challenging terms, the authors identified two or more possible synonyms. For example, the descriptor sad appears to exhibit very high agency. It is difficult to identify words that are good synonyms for sad that do not also exhibit high agency. Consequently, we identified three possible low-agency words (gloomy, pale, and sunless), one of which we hoped might be considered more synonymous than the others.

In matching pairs of possible synonyms, we began by looking for other terms in our preliminary list that might provide suitable partners. Hence, we were able to link lively (presumed higher agency) with fast (presumed lower agency) and glum (presumed higher agency) with drab (presumed lower agency). In other cases, it was necessary to select descriptive terms not included in the preliminary list. For example, we matched the listed word slow (presumed lower agency) with a more explicitly higher agency non-list word tired. Similarly, we matched the listed word defiant (presumed higher agency) with a more explicitly lower agency non-list word resistant. In total, the entire roster of descriptors includes some 140 terms representing 70 nominal descriptive pairs (identified in Table 2). Non-list words in Table 2 are shown in italics.

Table 2.

The 70 pairs of words used in Studies 1a and 1b, and the means and standard deviations representing implied agency, synonymousness, and difference in implied agency.

High agency		Low agency		Agencydifference	Synonymratings(SD)
Term	Agency score(SD)	Term	Agency score(SD)	Agencydifference	Synonymratings(SD)
Abandoned	46.0 (28.1)	Empty	35.5 (27.3)	10.4	76.8 (25.2)
Aggressive	76.2 (19.0)	Forceful	62.0 (25.2)	14.2	70.7 (16.5)
Agreeable	77.1 (20.3)	Warm	63.7 (25.9)	13.4	61.6 (23.8)
Angry	84.2 (15.7)	Hot	43.4 (27.8)	40.8	57.5 (27.0)
Anguished	72.2 (22.7)	Dramatic	75.6 (22.1)	3.3	43.8 (23.1)
Animated	67.5 (24.3)	Active	74.0 (20.0)	6.5	65.7 (22.3)
Animated	67.5 (24.3)	Energetic	77.0 (18.8)	9.4	71.1 (21.1)
Anxious	88.4 (12.2)	Tense	72.0 (20.5)	16.4	82.4 (14.5)
Austere	49.9 (28.1)	Cold	49.3 (27.9)	0.6	50.4 (24.4)
Boisterous	78.9 (16.7)	Loud	61.9 (25.7)	17.0	81.0 (15.0)
Boring	73.3 (22.5)	Dull	58.4 (28.6)	14.9	88.0 (15.7)
Calm	64.5 (23.5)	Cool	56.5 (26.4)	7.9	68.3 (23.3)
*Caring	85.4 (12.4)	*Soft	52.0 (27.8)	33.4	73.5 (18.6)
Caring	85.4 (12.4)	Warm	63.7 (25.9)	21.7	78.9 (17.7)
*Cheerful	85.6 (12.4)	*Bright	54.1 (28.1)	31.6	81.9 (16.9)
†Cheerful	85.6 (12.4)	†Radiant	63.1 (24.8)	22.5	70.4 (19.0)
Compassionate	87.7 (12.0)	Tranquil	39.4 (26.7)	48.4	34.8 (25.9)
Contented	78.8 (19.2)	Placid	53.0 (28.4)	25.8	56.0 (23.1)
*Defiant	79.1 (18.2)	*Resistant	51.9 (26.0)	27.2	72.8 (21.7)
*Depressed	87.5 (15.0)	*Dark	38.5 (25.2)	49.0	74.9 (21.8)
*Depressed	87.5 (15.0)	*Gloomy	50.3 (28.6)	37.2	81.1 (19.2)
*Drained	66.0 (26.4)	*Colorless	22.8 (23.4)	43.2	76.3 (18.8)
Dreamy	69.6 (23.1)	Vague	55.4 (29.1)	14.2	43.6 (26.1)
Ecstatic	82.5 (17.5)	Unusual	55.2 (23.2)	27.3	18.5 (19.3)
Emotional	88.7 (14.7)	Intense	69.4 (22.2)	19.4	57.1 (24.8)
*Fat	73.1 (22.2)	*Heavy	36.9 (24.7)	36.3	73.9 (24.6)
Fearful	77.3 (17.9)	Gloomy	50.3 (28.6)	27.0	43.9 (26.3)
*Flamboyant	76.7 (18.5)	*Colorful	40.0 (26.4)	36.7	78.5 (17.7)
Friendly	84.6 (15.6)	Pleasant	70.7 (20.4)	13.9	80.2 (15.8)
*Frightened	80.5 (16.4)	*Shaky	48.9 (27.0)	31.6	66.2 (22.3)
†Glum	71.4 (24.1)	†Drab	40.1 (28.0)	31.3	64.9 (24.6)
Fun	72.2 (21.6)	Vivid	35.3 (26.1)	36.9	49.1 (26.9)
Gentle	71.5 (20.3)	Soft	52.0 (27.8)	19.5	85.1 (14.0)
*Happy	84.4 (12.1)	*Bright	54.1 (28.1)	30.3	80.4 (19.5)
Heroic	82.6 (14.9)	Grand	36.2 (25.3)	46.4	47.8 (25.0)
Heroic	82.6 (14.9)	Powerful	66.4 (22.3)	16.2	64.3 (24.1)
Heroic	82.6 (14.9)	Strong	68.1 (24.2)	14.5	72.7 (23.6)
*Intimate	76.6 (22.2)	*Close	38.9 (29.4)	37.7	89.8 (12.0)
Joyful	81.0 (17.2)	Energetic	77.6 (16.9)	3.4	72.2 (23.6)
Joyful	81.0 (17.2)	Vivid	35.3 (26.1)	45.7	46.3 (27.3)
Lively	74.0 (20.2)	Fast	48.6 (26.4)	25.4	63.4 (22.3)
Loving	87.5 (12.7)	Warm	66.5 (24.7)	21.0	83.0 (14.5)
Manic	73.5 (21.3)	Deep	51.0 (26.0)	22.5	24.8 (22.2)
Manic	73.5 (21.3)	Intense	66.1 (24.0)	7.4	64.5 (23.6)
Melancholy	68.2 (26.2)	Overcast	15.8 (19.3)	52.4	47.7 (26.7)
Nervous	83.4 (15.3)	Volatile	63.0 (26.9)	20.4	29.4 (21.6)
*Open	56.1 (26.2)	*Wide	19.0 (19.6)	37.1	69.4 (24.2)
Pensive	68.0 (23.7)	Deep	51.0 (26.0)	17.0	59.5 (24.1)
Poetic	69.6 (28.2)	Literary	58.8 (30.6)	10.8	71.2 (21.7)
Proud	85.6 (13.5)	Notable	55.3 (28.4)	30.3	49.7 (26.4)
Reflective	57.2 (29.2)	Tranquil	39.4 (26.7)	17.8	60.8 (23.9)
*Sad	85.0 (14.2)	*Gloomy	50.3 (28.6)	34.7	86.3 (16.4)
Sad	85.0 (14.2)	Pale	60.1 (28.1)	25.0	35.4 (26.8)
Sad	85.0 (14.2)	Sunless	16.1 (20.4)	68.9	50.3 (26.9)
Savage	63.7 (25.3)	Harsh	62.5 (24.8)	1.2	78.9 (18.8)
Sensual	79.1 (22.8)	Luxurious	32.6 (30.5)	46.5	52.0 (27.6)
Serene	49.0 (28.7)	Still	37.3 (24.3)	11.7	76.0 (21.2)
Sexy	87.9 (13.9)	Attractive	75.9 (18.9)	12.1	82.3 (14.7)
Smart	82.5 (15.2)	Complex	64.2 (26.5)	18.3	42.4 (23.2)
Somber	65.3 (24.9)	Dim	48.2 (30.8)	17.1	46.8 (27.4)
Sophisticated	74.7 (22.2)	Complicated	64.5 (26.2)	10.2	33.8 (27.2)
Sorrowful	79.4 (18.6)	Dark	38.5 (25.2)	40.8	61.0 (26.3)
Spiritual	78.1 (20.6)	Metaphysical	29.5 (24.0)	48.6	55.5 (24.8)
*Sympathetic	86.1 (14.0)	*Soothing	52.2 (29.2)	33.9	65.6 (21.5)
Tender	69.3 (25.6)	Peaceful	55.5 (26.9)	13.8	54.6 (23.4)
Tender	69.3 (25.6)	Tranquil	39.4 (26.7)	29.9	48.7 (25.6)
Timid	76.4 (21.2)	Quiet	64.7 (26.3)	11.6	79.4 (16.0)
Tired	82.0 (20.2)	Slow	47.8 (26.7)	34.3	54.9 (26.9)
Triumphant	72.4 (24.9)	Monumental	25.3 (21.7)	47.1	62.8 (24.6)
*Violent	78.8 (19.2)	*Rough	49.0 (26.8)	29.8	80.1 (19.1)

Words that were not part of the preliminary list (see Table 1) are shown in italics. The pairs marked with an asterisk (*) satisfied the strict selection criteria, whereas the pairs marked with a dagger (†) satisfied the more relaxed criteria (see text). Pairs rendered in bold were selected for inclusion in Study 2.

The purpose of Study 1a was to rate the degree of agency implied by each of the 140 descriptors. The purpose of Study 1b was to rate the extent to which the 70 pairs of terms were synonymous.

Participants

Participants for all studies were recruited through the Prolific recruitment portal (Palan & Schitter, 2018; Peer et al., 2017). All the studies involved tasks in which participants make subtle distinctions between different words in terms of their connotations and denotations. Language competency was therefore deemed essential. Accordingly, only participants were recruited who reported native or first-language competency in English. In total, 603 participants were recruited. Of these, 576 participants completed the surveys, with 368 (63.89%) females, and with a mean age of 35.76 years. In order to avoid possible demand characteristics, recruitment was conducted in such a way that no one was able to participate in more than one study.

Study 1a: Procedure

Sixty-five participants were asked to characterize candidate descriptive terms according to their implied agency using a continuous scale. Participants received the following instructions:

For each of the following words, we want you to rate how much the word implies a human person. For example, words like “happy” and “shout” might be rated as having a human implication, whereas words like “green” and “flow” might be considered less human-like. To what extent would this word be more suitable for describing a person or human action rather than an object, event, or natural phenomenon?

Please aim to use the complete range of the scale.

Participants rated each word via a horizontal slider (range: 0–100) with the endpoints labeled very human-like and not at all human-like. The instructions above were visible throughout the task. The results are reported below in conjunction with the results for Study 1b.

Study 1b: Procedure

Sixty-one participants were presented with pairs of words representing our presumed descriptive synonyms and were asked to rate their similarity on a continuous scale (range: 0–100), with the endpoints labeled highly synonymous and not at all synonymous. Participants received the following instructions:

For each of the following pairs of words, rate how synonymous the words are. For example, the words “love” and “affection” might be rated as highly synonymous, whereas “climate” and “color” might be rated as not at all synonymous. Please try to use the full range of the scale.

Results (Studies 1a and 1b)

Prior to analysis, we inspected the data for unfinished responses and outliers. We obtained an intercorrelation between all participants, thereby creating correlation coefficients indicating responses for each participant compared to responses from all other participants. We excluded participant coefficients that fell more than two standard deviations below the mean for all participant coefficients. In Study 1a, we removed four participants who did not finish the survey and four as outliers. In Study 1b, we removed one participant for not completing the survey and five participants as outliers. Means and standard deviations for both agency (Study 1a) and synonym ratings (Study 1b) are shown in Table 2. The range for both rating scales was 0–100. Differences in mean agency scores are also tabulated. Each line in the table identifies a pair of proposed synonyms. As noted, we proposed more than one synonym in some cases (such as joyful and sad). In total, participants rated 70 pairs of words.

Recall that the aim of Studies 1a and 1b was to assemble a set of descriptive pairs satisfying two criteria. The first was that pairs of words differ with regard to agency; that is, where one term connotes relatively lower agency than the other term in the pair. The second criterion was that the pairs were deemed relatively synonymous.

We had set these two criteria, a priori, when we selected our final list of pairs of descriptive terms: those that received both synonym and agency difference ratings above the median for each category. It was possible, therefore, that this selection method was too strict and we decided to relax these criteria as needed in order to ensure a list of at least 15 paired terms (30 descriptors) for use in the subsequent studies. Moreover, in order to ensure a range of descriptive terms capable of characterizing a variety of music, we resolved to retain at least two descriptive pairs capable of characterizing music in each of the four valence/arousal quadrants in the common two-dimensional emotion model (nominally representing happy, sad, scary, and tender musical passages).

In the data we collected, we found that 15 pairs of words satisfied both criteria (i.e., for contrasting agency and high synonymousness). These pairs are marked with asterisks in Table 2. Unfortunately, three words (cheerful, depressed, and gloomy) were duplicated. We added two further pairs of words (cheerful/radiant and glum/drab) by lowering the agency difference from the median value (23.75) to 22.00 and the synonym ratings from the median value (65.28) to 64.00. By replacing cheerful/bright with cheerful/radiant and depressed/gloomy with glum/drab, we ended up with 15 unique pairs of synonyms employing 30 unique (i.e., unduplicated) words.

Conveniently, the resulting 15 pairs of synonyms also fulfilled the criterion we had set, a priori, of ensuring a minimum of two pairs to represent happy, sad, scary, and tender moods, respectively. These were as follows: happy (cheerful/radiant, happy/bright, and flamboyant/colorful); sad (sad/gloomy, depressed/dark, drained/colorless, and glum/drab); scary (defiant/resistant, violent/rough, and frightened/shaky); and tender (sympathetic/soothing and caring/soft). The three remaining pairs of synonyms exhibiting contrasting agency with high synonymousness included fat/heavy, intimate/close, and open/wide. These pairs are shown in bold font in Table 2.

Study 2

A total of 154 native-English speakers were recruited for this experiment. Participants were presented with the 30 descriptive terms selected from Studies 1a and 1b. They received the following instructions:

When people describe music they may make use of many different descriptive words. Some words seem better suited than others for describing a particular musical passage. For example, it might be appropriate to describe a particular song or musical passage as “bold” or “charming.” But words like “curly” or “round” might seem less suitable for describing a musical passage.

In this experiment, you can aid our research by helping us to identify those words that are most and least useful for describing music.

For each of the following words, rate how pertinent the word is for describing music. Please try to use the complete range.

Each participant was presented with the 30 words in unique random order. Each word appeared in isolation, and participants were able to move a vertical slider (range: 0–100) whose endpoints were labeled not at all appropriate word for describing music and highly appropriate word for describing music. Following the 30 target words, five randomly selected words (flamboyant, shaky, wide, sad, and cheerful) were repeated in order to collect test-retest reliability measures.

After completing the word-rating task, participants then completed the 28-question IRI empathy survey (Davis, 1980).

Results (Study 2)

Prior to analysis, we inspected the data for unreliable responses and outliers. As in Study 1 we excluded participant coefficients that fell more than two standard deviations below the mean for all participant coefficients (n = 10). Second, we correlated each participant’s test-retest data for the five words randomly selected: flamboyant, shaky, wide, sad, and cheerful. Again, participant coefficients that fell two standard deviations below the mean for all participant coefficients were removed (n = 7), leaving the final sample size at N = 137.

Linear mixed-effects model

To test our hypothesis that listeners higher in IRI trait fantasy would more often choose musical descriptors higher in word agency, we created a linear mixed-effects model using the lme4 R-package v. 1.1-23 (Bates et al., 2015). Specifically, we tested whether high-agency words would interact with participants’ fantasy scores in predicting music-applicability ratings. Word agency was treated as a dichotomous variable, and fantasy score was treated as a continuous variable. The model employed random intercepts for both participants and words, as well as random slopes for word agency (between-word variable) by participants, and fantasy scores (between-participant variable) by words. The model equation is presented below.

γ_{i j} = a + a_{p} + a_{s} + (b_{p 1} + b_{1}) x_{1 i} + (b_{s 2} + b_{2}) x_{2 j} + b_{3} x_{1 i} x_{2 j} + e_{i j}

Note. $γ_{i j}$ is the dependent variable (music-applicability descriptor); a is the intercept; a_p is the random intercept for the participant; a_s is the random intercept for word; b_p1 is the slope of the random effect of Predictor 1 (agency) by the participant; x_1i is the value of the effect of Predictor 1 by the participant; b_s2 is the slope of the random effect of Predictor 2 by word (fantasy scores); x_2j is the value of the effect of Predictor 2 by word; b₁ is the regular slope for Predictor 1 and b₂ is the regular slope for Predictor 2; b₃ is the slope of the interaction; e_ij is the remaining random variance.

The results for the fixed effects are presented in Table 3, and the random effects are provided in the Supplementary Tables on OSF. There were no significant main effects of word agency (β = .132, p = .345) or trait fantasy (β = .044, p = .062), and no significant interaction effect (β = .001, p = .668) in predicting music-applicability ratings.

Table 3.

The fixed and interaction effects of the linear mixed-effects model predicting music-applicability ratings.

	Beta	SE	df	t	p
(Constant)	.000	.139	29.3	.000	1.00
Word agency	.132	.138	28.1	.961	.345
Trait fantasy	.044	.024	11.8	1.88	.062
Interaction	.001	.012	37.2	−.432	.668

Correlation analysis

In order to perform an analysis that could account for the variance in agency scores, we performed a second analysis. First, for each participant, ratings of musical appropriateness for each of the 30 words were correlated with the mean agency scores from Study 1b using Spearman rank-order correlations. Second, these correlation coefficients were correlated with the participants’ Fantasy scores. There was no significant association between the participants’ correlation coefficients and Fantasy scores (ρ = .017, p = .846).

Study 3: Procedure

In Study 2, we asked participants to rate the applicability of different descriptive terms to music; we did not provide any musical stimuli. That is, participants responded to an abstract question in the absence of any actual musical context. There remained, therefore, the question as to whether an association between fantasy and agency would be evident when participants were asked to listen to musical excerpts and characterize them. To that end, we recruited 323 native-English speakers to Study 3, delivered online, in which they responded to 24 musical excerpts. For each excerpt, participants selected the most appropriate descriptive terms (up to a maximum of three) from a list of 16 terms varying in their degree of agency. Excerpts were 15 s in duration, and participants were required to listen to at least 10 s of each excerpt to be included in the analysis. According to this criterion, we excluded 2 participants, and we also excluded 22 further participants who did not complete the survey, resulting in 299 complete responses for analysis.

We obtained the musical stimuli from a film music stimulus set compiled by Eerola and Vuoskoski (2011). The stimuli were selected to represent the four quadrants of the arousal-valence space: high arousal and positive valence (nominally happy), low arousal and positive valence (tender), high arousal and negative valence (scary), and low arousal and negative valence (sad). In the present study, each quadrant was represented by six 15 s musical excerpts. The list of stimuli and their mean valence and arousal ratings are provided in Appendix 1.

So that the task would not take too long, we trimmed the list of 30 descriptive terms (15 synonym pairs) used in Study 2 to 16 descriptive terms (8 synonym pairs). In selecting our reduced subset of descriptive terms, our first criterion was to ensure the availability of suitable descriptors for each of the four quadrants in the arousal-valence space. For each quadrant, our aim was to provide four pertinent descriptors (two synonym pairs).

Each of the three authors of the study independently rated all 30 descriptive terms according to their arousal and valence. The inter-author average paired correlation was .93 for valence ratings and .89 for arousal ratings. More importantly, the authors independently agreed on the quadrant assignments for all 30 descriptive terms.

As documented above, in Study 2, participants deemed some terms more appropriate than others for describing music. For example, terms such as sad and cheerful were rated high as musical descriptors, whereas fat and close were rated low. Moreover, the differences in music applicability for some synonym pairs were greater than for others. Examples include sympathetic/soothing and caring/soft; that is, sympathetic and caring were rated as much less applicable to music than their synonymous partners soothing and soft.

In Study 3, our aim was to investigate inter-individual variability in choosing high-agency vs. low-agency descriptors for musical excerpts. If all participants regarded soothing as more applicable to music than sympathetic, then we would be unlikely to see any tendency for high-fantasy participants to favor high-agency descriptors. Consequently, in order to increase the sensitivity of the study, we should include synonym pairs rated least different for music applicability. Accordingly, for each of the four arousal-valence quadrants, we selected those synonym pairs with the smallest differences between their mean music-applicability ratings.

The synonym pairs are shown in Table 4, classified according to the four arousal-valence quadrants. Also shown are the mean arousal and valence values (ranging from −5 to + 5) according to the three authors’ ratings. The differences between the mean music-applicability scores for each synonym pair are shown in the fourth column. (Positive difference scores are obtained when high-agency words are rated more applicable to music; negative difference scores are obtained when low-agency words are rated as more applicable to music.) In each arousal-valence quadrant category, synonym pairs are ordered according to the absolute difference in mean applicability scores. Once again, in order to enhance sensitivity, we aimed to select those synonym pairs whose descriptive words are similarly appropriate for describing music. That is, we chose those pairs with the smallest absolute difference scores for music applicability. For low arousal/positive valence, we selected intimate/close and caring/soft; for low arousal/negative valence, we selected drained/colorless and glum/drab; for high arousal/positive valence, we selected flamboyant/colorful and happy/bright; for high arousal/negative valence we selected violent/rough and frightened/shaky. Notice that the absolute difference in mean applicability scores was considerably higher (absolute M = 50.6) for the selected low arousal/positive valence descriptive word pairs compared with the mean applicability scores for all other selected word pairs (M = 12.0).

Table 4.

The list of 15 synonym pairs used in Study 2, categorized into the four valence-arousal quadrants, with the authors’ mean ratings of valence and arousal (range: −5 to + 5), as well as the mean music-applicability ratings from Study 2 and the difference in applicability for each pair (positive difference scores indicate that the high-agency word was rated more applicable to music; negative difference scores indicate that the low-agency word was rated more applicable to music).

Word	Mean arousal	Mean valence	Mean music-applicability	Applicability difference
Low arousal/positive valence
Intimate	+ 0.3	+ 3.0	68.77	+ 50.05
Close	−0.3	+ 2.0	18.72
Caring	−2.3	+ 3.7	31.93	−51.15
Soft	−1.7	+ 2.7	83.08
Sympathetic	−2.3	+ 3.3	40.53	−53.64
Soothing	−2.3	+ 3.3	94.17
Low arousal/negative valence
Drained	−5.0	−4.0	22.26	−10.35
Colorless	−3.3	−3.3	32.61
Glum	−3.3	−3.3	63.88	+ 10.76
Drab	−3.0	−3.3	53.12
Sad	−3.3	−3.7	90.55	+ 11.68
Gloomy	−2.7	−3.3	78.87
Depressed	−4.3	−4.7	52.66	−31.71
Dark	−1.7	−3.7	84.37
High arousal/positive valence
Flamboyant	+ 3.7	+ 3.0	69.40	+ 7.19
Colorful	+ 3.3	+ 3.7	62.21
Happy	+ 4.0	+ 4.7	91.85	+ 17.64
Bright	+ 3.3	+ 3.7	74.21
Open	+ 1.3	+ 1.0	36.14	+ 18.86
Wide	+ 0.3	+ 0.3	17.28
Cheerful	+ 4.0	+ 4.3	92.30	+ 34.88
Radiant	+ 3.3	+ 3.7	57.42
High arousal/negative valence
Violent	+ 4.7	−4.3	61.62	+ 13.00
Rough	+ 3.3	−3.0	48.62
Frightened	+ 3.7	−4.7	26.58	−13.14
Shaky	+ 3.3	−3.3	39.72
Defiant	+ 3.7	−3.3	54.11	+ 29.42
Resistant	+ 3.3	−3.0	24.69
Fat	−1.3	−2.0	9.85	−65.11
Heavy	−1.0	−1.7	74.96

Participants received the following instructions:

In this experiment, you will hear a series of 24 musical excerpts. You will also see displayed a number of descriptive words. After the completion of the music, we want you to identify up to three words that you feel best describe the music. You can select fewer than three words, although you must select at least one word for each excerpt. When you have finished selecting the most pertinent descriptive words, press on the “next excerpt” button to continue with the next musical example.

The 16 displayed descriptors were bright, caring, close, colorful, colorless, drab, drained, flamboyant, frightened, glum, happy, intimate, rough, shaky, soft, and violent. The order of the displayed descriptors was randomized for each participant, although the order of descriptors for any given participant was retained for successive musical excerpts. Descriptive terms were displayed throughout the sounded musical excerpt. The response page included the instruction reminder: “Select up to three words”:

Following the end of each musical stimulus and having selected the most appropriate descriptive term(s), participants subsequently rated their enjoyment of the musical excerpt. The screen displayed the question, “How much did you enjoy the musical excerpt?” A horizontal slider was provided with the left and right ends labeled Did not enjoy at all and Enjoyed very much, respectively. The default slider position was set in the middle.

After completing the music descriptor selection and enjoyment rating tasks for all 24 excerpts, participants then completed the 28-question IRI empathy survey.

Results (Study 3)

For each participant, we obtained an aggregate musical-descriptor agency score for each excerpt by summing up the agency values for all words chosen, divided by the number of words chosen. This aggregate score was used as our dependent variable.

A linear mixed-effects model analysis was conducted using the lme4 R-package v. 1.1-23 (Bates et al., 2015) to investigate whether people scoring higher in trait fantasy would be more likely to choose musical descriptors that were high in agency. Further, we included enjoyment ratings in the model to investigate if this factor had a main effect on or interacted with IRI fantasy scores in predicting musical-descriptor agency scores. In addition, we included valence and arousal as factors to investigate whether the potential main effects of IRI Fantasy or enjoyment ratings would differ between low and high-arousal excerpts and low and high-valence excerpts. In order to avoid overfitting, we created three models: the first model tested the interaction between fantasy and enjoyment, the second model tested the interaction between enjoyment and valence, and the third model tested the interaction between enjoyment and arousal. The model employed random intercepts for both participants and excerpts, as well as random slopes for fantasy (between-participants variable) by excerpts, valence, and arousal (between-excerpts variable) by participants, and enjoyment by participants and excerpts (between-participants and between-excerpts variable). The model equations are given below:

Model 1

γ_{i j} = a + a_{p} + a_{s} + (b_{s 1} + b_{1}) x_{1 i} + (b_{p 2} + b_{2}) x_{2 i} + (b_{s 2} + b_{2}) x_{2 j} + b_{3} x_{1 i} x_{2 i j} + e_{i j}

Note. $γ_{i j}$ is the dependent variable (aggregate descriptor-agency score); a is the intercept; a_p is the random intercept for participant; a_s is the random intercept for excerpt; b_p1 is the slope of the random effect of Predictor 1 (fantasy scores) by excerpt; x_1i is the value of the effect of Predictor 1 by excerpt; b_p2 is the slope of the random effect of Predictor 2 by participant (enjoyment scores); b_s2 is the slope of the random effect of Predictor 2 by excerpt (enjoyment scores); x_2i is the value of the effect of Predictor 2 by participant; x_2j is the value of the effect of Predictor 2 by excerpt; b₁ is the regular slope for Predictor 1 and b₂ is the regular slope for Predictor 2; b₃ is the slope of the interaction; e_ij is the remaining random variance.

Models 2 and 3

γ_{i j} = a + a_{p} + a_{s} + (b_{p 1} + b_{1}) x_{1 i} + (b_{p 2} + b_{2}) x_{2 i} + (b_{s 2} + b_{2}) x_{2 j} + b_{3} x_{1 i} x_{2 i j} + e_{i j}

Note. $γ_{i j}$ is the dependent variable (aggregate descriptor-agency score); a is the intercept; a_p is the random intercept for participant; a_s is the random intercept for excerpt; b_p1 is the slope of the random effect of Predictor 1 (valence-arousal) by participant; x_1i is the value of the effect of Predictor 1 by participant; b_p2 is the slope of the random effect of Predictor 2 by participant (enjoyment scores); b_s2 is the slope of the random effect of Predictor 2 by excerpt (enjoyment scores); x_2i is the value of the effect of Predictor 2 by participant; x_2j is the value of the effect of Predictor 2 by excerpt; b₁ is the regular slope for Predictor 1 and b₂ is the regular slope for Predictor 2; b₃ is the slope of the interaction; e_ij is the remaining random variance.

The fixed effects of the linear mixed-effects models are presented in Table 5, and the random effects are provided in the Supplementary Tables on OSF. Measures of model fit (Akaike information criterion; AIC) indicated that Model 3 was the best fit to the data. There were no significant main or interaction effects related to trait fantasy. However, there were significant main effects of arousal (B = 8.13, p < .001) and enjoyment (B = 0.11, p < .001), as well as a significant interaction between arousal and enjoyment (B = −0.07, p = .002). The effect of enjoyment on the aggregate descriptor-agency scores was stronger for low-arousal excerpts compared to high-arousal excerpts.

Table 5.

The fixed and interaction effects of the three linear mixed-effects models predict the aggregate musical-descriptor agency scores for each excerpt.

Model 1
	B	SE	df	t	p
Constant	58.81	2.24	18.95	26.20	< .001
Fantasy	−0.14	0.71	21.39	−0.20	.84
Enjoyment	0.05	0.02	24.98	2.23	.027
Fantasy: Enjoyment	0.02	0.01	220.55	0.41	.69
Model 2
	B	SE	df	t	p
Constant	60.68	2.08	11.40	29.24	< .001
Enjoyment	0.04	0.03	12.16	2.07	.060
Valence	−5.32	2.94	11.47	−1.81	.096
Enjoyment: Valence	0.05	0.03	12.07	1.65	.12
Model 3
	B	SE	df	t	p
Constant	53.89	1.51	23.53	35.58	< .001
Enjoyment	0.11	0.02	24.39	6.65	< .001
Arousal	8.13	2.11	22.13	3.85	< .001
Enjoyment: Arousal	−0.07	0.02	22.09	−3.47	.002

Discussion (Study 3)

The motivation for this study was an effort to try to explain why high trait fantasy might be associated with the enjoyment of sad music. We proposed that those individuals who score high on fantasy might be more likely to perceive music as an agent or actor, and therefore sad music might be more likely to evoke a (positive) empathetic response.

In Study 2 we tested whether high-agency words would interact with participants’ fantasy scores in predicting ratings of applicability for describing music. We found no evidence consistent with such an interaction effect.

In Study 3, we had participants judge the applicability of different words in describing actual music examples. Once again, our aim was to test whether those individuals who score high on fantasy might be more likely to characterize music using high-agency descriptors. Consistent with the results of Study 2, we found no relationship between IRI trait fantasy and a tendency to favor high-agency descriptors for music, contrary to our motivating hypothesis.

Interestingly, Study 3 revealed an unexpected relationship between the tendency to favor high-agency descriptors and both musical arousal and participant enjoyment. Specifically, the more a participant enjoyed the music, the greater the likelihood of favoring high-agency descriptors of the music. In addition, the greater the musical arousal, the greater the likelihood of a participant favoring high-agency descriptors. At the same time, the results indicated a negative interaction between these two factors: for high-arousal excerpts, enjoyment contributed less to the tendency to favor high-agency descriptors.

In the case of enjoyment and agency, there is an issue of causality: does greater enjoyment dispose listeners to favor higher-animacy descriptors? Or does the perception of higher agency result in greater enjoyment? In the case of arousal, the causality is clearer since arousal levels were not judged by the participants. A preference for high-agency descriptors cannot cause the musical stimuli to be more arousing. Instead, higher-arousal music must be the reason why participants prefer higher agency descriptors.

Interpreting the overall results, high-arousal music appears to encourage listeners to perceive the music as exhibiting more agency. It may also be the case that greater enjoyment disposes listeners to characterize the music as exhibiting greater agency, but it is also possible that music perceived as exhibiting greater agency is enjoyed more.

Post hoc analyses

Recall that the motivation for the current study was to establish a better understanding of sad-music enjoyment. The musical stimuli used in Study 3 were drawn from a range of emotions, including both sad and non-sad passages. Perhaps the relationship proposed in our main hypothesis is pertinent only to nominally sad music. The stimuli for Study 3 consisted of 24 musical passages explicitly drawn from the four quadrants of the common arousal-valence model. Nominally melancholic music is typically found in the low-arousal/negative-valence quadrant. Accordingly, we conducted a post hoc analysis in which we used only data from the low-arousal/negative-value quadrant. Once again, the aim of our analysis was to determine whether IRI fantasy scores play a statistically significant role in predicting agency scores. Our analysis showed a non-significant relationship (p = .59).

Since fantasy is implicated in the enjoyment of sad music, another post hoc approach might focus on the degree to which participants enjoy sad music. For this analysis, we created a new variable (sad-music liking). This was operationalized as the difference score between a participant’s mean enjoyment rating for low-arousal/negative-valence stimuli and the participant’s mean enjoyment ratings for all other stimuli. Here, we made two predictions: those participants who most enjoy low-arousal/negative-valence stimuli were more likely to score high on IRI fantasy and were also more likely to choose high-agency words as appropriate music descriptors. Restricting our analysis to the low-arousal/negative-valence quadrant, we found a non-significant correlation (p = .60) between low-arousal/negative-valence music liking and IRI fantasy. In addition, we found a non-significant correlation (p = .54) between low-arousal/negative music liking and word agency score.

Recall that apart from fantasy, the IRI assesses three other facets of empathy, including empathic concern, personal distress, and perspective taking. Although our original research plan focused exclusively on fantasy, the enjoyment of sad music is known to also be related to empathic concern. Notably, sad-music likers tend to score higher on empathic concern. Consequently, we tested whether those participants who scored higher on empathic concern were more likely to favor high-agency descriptors for low-arousal/negative-valence music. Once again, we found a non-significant correlation was found.

We also decided to test whether the current results replicated those of earlier studies regarding the relationship between trait fantasy, empathic concern, and sad-music enjoyment. Several studies have found that listeners who most enjoy nominally sad music tend to score high on trait fantasy and empathic concern in the IRI empathy scale (Eerola et al., 2016; Kawakami & Katahira, 2015; Sattmann & Parncutt, 2018; Vuoskoski & Eerola, 2017). Consistent with this research, we found a significant positive association between low-arousal/negative-valence music liking and IRI fantasy (r = .12, p = .045). We also found a significant positive relationship between liking for low-arousal/negative-valence music and empathic concern (r = .16, p = .005).

In our final post hoc analysis, we examined the possibility that the significant relationship between high-arousal music and the selection of more high-agency descriptors might be due to some inherent bias toward high agency among the high-arousal descriptors. We did this by calculating the Spearman rank-order correlation between the mean arousal ratings given by the three authors for the descriptive terms (obtained after Study 2) and the mean agency ratings obtained in Study 1A. There was no significant correlation (r = .11, p = .53), suggesting that the observed association between agency and musical arousal is unlikely to be due to any systematic bias in the study design. However, since the present study utilized only a limited number of musical excerpts and descriptive terms, it is nevertheless possible that the design may have caused certain high-arousal, high-agency words to be favored over their low-agency alternatives.

Study 3: Conclusion

The results of this study did not reveal any positive association between trait fantasy and high-agency descriptors in a music-listening task. However, we found that when asked to select words that best describe a particular musical passage, participants were more likely to select words associated with high agency for high-arousal music as well as for music that they enjoyed.

General discussion

In the first instance, our results were not consistent with the motivating hypothesis that high-fantasy listeners are more likely to impute agency to instrumental music. We failed to find a relationship between trait fantasy and the tendency to favor high-agency descriptors for music both in a word-rating task, as well as in a listening experiment.

While the participants in our studies found many of the high-agency descriptors highly appropriate for describing music, our findings suggest that trait fantasy does not seem to facilitate the detection or perception of agency in instrumental music. Davis defined “fantasy” as “the tendency to imaginatively transpose oneself into fictional situations (e.g., books, movies, daydreams)” (Davis, 1980, p. 11).¹ As defined, the fantasy facet aims to assess the degree to which a person is more or less transported, absorbed, or gets into some fantasized or imagined experience. Thus, it seems plausible that fantasy contributes to the degree to which listeners empathically identify with the detected agency cues in music rather than to the detection of these cues. However, further experimental research is required to investigate this possibility more thoroughly.

Although the present series of studies did not reveal any significant relationship between trait fantasy and agent-sensitivity in music listening, the results nevertheless suggest that the detection of agency may be important in music-related emotion. Specifically, we found that both enjoyment and musical arousal were positively associated with the tendency to select musical descriptors implying high agency. The fact that enjoyment was positively associated with the tendency to favor high-agency descriptors supports the view that social cognition and social emotions play an important role in our experience of music (cf. Clarke et al., 2015; Huron & Vuoskoski, 2020). Specifically, it is possible that the positive relationship between enjoyment and attribution of agency is related to a process of identification with the music. The experience of identification is closely related to empathy (e.g., Davis, 1980; Egermann & McAdams, 2012), and appears to involve heightened enjoyment and increased similarity between the listener’s perceived and felt emotions (see, for example, Egermann & McAdams, 2013; Schubert, 2007, 2013). Sloboda (2000) suggests that music may create an environment where the attribution of the detected emotion—either to oneself or to an external agent—may be particularly fluid. The experience of identification may even involve sharing the emotions of an imagined, indefinite agent or persona conveyed by the music (Levinson, 2006); a process where trait fantasy may potentially play a modulating role. However, the question of causality remains open: does the perception of agency contribute to increased identification and enjoyment or does enjoyment and/or the experience of identification make listeners more prone to selecting descriptors implying high agency? Both options appear plausible and should be subject to further experimental investigation.

The arousal dimension is positively associated with energy and activity (cf. Schimmack & Grob, 2000), and thus it is possible that the perception of high activity contributes to an increased sense of agency. It may also be that high-arousal music evokes more intense emotions in listeners (cf. Dibben, 2004) or gives rise to more salient motion imagery (cf. Eitan & Granot, 2006), which in turn could contribute to increased attribution of agency to music. However, it should be noted that this conjecture remains speculative, since the current experiments did not investigate induced emotional responses or imagery. It is furthermore possible that some aspect of the experimental design may have favored high-arousal, high-agency descriptors over their low-agency counterparts, although we attempted to mitigate this possibility with our post hoc analyses. Nevertheless, future studies should investigate how the detection of agency contributes to music-induced emotions (or vice versa) since music enjoyment has been shown to be positively associated with emotional intensity (e.g., Ladinig & Schellenberg, 2012).

By way of summary, the research presented here suggests that listeners find both high- and low-agency descriptors appropriate for describing music. The tendency to favor high-agency descriptors was associated with increased enjoyment and musical arousal. Trait fantasy did not appear to facilitate the attribution of agency to music.

Footnotes

Appendix

Appendix 1

List of stimuli used in Experiment 3, with mean valence and arousal ratings obtained from Eerola and Vuoskoski (2011).

Category	Number^a	Soundtrack name	Valence (range: 1–9)	Arousal (range: 1–9)
Positive valence and low arousal	44	Pride and Prejudice	7.38	3.21
	48	Dracula	5.85	3.56
	83	Big Fish	6.40	4.49
	98	Naked Lunch	5.46	4.78
	102	Shakespeare In Love	6.01	4.96
	103	The Fifth Element	5.87	4.54
Positive valence and high arousal	23	Shallow Grave	8.27	8.54
	53	Gladiator	7.07	6.76
	72	Man of Galilee CD1	7.45	8.39
	75	Batman	7.31	8.04
	77	Lethal Weapon 3	6.27	6.34
	78	Crouching Tiger	5.30	5.87
Negative valence and low arousal	32	Running Scared	4.04	3.67
	33	The Portrait of a Lady	4.38	2.48
	38	Dracula	4.73	2.79
	63	Batman	4.76	3.96
	89	Blanc	4.33	3.21
	90	Batman Returns	4.43	3.55
Negative valence and high arousal	2	The Rainmaker	2.50	8.21
	64	The Fifth Element	3.03	5.12
	91	The Alien Trilogy	3.69	7.09
	92	The Fifth Element	2.99	6.99
	93	Babylon 5	2.46	7.25
	97	Shallow Grave	2.88	6.51

Stimuli were obtained from the database published by Eerola and Vuoskoski (2011).

Stimulus number in the set of 110 film music excerpts published by Eerola and Vuoskoski (2011).

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the Research Council of Norway through its Centres of Excellence scheme, project number 262762.

ORCID iD

Jonna K. Vuoskoski

Notes

References

Aucouturier

J.-J.

Canonne

(2017). Musical friends and foes: The social cognition of affiliation and control in improvised interactions. Cognition, 161, 94–108. https://doi.org/10.1016/j.cognition.2017.01.019

Bates

Mächler

Bolker

Walker

. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

Broze

G. J.

(2013). Animacy, Anthropomimesis, and musical line [PhD dissertation]. School of Music, Ohio State University.

Clarke

DeNora

Vuoskoski

(2015). Music, empathy and cultural understanding. Physics of Life Reviews, 15, 61–88. https://doi.org/10.1016/j.plrev.2015.09.001

Colver

M. C.

El-Alayli

(2016). Getting aesthetic chills from music: The connection between openness to experience and frisson. Psychology of Music, 44(3), 413–427. https://doi.org/10.1177/0305735615572358

Costa

Alves

Neto

Marvão

Portela

Costa

M. J.

(2014). Associations between medical student empathy and personality: A multi-institutional study. PLOS ONE, 9(3), Article e89254. https://doi.org/10.1371/journal.pone.0089254

Davies

(1997). Contra the hypothetical persona in music. In Hjort

Laver

(Eds.), Emotion and the arts (pp. 95–109). Oxford University Press.

Davis

M. H.

(1980). A multidimensional approach to individual differences in empathy. JSAS Catalog of Selected Documents in Psychology, 10, Article 85.

Davis

M. H.

(1983). Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44(1), 113–126. https://doi.org/10.1037/0022-3514.44.1.113

10.

Dibben

(2004). The role of peripheral feedback in emotional experience with music. Music Perception, 22(1), 79–115. https://doi.org/10.1525/mp.2004.22.1.79

11.

Dobrota

Reić Ercegovac

(2015). The relationship between music preferences of different mode and tempo and personality traits–implications for music pedagogy. Music Education Research, 17(2), 234–247. https://doi.org/10.1080/14613808.2014.933790

12.

Eerola

Vuoskoski

J. K.

(2011). A comparison of the discrete and dimensional models of emotion in music. Psychology of Music, 39(1), 18–49. https://doi.org/10.1177/0305735610362821

13.

Eerola

Vuoskoski

J. K.

Kautiainen

(2016). Being moved by unfamiliar sad music is associated with high empathy. Frontiers in Psychology, 7, Article 1176. https://doi.org/10.3389/fpsyg.2016.01176

14.

Egermann

McAdams

(2013). Empathy and emotional contagion as a link between recognized and felt emotions in music listening. Music Perception, 31(2), 139–156. https://doi.org/10.1525/mp.2013.31.2.139

15.

Eitan

Granot

R. Y.

(2006). How music moves: Musical parameters and listeners images of motion. Music Perception, 23(3), 221–248. https://doi.org/10.1525/mp.2006.23.3.221

16.

Fiske

A. P.

Seibt

Schubert

(2019). The sudden devotion emotion: Kama muta and the cultural practices whose function is to evoke it. Emotion Review, 11(1), 74–86. https://doi.org/10.1177/1754073917723167

17.

Garrido

Schubert

(2011). Individual differences in the enjoyment of negative emotion in music: A literature review and experiment. Music Perception, 28(3), 279–296. https://doi.org/10.1525/mp.2011.28.3.279

18.

Hatten

R. S.

(2018). A theory of virtual agency for western art music. Indiana University Press.

19.

Huron

Vuoskoski

J. K.

(2020). On the enjoyment of sad music: Pleasurable compassion theory and the role of trait empathy. Frontiers in Psychology, 11, Article 1060. https://doi.org/10.3389/fpsyg.2020.01060

20.

Kawakami

Katahira

(2015). Influence of trait empathy on the emotion evoked by sad music and on the preference for it. Frontiers in Psychology, 6, Article 1541. https://doi.org/10.3389/fpsyg.2015.01541

21.

Ladinig

Schellenberg

E. G.

(2012). Liking unfamiliar music: Effects of felt emotion and individual differences. Psychology of Aesthetics, Creativity, and the Arts, 6(2), 146–154. https://doi.org/10.1037/a0024671

22.

Launay

(2015). Musical sounds, motor resonance, and detectable agency. Empirical Musicology Review, 10(1–2), 30–40. https://doi.org/10.18061/emr.v10i1-2.4579

23.

Levinson

(1996). The pleasures of aesthetics: Philosophical essays. Cornell University Press.

24.

Levinson

(2006). Musical expressiveness as hearability-as-expression. In Kieran

(Ed.), Contemporary debates in aesthetics and the philosophy of art (pp. 192–206). Blackwell Publishing.

25.

Margulis

E. H.

(2017). An exploratory study of narrative experiences of music. Music Perception, 35(2), 235–248. https://doi.org/10.1525/mp.2017.35.2.235

26.

Margulis

E. H.

Wong

P. C. M.

Turnbull

Kubit

B. M.

McAuley

J. D.

(2022). Narratives imagined in response to instrumental music reveal culture-bounded intersubjectivity. Proceedings of the National Academy of Sciences of the United States of America, 119(4), Article e2110406119. https://doi.org/10.1073/pnas.2110406119

27.

Maruskin

L. A.

Thrash

T. M.

Elliot

A. J.

(2012). The chills as a psychological construct: Content universe, factor structure, affective composition, elicitors, trait antecedents, and consequences. Journal of Personality and Social Psychology, 103(1), 135–157. https://doi.org/10.1037/a0028117

28.

McCrae

R. R.

(2007). Aesthetic chills as a universal marker of openness to experience. Motivation and Emotion, 31(1), 5–11. https://doi.org/10.1007/s11031-007-9053-1

29.

Melchers

M. C.

Haas

B. W.

Reuter

Bischoff

Montag

(2016). Similar personality patterns are associated with empathy in four different countries. Frontiers in Psychology, 7, Article 290. https://doi.org/10.3389/fpsyg.2016.00290

30.

Menninghaus

Wagner

Hanich

Wassiliwizky

Kuehnast

Jacobsen

(2015). Towards a psychological construct of being moved. PLOS ONE, 10(6), Article e0128451. https://doi.org/10.1371/journal.pone.0128451

31.

Nusbaum

E. C.

Silvia

P. J.

(2011). Shivers and timbres: Personality and the experience of chills from music. Social Psychological and Personality Science, 2(2), 199–204. https://doi.org/10.1177/1948550610386810

32.

Palan

Schitter

(2018). Prolific.ac—A subject pool for online experiments. Journal of Behavioral and Experimental Finance, 17, 22–27. https://doi.org/10.1016/j.jbef.2017.12.004

33.

Peer

Brandimarte

Samat

Acquisti

(2017). Beyond the Turk: Alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology, 70, 153–163. https://doi.org/10.1016/j.jesp.2017.01.006

34.

Rentfrow

P. J.

Gosling

S. D.

(2003). The do re mi’s of everyday life: The structure and personality correlates of music preferences. Journal of Personality and Social Psychology, 84(6), 1236–1256. https://doi.org/10.1037/0022-3514.84.6.1236

35.

Robinson

Hatten

R. S.

(2012). Emotions in music. Music Theory Spectrum, 34(2), 71–106. https://doi.org/10.1525/mts.2012.34.2.71

36.

Sattmann

Parncutt

(2018, July 23–28). The role of empathy in musical chills [Paper presentation]. International Conference on Music Perception and Cognition, Graz.

37.

Schimmack

Grob

(2000). Dimensional models of core affect: A quantitative comparison by means of structural equation modeling. European Journal of Personality, 14(4), 325–345. https://doi.org/10.1002/1099-0984(200007/08)14:4<325::AID-PER380>3.0.CO;2-I

38.

Schubert

(2007). The influence of emotion, locus of emotion and familiarity upon preference in music. Psychology of Music, 35(3), 499–515. https://doi.org/10.1177/0305735607072657

39.

Schubert

(2013). Emotion felt by the listener and expressed by the music: Literature review and theoretical perspectives. Frontiers in Psychology, 4, Article 837. https://doi.org/10.3389/fpsyg.2013.00837

40.

Silvia

P. J.

Fayn

Nusbaum

E. C.

Beaty

R. E.

(2015). Openness to experience and awe in response to nature and music: Personality and profound aesthetic experiences. Psychology of Aesthetics, Creativity, and the Arts, 9(4), 376–384. https://doi.org/10.1037/aca0000028

41.

Silvia

P. J.

Nusbaum

E. C.

(2011). On personality and piloerection: Individual differences in aesthetic chills and other unusual aesthetic experiences. Psychology of Aesthetics, Creativity, and the Arts, 5(3), 208–214. https://doi.org/10.1037/a0021914

42.

Sloboda

J. A.

(2000). Musical performance and emotion: Issues and developments. In Yi

S. W.

(Ed.), Music, mind, and science (pp. 220–238). Western Music Research Institute.

43.

Taruffi

Koelsch

(2014). The paradox of music-evoked sadness: An online survey. PLOS ONE, 9(10), Article e110490. https://doi.org/10.1371/journal.pone.0110490

44.

Thompson

W. F.

Geeves

A. M.

Olsen

K. N.

(2019). Who enjoys listening to violent music and why? Psychology of Popular Media Culture, 8(3), 218–232. https://doi.org/10.1037/ppm0000184

45.

Vuoskoski

J. K.

Eerola

(2017). The pleasure evoked by sad music is mediated by feelings of being moved. Frontiers in Psychology, 8, Article 439. https://doi.org/10.3389/fpsyg.2017.00439

46.

Vuoskoski

J. K.

Thompson

W. F.

McIlwain

Eerola

(2012). Who enjoys listening to sad music and why? Music Perception, 29(3), 311–317. https://doi.org/10.1525/mp.2012.29.3.311

47.

Vuoskoski

J. K.

Zickfeld

J. H.

Alluri

Moorthigari

Seibt

(2022). Feeling moved by music: Investigating continuous ratings and acoustic correlates. PLOS ONE, 17(1), Article e0261151. https://doi.org/10.1371/journal.pone.0261151

48.

Watt

R. J.

Ash

R. L.

(1998). A psychological investigation of meaning in music. Musicae Scientiae, 2(1), 33–53. https://doi.org/10.3389/fpsyg.2011.00393

49.

Zentner

Grandjean

Scherer

K. R.

(2008). Emotions evoked by the sound of music: Characterization, classification, and measurement. Emotion, 8(4), 494–521. https://doi.org/10.1037/1528-3542.8.4.494