Abstract
When people make cross-modal matches from classical music to colors, they choose colors whose emotional associations fit the emotional associations of the music, supporting the emotional mediation hypothesis. We further explored this result with a large, diverse sample of 34 musical excerpts from different genres, including Blues, Salsa, Heavy metal, and many others, a broad sample of 10 emotion-related rating scales, and a large range of 15 rated music–perceptual features. We found systematic music-to-color associations between perceptual features of the music and perceptual dimensions of the colors chosen as going best/worst with the music (e.g., loud, punchy, distorted music was generally associated with darker, redder, more saturated colors). However, these associations were also consistent with emotional mediation (e.g., agitated-sounding music was associated with agitated-looking colors). Indeed, partialling out the variance due to emotional content eliminated all significant cross-modal correlations between lower level perceptual features. Parallel factor analysis (Parafac, a type of factor analysis that encompasses individual differences) revealed two latent affective factors—
Introduction
Music–color synesthesia is a rare and interesting neurological phenomenon in which listening to music automatically and involuntarily leads to the conscious experience of color (Ward, 2013). Although only a small proportion of people have such synesthesia, recent evidence suggests that self-reported nonsynesthetes exhibit robust and systematic music-to-color associations (e.g., Isbilen & Krumhansl, 2016; Lindborg & Friberg, 2015; Palmer, Schloss, Xu, & Prado-Leon, 2013; Palmer, Langlois, & Schloss, 2016). For example, when asked to choose a color that “goes best” with a classical musical selection, both U.S. and Mexican participants chose lighter, more saturated, yellower colors as going better with faster music in the major mode (Palmer et al., 2013).
Two general hypotheses have been proposed to explain such music-to-color associations in both synesthetes and nonsynesthetes: direct links and emotional mediation. The direct link hypothesis asserts that musical sounds and visual colors are related via direct correspondences between the perceived properties of the two types of stimuli (e.g., Caivano, 1994; Pridmore, 1992; Wells, 1980). For example, Caivano (1994) proposed that the octave-based musical scale maps to the hue circle, luminosity to loudness, saturation to timbre, and size to duration. Pridmore (1992) suggested that hue maps onto tone via wavelength and that loudness maps to brightness via amplitude. Lipscomb and Kim (2004) found that pitch mapped to vertical location, timbre with shape, and loudness with size. A well-documented example is that higher pitched tones are associated with lighter, brighter colors (e.g., Collier & Hubbard, 2004; Marks, 1987; Ward, Huckstep, & Tsakanikos, 2006).
In contrast, the emotional mediation hypothesis suggests that color and music are related indirectly through common, higher level, emotional associations (Barbiere, Vidal, & Zellner, 2007; Bresin, 2005; Lindborg & Friberg, 2015; Palmer et al., 2013, 2016; Sebba, 1991). In this view, colors are associated with music based on shared emotional content. For example, happy-sounding music would be associated with happy-looking colors and sad-sounding music with sad-looking colors. Most evidence for emotional mediation comes from studies in which nonsynesthetes were asked to choose colors according to how well they went with different selections of classical music (Isbilen & Krumhansl, 2016; Palmer et al., 2013, 2016). Participants subsequently rated each musical selection and each color on several emotion-related scales 1 (e.g., happy/sad, angry/calm, strong/weak), so the role of emotion in their color–music associations could be assessed. Results showed strong correlations between the emotion-related ratings of the musical excerpts and the emotion-related ratings of the colors people chose as going best/worst with the music (e.g., +.97 for happy/sad and +.96 for strong/weak; Palmer et al., 2013). 2 Similarly, strong correlations were found for classical music even when using highly controlled, single-line piano melodies by Mozart (e.g., +.92 for happy/sad and +.85 for strong/weak; Palmer et al., 2016). These results were corroborated for Bach preludes (Isbilen & Krumhansal, 2016) and film music (Lindborg & Friberg, 2015) using different methods and analytic techniques.
Note that emotional mediation does not imply that the music–perceptual features are irrelevant to people's color associations. Indeed, combinations of the music's perceptual features (e.g., subjective qualities such as loudness and rhythmic complexity) carry information that helps determine the emotional character of the music (Friberg, Schoonderwaldt, Hedblad, Fabiani, & Elowsson, 2014), but this emotional interpretation may be influenced by experience, such as cultural differences (e.g., Gregory & Varney, 1996). For example, major/minor mode is important in conveying happy/sad emotions in music for Western listeners (e.g., Parncutt, 2014), but this does not necessarily mean that music in the major mode maps directly to colors that are bright and saturated in the absence of happy emotions.
The 34 Musical Excerpts.
Note. All numbers correspond to the weighted average color–appearance ratings of the colors picked to go with the music (PMCA scores), averaged across subjects (details described in the subsection “Results of lower-level perceptual correlations in music-to-color associations” of the “Results and Discussion” section and Appendix A). Entries in boldface indicate selections for which first-choice colors are presented in Figure 2. Sat. = saturation; L/D = light(+)/dark(−); Y/B = yellow(+)/blue(−); R/G = red(+)/green(−).
The Music–Perceptual Features (a) and Emotion-Related Scales (b).
Note. See Supplementary Data1-4.csv for across-subject averages.
Methods
Participants
Three independent groups of participants all gave written informed consent and received either course credit or monetary compensation for their time. The experimental protocol was approved by the Committee for the Protection of Human Subjects at the University of California, Berkeley (approval #2010-07-1813).
Group A
Thirty participants (18 females) performed Tasks A1 to A4 described later. All had normal color vision, as screened with the Dvorine Pseudo-Isochromatic Plates, and none had any form of synesthesia, as assessed by the initial Synesthesia Battery questionnaire (Eagleman, Kagan, Nelson, Sagaram, & Sarma, 2007).
Group B
Fifteen musicians (four females), who were members of either the University of California Marching Band or the University of California Wind Ensemble, provided ratings for the musical excerpts in terms of 15 global musical features, as described in Task B1.
Group C
The color–perceptual ratings data from 48 participants from Palmer et al. (2013) were used in evaluating the present results (see Task C1).
Design and Stimuli
Colors
The colors were the Berkeley Color Project 37 (BCP-37) colors studied by Palmer et al. (2013) (Figure 1; Table S1 in Supplementary Materials for CIE 1931 xyY and Munsell coordinates). The colors included eight hues (red (R), orange (O), yellow (Y), chartreuse (H), green (G), cyan (C), blue (B), and purple (P)) sampled at four “cuts” (saturation/lightness levels): saturated (S), light (L), muted (M), and dark (D). The colors were initially sampled from Munsell space (Munsell, 1966), with the goal of obtaining highly saturated colors (S) within each hue, and then sampling less-saturated versions of those hues at varying lightness levels—light (L), medium (M), and dark (D). The Munsell coordinates were translated to CIE 1931 xyY coordinates using the Munsell Renotation Table (Wyszecki & Stiles, 1982). The S-colors for each hue were the colors with the highest saturation that could be displayed on the computer monitor used by Palmer et al. (2013) for each hue. The lightness (value) of the S-colors differed across hues because the most saturated versions of each hue occur at different lightness levels across hues, due to the nature of the visual system (Wyszecki & Stiles, 1982). In Munsell space, the L-colors were approximately halfway between S-colors and white, M-colors were approximately halfway between S-colors and neutral gray, and D-colors were approximately halfway between S-colors and black. Therefore, the L-, M-, and D-colors had the same Munsell chroma within each hue, and their chroma and value were scaled relative to the S-color of each hue.
The 37 colors used in the experiment (from Palmer et al., 2013). The top left and bottom right gray appeared twice for consistency with the stimulus design and with Palmer et al. (2013).
There were also five achromatic colors: white (WH), black (BK), light gray (AL), medium gray (AM), and dark gray (AD). In this notation, “A” stands for achromatic, and the subscript stands for light (L), medium (M), and dark (D). AL had approximately the average luminance of all eight L-colors, AM had approximately the average luminance of all eight M-colors (and of the S-colors), and AD had approximately the average luminance of all eight D-colors. The colors were always presented on a gray background (CIE x = 0.312, y = 0.318, Y = 19.26).
In the music-to-color association task, the 37 colors were displayed on the screen in the spatial array shown in Figure 1, with each color displayed as a 60 × 60 pixel square. 3 All visual displays were presented on a 21.5 in. iMac computer monitor with a resolution of 1680 × 1050 pixels using Presentation software ( www.neurobs.com ). In tasks that displayed the 37 colored squares, the task was completed in a dark room. The monitor was characterized using a Minolta CS100 Chromometer to ensure that the correct colors were presented. The deviance between the target color's CIE xyY coordinates (Table S1) and its measured CIE xyY coordinates was < .01 for x and y and less than 5 cd/m2 for Y.
Music
The 34 musical stimuli were instrumental excerpts from 34 different genres (Table 1 and Table S2). The primary goal of the selection procedure was to use a more diverse sample of music than in previous studies (Isbilen & Krumhansl, 2016; Palmer et al., 2013, 2016). The first, second, and last authors chose excerpts that (a) contained no lyrics (to avoid contamination by the meaning of the words), (b) were unlikely to be familiar to our undergraduate participants, (c) conveyed a range of different emotions, and (d) were musically distinct, so that no two selections sounded too similar. 4 None of these selections should be interpreted as standing for the entire genre used to label them, but only as single examples that were chosen to achieve a diversity of excerpts that come from different (author-identified) genres. Because musical genres are highly variable, they span wide ranges of variation that cannot be represented by any single example.
The same authors chose the names used in referring to the genres and the excerpts, which were never displayed or mentioned to the participants. In most instances, the genre name corresponded to the genre the artist affiliated with their music on their website or album or the genre label of the given musical selection on iTunes. The exceptions were two musical excerpts that were labeled by the dominant timbre (Gamelan and Piano), as well as the three classical pieces, which were labeled with the name of the composer—Mozart, Bach, and Stravinsky—to differentiate easily among them. We make no claim that the genre-excerpts studied were sampled systematically from among all forms of music, only that the present musical excerpts were more diverse than stimuli used in several previous studies examining music-to-color associations.
The musical excerpts were edited using Audacity software (audacity.sourceforge.net) by clipping a 15-s excerpt and adding a 2-s fade-in and fade-out. All musical excerpts were presented through closed-ear headphones (Sennheiser Model HD 270). The level was determined by having a different set of 19 participants listen to the 34 excerpts through the same headphones and adjust the volume “to the appropriate loudness level” for that musical selection. The level of each musical selection for the main experiment was determined by the average data from this task. This method was used to present the excerpts at more natural, ecologically valid listening levels than if the level had been constant across musical stimuli.
The emotion-related scales were selected from a larger set of 40 scales, with the purpose of retaining 10 scales that were distinct from one another and relevant for both colors and music, based on pilot data from n = 28 participants (see S1 Text). The 15 music–perceptual features were selected by examining the music cognition literature (e.g., Rentfrow et al., 2012) and choosing any feature that seemed potentially relevant to the 34 musical excerpts in the experimenters' collective judgment (Table 2(b)).
Experimental Tasks
Overview
Summary of the Six Tasks.
Task A1: Music-to-color associations
Participants heard the 34 musical excerpts in an individualized random order while viewing the 37-color array. After hearing each full, 15-s selection at least once, participants were asked to choose the three colors that were most consistent with each selection as it was playing. The cursor used to select the colors appeared on the screen after the 15-s excerpt played once, so that participants were required to listen to the entire selection before they could begin selecting colors. They chose the most, second-most, and third-most consistent colors in that order, with each color disappearing as it was selected. After all 37 colors reappeared on the screen, participants were asked to choose the three colors that were most inconsistent with the music, choosing the most, second-most, and third-most inconsistent colors in that order. The music looped continuously until all six color choices had been made for that selection, so that participants could listen to the music as many times as they wished during a given trial. The next musical selection and the full color array were then presented after a delay of 500 ms.
Task A2: Color–emotion ratings
Participants then rated each of the 37 colors on each of 10 bipolar emotion-related scales, including their personal preference (Table 2(a)). The preference ratings (like/dislike) of all 37 colors were always rated first to avoid being contaminated by the other emotion-related ratings.
Color preferences were rated on a continuous line-mark scale from Not At All to Very Much. Each color was centered on the screen above the response scale. Participants slid the cursor along the response scale and clicked at the appropriate position to record their response. To anchor the scale prior to making their ratings, participants were shown the entire 37-color array, asked to point to the color they liked the most, and were told that they should rate that color at or near the Very Much end point of the scale. They were then asked to point to the color they liked the least and were told that they should rate that color at or near the Not At All end point of the scale.
Before performing the other nine emotion-related rating tasks, participants were anchored on each scale by analogous anchoring procedures with appropriate labels at the ends of the bipolar response scale. This anchoring procedure was completed verbally with the experimenter for all nine scales before the subject rated any color on any scale. All nine ratings for one color were made before the next color was presented, and the order of the scales was randomized for each color within participants. The order in which the colors were presented was randomized across participants.
Task A3: Music–emotion ratings
Participants rated the same 34 musical excerpts on all 10 emotion-related scales (Table 2(b)) in a manner analogous to that for the colors, with all 34 preference ratings being made before any of the other emotion-related ratings to avoid contamination. The primary difference was that for the anchoring procedures, participants were instructed to recall which previously heard excerpt they thought was, for example, the happiest (or the saddest), and to click at or near the happy (or sad) end of the response scale for that selection. The musical excerpts were played one at a time in a random order, and each had to be heard all the way through once before being rated on the scales. All nine scales were rated for one musical excerpt before the next excerpt was presented.
Task A4: Synesthesia questionnaire
In the final task, each participant took the Synesthesia Battery questionnaire (Eagleman et al., 2007). If they answered “yes” to any of the questions, they were asked to describe their synesthesia and estimate how frequently it occurs. No data from the experimental tasks were included in the analysis for any participant who answered “yes” to any question.
Task B1: Music–perceptual ratings
The 15 musicians rated each musical excerpt on their perception of each of the 15 bipolar, music–perceptual features listed in Table 2(a). Ratings were made in a manner analogous to the emotion-related rating task for music (Task A3). Because these participants had not previously heard the musical excerpts, they all listened to the same representative sample of five musical excerpts before beginning the experiment to exemplify extremes of salient musical features (e.g., the Heavy metal selection was included as an extreme example of electric, distorted, loud, heavy, low pitch, and punchy, whereas the Piano selection was included as an extreme example of clear, few instruments, soft, light, high pitch, and smooth). No mention was made of these or any other musical features either before or during the initial presentation of these five selections, however. After listening to all five selections, participants completed an anchoring procedure analogous to the anchoring procedure for the emotion-related rating task for music as described earlier. The 34 excerpts were then presented in a random order and looped continuously until participants completed the task for that excerpt.
Task C1: Color–perceptual ratings
The 48 participants described in Palmer et al. (2013) rated the appearance of each of the 37 colors on four color–appearance dimensions—saturated/desaturated, light/dark, red/green, and yellow/blue—using a 400 pixel line-mark rating scale analogous to those described earlier. The anchoring procedure for each dimension was also analogous to that described earlier for color–emotion ratings (Task A2). Trials were blocked by colors, and the order of the dimensions was randomized within color blocks. The order of colors was randomized across participants.
Statistical Analysis
All participants completed their given tasks, so there were no missing data points. Across-subject agreement for each rating scale was measured using Cronbach's alpha, and all indicated good-to-excellent consistency (Table S4). Examination of the Q-Q plots of all of the average rating scales suggested the normality assumption did not adhere for some of the scales. To be conservative, all correlations correspond to Spearman's Rho. Parafac does not make any assumptions about the distribution of the data. The only assumption is that the data display systematic variation across three or more modes, which fits well with our three-mode ratings data sets, consisting of stimuli, ratings scales, and subjects.
Results and Discussion
Figure 2 shows the first-choice colors for each of the 30 participants for eight of the musical excerpts. The musical excerpts in Figure 2 were ones that showed particularly strong contrasts along each of the four color–appearance dimensions (see Figure S1 for all 34 musical excerpts). These examples illustrate how participants chose different kinds of colors as going best with different musical excerpts. The colors chosen as going best with the Ska selection, for example, are noticeably more saturated (vivid) than those chosen as going best with the Indie selection, despite the very wide variations in hue for both excerpts. Next, we quantify these differences and address why they might have arisen.
Example of the “best fitting” colors from the cross-modal music-to-color association task. The examples shown represent relatively extreme values for each of the four color–appearance dimensions (red/green, yellow/blue, light/dark, and saturated/desaturated) with relatively similar values for the other three dimensions. Labels above the array identify the genre name of the musical excerpts represented.
Results of Lower Level Perceptual Correlations in Music-to-Color Associations
First, we computed 15 across-subject average music–perceptual feature scores (e.g., loud/soft, fast/slow) for each excerpt using the ratings from Task B1. These averages were combined with the data from the music-to-color associations (Task A1) and the color–perceptual ratings (Task C1) to compute four Perceptual Music–Color Association scores (henceforth, Perceptual-MCAs or PMCAs) for each of the 34 musical excerpts (for details, see Appendix A). The four Perceptual-MCA scores (four rightmost columns of Table 1) represent the weighted averages of how, saturated/desaturated, light/dark, red/green, and yellow/blue, the six colors are that were chosen (as the three best and three worst) by each participant.
Next, we computed the correlations between the average music–perceptual ratings of the 34 musical excerpts and the average Perceptual-MCA scores of the colors people associated with the same musical excerpts (Figure 3(a)). Holm's (1979) method was used to control the family-wise error rate, implemented in the “psych” R package (Revelle, 2016). There were statistically significant correlations between the music–perceptual features and the Perceptual-MCAs for 6 of the 15 music–perceptual features. For example, louder, punchier musical excerpts were significantly correlated with more saturated colors—loud: r
s
(32) = .642; punchy: r
s
(32) = .591, redder colors—loud: r
s
(32) = .772; punchy: r
s
(32) = .643, and darker colors—loud: r
s
(32) = −.557; punchy: r
s
(32) = −.517. Such correlations show that there are indeed strong perceptual-level music-to-color correspondences. We analyze and discuss the nature of these correlations in more detail in the subsection “Results of dimensional compression of emotion-related scales” later. Figure 3(b) and (c) will be discussed in the subsections “Results of higher level emotional correlations in music-to-color associations” and “Results of correlations between latent Parafac factors and associated colors,” respectively.
(a) Correlations between the 15 music–perceptual features and the weighted average color–appearance values of the colors picked as going best/worst with the music (the Perceptual-MCAs). (b) Correlations between the 10 music–emotion ratings and the weighted average color–appearance values of the colors picked to go with the music. The grouping of emotion-related features into Sets 1A, 1B, and 2 is explained in the subsection “Results of Higher Level Emotional Correlations in Music-to-Color Associations.” (c) Same correlations as in (b) for the two latent emotion factors identified by Parafac (
Discussion
There were notable differences between the present results and the analogous analyses for the more restricted classical musical excerpts previously reported (Palmer et al., 2013, 2016). First, we found strong correlations between musical features and the red/green dimension of the associated colors in the present data, which were not previously found for classical orchestral music by Bach, Mozart, and Brahms (Palmer et al., 2013) or for classical piano melodies by Mozart (Palmer et al., 2016). Moreover, there were no significant music–perceptual correlations with the yellow/blue dimension, which previously yielded highly significant effects for the classical music. These differences in the nature of hue mappings show that music-to-color associations can be quite distinct with different samples of music.
It is possible that these differences arise because different emotions tend to be expressed in different kinds of music. In particular, Palmer et al.'s (2013, 2016) classical music sample varied more along a happy/sad scale, which tends to correlate with yellow/blue color variations, whereas the present sample varied more along an agitated/calm scale, which tends to correlate with red/green color variations. These issues will be assessed empirically in the subsections “Testing the emotional mediation hypothesis” and “Results of dimensional compression of emotion-related scales,” where we consider evidence for emotional mediation.
A second contrast with previous findings concerns differences in the color associations for different tempi. Similar to previous reports for classical music, we found that more saturated colors were selected for faster music. However, unlike previous reports for classical music in which faster tempi and greater note densities—which have similar effects on color associations despite being musically distinct—were associated with lighter, yellower colors (Palmer et al., 2013, 2016), here we found that faster-rated music was associated with darker, redder colors. (Note that some of the latter correlations did not reach significance after correcting for multiple comparisons.) These differences may be due to the fact that faster tempi are correlated with different patterns of other musical features in the different sets of musical samples. For instance, in the present sample, many fast-paced selections were also judged to be heavy (r s = .455) and punchy (r s = .665), whereas the corresponding relations were likely absent in the well-controlled, synthesized, single-line piano melodies by Mozart (Palmer et al., 2016). Different patterns of musical features may also differentially interact to produce different emotion-related experiences (Eerola, Friberg, & Bresin, 2013; Juslin & Lindström, 2010; Lindström, 2003, 2006; Schellenberg, Ania, Krysciak, & Campbell, 2000), hence modulating the types of colors chosen as going with the music. Understanding how musical features interact and map to emotion, and how this may affect color choices, is a complex question that could be addressed using structural equation modeling given a large enough (n > 100) sample size of musical excerpts. This is an open area for future research.
Results of Higher Level Emotional Correlations in Music-to-Color Associations
To examine whether the systematic associations between the music–perceptual and color–appearance dimensions could be mediated by emotion, we conducted corresponding analyses of higher level emotion-related aspects of music-to-color associations. First, we computed the across-subject averages of the 10 music–emotion ratings (from Task A3) for each of the 34 musical excerpts. Next, we correlated each of the average music–emotion ratings with the four Perceptual-MCAs of the colors chosen as going best/worst with the music, analogous to those defined in the previous section. These correlations reflect how the emotional properties of the music correspond to the properties of the colors chosen as going best/worst with them: for example, the extent to which people chose colors that were more saturated, darker, and redder when listening to more agitated-sounding music than when listening to calmer-sounding music. The results, plotted in Figure 3(b), show 18 significant correlations. Indeed, every one of the 10 emotion-related scales except for like/dislike shows at least one significant correlation at the .05 level using Holm's (1979) method.
Discussion
The significant correlations between the emotional content of the music and the perceptual dimensions of the colors picked to go with the music constitute initial evidence that the emotional mediation hypothesis is viable for the expanded sample of 34 diverse musical excerpts. It is also noteworthy that, although none of the 15 music–perceptual features produced a significant correlation with the yellow/blue color–appearance dimension (Figure 3(a)), two of the 10 emotion scales did: happy/sad (+.55) and warm/cool (+.72) (Figure 3(b)). This result shows that, at least in our particular sample of music and colors, emotional mediation accounts for more variability in the yellowness/blueness of people's color choices than any single music–perceptual feature we studied.
Further scrutiny of Figure 3(b) reveals different patterns of correlation between music emotions and color appearances. These patterns of correlation can be qualitatively clustered into two sets, with Set 1 split into two subsets. In Set 1A, spicy, loud, agitated, complex sounding music was consistently associated with more saturated, redder, darker colors. In Set 1B, appealing, harmonious, liked music was consistently paired with lighter colors that tended to be a bit greenish and somewhat desaturated. In Set 2, happy, whimsical, warm sounding music was consistently associated with more saturated, yellower colors. This is in contrast to Sets 1A and 1B, where the emotion-related scales were unrelated to the yellow/blueness of color choices.
It is important to note that the correlations for Set1B are nearly opposite to those for Set 1A. If we had plotted the correlations for Set 1B using reversed polarities of the same features (i.e., disgusting/appealing, dissonant/harmonious, and disliked/liked), the pattern for Set 1B would look qualitatively similar to that for Set 1A. Thus, the patterns of correlations between music emotions and color–appearance dimensions represented in Figure 3(b) appear to reveal two qualitatively different patterns of color choices: one for Set 1 (including both Set 1A and Set 1B) in which more agitated, spicy, loud, complex, disgusting, dissonant, disliked music elicits more saturated, darker, redder color choices, and one for Set 2, in which more whimsical, happy, warm music elicits more saturated, yellower color choices.
Testing the Emotional Mediation Hypothesis
Results of correlating music–emotional content and associated colors
Next, we analyzed whether people picked colors to go with music based on shared emotional content. We did so by examining correlations between the across-subject average emotion ratings of each of the 34 musical excerpts and the weighted average emotion ratings of the colors chosen as going best/worst with the corresponding excerpts (Emotional-MCAs, or EMCAs; Appendix B). These correlations identify the degree to which people chose colors whose emotional associations matched the emotional associations of the music: for example, choosing happy-looking colors as going best/worst with happy-sounding music and agitated-looking colors as going best/worst with agitated-sounding music. Consistent with the emotional mediation hypothesis, 9 of the 10 correlations for the rated musical scales were strongly positive and highly significant after adjusting the alpha level using the Bonferroni correction (.05/10 = .005). As evident in Figure 4(a), these nine correlations ranged from a high of .928 (p < .0001, one-tailed) for spicy/bland
5
to a low of .584 (p = .00018, one-tailed) for whimsical/serious and are thus consistent with emotional mediation of some sort. Although not quite as high as the corresponding correlations in the study using classical orchestral music (.89 < r < .99, Palmer et al., 2013), they are roughly comparable to those based on single-line piano melodies (.70 < r < .92, Palmer et al., 2016), despite the much wider musical variety in the present sample of music.
Correlations between the average emotion-related ratings of the 34 musical excerpts and the weighted average emotion-related ratings of the colors chosen as going best/worst with the musical excerpts (EMCAs). (a) Correlations for the 10 originally rated emotion scales. (b) Analogous correlations for the two latent factors in the emotion Parafac solution. The black dotted line corresponds to the uncorrected p value cut-off, whereas the red dotted line corresponds to the Bonferroni-corrected p values for conducting multiple comparisons.
The 10th comparison was between preferences for the music and preferences for the colors chosen: that is, correlations between the like/dislike ratings for the musical excerpts and the like/dislike EMCA scores for the chosen colors for each musical excerpts (the rightmost bar in Figure 4(a)). This preference correlation, although positive, was not significant after correcting for multiple comparisons, r s (32) = .406, p = .0086 > .005, one-tailed. The evidence that people chose colors they like/dislike as going better with music that they like/dislike is thus quite weak, at least for this sample of music, colors, and Western participants.
Results of partialling out emotion-related associations
We have now reported evidence for both direct perceptual associations and emotionally mediated associations, but it is unclear that the degree to which the higher level correlations from music emotions to colors (Figure 3(b)) can explain the lower level correlations from music perceptions to colors (Figure 3(a)) versus the degree to which they are independent. A pure version of the direct perceptual link hypothesis (i.e., that all music-to-color associations are due to direct, low-level mappings) implies that the perceptual-feature correlations in Figure 3(a) will be unsystematically affected after partialling out the contribution of emotional associations (Emotional-MCAs, calculated earlier). A pure version of the emotional mediation hypothesis (i.e., that all music-to-color associations are due to higher level emotional associations) implies that all significant perceptual-feature correlations in Figure 3(a) will be eliminated after partialling out the contributions of all covarying emotional associations.
The partial correlation results support the strong form of the emotional mediation hypothesis. All of the music–perceptual correlations in Figure 3(a) were reduced to nonsignificance after removing the emotional effects in Figure 3(b) and controlling for family-wise error rate using Holm's (1979) method: −.568 ≤ r s ≤ +.489, p > .05 (Figure S2(a)).
Results of Dimensional Compression of Emotion-Related Scales
Latent emotion-related factors of the colors and music
To better understand the shared emotional content of the music–color associations, we used the Parafac model (Harshman, 1970) to discover the latent factors underlying the emotion ratings from Tasks A2 and A3. Parafac was performed jointly on both the color–emotion and the music–emotion ratings (for details, see Appendix C) because previous studies, in which dimensional reductions for music–emotion ratings and color–emotion ratings were conducted separately, found the emotion-related dimensions of the colors and of the music to be very similar (Palmer et al., 2013, 2016).
We examined the weights for Parafac factor solutions containing 2 to 10 factors. We chose the two-factor solution because of (a) its interpretability, (b) its consistency with previous results (Palmer et al., 2013, 2016), (c) its consistency with the clustering of the 10 emotion scales into just two qualitatively different groups (see the subsection “Results of Higher Level Emotional Correlations in Music-to-Color Associations” and Figure 3(b)), (d) the shape of the scree plot, (e) the results of the core consistency diagnostic, and (f) its consistency with the canonical dimensions of human affect:
The estimated Parafac weights are plotted in Figure 5. The emotion weights in Figure 5(c) show where the emotion-related rating scales are located relative to the axes of the two latent Parafac factors. They are useful for assigning meaning to the factors and labeling them. Factor 1 (along the x-axis) is most closely aligned with ratings of agitated, spicy, loud, and complex on the positive end versus calm, bland, quiet, and simple on the negative end. We refer to this latent dimension by its affective interpretation, Two-factor emotion Parafac solutions are plotted for the stimulus features (
Figure 5(a) plots the weights for the 37 colors and Figure 5(b) for the 34 musical excerpts within the two-dimensional space defined by the latent factors of
Figure 5(d) plots the subject weights, which are useful for understanding individual differences in the saliences each subject assigned to each factor in choosing the best/worst colors for the music. Most participants weighted
Discussion
It is interesting for several reasons that the emotion-ratings data can be well accounted for by two factors,
Results of correlations between latent Parafac factors and associated colors
Next, we computed the same correlations as reported in the subsection “Results of Higher Level Emotional Correlations in Music-to-Color Associations” but replaced the across-subject average of the 10 music–emotion ratings for the 34 musical excerpts (Task A3) with the Parafac factor scores corresponding to the 34 music selections: that is, the estimated
We also conducted correlational analyses between the music–emotional scales and the color–emotion scales of the colors chosen as going best/worst with the music (i.e., EMCA scores) using the Parafac solutions. These correlations, shown in Figure 4(b), are analogous to those shown in Figure 4(a) but differ in that they replace the across-subject average ratings on the 10 original emotion-related scales for the 34 musical excerpts with those of the two Parafac-based Emotional-MCAs. The formula for calculating these new Parafac-based Emotional-MCAs is identical to that for calculating the Perceptual-MCAs in the subsection “Results of lower-level perceptual correlations in music-to-color associations,” except that the ratings on the four color–appearance dimensions (saturated/desaturated, light/dark, etc.) were replaced with the color–emotion Parafac weights representing the emotional associations of the colors that were chosen as going best with the musical selection. These Parafac-based Emotional-MCAs for the 34 musical excerpts were then correlated with the music–emotion Parafac weights for the 34 musical excerpts as represented in Figure 4(b). The results show strong correlations for both
Results of partialling out the emotional content of the Parafac factors
To better understand how the two latent factors identified by the Parafac analysis relate to the results of the Emotional-MCA partial correlation analysis (Figure 3(b)), we conducted a second partial-correlational analysis examining the correlations between the music–perceptual features and the Perceptual-MCAs after partialling out the factor scores for the 34 music selections: that is, the estimated
Results comparing music–perceptual versus affective predictions of color data
To compare direct, low-level music–perceptual associations and higher level affective mediation, we further analyzed the ability of each hypothesis to predict the chosen colors along the four color–appearance dimensions using multiple linear regression (MLR) analyses. The music–stimuli weights were entered into the regression and fitted using the ordinary least squares method, with factor 1 always entered into the model first. The results, shown in the red bars of Figure 6, indicate that the two affective factors ( MLR analyses predicting the four color–perceptual values of the colors associated with the musical excerpts (PMCAs) from the Parafac latent factors. The contributions of the first factor entered are represented by the lower segment of each bar, and the polarity of each contribution is indicated by the “+” or “−” sign in that segment.
For direct comparison, we performed an analogous MLR on the stimuli weights from a corresponding two-dimensional Parafac analysis of the 15 music–perceptual features. The two latent music–perceptual factors were interpreted as
The Case for Emotional Mediation
The present research demonstrates that music-to-color associations are better characterized as emotionally mediated (music → emotion → color) rather than direct (music → color) using more conclusive methods than previously employed (Isbilen & Krumhansl, 2016; Lindborg & Friberg, 2015; Palmer et al., 2013, 2016). Although the color–appearance dimensions of the chosen colors were correlated with both music–perceptual features (Figure 3(a)) and emotion-related scales (Figure 3(b)), the lower level correlations were no longer significant after accounting for variance due to the 10 emotion-related scales or after accounting for variance from just the two latent affective dimensions of
The interpretation of the present findings converges with previous evidence that emotional/affective mediation also occurs in two additional sets of perceptual mappings: namely, from music to pictures of expressive faces and from pictures of expressive faces to colors (Palmer et al., 2013). In both cases, the emotional mediation correlations were nearly as strong as those found for music-to-color associations with the same emotion-related scales. It is difficult to discern any low-level perceptual features that are common to music, colors, and emotional faces that might reasonably account for the common associative patterns. It is easy to account for these results if they are all mediated by emotion, however.
We hasten to add that the present or previous findings do not rule out the possibility that other kinds of information might also influence music-to-color associations. Semantic effects could arise, for example, if musical excerpts trigger associations with objects or entities that have strong color associations (cf. Spence, 2011). For example, the colors associated most strongly with the Irish selection (Figure 2), which had a prominent bagpipe melodic line, were prominently greens, commensurate with the fact that Ireland is so closely associated with the color green. However, such semantic associations from music to objects/entities to colors may also have emotional components. For example, the blacks and reds chosen predominantly for the heavy metal excerpt (Figure S1) can plausibly be understood as derived from the angry, strong, agitated, and even dangerous sound of the music, given that blacks and dark reds are among the strongest, most angry, agitated, and dangerous looking colors (Palmer et al., 2013; Figure 5).
However, another possibility is consistent with direct associations, namely, through statistical covariation (cf. Spence, 2011). Here, the presumption would be that past experiences cause classical conditioning of direct, cross-modal associations, reflecting the fact that certain kinds of music tend to be heard in visual environments in which certain kinds of colors are predominately experienced. One might frequently hear Salsa music, for example, while seeing (and perhaps eating) tomato-based, Latin American cuisine, such as salsas and enchiladas. However, that explanation seems unlikely to hold for any but a few of the 34 present musical excerpts. Moreover, there is no reason why emotional associations cannot jointly determine music-to-color associations along with these other factors.
Open Questions
Cross-cultural influences
Considering the role of prior experience in music-to-color associations raises the interesting possibility of cultural influences. Would people from different cultures choose the same colors as going with the same music or would they be systematically different? The only existing evidence comes from prior research on music-to-color associations for classical orchestral selections in the United States and Mexico (Palmer et al., 2013). The Mexican data were virtually identical to the U.S. data in every respect. This finding shows that some degree of generalization across cultures is warranted. However, the strength of the generalization is unclear because people in Mexico still have extensive exposure to Western music.
At least two distinct issues underlie the cultural generality of emotional mediation in music-to-color associations. One is the generality of music-to-emotion associations (for a review, see Thompson & Balkwill, 2010). When forced-choice tasks were used with a small number of emotional categories (e.g., choosing among joy, anger, sadness, and peace while listening to music), the results tend to support generality across cultures (e.g., Balkwill & Thompson, 1999; Balkwill et al., 2004; Fritz et al., 2009). With larger numbers of emotional categories, the results are less clear-cut and depend more heavily on musical familiarity (e.g., Gregory & Varney, 1996). Given that the perception of music is likely influenced by an individual's cultural lens (e.g., Demorest, Morrison, Nguyen, & Bodnar, 2016; McDermott, Schultz, Undurraga, & Godoy, 2016), the true universality of musical emotions is still unclear. Moreover, the theoretical question of whether musical emotions, and human emotions more generally, are represented by discrete, basic categories or from a dimensional, constructionist approach is still debated (e.g., Barrett, Mequita, & Gendron, 2011; Cespedes-Guevara & Eerola, 2018; Juslin, 2013; Ortony & Turner, 1990).
The other issue is the cultural generality of associations between emotions and colors. Most cross-cultural studies take the limited perspective of assessing color on the three dimensions of the semantic differential (Osgood et al., 1957)—namely, evaluation (akin to
Perceived versus experienced emotion
Another open question is whether the emotion-related correspondence of music and colors that we have identified occurs via perceived emotion (i.e., emotional cognition) or experienced emotion (i.e., emotional feelings or qualia). The distinction is illustrated by the fact that people can clearly perceive the intended emotionality of a piece of music or a combination of colors without actually experiencing that emotion (Gabrielsson, 2002). For instance, it is logically possible to perceive sadness in Hagood Hardy's “If I had Nothing but a Dream” without actually feeling noticeably sadder than before hearing it. We have discussed the 10 rated scales as being “emotion-related” largely to avoid having to commit to whether perceived emotion, felt emotion, or some combination of both is the basis of the emotion-related effects we have reported. It would therefore be desirable to dissociate the relative contributions of perceived versus felt emotion in future cross-modal music-to-color research (e.g., Juslin & Västfjäll, 2008), perhaps through taking relevant physiological measures while participants report on their perception versus experience of emotions (e.g., Krumhansl, 1997) or through studying patient populations, such as alexithymics, who have reduced ability to distinguish between or categorize emotional experiences.
Synesthesia
The consistent finding that nonsynesthetes show strong emotion-related effects in cross-modal music-to-color associations, both here and in prior studies (Isbilen & Krumhansl, 2016; Palmer et al., 2013, 2016), warrants further exploration as to whether timbre-to-color synesthetes show similar emotion-related effects. One hypothesis is that the mechanisms producing synesthetic experiences are essentially the same as (or similar to) the mechanisms producing cross-modal nonsynesthetic associations but at an intensity high enough to result in conscious experiences (e.g., Ward et al., 2006). If so, then one would expect that music-to-color synesthetes would also show substantial emotion-related effects in the colors they experience to the same musical excerpts. Indeed, some theorists claim that the primary basis of synesthesia is emotional (e.g., Cytowic, 1989), in which case synesthetes might be expected to show even stronger emotion-related effects than nonsynesthetes. Isbilen and Krumhansl (2016) found that when synesthetes were asked to pick one of eight colors that went best with excerpts of 24 Preludes from Bach's Well Tempered Clavier, they chose colors that were consistent with nonsynesthetic color choices. However, because synesthetes were not asked to pick the colors that were most similar to their synesthetic experiences—rather, they picked the colors that “best fit the music”—it is unclear whether synesthetic experiences are emotionally mediated, or whether synesthetes are simply capable of making best fitting cross-modal music-to-color choices that are similar to those of nonsythesthetes.
A competing claim is that synesthetic experiences occur via hyperconnectivity from one area of sensory cortex to another (e.g., Ramachandran & Hubbard, 2001; Zamm, Schlaug, Eagleman, & Loui, 2013). By this account, music–color associations in timbre-to-color synesthetes more likely arises from the specific qualities of the sounds (i.e., their timbre, duration, loudness, and pitch) directly activating specific qualities of colors rather than from any higher level, emotion-related attributes. It would be surprising from this perspective if synesthetes showed any emotion-related effects that were not spurious by-products of direct auditory-to-visual mappings.
Conclusions
The results of this study contribute importantly to the understanding of music-to-color associations in at least four ways.
They present robust evidence of emotional mediation of cross-modal music-to-color mappings over a broad range of 34 musical excerpts and 10 emotion-related scales. The present experiment investigated a larger and more diverse set of 15 underlying musical features, many of which have not been previously studied for their color associations (e.g., loudness, harmony, distortion, beat strength, and complexity, among others). The pattern of results shows that essentially the same emotion-related effects are evident when using a wider range of linguistically labeled scales. Specifically, we found that the 10 emotion-related scales can largely be reduced to the same two latent factors (Figure 5) that are easily identifiable with the affective dimensions of MLR analyses showed that the two affective factors (
Overall, the present results have brought us closer to understanding the role of emotion in people's cross-modal associations to music.
Supplemental Material
Supplemental Data1 - Supplemental material for Color, Music, and Emotion: Bach to the Blues
Supplemental material, Supplemental Data1 for Color, Music, and Emotion: Bach to the Blues by Kelly L. Whiteford, Karen B. Schloss, Nathaniel E. Helwig and Stephen E. Palmer in i-Perception
Supplemental Material
Supplemental material for Color, Music, and Emotion: Bach to the Blues
Supplemental Material for Color, Music, and Emotion: Bach to the Blues by Kelly L. Whiteford, Karen B. Schloss, Nathaniel E. Helwig and Stephen E. Palmer in i-Perception
Footnotes
Acknowledgments
The authors thank Daniel J. Levitin for his input in selecting and evaluating the representativeness of some of the 34 musical excerpts and Matthew C. Scauzillo for his assistance in finding YouTube links for Table S2.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by NSF Grant BCS‐1059088 from the NSF to S.E.P.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
