Abstract
A growing body of research analyzing musical scores suggests mode’s relationship with other expressive cues has changed over time. However, to the best of our knowledge, the perceptual implications of these changes have not been formally assessed. Here, we explore how compositional choices of 17th- and 19th-century composers (J. S. Bach and F. Chopin, respectively) differentially affect emotional communication. This novel exploration builds on our team’s previous techniques using commonality analysis to decompose intercorrelated cues in unaltered excerpts of influential compositions. In doing so, we offer an important naturalistic complement to traditional experimental work—often involving tightly controlled stimuli constructed to avoid the intercorrelations inherent to naturalistic music. Our data indicate intriguing changes in cues’ effects between Bach and Chopin, consistent with score-based research suggesting mode’s “meaning” changed across historical eras. For example, mode’s unique effect accounts for the most variance in valence ratings of Chopin’s preludes, whereas its shared use with attack rate plays a more prominent role in Bach’s. We discuss the implications of these findings as part of our field’s ongoing effort to understand the complexity of musical communication—addressing issues only visible when moving beyond stimuli created for scientific, rather than artistic, goals.
Music’s capacity to convey emotion has fascinated history’s great thinkers, garnering attention from observers ranging from Plato (Stamou, 2002) to Darwin (1872). Although music treatises have related musical expression to its affective outcomes since the 1600s, empirical studies did not appear until the early-20th century (Hevner, 1935, 1937). Mode—the cue describing sets of notes composers use to convey music’s emotional character—has been of particular interest. The two most common modes in western music—major and minor—are generally recognized to convey positive and negative emotion states, respectively (Crowder, 1984; Gagnon & Peretz, 2003). Mode’s associations with pitch and timing also influence emotion perception, with major music typically associated with higher, faster musical passages and minor with lower, slower ones (Huron, 2008; Rigg, 1964; Turner & Huron, 2008).
Although musical cultures throughout the world employ numerous modes and scales, listeners unacculturated to western music can still identify its conveyed emotions at above-chance levels (Balkwill & Thompson, 1999; Laukka et al., 2013). Listeners with minimal exposure to western music accurately decode happy or sad connotations in major and minor pieces, suggesting mode’s emotional associations are to some degree universal (Fritz et al., 2009). Despite this expressive salience, mode’s imbalanced use complicates efforts to explore its effects (Larue et al., 2015). Von Helmholtz (1877) first noticed this cultural preference for major pieces, arguing the major mode’s popularity stems from its simpler acoustic structure (see Parncutt, 2014, for summary). This preference for major pieces appears in both classical and rock music (Horn & Huron, 2015; Temperley & de Clercq, 2013), posing challenges for researchers seeking corpora with equal numbers of major and minor pieces.
A second challenge to understanding mode’s role in musical emotion is that it intercorrelates with pitch, timing, and many other cues in musical performances. To account for this multicollinearity, researchers often use controlled approaches, such as assessing the effect of adjusting excerpts to parallel major or minor keys (Dalla Bella et al., 2001; Hevner, 1935). The obvious challenges of this approach for the complex, polyphonic music heard in concert halls leads researchers to frequently choose single-line melodies as experimental stimuli. These controlled investigations are useful for understanding how specific cues can affect emotion perception, however their relation to real-world listening is tenuous. Analysis of musical scores can offer greater understanding of nuanced cue relationships in more complex musical stimuli.
Historic differences in musical emotion
Addressing mode’s historic emotional context requires clarification of differing perspectives. Musicologists typically demarcate western common practice music into three or four distinct epochs between the 1600s and early 1900s (Post & Huron, 2009). In contrast, music cognition studies often use “classical music” as a catch-all term for Western tonal music from any of these epochs. This generalization is useful when distinguishing between western and non-western music but makes it easy to overlook historic changes in western music. For example, mode’s changing relationship with loudness and timing in nominally “classical” music suggests generalization may not hold for all eras. Cluster analyses of cues encoded in musical scores illustrate differences between music from the early Classical (~1750) and late Romantic (~1850) eras, with mode most clearly distinguishing dissimilar Classical music, and musically expressive cues such as loudness and timing distinguishing Romantic pieces (Horn & Huron, 2015). Additionally, examination of scores reveals changes in timing (Daniele & Patel, 2013; Hansen et al., 2016; Post & Huron, 2009), scale degree use (Perttu, 2007), and dynamics (Ladinig & Huron, 2010) between Baroque (~1600–1750) and Romantic (~1800–1910) era composition. Unfortunately, pinpointing the specific effects of any one cue is challenging, given the way composers tend to use them in intercorrelated ways (Schutz, 2017).
Intercorrelations in musical cue use
Although cue intercorrelations are a natural consequence of music’s complexity (and likely part of its appeal), they pose significant barriers to experimental approaches aimed at understanding cues’ individual contributions. To deal with this issue, psychologists have primarily adopted the following two approaches: (1) creating highly controlled musical stimuli to avoid multicollinearity and (2) using diagnostic techniques to assess multicollinearity. After summarizing these approaches, we discuss a third that has proven helpful in other fields dealing with problems concerning collinearity.
Accounting for intercorrelations
Exhaustively combining factorized cue levels to assess effects is a popular method for preventing unwanted intercorrelations. This approach overcomes challenges with discerning cue effects from intercorrelated music by focusing on discretized cue levels in controlled stimuli (Juslin & Lindström, 2010). These methods have empowered researchers to develop and validate melodic stimuli expressing particular emotion states (Paquette et al., 2013; Vieillard et al., 2008), enabling meaningful interpretations of cues’ effects by clarifying their relative importance for perceived emotion (Eerola et al., 2013; Juslin & Lindström, 2010). These controlled approaches reveal cues contribute additively in expressing specific emotion states. However, they implicitly treat music’s intercorrelated structure as “hopelessly confounded” (Juslin & Lindström, 2010, p. 337) instead of an important perceptual feature. Consequently, it remains unclear how these findings generalize to real-world listening.
As an alternative to removing multicollinearity, other statistical techniques aim at managing some of its most problematic consequences. This gives researchers freedom to use naturalistic stimuli while minimizing the risk of inaccurate conclusions. One study applied ridge regression analysis to improve estimated cue effects in intercorrelated stimuli to assess their emotional significance (Costa et al., 2004), replicating well-known associations between valence and mode. Diagnostic techniques such as the variance inflation factor can optimize selections of naturalistic musical stimuli, and have identified important emotional effects from pitch (Chordia & Rae, 2008) and timing information (Luck et al., 2008) in North Indian raag and piano improvisations.
One limitation to the abovementioned techniques is that they cannot assess precisely where, and to what extent, collinearities occur. Because music’s complex structure contributes to its affective meaning, understanding how cues elicit diverse emotional effects provides insight into emotional listening experiences. This is particularly important for understanding how multicollinearity differs between composers from distant historical periods.
Exploring an alternative to dimension reduction
To explore the nuanced cue relationships underlying emotional expression, we employ commonality analysis (CA) to decompose emotional cue effects into unique and combined contributions. Although used since the 1960s (Nimon & Oswald, 2013), to the best of our knowledge, CA had not been applied to music prior to our team’s recent explorations (Battcock & Schutz, in press, 2019, 2021). Nonetheless, its ability to clarify the relative importance of interrelated cues has proven powerful in disciplines ranging from education (Mood, 1969; Werner et al., 2019) and clinical psychology (Gustavson et al., 2018; Marchetti et al., 2016) to evolutionary biology (Cuevas et al., 2021). Its utility in clarifying how social, cognitive, and affective factors influence behaviors surrounding medical examinations (Seibold & Roper, 1979) illustrates this power. When applied to determine barriers preventing at-risk individuals from seeking timely treatment, regression analyses revealed all three factors contributed significantly. However, after accounting for unique and joint effects, CA revealed one set of factors to be most important—an insight crucial to obtaining the greatest benefit in public messaging campaigns. As other studies provide extensive historical and theoretical detail of CA (Ray-Mukherjee et al., 2014; Seibold & McPhee, 1979), we will now focus on its musical applications.
In this study, we perform CA to separate cues’ unique and combined emotional effects on perceived emotion. This provides insight into (1) effects unique to each cue, (2) effects attributable to the “combination” of two cues, and (3) effects jointly attributable to all three cues. Figure 1 depicts a Venn diagram conceptually representing the relationship between shared and joint cues.

Commonality Analysis Cue Relationships. Note. Venn diagram depicting the various subsets of independent variables producing coefficients In a commonality analysis performed on predictor variables mode (green), attack rate (blue), and pitch height (purple). The overlapping portions of (1) mode and attack rate, (2) mode and pitch height, and (3) attack rate and pitch height, signify their joint effect after controlling for their unique effects. The area where all three circles overlap indicates their combined contribution after controlling for their individual contributions and lower-order joint effects (e.g., mode and attack rate, pitch height and mode, etc.). Please refer to the online version of the article to view the figures in colour.
To build on score-based analyses of how cue use shifted over music history, here we compare participants’ emotion ratings of music from Frederic Chopin’s Préludes with those from a previous study of J. S. Bach’s The Well-Tempered Clavier (WTC) by Battcock and Schutz (2019). We see three benefits to using these corpora as the basis for our exploration. First, both sets contain 24 preludes written in each major and minor key, allowing for balanced assessment of how each composer uses mode to convey emotion. Second, each set was composed for a keyboard instrument, avoiding problematic differences in musical texture and timbre that are difficult to codify in scores. Third, Bach and Chopin composed during periods of clear stylistic contrast, making their music well suited for comparisons requiring music of different historical eras.
Previous findings from empirical musicology indicate notable changes in musical timing during the Romantic era, including the higher prevalence of fast minor, and slow major, music (Horn & Huron, 2015; Kelly et al., under review; Post & Huron, 2009). Here, we complement and extend those findings in score-based analyses of cues, exploring differences in perceptual effects of cues (and cue combinations) in music from different eras. Our techniques are novel within music research, however, our goal of relating musical structure to perceived emotion follows a long tradition within music psychology. Although this study is primarily exploratory, we hypothesized shifting patterns of cue weights with respect to mode between composers. This holds potential to complement and extend current knowledge of musically expressed emotion—a topic fascinating several generations of musicians, psychologists, and neuroscientists alike.
Methods
Cue extraction and excerpt preparation
To codify pitch and timing in these excerpts, a research assistant with advanced training in music analysis analyzed the first eight measures from the urtext edition of Frederic Chopin’s Préludes, as well as the 24 preludes from Bach’s WTC. This followed the method outlined in Poon and Schutz (2015) with the following four exceptions. First, here, we included pick-up measures (previously omitted) as excerpts used in the experiment needed them for continuity. Second, we factored the album details into our timing calculations by summing note attacks (from the score) and dividing by the duration of the particular excerpt used in the experiment. Third, as this analysis aims to map cue values onto perceptual ratings of each excerpt, we used only a single value for pitch and timing (rather than the measure-by-measure values used by Poon and Schutz). Finally, in addition to the Kalmus edition of Bach’s (1883 [originally published in 1722]) preludes, we analyzed the urtext edition of Chopin’s (2007 [originally published in 1839]), which we have since learned is considered more authoritative.
For the Chopin experiment, we prepared eight measure excerpts of each piece from recordings by Vladimir Ashkenazy (1993), appending a 2-second fade to each excerpt using Amadeus Lite (HairerSoft, 2019) and exporting the audio at a sampling rate of 44.1 kHz. The Chopin recordings took place between 24 and 25 June 1993 at St. Charles Hall, Switzerland, providing some consistency in recording conditions (Ashkenazy, 1993 [liner notes]). The Bach recordings took place in April 1972 in Villingen, Germany (Gulda, 1973 [liner notes]). Additional technical details on methods can be found in this article’s Supplemental material online.
Participants
To assess perceived emotion in Chopin’s Préludes, we recruited 35 non-musicians (defined as having less than one year of musical training) through our institution’s psychology participant pool. Participants (28 female, M = 18.39 years, SD = 1.47 years; 7 male, M = 17.86 years, SD = 0.38 years) reported normal hearing and corrected-to-normal vision. The experiment complied with our institution’s research ethics board’s ethics policy. The recruitment procedure of the comparison study (Battcock & Schutz, 2019) followed a similar procedure, comprising ratings of Bach’s WTC from 30 participants (18 females; M = 19.1 years, SD = 3.0; 12 males, M = 19.7, SD = 2.9).
Procedure
Participants completed a consent form along with a brief demographic survey, indicating the number of hours spent each week listening to music or practicing an instrument. During the experiment, participants sat in a noise-attenuating sound booth where an experimenter provided instructions. We played each of the 24 prepared excerpts in a randomized order and participants rated the music’s conveyed emotion along scales indicating (a) valence and (b) arousal of each excerpt (adapted from Russell, 1980). They listened to each excerpt through Sennheiser HDA-200 noise-canceling headphones, registering ratings using experimental software designed in PsychoPy (Peirce et al., 2019).
We defined valence and arousal prior to the experiment, reminding participants to rate the emotion conveyed by the music as opposed to how they felt while listening. Participants completed practice trials rating four randomly selected pieces (two major, two minor). They rated valence on a scale from 1 (negative) to 7 (positive) and arousal on a scale from 1 (low) to 100 (high). Prior to the full experiment, comprising 24 trials, participants had the opportunity to ask the experimenter additional questions. After completing the experiment, the experimenter debriefed participants on the study.
Results
Replicating cue analyses
To assess consistency between the score-based cue quantification reported by Poon and Schutz (2015) and our approaches here (using both scores and information from audio files), we performed intraclass correlations. This revealed a high level of agreement. Additionally, we ran a series of assessments using only a single value for score-based calculations of pitch and timing (i.e., collapsing across the measure-by-measure values used in Poon & Schutz, 2015). We found strong agreement in all comparisons except one, which we attribute to statistical differences in power when collapsing across measures. Finally, we conducted Pearson correlations between mode, attack rate, and pitch height to gain insight into cue relationships in each set. This revealed differences between composers, including a stronger relationship between mode and attack rate for Bach (technical details are provided in the Supplemental materials).
Perceptual analysis
We visualized participant ratings of Chopin’s Préludes and the preludes in Battcock and Schutz’s (2019) analysis of Bach’s WTC using Russell’s circumplex model. See Figure 2 for the mean valence and arousal ratings for each piece.

Emotion Ratings of Valence and Arousal.Note. Visualization of mean valence and arousal ratings for each excerpt visualized using Russell’s circumplex model for ratings of Bach’s WTC (left) and Chopin’s 24 preludes (right). Major and minor pieces are denoted with red and blue text, respectively. Solid black lines indicate the middle of the rating scales for the valence and arousal dimensions. Please refer to the online version of the article to view the figures in colour.s
A median split on averaged ratings suggested differences between the composers regarding mode’s effect for one dimension. For valence, major pieces account for 11 of the 12 highest ratings for Bach, as well as 10 of the 12 highest for Chopin. However, the comparison for arousal ratings differed sharply, with major key pieces accounting for 8 of the 12 highest ratings for Bach, but only 4 of the 12 highest for Chopin.
Examining cue contributions to emotion
We performed commonality analyses using the yhat package in R (Nimon et al., 2013), using nonparametric bootstrapping (averaging arousal and valence ratings within each new sample) to approximate normality through 10,000 simulated replications with replacement. 1 After bootstrapping, we estimated the 95th percentile interval (PI) for each commonality partition. Tables 1 and 2 list the commonality coefficients for each composer as percentages. Included are the commonality coefficients, 95th PI for each simulated commonality (column 4; n = 10,000), and differences between commonalities (columns 5–10; for each, n = 10,000). This offers a range of simulated effect sizes along with PI estimates of statistical significance. For example, in Table 1(a), the PI indicates the difference in attack rate and pitch height’s effect ranges from 4.63% to 13.51% for valence. As this does not contain 0%, we interpret this as indicating attack rate’s effect on valence is significantly greater.
Commonality Analysis of Battcock and Schutz (2019) Participant Ratings of Pieces in WTC, Showing Bootstrapped Differences Between Each Commonality Coefficient (CC).
AR: attack rate; CC: commonality coefficient; LL: lower limit; MO: mode; PH: pitch height; UL: upper limit; WTC: Well-Tempered Clavier.
Bold formatting indicates a significant difference at α = .05 level. Coefficients reported as percentages. Table formatting adapted from Marchetti et al. (2016).
Commonality Analysis of Results From This Study, Examining Ratings of Pieces in Préludes by Frederic Chopin, Including Bootstrapped Differences Between Each Commonality Coefficient (CC).
AR: attack rate; CC: commonality coefficient; LL: lower limit; MO: mode; PH: pitch height; UL: upper limit; WTC: Well-Tempered Clavier.
Bold formatting indicates a significant difference at α = .05 level. Coefficients reported as percentages.
To afford comparison with Chopin’s Préludes, we analyzed the 24 preludes from Bach’s WTC (book 1)—a subset of the full analysis of preludes and fugues in Battcock and Schutz (2019), modeling the original participant ratings using the newly encoded cue information from this study. We visualize these data first by showing a two-dimensional scatter plot illustrating both valence (x axis) and arousal (y axis) cue weights—derived from the 95% PIs of 10,000 sampled bootstrap replications (Figure 3). In addition to only analyzing preludes, this method differs from Battcock and Schutz’s (2019, 2021) analyses in the following two other ways: (1) our analyses comprise 10,000 rather than 1,000 replications and (2) we do not perform averaging on participant ratings before bootstrapping.

Scatter Plot of Bootstrapped Cue Effects on Emotion Ratings.Note. Scatter plot indicating cue’s unique and joint contributions to valence (x axis) and arousal ratings (y axis) using 95th percentile intervals from 10,000 bootstrap simulations. Cues: attack rate (blue), mode (green), and pitch height (PH; purple). Mixed colors indicate joint contributions. Text annotations indicate the three primary cues as well as the commonality between attack rate and mode. Figure axes vary between composers to clarify depictions of the varied cue effects within each set. Please refer to the online version of the article to view the figures in colour.
Comparison with Bach
For Bach’s set, attack rate uniquely plays a strong predictive role, accounting for 9.98% of variance explained in valence ratings and 29.15% of variance explained in arousal ratings. Mode accounts uniquely for 6.30% of the variance in valence ratings and only 0.79% in arousal ratings. Together, mode and attack rate jointly account for 18.4% of variance in valence ratings and 5.27% in arousal ratings. Pitch height plays a smaller role, both uniquely (1.2% valence, 0.9% arousal), and through its joint explanation with mode (2.10% valence, −0.33% arousal) and attack rate (–0.49% valence, 5.11% arousal). Jointly, all three cues play a minimal role (–1.74% valence, 0.85% arousal). The explained variance from the cumulative effect of all commonalities is 35.70% [28.89%, 42.63%] for valence and 41.72% [32.55%, 50.08%] for arousal ratings.
For Chopin’s set, attack rate accounts for 5.81% of the variance in valence and 40.5% for arousal. Mode’s unique effect contributes strongly to valence (23.2%); but only accounts for 1.18% of the variance in arousal. In contrast to Bach, mode and attack rate’s joint effect here contributes less prominently to valence (–4.09%) but similarly to arousal (6.26%). Pitch height contributes modestly (2.97% valence, 4.0% arousal), with its joint contributions with mode (12.31% valence, 2.88% arousal), and attack rate (3.07% valence, −3.89% arousal) explaining more variance in valence than arousal. Jointly, all three cues contribute minimally (valence: −0.46%, arousal: −1.29%). All commonalities cumulatively explain 42.84% [35.65%, 50.42%] of the variance in valence ratings and 49.60% [42.90%, 56.08%] in arousal ratings.
Comparing Bach and Chopin’s cue use
Comparing differences in how each commonality predicted participant ratings revealed all unique and combined effects except for attack rate’s unique effect, accounted for significant differences in valence ratings between the two composers. Similarly, all effects except mode’s unique effect and joint effect with attack rate accounted for differences between composers in arousal ratings. Table 3 lists the commonalities and 95th PI estimates after subtracting the contributions of Bach from those of Chopin in accounting for emotion ratings. Whereas for valence, the composers differed for most commonalities, the total variance that all cues and their combinations explained was not significantly different between composers (7.15% [−2.77%, 17.45%]). Similarly, the total variance explained for arousal ratings did not differ meaningfully between composers (7.88% [−2.75%, 19.10%]).
Mean Differences in Explained Variance After Subtracting the Distribution of Bootstrapped Commonality Coefficients of Bach’s WTC Ratings From Chopin’s 24 Preludes.
AR: attack rate; MO: Mode; PH: pitch height; WTC: Well-Tempered Clavier.
Bold formatting indicates significance at the α = .05 level.
Interpreting negative commonality values
Negative coefficients in commonality analysis are thought to indicate either (1) the presence of a suppressor variable removing irrelevant variance from another independent variable, or (2) a null effect equivalent to zero (Seibold & McPhee, 1979). A suppressor variable improves a predictor’s estimates by removing some of its irrelevant variance. It strongly correlates with the predictor while yielding a coefficient close to zero with the dependent variable (Ray-Mukherjee et al., 2014). We conducted several assessments to determine the best interpretation of negative commonalities for these data.
To investigate negative commonalities in Chopin related to pitch, we assessed its Pearson correlations with attack rate and the pieces’ average arousal ratings. For arousal ratings of Chopin’s pieces, pitch appeared in multiple commonalities yielding negative values, suggesting it may have suppressed some of the variance associated with attack rate. Pitch exhibited a nonsignificant positive correlation with attack rate, r(22) = .20, p = .70 and a weak negative correlation with arousal ratings, r(22) = −.17, p = .70. In contrast, attack rate and arousal ratings exhibited a strong positive correlation, r(22) = .84, p < .01. 2 The positive Pearson coefficient and lack of significance with arousal suggests pitch’s negative value reflects a negligible effect instead of suppression. For Bach, pitch height also appeared in negative commonalities involving mode and attack rate. It exhibited a strong negative correlation with attack rate, r(22) = −.50, p = .04, but significant correlations with neither mode, r(22) = .04, p = .98, nor valence, r(22) = −.15, p = .98. We interpret pitch’s weak associations with mode and valence to signify pitch explains minimal variance in valence ratings.
Assessing cues’ overall effects
To assess the cumulative effect of each cue, we performed a second bootstrap simulation, randomly sampling the unique and common effects of each cue 10,000 times, excluding commonalities with negative mean values. This allows comparing cue contributions accounting for unique and joint variance. Figure 4 reorganizes information from Figure 3 to convey each cue’s full role. Whereas attack rate (28.32% [22.81%, 34.36%]) and mode (26.74% [21.65%, 32.34%]) contribute roughly equally to valence ratings in Bach, pitch height plays a much smaller role (3.29% [2.06%, 4.73%]). For Chopin, mode (35.56% [28.76%, 42.63%]) contributes more than both attack rate (8.87% [4.79%, 14.04%]) and pitch height (18.36% [15.26%, 21.62%]). The combined cue contributions are more similar for arousal, with attack rate (and joint contributions) accounting for the most variance for both composers (Bach: 40.32% [32.80%. 47.33%]; Chopin: 46.73% [39.61%, 53.84%]). Similarly, for both composers, mode (Bach: 6.91% [3.23%. 10.47%]; Chopin: 10.33% [8.08%, 12.69%]) and pitch height (Bach: 6.86% [3.26%. 10.16%]; Chopin: 6.87% [5.05%, 8.88%]) explained small but significant proportions of variance in arousal.

Stacked Bar Charts of Bootstrapped Cue Effects on Emotion Ratings.Note. Bar charts indicating cues’ unique (solid color) and joint (mixed colors) contributions to valence (top) and arousal (bottom) ratings from 10,000 bootstrap simulations. Commonalities yielding a mean value of ≤ 0% are excluded. Simulated 95% percentile intervals from 10,000 summed bootstrapped resamples represent the total effect of each cue’s unique and joint contributions (gray bars). Cues: attack rate (blue), mode (green), and pitch height (purple). Please refer to the online version of the article to view the figures in colour.
Discussion
Our data provide new insight into the unique and joint effects of pitch, timing, and mode in musical sets by Bach (1883 [originally published in 1722]) and Chopin (2007 [originally published in 1839]). We believe the most important outcomes of these extensive analyses are (a) the visualizations of each composer’s “emotional palate” shown in Figure 2 and (b) insight into each cue’s unique and joint contributions to musical communication shown in Figures 3 and 4. Together these findings shed new light on how emotion is communicated in unaltered musical passages by two renowned composers. Crucially, the use of natural music also affords exploring changes over musical history. This complements and extends a literature often based on simplified stimuli to avoid problems with cue collinearity. Consequently, this approach allows for new insight regarding cue collinearity—showing that it not only plays a crucial role in conveying musical emotion, but also that Bach and Chopin may have used it in different ways.
Circumplex analyses
Visualizing participant ratings on the circumplex (see Figure 2) illustrates each set’s emotional palate. Mode clearly distinguished pieces’ positive or negative connotations, with minor pieces rated lower in valence than major pieces. Although both composers’ major and minor pieces split relatively evenly along the valence dimension, Chopin’s exhibit more variability in arousal, with twice as many minor pieces receiving high arousal ratings. This difference in the use of the minor mode is compelling in light of the increased prevalence of fast minor pieces during the Romantic era (Horn & Huron, 2015; Post & Huron, 2009). Consequently, our findings provide a useful perceptual counterpart to musicological work tracking mode’s changing use and function (Pedneault-Deslauriers, 2017), contextualizing explorations of mode’s aesthetic and expressive meaning by music theorists (Hatten, 2004; Parncutt, 2014).
Although intriguing, differences in the circumplex visualizations must be interpreted carefully. Despite each set containing 12 pieces in each mode (major/minor), we cannot assume equivalence across sets—Bach’s Prelude in Bb is not “equivalent” to Chopin’s Prelude in Bb. Therefore, understanding differences between cues’ specific mappings to emotional responses is essential for understanding composer-related differences.
Investigating cues’ unique and joint contributions
Commonality analysis enables disentangling how mode, timing, and pitch influence valence and arousal ratings. For valence, mode’s joint effect with attack rate explains most of the variance in ratings of Bach’s pieces, whereas its unique effect more strongly influences ratings of Chopin’s pieces (along with its correlated use with pitch height). This stronger importance of collinearity for ratings of Bach’s music also affects arousal ratings—attack rate’s joint effects with pitch and timing explain more variance for Bach than for Chopin.
Recomposing the CA variance through bootstrapping enabled deriving estimates of each cue’s total effect (inclusive of intercorrelations). For Bach, mode and attack rate similarly affected valence (each explaining over 20% of the variance in participants’ ratings). Figure 4 reveals this similarity stems from their joint effect influencing ratings more than either cue’s unique effect. In contrast, mode (35.6%) contributed more prominently than attack rate (8.9%) for Chopin, suggesting greater independence in conveying valence (although its joint contribution with pitch also affected ratings). For arousal, attack rate’s cumulative effect (Bach: 40.3%, Chopin: 46.7%) explained more variance than either mode (Bach: 6.9%, Chopin: 10.3%) or pitch height (Bach: 6.9%, Chopin: 6.9%) for both composers. Comparing unique and joint contributions reveals the unique effects of mode and pitch explain less than 1% of the variance in arousal for Bach; similarly, mode uniquely explains just over 1% of the variance in arousal ratings of Chopin’s pieces—suggesting its contribution to arousal largely stem from its relationship with attack rate. Consequently, deconstructing and reconstituting cues clarifies the importance of collinearity, revealing how renowned composers weave cues to convey complex emotional messages.
General discussion
Consistent with past research, we observe associations between timing and arousal (Carpentier & Potter, 2007; Husain et al., 2002) and between mode and valence (Dalla Bella et al., 2001; Gagnon & Peretz, 2003; Kastner & Crowder, 1990). However, unraveling the contributions of correlated cues offers novel insight. Most cue combinations affecting valence ratings differed between Bach and Chopin (except attack rate). Similarly, most cue combinations affected arousal ratings differently between composers (except for mode and its shared variance with attack rate). In the following sections, we summarize the importance of exploring music’s complex structure, suggesting steps for refining explorations of music’s changing emotional implications.
Limitations and future directions
Our study attempts to address a long-standing fundamental challenge with using naturalistic music stimuli by combining rigorous musical analyses characteristic of empirical musicology studies with statistical analyses capable of disentangling nuanced relationships between analyzed cues and perceptual responses. Ironically, these complex methods enable clear insights into the emotional messages encoded in Bach and Chopin’s preludes. Although we believe this offers exciting new possibilities for broader musical inquiry, recognizing its limitations is crucial to both interpreting our findings and guiding future research.
First, our study focused on 24-piece prelude sets from two composers of historical renown. Although their prominence makes them valuable, using only two sets precludes clearly disambiguating between composers and the eras they lived in, along with how representative these sets are of the composers’ complete oeuvre. Although our results are consistent with previous findings tracking changes in musical structure during the Romantic era, further research must explore whether this extension into perceptual experimentation holds for a broader range of pieces, composers, and performer interpretations. Second, emotion judgments of a convenience sample of university students may not reflect those of a demographic varied in age, education, or emotion processing abilities (see Henrich et al., 2010; Kret & Ploeger, 2015). To better understand how diverse participants perceive music’s conveyed emotion, we plan to conduct online studies recruiting from a larger pool of participants using extensive inclusion criteria. Third, as both sets fit squarely within the western musical canon, future research should explore how these findings compare across cultures. Investigating diverse music in emotion research will facilitate developing informed hypotheses and accurate conclusions about music’s emotional associations (see Ewell, 2020). Finally, we recognize the complexity in our combination of bootstrapping techniques with commonality analysis, acknowledging this reflects our team’s efforts to consolidate a diverse toolset exploring the historic changes in musical communication. We hope this study serves as a starting point for further inquiry, and recognize dialogue with statisticians, psychologists, and musicologists will drive further musical insights. Despite these limitations, this study provides a valuable step toward sharpening our understanding of the diverse emotions elicited by music’s historic shifts.
The case for exploring intercorrelations
We believe disentangling music’s intercorrelated structure is helpful for understanding its emotional impact. Whereas removing multicollinearity in stimuli created for experimental purposes lends insight into which cues can influence emotion perception, deconstructing multicollinearity reveals how composers actually use these cues in real musical compositions. This approach enables exploring questions beyond those available with stimuli constructed for psychological experiments—such as historical changes in musical cue use. Additionally, it offers the possibility of formally assessing how developments in instruments’ technology influence their emotional affordances. Studies exploring the relationship between instruments’ design and their emotional palate highlight the importance of these inquiries (de Souza, 2017; Huron et al., 2014; Schutz et al., 2008). Several aspects of our data complement more traditional experimental approaches—such as the strong contribution of attack rate to arousal ratings and mode’s influence on valence for both composers. However, they also offer novel insight unavailable with more controlled stimuli—namely that mode’s effects stem from its covariation with timing in Bach’s set. Curiously, pitch’s unique effect explained little variance for either emotion dimension for either composer, contrasting previous work finding strong relations to valence (Ilie & Thompson, 2006) and arousal (Jaquet et al., 2014). Instead, its joint contributions provided its strongest effects, suggesting pitch’s expressivity in musical practice stems from its relation to other cues.
Our findings provide a snapshot of how two renowned composers elicit emotional responses using different cue combinations. These differences have important implications for the perception of music’s meaning—and suggest it might ultimately prove beneficial to embrace music’s structural complexity as a rich source of information—rather than a problematic feature to be avoided. We hope these approaches applied to a variety of musically expressive styles and music from different historical periods will shed new light on the complexity of musical emotion—unveiling how composers’ musical choices continue to captivate audiences separated by centuries.
Supplemental Material
sj-docx-1-pom-10.1177_03057356211046375 – Supplemental material for Exploring historic changes in musical communication: Deconstructing emotional cues in preludes by Bach and Chopin
Supplemental material, sj-docx-1-pom-10.1177_03057356211046375 for Exploring historic changes in musical communication: Deconstructing emotional cues in preludes by Bach and Chopin by Cameron J. Anderson and Michael Schutz in Psychology of Music
Footnotes
Acknowledgements
The authors would like to thank Max Delle Grazie for his careful analyses of the scores studied as well as Benjamin Kelly for his suggestions for developing figures and continued engagement with this project.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Social Sciences and Humanities Research Council [435-2018-1448] and the Canada Foundation for Innovation [CFI-LOF 30101].
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
