Abstract
The ability of music to communicate emotions cross-culturally has been explored in many studies. The first purpose of this study was to examine whether the previously hypothesised in-group advantage of emotion recognition in music holds between the Chinese and Western contexts. The second purpose was to investigate the associations between psychoacoustic features and the recognition of musical emotions in Chinese and Western listeners. Chinese and Western participants were asked to listen to both Chinese and Western music intended to express happiness, sadness, peacefulness, anger, and fear, and indicate to which degree they thought the music expressed the provided five emotions on continuous scales ranging from 1 to 5. A series of mixed analyses of variance (ANOVAs) revealed that the in-group advantage was not well established in this study as Chinese listeners seemed to be more sensitive to the recognition of happiness and sadness but less sensitive to fear compared to their Western counterparts. The psychoacoustic features were extracted through the MIR Toolbox 1.8.1 and processed through principal component analysis. The following hierarchical linear regressions for each type of rating for Chinese and Western music separately showed that the type, number, and degree of psychoacoustic features correlated with emotion recognition differed across cultures.
Keywords
According to Juslin and Laukka (2004), there exist various ways of defining emotions, yet most researchers in the field of emotions would likely concur that emotions can be characterised as relatively brief and intense responses to goal-oriented changes within the contextual surroundings. These responses comprise several elements: cognitive appraisal, subjective feeling, physiological arousal, emotional expression, action tendency, and emotion regulation. In the previous studies on music and emotion, the conceptual distinction between perception and induction of emotions is constantly highlighted. This is due to their different underlying mechanisms and measurements, and the fact that emotions perceived in music may not always be congruent with emotions induced by music (Gabrielsson, 2001; Juslin & Laukka, 2004; Juslin & Västfjäll, 2008). In this study, we focused on perceived emotions. There are two widely acknowledged theories in emotion research nowadays: discrete emotion theory and dimensional theory of emotion. The well-known discrete emotion theory – the basic emotion theory – proposes that there are a small number of core emotions that are biologically and psychologically basic and shared among human beings (Eerola & Vuoskoski, 2011; Gu et al., 2019). Psychologists spanning centuries (e.g., Darwin, 1872; Ekman, 1992a; Ekman & Cordaro, 2011; Izard, 1977, 2011; Plutchik, 1962; Tomkins, 1962) have engaged in an enduring discussion about the theory; however, there has not yet been an agreement on the precise number and types of basic emotions (Ekman, 1992b; Gu et al., 2019). Basic emotions such as happiness, sadness, and anger are considered to be easily judged from facial expressions and communicated in music (Ekman, 1992a; Ekman et al., 1969; Gabrielsson & Juslin, 1996). Another important theory about emotion is the dimensional theory of emotion (Schlosberg, 1954; Wundt, 1897), which proposes that emotions can be described by only two or three dimensions of affect. The most predominant and commonly used dimensional model nowadays is the circumplex model proposed by Russell (1980), which categorises an emotion based on two dimensions: valence (pleasure–displeasure) and arousal (degree of arousal). However, some have argued that the two-dimensional space has limitations in terms of fully representing all emotions and distinguishing those that are very similar (Juslin & Laukka, 2004). Given all the above, we chose to focus in this study on five commonly studied discrete emotions in the field of music and emotion: happiness, sadness, peacefulness, anger, and fear (Argstatter, 2016; Balkwill & Thompson, 1999; Hanigan et al., 2023; Juslin, 2013; Juslin & Laukka, 2003, 2004).
Music is ubiquitous across human cultures and is considered to have the capacity to convey emotions (Midya et al., 2019). Juslin and colleagues developed the lens model and the expanded lens model (Juslin, 2000; Juslin & Laukka, 2004; Juslin & Lindström, 2003), which help explain the musical communication of emotions through psychoacoustic features such as mode, pitch, tempo, sound level, articulation, and timbre. The similar uses of psychoacoustic cues to judge emotions in music and speech (Ilie & Thompson, 2006; Juslin & Laukka, 2003, 2004), and the shared structure involving universal emotional expression in music and movement (Sievers et al., 2013), both help to provide explanations for the expressive capacity of music. The emotional communication of music has been verified in some cross-cultural studies. For instance, in the study by Fritz and colleagues (2009), the Mafas were able to recognise happiness, sadness, and fear above chance level from Western music stimuli, even though they were naïve to Western culture and music. By contrast, some point out that cross-cultural decoding of affective states may not always be established (Davies, 2011); instead, cultural differences may be more or less apparent in emotional responses to music (Gregory & Varney, 1996).
Balkwill and Thompson (1999) proposed the cue-redundancy model (CRM) to characterise cross-cultural similarities and differences in the expression and recognition of emotion in music, with both universal cues (i.e., psychophysical cues) and cultural-specific cues. Psychophysical cues refer to those psychoacoustic features that are shared by all tonal systems (e.g., tempo and complexity), while cultural-specific cues are influenced by enculturation and conventions, which may be bound to a particular culture (e.g., harmonic progressions). The association between emotion recognition and the use of psychoacoustic cues has been substantiated by previous studies (e.g., Balkwill et al., 2004; Balkwill & Thompson, 1999). In the study by Balkwill and Thompson (1999), Western listeners’ perceptions of four target emotions (joy, sadness, anger, and peace) encoded in raga music were found to correlate with their perceptions of acoustical features, including tempo, rhythmic complexity, melodic complexity, pitch range, and timbre. Their further studies also revealed the different utilisations of psychoacoustic cues by listeners from different cultures (Balkwill, 2006; Thompson & Balkwill, 2010). In this study, one of the purposes was to expand research on this issue.
The dock-in model proposed by Fritz (2013, p. 514) aimed to conceptualise universals and cultures from a wider scope, which further includes ‘music universals that are not a shared feature of all music cultures or even any music culture’. In this sense, the farther away a listener is from another culture’s dock, the more likely they have difficulty in recognising the culture-specific cues (Argstatter, 2016). This model agrees with the CRM on an in-group advantage in recognising emotions for listeners within or close to the culture of the music, which has been substantiated in many studies. For instance, in the study by Argstatter (2016), the German group shared similar recognition patterns with the Norwegian group (both from West European culture), as was the case for the Korean group and the Indonesian group (both from Asian culture), and the European groups outperformed the Asian groups when listening to Western music stimuli. However, it is notable that the in-group advantage seems to vary across emotions (Laukka et al., 2013).
In this study, we decided to explore the issue of cross-cultural music emotion recognition between Chinese and Western contexts because of the following reasons. On one hand, as an important representative of the Oriental music cultures, Chinese traditional music is distinctly different from Western classical music in musical systems and instrumentations. On the other hand, emotional expressivity, which is deemed a fundamental feature in Western music in Western culture (Fritz, 2013), is likewise emphasised in Chinese music, as an important representation of people’s inner thoughts and affective states (Zheng & Yang, 2020). Based on the previously discussed literature, there are two research questions for this study: (1) Is there an in-group advantage when listeners recognise emotions from the music of their own cultures? (2) How do psychoacoustic cues correlate with the emotion recognition of listeners from different cultures in music?
Method
Ethics
Ethical approval was granted by the University of York Arts and Humanities Ethics Committee. Participants were required to read the participant information sheet and sign the consent form before starting the listening study. All data and personal information collected from participants were kept anonymous and were protected against unauthorised access.
Participants
All participants were approached via social media or email contact. Participants were informed of the opportunity to be notified about the results of the study and the chance to win a £10 Amazon Voucher or equivalent cash prize. People who were born and raised in a Western cultural background 1 or the Chinese cultural background were eligible to take part in the study. Musicians were identified as those who either undertook a music-related major or job or those who had received 10 years or more of professional musical training. Two hundred and seventy-eight Chinese (69 males, 98 musicians; M = 25.01 years old, SD = 6.44) and 136 Westerners (54 males, 68 musicians; M = 34.91 years old, SD = 14.95) participated in the study.
Music stimuli
Music stimuli were selected from two pilot studies, in which participants were professional Western classical musicians with a Western cultural background and Chinese traditional musicians from China. Similar to previous studies (e.g., Balkwill et al., 2004; Laukka et al., 2013), professional musicians were exclusively involved in the selection of music stimuli for the pilot studies. This is because musicians are not only influenced by cultural conventions but also by their specialisation in particular musical genres. Musicians’ expertise in expressing emotions through musical performance can result in a higher degree of consensus among their judgements, reflecting a professionally typical perspective in contrast to non-musicians. Twelve Westerners (five males; M = 45.58 years old, SD = 13.82) and nine Chinese (three males; M = 23.78 years old, SD = 2.95) participated in the first pilot study, and 12 Westerners (four males; M = 31.25 years old, SD = 9.52) and 15 Chinese (three males, one prefer not to say; M = 22.60 years old, SD = 1.72) in the second pilot study. The original music stimuli were recommended by Chinese traditional music and Western classical music experts, who were asked to recommend music that can express happiness, sadness, peacefulness, anger and fear. Each group of musicians listened to the music excerpts from their own culture and rated the extent to which they thought the music expressed each of the five intended emotions using scales numbered 1 to 5 (low to high), for each excerpt. Scores 1 and 2 were coded as Low Level, 3 as Medium Level, and 4 and 5 as High Level. Frequencies of these three levels of emotion ratings were calculated for each musical excerpt.
The valid excerpts were identified according to the following criteria: (1) the highest rating for High Level should lie on the intended emotion; (2) for the ratings of the intended emotion, the sum of the ratings for High Level and Medium Level should be higher than the rating for Low Level; (3) the sum of the ratings for High Level and Medium Level for the intended emotion should be higher than the sum of the ratings for High Level and Medium Level for the other emotions. 2 It was also required that the mean duration of each type of emotional music in one culture be roughly equal to that of the other culture. Since no Chinese musical excerpts were perceived as angry and only one Western musical excerpt was perceived as sad in the first study; experts in the second pilot study were asked to recommend specific music excerpts for these two emotions within their respective musical genres, instead of only providing the title of the music. Ten excerpts of Western classical orchestral music and eight excerpts of Chinese traditional ensemble music, with two musical excerpts for each emotion of each culture, were finally selected. Note that Western musical stimuli included all the five types of emotions, while Chinese music stimuli only included four emotions, happiness, sadness, peacefulness, and anger but without fear. This was because Chinese experts were not able to recommend fearful music in either pilot study, and thus the ratings for fear were excluded from the selection criteria for Chinese angry music. The duration of the musical excerpts was 13 to 21 s. A full list of music stimuli selected for this study can be found in Appendix A in Supplementary materials online.
Procedure
The main study was conducted using an online questionnaire based on the Qualtrics system. The translated terms used in the Chinese version of the questionnaire were based on reference to the dictionary The Modern English-Chinese-English Psychological Vocabulary (Zhang et al., 2006), and those relevant Chinese peer-reviewed publications in which these terms were used. English and Chinese versions of questionnaires were provided for participants to choose from according to personal needs. Participants were required to find a quiet place away from interfering noise or distractions. After completing the demographic questions, participants were given a sound test to set the volume to a comfortable level, which they were asked not to change afterward. Participants were then instructed to click on the play button and listen to the music excerpt only once before answering the subsequent questions. All 18 musical excerpts were played in a random order for each participant. Participants were required to rate on scales numbered 1 to 5 (low to high) their familiarity with the music, and their perception of each of the five intended emotions (happiness, sadness, peacefulness, anger, and fear) conveyed by the music. The instructions reminded participants to rate the extent to which they thought the music expressed the given emotions, rather than how the music made them feel.
Results
Familiarity
Figure 1 shows that the 18 musical excerpts were generally unfamiliar to both cultural groups. For the familiarity ratings, only Excerpts 1 and 5 were above the Medium Level (rating score ‘3’) for the Chinese group. For the Western group, all the music excerpts were below the Medium Level. Both cultural groups were more familiar with the music excerpts from their own culture than with those from the other culture, except for Excerpts 4, 7, and 13.

Estimated Marginal Means of Ratings Of Stimulus Familiarity for Chinese and Western Participants (*p
Recognised emotions
After calculating the mean ratings per participant for each emotional category of the music, we conducted a mixed ANOVA for each perceived emotion rating, with the within-subjects factor, emotion of music (happy, sad, peaceful, and angry music for Chinese music, and happy, sad, peaceful, angry, and fearful music for Western music), and the between-subjects factor cultural background (Chinese versus Western). Figures 2 and 3 present the estimated marginal means of ratings for happiness, sadness, peacefulness, anger, and fear for different cultural groups, in Chinese music and Western music. Significant group differences are indicated by asterisks (see Appendix B in Supplementary materials online for statistical details). Through pairwise comparisons, a cultural difference in the recognition of an emotion was determined by an observed group difference in the ratings of the targeted musical emotion. The highest-rated emotions were congruent with the targeted musical emotions in both cultural groups and across all emotions, except for fear ratings in Western music.

Estimated Marginal Means of Ratings for Happy, Sad, Peaceful, and Angry Music in Chinese Music, Separated by Rating Types.

Estimated Marginal Means of Ratings for Happy, Sad, Peaceful, Angry, and Fearful Music in Western Music, Separated by Rating Types.
We also included musical background and gender as between-subjects factors, along with cultural background, to examine whether musicianship and gender influenced cultural differences in the emotion recognition of music. Only significant results from follow-up analyses are reported, specifically when cultural differences in responses to the targeted emotional music vary by musical background or gender, or when these differences diverge from those observed in the initial analysis (i.e., when significant cultural differences become non-significant, or vice versa). For the full results of the effects of all variables and interactions, as well as all the relevant pairwise comparisons, for the follow-up analyses, see Appendix C in Supplementary materials online.
Chinese music
In general, Chinese participants only showed an in-group advantage (i.e., ratings given by Chinese participants for the targeted musical emotion were higher than those given by Western participants) in recognising happiness and sadness. No group difference was found in peacefulness and anger, and Chinese participants were less sensitive to fear compared to Western participants.
Specifically, for the happiness ratings, the first analysis (with cultural background as the only between-subjects factor) revealed main effects of emotion of music (F(3, 410) = 960.842, p < .001,
For the sadness ratings, there was a main effect of emotion of music (F(3, 410) = 521.967, p < .001,
For the peacefulness ratings, there was an emotion of music main effect (F(3, 410) = 378.056, p < .001,
For the anger ratings, there was an emotion of music main effect (F(3, 410) = 385.120, p < .001,
For the fear ratings, there was an emotion of music main effect (F(3, 410) = 395.941, p < .001,
Western music
In general, Western participants only showed an in-group advantage in recognising fear. No group difference was found in peacefulness and anger, and Western participants were less sensitive to happiness and sadness compared to Chinese participants.
Specifically, for the happiness ratings, the first analysis revealed an emotion of music main effect (F(4, 409) = 501.593, p < .001,
For the sadness ratings, there was an emotion of music main effect (F(4, 409) = 496.366, p < .001,
For the peacefulness ratings, there was an emotion of music main effect (F(4, 409) = 433.056, p < .001,
For the anger ratings, there was an emotion of music main effect (F(4, 409) = 383.306, p < .001,
For the fear ratings, there was an emotion of music main effect (F(4, 409) = 366.336, p < .001,
Correlates of psychoacoustic features
We subsequently investigated whether psychoacoustic features of music stimuli were associated with listeners’ emotion recognition in both Chinese and Western music (Egermann et al., 2015). First, we extracted seven psychoacoustic features that were represented by the mean for each music excerpt, through the MIR Toolbox 1.8.1 (Lartillot et al., 2008). These were: (1) pitch (through computing an autocorrelation function of the audio waveform), (2) event density (through estimating the number of events detected per second), (3) roughness (based on the summation of roughness between all pairs of sines, obtained through spectral peak-picking, Sethares, 1998), (4) the centroid of the frequency spectrum, (5) root mean square (RMS) energy (by taking the root average of the square of the amplitude), (6) brightness (through measuring the amount of energy above the cut-off frequency, Juslin, 2000), and (7) mode (by computing the key strength difference between the best major key and the best minor key). Tempo was measured in beats per minute (BPM) through a web-based BPM-Tracker by tapping with the dominant beat of the music excerpt manually. For parameters of all the psychoacoustic features for all the 18 music excerpts, see Appendix A in Supplementary materials online. Second, we conducted a Principal Component Analysis on those mean audio features (see Table 1) to reduce the number of predictor variables and their collinearity.
Component Loadings From Principal Component Analyses of Psychoacoustic Features of Music Excerpts (n = 18).
Note. Rotation method: Varimax with Kaiser normalisation. RMS: root mean square. Factor loadings greater than .8 are shown in bold.
Third, the scores resulting from those six principal components [PC1–6] were used as predictor variables. Together with all the outcome variables (rating scores), they were then z-standardised and subsequently tested in a hierarchical linear regression for each type of rating for Chinese music and Western music separately. Then we ran a third regression model, which added a dummy variable that coded the Western participant group as the reference group. The group differences were estimated by the interaction effects between the cultural background and all the z-standardised predictors. Figure 4 displays the estimated fixed-effects coefficients, separated by cultural groups, rating types, and cultures of the music. A significant effect (greater or smaller than zero) of the predictors is indicated when the 95% confidence interval (error bar) does not cross through the zero line, and the significant group differences are indicated by asterisks (see Appendix D in Supplementary materials online for statistical details).

Error Bar Graphs of Fixed Effect Coefficients Estimated for the Acoustical PCs, Separated by Cultural Group, Rating Types, and Culture of the Music.
It can be seen that there was a general difference across different PCs, between the Chinese and Western groups, and between Chinese and Western music, though some similarities were also shown. These analyses focused on the group difference in the PCs with particularly strong responses, the response/effect size, and the number of PCs associated with emotion recognition. Overall, in Chinese music, the Chinese group generally responded more strongly to PC2, while both cultural groups generally responded more strongly to PC4 in Western music. In addition, the Chinese group seemed to respond to PCs more strongly overall than the Western group. Furthermore, the Chinese group seemed to generally respond to more psychoacoustic PCs than the Western group.
Specifically, when rating for happiness, both cultural groups responded to all PCs in both Chinese and Western music, except for PC6 for the Western group. Both cultural groups responded more strongly to PC2 in Chinese music, and PC4 in Western music. Responses were generally stronger for the Chinese group, as shown in PC1 to PC4 in Chinese music, and PC3 to PC5 in Western music.
When rating for sadness, both cultural groups responded to all PCs in both Chinese and Western music, except for the Chinese group to PC6 in Chinese music, and both cultural groups to PC3 in Western music. Both cultural groups responded more strongly to PC2 in Chinese music, and particularly strongly to PC6, with equally moderately strong responses to the other PCs, except for PC3. The asterisks indicated that the responses were generally stronger for the Chinese group, as shown in PC1 to PC4 in Chinese music, and PC4 to PC5 in Western music, and there were more group differences in Chinese music (four) than in Western music (two).
When rating for peacefulness, both cultural groups responded to all PCs in both Chinese and Western music, except for PC2 and PC3 in Chinese music, and PC3 in Western music. In Chinese music, the Chinese group responded more strongly to PC2, while in Western music, both cultural groups responded more strongly to PC4. The asterisks indicated that the responses were generally stronger for the Chinese group, as shown in PC1 to PC3, and PC6 in Chinese music, and PC2 and PC4 in Western music, and there were more group differences in Chinese music (five) than that in Western music (three).
When rating for anger, both cultural groups responded to all PCs in both Chinese and Western music, except for the Western group to PC1 and PC2 in Chinese music, and the Chinese group to PC6 in Western music. In Chinese music, the Chinese group responded more strongly to PC2, while in Western music, both cultural groups responded relatively more strongly to PC4. The asterisks indicated that the responses were generally stronger for the Chinese group, as shown in PC1 to PC4, and PC6 in Chinese music, and PC3 and PC4 in Western music, and there were more group differences in Chinese music (five) than that in Western music (three).
When rating for fear, the Chinese group responded to all PCs in both Chinese and Western music, except for PC4 in Chinese music. By contrast, the Western group only responded to PC4 to PC6 in Chinese music, while in Western music, the Western group responded to all PCs, except for PC2. The Chinese group responded slightly more strongly to PC2 and PC3 in Chinese music, while in Western music, both cultural groups responded relatively more strongly to PC1 and PC4. The asterisks indicated that the responses were generally stronger for the Chinese group, as shown in PC1 to PC3, PC5 and PC6 in Chinese music, and there were more group differences in Chinese music (four) than in Western music (zero).
Discussion
The first question to be addressed in this study was whether the in-group advantage of cross-cultural music emotion recognition can be confirmed between Chinese and Western cultures. Through a series of mixed ANOVAs and pairwise comparisons, we found both similarities and differences in emotion recognition between the Chinese and Western participants. The results showed that the highest-rated emotions by both groups were all consistent with the emotion targeted in the music, though there might be some crossover between anger and fear. This may be because anger and fear both are considered to have negative valence and high arousal (Russell, 1980), both are expressed with very similar psychoacoustic features, such as roughness, which is related to ‘the perceptual quality of buzz, raspiness, or harshness’ and ‘dissonance’ (Coutinho & Dibben, 2013, p. 667), and both are often represented by minor harmonic progressions and varied rhythms (Hailstone et al., 2009). In general, both cultural groups were able to identify musical emotions within and across cultures, which was in line with previous studies (e.g., Balkwill et al., 2004; Balkwill & Thompson, 1999; Fritz et al., 2009). There were no significant group differences in the recognition of peacefulness and anger in either Chinese or Western music. In both Chinese and Western music, the Chinese participants seemed to be relatively more sensitive to the recognition of happiness and sadness, while the Western participants were more sensitive to the recognition of fear. Moreover, the Western group rated fear higher in the target music in both Chinese and Western music, whereas the Chinese group rated generally lower fear than the Western group across all the emotional music sets. This suggests that the Chinese may be more conservative in their judgements of fear compared to Westerners. This cultural difference reflected the statements made by the Chinese traditional music experts in the pilot studies. Here, they stated ‘there appears to be no fearful music in Chinese traditional music’ or ‘it is difficult to categorise so-called fearful music’, which thus led to having no fearful music provided from Chinese culture in this study. Overall, the above findings indicated that the in-group advantage found in previous studies (Argstatter, 2016; Zacharopoulou & Kyriakidou, 2009), was not well established between Chinese and Western contexts in this study. Instead, the findings suggested a cultural advantage only for particular emotions in music. Moreover, it is worth noting that this advantage was not influenced by familiarity, as the initial analysis showed that both cultural groups were generally more familiar with music excerpts from their own culture than with those from the other culture. However, whether there is an in-group advantage or cultural advantage in recognising musically expressed emotions remains inconclusive, necessitating further studies across a wider range of cultures and emotions.
Musical background seems to somehow influence the effect of cultural background on emotion recognition, but the influence is only limited to certain emotions. For instance, in both Chinese and Western music, the Chinese participants were less sensitive to the recognition of fear than the Western participants only in non-musicians, but not in musicians. These results suggest that Western culture may have an advantage in the recognition of fear, but musicianship could counteract it. The overall findings suggest that musicianship may confer an advantage in recognising emotions that are often confused with others, or are not easily identifiable. There was little gender difference shown in this study, though cultural background seemed to influence the recognition of happiness differently between males and females. The findings of previous studies on the effects of musical training or gender on music emotion recognition remained mixed. Some indicated better performance for those with more years of musical training (Lima & Castro, 2011) and for female listeners (Gabrielsson & Juslin, 1996), while some reported no effects of musical training (Nineuil et al., 2021) or gender (Gregory & Varney, 1996; Shen et al., 2018) on emotional judgements in music. This study’s findings align more closely with Argstatter’s (2016) study, which also observed that musicians exhibited a slight but significant advantage in judging musical emotions compared to non-musicians. However, this advantage was limited to specific music excerpts and may not be generalised to all cases.
For the overall cultural differences observed in recognised emotions, it is worth considering individual differences (such as personality traits) as one of the causes (Juslin et al., 2016). Researchers in the field of culture and personality have noted differences in fundamental values between individualist cultures, which emphasise self, personal goals, and achievements, and collectivist cultures, which prioritise social harmony and group interests over the individual (Hofstede & McCrae, 2004; Triandis, 2001; Triandis & Gelfand, 1998), which may be associated with variations in certain personality traits across cultures. Previous research on personality profiles across cultures showed that Europeans and Americans scored higher in Extraversion and Openness to Experience, and lower in Agreeableness compared to Asians and Africans (Allik & McCrae, 2004). It has been argued that individuals with a higher score in Openness to Experience were more sensitive and tended to experience strong feelings towards art and beauty (McCrae, 2007), which can be extended to the field of music, where listeners with higher scores in Openness to Experience were also found to experience emotions more intensely than those with lower scores (Liljeström et al., 2012). This seems to partly explain the higher sensitivity to the recognition of fear in the Western participants in this study. In research on the perception of musical emotions, Vuoskoski and Eerola (2011) indicated that personality traits were strongly linked to preferences for music expressing different emotions. For instance, congruent with the definition of a prosocial trait that reflects cooperation and social harmony (Graziano & Eisenberg, 1997), agreeableness was found closely related to liking for happy and tender music, while disliking for angry and fearful music. These trait-congruent associations seem to be in line with the findings in this study: Chinese listeners were more sensitive to the recognition of happiness and sadness (aesthetic enjoyment in music) but less sensitive to fear compared to Western listeners. However, the possible correlations between personality traits and emotion recognition, even from a cross-cultural perspective, need further investigation in future research.
The second question in this study was how psychoacoustic cues correlate with music emotion recognition among listeners from different cultures. Results of the hierarchical linear regression indicated that psychoacoustic features were somehow associated with musical emotion recognition, though the association varied across cultures and types of emotion ratings. In general, when listening to Chinese music, the Chinese group seemed to respond more strongly to timbre/loudness [PC2]. By contrast, when listening to Western music, both cultural groups showed relatively high responses to mode [PC4]. This reflects differences in structural and psychoacoustic characteristics between Chinese and Western music. The former emphasises the psychoacoustic attribute of timbral roughness and sound intensity, while the latter emphasises mode system. This, in return, implies the different significance of psychoacoustic features in predicting emotion recognition between Chinese and Western music. In a recent cross-cultural study conducted by Wang et al. (2022), five musical elements (timbre, pitch, rhythm, loudness, and mel-frequency cepstral coefficients (MFCC)) were examined for emotion recognition in Western and Chinese classical music, employing the Valence-Arousal model. The study identified pitch as the predominant factor in emotional recognition for Western classical music. Conversely, in Chinese music, all musical elements exhibited relatively equal importance, with loudness and rhythm playing more significant roles compared to those in Western classical music. The researchers hypothesised that this discrepancy could be attributed to the flexible nature of Chinese music, lacking a rigid system of fixed rhythm and speed found in Western music. This flexibility, described as ‘dynamic fluctuation’ (Wang et al., 2022, p. 14), contributes to the heightened influence of rhythm and loudness on the emotional perception of Chinese classical music.
Cultural differences were also shown in the degree and number of psychoacoustic features correlated with emotion recognition. Chinese participants generally responded more strongly, and responded to more psychoacoustic features than Western participants in both Chinese and Western music, as indicated by the more significant and larger absolute values of the fixed effect coefficients for the Chinese group. For example, in the ratings for anger in Chinese music, all six psychoacoustic components were significant predictors for Chinese listeners, while for Western listeners, only four psychoacoustic components were significant, and their effects were relatively low. This is similar to findings in previous research – when rating for anger, perceived complexity, tempo, and intensity significantly influenced the judgements of Japanese listeners, while for Canadian listeners, only perceived intensity had a significant association with their judgements (Balkwill, 2006; Thompson & Balkwill, 2010). This phenomenon was hypothesised to be related to the attention focus or cognitive styles of listeners from different cultural backgrounds (Thompson & Balkwill, 2010), although further research is needed.
Results on psychoacoustic features also highlighted greater cultural differences in responses to Chinese music than to Western music. Western music, guided by a precise system, allows for clear intentions and direct expressions, whereas Chinese music, characterised by sensual expressions, prioritises symbolic abstractions (Lin, 2010). From musicologists’ perspective, this difference may be attributed to distinct national characters and compositional traditions. However, the heightened group differences in responses to Chinese traditional music may primarily arise from its less widespread exposure across cultures compared to Western classical music, posing challenges for non-Chinese listeners in grasping the emotional content of this unfamiliar genre.
Limitations
This study was conducted only within Chinese and Western cultural contexts; the in-group advantage and associated psychoacoustic features should be further investigated across a wider range of cultures, particularly those that are rarely studied in this field. This study was limited to five discrete emotions and six categories of psychoacoustic features. Further exploration is warranted for other emotions, particularly those deemed complex (e.g., nostalgia), and other psychoacoustic cues (e.g., melodic and rhythmic complexity). Furthermore, there were only 18 musical excerpts tested, selected by a small number of professionals, which might not be representative of all Chinese traditional and Western classical music.
Conclusion
The in-group advantage of cross-cultural emotion recognition in music could not be confirmed in this study. Instead, in both Chinese and Western music, a cultural advantage in the recognition of happiness and sadness for the Chinese group, and the recognition of fear for the Western group were found, although more studies are needed. Musicianship may affect the influence of cultural background on recognising certain emotions, such as fear. Gender showed little effect on music emotion recognition in this study. Psychoacoustic features correlated with listeners’ emotion recognition in music differently across cultures. The in-group advantage in cross-cultural music emotion recognition and the varied associations of psychoacoustic cues across different emotions and cultural contexts needs further investigation. Future research should also take into account individual differences, such as personality traits and cognitive styles, as well as historical, sociocultural, and ethnographic factors, which we believe can lead to more comprehensive insights into the intricate issue of cross-cultural emotion recognition in music.
Supplemental Material
sj-pdf-1-pom-10.1177_03057356241303857 – Supplemental material for A cross-cultural study in Chinese and Western music: Cultural advantages in recognition of emotions
Supplemental material, sj-pdf-1-pom-10.1177_03057356241303857 for A cross-cultural study in Chinese and Western music: Cultural advantages in recognition of emotions by Menglan Lyu and Hauke Egermann in Psychology of Music
Footnotes
Acknowledgements
The authors would like to thank the experts and participants who dedicated their time to this study, as well as the colleagues and friends who kindly assisted in distributing the recruitment information.
Author contributions
M.L. and H.E. contributed to the study design, obtained ethical approval, and recruited participants. M.L. conducted the initial analysis and wrote the first draft of the article, which was revised following the feedback by H.E. Both authors reviewed and approved the final version of the article.
Funding
This article was published as open access, with the cost covered by the University of York Library agreement.
Supplemental material
Supplemental material for this article is available online.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
