Abstract
Processing fluency has been shown to affect how people aesthetically evaluate stimuli. While this effect is well documented for visual stimuli, the evidence accumulated for auditory stimuli has not yet been integrated. Our aim was to examine the relevant research on how processing fluency affects the aesthetic appreciation of auditory stimuli and to identify the extant knowledge gaps in this body of evidence. This scoping review of 19 studies reported across 13 articles found that, similarly to visual stimuli, fluency has a positive effect on liking of auditory stimuli. Additionally, we identified certain elements that impede the generalizability of the current research on the relationship between fluency and aesthetic reactions to auditory stimuli, such as a lack of consistency in the number of repeated exposures, the tendency to omit the affective component and the failure to account for personal variables such as musical abilities developed through musical training or the participants' personality or preferences. These results offer a starting point in developing novel and proper processing fluency manipulations of auditory stimuli and suggest several avenues for future research aiming to clarify the impact and importance of processing fluency and disfluency in this domain.
Introduction
Processing fluency is a subjective feeling of ease or difficulty associated with any type of mental processing (Graf et al., 2018). Its effect of modifying how individuals judge stimuli has been observed over a range of judgements people make, from judgements of truth (Reber & Schwarz, 1999; Reber & Unkelbach, 2010; Unkelbach, 2007), judgements of learning (Jia et al., 2016; Undorf & Erdfelder, 2013) or judgments of familiarity (Kurilla & Westerman, 2008; D. L. Westerman et al., 2015). While its implications can be observed in the area of consumer psychology (Schwarz, 2004), where processing fluency affects brand evaluation (Lee & Labroo, 2004) and the preference for product packaging (S. J. Westerman et al., 2013), another area influenced by processing fluency is that of aesthetics. Seeking to understand the dynamics of the aesthetic experience, Reber et al. (2004) argue that increases in how fluent an individual processes an object result in a more positive appraisal of that object. This effect was observed for visual art (Belke et al., 2010; Mayer & Landwehr, 2018; Rui & Xiangping, 2011), musical lyrics (Melvill-Smith et al., 2023; Nunes et al., 2015) and artistic photographs (Vissers & Wagemans, 2021), to name a few.
The term “processing fluency” can be thought of as an umbrella term for all phenomena related to the subjective ease of information processing (Oppenheimer, 2008), with different sources producing different types of fluency, such as perceptual, conceptual or linguistic fluency, to name a few (Alter & Oppenheimer, 2009). Fluency can be achieved through multiple means, such as altering visual characteristics specific to the stimulus, e.g., symmetry, contrast, simplicity or self-similarity (Mayer & Landwehr, 2018). Furthermore, processing fluency can be achieved by exposing the stimuli multiple times, which creates the mere-exposure effect and thus fosters preferences (Zajonc, 1968).
Processing Fluency and Aesthetic Reactions of Auditory Stimuli
The effects of processing fluency on aesthetic evaluations have been predominantly studied using visual stimuli, ranging from patterns (Jacobsen & Höfel, 2002) and line shapes of objects (Forster et al., 2015) to representational, cubist and abstract paintings (Belke et al., 2010), computer generated artworks (Graf & Landwehr, 2017) and other instances of visual stimuli such as portraits (Belke et al., 2015) or photographs (Vissers & Wagemans, 2021). Previous research attests that increases in processing fluency of visual stimuli leads to increases in liking (D. L. Westerman et al., 2015; Winkielman, Schwarz, Reber et al., 2003), an idea supported by early meta-analyses done on the mere-exposure effect as well (for example, Bornstein, 1989). However, this conclusion cannot be readily extended on auditory stimuli, due to important differences between visual and auditory stimuli that are relevant to the topic at hand, as well as to empirical findings suggesting differences in the effects of fluency on the aesthetic reactions to these different types of stimuli.
Firstly, many auditory materials consist of a mix of linguistic and musical components, with a lack of specificity for musical stimuli in particular (Nieminen et al., 2012). While many parallels can be drawn between language and music, it is important and necessary to study them independently of each other (Jackendoff, 2009). However, it is difficult to extract only musical stimuli from the pool of auditory stimuli given the difficulty in defining what music or art is or is not (Pelowski et al., 2017). As such, throughout this paper, we will refer to all non-linguistic auditory stimuli that convey musical elements simply as “auditory stimuli”, regardless if they consist of musical fragments, tone sequences or simple melodic lines, for simplicity’s sake.
Also, there is a multitude of global characteristics of auditory stimuli that differ from those of visual stimuli, which significantly impact how individuals experience music (Jacobsen & Beudt, 2017). While aesthetic preferences for visual art can be affected by visual simplicity, symmetry, contrast and self-similarity (Mayer & Landwehr, 2018), aesthetic preferences for music might depend on characteristics such as the distribution of spectral energy (frequency, space and time), musical texture, expressivity, tempo and mode (P. Brattico et al., 2017). In addition, these can be affected by how the listener perceives aspects such as speed, rhythmic clarity, rhythmic complexity, articulation, dynamics, modality, overall pitch, harmonic complexity, brightness, energy or valence (Friberg et al., 2014) or consonance and dissonance (Di Stefano et al., 2022; Seror & Neill, 2015). All these specific features of auditory stimuli may further lead to different effects of fluency on aesthetic reactions in comparison to those documented using visual materials.
Secondly, previous findings suggest that while liking for auditory stimuli might increase when they are processed fluently due to increases in familiarity, these effects are not as impactful as for visual materials (Montoya et al., 2017). Past research also suggests that affective ratings get amplified with fluency, meaning that stimuli that are already disliked get to be disliked more (Albrecht & Carbon, 2014; Landwehr & Eckmann, 2020), contrary to early propositions that processing fluency is hedonically marked in a way that repeated exposures lead to increased liking regardless of valence (Winkielman et al., 2003a). However, this might not be true for music given how at times, listening to sad music, which should elicit a negative state, results in feelings of pleasure instead (Sachs et al., 2015; Vuoskoski et al., 2012). Also, the idea that stimuli low on complexity (and thus, high on fluency; Reber, Schwarz, et al., 2004) are preferred the most is supported only partially by research on auditory stimuli, as in some instances individuals tend to prefer stimuli that are complex and not simple. For example, Delplanque et al. (2019) found that their participants preferred tone sequences with intermediate complexity over those of high or low complexity. Ball et al. (2018) also found that higher complexity stimuli engage the perceiver in a more deliberate process than stimuli with lower complexity.
The Importance of Emotion
Music is closely related to emotion (Juslin, 2013; Juslin et al., 2008; Juslin & Sloboda, 2013), and people base their musical choices mostly on how the music affects them emotionally (Juslin & Isaksson, 2014). Some authors think of emotional states as intrinsically important to the formation of aesthetic judgement and not as a different process that runs in parallel (Egermann & Reuben, 2020; Xenakis et al., 2012), as it was proposed by earlier models of aesthetic judgement formation (Leder et al., 2004). However, some authors evaluate the causal relation between music and emotion as weaker than it is currently presented in the literature (Konečni, 2008). In addition, while music is able to induce basic emotions, more genuine forms of emotion such as being moved, aesthetic awe and thrills are more difficult to examine (Konečni, 2005). Lastly, the emotions people feel when listening to music differ in frequency compared to emotions felt when viewing a painting, even though they might all stem from the same everyday emotions. In this sense, while wonder is felt more when viewing a painting, listening to music makes listeners feel tenderness, nostalgia, peacefulness, or sadness (Miu et al., 2016). In this regard, the specific emotional value of stimuli used in studying processing fluency may cause differences in aesthetic judgment. For instance, in some cases, the fluent processing of negative or disliked stimuli leads to an amplification of the already existing feeling (Albrecht & Carbon, 2014; Meskin et al., 2013).
Individual Factors of Aesthetic Judgement
As argued previously, aesthetic preferences are dependent to some degree on stimuli characteristics. However, it is similarly important to consider how people subjectively perceive and interpret these characteristics. There are a multitude of individual factors at play that affect how people appreciate art, such as different thresholds in subjective complexity (Güçlütürk et al., 2016; Marin & Leder, 2013; North & Hargreaves, 1995), expertise (Orr & Ohlsson, 2005), musical training (Vuvan et al., 2020), aesthetic sensitivity (Clemente et al., 2020) or differences in perceived musical complexity based on musical style (Orr & Ohlsson, 2001). Reber, Schwarz, et al. (2004) also mention how art novices and experts differ in their aesthetic preferences. For music in particular, experts were found to prefer atypical musical progressions over prototypical ones, whereas novices and even music undergraduate students show the opposite effect (Smith & Melara, 1990). Also, experts present signs of mentally preparing themselves for an aesthetic judgement of chord progressions (Müller et al., 2010). People with different musical backgrounds, such as jazz, show different preference patterns for chord progressions that vary in expectancy, compared to those coming from a classical music background or non-musicians. Specifically, jazz improvisers tend to enjoy unexpected (and thus, disfluent) musical events more due to their increased perceptual sensitivity (Przysinda et al., 2017).
The Present Study
The present study seeks to scope the published scientific literature in order to examine how processing fluency affects the aesthetic appreciation of auditory stimuli and to identify the extant knowledge gaps in this body of evidence. As indicated above, whether the effect of increased liking due to increased processing fluency that was documented in the visual domain extends to the auditory stimuli is still an open question, especially due to certain specificities of these stimuli. Therefore, our first aim is to review the existing empirical studies that have examined the effect of processing fluency auditory stimuli on aesthetic reactions to these stimuli in order to synthesize the current body of evidence concerning this effect. Besides this general level of analysis, we also aim to develop a more in-depth understanding on this topic by systematizing the past research on a set of dimensions that may be, as suggested above, highly relevant for the effects of fluency on aesthetic reactions to auditory stimuli. To this aim, we scope the previous studies by examining the types of auditory stimuli employed, the operationalizations of processing fluency used, the specific aesthetic reactions investigated, the emotions experienced by the perceivers, and their individual characteristics. Specifically, our review also aims to answer the following secondary questions: • What aesthetic reactions have been studied the most as effects of the processing fluency of auditory stimuli? • How was the processing fluency of auditory stimuli manipulated (in experimental studies) or measured (in correlational studies)? • Which types of auditory stimuli have been used to study these aesthetic reactions? • To what degree was the emotional component of the auditory stimuli taken into account in the past relevant research? • Which individual characteristics were found to affect the relation between processing fluency of auditory stimuli and the aesthetic reactions to them?
Method
We searched for relevant articles in the Web of Science, APA PsycNet and Scopus databases, using the following terms: (“fluency” OR “processing fluency” OR “perceptual fluency” OR “conceptual fluency”) AND (“music” OR “tune” OR “sound” OR “noise” OR “auditory”) AND (“aesthetic” OR “aesthetics” OR “liking” OR “appreciation” OR “pleasure” OR “preference” OR “enjoyment” OR “attitude” OR “emotion” OR “emotions” OR “reaction” OR “reactions”) AND (“experiment” OR “experimental” OR “correlation” OR “correlational” OR “association”). This search was done in accordance to the guidelines provided by PRISMA (Page et al., 2021). We were interested in English articles only, regardless of publication date. Figure 1 presents the extensive process of determining which articles to include in this review. A total of 259 records were identified, out of which 49 were duplicate. After deduplication, we were left with 210 records, which we screened for eligibility by reading their titles and abstracts. Given that we were interested in articles that studied the relationship between processing fluency and aesthetic reactions to auditory stimuli, articles that were irrelevant being discarded. We ended up with a total of 13 articles that satisfied our inclusion criteria. Some articles present multiple studies, with some being relevant to our review and some being irrelevant. The total number of relevant studies in the articles retained for review is 19. PRISMA (Page et al., 2021) article selection flow diagram.
Results
Relevant Information Extracted From Each Article Included in the Review.
The Effect of Processing Fluency On Aesthetic Reactions to Auditory Stimuli
We begin by reviewing the aesthetic reactions that have been investigated in relation to processing fluency. As such, we observe that 17 out of the 19 studies studied liking as aesthetic reaction, while only the two studies reported by Nunes et al. (2015) examined novelty. In addition to liking, Omigie et al. (2021) studied additional variables which include tension, energy, pleasantness, interest, empathy and beauty, while Witvliet and Vrana (2007) studied pleasantness. Looking at liking in particular, we observe how in most cases, it was found to increase with fluency. For example, Anand and Sternthal (1991), Wang and Chang (2004), Peretz et al. (1998) and Mungan et al. (2019) showed that stimuli were liked more when they were more familiar and participants recognized the stimuli from previous exposures. The only instance in which liking did not increase with fluency is that presented by Felisberti (2021), where liking for music clips with vocals and without vocals did not significantly increase after three exposures, neither in a clinical sample of individuals suffering from dementia nor in a control sample of neurotypical individuals.
While this effect is observable from the first repetition of the auditory stimuli (Anand & Sternthal, 1991; Topolinski & Strack, 2009), it is not a linear increase in all cases. Liking seems to keep increasing for 6 or 8 repetitions (Mungan et al., 2019), then it falls off at higher repetitions (Schellenberg et al., 2008; Witvliet & Vrana, 2007). This is in accordance to the inverted U curve of aesthetic liking (Chmiel & Schubert, 2017; Güçlütürk & van Lier, 2019), where liking can be plotted as an inverted U curve as a function of a collative variable (Berlyne, 1970). A more nuanced approach comes from Schellenberg et al. (2008), which found that the liking ratings of those who listened to the musical excerpts in a focused manner created an inverted U-curve as a function of exposure, while liking ratings of those who listened to the musical fragments in an incidental manner increased monotonically. Similar results were reported by Szpunar et al. (2004), who found the same effect but for orchestral music and by Madison and Schiölde (2017) who found a steady increase in liking as a function of familiarity, regardless of complexity, in the case of musical stimuli listened in an ecologically valid context. Given how people listen to music on a day to day basis is to distract themselves while doing other activities such as driving, working and exercising (Mehl & Pennebaker, 2003; Sloboda et al., 2001; Volokhin & Agichtein, 2018), and that Madison and Schiölde (2017) sought to preserve ecological validity by having their participants listen to the stimuli on their own pace, we can infer that incidental listening is the prevalent way of engaging with auditory stimuli in an ecological context.
Lastly, while liking increased with fluency for all kinds of stimuli, some type of stimuli presented higher liking ratings than others. For example, liking scores were greater for music clips with vocals compared to instrumental music clips (Felisberti, 2021), musical excerpts that were happy-sounding compared to those that were sad-sounding (Schellenberg et al., 2008), that were of a specific music genre such as classical music instead of Russian music (Wang & Chang, 2004), played on a certain instrument such as the flute instead of being simply hummed (Topolinski & Strack, 2009) or articulated in a specific manner, such as legato instead of staccato (Carr et al., 2023).
Looking at what other subjective dimensions correlate with liking and fluency, we note that liking is highly correlated with beauty, while both present a moderate positive association with energy (Omigie et al., 2021). Additionally, the same authors report that beauty and liking tend to be associated with positive valence more and that respiration rate and zygomaticus activity, which are used to measure fluency, were greater for low tension-low energy musical passages. While liking is positively associated with pleasantness, it is also negatively associated with tension (Day & Thompson, 2019). In this sense, positive music prompts more pleasantness than negative music (Witvliet & Vrana, 2007) and musical passages that were low on tension but high on energy prompted the greatest liking (Omigie et al., 2021).
Processing Fluency Type
Given that processing fluency presents itself in multiple forms (Alter & Oppenheimer, 2009), we examined the prevalence of the different types of fluency in relation to aesthetic reactions to auditory stimuli. If no specific type was mentioned in the articles, we considered it to be general fluency. In this sense, 6 articles reported studying processing fluency but did not mention whether it was perceptual fluency, conceptual fluency or any other kind of fluency. Perceptual fluency was present in 6 articles and a single article reported studying linguistic fluency. The inclusion of linguistic fluency in this review might seem beyond its scope, as we focus on non-linguistic auditory stimuli. We decided for including Nunes’s et al. (2015) article due to how music can present itself both in instrumental version and in versions with lyrics. Their results show that processing fluency increases with lexical repetition and that ratings of novelty decrease as fluency increases. As such, music with lyrics might create multiple streams of fluency. For example, lyrics may offer an additional source of familiarity, which in turn affects judgments of knowing (Rabinovitz & Peynircioğlu, 2011). In an experimental context, auditory stimuli are stripped down to those characteristics that are the most important to the objective of the study in order to not contaminate the results. In this sense, it is easier to study the impact of instrumental music than with music that contains lyrics, given that lyrics could impact the participants affective state by enhancing negative emotions such as sadness and anger (Ali & Peynircioğlu, 2006; Barradas & Sakka, 2022) or by making music with lyrics seem happier than instrumental music (E. Brattico et al., 2011). However, in doing so, we remove the stimuli from their real-life counterparts and we create a vacuum in which we lose ecological validity, given that music with lyrics is more popular than instrumental music (North et al., 2021).
Processing Fluency Manipulation
Most studies (9 out of 13) manipulated fluency by repeatedly exposing participants to the auditory stimuli. The number of exposures ranged from 2 exposures to as many as 32 exposures, with some studies using 3, 6, or 8 exposures in some of their experimental groups. For example, Anand and Sternthal (1991), as well as Topolinski and Strack (2009) and Peretz et al. (1998) presented their participants with the auditory stimuli only 2 times, once during an exposure phase where participants had to listen to or study the stimuli and once during an experimental phase. A similar approach was taken by Nunes et al. (2015), who presented their auditory stimuli in a low repetition form, where the chorus repeats once, and in a high repetition form, where the chorus repeats twice. The greatest number of exposures was used by Schellenberg et al. (2008), which presented happy or sad musical excerpts to their participants 2, 8 and 32 times. Important to note is how all studies except that of Nunes’s et al. (2015) used between-stimuli repetition, meaning that they presented the same stimuli multiple times. However, in the context of a song, repetition may be present internally as well, for example repeating the same musical phrase at different times throughout the song. The spacing between repetitions is important as well, given that within-phrase repetition are easier to notice and recognize than in between-phrase repetition (Margulis, 2013). Among the studies that did not use repeated exposures, one article manipulated complexity (Day & Thompson, 2019). However, it does not explicitly mention how this was achieved; instead, it suggests that a form of objective complexity derived from the songs themselves was used.
Lastly, other processing fluency manipulations include musical articulation (Carr et al., 2023) and efficiency of stream segregation (Kowalewski et al., 2019). Regarding musical articulation, it can be thought of as an expressive way of playing the same note or the same musical phrase, which can be used, for example, to convey specific emotions during music performance (Juslin, 2000). Carr et al. (2023) manipulated processing fluency through musical articulation by using the legato (long and sustained notes) and staccato (short and accentuated notes) articulations. In this case, higher perceptual fluency is represented by legato articulation due to how legato notes are perceived to form a unitary melody, while staccato notes are perceived as distinct notes. Kowalewski et al. (2019) manipulated processing fluency on the basis of the efficiency of stream segregation. This refers to the way a song is segregated by the brain into different audio streams based on, for example, pitch or timbre, and how efficient segregation can be attributed to increased fluency, based on the Source Dilemma Hypothesis (SDH) proposed by Bonin et al. (2016). For example, the authors used polyphonic melodies where the higher note was changed from piano to either a trumpet or xylophone. Additionally, the melodies were presented in either a low harmonicity form or a high harmonicity form. In this sense, pleasure and displeasure for the stimuli would be influenced by the degree of uncertainty (or perceptual clarity) created by how easy it is for the participants to perceptually sperate the notes based on the instrument they are played with, as well as how consonant or dissonant they sound (Bonin et al., 2016).
Auditory Stimuli
It is difficult to say for certain what sequence of sounds or notes can be considered music or not. What differentiates, let’s say, a melodic line in a song from a melodic line specifically created to be used in an experiment? Given this uncertainty, auditory stimuli used in the articles included in this review can be classified as either artificial (specifically created to be used in an experimental context) or naturalistic (already existing songs or melodies). Eight articles used only artificial stimuli and four studies used only naturalistic stimuli, while a single study (Nunes’s et al., 2015) used both types of stimuli for different experiments. In general, artificial stimuli consisted of note sequences or simple melodic lines, with the vast majority of investigations using this type of stimuli (Anand & Sternthal, 1991; Carr et al., 2023; Mungan et al., 2019; Peretz et al., 1998; Schellenberg et al., 2008; Topolinski & Strack, 2009; Wang & Chang, 2004). Other studies used polyphonic melodies instead of melodies consisting of one note (Kowalewski et al., 2019) or melodies that had their choruses altered (Nunes et al., 2015). On the other hand, naturalistic stimuli consisted of avant-garde audio clips (Felisberti, 2021), instrumental songs (Witvliet & Vrana, 2007), musical excerpts (Day & Thompson, 2019) and musical passages selected by the participants themselves from songs that they frequently listen (Omigie et al., 2021). One distinction to note between naturalistic and artificial stimuli is that naturalistic stimuli were longer in duration, with some being as long as 30 or 40 seconds. In comparison, artificial stimuli duration ranged from 7 to about 10 seconds.
Affective Component
Our analysis revealed 5 articles that consider the emotional aspect, out of which 3 studies reported greater liking for the musical stimuli that induce a positive emotion, such as happiness (Schellenberg et al., 2008) or for musical stimuli that can be characterized by aspects specific to positive emotions, such as low tension and high energy (Omigie et al., 2021) or high arousal (Witvliet & Vrana, 2007). In addition, beauty and liking tend to be associated with positive valence more (Omigie et al., 2021) and liking is strongly correlated with pleasantness and processing fluency (Day & Thompson, 2019; Experiment 3). When it comes to perceived emotion, Carr et al. (2023) identified that musical articulation plays a role in helping differentiate between emotions. In this sense, legato melodies were perceived as calmer and sadder than staccato melodies. On the other hand, staccato melodies were perceived as more energetic and tense. Another study (Kowalewski et al., 2019) identified the affective connotations of specific instruments. For instance, some might carry a positive emotional connotation, such as the trumpet and the xylophone, making them sound particularly happy, at least in comparison to the piano.
Individual Characteristics
The studies reported in 5 articles included in our review omitted to mention whether or not their participants had any musical training. In those that did mention this characteristic, researchers collected this information through self-report from participants. Also, some papers report more detail than others regarding the type of musical training, its duration or the number of instruments participants practiced. For example, Schellenberg et al. (2008) reported that their participants had, on average, around 1.9 years of private music lessons (worth noting that the distribution was positively skewed), while Carr et al. (2023) reported their participants had, on average 3.53 years of musical training. In some instances, musicians and nonmusicians tend to be put in the same category based on their musical experience, with Mungan et al. (2019) reporting that their participants were all nonmusicians with less than 1 year of musical experience and Wang and Chang (2004) reporting that their participants having no or less than 3 years of learning an instrument. Specifics about training in musical theory or musical instruments were provided by Kowalewski et al. (2019), where 31% of their participants reported having at least some formal training in music theory while 67% reported at least some formal training on a musical instrument. Day and Thompson (2019), on the other hand, reported that out of 50 participants, 8 indicted more than 4 years musical training on an instrument and 6 indicated having received musical training on two or more instruments. Lastly, Omigie et al. (2021) reported that 56% of their participants have regularly played an instrument for more than 4 years, 35% had undergone 6 or more years of lessons, 24% had undergone less than 1 year of music lessons and 26% reported never having played an instrument.
Discussion
Key Results on the Five Dimensions of the Studies Reviewed.
Generally, the studies that they report indicate that processing fluency does indeed have an effect on the aesthetic reactions of auditory stimuli, paralleling the effect of fluency on the reactions to visual stimuli. Specifically, liking for auditory stimuli increases with fluency, and this effect generalizes over different manipulations of fluency, i.e., through familiarity and repeated exposures (Anand & Sternthal, 1991; Mungan et al., 2019; Peretz et al., 1998; Schellenberg et al., 2008; Topolinski & Strack, 2009; Wang & Chang, 2004; Witvliet & Vrana, 2007), and through articulation or arrangement of notes (Carr et al., 2023; Kowalewski et al., 2019). Additionally, liking correlates positively with beauty and pleasantness and negatively with tension (Day & Thompson, 2019; Omigie et al., 2021), with people preferring musical stimuli low on tension but high on energy.
The corpus of empirical results reviewed also suggest that liking increases monotonically for auditory stimuli that were listened passively or with little to no engagement. On the other hand, liking for auditory stimuli that are listened actively increases only up to a certain point. This dual path process parallels the one highlighted in the research on visual materials, i.e., in the evaluation of fluent and disfluent portraits (Belke et al., 2015) or abstract images (Graf & Landwehr, 2017), suggesting that similar mechanisms might be at play in the effects of fluency on the appreciation for auditory stimuli and music. As such, accounting only for an incidental processing route based on mere exposure might not explain the full range of results regarding liking (Montoya et al., 2017). The idea of a dual processing path involved in the modulation of liking is also reinforced by the separation between early emotional “liking” and conscious liking, which occurs after judgement and emotion (E. Brattico et al., 2013) and by how the aesthetic judgements that arises from each route can vary in intensity, with fluent objects promoting a sense of prettiness while disfluent objects promote a sense of beauty exactly because they resist fluent processing (Armstrong & Detweiler-Bedell, 2008).
Previous theoretical models sought to explain fluency-based aesthetic responses based on how people attribute the good feeling state resulting from fluent processing to the stimuli (Bornstein & D’Agostino, 1994; Winkielman et al., 2003a), on how these aesthetic responses can go beyond basic liking and into beauty judgements (Reber et al., 2004) and on how fluency can be considered alongside other cognitive mechanisms in order to result in aesthetic emotions (Juslin, 2013). One of the latest theoretical addition to fluency-based aesthetic reactions is the Pleasure-Interest Model of Aesthetic Liking (PIA Model), developed by Graf and Landwehr (2015), which suggests that aesthetic liking can be achieved by following two distinct processing routes. While uncontrolled processing generates pleasure or displeasure and it is based on processing fluency, controlled processing is triggered by processing disfluency, which fosters interest. While the extant empirical results reviewed indicates that liking increases with processing fluency induced through repetition, the second processing route of this model has not been verified in the realm of auditory stimuli. Nevertheless, the PIA Model, so far tested in relation to visual stimuli, may provide a valid account of the routes involved in the processing of auditory stimuli as well.
Processing Fluency Specificity
Our findings show that the research on auditory stimuli has focused on perceptual fluency or a general form of processing fluency, while omitting conceptual fluency. The prevalence of perceptual fluency in the research on auditory stimuli was to be expected, given that its effect on affective judgements is well researched for visual stimuli such as drawings and shapes (Reber et al., 1998) or words (Van den Bergh & Vrana, 1998). What is surprising however is the lack of consideration for conceptual fluency, given that in some instances it is as prevalent as perceptual fluency (Gamblin, 2020, Chapter 2). Looking at visual art, it was previously shown that people tend to appreciate paintings more when they were easier to understand due to their titles, which promoted conceptual fluency (Belke et al., 2010; Leder et al., 2006). In addition, when paintings are easier to understand, they are liked more regardless of their complexity (Ball et al., 2018). On the other hand, when looking at music specifically, we can observe that complex music is used more for cognitive purposes, while simpler music is used for emotional purposes (Sallavanti et al., 2016).
It is also important to note that evaluative and perceptual judgements differ significantly based on their neural architecture and time needed for said judgments to unfold, with aesthetic evaluative judgements happening later than symmetry perceptual judgements (Jacobsen & Höfel, 2001). This means that merging perceptual and conceptual fluency together and presenting them under the umbrella term of processing fluency might fail in underlining the specific effects of fluency on judgement. Future studies addressing the effects of conceptual fluency might contribute to understanding the contexts in which individuals prefer more complex auditory stimuli over simpler ones (for example, when intentionally listening to them compared to incidental listening; Schellenberg et al., 2008). As such, while the effects of perceptual fluency on liking are in accordance to the PIA Model, the effects of conceptual fluency have yet to be verified, thus emphasizing a gap in the current knowledge on how processing fluency prompts interest that, in turn, results in liking (Adair, 2021).
Additionally, processing fluency of auditory stimuli has been predominantly manipulated through repeated exposures and less through other means such as complexity or repeating internal elements specific to songs (e.g., melodic lines). Moreover, Montoya et al. (2017) recommended that in order to observe the nuanced relationship between exposure and response, participants should be exposed to at least three levels of exposures. Nevertheless, several studies conducted so far did not reach this threshold, stopping at only two exposure levels. While in some instances, the use of only two exposure levels might be justified by the experimental design or the research objective, in other instances it might represent a simplified and overly controlled manipulation, given how people encounter artistic stimuli more than twice in their day-to-day life.
Auditory Stimuli Used
We classified the experimental auditory stimuli as being either artificial, created specifically to be used in an experimental context, or naturalistic, i.e., already existing songs and melodies or extracted from such contents in order to preserve their ecological validity. The duration of these stimuli fluctuates, as naturalistic stimuli were overall longer in duration (about 30–40 seconds) compared to artificial stimuli (about 7–10 seconds). While aesthetic judgements of images are independent of the exposure duration (Brielmann et al., 2017) and gut-level aesthetic judgements of music can be made in a matter of seconds (Belfi et al., 2018), there is a point to be made for using longer auditory stimuli given how different parts or sections of a song get revealed as the song progresses, compared to an image that gets presented in its entirety and processed as such. Furthermore, longer auditory stimuli are more likely to generate aesthetic reactions different from liking, such as the “Aha moment”, which is a sudden shift in processing ease (Muth & Carbon, 2013; Topolinski & Reber, 2010). It is also worth noting that using even longer stimuli (e.g., of above 90 seconds) might generate habituation, reactions to stimuli becoming more dull over time due to internal repetition (Huber et al., 2008; Huron, 2013; Leventhal et al., 2007), which would cancel the positive initial effect of fluency (Montoya et al., 2017), a hypothesis that we believe warrants examination in future research.
Additionally, auditory stimuli used in past studies have ranged vastly, from simple melodic lines to classical pieces, instrumental songs or musical passages selected by participants. Given this diversity, it is difficult to pinpoint which set of stimuli is the most fitting for researching processing fluency effects. Efforts have been made to create databases of standardized stimuli which are altered on a specific set of characteristics. For example, Clemente et al. (2020) provided 200 musical stimuli varying in balance, contour, symmetry and complexity, which might prove useful in studying how processing fluency affects liking, specifically on the incidental processing route. However, auditory stimuli that are closer to real life music may be better fitted for research on the controlled processing route described by the PIA model. In order to create a database that incapsulates already existing songs, techniques such as web scraping can be used in order to scan, identify and organize songs from streaming platforms based on certain characteristics such as valence, tempo or duration (Sciandra & Spera, 2022).
Importance of Emotions
The affective component is integral in understanding how people evaluate art and, more specifically, music. Emotions are not simply a byproduct of processing, as suggested by Leder et al. (2004), but instead they are strongly connected to processing, as outlined by the BRECVEMA framework (Juslin, 2013). Somewhat surprisingly, we found that emotion has been understudied in the extant research on how processing fluency affects the aesthetic reactions to auditory stimuli.
The findings of the research that did study emotion in addition to aesthetic reactions to auditory stimuli indicate that liking is, in general, greater for music that is happy (Schellenberg et al., 2008; Witvliet & Vrana, 2007). However, a lack of subjective measurement of the emotion experienced by participants makes it impossible to disentangle the effects on liking of this affective state from those of the emotion associated to the stimuli. Past results also show that positive valence is significantly associated with ratings of beauty and liking, that auditory stimuli low on tension and low on energy are liked the least and that stimuli low on tension but high on energy are liked the most (Omigie et al., 2021). Lastly, musical articulation plays a role in conveying emotion, with ratings of amusement, happiness, and surprise being significantly higher for staccato melodies, while ratings of sadness and scariness were significantly higher for legato melodies (Carr et al., 2023).
Generally, the emotional charge of music makes studying the effects of processing fluency more difficult given that people tend to enjoy songs that present contrasting emotions to what they feel (Schellenberg et al., 2012) and that some musical instruments might appear to sound happier or sadder (Kowalewski et al., 2019), meaning that the affective aspect of the stimuli used is as important as the stimuli themselves. This is especially true given how the affective valence of the auditory stimuli could amplify the attitudes people have towards them (Meskin et al., 2013). Thus, controlling all these potential affective sources is important for an exact assessment of the role of fluency in the generation of emotion when listening to auditory material. Furthermore, the effects of the current processing style on aesthetic reactions to music also deserves to be explored by future studies, as past research using visual materials suggests that individuals tend to rely more on emotion when processing information globally compared to when they focus on details (Dijkstra et al., 2014).
Importance of Musical Abilities and Personality
While most studies reported the overall musical training of the participants, no specific ability developed through musical training (such as pitch perception) or other correlates such as personality were measured or manipulated. In terms of the influences of musical expertise on aesthetic evaluations, the studies reviewed highlight preference for musical complexity as an important factor. Musical training was found to be related to preference for more complex musical material (Vuvan et al., 2020). Moreover, while laypersons’ appreciation for music as a function of its complexity can be plotted as an inverted U curve (Güçlütürk & van Lier, 2019), experts’ appreciation for music as a function of complexity cannot be plotted as such (Orr & Ohlsson, 2005).
It is also important to note that even though a distinction between novices and individuals that do possess some musical abilities can be made based on how many years of musical training they have, in some instances, there are individuals that possess musical abilities, despite never taking music lessons (Law & Zentner, 2012) or by simply being exposed to music more (Bigand & Poulin-Charronnat, 2006). Thus, self-reported musical experience is inconsistent and insufficient in clarifying the individual differences in aesthetic engagement with auditory stimuli. In fact, the abilities developed through musical training and not the musical training in itself might be of greater importance in understanding these differences. This argument is supported by the idea that some individuals could possess musical abilities whilst not having any musical training (Bigand & Poulin-Charronnat, 2006; Law & Zentner, 2012; Rajan et al., 2019). In addition, those that were musically trained for more than two years show greater appreciation for complex musical pieces as a function of their sophistication, compared to those that were musically trained for less than two years (Burke & Gridley, 1990). Music students also form their preferences based on intrinsic characteristics of music such as expressivity or musical structure, compared to psychology students who base their preferences more on extrinsic characteristics, such as moods or activities where music is used in the background (Juslin & Isaksson, 2014). This might suggest that novices or non-experts tend to evaluate music on a global level, which impacts their affective responsiveness in a way that they give more extreme affective ratings and they are more responsive to processing fluency (Dijkstra et al., 2014). The implications of this might be more important for auditory stimuli than for visual stimuli given how emotionally charged music is (Juslin, 2013). Phenomena such as hedonic amplification (Landwehr & Eckmann, 2020) or the reliance on emotion during evaluation might differ between experts and non-experts, which implies that these differences should be accounted for.
A second aspect worth discussing is the individual’s personality. Previous research has shown that openness to experience is the best predictor of aesthetic preferences (Cleridou & Furnham, 2014). People high on openness to experience tend to prefer complex and reflective music (Rentfrow & Gosling, 2003), while also enjoying sad music more than happy music (Vuoskoski et al., 2012). Additionally, openness to experience was found to moderate the impact of repeated exposures on liking, with liking ratings being higher for novel pieces but lower for the stimuli that were over-exposed (Hunter & Schellenberg, 2011). Personality is also linked to how people use music, with intellectually engaged individuals using music in a more rational or cognitive way, while neurotic, introverted and non-conscientious individuals tend to use music for emotional regulation (Chamorro-Premuzic & Furnham, 2007).
As such, more rigorous control is needed when measuring individual characteristics given how measuring on a broader sense, by ways of musical training or years of playing an instrument, fails to unveil how more specific individual characteristics such as personality or musical preferences affect how they engage with auditory stimuli.
Limits of the Present Study
A potentially important limit of our study is that we explicitly avoided the inclusion of articles that might be considered part of the gray literature, which consists of conference proceedings, dissertations, presentations or pre-prints, among others (Adams et al., 2017). However, the inclusion of gray literature in systematic reviews might be beneficial as it increases the likelihood of conducting a more comprehensive search (Benzies et al., 2006) and uncover results that, while statistically insignificant, might prove useful in better understanding the topic at hand.
Additionally, articles included in this review were found by searching through Web of Science, APA PsycNet and Scopus. While only two comprehensive databases might be sufficient to identify most articles (Harari et al., 2020), we opted to include a third database to strengthen our search. While we could have opted to use Google Scholar given it large coverage, we decided against it because of its insufficiency and unreliability when conducting systematic searches, especially as the sole database due to its constantly-changing content, algorithms and database structure make it a poor choice for systematic reviews (Giustini & Boulos, 2013; Gusenbauer & Haddaway, 2020).
Lastly, the search terms we used might have not been adequate enough for the scope of this review. While we did iterate on the keyword string by adding additional Boolean operators, synonyms, plural forms or different labels for the same concept and we followed as closely as possible the guidelines provided by the PRISMA framework (Page et al., 2021), we might have ended up with an excessively focused keyword string. In general, it is advised to conduct a broader search at first, in order to retrieve as many articles as possible, despite the fact that many of those articles might prove irrelevant to the research question at hand (Xiao & Watson, 2019). While we consider our search terms to be of medium specificity, the review might have benefited from a more generalized search.
Future Research Directions
As for future research directions, we advise researchers to focus more on how disfluency affects interest, aesthetic judgement and enjoyment of music. While multiple studies have shown that people get to enjoy stimuli more if they are processed more fluently (Forster et al., 2013; Reber et al., 2004; D. L. Westerman et al., 2015), aesthetic and emotional reactions to the stimuli that are difficult to process at first is still an open issue (Güçlütürk & van Lier, 2019). Several authors point at the fact that disfluency might play a bigger role in the way people engage with stimuli surrounding them (Adair, 2021; Belke et al., 2015; Graf & Landwehr, 2017; Sung et al., 2022) and including disfluency in the research agenda might encourage a shift from using simple stimuli to using more complex or unusual stimuli (Chmiel & Schubert, 2019). Speaking of complexity, it might be worth looking into how different forms of complexity affect fluency. For once, complexity can be looked at from two perspectives, one that is stimulus driven and one that is perceiver driven. Research paradigms such as the Music Information Retrieval (MIR; Casey et al., 2008; Downie, 2004) derive complexity from features taken directly from the audio content, such as audio compression or event density per second (Marin & Leder, 2013). On the other hand, some authors argue in favor of using subjective complexity instead of objective complexity, in order to control for the variables that might affect how an individual perceives the complexity of a piece (Marin & Leder, 2013; North & Hargreaves, 1995). In addition, besides a general level of complexity, different aspects of a song could produce different types of complexity, such as harmonic or rhythmic complexity (Friberg et al., 2014).
Additionally, building on the previous discussion on the importance of the affective valence of the stimuli, future studies should explore whether conceptual fluency of music can be achieved by deriving meaning from affective states, given how feelings in themselves can be a source of information (Schwarz & Clore, 2007). In this sense, people might turn to their emotions in order to aid processing of an artistic stimuli. This might explain why people tend to appreciate music that presents a different emotion than the one they are feeling, as is the case for sad music (Kawakami et al., 2013; Schellenberg et al., 2012; Vuoskoski et al., 2012). Contrasting emotions might result in greater aesthetic liking by sparking interest in the listeners, who evaluate how they feel and what the musical piece is expressing emotionally in order to make a judgement. It is important to note that although correlated, ratings of felt emotion versus perceived emotion differ, with perceiving ratings tending to be higher than feelings ratings (Hunter et al., 2010). As such, future research should focus more on how felt emotions and perceived emotions affect fluency-based judgements of liking and beauty. A different research avenue consists of the role musical abilities developed through musical training have in the dynamic between processing fluency and aesthetic reactions to music. To this end, studies should extend their current focus on self-reported musical experience by including specific musical abilities such as pitch perception (Law & Zentner, 2012; Seror & Neill, 2015) and personality traits such as sophistication (Müllensiefen et al., 2014) or musical preferences (Nieminen et al., 2012; Rentfrow et al., 2011) into their designs.
Lastly, we propose that manipulation of processing fluency of auditory stimuli to be done differently than the manipulation used for visual stimuli. A point of differentiation between visual stimuli and auditory stimuli is how auditory stimuli are able to present varying degrees of within-stimuli repetition in the form of repeating melodic phrases or repeating structural elements such as verses or choruses. While visual stimuli can be presented with high degrees of self-similarity and symmetry (Mayer & Landwehr, 2018) or fractality (Forsythe et al., 2011; Viengkham & Spehar, 2021) to induce fluency and consequent aesthetic appreciation, auditory stimuli such as songs can have repeating choruses (Nunes et al., 2015) or repeating melodic phrases (Margulis, 2012; Taher & Jin Hyun Kim, 2022) which seem to have similar effects on judgement. As such, we suggest future research to use within-stimuli repetition and further explore its effects, as past studies suggest that this manipulation of auditory stimuli affects felt time (Taher & Jin Hyun Kim, 2022) and boredom (Van den Bergh & Vrana, 1998).
Suggested Future Research Directions.
Conclusion
Past research using visual stimuli has indicated that processing fluency partly accounts for how people appreciate and evaluate art. Our aim was to examine this effect in the domain of auditory stimuli by reviewing extant empirical findings relevant for how people react aesthetically to fluently processed auditory stimuli. Overall, the studies included in this scoping review indicate that, similarly to visual stimuli, processing fluency has a positive effect on liking of auditory stimuli. Additionally, we identified some aspects that limit the generalizability of the current body of findings on the relationship between fluency and aesthetic reactions to auditory stimuli, such as a lack of consistency in the number of stimulus exposures in the manipulation of fluency, a tendency to omit the affective component and the failure to account for personal variables such as musical abilities developed through musical training or the participants personality and preferences. While these aspects could be considered problematic, they also emphasize the aspects of auditory stimuli that differ greatly from visual stimuli. As such, they should be considered in the future research agenda and designs on the effect of processing fluency on aesthetic reactions to auditory stimuli.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
