Sage Journals: Discover world-class research

Abstract

The examination of cross-modal correspondences between auditory and olfactory senses opens up an intriguing perspective into the study of extra-musical meaning. In a behavioral experiment, musically trained participants were presented with 26 complex synthetic tones and 12 aromatic stimuli. Their task was to report potential associations between the two. The data analysis revealed that the majority of scents featured at least one association with a sound that was above chance. The salient acoustical correlates of basic aromatic categories could be summarized as follows: both fruity (e.g., cherry, melon, and pomegranate) and sour aromas exhibited a positive correlation with pitch. Fruity scents, in addition, were more likely to be associated with sounds featuring pronounced low harmonic partials (1st–4th), low noise content, low roughness, and a greater number of distinct pitches. Conversely, sour aromas were linked with stronger energy in the higher frequencies. Sweet scents correlated with sounds characterized by a lower spectral centroid, whereas aromas in the spicy/other category were associated with weak lower partials (fundamental frequency in particular) and stronger noisy components.

Keywords

Acoustical correlates cross-modal correspondences essential oils scents sound synthesis timbre

Introduction

It is generally accepted that sound and music are capable of conveying meaning that may take different forms. One common differentiation is made between intra-musical and extra-musical meaning (Patel, 2008). The former represents meaningful information that arises from statistical learning and the subsequent formation of expectations due to long-term musical exposure (e.g., Krumhansl, 2015). The latter concerns referential associations between musical entities and extra-musical concepts such as, for example, density (e.g., Noble et al., 2020), tension (e.g., Farbood, 2012; Farbood & Upham, 2013; Huron, 2006), intimacy (Huovinen & Kaila, 2015), or various kinds of visual imagery (e.g., Hashim et al., 2023). Koelsch (2011) further organizes extra-musical meaning into three subcategories, namely, iconic meaning in the case where musical qualities are linked to objects or abstract concepts through similes and metaphors; indexical meaning when musical patterns indicate a psychological state; and symbolic meaning that is conveyed through cultural and social associations (e.g., Christmas carols or a national anthem).

The fundamental building block of music is the musical tone, which may or may not have pitch but is certainly associated with some loudness, perceived duration, and timbral qualities. There is now ample evidence, collected from both neuroimaging methods (e.g., Painter & Koelsch, 2011) and behavioral approaches (e.g., Zacharakis et al., 2014), that even out-of-context isolated musical tones can carry extra-musical semantic information. Arguably, the study of timbral semantics can be traced back to the pioneering work of von Helmholtz (1877), but interest in the subject has grown substantially over the past 20 years. This has led to an accumulation of knowledge around the extra-musical concepts that we tend to associate with timbre. The subject has been approached from several viewpoints, such as comparing data from different languages (Zacharakis et al., 2014), comparing perceptual with semantic ratings (Samoylenko et al., 1996; Zacharakis et al., 2015), analyzing descriptions from corpora on orchestration (Wallmark, 2019a), and creating semantic profiles of imagined instrumental timbres (Reymore & Huron, 2020; Reymore, 2022). The increased interest in the extra-musical meaning of timbre is reflected by the recent book chapter on the semantics of timbre by Saitis and Weinzierl (2019), which offers a comprehensive review of the current literature. In the closing remarks of this chapter, the study of cross-modal correspondences is proposed as a promising path for further investigation into the extra-musical meaning conveyed by timbre, potentially providing deeper insight into mechanisms of human semantic processing.

Associations between auditory attributes and other modalities have already been well documented. For example, pitch and loudness have both been related with spatial height (Ben-Artzi & Marks, 1995; Bernstein & Edelstein, 1971), visual brightness (Klapetek et al., 2012; Marks et al., 1987; Marks, 1987, 1989), size (Eitan et al., 2014; Gallace & Spence, 2006), tactile sensation (Eitan & Rothschild, 2011), and even gustatory (Mesz et al., 2011; Wang et al., 2015) or olfactory qualities (Belkin et al., 1997; Crisinel & Spence, 2012a). In addition, higher-level musical characteristics such as harmonic dissonance have been associated with visual roughness (Giannos et al., 2021).

At the same time, cross-modal metaphors for musical timbre description have long been utilized, as shown by the analysis of past orchestration treatises (Wallmark, 2019a). Behaviorally accumulated evidence has further supported the widespread existence of cross-modal adjectives in timbre semantics. In particular, the use of metaphors inspired by touch and vision is now strongly backed by a multitude of findings that originate from several different languages (Lichte, 1941; Pratt & Doak, 1976; Rosi et al., 2023; Štĕpánek, 2006; von Bismarck, 1974; Zacharakis et al., 2014; Zacharakis & Pastiadis, 2016). Besides these indirect indications, several more recent works have directly investigated cross-modal associations with respect to timbre. Adeli et al. (2014) expanded the kiki-bouba paradigm (Köhler, 1929; Ramachandran & Hubbard, 2003; Ramachandran et al., 2020) in musical timbres and observed that softer timbres corresponded to rounded shapes with blue or green colors, while harsher timbres were linked to angular shapes with red or yellow colors. Wallmark and colleagues explored cross-modal associations of timbre in a series of recent studies. Combining neuroimaging and behavioral methods with acoustic analysis, Wallmark et al. (2018) initially showed that noisy timbres that are often described through tactile metaphors (e.g, rough, harsh, coarse), activated somatosensory regions responsible for tactile processing. Then, Wallmark (2019b) applied a Stroop-type speeded classification and reported interference between word–timbre presentations of roughness and brightness. Subsequently, Wallmark et al. (2021) expanded this finding further by identifying interference between visual and timbral brightness. In addition, strong haptic–auditory correspondences were demonstrated for preschool children, with visual-–auditory associations proven to be less systematic and more dependent on the developmental stage (Wallmark & Allen, 2020).

The above works support the existence of correspondences between timbral characteristics and tactile or visual attributes. The mechanisms proposed to explain the origin of such effects include activation of common brain structures in response to properties of cross-modal stimuli, statistically learned environmental associations, affective similarities, and mediation through shared semantic descriptions (Spence, 2011, 2020a). Out of all the senses, vision seems to gather the highest number of descriptive terms (at least in the English language) (Majid & Kruspe, 2018). In contrast, Winter (2019) argues that olfactory experiences are the most difficult to express lexically (i.e., ineffable), followed closely by gustatory experiences, and then auditory experiences.

The apparent ineffability of smell, taste, and sound sets an intriguing context for the exploration of crosstalk between them. Indeed, over the past 15 years, there has been a growing interest in the correspondences and interactions between the chemical senses (i.e., gustation and olfaction) and musical parameters. Knöferle and Spence (2012) summarize the state of the art regarding identified mappings between musical parameters and basic tastes up until 2012 (some of the most notable studies being Bronner et al., 2012; Crisinel & Spence, 2009, 2010a, 2010b, 2012b; Mesz et al., 2011; Simner et al., 2010; Wang et al., 2021). Despite some expected inconsistencies between experimental data, there seems to be a consensus that sweetness is related to soft consonant sounds and chords, legato articulation, slow tempo, and low roughness. The pitches reported for sweetness vary from average to high. Sourness, in contrast, is associated with high pitch, staccato articulation, fast tempo, and dissonance. Bitterness is consistently associated with low pitch and high roughness. The latter is also positively linked with saltiness, which additionally correlates with sound discontinuities, long decay times, and regular rhythmic patterns. It must be pointed out that timbre is not explicitly mentioned in most of these studies and is often treated either as a source category (i.e., piano, brass, and woodwind) or as a semantic dimension (i.e., roughness and sharpness). The findings of a subsequent study by Guetta and Loui (2017) are in accordance with the above and confirm the existence of systematic associations between auditory and gustatory stimuli of varied complexity. The findings from these studies have significantly influenced subsequent research on sound–taste correspondences, exploring the potential impact of background music on taste perception (e.g., Carvalho et al., 2015; Carvalho et al., 2017; Crisinel et al., 2012; Spence, 2021c; Wang et al., 2015; Wang & Spence, 2016), what Charles Spence has coined as “sonic seasoning.”

The existing evidence on sound–taste mappings largely concerns the four (or five with occasional inclusions of umami) basic tastes and not specific flavors. When it comes to olfaction, however, such basic categories for odor classification are less prominent. Instead, odor naming is mostly based on resemblance to a certain source (e.g., smells like freshly cut grass), and the precision of odor identification is low (Agapakis & Tolaas, 2012; Majid & Burenhult, 2014; Speed et al., 2021). It has been suggested that the ability of humans to talk about odors in abstract terms may have been lost in urbanized societies, since communities of hunter-gatherers are more capable of it in comparison to English speakers (Majid & Burenhult, 2014; Majid & Kruspe, 2018). That said, there do exist general categories of scents that are based on properties of the source (e.g., fruity, floral, spicy, earthy, sweet, and green), analogous to how source characteristics largely dictate timbre perception. Efforts to organize the wealth of odor descriptions into more parsimonious models for odor classification are intended to facilitate communication and have been attempted in perfumery (e.g., Zarzo & Stanton, 2009), wine (Noble et al., 1984), beer (Meilgaard et al., 1982), whiskey (Piggott & Jardine, 1979), and even wastewater (Burlingame et al., 2004) among other disciplines. It is worth mentioning that odor perception employs similar methods to timbre perception, such as similarity ratings and semantic differential (Kaeppler & Mueller, 2013), and interestingly, odor semantics feature some overlap with timbre semantics (e.g., warm, rich, smooth, clear, in Zarzo & Stanton, 2009). In yet another analogy with timbre, odor perception is multidimensional (often represented through two to four-dimensional spaces), and existing classification systems are still imperfect (Kaeppler & Mueller, 2013).

The above commonalities between sound and odor perception raise the question of whether they are also reflected in some type of cross-modal associations. The 19th-century perfumer Piesse (1891) was probably the first to attempt a connection between essential oil scents and the pitch of musical notes. Some empirical backing came many decades later from the pioneer music scholar von Hornbostel (1931), who reported a match between odors and tuning fork tones. At the end of the 20th century, Belkin et al. (1997) pursued an empirical odor–pitch association for odor classification purposes. Their experimental data suggested that odor–pitch associations were systematic and could not be attributed to the olfactory dimensions of intensity or pleasantness. On the contrary, the authors speculated that the identified pitch–odor mappings could have been based on underlying semantic dimensions of olfaction (e.g., hard–soft, heavy–light, or bright–dark) and call for a subsequent investigation on potential correspondences between timbral and odor qualities. A more recent study by Crisinel & Spence (2012a) confirmed the existence of systematic associations between odors (associated with wine) and pitch while providing some evidence for a link with source-cause timbral properties. In particular, they reported a tendency of people to associate fruity aromas with higher-pitched sounds and confirmed that aromatic intensity did not seem to correspond to pitch but may instead correspond to timbre (e.g., a trend between higher aromatic intensities and brass instruments was observed).

Considering the semantic commonalities between timbral and aromatic qualities, and drawing on indications provided by the work of Crisinel & Spence (2012a), this study seeks to examine potential timbre–aroma correspondences more closely. The goal of the current approach is to minimize the coupling between timbre and source-cause, acknowledging that total decoupling may not be entirely feasible. To this end, I have created complex synthesized sound stimuli instead of familiar instrumental samples. I report here the results of an experiment whereby the task of the evaluation panel was to match (if possible) each sound stimulus with any of the 12 provided aromas in the form of essential oils. Through this experimental design, I aimed to address two fundamental questions: Is it possible to come up with systematic timbre–odor associations in accordance with previous evidence on pitch-odor connections? And if so, can an interpretation of cross-modal relationships in terms of acoustic properties (represented through audio descriptors) be achieved? That is, can there be identified acoustical correlates of scents?

The motivation for these pursuits originates from one long-term goal: to gain a deeper understanding of the already documented influence of sounds and music on gustatory and olfactory experiences, such as, for example, wine tasting (Spence & Wang, 2015a, 2015b, 2015c; Spence, 2020b). The underlying premise is that multisensory congruence positively contributes to the brain's processing fluency, thereby enhancing overall pleasure (Spence, 2021b). Therefore, being able to define congruence or incongruence between the modalities under question is of paramount importance for such an investigation.

The following section elaborates on methodological decisions, including the selection of sound stimuli and aromatic variables, and outlines the experimental procedure. The results section introduces the assessment of statistical significance for sound–odor correspondences, underscores noteworthy occurrences, and adduces a few representative aromatic profiles of sounds. Additionally, this section presents the identified acoustical correlates of both specific aromas and aromatic families. The paper concludes by contextualizing the primary findings within the existing literature, discussing the limitations of this study, and suggesting potential future directions.

Method

The experiment described below aimed to explore potential associations between auditory and olfactory stimuli by asking participants to report potential correspondences between presented sound stimuli and a number of aromas.

Sound Stimuli and Apparatus

The sound stimulus set consisted of 26 complex synthetic tones that were created through various combinations of sound synthesis (frequency modulation, amplitude modulation, wavetable, additive, and granular synthesis) and/or sound processing (filtering, reverb, delay, phasing, etc.) implemented using Ableton Live. I created these sounds attempting to sonically represent a wide range of aromas that are typically found in wine. This approach was favored over an exposition to familiar instrumental timbres, firstly to minimize (as much as possible) source-cause category influences on the auditory end (Siedenburg, 2017) and secondly to maintain the freedom to create timbres with desired characteristics. During the sound synthesis process, I conformed with some of the basic guidelines offered by the literature (Crisinel & Spence, 2012a; Crisinel et al., 2013; Deroy et al., 2013; Spence, 2021a) concerning identified correspondences between timbral qualities and olfactory properties. However, as presented in the introduction, the majority of existing evidence concerns pitch–odor correspondences, while timbre–odor relationships are largely approached as associations between aromas and specific source-cause categories (i.e., musical instrument classes), with limited suggestions for acoustical correlates. As a consequence, numerous sound synthesis decisions were guided by personal impressions formed during exposure to the designated aromatic stimuli, introducing a considerable degree of subjectivity. The number of sound stimuli (26) was selected to be larger than the aromatic variables (12, see below) to avoid a one-to-one correspondence experimental setup. In addition, the higher the timbral diversity within the stimuli, the likelier the acquisition of unexpected auditory–olfactory associations should be. As a result, many of the sounds within the stimulus set were designed having in mind scents that were not represented by the aromatic variables. At the same time, due to the exploratory nature of this study, two of the aromas (melon and pomegranate) had two sonic candidates. The stimulus duration ranged from 6 to 12 s, while the pitch also varied, ranging from G2 (98 Hz) to G5 (784 Hz). Several of the stimuli comprised tone combinations and complex temporal fluctuations.

The sound stimuli were delivered via a MacBook Pro laptop (Apple Computer, Inc., Cupertino, CA), utilizing a custom-built graphical user interface in Max/MSP for stimulus playback and data acquisition. Listeners were presented with the sound stimuli binaurally using Beyerdynamic DT-880 PRO headphones (250 Ohm). Loudness was equalized across all stimuli at a comfortable playback level through informal listening tests. This resulted in RMS levels between 65 and 75 dB SPL (A-weighted, slow response). The sound stimuli are available in the supplementary material. At this point, it should be noted that the sound labels employed throughout the manuscript derive directly from the names of their intended aromatic counterparts. Given that these sounds lack a physical source, this methodology was considered more advantageous than simply labelling the stimuli as S1–S26. This approach allows the reader to be informed about the intended aromatic target corresponding to each sound.

Aromatic Variables

Twelve aromatic variables were selected to reflect distinctive aromas present in the profiles of three different wines (one white, one rosé, and one red). This selection served the future objective of composing congruent music tailored to specific wine profiles as a means to investigate multisensory perception. The aromatic variables were introduced using small glass bottles (5 ml) sealed with a plastic screw cap, each containing a piece of cotton on the inside. The cotton in each bottle was moistened with 3 drops from a selection of 11 different essential oils, namely, vanilla, honey, caramel, cinnamon, coffee, (black) pepper, lemon, lemon blossom, pomegranate, melon, and cherry. No satisfactory essential oil representative was found for our 12th selected aromatic variable, tobacco; therefore tobacco leaves enclosed in a small plastic container with a screw cap (8 ml) were provided as a stimulus. The presentation of real aromatic stimuli ensured that all participants shared the same olfactory references, as past research has shown that imagining a stimulus may result in different cross-modal associations compared to actually experiencing it (Bronner et al., 2012; Zarzo & Stanton, 2009). In contrast to the approach for the sound stimuli, we opted to facilitate the source identification of aromatic variables due to the large number of provided options. Thus, a label indicating the aroma contained in each glass bottle was provided on its placement base.

Participants

A convenience sample of 29 participants¹ with formal musical training (mean age: 22.5 years, age range: 19–41 years, 19 females) took part in a listening experiment. The majority were students at the Aristotle University of Thessaloniki and received course credit compensation for their participation.

Procedure

Each participant listened to the 26 stimuli in random order and their task was to associate —if possible— each sound with any of the 12 provided aromas. The participants were completely naive regarding the intended aromatic associations for each sound stimulus. Each sound could be associated with as many aromas as desired by providing a strength-of-association value (hidden scale: 0–100). Participants were initially instructed to experience each scent by sequentially opening the corresponding glass bottles before listening to the sound stimuli. They were encouraged to revisit specific scents only if they deemed it necessary while forming associations between scents and sounds. This instruction aimed to protect their olfactory system from sensory overload by limiting the overall number of times they experienced each different scent. An additional instruction was to ignore possible influence stemming from conscious higher-level connections between sounds and concepts related to the source of the aromas (e.g., “this sounds like a buzzing bee therefore it has to be associated with the scent of honey”). Participants were instead encouraged to base their judgements strictly on potential sensory connections as much as possible.

Results

The analysis of the data had two primary objectives. The first was to examine whether above-chance associations could be identified between certain sound stimuli and some of the aromatic variables. The second was to uncover acoustical correlates of general aromatic categories.

Analysis of Responses

To examine which of the observed effects were statistically significant, a bootstrapping approach that created random distributions through computational simulation of the experimental conditions was adopted. This included 29 virtual raters evaluating 26 objects on 12 variables. The acquired behavioral data indicated that real participants associated each auditory stimulus (i.e., object) with 2.13 aromatic variables on average. Therefore, for each computational evaluation, the virtual raters were set to randomly select 2 out of the 12 aromatic variables and indicate the strength of association by randomly assigning a value drawn from a uniform distribution with a range between 0 and 100. The above scenario was permuted 1000 times, and the distributions of selection frequencies and descriptive statistics for a variable under these circumstances were obtained. The 95th percentile of the number of raters that associated a variable with a certain stimulus (even with a minimal value) through this simulation was 8 (out of 29). This translates into an above-chance effect ( $p < .05$ ) when observing more than 8 instances of association between a given sound stimulus and one aromatic variable. Alternatively, when above $27 %$ of the raters registered an association between a sound and an aroma, this should be deemed statistically significant at the $p < .05$ level. Based on this, it seems that the distribution median would constitute an unnecessarily strict measure for assessing the effect size of each association. In addition, due to the sparse data justified by the nature of the task, the median often turns out to be zero despite the fact that the number of raters that have identified at least some degree of association is significant according to the above analysis. Figure 1 presents two examples of behavioral data acquired for two of the sound stimuli to exemplify the sparsity of responses for each sound–aroma relationship. In the first case, the stimulus intended to sonically represent the caramel scent scores above chance in three out of the twelve variables (including caramel itself) and particularly strongly for vanilla. On the contrary, the intended stimulus for cherry scent fails to score above chance for any of the variables.

Figure 1.

Example of the data gathered for two indicative sound stimuli. Horizontal axes depict the 29 participants, rank-ordered based on the magnitude of their responses as shown in the vertical axes (range 0–100). Three of the variables (vanilla, caramel, and melon) for the sound intended to resemble the caramel aroma (on the left) were selected above chance ( $p < .05$ ). Note that more than half of the participants associated this stimulus with the vanilla scent. For the stimulus intended to resemble the cherry aroma (on the right), there were no identified associations above chance (i.e., $>$ 8 participants).

Returning to the statistical significance of the effect size, in essence, every percentile above the 73th ( $100 - 27 %$ ) would satisfy the prerequisite of being statistically significant based merely on the number of selections. However, since most of the values at the 73rd percentile were minimal within the 0–100 scale, a higher percentile (i.e., the 85th) was adopted to improve visualization and was tested for statistical significance. The computational simulation showed that the 85th percentiles become significant (at the $p < .05$ level) when above $62 / 100$ (i.e., the 95th percentile of the 85th percentile distributions never exceeds 62 for any of the 12 tested variables). This value results from assuming a uniform distribution of responses (as described above) and becomes more relaxed with the assumption of normal distribution (i.e., $53 / 100$ ). Given that the distribution of responses violates normality in many cases, I have adopted the stricter assumption of a uniform distribution and 62 as the threshold for statistical significance of the 85th percentile of responses. At this point, it should be noted that between the number of raters and the magnitude of the 85th percentile criteria, the latter is the stricter for the vast majority of cases. The only exceptions are a few instances where the number of raters selecting the aroma was less than 8, but the 5th largest rating was 62/100 or above.

Figure 2 shows the 85th percentiles of the scores of the 26 stimuli on each of the 12 aromatic variables. With the exception of cinnamon and pomegranate, the remaining 10 aromatic variables featured at least one statistically significant association with one of the sound stimuli.

Figure 2.

Bar graphs depicting the 85th percentiles of the acquired associations (vertical axes) between the 12 aromas and each of the 26 sound stimuli (horizontal axes). The horizontal red line at 62 signifies the threshold for statistical significance ( $p < .05$ ). Most of the aromas, except cinnamon and pomegranate, feature at least one statistically significant sonic correspondence.

Acoustical Correlates of Aromatic Categories

Since several statistically significant correspondences between the sound stimuli and the provided aromas were observed, I proceeded to identify acoustical correlates of specific scents and groupings of more general scent categories. A correlational analysis (Spearman's $ρ$ ) between audio features extracted from the sound stimuli using the Timbre and MIR Matlab Toolboxes (Kazazis et al., 2021; Lartillot & Toiviainen, 2007; Peeters et al., 2011) and ratings on each aromatic variable is shown in Table 1. The settings applied for the audio feature extraction based on the harmonic representation of the Timbre Toolbox were the following: window type: blackman, window length: 4,096 samples ( $f_{s} = 44.1 k H z$ ), hop size: 512 samples, magnitude threshold: 40 dBFs, minimum partial duration; .1 s, pitch range: 30–5,000 Hz, inharmonicity tolerance: 0.1. The default settings for the temporal energy representation where applied. The MIR Toolbox was used to extract the mean auditory roughness based on the model by Vassilakis (2001) (time window: 0.05 s, 25% overlap), an estimation of the number of distinct pitches for each sound identified through mirpitch (“Tolonen” model), and a metric of the key variability expressed as the standard deviation of mirkey output (time window: 0.25 s, 25% overlap).

Table 1.

Spearman's rank correlation coefficients between ratings of stimuli on aromatic variables and audio features extracted using the Timbre and MIR Toolboxes. The ratings were represented as the median value of the non-zero ratings for each sound–aroma pair weighted by the number of raters. The aromatic variables are grouped into more general categories to facilitate interpretation.

Audio feature	Sweet			Spices/Other				Sour		Fruit
Audio feature	Vanilla	Honey	Caramel	Cinnamon	Tobacco	Coffee	Pepper	Lemon	Lmnbloss.	Pomegran.	Melon	Cherry
Pitch					$- {.64}^{* *}$		$- .45 *$	$.48 *$	${.55}^{* *}$	${.52}^{* *}$		${.55}^{* *}$
Spectral centroid	$- .47 *$		$- .45 *$					${.53}^{* *}$	$.45 *$
Spectral skewness		$.43 *$	$.43 *$
Tristimulus 1				$- .41 *$	$- {.59}^{* *}$	$- {.60}^{* *}$	$- {.60}^{* *}$				${.55}^{* *}$
Tristimulus 2		$.39 *$			$- {.53}^{* *}$		$- .44 *$			$.43 *$	${.56}^{* *}$	$.47 *$
Tristimulus 3	$- .40 *$			$.46 *$			$.46 *$					$- .40 *$
Harm/Noise Ener.					$- {.65}^{* *}$		$- {.54}^{* *}$			$.44 *$	${.54}^{* *}$	$.43 *$
Odd Even Ratio					$- .45 *$	$- {.65}^{* *}$					$- .41 *$
Spect. deviation					$- {.61}^{* *}$						$- .43 *$
Attack time						$.43 *$
Attack slope	$- .49 *$						$.43 *$
Temporal centr.						${.49}^{* *}$	${.50}^{* *}$				$- {.50}^{* *}$
Roughness							$.45 *$			$- .50 *$		$- .40 *$
Key variability	$.38 *$										$.40 *$
No. of pitches											$.42 *$	$.47 *$

Effect size: ( $* p < .05,^{* *} p < .01$ ).

The behavioral ratings were represented as the median value of the non-zero registrations for each sound–aroma pair weighted by the relative number of raters (i.e., no. of registrations $\div$ maximum number of registrations observed in the behavioral data). This approach also included the non-significant sound–aroma relationships in the correlation analysis, but it was nevertheless adopted to circumvent the problem of very sparse data resulting from taking into account merely statistically significant pairs.

Figure 3 presents four representative sound stimuli that received statistically significant ratings in each of the four general aromatic categories of Table 1. In some accordance with Table 1, the stimulus that received a statistically significant rating in the sweet category (i.e., vanilla) features energy concentrated in the lower partials (i.e., lower spectral centroid). In contrast, the stimulus that was rated highly in the spices/other category (i.e., pepper) features stronger high energy and noisy content, lower pitch, and longer duration (i.e., higher temporal centroid). The representative of the sour category (i.e., lemon) is a sound with both high pitch and strong high-energy content, some modulation, and the presence of non-harmonic partials. Finally, the representative of the fruit category (i.e., Melon) is a sound with its highest energy concentrated in the low frequencies and in the 2nd harmonic in particular (i.e., Tristimulus 2). There is a complete lack of non-harmonic components (i.e., strong harmonic to noise energy), but this specific sound also features a major third interval (note a distinct harmonic series initiating from the 3rd visible partial), highlighting the weak positive correlation with the number of distinct pitches listed in Table 1.

Figure 3.

Four examples of correspondences between the aromatic profiles and the spectrograms of highly rated sound stimuli in each of the four aromatic categories of Table 1. The radar plots display the 85th percentile of the response distributions. The red line corresponds to the 62/100 level of statistical significance, as detailed in the subsection Analysis of Responses. The labeling of the stimuli stems from the intended aroma that each synthetic sound was meant to resemble.

Figure 4 shows the dendrogram resulting from a hierarchical cluster analysis (Ward's method, distance metric: Spearman's $ρ$ ) on the weighted median values of the non-zero registrations of the aromatic variables. Notably, the observed groupings, derived solely from sonic correspondences of aromas, exhibit a structure aligning with some expected patterns of odor categorization. For example, the fruits (melon, cherry, pomegranate) form a wide cluster together with the sweet scents (vanilla, caramel, honey). Lemon and lemon blossom form a separate group, while spices and others (cinnamon, pepper, tobacco, coffee) are grouped together. This suggests that a sound-to-aroma association task has the potential to unveil underlying aspects of odor classification.

Figure 4.

Dendrogram from hierarchical cluster analysis (Ward's method, distance metric: Spearman's correlation) applied to the 12 aromatic variables. One major cluster encompasses the majority of fruity and sweet aromas. Spices and others form a distinct group, while lemon and lemon blossom form a two-member cluster that is loosely linked to the fruit/sweet one.

Discussion

The present study sought to enhance our understanding of correspondences between timbre and scent. This subject has not yet received extensive attention despite its potential applications in the sonic representation of odors for marketing, the well-being industry, and even artistic expression. Current findings suggest that it was possible to obtain reliable cross-modal associations between complex timbres and aromas. Ten out of the twelve aromas under study featured a statistically significant association with at least one sound stimulus (see Figure 2). This outcome is particularly impressive given the complexity of the task at hand. It is worth noting that not only were there numerous aromatic variables to choose from (12 in total), but some of them were closely perceived by participants, such as lemon with lemon blossom or cherry with melon (see Figure 4). Taken together, these results provide several adequate sonic representations for many of the scents under study. This was one of the major objectives of this work for informing future cross-modal experimental designs. Obtaining validated sonic representations is crucial for distinguishing congruent from incongruent properties between audition and olfaction—an essential step in investigating how sound and music may influence the perception and appreciation of scents. This result supports the notion that iconic extra-musical meaning, as proposed by Koelsch (2011), may also encompass an olfaction-related component. Moreover, given that pleasantness is a prominent olfactory characteristic, iconic and indexical (i.e., signaling an affective state) forms of extra-musical meaning may get intertwined in the context of sound–aroma correspondences, in line with the proposed mechanism of affective similarity for cross-modal correspondences. Thus, a sound may also be associated with an aroma based on the emotion (e.g., pleasantness) it commonly evokes.

At the same time, the correlational analysis between aromatic variables and audio features extracted from the sound stimuli provided more general insight into possible acoustical correlates of olfactory properties. To facilitate a more general interpretation of findings, the aromatic variables were organized into four categories, i.e., sweet, spices/other, sour, and fruit. It might have been apparent from the introduction that odor classification is a non-trivial task, and consequently, there is no universally accepted taxonomy (Kaeppler & Mueller, 2013). Therefore, the source cause classification I opted for could be challenged. After all, lemon is a fruit, spices constitute a very broad category, tobacco and coffee are—strictly speaking—not spices, and lemon blossom is a flower. However, both the cluster analysis of the aromatic variables and the audio features correlated with members of each category portray a relatively cohesive picture (see Figure 4 and Table 1) and support the selection of this classification.

In agreement with existing evidence (Crisinel & Spence, 2012a; Ward et al., 2021), both sour and fruit categories were positively correlated with pitch. They were, however, differentiated acoustically through several features. Fruity aromas (e.g., cherry, melon, and pomegranate) were more likely to be associated with sounds featuring higher energy concentration between the 2nd and 4th harmonics (i.e., Tristimulus 2), stronger harmonic-to-noise energy ratios, lower roughness, and a stronger presence of distinct pitches. On the contrary, sour exhibited a positive correlation with distribution of energy at the higher frequencies (i.e., higher spectral centroid values), aligning with previous findings in Knoeferle et al. (2015) and Simner et al. (2010). Sweet scents were associated with a stronger concentration of energy in the lower harmonic partials (i.e., lower spectral centroid and positive spectral skewness) and indications for smoother attacks (i.e., smaller attack slopes), supporting Mesz et al. (2011) and Bronner et al. (2012). The spices/other category was characterized by lower pitch, energy concentration toward the higher harmonic partials (i.e., negative correlations with Tristimulus 1 & 2 as opposed to positive correlations with Tristimulus 3), more noisy timbres (i.e., lower harmonic to noise energy ratio), higher roughness, less smooth spectral envelopes (i.e., higher odd even ratio and harmonic spectral deviation), slower attacks (i.e., positive correlations with attack time and slope), and longer-lasting sounds (i.e., higher temporal centroid). Many of these findings align with prior research (Knöferle & Spence, 2012; Ward et al., 2021), especially when considering the spices/other category as a potential equivalent to bitterness, given the inclusion of coffee and black pepper. In particular, the combined observation of a positive correlation with lower pitch and weaker concentration of energy in the low partials may offer a partial explanation for certain discrepancies highlighted in the literature concerning the acoustic correlates of bitterness. Indeed, Knoeferle et al. (2015) reported a negative association between bitterness and spectral centroid, while more recent work by Wang et al. (2019) observed a positive correlation between perceived bitterness in wine tasting and the spectral centroid of background music. The current findings suggest that a combination of lower spectral centroid (due to low pitch) with a spectral balance skewing toward the higher partials may contribute to the correspondence between sound characteristics and bitterness perception. The preceding discussion interchangeably refers to studies that have examined either sound–taste or sound–odor associations. Combining evidence from both perspectives is warranted since taste and smell often share common properties due to associative learning formed by past experiences (Stevenson et al., 1995; Stevenson & Boakes, 2003). Thus, a lemon scent may partially evoke a sour sensation, a caramel scent may evoke sweetness, and so on. In addition, similarly to previous literature on sound–aroma correspondences, the majority of the aromatic stimuli used in this experiment had a straightforward link with taste. Therefore, the current data are not useful for either rejecting or confirming the “indirect hypothesis” proposed by Deroy et al. (2013), according to which the origin of observed olfactory–auditory mappings may lie in a common connection to gustatory properties.

Limitations and Future Work

While contributing to and augmenting the limited literature on sound–scent correspondences, the present study does have certain limitations. First, the auditory stimuli were presented in more controlled conditions and analyzed more thoroughly (i.e., extraction of audio features) compared to the olfactory ones. Indeed, the sound stimuli were equalized for loudness, but the same was not the case for the intensity of the aromatic variables. Although three drops were used from each essential oil, this did not guarantee equal perceived intensity (due to potentially different concentrations and/or sensitivities), not to mention that tobacco was presented in the form of leaves. Based on the findings by Crisinel & Spence (2012a), there seems to be no clear reflection of aromatic intensity in scent–pitch correspondences. However, a potential association with timbral qualities in the form of source-cause categories was implied. In general, the same scent may feature different perceived qualities because of differences in concentrations (Kaeppler & Mueller, 2013). Expanding on this, future work should not only adopt a more controlled approach toward equalized perceived aromatic intensity but should also seek to link acoustic characteristics of sound stimuli with chemical substances for each scent as opposed to the elementary aromatic categorization of the current approach. A similar approach has been adopted to offer chemical correlates for a range of cross-modal associations centered around olfaction through the use of an electronic nose (Ward et al., 2020, 2022). Hence, the source-cause category caveat may be mitigated not only for auditory objects but also for olfactory ones.

In the same vein, the current design introduces the possibility of semantic mediation effects, since aromatic variables were labelled to explicitly indicate their source. The degree of influence of source semantics on odor perceptual processing and classification is a subject of open debate (see, Kaeppler & Mueller, 2013), while it has also been supported that knowledge of an odor's identity can affect its association with emotions and music genres (albeit not with shape angularity, pitch, smoothness of texture, or perceived pleasantness) (Ward et al., 2021). Thus, it is not unlikely that such effects may have somewhat affected the reported associations. However, the number of provided scents (12) was high enough and their similarities were quite close (in some cases) to warrant the inclusion of labels to reduce noise in the acquired behavioral data. In any case, participants were instructed to focus on the provided aroma and not on a prototypical representation of this category that they may have had in mind, even if these two were not entirely aligned. While acknowledging all these caveats, the presentation of actual olfactory stimuli as a common reference still leads to a more robust experimental design compared to the alternative of simply asking participants to imagine scent sensations based on memory, according to current evidence (Bronner et al., 2012; Zarzo & Stanton, 2009).

Overall, this study not only corroborated certain prior findings regarding sound–scent correspondences but also contributed a more nuanced perspective by examining timbre features. Nevertheless, there is substantial untapped potential for further exploration and application of cross-modal correspondences between audition and olfaction. For one thing, the cross-modal data reported here originated from an auditory-oriented population. It has been suggested that odor familiarity influences perceived pleasantness (Crisinel & Spence, 2012a), a salient dimension in odor perception. Furthermore, Wang et al. (2015) has found that taste perception associated with certain musical characteristics can vary between musically trained participants and musical novices. While the aromas examined in this context were relatively common, it would be intriguing to compare the perceptual aromatic organization (that resulted from sonic correspondences) identified by musicians with an equivalent organization established by a panel specializing in olfaction.

Several diverse experimental paradigms could also be considered. A slight variation would be to limit the aromatic variables to a significantly smaller number without source labelling to attenuate possible semantic mediation effects. Moreover, given the acquired sound–odor relationships, a comparison of perceptual spaces resulting from pairwise dissimilarity ratings for both sound and scent groups would offer a purely perceptual standpoint along with acoustical and chemical correlates. This approach could be also complemented by a semantic description of sounds and odors to facilitate interpretation and to explore possible common forms of iconic and indexical meaning. Alternatively, adopting a different approach altogether would involve utilizing the priming experimental paradigm to assess potential effects on odor perception induced by listening to a prior sound and vice versa. Priming with sound has already been demonstrated to affect chocolate taste ratings, albeit concurrent presentation was found to exert even greater influence (Wang et al., 2020). Finally, an experimental set-up resembling the approach of musical improvisation in response to olfactory stimuli by (Mesz et al., 2023) could also be adopted for timbre. In such a scenario, participants would be able to adjust the parameters of a sound synthesizer in real time to achieve the best possible matching of a timbre with a target scent. A combination of all the above research paths could lead to a more comprehensive understanding of the rules governing the crosstalk between audition and olfaction. This may, in turn, contribute to a stronger scientific foundation for the burgeoning real-world applications of auditory–olfactory interplay—already underway in domains such as marketing (Spence et al., 2021; Spence & Keller, 2024), rehabilitation, education, art, and entertainment (Spence, 2021a).

Supplemental Material

sj-zip-1-mns-10.1177_20592043241274258 - Supplemental material for Sonic Bouquet: Decoding Cross-Modal Correspondences Between Timbre and Scent

Supplemental material, sj-zip-1-mns-10.1177_20592043241274258 for Sonic Bouquet: Decoding Cross-Modal Correspondences Between Timbre and Scent by Asterios Zacharakis in Music & Science

Supplemental Material

sj-zip-2-mns-10.1177_20592043241274258 - Supplemental material for Sonic Bouquet: Decoding Cross-Modal Correspondences Between Timbre and Scent

Supplemental material, sj-zip-2-mns-10.1177_20592043241274258 for Sonic Bouquet: Decoding Cross-Modal Correspondences Between Timbre and Scent by Asterios Zacharakis in Music & Science

Footnotes

Acknowledgments

I am grateful to Ioulieta Michail for her assistance in conducting the experiments and for carrying out a thorough literature review on correspondences between taste/aromas and sound. I would also like to thank the participants involved in the experiments, as well as the Action Editor and both reviewers for their valuable assistance in improving this manuscript. Finally, I appreciate Vasilis Paras' help in creating the sound stimuli using Ableton Live.

Portions of this work were reported in .

Action Editor

Zachary Wallmark, School of Music and Dance, University of Oregon,

Peer Review

Charles Spence, Department of Experimental Psychology, Oxford University; Caroline Traube, Faculté de Musique, Université de Montréal.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Ethical approval

This experiment received ethical approval from the Aristotle University ethics board. Participants provided written informed consent acknowledging their voluntary participation in the study, understanding the purpose, procedures, potential risks and benefits, confidentiality measures, and their right to withdraw at any time. They also agreed to the storage, management, and, if applicable, anonymized sharing of their data for research purposes.

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Asterios Zacharakis

Data Availability Statement

The raw data collected for this experiment are available in the supplementary material as 29 .xlsx files. Each file represents the associations reported by one of the 29 participants between each sound stimulus and the 12 aromatic variables. The supplementary materials also include the 26 sound stimuli as .wav files.

Supplemental Material

Supplemental material for this article is available online.

Notes

References

Adeli

Rouat

Molotchnikoff

(2014). Audiovisual correspondence between musical timbre and visual shapes. Frontiers in Human Neuroscience, 8, 1–11. https://doi.org/10.3389/fnhum.2014.00352

Agapakis

C. M.

Tolaas

(2012). Smelling in multiple dimensions. Current Opinion in Chemical Biology, 16(5–6), 569–575. https://doi.org/10.1016/j.cbpa.2012.10.035

Belkin

Martin

Kemp

S. E.

Gilbert

A. N.

(1997). Auditory pitch as a perceptual analogue to odor quality. Psychological Science, 8(4), 340–342. https://doi.org/10.1111/j.1467-9280.1997.tb00450.x

Ben-Artzi

Marks

L. E.

(1995). Visual-auditory interaction in speeded classification: Role of stimulus difference. Perception & Psychophysics, 57(8), 1151–1162. https://doi.org/10.3758/BF03208371

Bernstein

I. H.

Edelstein

B. A.

(1971). Effects of some variations in auditory input upon visual choice reaction time. Journal of Experimental Psychology, 87(2), 241–247. https://doi.org/10.1037/h0030524

Bronner

Frieler

Bruhn

Hirt

Piper

(2012). What is the sound of citrus? Research on the correspondences between the perception of sound and flavour. In Proceedings of the 12th International Conference of Music Perception and Cognition (ICMPC) and the 8th Triennial Conference of the European Society for the Cognitive Sciences of Music (ESCOM), 142–148.

Burlingame

Suffet

Khiari

Bruchet

(2004). Development of an odor wheel classification scheme for wastewater. Water Science and Technology, 49(9), 201–209. https://doi.org/10.2166/wst.2004.0571

Carvalho

F. R.

Van Ee

Rychtarikova

Touhafi

Steenhaut

Persoone

Spence

(2015). Using sound-taste correspondences to enhance the subjective value of tasting experiences. Frontiers in Psychology, 6, 1309. https://doi.org/10.3389/fpsyg.2015.01309

Carvalho

F. R.

Wang

Q. J.

Van Ee

Persoone

Spence

(2017). Smooth operator”: Music modulates the perceived creaminess, sweetness, and bitterness of chocolate. Appetite, 108, 383–390. https://doi.org/10.1016/j.appet.2016.10.026

10.

Crisinel

A.-S.

Cosser

King

Jones

Petrie

Spence

(2012). A bittersweet symphony: Systematically modulating the taste of food by changing the sonic properties of the soundtrack playing in the background. Food Quality and Preference, 24(1), 201–204. https://doi.org/10.1016/j.foodqual.2011.08.009

11.

Crisinel

A.-S.

Jacquier

Deroy

Spence

(2013). Composing with cross-modal correspondences: Music and odors in concert. Chemosensory Perception, 6(1), 45–52. https://doi.org/10.1007/s12078-012-9138-4

12.

Crisinel

A.-S.

Spence

(2009). Implicit association between basic tastes and pitch. Neuroscience Letters, 464(1), 39–42. https://doi.org/10.1016/j.neulet.2009.08.016

13.

Crisinel

A.-S.

Spence

(2010a). As bitter as a trombone: Synesthetic correspondences in nonsynesthetes between tastes/flavors and musical notes. Attention, Perception, & Psychophysics, 72(7), 1994–2002. https://doi.org/10.3758/APP.72.7.1994

14.

Crisinel

A.-S.

Spence

(2010b). A sweet sound? Food names reveal implicit associations between taste and pitch. Perception, 39(3), 417–425. https://doi.org/10.1068/p6574

15.

Crisinel

A.-S.

Spence

(2012a). A fruity note: Crossmodal associations between odors and musical notes. Chemical Senses, 37(2), 151–158. https://doi.org/10.1093/chemse/bjr085

16.

Crisinel

A.-S.

Spence

(2012b). The impact of pleasantness ratings on crossmodal associations between food samples and musical notes. Food Quality and Preference, 24(1), 136–140. https://doi.org/10.1016/j.foodqual.2011.10.007

17.

Deroy

Crisinel

A.-S.

Spence

(2013). Crossmodal correspondences between odors and contingent features: Odors, musical notes, and geometrical shapes. Psychonomic Bulletin & Review, 20(5), 878–896. https://doi.org/10.3758/s13423-013-0397-0

18.

Eitan

Rothschild

(2011). How music touches: Musical parameters and listeners’ audio-tactile metaphorical mappings. Psychology of Music, 39(4), 449–467. https://doi.org/10.1177/0305735610377592

19.

Eitan

Schupak

Gotler

Marks

L. E.

(2014). Lower pitch is larger, yet falling pitches shrink. Experimental Psychology, 61(4), 273–284. https://doi.org/10.1027/1618-3169/a000246

20.

Farbood

M. M.

(2012). A parametric, temporal model of musical tension. Music Perception, 29(4), 387–428. https://doi.org/10.1525/mp.2012.29.4.387

21.

Farbood

M. M.

Upham

(2013). Interpreting expressive performance through listener judgments of musical tension. Frontiers in Psychology, 4, 998. https://doi.org/10.3389/fpsyg.2013.00998

22.

Gallace

Spence

(2006). Multisensory synesthetic interactions in the speeded classification of visual size. Perception & Psychophysics, 68(7), 1191–1203. https://doi.org/10.3758/BF03193720

23.

Giannos

Athanasopoulos

Cambouropoulos

(2021). Cross-modal associations between harmonic dissonance and visual roughness. Music & Science, 4, 20592043211055484. https://doi.org/10.1177/20592043211055484

24.

Guetta

Loui

(2017). When music is salty: The crossmodal associations between sound and taste. PLOS ONE, 12(3), e0173366. https://doi.org/10.1371/journal.pone.0173366

25.

Hashim

Stewart

Küssner

M. B.

Omigie

(2023). Music listening evokes story-like visual imagery with both idiosyncratic and shared content. PLOS ONE, 18(10), e0293412. https://doi.org/10.1371/journal.pone.0293412

26.

Huovinen

Kaila

A.-K.

(2015). The semantics of musical topoi: An empirical approach. Music Perception, 33(2), 217–243. https://doi.org/10.1525/mp.2015.33.2.217

27.

Huron

D. B.

(2006). Sweet anticipation: Music and the psychology of expectation. MIT press.

28.

Kaeppler

Mueller

(2013). Odor classification: A review of factors influencing perception-based odor arrangements. Chemical Senses, 38(3), 189–209. https://doi.org/10.1093/chemse/bjs141

29.

Kazazis

Depalle

McAdams

(2021). The Timbre Toolbox Version R2021a, User’s Manual.

30.

Klapetek

Ngo

M. K.

Spence

(2012). Does crossmodal correspondence modulate the facilitatory effect of auditory cues on visual search? Attention, Perception, & Psychophysics, 74(6), 1154–1167. https://doi.org/10.3758/s13414-012-0317-9

31.

Knoeferle

K. M.

Woods

Käppler

Spence

(2015). That sounds sweet: Using cross-modal correspondences to communicate gustatory attributes. Psychology & Marketing, 32(1), 107–120. https://doi.org/10.1002/mar.20766

32.

Knöferle

Spence

(2012). Crossmodal correspondences between sounds and tastes. Psychonomic Bulletin & Review, 19, 992–1006. https://doi.org/10.3758/s13423-012-0321-z

33.

Koelsch

(2011). Towards a neural basis of processing musical semantics. Physics of Life Reviews, 8(2), 89–105. https://doi.org/10.1016/j.plrev.2011.04.004

34.

Köhler

(1929). Gestalt Psychology. Liveright.

35.

Krumhansl

C. L.

(2015). Statistics, structure, and style in music. Music Perception, 33(1), 20–31. https://doi.org/10.1525/mp.2015.33.1.20

36.

Lartillot

Toiviainen

(2007). A matlab toolbox for musical feature extraction from audio. In S. Marchand (Ed.), Proceedings of the 10th international conference on digital audio effects (DAFx-07) (pp. 237–244). LaBRI - University of Bordeaux.

37.

Lichte

(1941). Attributes of complex tones. Journal of Experimental Psychology, 28(6), 455–480. https://doi.org/10.1037/h0053526

38.

Majid

Burenhult

(2014). Odors are expressible in language, as long as you speak the right language. Cognition, 130(2), 266–270. https://doi.org/10.1016/j.cognition.2013.11.004

39.

Majid

Kruspe

(2018). Hunter-gatherer olfaction is special. Current Biology, 28(3), 409–413. https://doi.org/10.1016/j.cub.2017.12.014

40.

Marks

L. E.

(1987). On cross-modal similarity: Auditory–visual interactions in speeded discrimination. Journal of Experimental Psychology: Human Perception and Performance, 13(3), 384. https://doi.org/10.1037/0096-1523.13.3.384

41.

Marks

L. E.

(1989). On cross-modal similarity: The perceptual structure of pitch, loudness, and brightness. Journal of Experimental Psychology: Human Perception and Performance, 15(3), 586–602. https://doi.org/10.1037/0096-1523.15.3.586

42.

Marks

L. E.

Hammeal

R. J.

Bornstein

M. H.

Smith

L. B.

(1987). Perceiving similarity and comprehending metaphor. Monographs of the Society for Research in Child Development, 52(1), i –100. https://doi.org/10.2307/1166084

43.

Meilgaard

Reid

Wyborski

(1982). Reference standards for beer flavor terminology system. Journal of the American Society of Brewing Chemists, 40(4), 119–128. https://doi.org/10.1094/ASBCJ-40-0119

44.

Mesz

Gorla

Zarzo

(2023). The music of perfume: Crossmodal correspondences between musical features and olfactory perception. Music Perception, 41(2), 110–131. https://doi.org/10.1525/mp.2023.41.2.110

45.

Mesz

Trevisan

M. A.

Sigman

(2011). The taste of music. Perception, 40(2), 209–219. https://doi.org/10.1068/p6801

46.

Noble

A. C.

Arnold

Masuda

B. M.

Pecore

Schmidt

Stern

(1984). Progress towards a standardized system of wine aroma terminology. American Journal of Enology and Viticulture, 35(2), 107–109. https://doi.org/10.5344/ajev.1984.35.2.107

47.

Noble

Thoret

Henry

McAdams

(2020). Semantic dimensions of sound mass music: Mappings between perceptual and acoustic domains. Music Perception, 38(2), 214–242. https://doi.org/10.1525/mp.2020.38.2.214

48.

Painter

J. G.

Koelsch

(2011). Can out-of-context musical sounds convey meaning? An ERP study on the processing of meaning in music. Psychophysiology, 48(5), 645–655. https://doi.org/10.1111/j.1469-8986.2010.01134.x

49.

Patel

A. D.

(2008). Music, language, and the brain, chapter Meaning, (pp. 300–351). Oxford University Press.

50.

Peeters

Giordano

B. L.

Susini

Misdariis

McAdams

(2011). The timbre toolbox: Extracting audio descriptors from musical signals. The Journal of the Acoustical Society of America, 130(5), 2902–2916. https://doi.org/10.1121/1.3642604

51.

Piesse

G. W. S.

(1862/1891). Piesse’s art of perfumery. Piesse and Lubin, 5th edition.

52.

Piggott

Jardine

(1979). Descriptive sensory analysis of whisky flavour. Journal of the Institute of Brewing, 85(2), 82–85. https://doi.org/10.1002/j.2050-0416.1979.tb06830.x

53.

Pratt

Doak

(1976). A subjective rating scale for timbre. Journal of Sound and Vibration, 45(3), 317–328. https://doi.org/10.1016/0022-460X(76)90391-6

54.

Ramachandran

V. S.

Hubbard

E. M.

(2003). Hearing colors, tasting shapes. Scientific American, 288(5), 52–59. https://doi.org/10.1038/scientificamerican0503-52

55.

Ramachandran

V. S.

Marcus

Chunharas

(2020). Bouba-kiki: Cross-domain resonance and the origins of synesthesia, metaphor, and words in the human mind. In Sathian

Ramachandran

V. S.

(Eds.), Multisensory perception (pp. 3–40). Elsevier.

56.

Reymore

(2022). Characterizing prototypical musical instrument timbres with timbre trait profiles. Musicae Scientiae, 26(3), 648–674. https://doi.org/10.1177/10298649211001523

57.

Reymore

Huron

(2020). Using auditory imagery tasks to map the cognitive linguistic dimensions of musical instrument timbre qualia. Psychomusicology: Music, Mind, and Brain, 30(3), 124. https://doi.org/10.1037/pmu0000263

58.

Rosi

Arias Sarah

Houix

Misdariis

Susini

(2023). Shared mental representations underlie metaphorical sound concepts. Scientific Reports, 13(1), 5180. https://doi.org/10.1038/s41598-023-32214-2

59.

Saitis

Weinzierl

(2019). Timbre: Acoustics, perception, and cognition, chapter The semantics of timbre, (pp. 119–149). Springer.

60.

Samoylenko

McAdams

Nosulenko

(1996). Systematic analysis of verbalizations produced in comparing musical timbres. International Journal of Psychology, 31(6), 255–278. https://doi.org/10.1080/002075996401025

61.

Siedenburg

(2017). Musical instruments in the 21st century: Identities, configurations, practices, chapter Instruments Unheard of: On the Role of Familiarity and Sound Source Categories in Timbre Perception, (pp. 385–396). Springer.

62.

Simner

Cuskley

Kirby

(2010). What sound does that taste? Cross-modal mappings across gustation and audition. Perception, 39(4), 553–569. https://doi.org/10.1068/p6591

63.

Speed

L. J.

Croijmans

Dolscheid

Majid

(2021). Crossmodal associations with olfactory, auditory, and tactile stimuli in children and adults. i-Perception, 12(6), 20416695211048513. https://doi.org/10.1177/20416695211048513

64.

Spence

(2011). Crossmodal correspondences: A tutorial review. Attention, Perception, & Psychophysics, 73(4), 971–995. https://doi.org/10.3758/s13414-010-0073-7

65.

Spence

(2020a). Assessing the role of emotional mediation in explaining crossmodal correspondences involving musical stimuli. Multisensory Research, 33(1), 1–29. https://doi.org/10.1163/22134808-20191469

66.

Spence

(2020b). Wine psychology: Basic & applied. Cognitive Research: Principles and Implications, 5(1), 1–18. https://doi.org/10.1186/s41235-019-0201-4

67.

Spence

(2021a). Musical scents: On the surprising absence of scented musical/auditory events, entertainments, and experiences. i-Perception, 12(5), 20416695211038747. https://doi.org/10.1177/20416695211038747

68.

Spence

(2021b). Sensehacking: How to use the power of your senses for happier, healthier living. Penguin UK.

69.

Spence

(2021c). Sonic seasoning and other multisensory influences on the coffee drinking experience. Frontiers in Computer Science, 3, 644054. https://doi.org/10.3389/fcomp.2021.644054

70.

Spence

Keller

(2024). Sonic branding: A narrative review at the intersection of art and science. Psychology & Marketing, 41, 1530–1548. https://doi.org/10.1002/mar.21995

71.

Spence

Wang

(2015a). Wine and music (i): On the crossmodal matching of wine and music. Flavour, 4(34), 1–14. https://doi.org/10.1186/2044-7248-4-1

72.

Spence

Wang

(2015b). Wine and music (ii): Can you taste the music? Modulating the experience of wine through music and sound. Flavour, 4(33), 1–14. https://doi.org/10.1186/2044-7248-4-1

73.

Spence

Wang

Q. J.

(2015c). Wine and music (iii): So what if music influences the taste of the wine? Flavour, 4(1), 1–15. https://doi.org/10.1186/2044-7248-4-1

74.

Spence

Wang

Q. J.

Reinoso-Carvalho

Keller

(2021). Commercializing sonic seasoning in multisensory offline experiential events and online tasting experiences. Frontiers in Psychology, 12, 740354. https://doi.org/10.3389/fpsyg.2021.740354

75.

Štĕpánek

(2006). Musical sound timbre: Verbal descriptions and dimensions. In Proc. of the 9th Int. Conference on Digital Audio Effects (DAFx-06), Montreal, Canada.

76.

Stevenson

R. J.

Boakes

R. A.

(2003). A mnemonic theory of odor perception. Psychological Review, 110(2), 340. https://doi.org/10.1037/0033-295X.110.2.340

77.

Stevenson

R. J.

Prescott

Boakes

R. A.

(1995). The acquisition of taste properties by odors. Learning and Motivation, 26(4), 433–455. https://doi.org/10.1016/S0023-9690(05)80006-2

78.

Vassilakis

P. N.

(2001). Perceptual and physical properties of amplitude fluctuation and their musical significance. PhD thesis, University of California, Los Angeles.

79.

von Bismarck

(1974). Sharpness as an attribute of the timbre of steady sounds. Acustica, 30(3), 17–24.

80.

von Helmholtz

H. L. F.

(1877). On the Sensations of Tone as a Physiological Basis for the Theory of Music. Dover (1954), 4 edition. English translation by Ellis

A. J.

81.

von Hornbostel

(1931). Ueber geruchshelligkeit [on smell brightness]. Pflügers Archiv für die Gesamte Physiologie des Menschen und der Tiere, 227, 517–538. https://doi.org/10.1007/BF01755351

82.

Wallmark

(2019a). A corpus analysis of timbre semantics in orchestration treatises. Psychology of Music, 47(4), 585–605. https://doi.org/10.1177/0305735618768102

83.

Wallmark

(2019b). Semantic crosstalk in timbre perception. Music & Science, 2(5), 2059204319846617. https://doi.org/10.1177/2059204319846617

84.

Wallmark

Allen

S. E.

(2020). Preschoolers’ crossmodal mappings of timbre. Attention, Perception, & Psychophysics, 82(5), 2230–2236. https://doi.org/10.3758/s13414-020-02015-0

85.

Wallmark

Iacoboni

Deblieck

Kendall

R. A.

(2018). Embodied listening and timbre: Perceptual, acoustical, and neural correlates. Music Perception, 35(3), 332–363. https://doi.org/10.1525/mp.2018.35.3.332

86.

Wallmark

Nghiem

Marks

L. E.

(2021). Does timbre modulate visual perception? Exploring crossmodal interactions. Music Perception, 39(1), 1–20. https://doi.org/10.1525/mp.2021.39.1.1

87.

Wang

Q. J.

Keller

Spence

(2021). Metacognition and crossmodal correspondences between auditory attributes and saltiness in a large sample study. Multisensory Research, 34(8), 785–805. https://doi.org/10.1163/22134808-bja10055

88.

Wang

Q. J.

Mesz

Riera

Trevisan

Sigman

Guha

Spence

(2019). Analysing the impact of music on the perception of red wine via temporal dominance of sensations. Multisensory Research, 32(4-5), 455–472. https://doi.org/10.1163/22134808-20191401

89.

Wang

Q. J.

Spence

(2016). Striking a sour note’: Assessing the influence of consonant and dissonant music on taste perception. Multisensory Research, 29(1–3), 195–208. https://doi.org/10.1163/22134808-00002505

90.

Wang

Q. J.

Spence

Knoeferle

(2020). Timing is everything: Onset timing moderates the crossmodal influence of background sound on taste perception. Journal of Experimental Psychology: Human Perception and Performance, 46(10), 1118. https://doi.org/10.1037/xhp0000820

91.

Wang

Woods

A. T.

Spence

(2015). What’s your taste in music?” A comparison of the effectiveness of various soundscapes in evoking specific tastes. i-Perception, 6(6), 2041669515622001. https://doi.org/10.1177/2041669515622001

92.

Ward

R. J.

Jjunju

F. P. M.

Griffith

E. J.

Wuerger

S. M.

Marshall

(2020). Artificial odour-vision syneasthesia via olfactory sensory argumentation. IEEE Sensors Journal, 21(5), 6784–6792. https://doi.org/10.1109/JSEN.2020.3040114

93.

Ward

R. J.

Rahman

Wuerger

Marshall

(2022). Predicting the crossmodal correspondences of odors using an electronic nose. Heliyon, 8(4), e09284. https://doi.org/10.1016/j.heliyon.2022.e09284.

94.

Ward

R. J.

Wuerger

Marshall

(2021). Smelling sensations: Olfactory crossmodal correspondences. Journal of Perceptual Imaging, 4(2), 020402–1–020402–12. https://doi.org/10.1016/j.heliyon.2022.e09284

95.

Winter

(2019). Sensory Linguistics, chapter Ineffability. John Benjamins Publishing Company.

96.

Zacharakis

Michail

Pastiadis

(2023). Scent of a timbre: Cross-modal correspondences between synthetic timbres and essential oil aromas. In Proceedings of the 3rd international conference on timbre (timbre 2023) (pp. 68–72). Aristotle University of Thessaloniki, Greece.

97.

Zacharakis

Pastiadis

(2016). Revisiting the luminance-texture-mass model for musical timbre semantics: A confirmatory approach and perspectives of extension. Journal of the Audio Engineering Society, 64(9), 636–645. https://doi.org/10.17743/jaes.2016.0032

98.

Zacharakis

Pastiadis

Reiss

J. D.

(2014). An interlanguage study of musical timbre semantic dimensions and their acoustic correlates. Music Perception, 31(4), 339–358. https://doi.org/10.1525/mp.2014.31.4.339

99.

Zacharakis

Pastiadis

Reiss

J. D.

(2015). An interlanguage unification of musical timbre: Bridging semantic, perceptual, and acoustic dimensions. Music Perception, 32(4), 394–412. https://doi.org/10.1525/mp.2015.32.4.394

100.

Zarzo

Stanton

D. T.

(2009). Understanding the underlying dimensions in perfumers’ odor perception space as a basis for developing meaningful odor maps. Attention, Perception, & Psychophysics, 71(2), 225–247. https://doi.org/10.3758/APP.71.2.225

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

22.64 MB

0.27 MB

0.00 MB