Abstract
In this narrative historical review, I want to take a closer look at the concept of perceptual similarity both as it applies within, and between, the chemical senses (specifically taste and smell). The discussion is linked to issues of affective similarity and connotative meaning. The relation between intramodal and crossmodal judgments of perceptual similarity, and the putatively special status of those odorants that happen to take on taste qualities will also be discussed. An important distinction is drawn between the interrelated, though sometimes distinct, notions of perceptual similarity and crossmodal congruency, specifically as they relate to the comparison of chemosensory stimuli. Such phenomena are often referred to as crossmodal correspondences, or by others (incorrectly in my view), as a kind of ubiquitous synesthesia.
Keywords
Introduction
At the start of his now-classic article from almost half a century ago, 1 Amos Tversky (1977, p. 327) wrote that: “Similarity plays a fundamental role in theories of knowledge and behavior. It serves as an organizing principle by which individuals explain and classify objects, form concepts, and make generalizations. Indeed, the concept of similarity is ubiquitous in psychological theory. It underlies the accounts of stimulus and response generalization in learning, it is employed to explain errors in memory and pattern recognition, and it is central to the analysis of connotative meaning.” While Tversky likely had the similarity between visual, and perhaps also between auditory, stimuli in mind (see also Mehrabi et al., 2019, on auditory perceptual similarity), those scientists interested in the topic of multisensory flavor perception (such as, e.g, your author) are presumably entitled to ask whether the same claim also applies when it comes to thinking about similarity in (and between) the chemical senses (here focusing, in particular, on olfactory and gustatory stimuli). This is the question that I will address in this narrative historical review. The topic takes on added contemporary relevance given the growing interest in flavor pairing that has been seen in recent years (e.g., see Coucquyt et al., 2020; Spence, 2020b, 2022b, for reviews).
Assessing the Similarity of Culinary Spices and Other Seasonings
In one of the first empirical studies designed to address the question of similarity within the chemical senses, Blank and Mattes (1990) had two groups of participants (one group of White North American participants and the other group of non-White participants who mostly came from outside the United States; N = 70 in total) rate the pairwise similarity of 45 pairings of 10 spices. 2 The stimuli included salt and sugar granules, vanilla and anise extract, dried bay leaf, dried crushed spearmint leaves, ground nutmeg, cloves, cinnamon, and ginger. The participants were given no specific guidance as to the basis on which they were supposed to rate the similarity of the stimuli that they first smelled and then tasted. The participants responded using a nine-point scale anchored with the terms “very similar” and “very different” (see Figure 1 for the results).

Multidimensional scaling (MDS) configuration of spices achieved by ALSCAL for all 70 participants. Figure reprinted from Blank and Mattes (1990). [Granularity present in original journal image.]
According to Blank and Mattes (1990), a majority (69%) of the variance in the data that was collected could be explained by just three dimensions. Discussion with the study participants led to the suggestion that one dimension (identified as Dimension 1 in the figure) was best understood as coding stimulus intensity, Dimension 2, it was suggested, coded the compatibility of the spice with sweetness, while Dimension 3 was associated with the degree of the bitter taste of the stimulus. One question that immediately crops up here is why the participants did not rate salt and sugar as being similar, given that both stimuli are typically encountered as crystalline white substances (note that they were, in fact, presented in granular form) that lack any discernible odor. Perhaps, though, the fact that the participants were instructed to smell and then taste the stimuli was sufficient to direct their attention to the taste/flavor (rather than focusing on the visual appearance of the stimuli).
While the results of Blank and Mattes’ (1990) study would appear to support the claim that people can make meaningful judgments of the similarity of pairs of chemosensory stimuli, the basis on which they made their similarity judgments remains unclear. A priori, one could imagine that the participants would have based their decisions on perceptual similarity, on affective similarity, on similar patterns of usage of the “spices” in recipes, on similar visual appearance cues (as was just mentioned), or perhaps even on the extent to which the stimuli happen to engage the olfactory versus gustatory systems. 3
Meanwhile, in an earlier study, Jones et al. (1978) had their participants (N = 22 students) assess the similarity/confusability of the 55 possible combinations of 11 common herbs or seasonings (comprising basil, bay, celery, marjoram, mint, oregano, parsley, rosemary, sage, tarragon, and thyme) on a nine-point similarity scale. In this case, the data supported a unidimensional solution for similarity judgments plotted against relative confusion in recognition memory. That said, one might want to consider the extent to which the context in which the stimuli are presented affects the similarity space that is obtained (cf. Cleland et al., 2002; Yearsley et al., 2021). Elsewhere, research from Dalton et al. (2008) revealed that three dimensions (corresponding to the evaluation, potency, and activity dimensions of semantic differential theory) were needed to account for the majority (53%) of the variance in semantic differential-type judgments of a range of 30 distinct olfactory stimuli.
Distinguishing Congruency From Similarity in the Case of the Chemical Senses
Spence (2022a) raised the question of whether we should think of those olfactory stimuli that take on taste properties (and which could potentially be used as sweet replacers; Blank & Mattes, 1990; Fial, 1978) as an example of crossmodal/semantic congruency (Amsellem & Ohla, 2016), crossmodal correspondence, and/or perhaps perceptual similarity. This is more than merely a matter of semantics: For understanding what exactly we are talking about is important given the suggestion that it is the perceptual similarity between combinations of olfactory and gustatory stimuli that determines the magnitude of any oral referral (Schifferstein & Verlegh, 1996). In turn, the extent to which the odorant is referred to the oral cavity likely predicts the extent of any odor-induced taste enhancement (OITE; Schifferstein & Verlegh, 1996; see Spence, 2016, for a review of the phenomenon of oral referral).
In an intriguing early study, Schifferstein and Verlegh (1996) attempted to distinguish between the concepts of congruency and similarity in the context of the chemical senses. The aroma of vanilla and the taste of sugar, for example, are both congruent and also perceptually similar (i.e., many people describe both as “sweet”; see Blank & Mattes, 1990; Spence, submitted). By contrast, while acetic and citric acid are perceptually similar (in that both are acidic) they are
Putative combinations of olfactory and/or gustatory stimuli that vary in terms of their perceptual similarity and congruency—the latter defined by Schifferstein and Verlegh (1996) as stimulus pairings that co-occur in flavorful stimuli.
Does Perceptual Similarity Exist Across the Senses
At this point, it is important to note that not everyone even believes that perceptual similarity judgments between the senses are necessarily possible. In particular, the early German psychophysicist, Hermann Ludwig Ferdinand von Helmholtz (1821–1894) once wrote that: “the distinctions among sensations which belong to different modalities, such as the differences among blue, warm, sweet, and high-pitched, are so fundamental as to exclude any possible transition from one modality to another and any relationship of greater or less similarity. For example, one cannot ask whether sweet is more like red or more like blue … Comparisons are possible only within each modality; we can cross over from blue through violet and carmine to scarlet, for example, and we can say that yellow is more like orange than like blue!” (Helmholtz, 1878/1971, p. 77). Given that Helmholtz (1878/1971) makes no explicit mention of the chemical senses, it is difficult, in hindsight at least, to know whether he had the same opinion regarding comparisons amongst the chemical senses (e.g., that judging the similarity of olfactory and gustatory stimuli is equally futile). Furthermore, given that one can talk about both perceptual and affective similarity (see Spence & Di Stefano, 2022), it would seem reasonable to assume that Helmholtz was referring to the question of perceptual similarity in the above-mentioned quote. Here, though, one might want to consider whether Helmholtz intended his claim to refer only to the proper sensible rather than to what Aristotle (1907) referred to as the common sensible (see Everson, 1995; Owens, 1982; Spence & Di Stefano, submitted).
That said, not everyone necessarily agrees with Helmholtz. So, for example, a seemingly contradictory position was forwarded by Lawrence Marks (2011, p. 52) when writing about: “perceptual similarities between and among sensory experiences in different modalities. Much as the color aqua is more similar phenomenologically to cerulean than to pink, the flavour of lime more similar to lemon than to banana, so too are low notes played on a bassoon or an organ more like dark colors such as brown or black than bright colors such as yellow or white, while the higher notes played on clavier or a flute resemble yellow or white more than brown or black.”
Evidence that might, at first, appear to contradict Helmholtz’s (1878/1971) assertion comes from those studies in which the participants were instructed to pick the color (typically from a restricted range of options) that is most strongly associated with each one of the basic tastes, or vice versa. The majority of people will, for example, tend to agree that a red drink, or a red or pink color patch, looks (or is associated with) sweet rather than, say, sour (see Spence et al., 2015, for a review of the literature on color-taste correspondences; Huisman et al., 2016; O’Mahony, 1983; Woods & Spence, 2016). It is, though, important to note that while consensual responses such as these are often obtained under such forced choice experimental conditions, this does not necessarily mean that the participants in the studies concerned thought that the stimuli that they paired together were perceptually similar to one another (see Spence & Levitan, 2022, on this point). For example, people’s responses might instead merely reflect an expected, or predictive, relationship (i.e., associative learning), such as, for example, that if I see a red drink then I expect it to taste sweet (cf. Baeyens et al., 1990). Multisensory source objects might also underpin a number of such crossmodal correspondences between color (e.g., red) and a specific smell (e.g., strawberry aroma; see also Spence, 2020c). This, one might wish to think of as the semantic account.
The associative learning of such color-taste mappings has been documented in infants of no more than a few months of age (Reardon & Bushnell, 1988; cf. Fernandez & Bahrick, 1994). It is, however, important to note that such associative learning does not necessitate that the component stimuli, the color “red” and the taste of sweetness, are perceived as being perceptually similar (nor that they start to become more perceptually similar), merely that they tend to co-occur in the environment (see Foroni et al., 2016), and that we internalize such environmental statistics (Barlow, 2001). To illustrate the point, consider only how people might well want to pair a barking sound with the picture of a dog (rather than a cat; Wegner-Clemens et al., 2022) without necessarily wanting to assert that the auditory and visual stimuli are themselves perceptually similar.
Crossmodal Correspondences Capture Multiple Kinds of Crossmodal Relationship
Crossmodal correspondences have been defined as a tendency for a feature, or attribute, in one sensory modality, either physically present, or else merely imagined, to be matched (or associated) with a sensory feature in another sensory modality (see Spence, 2011, for a review). More than two decades ago, Martino and Marks (2001, p. 61) suggested that: “Weak synaesthesia is characterized by cross-sensory correspondences expressed through language, perceptual similarity and perceptual interactions during information processing.” Consistent with this line of argument (that crossmodal correspondences can be considered as a “weak” form of synesthesia), Stevenson and Tomiczek (2007) argued that the phenomenon of certain food-related olfactory stimuli taking on taste properties as a ubiquitous form of olfactory-induced synesthesia (see also Stevenson & Boakes, 2004). Notice here how such an account, should it be accepted, does not necessitate that the synesthetically related olfactory and gustatory stimuli are perceptually similar. Note that in synesthesia proper, there is no sense in which the inducer and concurrent are thought of as being perceptually similar.
Nevertheless, Stevenson and Tomiczek’s (2007) suggestion has been robustly questioned by Auvray and Spence (2008) and Deroy and Spence (2013). Two key issues to bear in mind here are first that the connection between olfactory stimuli and gustatory properties is not idiosyncratic between individuals, and second that people tend to experience a unitary flavor rather than a distinction between inducer and concurrent that is such a signature feature of synesthesia proper (see Spence, 2015). For these and several other reasons, the description of OITE as a ubiquitous form of synesthesia would seem misleading and hence should probably be abandoned. Nevertheless, the point remains that while crossmodal correspondences might potentially be based on the perceptual similarity of the corresponding stimuli, they need not be (see Spence & Di Stefano, 2022).
Over the years, commentators have proposed a number of more or less surprising crossmodal correspondences between olfactory and both auditory and visual stimuli (see Deroy et al., 2013, for a review). So, for example, the 19th century chemist/perfumer Septimus Piesse (1867, 1891) once famously made a connection between a range of 24 scents and different musical notes in his so-called “Gamut of odours” (see Figure 2). Meanwhile, Von Hornbostel (1931) put forward the suggestion that sensory brightness should be considered as a universal (or amodal) dimension of sensory experience (see also Hartshorne, 1934). 6 That said, a few years later, the North American psychologist Cohen (1934) argued that all that was needed was in fact ratio properties amongst analogs unisensory dimensions (i.e., the data did not necessarily support the existence of a universal, or amodal, dimension of sensory brightness; see also Ellermeier et al., 2021, on the ratio-based crossmodal matching of visual brightness and sound intensity; cf. Heller, 2021; Luce et al., 2010). 7

Scale of crossmodal correspondences between sound and odors reproduced from Piesse (1867, pp. 42-43).
Aligning Perceptual Dimensions
Several researchers have attempted to document crossmodal correspondences between specific olfactory stimuli and the dimension of auditory pitch (Belkin et al., 1997; Crisinel & Spence, 2012), though intriguingly in Belkin et al.’s case, neither pleasantness, nor intensity, provided a satisfactory explanation of the data. Meanwhile Gilbert et al. (1996; see also Kemp & Gilbert, 1997) were able to demonstrate an inverse correlation between the dimensions of lightness and olfactory intensity (see Figure 3). 8 More recently, Gilbert et al. (2016) suggested that color-olfactory crossmodal mappings could be explained on the basis of emotional mediation. Note, though, that in the latter case, the color stimuli that participants had to choose between varied in terms of a range of dimensions (i.e., hue, saturation, and lightness). Certainly, the notion of emotionally mediated crossmodal correspondences has become increasingly popular in recent years (see Cunningham & Weinel, 2016; Hauck et al., 2022; Spence, 2020a; Spence & Di Stefano, 2022).

Crossmodal matchings (or correspondences) between odors and pitch. Adapted from Belkin et al. (1997) (left) and Crisinel and Spence (2012) (right).
On the Similarities and Differences Between Orthonasal and Retronasal Olfaction
A number of researchers have wanted to distinguish between orthonasal and retronasal smell. Orthonasal sniffing occurs when we inhale odors from the external environment, while retronasal olfaction occurs when odors are pulsed out from the oral cavity to the back of the nose when chewing and swallowing (Fincks, 1886; Masaoka et al., 2010; Patrick, 1899; Rozin, 1982; Wilson, 2021). Rozin has written of the “duality” of the sense of smell. However, few other researchers have gone so far as to suggest that orthonasal and retronasal smell should be considered as distinct senses (or sensory systems). That said, a growing body of cognitive neuroscience research has, in recent years, demonstrated a somewhat distinct recruitment of neural areas in the case of the two senses of smell (e.g., Blankenship et al., 2019; Chapuis et al., 2009; Small et al., 2005; Veldhuizen et al., 2010; see also Gagnon et al., 2015). So, for example, Blankenship et al. have shown that retronasal olfaction requires taste cortex whereas orthonasal olfaction does not.
Small et al. (2005) conducted a neuroimaging study in which they compared the odor of chocolate (i.e., food-related) with three non-food odors. When the chocolate odor was compared to the other olfactants, orthonasal–retronasal stimulation revealed enhanced thalamic, insular, orbitofrontal cortex (OFC), hippocampal and amygdalar activity—all were more active when sniffing chocolate odour. The retronasal–orthonasal comparison, meanwhile, revealed more definitive activity in OFC, along with a number of brain areas putatively involved in food reward (temporal gyrus, temporal operculum, periegnual cingulate, etc.). Small and her colleagues summarized their findings in terms of the operation of “wanting” (orthonasal) and “liking” (retronasal)—“the amygdala representing anticipation of food reward and the temporal gyrus and others reflecting the receipt of food reward.”
Given the growing body of evidence suggesting that different neural circuits are involved in the orthonasal and retronasal experience of odors, and the suggestion that these might be considered as constituting two senses of smell, one might legitimately want to question the similarity between orthonasal smell, for example, of a strawberry and the retronasal smell, or better said, retronasal contribution to the flavor of strawberry (Spence et al., 2015). At the same time, however, it is striking how rarely we are aware of any perceptual difference between odors when experienced orthonasally versus retronasally. So similar are they, in fact, that people are rarely even aware that there might be a meaningful difference between them (Pierce & Halpern, 1996; Sun & Halpern, 2005; Voirol & Daget, 1986; though see Diaz, 2004, for occasional differences in perceived stimulus intensity between the orthonasal and retronasal routes). As Stevenson (2009, p. 231) puts it, while there are clear differences, the similarities “between orthonasal and retronasal perception are predominant.”
In fact, the exceptions, such as freshly ground coffee (which is often reported to smell orthonasally great, but be disappointing when experienced retronasally when drinking the coffee) and certain cheeses (such as Époisses; which may smell absolutely terrible, but deliver a highly desirable retronasal flavor) are noticeable by their rarity. To this list one might also want to add the so-called stinking tofu, and the legendary durian fruit of South-East Asia (see Stevenson et al., 2007, on the latter). In such cases, it is unclear whether the multiple associations (or referents that such odorants are linked to) is the key difference (such as isolaveric acid being a distinctive component of both certain cheeses, but also sweaty trainers, etc.), or whether instead certain volatiles may be “stripped” from the retronasal aroma by saliva (see Bonnans & Noble, 1995; Ge, 2012), meaning that the composition of the orthonasal and retronasal olfactory stimuli is actually physically different hence explaining the different affective associations (cf. Goldberg et al., 2018; Hannum, Stegman, Fryer, & Simons, 2018). Nevertheless, if one takes the suggestion that can be seen as emerging from Rozin’s (1982) paper that there are two senses of smell seriously then one might legitimately want to discuss the very high degree of perceptual similarity between the orthonasal and retronasal experience of the vast majority of odors. However, it is worth noting that in the four decades since Rozin published his paper, I am not aware of any other researcher who have wished to pursue the line that there are two distinct senses of smell (as might be needed to talk about the perceptual similarity of the orthonasal and retronasal senses of smell in a similar way that we have been discussing the perceptual similarity of olfactory and gustatory stimuli.
Featural Versus Dimensional Accounts of Similarity
The question of whether a dimensional or feature-based account of similarity is more appropriate in the case of the chemical senses is by no means obvious. In his classic article, Tversky (1977, p. 328) notes that: “It has been argued by many authors that dimensional representations are appropriate for certain stimuli (e.g., colors, tones) but not for others. It seems more appropriate to represent faces, countries, or personalities in terms of many qualitative features than in terms of a few quantitative dimensions. The assessment of similarity between such stimuli, therefore, may be better described as a comparison of features rather than as the computation of metric distance between points.”
One theoretical issue that challenges any idea of perceptual similarity that might be based on shared phenomenological properties relates to the fact that a thing might be an “A-thing” with respect to “A-ness,” and at the same time might be a “B-thing” with respect to “B-ness” (see Rodriguez-Pereyra, 2002). Such critics point to the fact that any two objects might share at least one phenomenological property and thus, as Goodman (1972) has argued, similarity would simply be a universal relation—namely, everything would be similar to everything else—and therefore claims regarding similarity would become somehow meaningless. Moreover, different properties count differently as far as perceptual similarity is concerned. For example, a tomato is more similar to grapes than to blood, despite both tomatoes and blood being red. Thus, perceptual similarity would appear to depend on more than just a simple count of shared and unshared perceptual features or attributes. One might also struggle to define featural properties of the basic tastes.
Meanwhile, according to Licon et al. (2018), pleasantness and the presence of trigeminal sensations, and particularly irritation, appear to act as salient dimensions in organizing the semantic and physiological spaces of odors thus potentially supporting a dimensional account of olfactory similarity. However, given the continuing disagreement concerning the existence of fundamental dimensions, or categories, of olfactory experience, and whether the space can be captured by only a few dimension or reflects a multi-dimensional space (e.g., see Berglund et al., 1973; Castro et al., 2013; Koulakov et al., 2011; Meister, 2015; Schiffman, 1974; Schiffman et al., 1977; Zarzo, 2008), one might assume that a dimensional representation of olfactory similarity would not necessarily provide the most appropriate way in which to conceptualize olfactory similarity. 9 As such, the possibility should perhaps be considered that a feature-based matching account might turn out to provide a more parsimonious explanation of similarity data as far as the chemical senses are concerned (or at the very least in the cases of olfaction and flavor; though see Jones et al., 1978, for uncertainty over the appropriateness of such an approach). 10
Should “Sensory Intricacy” be Considered as Another Kind of Similarity
One might also consider whether complexity, or the more recently introduced notion of sensory intricacy might also provide an alternative means of matching stimuli across the senses. In this regard, Snitz et al. (2016) conducted an intriguing study of sensory intricacy involving people assessing both olfactory and visual stimuli. In this case, “sensory intricacy” was measured in terms of the degree of disagreement between participants using a version of the Semantic Differential Technique (SDT; Osgood et al., 1957; Snider & Osgood, 1969). Here, though, it is worth noting that the latter approach delivers a judgment that is based on affective/connotative (rather than perceptual) meaning. What is more, judging stimuli presented in different sensory modalities as similar because they share a similar degree of intricacy (note here that Snitz and colleagues argue that “sensory intricacy” is distinct from stimulus complexity; see Spence & Wang, 2018) would be neither a perceptually, nor an affectively-based judgment. At the same time, however, it is unclear whether people necessarily have any kind of subjective awareness of the sensory intricacy of a given stimulus (i.e., this characterization may only emerge when data are analyzed at the group level). Nevertheless, this clearly leaves open the possibility of perceived complexity of the component stimuli as providing another grounds for combining, or rather pairing, stimuli across the chemical senses (see Spence, 2020b).
Conclusions
Given the evidence that has been reviewed here, it is tempting to suggest that Helmholtz (1878/1971) may have been right after all; His claim that it is simply not possible to make judgments of (perceptual) similarity would appear to be broadly true, with the notable exception of food-related odors and tastes (i.e., perceptual similarity judgments within the chemical senses; and/or between orthonasal and retronasal smell should they be considered as distinct senses; see Hannum et al., 2018; Rozin, 1982). That said, other researchers have expressed no hesitation about asserting the possibility of crossmodal judgments of perceptual similarity (e.g., Marks, 2011). 11 Certainly, the majority of crossmodal correspondences would appear to be based on association (i.e., the internalization of the statistical regularities of the environment) rather than necessarily on perceptual similarity. Emotional mediation, or common affective/connotative meaning as revealed by, for example, research involving the Semantic Differential technique (see Dalton et al., 2008), often appears to play key role in determining people’s judgments of the similarity of stimuli from different categories (e.g., Gilbert et al., 2016; Pedović & Stosić, 2018). Perhaps, though, we need to accept the possibility that crossmodal correspondences between (at least food-related) olfactory stimuli and basic tastes (i.e., gustatory qualities), are somehow importantly different from the correspondence that many people experience between other pairings of modalities (i.e., such as between visual and auditory stimuli, see Spence, 2022b). Different in the sense that judgments of perceptual similarity between certain pairs of olfactory and gustatory stimuli might make sense in a way that they do not when it comes to any other pairing of sensory modalities.
At the same time, however, there is no need to go so far as to suggest that the close connection between food-related odors and the dominant tastes that they so often co-occur with in foods should be considered as a ubiquitous form of olfactory-gustatory synesthesia, as suggested by Stevenson and Tomiczek (2007; and see Auvray & Spence, 2008; Deroy & Spence, 2013, for arguments against accepting this suggestion). Ultimately, therefore, the fact that food-related olfactory stimuli can sometimes take on the perceptual qualities of the basic taste with which they are so often associated in foods (such as vanilla becoming (associated with) sweetness, despite the fermented vanilla bean itself tasting bitter; see Spence, submitted) means that crossmodal judgments of perceptual similarity may be possible in the case of certain olfactory–gustatory combinations in a way that is simply not possible for any other combination of sensory inputs. That being said it should also be remembered that affective, or emotionally-mediated, similarity likely operates across any pairing of sensory dimensions, and often represents the most plausible basis for establishing a connection, or correspondence, between unrelated sensations.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Arts and Humanities Research Council (grant no. AH/L007053/1).
