Sage Journals: Discover world-class research

Abstract

Historically, debates over relationships between spoken lexical form and meaning have been dominated by views of arbitrariness. However more recent research revealed a different perspective, in which non-arbitrary mappings play an important role in the makeup of a lexicon. It is now clear that phoneme-sound symbolism - along with other types of form-to-meaning mappings - contributes to non-arbitrariness (iconicity) of spoken words, which is present in many forms and degrees in different languages. Attempts have been made to provide a mechanistic explanation of the phenomenon, and these theories largely centre around cross-modal correspondences. We build on these views to explore iconicity within the evolutionary context and the neurobiological framework for human language processing. We argue that the multimodal bihemsipheric communicative system, to which iconicity is integral, has important phylogenetic and ontogenetic advantages, facilitating language learning, comprehension, and processing. Despite its numerous advantages however, iconicity must compete with arbitrariness, forcing language systems to balance the competing needs of perceptual grounding of the linguistic form and ensuring an effective signal. We conclude that, on balance, iconicity should be viewed as integral to language, and not merely a marginal phenomenon.

Keywords

Iconicity sound symbolism cross-modal correspondence language evolution neurobiology of language

Introduction

The nature of the relationship between word forms and their meanings has long been a prominent question. Do words sound like what they describe, or is this relationship arbitrary? Modern linguistic theory has largely adopted the view that the phonological form of the lexical sign bears no resemblance to its meaning; the connection is arbitrary. This view is often attributed to Saussure (1959) who argued “Because the sign is arbitrary, it follows no law other than that of tradition, and because it is based on tradition, it is arbitrary”. However, the idea of arbitrariness was established much earlier by Locke (1690), who proposed that the existence of multiple languages is sufficient evidence to disprove a so-called ‘natural’ connection: if the properties of the concept or object that the word refers to are sufficient to determine a phonological form, then there should only be one language. The notion of arbitrariness provides a robust explanation for the origin of most words; for example, there is nothing about the phonological form of the word ‘pen’ to suggest that it should denote ‘an instrument for writing or drawing with ink’. It has simply been established that there is a conventionalised system of symbols shared by communities of users and passed from generation to generation - “there is no reason for you to call a dog ‘dog’ rather than ‘cat’ except for the fact that everyone else is doing it” (Pinker & Bloom, 1990, p. 728). Notable exceptions to the conventional approach, such as onomatopoeia, have been regarded as marginal phenomena within an arbitrary linguistic system, with Saussure (1959) arguing that “such words are never organic elements” of language. In addition, although onomatopoeic words seem to represent the sounds they symbolise, they too show evidence of arbitrariness. For example, the English expression for the sound of a rooster (cock-a-doodle-doo) differs from German and French expressions (kikeriki and cocorico, respectively), all of which are acoustically different from the actual sound made by a rooster (Perniss et al., 2010). There is therefore plenty of evidence to support the idea of conventionalised, wholly arbitrary mapping between phonological form and meaning in our communication system. Indeed, the idea of language as arbitrary dominates many of the prominent theories of language acquisition, production, and comprehension; where relatively little attention has been focused on determining how phonological and semantic representations might overlap.

However, in looking for an accurate account of non-arbitrariness in language, a broad-cross linguistic perspective is required, and more recent work has looked beyond Indo-European languages as well as at both signed and spoken languages (Perlman et al., 2018). This research has suggested a more textured view that non-arbitrariness may have a more important role in language than previously thought, and may in fact complement arbitrariness by conveying different linguistic advantages.

Defining Forms of Non-Arbitrariness

When considering forms of non-arbitrariness in language, there has been much confusion over terminology despite attempts to clarify these uncertainties (e.g., Lockwood & Dingemanse, 2015; Sidhu, 2019). The term iconicity is usually taken to refer to a form-meaning resemblance that may apply to spoken, manual and written communication and is not limited to human language. In fact, there is evidence for iconicity in the communicative acts of non-human primates, which we will later explore in the context of human language evolution. Sound symbolism, which the current paper focuses on, is used to specifically refer to iconicity in spoken language, though these terms have previously been used interchangeably in much of the literature.

It has been argued that iconicity is not a binary property and may be present in different degrees, with a distinction between ‘absolute’ (imagic) and ‘relative’ (diagrammatic) iconicity (Dingemanse et al., 2015; Peirce, 1974); and ‘direct’ and ‘indirect’ iconicity (Masuda, 2007). Absolute iconicity is where there is a one-to-one mapping between phonological form and meaning. This may be ‘direct’ as in onomatopoeia, for example the sound of water leaking ‘drip drop’ (English) or ‘plitsch platsch’ (German), or ‘indirect’, such as the association between the phoneme /i/ and smallness in words like ‘petite’ and ‘tiny’. Absolute iconicity in spoken language has also been labelled sensory sound symbolism (Cuskley & Kirby, 2013), in other words the phenomenon where a word’s form imitates its referent (Lockwood & Dingemanse, 2015). Absolute iconicity is perhaps most prominently exemplified in spoken language by ideophones – vivid sensory words such as the Japanese ‘pika-pika’ (bright and shiny) or the Siwu ‘gidigidi’ (running energetically). Unlike English, which incorporates onomatopoeic words into conventional grammatical categories, many languages treat ideophones as a distinct word class with unique syntactic and phonological characteristics.

In relative iconicity, (what Cuskley and Kirby (2013) refer to as conventional sound symbolism) the relationships between multiple forms are analogous to relations between different meanings. This can be thought of as a mapping between two or more words and two or more referents, such as phonesthemes like the English ‘str-’, which denotes something long and thin, as in ‘straight’, ‘stripe’, ‘stream’, and ‘string’ (though it is worth noting that it is debated whether phonesthemes have a direct sensory link to associated meaning, as such ‘pockets of consistency’ in the lexicon may arise simply by chance (Baayen et al., 2011)). However some phonesthemes for example ‘sn-’ (occurring in words relating to the nose – ‘sniff’, ‘snore’, ‘sneeze’) do elicit sensory imagery and could therefore be considered iconic, Schmidtke et al. (2014). The term conventional sound symbolism can also cover correlations between sounds and grammatical categories – which is a form of non-arbitrariness called systematicity (Dingemanse et al., 2015; Monaghan et al., 2014). A good example is nouns denoting abstract concepts being longer and more derivationally complex than concrete nouns (Reilly & Kean, 2007). This is a form of non-arbitrariness as it reveals connections between word forms and meanings; however it primarily concerns broad statistical patterns between groups of words across the lexicon, rather than the mappings between specific words and their meanings (see Dingemanse et al. (2015) for a detailed review).

This paper will first briefly review evidence for some of the different forms of sound symbolism in human language, and build on existing work (Dingemanse et al., 2015; Sidhu, 2019; Sidhu & Pexman, 2018; Svantesson, 2017) to explore how these form-meaning relationships might arise. We will then turn our attention to the possible role of sound symbolism from an ontogenetic and evolutionary perspective. Finally, we incorporate evidence for iconicity in non-human communication within a neurobiological framework for human language processing to suggest that sound-symbolism, in addition to arbitrariness, should be viewed as integral to language, rather than a marginal linguistic phenomenon.

Sound Symbolism

Spoken languages consist of an inventory of different sounds, or phonemes, which can be combined in different ways to form words. These words are then attributed to meanings, leading to the fundamental questions about the nature of this form-meaning relationship raised above. Primarily, as a foundation for spoken language, do phonemes themselves convey inherent qualities which lead to association with certain meanings, or are they semantically neutral in nature? The notion that phonemes may carry such associations underlies the phenomenon of sound symbolism. There are different ways in which phonemes may evoke semantic properties, examples of which are presented below.

Sound-Size Symbolism

One of the best-known examples of sound symbolism is the idea that different vowels may communicate the semantic property of either small or large size. Almost a century ago it was first reported that a closed-front vowel /i/ may more naturally denote small size over an open-front vowel /a/ (Sapir, 1929). Sapir presented participants with short nonwords such as ‘mil’ and ‘mal’ that were allocated to meanings such as ‘table’. Participants were then asked to distinguish these nonwords according to their perceived size and decide which nonword symbolised a larger object. Between 75%–96% of responses were in favour of the nonword containing /a/ as referring to the larger referent, consistently across different ages and language backgrounds (although with a large variance in individual sensitivity to the ‘symbolic suggestiveness’ of phoneme features). This was later extended to include a more gradual scale, as well as vowels in the front-back dimension (ɔ = o > u = a > æ > ε > e > i; Newman, 1933). Outside of forced-choice experiments, the association between different phonemes and the depiction of size has since been corroborated by cross-linguistic evidence, with some languages containing very elaborate size scales. One example is Khmu, the language of the northern Laos region. According to Svantesson (2017), in Khmu ideophones (words that represent sensory imagery), vowel variation is used to indicate size differences, as illustrated by the expressions for ‘drink noisily’ (crúut-crúut > cróot-cróot > críit-críit > créet-créet). The expression chosen may vary according to the size of the drinking animal - for example, crúut-crúut could describe the sound of a buffalo drinking noisily, cróot-cróot the sound of a human, and críit-críit/créet-créet the sound of smaller animals (Figure 1(a)).

Figure 1.

(a) The use of vowel variation to indicate size differences in Khmu ideophones (Svantesson, 2017). (b) The takete/maluma or bouba/kiki effect - when prompted most subjects call the angular shape (left) takete/kiki and the rounded (right) shape maluma/bouba.

There is also evidence to suggest that it is the number of ‘small’ or ‘large’ vowel sounds in a word, and not vowel alone, that determines the conveyed magnitude (Thompson & Estes, 2011). In this study, English-speaking participants were presented with novel figures of varying size and asked to match them to nonwords containing either ‘large’ (u, o) or ‘small’ (i, e) vowels. The authors found a linear relationship between the size of an object and the number of ‘large’ phonemes in its preferred name, regardless of whether the nonwords were presented aurally or in writing.

Sound-Shape Symbolism

Another well-documented example of sound symbolism is the maluma/takete or bouba/kiki effect, which demonstrates a connection between specific phonemes and the perception of shapes like roundness or sharpness. Köhler (1929) was the first to observe that participants assigned non-words takete and maluma to spiky and rounded shapes, respectively (Figure 1(b)). Decades later, Ramachandran and Hubbard (2001) replaced the terms with bouba and kiki and reported that 95% of English-speaking adult participants matched bouba with the rounded shape and kiki with the spiky one. Although they did not provide data to support this prevalence estimate, numerous subsequent studies have confirmed the effect across age groups, including infants and young children (Maurer et al., 2006; Ozturk et al., 2013). A meta-analysis by Fort et al. (2018) found a modest but consistent bouba/kiki effect across diverse languages and paradigms, with greater sensitivity for rounded shapes than spiky ones.

While the bouba/kiki effect is robust and cross-culturally documented, its relevance to natural spoken language remains debated. Some studies failed to replicate the effect (Rogers & Ross, 1975; Styles & Gawne, 2017) and argue that the effect is determined by whether test words conform to the sound structure of the target language. Sidhu and Pexman (2018) question whether results from forced-choice experiments used to test the bouba/kiki effect can be generalised to natural language, noting that experimental settings may heighten participants’ awareness of shared properties between stimuli. Another issue is the potential role of orthography, with some authors suggesting that the bouba/kiki effect may be partly driven by the visual shape of written symbols, given that most study participants are literate (Cuskley et al., 2017). For instance, Koriat and Levy (1977) proposed that letters representing rounded sounds (e.g., /u/, /b/) often have curved shapes, while those representing spiky sounds (e.g., /k/, /t/) are angular. However, studies like Bremner et al. (2013), which observed the effect in the non-literate Himba population of Northern Namibia (albeit at a reduced prevalence of 82%), suggest that it is not entirely dependent on orthography. A comprehensive analysis by Ćwiek et al. (2022), spanning 25 languages and 10 writing systems also found no consistent relationship between orthographic shape and the bouba/kiki effect, providing strong evidence that the phenomenon arises independently of written language.

Debate also surrounds the question of which phonemes drive the effect. Some authors emphasise the role of vowels, particularly the close back rounded vowel /u/ associated with rounded shapes (Maurer et al., 2006; Ramachandran & Hubbard, 2001), while others highlight the contribution of consonants, noting that the harsh stop consonant /k/ contrasts with the softer bilabial /b/ (Nielsen & Rendall, 2011; Westbury, 2005). Evidence suggests that both play a role. For example, Nielsen and Rendall (2013) demonstrated that sound-shape symbolism is influenced by multiple phoneme categories, and Westbury et al. (2018) linked specific phonemes (e.g., /oƱ/ (as in soak), /u/, /b/, /m/ and /ɑ/ for roundness, and /t/, /k/, /z/, /i/, and /ɪ/ (as in sing) for spikiness) to subjective shape associations in a large dataset of 8000 randomly generated non-words. Extending this work to real words, Sidhu et al. (2021) found that English nouns describing round objects more frequently contained round-associated phonemes (/u/, /m/, /oƱ/, /b/) than words for spiky objects, which were more likely to include spiky-associated phonemes (/k/, /t/, /l/).

Other Phoneme-Feature Associations

Although much research on sensory sound symbolism has focused on the sound-size effect and sound-shape effect, several other phoneme-feature associations have been described in the literature (see Lockwood and Dingemanse (2015) for a detailed review). For instance, Vainio and Vainio (2021) describe the phenomenon of sound-action symbolism, in which vocal sounds are associated with a particular body action. An example is the sound-grip effect, in which a precision grip action is associated with front-close vowels and voiceless stop consonants, and a power grip is associated with low-back vowels. This is a form of sound symbolism in which certain vocal signs have connections to motor, perceptual and conceptual representations of particular hand actions. In another study, Hirata et al. (2011) observed that lightness influences sound sensitivity – participants were more successful at identifying consonants when they experienced congruent sound–light pairings (e.g., voiceless consonants with light visual stimuli and voiced consonants with dark visual stimuli) compared to incongruent pairings. A further example of phoneme-feature association comes from a study which investigated how front and back vowels affect conceptual precision (Maglio et al., 2014). Participants showed greater precision in geographic and action descriptions for labels containing front vowels compared to back vowels. Back vowels in product names however made participants focus on long-term benefits rather than immediate features. This evidence highlights how different vowel sounds can influence mental representation.

Sound symbolic relationships have also been noted for features such as taste (Gallace et al., 2011; Simner et al., 2010), colour (Johansson et al., 2020), and perception of dominant or submissive body postures (Auracher, 2017), suggesting that sound symbolism in language extends far beyond the takete/maluma effect described by Köhler. There is even evidence to suggest that people may draw upon knowledge of form-meaning resemblances in open-ended situations (Davis et al., 2019). When asked to ‘draw a creature’ described by nonce words, participants included similar elements in drawings, demonstrating that sound symbolism evokes expected properties of referents. These are only some of the examples of phoneme-feature associations described in the literature. Mechanisms of such associations and their consequences, both behavioural and evolutionary, are explored in subsequent sections.

Mechanisms of Sound Symbolism

The literature on the mechanisms of sound symbolism largely centres around cross-modal correspondences, defined as ‘a compatibility between attributes or dimensions of a stimulus (i.e., an object or event) in different sensory modalities’ (Spence, 2011, p. 3). The mechanisms of sound symbolism have been comprehensively reviewed by Sidhu and Pexman (2018), who evaluated five proposals, along with their supporting evidence. We summarise these briefly below and discuss additional evidence for each mechanism where appropriate.

According to Sidhu and Pexman (2018) one way that sound-symbolic association may be explained is through statistical regularities and co-occurrences between phonetic features and associated stimuli in the environment. This may be understood via Bayesian integration theory (Spence, 2011), whereby humans combine prior knowledge and sensory information to make cross-modal associations about stimuli. For example, pointed objects tend to produce less tonal sounds with more abrupt transitions whereas rounded forms produce more continuous, tonal sounds, and these statistical co-occurrences could be reflected in language (McCormick et al., 2015). Recent computational modelling by Fort and Schwartz (2022) indeed implies that the bouba/kiki effect might rely on acoustic cues of spectral balance and temporal continuity. They suggest that cognitively, a sound may be perceived as either spikey or round because it is likely to be produced by a spikey or round object hitting/rolling on a hard surface. Rounded objects, when compared with spiky objects, have lower frequency acoustic resonance modes, meaning that they produce sounds with more energy in the low-frequency part of the acoustic spectrum. They also have smoother trajectories when rolling on a hard surface and therefore produce more continuous acoustic envelopes. These properties, it is argued, may contribute to the formation of cross-modal correspondences responsible for human sensitivity to sound symbolism.

As noted by Sidhu and Pexman (2018), the statistical co-occurrences explanation relies upon experience, and therefore assumes that there is no innate human sensitivity to sound symbolism. Therefore, according to this explanation alone, it might be expected that sound-symbolic effects are not present at birth or in congenitally deaf individuals, yet sound symbolism effects have been observed in 4 month old infants (Ozturk et al., 2013; Peña et al., 2011) and although the effect is diminished in individuals with prelingual auditory deprivation, it is still present at above-chance levels (Gold & Segal, 2020). It is quite possible that statistical learning begins from birth, however it seems plausible that humans may also have an innate sensitivity to sound symbolism. A related view therefore argues that statistical co-occurrences underpin sound symbolism, but that these associations are innate and not acquired (Ohala, 1994). According to this account, the association between high frequency and smallness, and low frequency and largeness, is innate and universal across species. It is however difficult to generate testable hypotheses to test this proposal, and a distinction is yet to be made between potential innate statistical associations and cross-modal mappings acquired from birth.

Another potential mechanism that also relies on statistical co-occurrences - though this time occurring within language itself - is what Sidhu and Pexman (2018) label language patterns. Here, sound symbolic associations are argued to emerge from co-occurrences between phonological and semantic features, with repeated exposure to phoneme clusters with similar meanings (e.g., phonesthemes) leading to formation of iconic associations. Such patterns can then be used to generate original words, for instance participants using ‘gl-’ (as in ‘gleam’, ‘glisten’, ‘glow’) to create a novel word related to brightness (Magnus, 2001). However, as noted previously, it is unclear whether such language patterns can be considered to underpin or emerge from sound symbolic associations.

Sound-symbolic associations have also been explained via the broad mechanism of shared properties between phonemes and associated stimuli, which Sidhu and Pexman (2018) argue includes both perceptual links between phonemes and articulatory gestures, as well as higher level conceptual links shared across modalities, for example higher pitch having a connotative association with sharpness. As an example of the former, Sapir (1929) hypothesised that participants might associate high front vowels (e.g., /i/ in ‘mil’) with small shapes, as the articulation of such vowels requires a smaller oral cavity. In this way, the shape and phoneme share the property of smallness. Shared properties may also be used to explain the bouba/kiki effect, as the pointed lines in the ‘kiki’ shape mimic sharp phonemic inflections of the sound ‘kiki’, as well as sharp inflection of the tongue on the palate (Ramachandran & Hubbard, 2001). Similarly, the round shape may be labelled ‘bouba’ due to the corresponding roundness of mouth and lips when producing the phoneme /u/. In this way, there is a physical relationship between mouth shape and referent object shape, which may provide grounds for resemblance-based association. Ramachandran and Hubbard (2001) suggested that this may occur via cross-modal cortical connections between representations of mouth shapes in motor areas and phonemic representations in proximal auditory regions of the brain. These sensorimotor mappings are proposed to be similar to those in synaesthesia (where a stimulus in one modality triggers the expected sensory experience, as well as activation in at least one other modality) and facilitate non-arbitrary links between an object’s visual form and its neural representation. The authors argued that the angular gyrus is important for this cross-modal association, as it is well-positioned between temporal, parietal, and occipital lobes. They also provide neuropsychological evidence of a patient with left angular gyrus damage that did not exhibit the bouba/kiki effect. Ramachandran and Hubbard’s proposal was questioned by Ikegami and Zlatev (2007), who argued that since shape symbolism is not involuntary and often not consciously perceived, it is fundamentally distinct from synaesthesia. The shared properties mechanism is also explored in the Front Oral Cavity (FOC) theory (Masuda, 2007), which proposes that the front oral cavity resonance frequency (which is high for high front vowels) may be the acoustic basis for an association between sound and meaning, and that kinaesthetic factors, such as tongue movement, may contribute to the effect. The FOC theory encompasses acoustic and articulatory bases, providing a possible explanation as to why deaf subjects are still sensitive to sound symbolism (Eberhardt, 1940) – they may feel the size of the oral cavity, and use other visual factors such as the degree of lip rounding.

Sound Symbolism in Language Acquisition and Language Evolution

Having provided an overview of the key forms of sound symbolism, and the proposals of their mechanisms discussed in the literature, we now turn to exploring the possible roles of sound symbolism and iconicity in both language acquisition and in the context of language evolution. We also use this evidence to introduce a novel framework for understanding how iconicity might fit into a neurobiological model of human language processing.

Sound Symbolism in Language Acquisition

The advantages of sound symbolism in language acquisition have been well documented. It has been found that words learned earlier tend to be more iconic, suggesting that sound symbolism may support word learning (Perry et al., 2015). A study by Imai et al. (2008) suggested that iconicity facilitates early verb learning, and experiments in adults have shown that sound symbolism in natural language may aid word learning too (Lockwood et al., 2016). Sound symbolism may facilitate word learning via ‘grounding’, whereby resemblance-based form-meaning relationships allow for shared understanding and the establishment of communication (Cuskley & Sommer, forthcoming). For example, if iconic mappings are shared (e.g., /i/ and smallness), when combined with other multimodal inputs (e.g., pointing) this may facilitate shared attention and generation of a form-meaning relationship. A shared understanding of intuitive form-meaning associations may thus facilitate the formation of a lexicon – in other words sound symbolism may act as a scaffold for mapping semantic information, thus bootstrapping word learning (Imai & Kita, 2014; Monaghan et al., 2014). Thus, sound symbolism may aid bootstrapping via establishing ‘referentiality’ (an ability to map linguistic form to meaning) – a process fundamental to language learning, in combination with Hebbian learning and joint attentional processes (Perniss & Vigliocco, 2014). In a recent review, Nielsen and Dingemanse (2021) found strong evidence for the role of sound symbolism in ‘local learning enhancement’ - where resemblance-based associations of certain lexical items influence the learning of those items - but not for ‘general learning enhancement’, where they influence the later learning of arbitrary items. In addition, it has been suggested that iconicity aids the comprehension of communicative signs (Perniss & Vigliocco, 2014) and that the imitative, performative nature of iconic words makes communication more vivid (Lockwood & Dingemanse, 2015). It has however also been noted that more research is required on these effects of sound symbolism, as there may be cross-linguistic variation in learning benefits.

Iconicity in Language Evolution

Whilst the role of sound symbolism in language acquisition has been well established, its role in the evolution of language is a matter of continuing debate. The beginning of use of words as a communicative means was arguably a pivotal point in human evolution, with small articulations permitting rapid and efficient encoding of information, as well as wide and distant broadcast and communication without sight. However, given the relatively short period of recorded human history, a cross-species comparison is necessary to answer questions relating to language evolution, and the role of iconicity in this context.

The core platform for language is face-to-face communication, as this is how languages are learnt and most used. Manual gesture is seen as a likely evolutionary precursor to vocal communication (Rizzolatti & Arbib, 1998), and iconicity has been considered as a logical entry into the language system (Armstrong, 1983), perhaps acting as a bridge between manual (gesture) communicative systems and the verbal language we use today. For example, Levinson and Holler (2014) propose an evolutionary stratification of human communicative systems, whereby the different layers (e.g., joint attention, iconic gesture, turn-taking, and vocalisations) vary in antiquity. Declarative pointing is a form of signalling which facilitates mutual gaze to objects and thus allows for joint attention, while iconic gestures allow the effective depiction of motion, size and spatial relations between objects, such as an offering of something presented with the hand or other body parts (Liebal et al., 2006) or a request made with an open hand (Pollick & De Waal, 2007). The repeated use of iconic gestures would have arguably facilitated the grounding and memorisation of representations, leading to conventionalisation and hence increasing levels of abstraction (Garrod et al., 2007). In this way, iconicity may have been important for achieving displacement, in other words the ability to refer to things that are spatially and/or temporally remote (Perniss & Vigliocco, 2014). Displacement would be required to allow hominins to progress from a communication system based on functional reference and pointing to a system based on conceptual reference, and iconicity may therefore have contributed to the development of the cognitive ability required to use conceptually referential signals.

While the hypothesis above argues for gesture as a precursor to language in the vocal domain (e.g., Rizzolatti & Arbib, 1998), Perniss et al. (2010) reject this claim, proposing instead that language in manual and vocal modalities must have co-evolved, as linguistic and imagistic components are tightly integrated. This embodied link between linguistic form and sensorimotor experience is argued to reduce cognitive ability needed to unite signs and referents, with linguistic form activating the same systems used in perception and action. The finding that ideophones are more common in narrative contexts and occur alongside iconic gesture (Dingemanse, 2013) is taken as evidence for embodiment in language, with iconicity and gesture contributing to a multimodal act of depiction. According to this account, the innateness of embodiment is shown by close connections between the hand and mouth in the somatotopic organisation of the human motor cortex (Meier et al., 2008) and in congenitally blind individuals who gesture while speaking despite never having observed this (Iverson & Goldin-Meadow, 2001).

There is also some evidence that non-human primates may utilise the processing advantages of embodiment through iconic gestural communication. Studies have demonstrated that orangutans and chimpanzees can use iconic gestures (pantomime) to represent objects and mimic actions related to their use. Notably, these apes often elaborate on gestures that fail to elicit the desired response from their recipient, showcasing flexibility in their communication. This behaviour has been observed in both great apes raised in captivity (e.g., Miles et al., 1996; Tanner et al., 2006), and in forest-living rehabilitant orangutans (Russon & Andrews, 2011). However, the evidence for the use of iconic gestures in apes remains contested. Some researchers argue that non-human primates do not produce truly iconic gestures, as there is no clear requirement for the recipient to infer a resemblance between the gesture and its intended meaning (see Byrne et al., 2017; Tomasello & Call, 2019). Adding nuance to this debate, Perlman et al. (2012) point to a continuity between gesture and instrumental action in apes, suggesting that gestures are influenced by immediate physical and social contexts and can be adapted into spontaneous iconic gestures on-the-spot.

Of course, the great difficulty when examining the relationship between form and meaning from an evolutionary perspective is that it is highly contested whether animal signals can be said to have meaning (i.e., whether non-humans cognitively represent referents; Moore, 2014; Rendall et al., 2009; Scott-Phillips, 2015). This has led some authors to develop alternative cross-species approaches to examining arbitrariness and iconicity, such as Watson et al.’s (2022) ‘optionality’ framework, which highlights the presence of linguistic features of arbitrariness in non-human communication. Others (Fischer & Price, 2017) argue that non-human primates do not express communicative or informative intent, and so non-human communication (vocalisations and manual gesture) should be conceived as goal-directed behaviour only, with nothing more than a probabilistic causal link between the signifier and signified. For example, a chimpanzee may wave an arm in a ‘beckoning’ gesture to another simply because this produces the desired response in the recipient. Warren and Call (2022) however argue that non-human primates may be capable of applying social inferences to a communicative act, and that there may be mentalistic processes which underlie outcomes. Their model of ‘inferential communication’ argues that visual perspective taking and knowledge attribution are required for communicative exchange, helping to bridge the gap between animal and human communication.

Yet another useful approach to analysing the role of iconicity in language evolution is to consider behavioural imitation of others. The human ability to imitate the actions of others is fundamental to turn-taking, which is one of the critical foundations of language evolution (Levinson & Holler, 2014). Imitation is supported by mirror neurons, which code for manual goal-directed movement and fire both when an individual performs a manual task as well as when watching another individual perform the same task. They allow recognition of another’s action, as the same neural activation necessary to produce an action is generated via observation. This is important for the development of mutual understanding and an ability to share meaning. Iconicity and the mirror neuron system may therefore contribute to the emergence of expressions, as is seen in signed languages where new signs emerge from iconic gesture (Ahlner & Zlatev, 2010). In non-human primates, mirror neurons have been found in the ventral premotor cortex – comparable to the location of the mirror neuron system in humans - and are argued to have provided a bridge between iconic gesture and the imitation of behaviour (Rizzolatti & Arbib, 1998; see also Ramachandran & Hubbard, 2001). There is also evidence that chimpanzees and other great apes show imitative abilities beyond those of other animals (Bates & Byrne, 2010), which challenges the long-held view that humans are ‘imitators’ but non-human primates are ‘emulators’ (e.g., Tennie et al., 2006; Tomasello, 1996; Tomasello et al., 1987). It is argued that similarities in cultural transmission across species indicate our common ancestor was capable of imitating actions with sufficient fidelity to transmit culturally variant behaviours within and between communities (Whiten et al., 2009).

Pulling all this evidence together therefore suggests that speech may have evolved from proto-dialogue between individuals based on iconic hand gestures and imitation, supported by mutual action recognition facilitated by the mirror-neuron system (Rizzolatti & Arbib, 1998). While sound-symbolism in contemporary language may be perceived as a vestige of the iconic protolanguage (Kita et al., 2010), its manifold roles and advantages in the context of language evolution arguably imply a more integral role within the human language system.

Iconicity Within the Neurobiological Framework for Human Language

According to one prominent model, the Dual Neurobiological Systems Hypothesis (DNS, Marslen-Wilson & Tyler, 2007), modern human communicative capabilities are controlled by joint activation of bihemispheric (BH) and left-lateralised neural systems, which interact but are functionally and evolutionarily distinguishable (Marslen-Wilson & Bozic, 2018). Broadly speaking, the left-lateralised system encompasses a network of left-hemisphere frontal and temporal regions surrounding the Sylvian fissure and is unique to humans and responsible for supporting complex syntactic functions. The bihemispheric system involves a broad network of fronto-temporal regions in both hemispheres. It underpins social communication, through the processing of sound-to-meaning mapping, pragmatics, linear adjacency and multimodal interpretation (Bozic et al., 2010, 2015; Marslen-Wilson & Bozic, 2018) – with its capacity for interpretation of multimodal social communication cues particularly relevant for the current context. The bihemispheric system is argued to be evolutionarily primary, with the evidence showing that this system and its functions are highly conserved in non-human primates (Ghazanfar et al., 2008; Seyfarth & Cheney, 2017; Wilson et al., 2015). As such, the bihemispheric system can be considered a promising analogue to the one present in humans at the early stages of language evolution. Given that most forms of iconicity are underpinned by ‘linguistic cross-modal correspondence’ (Cuskley & Sommer, forthcoming; Sidhu & Pexman, 2018) - excluding only the most direct form-meaning associations such as onomatopoeia that remain within one sensory modality - and the presence of cross-modal associations has also been noted in non-human primates (Ludwig et al., 2011), it is possible that iconicity may have played a role in bridging the gap between the gestural or imitative visual signs used in early communication and the lexicalised concepts that emerged later. According to this view, cross-modal transfer may have developed into a more sophisticated cross-modal cognitive suite that provided humans with the neurological and behavioural architecture capable of storing and increasing cross-modal representations, thus affording us the ability to learn arbitrary symbols necessary for the modern linguistic capacity (Cuskley & Sommer, forthcoming). Therefore, iconicity may in part explain the evolutionary gap between the primate and the modern bihemispheric systems, ultimately setting a precedent for advancing communicative abilities that led to the evolution of the human linguistic capability – with the bihemispheric system likely heavily involved in the processing of iconic sounds, in line with evidence for activation in the bilateral superior temporal sulci (STS) in response to sound symbolic words (Kanero et al., 2014). Taking this evidence into account, we argue that placing iconicity within the neurobiological framework of the DNS hypothesis, and the bihemispheric system in particular, could help strengthen the argument that iconicity played a prominent role in early human language evolution.

Factors Limiting the Prevalence of Iconicity

If iconicity provides so many advantages, and is believed to be a stepping stone in the evolution of language, why is language still predominantly arbitrary? From an evolutionary perspective, features of language that improve processing and learnability should survive and become more common. Indeed, Jesperson (1922) found evidence of /i/ replacing other vowels in the historical development of words with meanings related to small size. This may have occurred as sound symbolic associations of /i/ made words containing this phoneme and describing small objects more favourable, and more likely to survive in language evolution – leading to the claim that languages should become ‘richer and richer in symbolic words’.

There are however factors that limit the type of meanings that can be expressed iconically. Dingemanse et al. (2015) and Sidhu (2019) review several such factors. Firstly, iconic words must have a fairly unique meaning, as similar meanings beg similar forms, and so iconicity may lead to ambiguity unless the referent is very distinct. A wholly iconic language would be dominated by words with similar forms and meanings, leading to uncertainties and deficiencies in processing and learnability. Secondly, the amount of sensory information contained within a referent may limit whether or not it can be described via an iconic relationship. Iconic mappings require sensory features, which restricts their use in language systems. Lupyan and Winter (2018) expand on this further, arguing that iconicity is limited in the expression of abstract concepts in particular, as iconic expressions are too strongly linked to sensory features or certain contexts. For example, certain metaphorical extensions found in English and Hebrew are not possible in Israeli Sign Language (ISL) and American Sign Language as the signs are too iconic (Meir, 2010). One example given is that while English allows for the verb “eat” to extend metaphorically (e.g., “The acid ate the iron key”), ISL’s sign for “eat” is too specific, depicting a human eating action at the mouth, which restricts such extensions. Meir discusses how this limitation is also seen in spoken languages where iconic sound-related words are less likely to be used metaphorically across sensory dimensions in ‘synaesthetic metaphors’. These examples demonstrate how iconicity may limit the meaning of an expression to a particular context, and thus its potential to become more abstract.

Iconicity in language may also be influenced by linguistic laws such as Zipf’s law of abbreviation, which predicts a negative relationship between word length and frequency of use (i.e., words used more frequently tend to be shorter and vice versa). Zipf’s law, alongside Menzerath’s law (which states that longer communicative constructs are composed of shorter parts) captures the information theoretic principle of minimising code length (compression). Both laws are ubiquitous in communicative systems, having been observed in both manual gesture and vocal communications of non-human primates as well as humans (Heesen et al., 2019; Huang et al., 2020). These laws may decrease iconicity in language - as frequency of use increases, the need for efficiency prevails over iconicity, resulting in ‘lexical elaboration’ with increasing levels of arbitrariness (Haiman, 1985).

Iconic words, such as ideophones, also seem to resist deep integration into the grammatical apparatus of a language. According to Dingemanse (2017), this reflects conflict between grammatical integration and the strength of iconic form-meaning mappings – if fully integrated, iconic expressions are subjected to processes of reduction and regularisation, so iconicity is reduced. This syntactic conventionalisation diminishes the role of iconicity and could perhaps be further evidence for the processing of iconic signals within the bihemispheric system, as it is the specialised left-hemisphere frontotemporal system that provides the principal neurobiological substrate of core, distinctly human-specific, grammatical capacity, (Bozic et al., 2010, 2015) and iconicity seems to resist grammatical integration.

The prevalence of iconicity in language is also limited by the advantages of arbitrariness. According to (Hockett, 1963), the flexibility of signalling afforded by arbitrariness is a fundamental property of language. Arbitrariness allows referral to any possible concept, and unlike iconicity it can allow for easy discrimination between similar entries in a lexicon. This allows for larger lexica to develop (Gasser, 2004) and allows communication about concepts for which direct perceptual grounding is not possible. A highly iconic lexicon with similar phonological forms for similar meanings would lead to high confusability, and decreased communicative effectiveness (Perniss et al., 2010).

Conclusions

In summary, the ancient debate over whether the linguistic sign is arbitrary has been clouded by unnecessary polarisation. A resemblance between form and meaning cannot be sufficient for understanding meaning, and conventionalisation of language throughout evolution is inevitable and necessary to maintain the efficiency and versatility of language. Hence while much of the historic debate until recent years has been binary, words cannot be entirely arbitrary or iconic – they fall on a spectrum instead. Iconicity and arbitrariness both convey their own unique advantages in a linguistic system, representing the adaptation of languages to the constraints of needing to link linguistic form with human experience whilst ensuring an effective signal. It is these constraints that drive the evolution and development of linguistic systems, adding further weight to the hypothesis that theories of language (both phylogenetic and ontogenetic) must value iconicity in addition to arbitrariness, and sound symbolism should be regarded as a core principle of language, and not merely a peripheral phenomenon.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Harry Barker

Data Availability Statement

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

Author Biographies

Harry Barker BA (Hons) is a fifth-year medical student at the University of Cambridge, and a Senior Whitby Scholar of Downing College, Cambridge. He intercalated to receive a bachelor's degree in Psychology during his third year of undergraduate study. His research interests are iconicity and language evolution, as well as child and adolescent mental health and media portrayals of psychiatry.

Mirjana Bozic PhD is Associate Professor at the Department of Psychology, University of Cambridge, and Fellow and Director of Studies for Psychological and Behavioural Sciences at King's College, Cambridge. She is a cognitive neuroscientist who studies the cognitive and neural mechanisms that support language comprehension, and how these mechanisms may have evolved. She also researches how brains adapt to the requirements of learning and using multiple languages.

References

Ahlner

Zlatev

(2010). Cross-modal iconicity: A cognitive semiotic approach to sound symbolism. Sign Systems Studies, 38(1/4), 298–348. https://doi.org/10.12697/SSS.2010.38.1-4.11

Armstrong

D. F.

(1983). Iconicity, arbitrariness, and duality of patterning in signed and spoken language: Perspectives on language evolution. Sign Language Studies, 38(Spring), 51–69.

Auracher

(2017). Sound iconicity of abstract concepts: Place of articulation is implicitly associated with abstract concepts of size and social dominance. PLoS One, 12(11), Article e0187196. https://doi.org/10.1371/JOURNAL.PONE.0187196

Baayen

R. H.

Milin

Đurđević

D. F.

Hendrix

Marelli

(2011). An amorphous model for morphological processing in visual comprehension based on naive discriminative learning. Psychological Review, 118(3), 438–481. https://doi.org/10.1037/a0023851

Bates

L. A.

Byrne

R. W.

(2010). Imitation: What animal imitation tells us about animal cognition. Wiley Interdisciplinary Reviews: Cognitive Science, 1(5), 685–695. https://doi.org/10.1002/wcs.77

Bozic

Fonteneau

Marslen-Wilson

W. D.

(2015). Grammatical analysis as a distributed neurobiological function. Human Brain Mapping, 36(3), 1190–1201. https://doi.org/10.1002/HBM.22696

Bozic

Tyler

L. K.

Ives

D. T.

Randall

Marslen-Wilson

W. D.

(2010). Bihemispheric foundations for human speech comprehension. Proceedings of the National Academy of Sciences of the United States of America, 107(40), 17439–17444. https://doi.org/10.1073/pnas.1000531107

Bremner

A. J.

Caparos

Davidoff

de Fockert

Linnell

K. J.

Spence

(2013). “Bouba” and “kiki” in Namibia? A remote culture make similar shape-sound matches, but different shape-taste matches to westerners. Cognition, 126(2), 165–172. https://doi.org/10.1016/J.COGNITION.2012.09.007

Byrne

R. W.

Cartmill

Genty

Graham

K. E.

Hobaiter

Tanner

(2017). Great ape gestures: Intentional communication with a rich set of innate signals. Animal Cognition, 20(4), 755–769. https://doi.org/10.1007/s10071-017-1096-4

10.

Cuskley

Kirby

(2013). Synaesthesia, cross-modality and language evolution. In Simner

Hubbard

E. M.

(Eds.), Oxford handbook of synesthesia (pp. 869–907). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780199603329.013.0043

11.

Cuskley

Simner

Kirby

(2017). Phonological and orthographic influences in the bouba-kiki effect. Psychological Research, 81(1), 119–130. https://doi.org/10.1007/s00426-015-0709-2

12.

Cuskley

Sommer

(forthcoming). The evolution of linguistic iconicity and the cross-modal cognitive suite. In Oxford Handbook of Iconicity in Language. Oxford University Press.

13.

Ćwiek

Fuchs

Draxler

Asu

E. L.

Dediu

Hiovain

Kawahara

Koutalidis

Krifka

Lippus

Lupyan

G. E.

Paul

Petrone

Ridouane

Reiter

Schümchen

Szalontai

Á.

Ünal-Logacev

Ö.

Winter

(2022). The bouba/kiki effect is robust across cultures and writing systems. Philosophical Transactions of the Royal Society B: Biological Sciences, 377(1841), Article 20200390. https://doi.org/10.1098/RSTB.2020.0390

14.

Davis

C. P.

Morrow

H. M.

Lupyan

(2019). What does a horgous look like? Nonsense words elicit meaningful drawings. Cognitive Science, 43(10), Article e12791. https://doi.org/10.1111/COGS.12791

15.

Dingemanse

(2013). Ideophones and gesture in everyday speech. Gesture, 13(2), 143–165. https://doi.org/10.1075/gest.13.2.02din

16.

Dingemanse

(2017). Expressiveness and system integration: On the typology of ideophones, with special reference to Siwu. STUF - Language Typology and Universals, 70(2), 363–385. https://doi.org/10.1515/stuf-2017-0018

17.

Dingemanse

Blasi

D. E.

Lupyan

Christiansen

M. H.

Monaghan

(2015). Arbitrariness, iconicity, and systematicity in language. Trends in Cognitive Sciences, 19(10), 603–615. https://doi.org/10.1016/J.TICS.2015.07.013

18.

Eberhardt

(1940). I. A summary of some preliminary investigations of the deaf. Psychological Monographs, 52(1), 1–5. https://doi.org/10.1037/H0093460

19.

Fischer

Price

(2017). Meaning, intention, and inference in primate vocal communication. Neuroscience & Biobehavioral Reviews, 82, 22–31. https://doi.org/10.1016/J.NEUBIOREV.2016.10.014

20.

Fort

Lammertink

Peperkamp

Guevara-Rukoz

Fikkert

Tsuji

(2018). Symbouki: A meta-analysis on the emergence of sound symbolism in early language acquisition. Developmental Science, 21(5), Article e12659. https://doi.org/10.1111/DESC.12659

21.

Fort

Schwartz

J.-L.

(2022). Resolving the bouba-kiki effect enigma by rooting iconic sound symbolism in physical properties of round and spiky objects. Scientific Reports, 12(1), Article 19172. https://doi.org/10.1038/s41598-022-23623-w

22.

Gallace

Boschin

Spence

(2011). On the taste of “Bouba” and “Kiki”: An exploration of word-food associations in neurologically normal participants. Cognitive Neuroscience, 2(1), 34–46. https://doi.org/10.1080/17588928.2010.516820

23.

Garrod

Fay

Lee

Oberlander

MacLeod

(2007). Foundations of representation: Where might graphical symbol systems come from? Cognitive Science, 31(6), 961–987. https://doi.org/10.1080/03640210701703659

24.

Gasser

(2004). The origins of arbitrariness in language A language design task. Proceedings of the Annual Meeting of the Cognitive Science Society, 26(26). https://escholarship.org/uc/item/34g8355v.

25.

Ghazanfar

A. A.

Chandrasekaran

Logothetis

N. K.

(2008). Interactions between the superior temporal sulcus and auditory cortex mediate dynamic face/voice integration in rhesus monkeys. Journal of Neuroscience, 28(17), 4457–4469. https://doi.org/10.1523/JNEUROSCI.0541-08.2008

26.

Gold

Segal

(2020). The bouba–kiki effect in persons with prelingual auditory deprivation. Language Learning and Development, 16(1), 49–60. https://doi.org/10.1080/15475441.2019.1685386

27.

Haiman

(1985). Natural syntax. Iconicity and erosion. In Cambridge studies in linguistics (Vol. 44). Cambridge University Press. https://www.jstor.org/action/showPublisher?publisherCode=springer

28.

Heesen

Hobaiter

Ferrer-I-Cancho

Semple

(2019). Linguistic laws in chimpanzee gestural communication. Proceedings of the Royal Society B, 286(1896), Article 20182900. https://doi.org/10.1098/RSPB.2018.2900

29.

Hirata

Ukita

Kita

(2011). Implicit phonetic symbolism in voicing of consonants and visual lightness using Garner’s speeded classification task. Perceptual and Motor Skills, 113(3), 929–940. https://doi.org/10.2466/15.21.28.PMS.113.6.929-940

30.

Hockett

C. F.

(1963). The Problem of Universals in Language. In Greenberg

J. H.

(Ed.), Universals of Language (pp. 1–22). MIT Press.

31.

Huang

Garber

P. A.

Fan

(2020). Male gibbon loud morning calls conform to Zipf’s law of brevity and Menzerath’s law: Insights into the origin of human language. Animal Behaviour, 160(2), 145–155. https://doi.org/10.1016/J.ANBEHAV.2019.11.017

32.

Ikegami

Zlatev

(2007). From non-representational cognition to language. In Ziemke

Zlatev

Frank

R. M.

(Eds.), Body, Language and Mind (1, pp. 241–283). De Gruyter Mouton.

33.

Imai

Kita

(2014). The sound symbolism bootstrapping hypothesis for language acquisition and language evolution. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651), Article 20130298. https://doi.org/10.1098/RSTB.2013.0298

34.

Imai

Kita

Nagumo

Okada

(2008). Sound symbolism facilitates early verb learning. Cognition, 109(1), 54–65. https://doi.org/10.1016/j.cognition.2008.07.015

35.

Iverson

J. M.

Goldin-Meadow

(2001). The resilience of gesture in talk: Gesture in blind speakers and listeners. Developmental Science, 4(4), 416–422. https://doi.org/10.1111/1467-7687.00183

36.

Jesperson

(1922). Symbolic value of the vowel i. Phil J Comp Phil, 1, 15–33.

37.

Johansson

Anikin

Aseyev

(2020). Color sound symbolism in natural languages. Language and Cognition, 12(1), 56–83. https://doi.org/10.1017/LANGCOG.2019.35

38.

Kanero

Imai

Okuda

Okada

Matsuda

(2014). How sound symbolism is processed in the brain: A study on Japanese mimetic words. PLoS One, 9(5), Article e97905. https://doi.org/10.1371/JOURNAL.PONE.0097905

39.

Kita

Kantartzis

Imai

(2010). Children learn sound symbolic words better: Evolutionary vestige of sound symbolic protolanguage. In Schouwstra

de Boer

Smith

A. D. M.

(Eds.), The Evolution of Language: Proceedings of the 8th International Conference (Evolang8) (pp. 206–213). World Scientific. https://doi.org/10.1142/9789814295222_0027

40.

Köhler

(1929). Gestalt psychology. Liveright.

41.

Koriat

Levy

(1977). The symbolic implications of vowels and of their orthographic representations in two natural languages. Journal of Psycholinguistic Research, 6(2), 93–103. https://doi.org/10.1007/bf01074374

42.

Levinson

S. C.

Holler

(2014). The origin of human multi-modal communication. Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651), Article 20130302. https://doi.org/10.1098/RSTB.2013.0302

43.

Liebal

Pika

Tomasello

(2006). Gestural communication of orangutans (Pongo pygmaeus).

44.

Locke

(1690). An essay concerning human understanding (WLC, 2009).

45.

Lockwood

Dingemanse

(2015). Iconicity in the lab: A review of behavioral, developmental, and neuroimaging research into sound-symbolism. Frontiers in Psychology, 6, 1246. https://doi.org/10.3389/fpsyg.2015.01246

46.

Lockwood

Dingemanse

Hagoort

(2016). Sound-symbolism boosts novel word learning. Journal of Experimental Psychology: Learning Memory and Cognition, 42(8), 1274–1281. https://doi.org/10.1037/XLM0000235

47.

Ludwig

V. U.

Adachi

Matsuzawa

(2011). Visuoauditory mappings between high luminance and high pitch are shared by chimpanzees (Pan troglodytes) and humans. Proceedings of the National Academy of Sciences of the United States of America, 108(51), 20661–20665. https://doi.org/10.1073/pnas.1112605108

48.

Lupyan

Winter

(2018). Language is more abstract than you think, or, why aren’t languages more iconic? Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1752), Article 20170137. https://doi.org/10.1098/rstb.2017.0137

49.

Maglio

S. J.

Rabaglia

C. D.

Feder

M. A.

Krehm

Trope

(2014). Vowel sounds in words affect mental construal and shift preferences for targets. Journal of Experimental Psychology: General, 143(3), 1082–1096. https://doi.org/10.1037/a0035543

50.

Magnus

(2001). What’s in a word?: Studies in phonosemantics. 202. https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/243679

51.

Marslen-Wilson

W. D.

Bozic

(2018). Dual neurobiological systems underlying language evolution: Inferring the ancestral state. Current Opinion in Behavioral Sciences, 21, 176–181. https://doi.org/10.1016/J.COBEHA.2018.05.004

52.

Marslen-Wilson

W. D.

Tyler

L. K.

(2007). Morphology, language and the brain: the decompositional substrate for language comprehension. Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481), 823–836. https://doi.org/10.1098/rstb.2007.2091

53.

Masuda

(2007). The physical basis for phonological iconicity. In Tabakowska

Ljungberg

Fischer

(Eds.), Insistent Images (pp. 57–71). John Benjamins. https://doi.org/10.1075/ill.5.08mas

54.

Maurer

Pathman

Mondloch

C. J.

(2006). The shape of boubas: Sound–shape correspondences in toddlers and adults. Developmental Science, 9(3), 316–322. https://doi.org/10.1111/J.1467-7687.2006.00495.X

55.

McCormick

Kim

J. Y.

List

Nygaard

L. C.

(2015). Sound to meaning mappings in the Bouba-Kiki effect. CogSci, 2015, 1565–1570. https://www.researchgate.net/publication/309485703.

56.

Meier

J. D.

Aflalo

T. N.

Kastner

Graziano

M. S. A.

(2008). Complex organization of human primary motor cortex: A high-resolution fMRI study. Journal of Neurophysiology, 100(4), 1800–1812. https://doi.org/10.1152/JN.90531.2008

57.

Meir

(2010). Iconicity and metaphor: Constraints on metaphorical extension of iconic forms. Language, 86(4), 865–896. https://doi.org/10.1353/lan.2010.0044

58.

Miles

H. L.

Mitchell

R. W.

Harper

S. E.

(1996). Simon says: The development of imitation in an enculturated orangutan. In Russon

A. E.

Bard

K. A.

Parker

S. T.

(Eds.), Reaching into thought: The minds of the great apes (pp. 278–299). Cambridge University Press.

59.

Monaghan

Shillcock

R. C.

Christiansen

M. H.

Kirby

(2014). How arbitrary is language? Philosophical Transactions of the Royal Society B: Biological Sciences, 369(1651), Article 20130299. https://doi.org/10.1098/RSTB.2013.0299

60.

Moore

(2014). Ape gestures: Interpreting chimpanzee and bonobo minds. Current Biology, 24(14), R645–R647. https://doi.org/10.1016/j.cub.2014.05.072

61.

Newman

S. S.

(1933). Further experiments in phonetic symbolism. American Journal of Psychology, 45(1), 53. https://doi.org/10.2307/1414186

62.

Nielsen

Rendall

(2011). The sound of round: Evaluating the sound-symbolic role of consonants in the classic takete-maluma phenomenon. Canadian Journal of Experimental Psychology, 65(2), 115–124. https://doi.org/10.1037/A0022268

63.

Nielsen

A. K. S.

Dingemanse

(2021). Iconicity in word learning and beyond: A critical review. Language and Speech, 64(1), 52–72. https://doi.org/10.1177/0023830920914339

64.

Nielsen

A. K. S.

Rendall

(2013). Parsing the role of consonants versus vowels in the classic Takete-Maluma phenomenon. Canadian Journal of Experimental Psychology, 67(2), 153–163. https://doi.org/10.1037/a0030553

65.

Ohala

J. J.

(1994). The frequency code underlies the sound-symbolic use of voice pitch (pp. 325–347). Sound Symbolism.

66.

Ozturk

Krehm

Vouloumanos

(2013). Sound symbolism in infancy: Evidence for sound-shape cross-modal correspondences in 4-month-olds. Journal of Experimental Child Psychology, 114(2), 173–186. https://doi.org/10.1016/J.JECP.2012.05.004

67.

Peirce

C. S.

(1974). Collected papers of charles Sanders Peirce (6th ed.). Harvard University Press.

68.

Peña

Mehler

Nespor

(2011). The role of audiovisual processing in early conceptual development. Psychological Science, 22(11), 1419–1421. https://doi.org/10.1177/0956797611421791

69.

Perlman

Little

Thompson

R. L.

(2018). Iconicity in signed and spoken vocabulary: A comparison between American Sign Language, British Sign Language, English, and Spanish. Frontiers in Psychology, 9, 1433. https://doi.org/10.3389/FPSYG.2018.01433

70.

Perlman

Tanner

J. E.

King

B. J.

(2012). A mother gorilla’s variable use of touch to guide her infant. In Pika

Liebal

(Eds.), Developments in primate gesture research (pp. 55–72). John Benjamins Publishing. https://doi.org/10.1075/gs.6.04per

71.

Perniss

Thompson

R. L.

Vigliocco

(2010). Iconicity as a general property of language: Evidence from spoken and signed languages. Frontiers in Psychology, 1, 227. https://doi.org/10.3389/FPSYG.2010.00227

72.

Perniss

Vigliocco

(2014). The bridge of iconicity: From a world of experience to the experience of language. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 369(1651), Article 20130300. https://doi.org/10.1098/RSTB.2013.0300

73.

Perry

L. K.

Perlman

Lupyan

(2015). Iconicity in English and Spanish and its relation to lexical category and age of acquisition. PLoS One, 10(9), Article e0137147. https://doi.org/10.1371/journal.pone.0137147

74.

Pinker

Bloom

(1990). Natural language and natural selection. Behavioral and Brain Sciences, 13(4), 707–727. https://doi.org/10.1017/S0140525X00081061

75.

Pollick

A. S.

De Waal

F. B. M.

(2007). Ape gestures and language evolution. Proceedings of the National Academy of Sciences of the United States of America, 104(19), 8184–8189. https://doi.org/10.1073/PNAS.0702624104

76.

Ramachandran

V. S.

Hubbard

E. M.

(2001). Synaesthesia-A window into perception, thought and language. Journal of Consciousness Studies, 8(12), 3–34. https://www.researchgate.net/publication/318494178_Synaesthesia---AWindow_Into_Perception_Thought_and_Language.

77.

Reilly

Kean

(2007). Formal distinctiveness of high- and low-imageability nouns: Analyses and theoretical implications. Cognitive Science, 31(1), 157–168. https://doi.org/10.1080/03640210709336988

78.

Rendall

Owren

M. J.

Ryan

M. J.

(2009). What do animal signals mean? Animal Behaviour, 78(2), 233–240. https://doi.org/10.1016/J.ANBEHAV.2009.06.007

79.

Rizzolatti

Arbib

M. A.

(1998). Language within our grasp. Trends in Neurosciences, 21(5), 188–194. https://doi.org/10.1016/S0166-2236(98)01260-0

80.

Rogers

S. K.

Ross

A. S.

(1975). A cross cultural test of the Maluma Takete phenomenon. Perception, 4(1), 105–106. https://doi.org/10.1068/P040105

81.

Russon

Andrews

(2011). Orangutan pantomime: Elaborating the message. Biology Letters, 7(4), 627–630. https://doi.org/10.1098/RSBL.2010.0564

82.

Sapir

(1929). A study in phonetic symbolism. Journal of Experimental Psychology, 12(3), 225–239. https://doi.org/10.1037/H0070931

83.

Saussure

F. de

. (1959). Course in general linguistics.

84.

Schmidtke

D. S.

Conrad

Jacobs

A. M.

(2014). Phonological iconicity. Frontiers in Psychology, 5, 80. https://doi.org/10.3389/fpsyg.2014.00080

85.

Scott-Phillips

T. C.

(2015). Meaning in animal and human communication. Animal Cognition, 18(3), 801–805. https://doi.org/10.1007/S10071-015-0845-5

86.

Seyfarth

R. M.

Cheney

D. L.

(2017). Precursors to language: Social cognition and pragmatic inference in primates. Psychonomic Bulletin and Review, 24(1), 79–84. https://doi.org/10.3758/s13423-016-1059-9

87.

Sidhu

D. M.

(2019). Explorations of sound symbolism and iconicity. Doctoral thesis. University of Calgary. https://prism.ucalgary.ca

88.

Sidhu

D. M.

Pexman

P. M.

(2018). Five mechanisms of sound symbolic association. Psychonomic Bulletin and Review, 25(5), 1619–1643. https://doi.org/10.3758/s13423-017-1361-1

89.

Sidhu

D. M.

Westbury

Hollis

Pexman

P. M.

(2021). Sound symbolism shapes the English language: The maluma/takete effect in English nouns. Psychonomic Bulletin and Review, 28(4), 1390–1398. https://doi.org/10.3758/S13423-021-01883-3

90.

Simner

Cuskley

Kirby

(2010). What sound does that taste? Cross-Modal mappings across gustation and audition. Perception, 39(4), 553–569. https://doi.org/10.1068/p6591

91.

Spence

(2011). Crossmodal correspondences: A tutorial review. Attention, Perception, & Psychophysics, 73(4), 971–995. https://doi.org/10.3758/S13414-010-0073-7

92.

Styles

S. J.

Gawne

(2017). When does maluma/takete fail? Two key failures and a meta-analysis suggest that phonology and phonotactics matter. I-Perception, 8(4), Article 2041669517724807. https://doi.org/10.1177/2041669517724807

93.

Svantesson

J. O.

(2017). Sound symbolism: The role of word sound in meaning. Wiley Interdisciplinary Reviews: Cognitive Science, 8(5), Article e1441. https://doi.org/10.1002/WCS.1441

94.

Tanner

J. E.

Patterson

F. G.

Byrne

R. W.

(2006). The development of spontaneous gestures in zoo-living gorillas and sign-taught gorillas: From action and location to object representation. Journal of Developmental Processes, 1, 69–103. https://www.researchgate.net/publication/237516373_The_Development_of_Spontaneous_Gestures_in_Zoo-Living_Gorillas_and_Sign-Taught_Gorillas_From_Action_and_Location_to_Object_Representation.

95.

Tennie

Call

Tomasello

(2006). Push or pull: Imitation vs. emulation in great apes and human children. Ethology, 112(12), 1159–1169. https://doi.org/10.1111/J.1439-0310.2006.01269.X

96.

Thompson

P. D.

Estes

(2011). Sound symbolic naming of novel objects is a graded function. Quarterly Journal of Experimental Psychology, 64(12), 2392–2404. https://doi.org/10.1080/17470218.2011.605898

97.

Tomasello

(1996). Do apes ape? In Heyes

C. M.

Galef

B. G.

Jr. (Eds.), Social learning in animals: The roots of culture (pp. 319–346). Academic Press. https://doi.org/10.1016/B978-012273965-1/50016-9

98.

Tomasello

Call

(2019). Thirty years of great ape gestures. Animal Cognition, 22(4), 461–469. https://doi.org/10.1007/s10071-018-1167-1

99.

Tomasello

Davis-Dasilva

Camak

Bard

(1987). Observational learning of tool-use by young chimpanzees. Human Evolution, 2(2), 175–183. https://doi.org/10.1007/BF02436405

100.

Vainio

(2021). Sound-action symbolism. Frontiers in Psychology, 12, 3848. https://doi.org/10.3389/fpsyg.2021.718700

101.

Warren

Call

(2022). Inferential communication: Bridging the gap between intentional and ostensive communication in non-human primates. Frontiers in Psychology, 12, Article 718251. https://doi.org/10.3389/FPSYG.2021.718251

102.

Watson

S. K.

Filippi

Gasparri

Falk

Tamer

Widmer

Manser

Glock

H. J.

(2022). Optionality in animal communication: A novel framework for examining the evolution of arbitrariness. Biological Reviews of the Cambridge Philosophical Society, 97(6), 2057–2075. https://doi.org/10.1111/BRV.12882

103.

Westbury

(2005). Implicit sound symbolism in lexical access: Evidence from an interference task. Brain and Language, 93(1), 10–19. https://doi.org/10.1016/J.BANDL.2004.07.006

104.

Westbury

Hollis

Sidhu

D. M.

Pexman

P. M.

(2018). Weighing up the evidence for sound symbolism: Distributional properties predict cue strength. Journal of Memory and Language, 99, 122–150. https://doi.org/10.1016/J.JML.2017.09.006

105.

Whiten

McGuigan

Marshall-Pescini

Hopper

L. M.

(2009). Emulation, imitation, over-imitation and the scope of culture for child and chimpanzee. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1528), 2417–2428. https://doi.org/10.1098/RSTB.2009.0069

106.

Wilson

Kikuchi

Sun

Hunter

Dick

Smith

Thiele

Griffiths

T. D.

Marslen-Wilson

W. D.

Petkov

C. I.

(2015). Auditory sequence processing reveals evolutionarily conserved regions of frontal cortex in macaques and humans. Nature Communications, 6, 8901. https://doi.org/10.1038/ncomms9901

Forms,Mechanisms,and Roles of Iconicity in Spoken Language: A Review

Abstract

Keywords

Introduction

Defining Forms of Non-Arbitrariness

Sound Symbolism

Sound-Size Symbolism

Sound-Shape Symbolism

Other Phoneme-Feature Associations

Mechanisms of Sound Symbolism

Sound Symbolism in Language Acquisition and Language Evolution

Sound Symbolism in Language Acquisition

Iconicity in Language Evolution

Iconicity Within the Neurobiological Framework for Human Language

Factors Limiting the Prevalence of Iconicity

Conclusions

Footnotes

Declaration of Conflicting Interests

Funding

ORCID iD

Data Availability Statement

Author Biographies

References