Relationships Between Subjective Iconicity Ratings and Phonological Variables in English and Spanish

Abstract

Iconicity (the extent to which word forms resemble their meanings) is proposed to be based on universally accessible form mappings that depict/express sensory imagery. In the present study, we explored phonological structural features proposed to be characteristic of iconicity and subjective iconicity ratings in two large English and Spanish datasets. Restricting analyses to words with good rating agreement across participants, we show that the distributions of iconicity ratings differ considerably between the two languages, with far fewer Spanish words rated as iconic. Multiple regression analyses showed that structural markedness significantly predicted iconicity ratings in both languages, although the relationship was weaker in Spanish. Highly rated English forms included many phonaesthemes, that is, words with systematic sound-meaning mappings that can be iconic or non-iconic. Surprisingly, English and Spanish words rated higher in iconicity had larger phonological neighbourhoods despite comprising less frequently occurring phoneme sequences. In English, words rated as more iconic were also more likely to be polysemes (i.e. convey multiple, metaphorically-related meanings) than linked to a specific sensory meaning. Regression models revealed phonological/phonetic features, syllable structures and reduplications predicted significant proportions of variance in both English (33.3%) and Spanish iconicity ratings (50.8%), demonstrating both common and language-specific mappings. While our findings support the qualified use of subjective ratings for cross-linguistic comparisons of iconicity, we recommend researchers control for systematicity and polysemy and consider using additional/alternative measures to exclude non-iconic forms.

Keywords

iconicity embodied cognition phonology word meaning

Introduction

Across spoken languages, there are some words whose forms sound like the meaning they convey, that is, they are iconic. Most are ideophones, a structurally marked, open class of words that depict imagery across sensory and motor domains (Thompson & Do, 2019). Some, such as onomatopoetic forms, directly imitate an auditory percept via articulatory gestures (e.g. in English the words splash and thump mimic the sounds of objects hitting water or a solid surface, respectively) while others indirectly depict motion (e.g. zigzag references alternating left and right turns in English and Spanish).¹ A different category of words references sensorimotor imagery via a systematic pairing of sub-morphemic form features, as in phonaesthemes (e.g. in English, the consonant cluster gl- is linked to luminance/vision as in glare, glimmer, gloom, etc.; while fl- is linked to movement as in fling, flip, flutter, etc.; Zingler, 2017). Iconicity is proposed to manifest in only small sets of words cross-linguistically, being based on universal form-resemblance mappings, while systematic pairings are represented more extensively within languages reflecting their basis in statistical regularities (Blasi et al., 2016; Dingemanse et al., 2015; Haslett & Cai, 2023; Perniss et al., 2010; Thompson et al., 2021).

Recent attempts to operationalise iconicity have used subjective ratings based on lay definitions supported by examples (Hinojosa et al., 2021; Perry et al., 2015; Thompson et al., 2020; Winter et al., 2024). An advantage of this approach is that it allows data to be collected from large portions of the vocabulary, enabling direct comparisons between various spoken languages (Motamedi et al., 2019; Winter & Perlman, 2021). However, the basis for participants’ intuiting that a word “sounds like what it means” is not entirely clear. As Winter et al. (2024) acknowledged, iconicity ratings “underspecify the particular form-meaning links that lead to a rater’s intuitions . . . they do not give clues to the nature of this correspondence” (p. 11). It has been suggested that participants might simply rate the iconicity of words based on their sensory meanings rather than on their form-meaning resemblances (Thompson et al., 2020). Given that iconicity ratings are increasingly employed in psycholinguistic studies to support inferences about form-meaning mappings and theories of language embodiment (Dove, 2022; Lupyan & Winter, 2018; Perniss & Vigliocco, 2014; Perry et al., 2015; Sidhu & Pexman, 2018b; Sidhu et al., 2020), it is essential to clarify the nature of this correspondence.

Several lines of evidence have been cited in support of iconicity ratings referencing form-meaning resemblances using the two largest normative datasets in English (Winter et al., 2024; 14,776 words) and Spanish (Hinojosa et al., 2021; 10,995 words). These studies have primarily focused on semantic relationships and have shown that words rated higher in iconicity are also rated higher in terms of sensory experience in both languages (sensory experience ratings (SER); Díez-Álamo et al., 2019; Juhasz & Yap, 2013). While significant, this relationship is also weak (e.g. r = .20 and r = .24 in English and Spanish, respectively; de Zubicaray et al., 2024; Hinojosa et al., 2021), suggesting iconicity ratings do not merely recapitulate sensory experience (cf., Thompson et. al., 2020). English words rated high in iconicity also have sparser semantic neighbourhoods involving fewer concepts with similar meanings, as might be expected for non-arbitrary form-meaning mappings (Sidhu & Pexman, 2018a; Winter et al., 2024).

Relatively little research has explored the form mappings that contribute to a raters’ intuition of iconicity across languages, which is the focus of the present study. Here, we aimed to investigate how well iconicity ratings align with proposals concerning the phonological structure of iconic words. For example, as iconicity is proposed to be based in the use of universally accessible acoustic-phonetic features to depict/express sensory imagery (Blasi et al., 2016; Thompson & Do, 2019), then it seems reasonable to assume these features should be shared among words with high iconicity ratings across languages. Most of this research has involved single phonemes. For example, Blasi et al. (2016) reported strong associations between the concepts of roundness and smallness and /r/ and /i/ sounds, respectively, across multiple languages. Ćwiek et al. (2024) reported that trilled /r/ and /l/ sounds are associated with roughness and smoothness, respectively, across 28 languages (see also Winter et al., 2022). Thompson et al. (2021) found that seven articulatory features were common to ideophones across 13 typologically diverse, non-Indo-European languages, a finding they considered consistent with the cross-linguistic use of sensorimotor analogies. Furthermore, these features should primarily be shared by root forms. This distinction is important because morphological derivation in most languages involves the systematic addition of redundant form-meaning mappings in affixes. For example, in English and Spanish, negation is typically prefixal (e.g. un- in “unhappy,” in- in “infeliz”; de Zubicaray & Hinojosa, 2024). In English, suffixes convey abstractness (e.g. the concrete word “friend” becomes the abstract word “friendliness” with the addition of -iness; Kearney et al., 2024; Reilly et al., 2017). In Spanish, evaluative suffixes for diminution (e.g. the word “pájaro,” bird- is perceived as more positive in the word “pajarito” little bird with the addition of -ito) and augmentation (e.g. the word “cabeza,” head- becomes negative valenced in the word “cabezón” big head with meaning of stubborn with the addition of -ón) not only express quantification but also melioration and pejoration (Hinojosa et al., 2022).

In order to facilitate cross-linguistic comparisons, Dingemanse (2019) proposed that ideophones are marked words, in that they have structural features that “make them stand out from other words” (p. 15) in their respective languages. Structural markedness is a linguistically relative concept (Waugh & Lafford, 2006). For example, iconic words in English have been proposed to be more structurally marked in terms of having phonologically complex consonant clusters/blends in their onsets and codas, and Dingemanse and Thompson (2020) observed a positive correlation between iconicity ratings and structural markedness in a sample of English words also rated for humour/funniness. However, phonaesthemes are also characterised by complex onsets and codas, and some researchers distinguish them from ideophones noting that the latter are not a traditional part of speech like nouns or verbs (Dingemanse, 2019; Kwon & Round, 2015; Zingler, 2017). Iconic forms also show a high rate of reduplication (the root is repeated exactly or with only slight modification, e.g. zigzag, frufrú), which is considered an example of markedness (Dingemanse, 2015; Punselie et al., 2024; Zingler, 2017).

Distinguishing between iconicity and systematicity in form-meaning mappings is not always straightforward. For example, while there is nothing about the phonoaesthetic gl- onset cluster that links it sensorily to luminance or vision, the sn- onset cluster associated with nose/oral functions in English (e.g. sneeze, sniffle, snore, snack, snarl etc.) could be considered both phonoaesthetic and iconic, because the initial fricative /s/ is systematic while the nasal /n/ appears imitative (Kwon & Round, 2015; Zingler, 2017). The former phonaesthemes are therefore often distinguished as learned or conventional rather than iconic forms (Kwon & Round, 2015; Perry et al., 2015; Zingler, 2017). English has far more phonaesthemes than other languages (Mompeán et al., 2020). Spanish has fewer onset consonant clusters than English and none in its codas (Carlo et al., 2020), making them a less prominent structural marker in that language, although there are some cross-linguistic examples of phonaesthemes with similar meanings (e.g. the /fl/ onset is linked to motion in fluid in both English and Spanish, as in float and flow, and flotar and fluir, respectively; see Mompeán et al., 2020). Using an inductive approach, Dingemanse and Thompson’s (2020) proposed that “phonological improbability” can be considered a proxy for structural markedness. In Spanish, plosives and fricatives other than /s/ occur infrequently in coda position, meeting this criterion (Lloyd & Schnitzer, 1967; Rodríguez, 2016). Voiced plosives (/b/, /d/, /g/) that follow vowels also undergo a systematic weakening (lenition) process known as spirantisation in Spanish, further differentiating them from their regular voicing in other positions (González, 2006; Piñeros, 2002).

Phonotactic probability can also be viewed as a cross-linguistic proxy for structural markedness (Dingemanse, 2019). For example, it has been proposed that ideophones might be less likely to follow the general phonotactic rules of their language and so comprise less probable sequences of phonemes than non-iconic forms due to their imitation of sensory percepts (Thompson & Do, 2019). Dingemanse (2019) also considers this to be a form of structural marking that is shared with phonaesthemes. Dingemanse and Thompson (2020) failed to observe a significant relationship between phonotactic probability and iconicity ratings in a set of 1,419 English words also rated for humour/funniness, although they did find a significant negative covariance (−16.3%) with a measure of log letter probability which they suggested might be due to the ratings having been collected using written words. Winter et al. (2024) also observed a negative correlation with log letter frequency (r = −.15) in their larger set of English iconicity ratings, although did not investigate phonotactic probability. However, this relationship might differ for Spanish as it is more phonotactically regular and constrained in terms of onset and coda clusters and has spelling-to-sound mappings that are more transparent (Carlo et al., 2020; Rodríguez-Ferreiro & Davies, 2019). Words with high probability segments also tend to occur in more dense neighbourhoods involving more similar sounding words (Vitevitch & Luce, 2016). Hence, if words rated higher in iconicity are less phonotactically probable (Thompson & Do, 2019), then they should also have sparser phonological neighbourhoods.

The imitative aspects of iconic form-meaning mappings have been interpreted as supporting grounded or embodied cognition accounts of language that propose conceptual processing is situated in sensorimotor systems/experience (Murgiano et al., 2021; Perniss & Vigliocco, 2014; Sidhu & Pexman, 2021). Indeed, some authors have argued iconic forms “are too linked to specific referents and contexts, and so are less well suited for expressing abstractions” (Lupyan & Winter, 2018, p. 1). However, this focus ignores the reality of colexification across languages in which single word forms are frequently used to express multiple inter-related meanings, especially abstract ones (François, 2008; Rzymski et al., 2020). Lexical ambiguity significantly impacts how words are processed (for a review, see Eddington & Tokowicz, 2015). In English, most onomatopoetic forms are polysemous in that they convey multiple, metaphorically-related senses (Sasamoto, 2019). This is also the case for Spanish (Ibarretxe-Antuñano, 2019). For example, consider the following senses of the word “crunch”: We are currently experiencing a credit crunch. You have to crunch the numbers now. It’s crunch time for the team. This suggests iconicity ratings should show a positive rather than negative correlation with number of senses. However, the strength of this relationship might also differ between the two languages. Using sense annotations derived from Wikipedia corpora, Dandala et al. (2013) reported that the average number of senses per word in English and Spanish was 9.6 and 4.2, respectively.

In this paper, we use the large datasets of English and Spanish iconicity ratings collected by Winter et al. (2024) and Hinojosa et al. (2021), respectively, to identify form structural features that contribute to participants’ intuitions that a word “sounds like what it means” across the two languages. It should be noted that while both studies included words from prior normative studies of various semantic ratings (e.g. concreteness, emotional valence), including lists of onomatopoeias in their respective languages, Winter et al. (2024) also included a list of phonaesthemes taken from Hutchins (1998) but did not distinguish learned and iconic forms in their norms. Given relationships between form and meaning can be systematic and/or iconic, it is essential to distinguish them lest they be misattributed by researchers (Thompson & Do, 2019). Specifically, we tested the hypotheses that unaffixed word forms with high iconicity ratings in both languages would: (1) exhibit structural markedness; (2) comprise less phonotactically probable phoneme sequences; (3) have sparser phonological neighbourhoods; (4) be more likely to be polysemous; and (5) share “universal” form features.

Methods

Materials

Iconicity ratings for 14,776 English and 10,995 Spanish words were sourced from Winter et al. (2024) and Hinojosa et al. (2021), respectively. In both studies, participants rated individually presented written words on a scale of 1 to 7, according to whether they considered the sound of the word to be unrelated to its meaning (not iconic at all) to it being closely related to its meaning (very iconic). Information about affixation and lists of compound words in both languages were sourced from the Multilingual Database of Derivational and Inflectional Morphology (MorphyNet; Batsuren et al., 2021) and normative databases (Desrochers et al., 2010; Juhasz et al., 2015), respectively. Phonological neighbourhood sizes and mean phonotactic (biphone) probabilities for English were sourced from the Cross-Linguistic Easy-Access Resource for Phonological and Orthographic Neighbourhood Densities (CLEARPOND; Marian et al., 2012) and Irvine Phonotactic online Dictionary (IPHOD v2; Vaden et al., 2009), respectively, both of which are based on the SUBTLEXus database (Brysbaert & New, 2009). For Spanish, these values were sourced from the EsPal database (Duchon et al., 2013), which is also based on a movie subtitle corpus.² Information about colexification in terms of the number of inter-related senses for English and Spanish words was sourced from WordNet (3.0; Gao et al., 2022; Miller, 1995) and the Multilingual Central Repository (MCR 3.0; Gonzalez-Agirre et al., 2012), which is based on WordNet, respectively. We selected the above databases because they each used similar methods to derive their measures from comparable English and Spanish corpora.

We followed Dingemanse and Thompson’s (2020) inductive approach of cataloguing phonological complexity to identify instances of structural markedness in iconic words. In English, this involved identifying words with complex two- and three-letter consonant clusters/blends in their onsets (bl/cl/fl/gl/pl/br/cr/dr/fr/gr/pr/tr/sk/sl/sp/st/sw; spr/scr/str) and codas (mp/nk/rt/rr/sh/wk) in addition to the diminutive suffix -le. However, unlike Dingemanse and Thompson (2020), we excluded instances where the suffix was used productively according to MorphyNet to avoid morphophonological redundancy influencing the results (so “sniffle” was excluded as it is a diminutive of the root “sniff,” while “drizzle” was included as “drizz” is not an English root word). We also identified instances of reduplication (repetition or near-repetition of roots, e.g. goo-goo, zigzag) and vowel lengthening/multiple consecutive vowels (e.g. squeal, squeak; see Dingemanse, 2015; Perniss & Vigliocco, 2014; Punselie et al., 2024). Each instance was coded as “1” and a cumulative measure of structural markedness was derived via summation (e.g. “plow” was coded as 1, while “drizzle” and “sport” were each coded as 2).

We adopted a similar approach for Spanish words, including only complex onset consonant clusters (pl/pɾ/bl/bɾ/tr/dr/cl/cr/gl/gr/fl/fr) as none may exist in codas (Bradley, 2006), as well as instances of reduplications (e.g. frufrú; Urbaniak, 2019) and multiple consecutive vowels (e.g. buaaa, muuu). In addition, we coded instances of fricatives other than /s/ as well as plosives in coda position, given these features occur infrequently in Spanish (Lloyd & Schnitzer, 1967; Rodríguez, 2016), following Dingemanse and Thompson’s (2020) proposal that phonological improbability is a proxy for structural markedness.

To determine how many English phonaesthemes were rated as iconic, we adopted two approaches: First, we compiled a non-exhaustive list of words with semantic gloss annotations from several available sources (Bergen, 2004; Hutchins, 1998; Kwon & Round, 2015; Zingler, 2017). Second, we catalogued words that included the 47 candidate phonaesthemic clusters from Otis and Sagi’s (2008) corpus study. While the first approach is conservative, the second is more lenient in that it will possibly include words that do not share the phonaesthemic meaning associated with a given cluster.³

Phonemic transcriptions and stress category assignments for English and Spanish were retrieved from the Carnegie Mellon University (CMU) pronouncing dictionary (http://www.Speech.cs.cmu.edu/cgi-bin/cmudict; 39 phonemes) and EsPal database (Duchon et al., 2013; 31 phonemes), respectively. Each word was coded according to its whole word properties (number of letters, syllables and phonemes), their initial and end phonemes (a number was assigned to each of the phonemes), number of phonetic features and their initial and final positions (i.e. place and height for vowels; voicing; place and manner of articulation for consonants), and syllabic stress position (initial, medial, final). To these we added orthographic length (i.e. number of letters) as a proxy for auditory duration.

Syllable structures were retrieved from the CMU and Espal. Note that cross-linguistic coding of sub-syllabic onset, nucleus, and coda slots is not straightforward due to the differences in syllabic structure across Spanish and English and difficulties determining syllable boundaries in the latter. For example, Spanish syllable structure is more clearly defined compared to English, due to a predominant open consonant-vowel (CV) syllable, and is a syllable-timed language (all syllables have approximately equal auditory duration; Bertrán, 1999; Gorman & Gillam, 2003). Conversely, the predominant syllable in English is the closed CVC combination, making the assignment of consonants to syllables relatively subjective (Duanmu, 2009; Gorman & Gillam, 2003), with syllable durations longer or shorter according to whether they are stressed or unstressed (Bertrán, 1999; Gorman & Gillam, 2003). Here, we adopted Bartlett et al.’s (2009) syllabification of the CMU dataset based on the sonority sequencing principle, in which sonority (relative loudness) rises in onsets, peaks at the nucleus, then falls in coda position.⁴

Analyses

All analyses were performed using R (version 4.4.1; R Core Team, 2024). Averaging disparate ratings typically results in an unrepresentative value in the middle range of the scale with a large standard deviation, particularly for sensorimotor experience variables (Pollock, 2018). Winter et al. (2024) noted this issue affected their ratings of iconicity in English. We therefore applied a cut-off of 1.5 standard deviations to identify words with reasonable rating agreement in both datasets based on their use of 7-point scales.⁵ Next, we removed affixed forms (Batsuren et al., 2021) and compound nouns with iconicity ratings >= 5 from both datasets (Desrochers et al., 2010; Juhasz et al., 2015) as Dingemanse and Thompson (2020) observed that English compounds with transparent yet non-iconic structure (e.g. heartbeat) tend to be rated as highly iconic.

Distributions were plotted using ggplot2 and cowplot packages (Wickham, 2016; Wilke, 2024). Spearman correlations between the iconicity ratings and other variables were calculated using the Hmisc and corrplot packages (Harrell, 2024; Wei & Simko, 2024). Multiple regressions with robust standard errors were performed with iconicity ratings as dependent variable using the package estimatr (Blair et al., 2022), ensuring valid coefficient estimates even in the presence of skewness, outliers, multicollinearity and/or heteroscedasticity among predictor variables (Wilcox, 2019).

To determine the best subsets of phoneme, syllable and structural markedness (reduplications, multiple consecutive vowels) variables for predicting iconicity ratings across languages, we used the leaps package (Lumley, 2022) after first excluding those with linear dependencies (caret package – findLinearCombos; Kuhn, 2008). Next, we determined the best-fitting model in terms of predictive accuracy via a 10-fold cross-validation procedure (repeated 200 times with different randomised folds), selecting the model that minimised root mean square error to avoid overfitting (de Rooij & Weeda, 2020; Yarkoni & Westfall, 2017). The best fit model was then into a linear regression with robust standard errors (Wilcox, 2019).

Results

The mean iconicity values and corresponding standard deviations for every English and Spanish word in the Winter et al. (2024) and Hinojosa et al. (2021) norms are plotted in Figure 1. Following application of the 1.5 standard deviation cut-off, 2,684 English and 1,900 Spanish words showed reasonable rating agreement, of which 629 and 99 were rated as iconic based on a rating of 5 or greater (as 4 is the neutral mid-point of a 7-point scale), respectively. Removal of affixed forms and etymologically transparent, non-iconic compounds resulted in sets of 1,523 and 1,121 English and Spanish words, respectively. Figure 2 shows the distributions of the latter words’ ratings in each language. They differed considerably in terms of the number of words rated as iconic, with 362 (23.7%) English words meeting this criterion compared to only 87 (7.8%) in Spanish. Of the English words rated as iconic, 76 (21%) were phonaesthemes according to the compiled list of semantic glosses, and 186 (51.4%) matched Otis and Sagi’s (2008) list of phonaesthemic clusters.

Figure 1.

Iconicity rating variability for English (n = 14,776) and Spanish words (n = 10,995).

Figure 2.

Iconicity rating distributions of unaffixed English (n = 1,523) and Spanish words (n = 1,121) with good agreement.

Relationships Between English Iconicity Ratings and Phonological Variables

Approximately 426 (28%) of the unaffixed English words were structurally marked. Biphone probabilities, phonological neighbourhood sizes, and Wordnet number of senses were available for 1,190 of these words. The zero-order correlations among the variables are shown in Figure 3. Iconicity ratings showed significant weak to moderate positive correlations with all variables excepting phonotactic probability for which the correlation was negative. The results of the regression are summarised in Table 1. Overall, the variables explained 18.1% of the variance in iconicity ratings, with each being a significant predictor. All except phonotactic probability showed significant positive relationships with iconicity ratings (lower biphone probabilities predicted higher ratings).

Figure 3.

Correlations between variables for English words (n = 1,190).

Table 1.

Regression Analysis Results for Predicting Iconicity of English Words (n = 1,190).

Model	Estimate	Standard error	t
Intercept	3.508	0.082	42.688***
Markedness	0.850	0.062	13.709***
Phonological neighbours	0.025	0.004	6.393***
Biphone probability	−60.866	12.216	−4.982***
Number of senses	0.017	0.007	2.512*

p < .05. ***p < .001.

Relationships Between Spanish Iconicity Ratings and Phonological Variables

Of the unaffixed Spanish words, 135 (12.5%) showed evidence of structural markedness. Biphone probabilities and phonological neighbourhood sizes were available for 1,078 words. As number of MCR senses were available for only 638 of these words, we conducted separate analyses with this variable. The zero-order correlations among the phonological variables are shown in Figure 4. Iconicity ratings were significantly and weakly correlated with all variables, although the relationship was negative for mean biphone probability. In the smaller sample, iconicity ratings were significantly correlated with number of senses (r = .084, p = .036). However, this weak result is likely due to a restriction of range issue (floor effect) given only 14 words in this smaller sample had iconicity ratings >= 5. The results of the regression with the phonological variables excluding number of senses are summarised in Table 2. Overall, the variables explained 11.92% of the variance in iconicity ratings, with all predictors contributing significantly to the model. All except phonotactic probability showed significant positive relationships with iconicity ratings (lower biphone probabilities predicted higher ratings). In the smaller sample of 639 words with sense information, the combined variables predicted less variance (6.5%) in the iconicity ratings and number of senses was not a significant predictor (estimate = 0.015, SE = 0.011, t = 1.405, p = .16; see OSF repository). Again, this is likely to be due to a floor effect in the iconicity ratings for this subsample.

Figure 4.

Correlations between variables for Spanish words (n = 1,078).

Table 2.

Regression Analysis Results for Predicting Iconicity of Spanish Words (n = 1,078).

Model	Estimate	Standard error	t
Intercept	2.161	0.054	39.928***
Markedness	0.672	0.145	4.644***
Phonological neighbours	0.28	0.006	5.024***
Biphone probability	−2.038e-5	8.671e-6	−3.286**

p < .05. **p < .01.***p < .001.

Surface Form Variables Predicting Iconicity Ratings of English Words

Phonemic transcriptions, syllable and stress assignments, and structural markedness predictor variables were available for 1,313 unaffixed English words, for which the best-fit model comprised 29 variables and explained 33.6% variance (see Table 3).

Table 3.

Best Fit Model for Predicting Iconicity Ratings of Unaffixed English Words According to 10-Fold Cross Validation Repeated 200 Times (n = 1,313).

Model	Estimate	Standard error	t
Intercept	1.882	0.158	11.931***
Number front	−0.498	0.067	−7.413***
Number central	−0.277	0.054	−5.095***
Number back	−0.386	0.095	−4.069***
Number close	0.165	0.063	2.599**
Number bilabial	0.215	0.058	3.728***
Number palatal	−0.298	0.063	−4.713***
Number velar	0.220	0.063	3.478***
Number nasal	−0.147	0.047	−3.121**
Number liquid	0.140	0.060	2.326*
Final phoneme	−0.008	0.004	−2.116*
Initial phoneme bilabial	−0.158	0.099	−1.604
Initial phoneme palatal	0.574	0.154	3.718***
Initial phoneme glottal	0.305	0.223	1.369
Initial phoneme fricative	0.155	0.090	1.713^†
Initial phoneme liquid	−0.499	0.153	−3.263**
Initial phoneme voiced	0.371	0.081	4.582***
Final phoneme bilabial	0.291	0.128	2.280*
Final phoneme palatal	0.523	0.124	4.224***
Final Phoneme velar	0.264	0.137	1.932^†
Final phoneme voiced	−0.251	0.068	−3.671***
Initial stress category	2.359	0.170	13.870***
Final stress category	0.497	0.247	2.017*
Medial stress category	2.508	0.115	21.830***
Number ccvc	−0.330	0.125	−2.633**
Initial syllable cv	−0.244	0.080	−3.052***
Initial syllable ccvc	1.026	0.157	6.516***
Initial syllable cccvc	0.996	0.239	4.168***
Initial syllable ccvcc	0.594	0.172	3.450***
Reduplication	2.041	0.281	7.273***

†

p =< .1. *p < .05. **p < .01. ***p < .001.

Surface Form Variables Predicting Iconicity Ratings of Spanish Words

Phonemic transcriptions, syllable and stress assignments, and structural markedness (multiple consecutive vowels, reduplications) predictor variables were available for 1,075 unaffixed Spanish words (55 with iconicity ratings > 5), for which the best-fit model comprised 14 variables and explained 50.85% variance (see Table 4).

Table 4.

Best Fit Model for Predicting Iconicity Ratings of Unaffixed Spanish Words With Surface Form Variables According to 10-Fold Cross Validation Repeated 200 Times (n = 1,075).

Model	Estimate	Standard error	t
Intercept	2.29559	0.06219	36.9115***
Number plosive	−0.09585	0.02797	−3.427***
Number approximant	−0.0847	0.02804	−3.0202**
Initial phoneme affricate	1.4248	0.8828	1.614
Initial phoneme trill	0.34895	0.12753	2.7361**
Final phoneme bilabial	3.56006	0.54407	6.5434***
Final phoneme labiodental	l2.42551	1.2376	1.9599^†
Final phoneme velar	3.64908	0.53073	6.8756***
Final phoneme voiceless	0.47112	0.309	1.5246
Final phoneme close	2.35627	0.64232	3.6684***
Medial stress category	−0.10826	0.04026	−2.6888**
Initial syllable ccvc	0.51911	0.38399	1.3519
Final syllable ccvcc	3.3693	6.00814	0.5608
Final syllable vc	1.45758	0.5736	2.5411*
Reduplication	2.28009	0.51977	4.3867***

†

p =< .1. *p < .05. **p < .01. ***p < .001.

Several features significantly predicted iconicity ratings across both languages: bilabial and velar sounds in the final phoneme position, penultimate stress, reduplications and an initial syllable ccvc structure. However, the relationship with penultimate stress differed across languages, being positive in English and negative in Spanish. Other features also showed similar but not identical relationships across languages: Close vowel sounds were associated with higher iconicity ratings in both languages, although this was only in the word final position in Spanish. Words with ccvcc in their initial versus final syllable positions in English and Spanish, respectively, were also associated with higher iconicity ratings.

Discussion

Recent investigations of iconic form-meaning mappings in various languages have employed subjective ratings. The present study investigated which phonological/phonetic variables contribute to participants’ intuitions that a word “sounds like what it means” across two large datasets of iconicity ratings in English and Spanish. Overall, many more words were rated as iconic in English compared to Spanish. Across languages, highly iconic words shared several phonological/phonetic features, syllable structures and reduplications while also demonstrating language-specific mappings.

At first glat Spanish and English differ substantially in terms of their number of words rated as iconic seems at odds with the research indicating that most Indo-European languages comprise similar small-sized inventories of iconic forms (Perniss et al., 2010). Before interpreting these differences in rating distributions as evidence for linguistic diversity in iconicity, it is worth considering an alternative explanation: While both Hinojosa et al. (2021) and Winter et al. (2024) utilised identical 7-point rating scales, the instructions to their respective participants differed in terms of the examples used. Specifically, Hinojosa et al. provided examples of iconic and non-iconic words, whereas Winter et al. provided examples of words that were rated as low, moderate or high in iconicity. This might explain the preponderance of words rated in the mid-point of the scale for the English words. Other studies have used different scales ranging from −5 (anti-iconic) to 5 (iconic) with the zero mid-point reflecting arbitrariness or asked participants to rate how accurately a “space alien” could guess the meaning of a word based only on its sound using a 100-point scale (Perry et al., 2015). As Motamedi et al. (2019) advise, variations in instructions and rating scales potentially index different aspects of sound-meaning mappings, and research using subjective iconicity ratings should be sensitive to this. In our view, the clear differences observed in the rating distributions between English and Spanish words preclude applying the set of English ratings cross-linguistically to translated words in other languages (e.g., Blasi et al., 2022).

In Spanish, the majority of unaffixed forms were onomatopoeias and interjections, consistent with Hinojosa et al.’s (2021) findings for their full dataset. In English, between 21% and 51% of the forms rated high in iconicity were phonaesthemes that have systematic sound-meaning mappings, depending on the method used to identify them. For example, the words “glare” and “gloom” had mean iconicity ratings of 5.73 and 6.2, respectively. This suggests that when English speakers rated whether a word “sounds like what it means,” part of this intuition involved accessing knowledge about systematic sound-meaning mappings that might be learned/conventional rather than directly linked to sensory experiences. As we noted in the Introduction, Winter et al. (2024) included Hutchin’s (1998) list of phonaesthemes because they can also be iconic. However, it is important to distinguish between iconic and non-iconic (learned) phonaesthemes (Thompson & Do, 2019), particularly if researchers are interested in direct grounding of meaning in sensory experience (Murgiano et al., 2021). Bergen (2004) reported that non-iconic English phonaesthemes contribute to lexical priming in a manner similar to morphemes, concluding they possess a “psychological reality” for speakers (see also Hutchins, 1998; Kwon & Round, 2015; Zingler, 2017). Interestingly, Spanish phonaesthemes tended not to be rated as iconic and had large rating standard deviations (Pollock, 2018). For example, the mean ratings of the phonaesthemes with fl- and tr- onsets such as “flotar” (3.15), “fluir” (4), “traca” (2.96), and “trueno” (3.19) were all in the middle range of the iconicity scale. In addition, while a number of etymologically transparent compound words were rated as highly iconic in English (e.g. “hardcover,” 6.2), confirming Dingemanse and Thompson’s (2020) observation, this was not the case for Spanish (none > 5). Again, it is unclear whether this tendency for English participants to rate non-iconic forms as iconic is due to linguistic diversity or different rating instructions.

We also confirmed and extended the observation that English words with higher iconicity ratings are more likely to be structurally marked (Dingemanse & Thompson, 2020). As we noted in the Introduction, structural markedness is a linguistically relative construct, and Spanish has far fewer words with complex onsets and none in its codas (Bradley, 2006; Carlo et al., 2020). We therefore investigated Spanish words’ markedness in terms of complex onsets, reduplications, long/multiple vowel sequences and infrequent phonemes in coda position (Lloyd & Schnitzer, 1967; Rodríguez, 2016), following Dingemanse and Thompson’s (2020) proposal that phonological improbability is a proxy for structural markedness. Here, the relationship was significant albeit much weaker than that in English (r = .07, p < .05 vs. r = .32, p < .001). It is possible that other examples of common markedness might be identified for Spanish and English using an inductive approach (Punselie et al., 2024). It is also important to reiterate that the relationship between markedness and English ratings is conflated to some extent with the systematic use of consonant clusters in phonaesthemes that can be iconic or non-iconic (Thompson & Do, 2019).

In both languages, words with less probable biphone sequences tended to be rated higher in iconicity, which might be due to their use of speech sounds to imitate sensory percepts (Thompson & Do, 2019), although it should be acknowledged that this relationship was quite weak in Spanish (r = −.08). This finding contrasts with that of Dingemanse and Thompson (2020) who failed to observe a similar relationship in a comparably sized set of English words rated for humour/funniness. This discrepancy is likely due to our inclusion of only unaffixed words with reasonable agreement in iconicity ratings across both languages. We also hypothesised that if forms rated high in iconicity were less phonotactically probable, then they would also have fewer lexical neighbours comprising similar sounds (Vitevitch & Luce, 2016). However, in both languages, words rated as iconic instead tended to have more phonological neighbours. Hence, they are less likely to “stand out” from other similar sounding words (cf. Dingemanse & Thompson, 2020), although again this relationship was much weaker in Spanish (r = .06) than English (r = .36). A possible explanation for this opposite relationship in English is suggested by the positive correlation we observed between phonological neighbourhood size and markedness which differed to Spanish where marked words tended to have fewer neighbours. This would also suggest that the relationship between iconicity rating and neighbourhood size in English is mediated to some extent by the systematic pairings of consonant clusters in phonaesthemes. In English, words with larger phonological neighbourhoods show an advantage in lexical decision (Vitevitch & Luce, 2016). In Spanish, they show a disadvantage (Vitevitch & Rodríguez, 2004).

In English, we also found that iconicity ratings were predicted by the number of senses expressed by a word. This is consistent with the use of most onomatopoetic/ideophonic forms to express multiple, metaphorical senses, that is, polysemy (Sasamoto, 2019), although some researchers have proposed iconic forms are linked to specific referents and contexts (Lupyan & Winter, 2018). We did not have a sufficient number of forms with sense annotations to confirm whether this was also the case for Spanish, although a similar relationship is likely (Ibarretxe-Antuñano, 2019). For both languages, we used Wordnet sense annotations (Gonzalez-Agirre et al., 2012; Miller, 1995), for which fine-grained annotations show lower inter-annotator reliability than coarser-grained ones (Navigli, 2009). Future work might consider using sense annotations derived from corpus distributional measures (Beekhuizen et al., 2021; Dandala et al., 2013).

One final question we investigated was whether iconicity ratings in both languages could be predicted by common form features, given that iconicity is proposed to be based on the cross-linguistic use of universally accessible acoustic-phonetic and structural features to depict sensory imagery (Blasi et al., 2016; Ćwiek et al., 2024; Dingemanse & Thompson, 2020; Punselie et al., 2024; Thompson & Do, 2019; Winter et al., 2022). Furthermore, we proposed that these features should be primarily shared with the root word, as iconicity is based in sensory depiction while affixation involves the systematic addition of redundant features (de Zubicaray & Hinojosa, 2024). Our analyses revealed phonological/phonetic feature and higher level structure mappings were able to significantly predict iconicity ratings in English and Spanish explaining 33% and 51% of variance, respectively.

In both languages, words rated higher in iconicity tended to be characterised by bilabial and velar consonant sounds in final position (Blasi et al., 2016), reduplications (Punselie et al., 2024) and an initial syllable structure with complex consonant onsets (ccvc; Dingemanse & Thompson, 2020). Penultimate stress also played a significant role across languages, although in opposite directions, being associated with higher versus lower iconicity in English and Spanish, respectively. Close vowel sounds were also associated with higher iconicity ratings in both languages, although this was only in the word final position in Spanish, consistent with prior work linking them to sound symbolism (e.g. small and /i/; Blasi et al., 2016; Dingemanse, 2019; Thompson & Do, 2019). Words comprising syllables with complex consonant onsets and codas (ccvcc) in initial versus final positions in English and Spanish, respectively, were also associated with higher iconicity ratings, and likely to be attributable to the inclusion of phonaesthemes. Interestingly, both languages showed a tendency for iconic words to have closed syllables, despite their relative frequencies differing across English and Spanish. Our analysis also revealed examples of features associated with iconicity specific to each language, e.g. /i/, /r/ and /l/ sounds in English (Blasi et al., 2016; Ćwiek et al., 2024) and the trilled /r/ in Spanish (Ćwiek et al., 2024; Winter et al., 2022). To determine the relative contributions of phonological/phonetic versus higher level structural features (syllables, reduplications) to predicting iconicity ratings, we reran the regression models excluding the latter. Phonological/phonetic features alone were able to predict 29% and 44% of the variance in English and Spanish ratings, respectively, demonstrating their importance.

Our novel findings have several implications for research on iconicity and theories of language embodiment. The fact that so many English phonaesthemes were rated as iconic indicates that systematic learned/conventional form-meaning mappings are likely to have contributed to past reports of subjective iconicity influencing lexical processing (Lupyan & Winter, 2018; Sidhu et al., 2020; de Zubicaray et al., 2024). This is consistent with Thompson and Do’s (2019) observation that systematicity has “sometimes been misappropriated as a form of iconicity” by researchers, a perspective that we agree with (p. 32). These findings also challenge embodied cognition accounts of language that use subjective iconicity ratings to support proposals that conceptual representations are grounded in sensory experience, as they assume that systematic relationships are a qualitatively different property of the language system (Dove, 2022; Murgiano et al., 2021; Perniss & Vigliocco, 2014). However, it is worth reiterating that Spanish words rated high in iconicity were less likely to be phonaesthemes than their English counterparts.

The fact that English words rated as iconic were also more likely to express multiple, metaphorical meanings indicates that past reports of a processing advantage for iconic words (Sidhu et al., 2020) might to some extent reflect the well-established advantage for polysemes (for a review, see Eddington & Tokowicz, 2015). The representations of polysemes have been proposed to occupy a complex, high-dimensional lexical-semantic space rather than simple grounding in sensory experience (Rodd, 2020). Embodied accounts have struggled to provide evidence for grounded representations of metaphors (Casasanto & Gijssels, 2015), while other researchers have considered iconicity inimical to abstraction (Lupyan & Winter, 2018). The positive correlation between iconicity ratings and polysemy might also explain both Hinojosa et al.’s (2021) and Winter et al.’s (2024) counterintuitive observations of negative correlations with concreteness ratings, indicating more iconic words were rated as more abstract, contradicting Lupyan and Winter’s (2018) proposal.⁶ We recommend that researchers interested in using iconicity ratings to investigate lexical processing ensure that they adequately control for variables such as systematicity and polysemy and consider using additional/alternative measures to exclude non-iconic forms (de Zubicaray, 2025; Dingemanse & Thompson, 2020; Motamedi et al., 2019; Punselie et al., 2024). Systematic form-meaning mappings in English sensory words have been shown to influence lexical processing to a greater extent than iconicity ratings (de Zubicaray et al., 2024).

More generally, the present research highlights the need for more research to investigate why people rate things the way they do in their respective languages. The finding that phonological variables significantly influence subjective iconicity ratings in both languages has implications for research that attempts to generate or extend human ratings using large language models (LLMs) such as ChatGPT (OpenAI, 2024). For example, Trott (2024) reported a correlation of r = .59 between Winter et al.’s (2024) subjective iconicity ratings and those generated by GPT-4, indicating the two cannot be considered equivalent. As current LLMs are trained solely on written text corpora, they are not “phonologically aware” in the way that humans are (de Zubicaray, 2025). Hence, using LLM-generated ratings as an extension or substitute for human judgements risks obscuring phonological relationships like the ones observed here. Finally, as Motamedi et al. (2019; see also Punselie et al., 2024) noted, iconicity is not a monolithic construct, so it is also important to understand the roles that variables such as task-type and instructional set play in influencing subjective ratings.

Conclusion

We investigated relationships between subjective ratings of iconicity and phonological/phonetic variables in unaffixed English and Spanish words with good rating agreement. Our findings showed that many more English words were rated as iconic, although these included phonaesthemes that have systematic form-meaning mappings that can also be learned/conventional or iconic. Iconic words showed evidence of structural markedness in both languages, although the relationship was weaker in Spanish. Both English and Spanish words rated higher in iconicity had larger phonological neighbourhoods despite comprising less frequently occurring phoneme sequences. In English, words rated as more iconic were also more likely to be polysemous forms. Finally, iconicity ratings in both languages were able to be predicted by common phonological/phonetic features, syllable structures and reduplications, consistent with previous cross-linguistic research.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by an Australian Research Council Discovery Project Grant DP220101853 and by grant HORIZON-MSCA-2023-SE-01. Ref.101182959 from the Horizon Europe Framework Programme.

ORCID iD

Greig I. de Zubicaray

Ethical Considerations

This study was granted exemption from requiring ethics approval due to its use of publicly available datasets.

Data Availability Statement

The data and materials from the present experiment are publicly available at the Open Science Framework website: .

Open Practices

All data and analysis scripts are publicly available via the Open Science Framework at:

The study was not preregistered.

Notes

References

Bartlett

Kondrak

Cherry

(2009). On the syllabification of phonemes. In NAACL 09: Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics (pp. 308–331). Association for Computational Linguistics.

Batsuren

Bella

Giunchiglia

(2021). MorphyNet: A large multilingual database of derivational and inflectional morphology. In Proceedings of the 18th SIGMORPHON workshop on computational research in phonetics, phonology, and morphology (pp. 39–48). Association for Computational Linguistics.

Beekhuizen

Armstrong

B. C.

Stevenson

(2021). Probing lexical ambiguity: Word vectors encode number and relatedness of senses. Cognitive Science, 45(5), Article e12943.

Bergen

(2004). The psychological reality of phonaesthemes. Language, 80(2), 290–311.

Bertrán

A. P.

(1999). Prosodic typology: On the dichotomy between stress-timed and syllable-timed languages. Language Design, 2, 103–130.

Blair

Cooper

Coppock

Humphreys

Sonnet

(2022). Estimatr: Fast estimators for design-based inference. University of California. https://cran.r-project.org/package=estimatr

Blasi

D. E.

Henrich

Adamou

Kemmerer

Majid

(2022). Over-reliance on English hinders cognitive science. Trends in Cognitive Sciences, 26(12), 1153–1170.

Blasi

D. E.

Wichmann

Hammarström

Stadler

P. F.

Christiansen

M. H.

(2016). Sound-meaning association biases evidenced across thousands of languages. Proceedings of the National Academy of Sciences, 113(39), 10818–10823.

Bradley

T. G.

(2006). Spanish complex onsets and the phonetics-phonology interface. In Martínez-Gil

Colina

(Eds.), Optimality-theoretic studies in Spanish phonology (pp. 15–38). John Benjamins.

10.

Brysbaert

New

(2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior Research Methods, 41(4), 977–990.

11.

Carlo

M. A.

Wilson

R. H.

Villanueva-Reyes

(2020). Psychometric characteristics of Spanish monosyllabic, bisyllabic, and trisyllabic words for use in word-recognition protocols. Journal of the American Academy of Audiology, 31(7), 531–546.

12.

Casasanto

Gijssels

(2015). What makes a metaphor an embodied metaphor? Linguistics Vanguard, 1(1), 327–337.

13.

Cuetos

Bonin

Alameda

J. R.

Caramazza

(2010). The specific-word frequency effect in speech production: evidence from Spanish and French. Quarterly Journal of Experimental Psychology, 63(4), 750–771.

14.

Ćwiek

Anselme

Dediu

Fuchs

Kawahara

G. E.

Paul

Perlman

Petrone

Reiter

Ridouane

Zeller

Winter

(2024). The alveolar trill is perceived as jagged/rough by speakers of different languages. Journal of the Acoustical Society of America, 156(5), 3468–3479.

15.

Dandala

Mihalcea

Bunescu

(2013). Word sense disambiguation using Wikipedia. In Gurevych

Kim

(Eds.), The people’s Web meets NLP, theory and applications of natural language processing (pp. 241–262). Springer.

16.

de Rooij

Weeda

. (2020). Cross-validation: A method every psychologist should know. Advances in Methods and Practices in Psychological Science, 3(2), 248–263.

17.

Desrochers

Liceras

J. M.

Fernández-Fuertes

Thompson

G. L.

(2010). Subjective frequency norms for 330 Spanish simple and compound words. Behavior Research Methods, 42(1), 109–117.

18.

de Zubicaray

G. I

. (2025). Revisiting semantic ambiguity in English words: Nonarbitrary polysemy-form mappings influence lexical processing. Journal of Experimental Psychology. Learning, Memory, and Cognition. Advance online publication. https://doi.org/10.1037/xlm0001483

19.

de Zubicaray

G. I.

Hinojosa

J. A

. (2024). Statistical relationships between phonological form, emotional valence and arousal of Spanish words. Journal of Cognition, 7(1), Article 42.

20.

de Zubicaray

G. I.

Kearney

Guenther

F. H.

McMahon

K. L.

Arciuli

. (2024). Statistical relationships between surface form and sensory meanings of English words influence lexical processing. Journal of Experimental Psychology: Human Perception and Performance, 50(7), 723–739.

21.

Díez-Álamo

A. M.

Díez

Wojcik

D. Z.

Alonso

M. A.

Fernandez

(2019). Sensory experience ratings for 5,500 Spanish words. Behavior Research Methods, 51(3), 1205–1215.

22.

Dingemanse

(2015). Ideophones and reduplication: Depiction, description, and the interpretation of repeated talk in discourse. Studies in Language, 39(4), 946–970.

23.

Dingemanse

(2019). “Ideophone” as a comparative concept. In Akita

Pardeshi

(Eds.), Ideophones, mimetics, expressives (pp. 13–33). John Benjamins. https://doi.org/10.1075/ill.16.02din

24.

Dingemanse

Blasi

D. E.

Lupyan

Christiansen

M. H.

Monaghan

(2015). Arbitrariness, iconicity, and systematicity in language. Trends in Cognitive Sciences, 19, 603–615.

25.

Dingemanse

Thompson

(2020). Playful iconicity: Structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 12(1), 203–224.

26.

Dove

G. O.

(2022). Rethinking the role of language in embodied cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 378, Article 20210375.

27.

Duanmu

(2009). Syllable structure: The limits of variation. Oxford University Press.

28.

Duchon

Perea

Sebastián-Gallés

Martí

Carreiras

(2013). EsPal: One-stop shopping for Spanish word properties. Behavior Research Methods, 45, 1246–1258.

29.

Eddington

C. M.

Tokowicz

(2015). How meaning similarity influences ambiguous word processing: the current state of the literature. Psychonomic Bulletin & Review, 22(1), 13–37.

30.

François

(2008). Semantic maps and the typology of colexification: Intertwining polysemous networks across languages. In Studies in language companion series (pp. 163–215). John Benjamins Publishing Company.

31.

Gao

Shinkareva

S. V.

Desai

R. H.

(2022). SCOPE: The South Carolina psycholinguistic metabase. Behavior Research Methods, 55, 2853–2884. https://doi.org/10.3758/s13428-022-01934-0

32.

González

(2006). The phonetics and phonology of spirantization in North-Central Peninsular Spanish. Anuario Del Seminario De Filología Vasca “Julio De Urquijo”, 40(1–2), 409–436. https://doi.org/10.1387/asju.4398.

33.

Gonzalez-Agirre

Laparra

Rigau

(2012). Multilingual Central Repository version 3.0: upgrading a very large lexical knowledge base. In Proceedings of the Sixth International Global WordNet Conference (GWC’12), Matsue, Japan.

34.

Gorman

B. K.

Gillam

R. B.

(2003). Phonological awareness in Spanish: A tutorial for speech-language pathologists. Communication Disorders Quarterly, 25(1), 13–22.

35.

Harrell

Jr. (2024). Hmisc: Harrell miscellaneous (R package version 5.1-4). https://github.com/harrelfe/hmisc

36.

Haslett

D. A.

Cai

Z. G.

(2023). Systematic mappings of sound to meaning: A theoretical review. Psychonomic Bulletin & Review, 31(2), 627–648.

37.

Hinojosa

J. A.

Haro

Calvillo-Torres

González-Arias

Poch

Ferré

(2022). I want it small or, rather, give me a bunch: The role of evaluative morphology on the assessment of the emotional properties of words. Cognition and Emotion, 36(6), 1203–1210.

38.

Hinojosa

J. A.

Haro

Magallares

Duñabeitia

J. A.

Ferré

(2021). Iconicity ratings for 10,995 Spanish words and their relationship with psycholinguistic variables. Behavior Research Methods, 53(3), 1262–1275.

39.

Hutchins

S. S.

(1998). The psychological reality, variability, and compositionality of English phonesthemes [Ph.D. dissertation]. Emory University.

40.

Ibarretxe-Antuñano

(2019). Perception metaphors in cognitive linguistics: Scope, motivation, and lexicalisation. In Speed

L. J.

O’Meara

Roque

L. S.

Majid

(Eds.), Perception metaphors (pp. 43–64). John Benjamins Publishing Company. https://doi.org/10.1075/celcr.19.03iba

41.

Juhasz

B. J.

Lai

Y. H.

Woodcock

M. L.

(2015). A database of 629 English compound words: Ratings of familiarity, lexeme meaning dominance, semantic transparency, age of acquisition, imageability, and sensory experience. Behavior Research Methods, 47(4), 1004–1019.

42.

Juhasz

B. J.

Yap

M. J.

(2013). Sensory experience ratings for over 5,000 mono-and disyllabic words. Behavior Research Methods, 45(1), 160–168.

43.

Kearney

McMahon

K. L.

Guenther

Arciuli

de Zubicaray

G. I.

(2024). Revisiting the concreteness effect: Non-arbitrary mappings between form and concreteness of English words influence lexical processing. Cognition, 254, Article 105972. Advance online publication. https://doi.org/10.1016/j.cognition.2024.105972

44.

Kuhn

(2008). Building predictive models in R using the caret package. Journal of Statistical Software, 28, 1–26. https://doi.org/10.18637/jss.v028.i05

45.

Kwon

Round

E. R.

(2015). Phonaesthemes in morphological theory. Morphology, 25(1), 1–27.

46.

Lloyd

P. M.

Schnitzer

R. D.

(1967). A statistical study of the structure of the Spanish syllable. Linguistics, 5(37), 58–72.

47.

Lumley

(2022). Regression subset selection. https://CRAN.R-project.org/package=leaps

48.

Lupyan

Winter

(2018). Language is more abstract than you think, or, why aren’t languages more iconic? Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1752), Article 20170137.

49.

Marian

Bartolotti

Chabal

Shook

(2012). CLEARPOND: Cross-Linguistic Easy-Access Resource for Phonological and Orthographic Neighborhood Densities. PLoS One, 7(8), Article e43230.

50.

Miller

G. A.

(1995). WordNet: A lexical database for English. Communications of the ACM, 38, 39–41.

51.

Mompeán

J. A.

Fregier

Valenzuela

(2020). Iconicity and systematicity in phonaesthemes: A cross-linguistic study. Cognitive Linguistics, 31, 515–548.

52.

Motamedi

Little

Nielsen

Sulik

(2019). The iconicity toolbox: Empirical approaches to measuring iconicity. Language and Cognition, 11(2), 188–207.

53.

Murgiano

Motamedi

Vigliocco

(2021). Situating language in the real-world: The role of multimodal iconicity and indexicality. Journal of Cognition, 4(1), Article 38.

54.

Navigli

(2009). Word sense disambiguation: A survey. ACM Computing Surveys, 41(2), Article 10.

55.

OpenAI. (2024). ChatGPT (version 4) [Large language model]. https://openai.com/index/gpt-4/

56.

Otis

Sagi

(2008). Phonaesthemes: A corpora-based analysis. In Love

B. C.

McRae

Sloutsky

V. M.

(Eds.), Proceedings of the 30th annual meeting of the cognitive science society (pp. 65–70). Cognitive Science Society.

57.

Oxford University Press. (2024). “zigzag”. In Oxford English dictionary, Retrieved December 9, 2024, from https://doi.org/10.1093/OED/5137221654

58.

Perniss

Thompson

R. L.

Vigliocco

(2010). Iconicity as a general property of language: Evidence from spoken and signed languages. Frontiers in Psychology, 1.

59.

Perniss

Vigliocco

(2014). The bridge of iconicity: From a world of experience to the experience of language. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 369(1651), 1–13.

60.

Perry

L. K.

Perlman

Lupyan

(2015). Iconicity in English and Spanish and its relation to lexical category and age of acquisition. PLoS One, 10(9), Article e0137147.

61.

Piñeros

C. E.

(2002). Markedness and laziness in Spanish obstruents. Lingua, 112, 379–413.

62.

Pollock

(2018). Statistical and methodological problems with concreteness and other semantic variables: A list memory experiment case study. Behavior Research Methods, 50(3), 1198–1216.

63.

Punselie

McLean

Dingemanse

(2024). The anatomy of iconicity: Cumulative structural analogies underlie objective and subjective measures of iconicity. Open Mind: Discoveries in Cognitive Science, 8, 1191–1212.

64.

R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.r-project.org/

65.

Reilly

Hung

Westbury

(2017). Non-Arbitrariness in mapping word form to meaning: Cross-linguistic formal markers of word concreteness. Cognitive Science, 41(4), 1071–1089.

66.

Rodd

J. M.

(2020). Settling into semantic space: An ambiguity-focused account of word-meaning access. Perspectives on Psychological Science, 15(2), 411–427.

67.

Rodríguez

I. R.

(2016). Cálculo de frecuencias de aparición de fonemas y alófonos en Español actual utilizando un transcriptor automático [Calculation of frequencies of appearance of phonemes and allophones in current Spanish using an automatic transcriber]. Loquens, 3(1), Article e029.

68.

Rodríguez-Ferreiro

Davies

(2019). The graded effect of valence on word recognition in Spanish. Journal of Experimental Psychology. Learning, Memory, and Cognition, 45(5), 851–868.

69.

Rzymski

Tresoldi

Greenhill

S. J.

M. S.

Schweikhard

N. E.

Koptjevskaja-Tamm

Gast

Bodt

T. A.

Hantgan

Kaiping

G. A.

Chang

Lai

Morozova

Arjava

Hübler

Koile

Pepper

Proos

Van Epps

, . . . List

J. M.

(2020). The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies. Scientific Data, 7(1), 13.

70.

Sasamoto

(2019). Onomatopoeia and relevance: Communication of impressions via sound. Palgrave Macmillan.

71.

Sidhu

D. M.

Pexman

P. M.

(2018a). Five mechanisms of sound symbolic association. Psychonomic Bulletin & Review, 25(5), 1619–1643.

72.

Sidhu

D. M.

Pexman

P. M.

(2018b). Lonely sensational icons: Semantic neighbourhood density, sensory experience and iconicity. Language, Cognition and Neuroscience, 33(1), 25–31.

73.

Sidhu

D. M.

Pexman

P. M.

(2021). Implications of the “language as situated” view for written iconicity. Journal of Ognition, 4(1), Article 40.

74.

Sidhu

D. M.

Vigliocco

Pexman

P. M.

(2020). Effects of iconicity in lexical decision. Language and Cognition, 12(1), 164–181. https://doi.org/10.1017/langcog.2019.36

75.

Strik Lievers

Bolognesi

Winter

. (2021). The linguistic dimensions of concrete and abstract concepts: Lexical category, morphological structure, countability, and etymology. Cognitive Linguistics, 32(4), 641–670.

76.

Thompson

A. L.

Akita

(2020). Iconicity ratings across the Japanese lexicon: A comparative study with English. Linguistics Vanguard, 6(1), Article 20190088.

77.

Thompson

A. L.

(2019). Defining iconicity: An articulation-based methodology for explaining the phonological structure of ideophones. Glossa: A Journal of General Linguistics, 4(1), Article 72. https://doi.org/10.5334/gjgl.872

78.

Thompson

A. L.

Van Hoey

(2021). Articulatory features of phonemes pattern to iconic meanings: Evidence from cross-linguistic ideophones. Cognitive Linguistics, 32, 563–608.

79.

Trott

(2024). Can large language models help augment English psycholinguistic datasets?. Behavior Research Methods, 56(6), 6082–6100.

80.

Urbaniak

(2019). La iconicidad de la reduplicación léxica en español [Iconicity of Lexical Reduplication in Spanish]. Estudios Interlingüísticos, 7, 186–201.

81.

Vaden

K. I.

Halpin

H. R.

Hickok

G. S.

(2009). Irvine phonotactic online dictionary (Version 2.0) [Data file]. http://www.iphod.com

82.

Vitevitch

M. S.

Luce

P. A.

(2016). Phonological neighborhood effects in spoken word perception and production. Annual Review of Linguistics, 2(1), 75–94.

83.

Vitevitch

M. S.

Rodríguez

(2004). Neighborhood density effects in spoken word recognition in Spanish. Journal of Multilingual Communication Disorders, 3(1), 64–73.

84.

Waugh

L. R.

Lafford

B. A.

(2006). Markedness. In Brown

(Ed.), Encyclopedia of language and linguistics (2nd ed, pp. 491–498). Elsevier.

85.

Wei

Simko

(2024). R package “corrplot”: Visualization of a correlation matrix (Version 0.94). https://github.com/taiyun/corrplot.

86.

Wickham

(2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York.

87.

Wilcox

R. R

. (2019). Robust regression: Testing global hypotheses about the slopes when there is multicollinearity or heteroscedasticity. British Journal of Mathematical and Statistical Psychology, 72(2), 355–369.

88.

Wilke

(2024). cowplot: Streamlined plot theme and plot annotations for “ggplot2” (R package version 1.1.3). https://CRAN.R-project.org/package=cowplot

89.

Winter

Lupyan

Perry

L. K.

Dingemanse

Perlman

(2024). Iconicity ratings for 14,000+ English words. Behavior Research Methods, 56(3), 1640–1655.

90.

Winter

Perlman

(2021). Iconicity ratings really do measure iconicity, and they open a new window onto the nature of language. Linguistics Vanguard, 7(1), Article 20200135.

91.

Winter

Sóskuthy

Perlman

Dingemanse

(2022). Trilled /r/ Is Associated with Roughness, Linking Sound and Touch across Spoken Languages. Scientific Reports, 12(1), Article 1035.

92.

Yarkoni

Westfall

(2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–1122.

93.

Zingler

(2017). Evidence against the morpheme: The history of English phonaesthemes. Language Sciences, 62, 76–90. https://doi.org/10.1016/j.langsci.2017.03.005