Abstract
Background & aims
Throughout typical development, children prioritize different perceptual, social, and linguistic cues to learn words. The earliest acquired words are often those that are perceptually salient and highly imageable. Imageability, the ease in which a word evokes a mental image, is a strong predictor for word age of acquisition in typically developing (TD) children, independent of other lexicosemantic features such as word frequency. However, little is known about the effects of imageability in children with autism spectrum disorder (ASD), who tend to have differences in linguistic processing and delayed language acquisition compared to their TD peers. This study explores the extent to which imageability and word frequency are associated with early noun and verb acquisition in children with ASD.
Methods
Secondary analyses were conducted on previously collected data of 156 children (78 TD, 78 ASD) matched on sex and parent-reported language level. Total expressive vocabulary, as measured by the MacArthur Bates Communicative Development Inventory (MB-CDI), included 123 words (78 nouns, 45 verbs) that overlapped with previously published imageability ratings and word input frequencies. A two-step hierarchical linear regression was used to examine the relationship between word input frequency, imageability, and total expressive vocabulary. An F-test was then used to assess the unique contribution of imageability on total expressive vocabulary when controlling for word input frequency.
Results
In both the TD and ASD groups, imageability uniquely explained a portion of the variance in total expressive vocabulary size, independent of word input frequency. Notably, imageability was significantly associated with noun vocabulary and verb vocabulary size alone, with imageability explaining a greater portion of the variance in total nouns produced than in total verbs produced.
Conclusions
Imageability was identified as a significant lexicosemantic feature for describing expressive vocabulary size in children with ASD. Consistent with literature on TD children, children with ASD who have small vocabularies primarily produce words that are highly imageable. Children who are more proficient word learners with larger vocabularies produce words that are less imageable, indicating a potential shift away from reliance on perceptual-based language processing. This was consistent across both noun and verb vocabularies.
Implications
Our findings contribute to a growing body of literature describing early word learning in children with ASD and provide a basis for exploring the use of multisensory language learning strategies.
Introduction
Most English-speaking children acquire nouns earlier than verbs (Gentner, 1982), and these nouns tend to dominate their early lexicons (Fenson et al., 1994). Gentner (1982) explained this early “noun bias” by suggesting that nouns are easy to learn because their referents are made obvious in the world, while verbs are harder to learn because their referents do not naturally emerge, and instead, need to be heard in context to be acquired. However, studies of children's early lexicons reveal a word class paradox, in which certain verbs are acquired earlier than certain nouns are (Fenson et al., 1994; Tardif, 1996). For example, verbs like eat and hug are present in young children's vocabularies before abstract nouns like idea. Hollich et al. (2000) expanded Gentner’s (1982) original proposal by suggesting that the saliency of words can be supported by a range of perceptual, linguistic, and social cues. In typical early language development, not only do children learn words that they hear more often, but under Hollich et al. (2000) Emergentist Coalition Model, children prioritize certain cues over others. By ten months of age, infants rely heavily on perceptually salient information to map words to referents (Pruden et al., 2006). Within the first year of life, young toddlers utilize speaker intent and attend to social cues, such as eye gaze, for word learning (Hollich et al., 2000). By two years of age, children can use linguistic context, including lexicosemantic knowledge and syntactic bootstrapping, to understand that cause is encoded within sentence structure (Arunachalam, 2013; Gleitman, 1990).
Based on this developmental trajectory, one would expect that regardless of word class, a toddler's earliest words would be those that are perceptually salient and experienced through sensory-motor modalities. Imageability has been proposed as an important metric in understanding early word learning, as it quantifies perceptibility as the ease in which a concept evokes a mental image (Paivio et al., 1968). One's utilization of imageability for word learning taps into the higher-level lexical processing, organization, and access of the concept. Indeed, in typically developing (TD) children, the earliest acquired words are concrete and highly imageable (Bird et al., 2001; Gillette et al., 1999; McDonough et al., 2011). Moreover, imageability also facilitates word naming, form, and recall in both children and adults, beyond that of other known predictors, such as word input frequency (Masterson et al., 2008; Paivio et al., 1968; Ramey et al., 2013; Smolik & Kriz, 2015).
While many studies have explored the role of perceptual information on language acquisition in TD individuals, few have examined the effects of imageability in young children with autism spectrum disorder (ASD). Children with ASD present with restricted, repetitive behaviors and deficits in social communication (American Psychiatric Association, 2013). Although language impairment is not a diagnostic criterion for ASD, language ability in individuals with ASD is heterogenous. Notably, some subgroups of children with autism experience delays in overall language acquisition compared to TD peers. Particular difficulties in mastering abstract or context-dependent words, such as pronouns, deictic terms (e.g., here, there), and mental state verbs (e.g., to think, to believe, etc.), may be observed (Charman et al., 2003; Rescorla & Safyer, 2013). Research has been divided on the primary cause of this language difference, attributing it to characteristic deficits of ASD in social cognition, executive function, and neurological processing (Eigsti et al., 2011).
In a review of lexical acquisition mechanisms, Arunachalam and Luyster (2015) suggest that children with ASD have certain intact lexical mechanisms (e.g., word-object mapping, mutual exclusivity, etc.), but they may process language differently and therefore benefit from different types of language input for word learning. For example, repetition or visual input (e.g., photographs, picture communication symbols, and video models) may support language development by making spoken language concepts less transient and easier to map to their referents (Bellini & Akullian, 2007; Schlosser et al., 2019). Studies examining underlying neurological processes related to linguistic tasks have shown that older individuals with ASD exhibit immature lexical processing, similar to that of young TD children. While TD individuals demonstrated a neurological shift from bottom-up to top-down lexical processing with age, this shift was absent in autistic individuals (Brown et al., 2005). Given sentences of various imageability ratings, children and adolescents with ASD consistently activated posterior, visual regions of the brain, demonstrating an atypical reliance on perceptual-based lexical processing, even for linguistic stimuli with low imageability (Cantiani et al., 2016; Gaffrey et al., 2007; Kana et al., 2006). Given that some individuals with ASD demonstrate an overreliance on perceptual information during linguistic tasks, it is possible that children with ASD rely more heavily on lexicosemantic features like imageability for word learning than their TD peers.
Understanding the mechanisms that underlie early word acquisition is essential for supporting language development and is especially valuable for informing intervention strategies for children with language disabilities. In typical development, early noun and verb production predicts later expressive vocabulary (Longobardi et al., 2017). More specifically, verb diversity at 24 months of age is especially pertinent for the development of grammatical complexity by 40 months of age (Hadley et al., 2016). Likewise, in children with ASD, verb diversity in early childhood predicts overall language outcomes in adulthood (LeGrand et al., 2021). These early lexical milestones have implications for functional communication, including the spontaneous production of syntax and generative phrase speech. Not only does early lexicon predict later language outcomes, but in individuals with ASD, functional speech before age five is also a strong predictor for long term achievement and adaptive life outcomes (Mawhood et al., 2000; Venter et al.,1992). Together, these findings emphasize the necessity of early language research and intervention to support the long-term outcomes of individuals with developmental delays and disorders.
The current study addresses the role of imageability in early noun and verb vocabulary acquisition in children with ASD. Secondary analyses were conducted on previously collected data on children with and without autism, matched on sex and parent-reported language level. Our aims were 1) to explore the relationship of average word input frequency, average imageability, and total expressive vocabulary in children with and without ASD, and 2) to describe any word class differences in imageability related to noun and verb production.
Methods
Participants
Data for the ASD group were obtained from the NIH-supported National Database for Autism Research (NDAR) (National Institute of Mental Health Data Archive Collection ID #2368) and a previous longitudinal study (Lord et al., 2012). The ASD group consisted of 78 children (13–107 months of age), contributing 213 dependent datapoints across various dates. To ensure independent datapoints, one datapoint per child was randomly selected using the “stratified” function in R, yielding 78 independent ASD datapoints (Mahto, 2014). Data for the TD group were extracted from the open source database, Wordbank (Frank et al., 2016), in October 2019. The TD children (16–30 months of age) contributed 1300 independent datapoints. To diminish the potential effects of differences in language level on our group analyses, a subset of 78 TD children were sex- and language-matched based on raw total reported vocabulary on the MacArthur Bates Communicative Development Inventory (MB-CDI; Fenson et al., 2007) to 78 children from the ASD group using the “MatchIt” package in R (Ho et al., 2011). Descriptions of the matched data are summarized in Table 1.
Summary of participants matched on sex and language level.
Median [Interquartile Range], unless otherwise noted.
p-value determined by Wilcoxon/Mann-Whitney test for non-normal data.
N/A: not enough reported data.
The matched TD sample consisted of 78 participants (88.5% male). The participants had a median age of 24 months (interquartile range of 21–27 months) and a median total reported vocabulary of 340 words (interquartile range of 151.8–526.8 words) out of a 680-word vocabulary checklist (described in Measures below). Information on maternal education was available for this sample, but race was not.
The matched ASD sample consisted of 78 participants (88.5% male). The participants had a median age of 59 months (interquartile range of 36.0–76.5 months) and a median total reported vocabulary of 339 words (interquartile range of 152.0–527.5 words) out of a 680-word vocabulary checklist. Chronological age of the ASD group was significantly higher than the TD group (p < .001), as expected when matching based on language ability. Information on maternal education was available for 24 participants, and race was available for all participants in the ASD sample.
Measures
Word frequency data were derived by Li and Shirai (2000) using English-speaking caregiver input from 27 corpora in the Child Language Data Exchange System (CHILDES). The corpora consist of about 2.6 million-word tokens and represent various naturalistic contexts in which children engage with their caregivers (e.g., mealtimes, free play, story time, etc.). Word frequency measures derived from caregiver speech were utilized in this study because of their stronger association with child language acquisition compared to other established frequency norms (e.g., Kucera-Francis, Thorndike-Lodge norms) based on adult-directed speech or written texts (Goodman et al., 2008).
Imageability ratings were obtained from Masterson and Druks (1998), following the procedure of Paivio et al. (1968). On a seven-point scale, 36 adults rated 164 nouns and 102 verbs for their ability to evoke a mental image, with 1 representing words that are difficult to image and 7 representing words that readily evoke images. This set of imageability ratings was selected because it is one of the few published rating sets that clearly differentiates between nouns and verbs. For example, noun and verb stimuli were presented to raters as dance and to dance, respectively.
Total expressive vocabularies were measured using the American English Words and Sentences form of the MB-CDI. The MB-CDI is a caregiver questionnaire that provides a reliable measure of young children's receptive and expressive vocabularies, which would otherwise be difficult to obtain through laboratory-based experimental or formal language measures alone (Charman et al., 2003; Fenson et al., 1994). The MB-CDI is a suitable measure for predicting language outcomes in both typical and language impaired populations, including children with ASD (Luyster et al., 2007; Tager-Flusberg et al., 2009).
The MB-CDI consists of two forms: Words and Gestures, for pre-verbal and early verbal communication in children ages 8 to 18 months and Words and Sentences, for early developing vocabulary and morphosyntax in children ages 16 to 30 months. Expressive vocabulary was of interest in this study because it is more reliably reported than receptive vocabulary in parent measures (Charman, 2004; Luyster et al., 2008). The Words and Sentences Form was chosen to maximize the number of action verbs included in the study. Part I of the Words and Sentences form contains 680 vocabulary items in a checklist format. Part II of this form includes a variety of grammatical items.
A total of 123 items (78 nouns, 45 verbs) from Part I of the MB-CDI Words and Sentences form were included in our analysis based upon the overlap of MB-CDI words, published imageability ratings from Masterson and Druks (1998), and CHILDES word input frequency measures from Li and Shirai (2000) (Table 2). The 123 words belonged to the following semantic categories, as defined by the MB-CDI Part IA Vocabulary Checklist: Animals, Vehicles, Toys, Food and Drink, Clothing, Body Parts, Small Household Items, Furniture and Rooms, Outside Things, Places to Go, People, Games and Routines, and Action Words. All included participants were reported to produce at least one word from the total 123 words of interest. Word input frequency and imageability ratings for each of the 123 words that a participant was reported to produce were averaged so that each participant's expressive vocabulary could be characterized by its average word input frequency and average imageability. A summary of participant language measures is found in Table 3. When matched on sex and parent-reported language ability, there were no significant differences between diagnostic groups with regards to the number of words produced (p = .805), average imageability (p = .531), or average word input frequency (p = .727).
Summary of the 123 words included in analysis.
Median [Interquartile Range], unless otherwise noted.
p-value determined by Wilcoxon/Mann-Whitney test for non-normal data.
Summary of variables of interest in groups matched on sex and language level.
Median [Interquartile Range], unless otherwise noted.
p-value determined by Wilcoxon/Mann-Whitney test for non-normal data.
Analysis
Preliminary analyses utilized histograms, scatter plots, and Pearson correlation coefficients to examine the distribution, strength, and nature of the relationships between study variables. All variables were unimodal. Total expressive vocabulary size, the dependent variable, was approximately normally distributed (skewness = −0.61). Word input frequency was right skewed (skewness = 2.26) and imageability was left skewed (skewness = −0.12). Pearson correlation coefficients indicated that word input frequency and imageability had strong linear relationships with total expressive vocabulary size (r = −0.78, p < .001 and r = −0.53, p < .001 respectively). Word input frequency and imageability were moderately correlated (r = 0.28, p = .0005).
A two-step hierarchical linear regression was used to examine the relationship between word input frequency, imageability, and total expressive vocabulary size. Residual plots and variance inflation factors were examined to ensure that the assumptions of linear regression were satisfied for each model. The residual plots showed that the model residuals were normally distributed, homoscedastic, and independent for each model. The variance inflation factors indicated that there was no multicollinearity, as all values were less than 1.2. P < .05 was used to determine statistical significance.
In the first step of the hierarchical linear regression, word input frequency served as the independent variable, with total expressive vocabulary as the dependent variable. In the second step, imageability was added as an independent variable to the step 1 regression model. In order to assess the unique contribution of imageability in explaining the variance in total expressive vocabulary size, an F-test was used to compare the step 2 regression model to the step 1 regression model. Interaction effects of diagnosis and imageability were also examined to assess diagnostic group differences. Analyses were then repeated on nouns alone and verbs alone to explore the effects of imageability on expressive vocabulary across word classes.
Results
Two main questions were addressed in this study: 1) What is the relationship between word input frequency, imageability, and expressive vocabulary size in children with and without ASD? and 2) Are there word class differences in the relationship of imageability across noun and verb vocabulary size?
Word input frequency, a known predictor for the acquisition of open class words in TD children, was first entered into a regression model with total words produced as the dependent variable (Goodman et al., 2008). The model revealed that 60% of the variation in total words produced was explained by word input frequency (p < .001). Next, imageability was added to the model. Imageability explained an additional 10% of the variance in total words produced (p < .001). An F-test revealed that imageability significantly improved the fit of the model (F = 55.39, p < .001). Notably, diagnostic group did not show any significant interactions with imageability (p = .119).
Additionally, imageability was consistently correlated with total nouns alone (F = 125.33, p < .001) and total verbs alone (F = 16.18, p < .001), with imageability explaining a greater portion of variance in total nouns produced (ΔR2 = .27, p < .001) than in total verbs produced (ΔR2 = .08, p < .001). No significant interactions between diagnostic group and noun imageability (p = .181) or verb imageability (p = .077) were observed. See Table 4 for a summary of analysis results.
Hierarchical linear regression model outputs.
Adjusted R2, unless otherwise noted.
*p < .001.
Discussion
In TD children, discrepancies between noun and verb production are thought to be partially facilitated by the lexicosemantic feature, imageability. Imageability is a known predictor for early word age of acquisition in TD children; however, little is known about imageability effects in children with ASD, who tend to have marked differences in language processing. This study contributes to the limited literature on the role of imageability in vocabulary acquisition in children with ASD. We maintain that children with ASD may utilize imageability for learning early object nouns and action verbs much like their TD peers, though the extent to which the average imageability of a child's expressive vocabulary is associated with their vocabulary size is variable across word class.
Using parent report measures of children's expressive vocabularies, we found that, as in their TD peers, word input frequency and imageability help to describe the early expressive vocabularies of children with ASD. Consistent with previous literature, word frequency was significantly associated with expressive vocabulary size in both children with and without ASD (Goodman et al., 2008; Kover & Weismer, 2013). Furthermore, imageability has a strong negative association with expressive vocabulary size and explains a unique portion of the variance in expressive vocabulary size, beyond that of word input frequency alone. Children with small vocabularies acquire words that are highly imageable, and this association was not different between TD and ASD groups. This supports the Emergentist Coalition Model of word learning, in which children prioritize different semantic and speaker cues to learn words throughout development (Brandone et al., 2007; Hollich et al., 2000). Our findings further expand this model to include children with ASD.
Despite previous studies suggesting that neurological differences between sex- and age-matched ASD and TD groups impact how imageability is utilized during linguistic tasks (Brown et al., 2005, Cantiani et al., 2016; Gaffrey et al., 2007; Kana et al., 2006), our study revealed that when matching groups by sex and parent-reported language level, the relationship between imageability and expressive vocabulary size was not significantly different. In fact, even though our ASD sample had a significantly older age range than our TD sample, the children with ASD acquired similar quantities of nouns and verbs as their TD peers. It is possible that the children with ASD utilized other strategies to learn words that were less imageable. A recent study demonstrated that late talkers use semantic and syntactic features to learn the same types of verbs as TD children (Horvath et al., 2019). This may be the case for children with ASD as well, as they benefit from similar cues for word learning (Gladfelter & Goffman, 2018; Horvath et al., 2018). Ultimately, our findings are consistent with literature that suggests overall language in children with ASD may be delayed but is not deviant compared to TD peers (Arunachalam & Luyster, 2015; Charman et al., 2003; Rescorla & Safyer, 2013).
Moreover, the association of imageability and expressive vocabulary size was consistent across both noun and verb vocabularies alone. This significant association across both word classes helps to explain why some verbs are learned before some nouns. Verbs that are acquired early tend to be those that are highly imageable. Likewise, nouns that are acquired later are those that are less imageable. As they do with nouns, infants prioritize perceptual cues to learn verbs. For example, in studies on early verb learning, infants have been shown to attend to the path of action verbs (e.g., an action's perceived relation to an object) before manner cues (e.g., how the action is carried out) (Konishi et al., 2016). Our findings support that it is not entirely word class, but also the perceptual salience of words that impact both noun and verb vocabulary acquisition in young children.
Interestingly, although imageability exhibited a statistically significant negative correlation with both noun and verb vocabulary, it accounted for a greater portion of nouns produced than verbs produced. The variable extent to which imageability explains noun vocabulary compared to verb vocabulary size suggests that there are likely some differences between how object nouns and action verbs are acquired. While word frequency and imageability together account for a significant portion of children's expressive vocabularies, they do not account for all of the variance. Differences in word class could be attributed to the effect of other internal, external, and lexicosemantic factors on noun and verb learning, in addition to word input frequency and imageability. Internal, personal factors that predict vocabulary acquisition but were not available for our dataset include nonverbal IQ, cognitive ability (Luyster et al., 2008; McDuffie et al., 2005; Wodka et al., 2013), joint attention engagement states (Crandall et al., 2019a), and morphosyntactic skills (Rujas et al., 2019), amongst others. External, environmental factors, such as the type and quality of caregiver input, also facilitate verb learning. For example, increased caregiver use of diverse verbs in child-directed speech positively predicts the expressive verb vocabulary of children with ASD (Crandall et al., 2019b). Likewise, there are other known lexicosemantic features that predict verb acquisition, including word length and phonological neighborhood density (Kover & Weismer, 2013; Smolik, 2019). These additional variables may attenuate the effects of imageability when learning concepts that are harder to perceive. Thus, the interaction of imageability with other internal, external, and lexicosemantic features should be a focus of future studies on noun and verb learning.
Limitations & future directions
This study helps to describe the relationship between imageability and early expressive language in children with and without ASD; however, some limitations are noted.
As previously discussed, internal and external factors, like nonverbal IQ and quality of caregiver input, are known predictors of language outcomes that should be considered. More specifically, while this study examined word input frequency as one measure for the quantity of caregiver language input, the quality of caregiver language input, including the level of caregiver education, caregiver vocabulary size, and lexical diversity, etc. are all important additional factors to include in models predicting child language outcomes. As our data were compiled from multiple previous datasets, the available participant information was not sufficient to consider additional variables in our analyses. The interaction of imageability with other internal, external, and lexicosemantic features (e.g., nonverbal IQ, verb diversity in child-directed speech, word length, etc.) should be better addressed in future studies on noun and verb learning.
Variability of imageability ratings should also be acknowledged when interpreting our findings. The current study used imageability ratings that were normed on a small sample of TD individuals (Masterson & Druks, 1998). Thus, these ratings may not be fully representative of the imageability ratings of our study's time period, age range, or diagnostic group. More robust imageability ratings for nouns and verbs could bolster our results.
Additionally, the 123 items selected for this study likely underrepresent our participants’ total expressive vocabularies. Repeating analyses on a larger set of words may increase the accuracy and generalizability of our findings. The current study was limited to children's expressive vocabularies, specifically action verbs and object nouns from the MB-CDI that overlapped with a set of words with existing imageability ratings. The reduced range of imageability ratings reflected in this early vocabulary measure may also overstate the contributions of imageability in our study. Including a greater variety of nouns and verbs (i.e., not only action verbs, but also path, manner, mental state verbs, etc.) could allow for further analysis of early vocabulary composition in relation to imageability.
Summary & conclusion
We identified imageability as a significant lexicosemantic feature related to early word learning in children with ASD. Similar to TD peers, children with ASD who have small vocabularies primarily produce words that are highly imageable. Children who are more proficient word learners with larger expressive vocabularies produce words that are less imageable, indicating a potential shift away from reliance on perceptual-based language processing. Imageability was significantly associated with noun production and verb production alone, with imageability explaining a greater portion of nouns produced than verbs produced. Our findings contribute to a growing body of literature describing the early word learning mechanisms in children with ASD and provide a basis for exploring the use of multisensory instruction and decontextualized language learning strategies. Understanding how the perceptual salience of words affects vocabulary acquisition, particularly during early language development, warrants further study and could have lasting impacts on noun and verb acquisition in children with ASD.
Footnotes
Acknowledgments
We would like to thank the families who contributed to the studies and databases utilized in our analysis, as well as Dr. Janine Molino for her statistical consultation.
Declaration of conflicting interests
The author(s) declared that there are no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institutes of Health [NIH R01DC017131]; the National Institute of Mental Health [R01 MH066496]; the Intramural Research Program of the National Institute of Mental Health [1ZICMH002961]; the Department of Education [H324 C030112]; and a gift from the Simons Foundation. This content is solely the responsibility of the authors and does not necessarily reflect the official view of the National Institutes of Health.
