Abstract
Caregiver language input is a key component of children's early developmental environment, with its nature likely varying based on certain characteristics. This study explored how caregiver language input differed across different engagement states for 12–24-month-old toddlers with autism spectrum disorder (ASD; n = 39) or typically developing (TD) toddlers (n = 31) based on their ages and spoken language during the interaction in Chinese contexts. Caregiver language was analyzed during 10-min naturalistic dyadic play interactions, including the mean length of utterances, total frequency of input, functional features (include affect-salient speech, directives, questions, labeling, descriptions, and attention getters), and the integrated proportion of utterances and gestures. Toddlers’ spoken language during the interaction and ages were also assessed. Caregivers of toddlers with ASD employed fewer questions and more attention-getters compared to those of TD toddlers. Caregivers adjusted language input based on toddlers’ engagement states and ages but not spoken language during the interaction. These findings indicate that Chinese caregiver language input is dynamic and context sensitive. The behavioral characteristics of Chinese toddlers with ASD might shape the unique characteristics of such input.
Keywords
Introduction
Autism spectrum disorder (ASD) is a developmental disorder caused by brain differences, and its core features include differences in social interaction and communication, as well as restricted and repetitive patterns of behavior, interests, or activities. The early interactions between caregivers and toddlers are crucial for toddlers’ early social environment. Notably, infants and toddlers later diagnosed with ASD often display a reduced tendency to seek, respond to, and initiate social experiences in early life. Such differences can disrupt their social interaction with caregivers (Dawson, 2008; Green et al., 2015; Hobson & Lee, 2010).
Caregiver Language Input
Caregiver language input not only acts as a key determinant of children's language acquisition but also structures and sustains interactions that shape multiple developmental milestones, including social communication (Adamson et al., 2015; Bottema-Beutel et al., 2014). Characterizing the features of caregiver input—and how these features may differ between caregivers of children with ASD and their typically developing (TD) counterparts—is equally relevant, as caregivers of children with ASD may adapt their speech in distinct ways to respond to their children's social and linguistic profiles (Nadig & Bang, 2016).
As a core component of children's early developmental environment, caregiver language input is typically characterized by two dimensions: structural and functional features. The structural aspects of caregiver input include complexity (e.g., mean length of utterance [MLU]), syntactic features such as wh-question constructions, and the frequency of different syntactic categories (Bottema-Beutel & Kim, 2021). The functional aspects, by contrast, encompass the intent behind caregiver language directed at their children.
Numerous studies have explored the structural features of caregiver language input for children with ASD within English-speaking context, yet inconsistencies in findings reflect the complexity of caregiver–child language dynamics, shaped by developmental stage, interaction context, and child characteristics. A longitudinal investigation revealed that caregivers of preschoolers with ASD produced shorter MLU overall during naturalistic free-play interactions, a difference linked to children's nonverbal cognition and diagnostic status—suggesting that caregivers adjust their syntactic complexity in response to children's developmental profiles (Fusaroli et al., 2019). Critically, this study highlighted reciprocal effects, with child speech and caregiver speech predicting one another. Compared with TD children, MLU in the speech of children with ASD was shorter. In contrast, Hutchins et al. (2017) found that during storytelling interactions with school-age children who already possessed language skills, caregivers of children with ASD produced longer utterances than those of TD children when controlling for total amount of talk. These studies illustrate that caregiver language input to children with ASD is not static but dynamically adjusts to children's age, language ability, and interaction goals.
Caregiver language serves as an important and widely employed medium for conveying emotions and information to children. Accordingly, utterances can be mainly categorized into two types based on their functional features: affect-salient speech and information-salient speech (Bloom et al., 1996; Locke, 1996; Luigia et al., 1998). Subsequent studies have included attention getters, which typically involve children's names or nicknames, into the study of language functional features (Venuti et al., 2012). From a developmental perspective, the functional features of caregiver language are of vital importance (Herrera et al., 2004; Luigia et al., 1998). Affect-salient speech, which aims to stimulate children's enthusiasm for communication and interaction, consists of expressive, usually nonpropositional or seemingly meaningless utterances such as encouragement, singing, and greetings (Locke, 1996; Luigia et al., 1998). It may create a secure emotional context that enhances language learning, aligning with attachment theory's emphasis on emotional security as a foundation for exploration (Ainsworth et al., 2014). Information-salient speech, on the other hand, focuses on conveying propositional content about the self, the child, or the environment, including directives, questions, labels, and descriptions (Zampini et al., 2020). Its role in language acquisition is well-documented. For example, between 6 and 18 months, caregivers’ contingent labeling of objects during joint pointing facilitates infants’ word learning (Rowe & Zuckerman, 2016). Among children with ASD, questions and descriptions emerge as critical subcategories. Notably, caregivers of toddlers with ASD use fewer questions but comparable levels of descriptions relative to those of TD children during naturalistic play interactions, suggesting descriptions may serve as a compensatory verbal scaffold to support exploration of internal and external worlds (Venuti et al., 2012). Attention getters represent another functional category, particularly prominent in interactions involving children with ASD (Venuti et al., 2012). This adaptation reflects bidirectional dynamics: children's reduced responsiveness elicits more attention-getting attempts, which may inadvertently reinforce suboptimal interaction patterns. Despite these insights, critical gaps remain. Relatively few studies on functional language in ASD focus on how affect-salient speech, subcategories of information-salient, and attention getters interact to shape the language development environment.
When providing language input, caregivers often use gestures concurrently to emphasize, clarify, or supplement the content of their utterances. When interacting with toddlers of lower language proficiency during storytelling, caregivers increase their use of representational and supplementary gestures—those that add information not present in speech (Molnar et al., 2021). Effective use of gestures by caregivers, particularly when synchronized with speech, is associated with improved social and language outcomes in both neurotypical and neurodivergent children (Choi & Rowe, 2024; Lv et al., 2022). Overall, caregivers’ adaptive use of gestures provides a multimodal scaffold that is especially valuable for children with limited vocabulary or those in linguistically challenging environments. Understanding how caregivers integrate gestures with speech is essential to a comprehensive account of the linguistic environment in which toddlers develop during their early years.
Unlike the extensive research on caregiver language input in English contexts, there are currently only a few studies focusing on such input in Chinese contexts, with a notable scarcity of relevant research. One such study demonstrated that 3- to 4-year-old Chinese children with ASD receive less overall language input during semistructural play sessions compared to their TD peers, with reduced complexity of the language they are exposed to (Xu et al., 2021). In fact, researching caregiver language input in non-English contexts is of great significance. A study investigating caregiver input to children with ASD in Bulgarian and English contexts revealed that parents in these two contexts exhibit differences in the process of language input, such as different percentage questions (Barokova & Tager-Flusberg, 2024). Conducting research only on English-speaking children may overlook the nuances of caregiver input arising from cross-linguistic variations.
Caregiver Language Input and Engagement States of Children
Early language exposure benefits both TD children and those with ASD. Joint engagement—the dynamic state in which a child and caregiver actively share attention to an object or event—serves as a foundational context for language and social communication development (Adamson et al., 2009). For children with ASD, disruptions in joint engagement are well-documented (Adamson et al., 2009; Bottema-Beutel et al., 2014). Early developmental divergences in both social engagement and language among children are thought to influence caregiver input (Kushner et al., 2023). Caregivers adjust their use of labeling based on the engagement state of toddlers with ASD, highlighting that the context of caregiver input (i.e., the engagement state in which it occurs) modulates its effectiveness for toddlers with ASD during free play interactions. Collectively, these findings underscore the necessity of examining caregiver language input within specific engagement states to understand how it supports development in ASD. Furthermore, investigations within the Chinese context can complement existing research findings and enhance understanding of the relationship between caregiver language input and engagement states among children with ASD across diverse contexts.
Caregiver Language Input and Child/Caregiver Characteristics
The more fine-grained aspects of input that matter depend on the child's language ability or age, particularly for those with language delays or ASD (Choi et al., 2020; Fusaroli et al., 2023; Rowe & Snow, 2020). Previous findings revealed that during semistructured play interactions, caregivers of higher-functioning verbal 2- to 9-year-old children with ASD asked more questions, whereas those supporting lower-functioning nonverbal children relied more on directives and used shorter MLU (Konstantareas et al., 1988). This adjustment reflects both responsiveness to the child's needs and potential constraints imposed by limited verbal engagement. Children's age is also a potential influencing factor on caregiver language input. Eighteen months emerges as a critical point in toddlers’ language development, as the quantity and diversity of input they receive at this stage exert a significant impact on their subsequent language development (Gámez et al., 2023; Rowe, 2012).
The caregiver–child interaction is a dyadic process involving both caregivers and children. Children's characteristics may influence caregiver input during the interaction, while caregivers’ own traits and emotions may also shape their interactive behaviors. Studies have found that caregiver stress can reduce the quantity of language input provided, which in turn is associated with poorer language outcomes in both toddlers with ASD and their siblings (Markfeld et al., 2023). Caregivers’ depressive symptoms also affect their perceptions and reports of autistic traits, highlighting the complex interplay between caregiver characteristics and child outcomes (Goh et al., 2018). Beyond emotions, caregiver's autistic traits may influence their interaction style; however, direct research linking caregiver autistic traits to their language input remains limited. When investigating caregiver language input, considering potential influencing factors from both caregiver and child perspectives facilitates a deeper and more accurate understanding of such input.
Extensive research on the early language development environment of toddlers with ASD and their potential influencing factors in English-speaking context has been conducted, laying a solid foundation for this field. However, research in Chinese contexts remains relatively scarce, particularly regarding the caregiver–toddler interaction encompassing diverse engagement states. Additionally, caregivers’ integration of gestures during language input warrants further exploration. A comprehensive and systematic analysis of caregiver language input to Chinese-speaking toddlers with ASD, along with the relationships between such input and child characteristics, not only deepens understanding of their early development environment but also provides a robust theoretical foundation for targeted interventions and educational practices.
In conclusion, this study addresses the following research questions:
Aim 1—In Chinese contexts, do caregivers of toddlers with ASD exhibit unique structural and functional characteristics in their language input during play interactions compared with those of TD toddlers? We hypothesized that caregivers of toddlers with ASD would exhibit shorter MLU, ask fewer questions, and use more attention getters (Fusaroli et al., 2019; Zanchi et al., 2024).
Aim 2—In Chinese contexts, do characteristics of caregiver language input vary with the engagement states of toddlers with ASD? We hypothesized that caregivers of toddlers with ASD would provide more labeling and descriptions during joint engagement (Kushner et al., 2023; Roemer et al., 2022).
Aim 3—In Chinese contexts, does caregiver language input (including gesture integration) vary with toddlers’ spoken language during interactions and their ages? We hypothesized that lack of spoken language during interactions and younger age would be associated with shorter MLU, fewer questions, and more attention getters (Zanchi et al., 2024).
Method
Participants
Participants were 70 caregivers and their toddlers with ASD (n = 39) or TD (n = 31). Families were recruited through three primary channels: the Department of Child Healthcare, community recruitment, and the Longitudinal Chinese Autism Study of Early Development (LCAS-ED) established by the Children Development and Behavior Center at the Third Affiliated Hospital of Sun Yat-Sen University.
In this study, inclusion criteria were as follows: (a) toddlers aged 12–24 months; (b) native Chinese speakers; (c) caregivers agreed to have videos taken and participate in the evaluation and follow-up. At a mean age of 18.7 months (standard deviation, SD = 3.7), toddlers completed the Mullen Scales of Early Learning (Mullen, 1995) and participated in caregiver–toddler play interactions. Concurrently, caregivers who engaged in interactions were required to complete the Autism-spectrum Quotient (AQ) questionnaire and the Patient Health Questionnaire-4 (PHQ-4). At 20.4 months (SD = 2.5), a clinical best estimate diagnosis was made at a developmental evaluation using all available information by two experienced specialists in child development and behavior. All toddlers completed the MSEL and the Autism Diagnostic Observation Schedule, Toddler Module (ADOS-T, Luyster et al., 2009) as part of the developmental evaluation. Toddlers were identified as having ASD if clinicians diagnosed them as such according to the Diagnostic and Statistical Manual of Mental Disorders-5th Edition (DSM-5), and their ADOS-T Calibrated Severity Score (CSS) was equal to or greater than four points. Toddlers were identified as TD if they met the following criteria: (a) they scored within 1.25 standard deviation of the mean or higher on all MSEL scales; (b) their ADOS-T CSS was less than four points; (c) and there were no concerns regarding any developmental delay, such as suspected ASD or language developmental delay. Diagnostic outcomes and other developmental characteristics of the toddlers in this sample were reported in Table 1.
Participant Demographics and Descriptive Statistics.
Note. ASD = autism spectrum disorder; TD = typically developing; ADOS-T = Autism Diagnostic Observation Schedule-Toddler Module; SA = social affect; RRB = restricted, repetitive behaviors; CSS = Calibrated Severity Score; MSEL = Mullen Scales of Early Learning; T = T-value; AQ = Autism-spectrum Quotient; PHQ-4 = Patient Health Questionnaire-4.
Values are presented as mean (standard deviation) for t-tests and frequency (percentage) for chi-square tests.
**p < .01.
About 27% of interactive caregivers in this study had an education level below a bachelor's degree, 53% had a bachelor's degree, and 20% had an education level above a bachelor's degree. Additional demographic information is reported in Table 1. All families gave written informed consent. This study was approved by the Institutional Review Board at the Third Affiliated Hospital of Sun Yat-Sen University.
Measures
The Mullen Scales of Early Learning: AGS Edition
The MSEL is a standardized developmental assessment tool for children ranging from birth to 68 months old. It evaluates gross motor skills, visual reception, fine motor skills, receptive language, and expressive language. Upon completion of the assessment, the raw scores in each scale are converted into T scores (M = 50, SD = 10) according to the norm.
The Autism Diagnostic Observation Schedule, Toddler Module
The ADOS-T is used for assessing toddlers aged 12–30 months and consists of two domains: social affect and restricted, repetitive behaviors. The total score is calculated as well. According to the norm, the following ranges of concerns are identified: little to no concern, mild to moderate concern, and moderate to severe concern. The CSS of the ADOS-T is used to assess the severity of individual modules (Esler et al., 2016).
The Autism-Spectrum Quotient
The AQ is a self-report screening tool for the autistic traits of caregivers with normal intelligence. The AQ comprises 50 items, with each item answered on a scale from 1 to 4. Depending on the item, either responses 1 and 2 or responses 3 and 4 are scored as 1 point. The higher the score, the more obvious the autistic traits (Baron-Cohen et al., 2001).
The Patient Health Questionnaire-4
The PHQ-4 is an ultra-brief self-report screening tool for depression and anxiety in caregivers. A score of 3 or higher on the depression subscale is a reasonable cutoff for identifying potential major depressive disorder or other depressive conditions (Kroenke et al., 2003; Löwe et al., 2005). Similarly, a score of 3 or higher on the anxiety subscale serves as a reasonable cutoff value for detecting generalized anxiety, panic, social anxiety, and posttraumatic stress disorder (Kroenke et al., 2007). In this study, we calculated the positive cases of the depression and anxiety subscales respectively.
Coding Scheme of Caregiver–Toddler Play Interaction in a Natural Context
Caregivers were free to choose whether to conduct interactions in their actual home environment or a simulated naturalistic setting, based on their own convenience and preferences. For both contexts, we provided an identical set of age-appropriate toys for caregivers to use as reference, including a toy car, a music box, a bouncing toy, a set of boxes, eight textured blocks, eight building blocks, and eight plastic snowflakes. Videos needed to clearly capture the upper body movements, including facial expressions, of both caregivers and toddlers. All naturalistic caregiver–toddler interactions were carried out by the parent who was the primary caregiver in the toddler's daily life. Caregivers were instructed to play in a natural, everyday manner. Prior to formal recording, caregivers played with toddlers for 2 min as a warm-up. Caregivers could decide freely whether to use toys during play. Face-to-face dyadic interaction was permissible; that was, caregivers and toddlers could interact without toys, engaging instead in games such as peekaboo and lifting the child. Examples of caregiver–toddler play interactions are shown in Figure 1.

Examples of Caregiver-Toddler Play Interaction in a Natural Context. (a) and (b) Respectively Represent Caregiver-Toddler Play Interaction in a Simulated Naturalistic Setting (a) and an Actual Home Environment (b).
Video coding started right after the warm-up ended and lasted continuously for 10 min. If the camera did not clearly capture the toddler's activities or the toddler interacted with someone other than the caregiver, that segment was considered uncodable and removed when calculating the total effective duration of the video (in minutes). Two well-trained researchers identified and transcribed caregiver's utterances and gestures, toddler's engagement states, and toddlers’ spoken language across all video recordings. ELAN 6.8 software was used for video coding.
Caregiver Language Input
All the spoken language of the caregiver directed towards the toddler was cut into single utterances, which represent the unit of analysis. The boundaries between utterances were defined by intonation changes, pauses longer than 1 s, and/or subject changes (Roemer et al., 2022), excluding semantically irrelevant parts or noises. Caregiver language input was analyzed considering functional, morphosyntactic and integrated gesture aspects. Unintelligible or incomplete utterances were excluded from the analysis.
Regarding the functional features of the input, we adopted the same coding scheme used by Zampini et al. (2020). The functions are as follows:
Affect-salient speech: utterances for maintaining conversation without content information, such as encouragements (“Well done”), greetings (“Goodbye”), singing, and nonsense (e.g., repeating the toddlers’ words). Information-salient speech: utterances for conveying content, with four subcategories: (a) Directives (e.g., “Put it here”); (b) Questions (e.g., “Is the cow white?”); (c) Labeling (e.g., “The girl”); (d) Descriptions (e.g., “The cow is on the truck”). Attention getters: utterances for drawing the toddler's attention, like “Look” or calling the toddler's name.
The proportion of each category described above was calculated by dividing the number of utterances in each category by the total number of utterances by caregivers.
Regarding the morphosyntactic features of the input, we considered the following measure:
MLU: as an independent language, Chinese has no morphological changes in its words. Therefore, the MLU of Chinese can be measured in two ways: by characters or by words. In this study, to explore the MLU of adult caregivers, the calculation was based on characters, that is, counting the number of characters in each utterance. For instance, the length of “Ni Hao” is 2. Regarding the integration of utterances and gestures, we categorized gestures based on the relationship between each gesture and the corresponding utterance (Ozçalişkan & Goldin-Meadow, 2005). The classification is as follows: Reinforcing/redundant gestures: gestures conveying the same information as the utterance. For example, “Dog” + pointing at the dog, “No” + shaking the head. Disambiguating gestures: gestures clarifying the meaning of an utterance, typically used to clarify pronominal (e.g., “he,” “she”), demonstratives (e.g., “this,” “that”), or deictic (e.g., “here,” “there”) references in the utterance. For example, “There” + pointing at the table. Supplementary gestures: gestures adding semantic information to the message conveyed in the utterance, including those seemingly conflicting with the utterance. For example, “Push” + pointing at the sofa, “Dog” + pointing at the cow.
Toddlers’ Spoken Language During the Interaction
The toddler's spoken language during the caregiver–toddler play interaction is defined as any spoken expression that uses a phonetic symbol system to convey clear information (O’Hare, 2005). In this study, the toddler's spoken language during the interaction was coded into two categories:
With spoken language during the interaction: the toddler produces at least one spoken expression, like “Mama” or “Want,” that can convey clear information during the entire effective coding period. Without spoken language during the interaction: the toddler produces no spoken expressions that can convey clear information throughout the entire effective coding period.
Toddlers’ Engagement States
We used the coding scheme for toddlers’ engagement states developed by Adamson et al. (2004, 2009). After excluding uncodable segments, the entire play video was coded into mutually exclusive engagement states, such that the end of one code marked the beginning of another. Engagement states included unengaged state, object engagement, joint engagement, and onlooking state. The specific definitions of each engagement state are shown in Table 2. To avoid micro-coding of brief fluctuations in attention, a 3-s principle was applied. That is, if the toddler’ gaze briefly shifts away from the interaction for less than 3 s to focus on another object or be distracted by an unexpected noise, it is not coded as a change in the engagement state.
Classification of Toddler Engagement States in Play Interactions.
Data Reduction
After coding, all the original data were processed. We calculated the total frequency of input, the proportion of utterances in each category, the proportion of utterances integrated with each gesture category, and the proportion of the engagement state in each category. Furthermore, within each of the toddlers’ engagement states, the proportion of each utterance category were calculated.
Training and Coding Reliability
Two doctoral students with professional backgrounds in child developmental behavior participated in the coding process of this study. To ensure coding consistency, the two coders conducted multiple rounds of pilot coding prior to the official coding phase. In each round of pilot coding, they independently completed full coding of a randomly selected interaction video. Official coding commenced once intercoder reliability exceeded .90. For any discrepancies, the coders reviewed the relevant segments repeatedly and only included them in the analysis after reaching a consensus. After ensuring coding reliability with training videos, the two coders independently coded the videos, and neither of them knew the diagnosis and ability information of the toddlers. Due to the involvement of variables at multiple levels, each video was processed 3 times. First, we coded the toddlers’ engagement states and excluded uncodable segments. Second, we coded all utterances of caregivers within the valid coding segments, as well as gestures integrated with these utterances. Finally, we determined whether the toddlers produced spoken language throughout the valid interaction segments. Twenty percent of the caregiver–toddler play interaction videos (n = 14) were randomly selected for overlapping coding to calculate intercoder reliability, measured via the intraclass correlation coefficient (ICC). The ICC values for the six categories of functional utterances ranged from .976 to .995; those for the three categories of gestures ranged from .971 to .998; and those for the duration (in seconds) of the four categories of engagement states ranged from .993 to 1. For coding consistency in determining the presence or absence of toddlers’ spoken language during the interaction, intercoder reliability was assessed using Cohen's kappa coefficient, with a value of .923.
Data Analyses
We used IBM SPSS 26.0 for all statistical analyses and GraphPad Prism 10.4 to create most graphs. Alpha was set at .05 for statistical significance. Proportions with a denominator of 0 were treated as missing values in the analysis. Outliers exceeding three standard deviations were excluded from the analysis. The normality was assessed using the Shapiro–Wilk (S-W) test, and the corresponding inter-group comparison method was selected based on variable distribution. To test for differences in sample variables between the two groups, we used the t-test for normally distributed continuous variables, the Mann–Whitney U nonparametric test for nonnormally distributed continuous variables, and the chi-square test for categorical variables. When conducting multiple independent tests, the probability of Type I errors accumulates and increases with the number of tests performed. To control for such inflated Type I errors, the Bonferroni correction was applied, resulting in a corrected significance level of α′ = α/N (N = number of comparisons). For comparisons of caregiver language input, a p-value < .0042 (.05/12) was deemed significant (see Tables 3–5). For the comparisons of multiple related groups, we used the Friedman test, followed by Nemenyi test with for post-hoc pairwise comparisons.
Comparison of Caregiver Language Input Characteristics Between ASD and TD Groups.
Note. ASD = autism spectrum disorder; TD = typically developing; MLU = mean length of utterance.
Frequency is calculated by dividing the number of utterances by the total effective duration of the video (in minutes).
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
*p < .0042.
Caregiver Language Input to Toddlers With ASD: Comparison Between Those With and Without Spoken Language During Interactions.
Note. MLU = mean length of utterance.
Values are presented as mean (standard deviation) for t-tests and median (lower quartile, upper quartile) for Mann–Whitney U nonparametric tests.
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
Caregiver Language Input to Toddlers With ASD: Comparison Between 12–18-Month-Old and 19–24-Month-Old Toddlers.
Note. MLU = mean length of utterance.
Values are presented as mean (standard deviation) for t-tests and median (lower quartile, upper quartile) for Mann–Whitney U nonparametric tests.
Proportion of utterances in each category of functional speech (calculated by dividing the number of utterances in each category by the total number of utterances produced by the caregivers).
*p < .0042.
Results
Sample Demographics and Descriptive Statistics
Table 1 presents the demographic and descriptive data of the ASD and TD groups. In total, 40 males and 30 females were enrolled in the study, with 30 males and nine females in the ASD group. According to Mullen's definition of developmental delay (T score < 30), the ASD group showed significant language delays. Their average T score of receptive language lagged over 1.5 SD behind, and the average T score of expressive language lagged over more than 2 SD behind. The ASD group had significantly higher ADOS-T total CSS than the TD group.
Aim 1—Caregiver Language Input Provided to Toddlers With ASD Versus TD Toddlers
Table 3 shows the characteristics of caregiver input in the ASD and TD groups. The proportion of questions and attention getters differed between the two groups. Specifically, the proportion of questions was significantly higher for TD toddlers than for toddlers with ASD (p = .002). In contrast, the proportion of attention getters was significantly higher for toddlers with ASD than for TD toddlers (p < .001). No significant differences were found in other characteristics of caregiver input for toddlers with ASD and TD toddlers.
Aim 2 – Caregiver Language Input Within Different Engagement States of Toddlers With ASD
We compared the proportions of utterances across functional categories within each engagement state. Within the unengaged state, affect-salient speech accounted for the lowest proportion, with significant differences from directives (p = .006), descriptions (p = .008) and attention getters (p = .027). Labeling also had a minimum proportion, differing significantly from directives (p = .001), questions (p = .003), descriptions (p = .001), and attention getters (p = .001). Within object engagement, affect-salient speech had the lowest proportion, with significant differences from directives and descriptions (p = .001). Directives had the highest proportion, differing from labeling (p = .001), and attention getters (p = .023). Descriptions had the highest proportion in relevant comparison, with significant differences from questions, labeling, and attention getters (p = .001). Within joint engagement, labeling had the lowest proportion, with significant differences from affect-salient speech, directives, and descriptions (p = .001). Directives had the highest proportion, differing from questions (p = .010) and attention getters (p = .001). Descriptions were also high, with significant differences from affect-salient speech (p = .005), questions (p = .001), and attention getters (p = .001). Attention getters had lower proportion than affect-salient speech (p = .020). Within the onlooking state, descriptions had a significantly higher proportion than the other five categories (p = .001). Table 6 illustrates the comparative analysis of the proportions of the six categories of caregivers’ utterances within the four engagement states of toddlers with ASD. The results of post-hoc pairwise comparisons via Nemenyi test are available in the supplementary material (see Appendix S1).
Comparison of Proportions of Six Caregiver Functional Utterance Categories Within Four Engagement States of Toddlers With ASD.
Note. n = actual number of samples after excluding missing values and outliers; Mdn = median; Q1 = lower quartile; Q3 = upper quartile; MR = mean rank.
Kendall's W, a statistic ranging from 0 to 1, was used to quantify the degree of intercaregiver agreement in these proportional distributions. Specifically, values < .3 indicate low consistency, values between .3 and .7 (inclusive) reflect moderate consistency, and values > .7 denote high consistency in the relative distribution patterns of the six utterance categories.
**p < .01.
We also compared the proportions of utterances across functional categories within each engagement state of TD toddlers (see Appendices S2 and S3).
Aim 3—Caregiver Language Input and Toddlers’ Individual Characteristics
In the present study, there were no intergroup differences in AQ scores or PHQ-4 scores between caregivers of toddlers with ASD and their TD counterparts. Therefore, we focused on the relationship between toddlers’ individual characteristics and caregiver language input.
Toddler Spoken Language During the Interaction
As reported in Table 4, no significant differences were observed in the characteristics of caregiver input to toddlers with ASD who had spoken language and those did not during the interaction.
Within the TD group, the sample size of the subgroup who did not have spoken language during the interaction was only 6. Given that the statistical power of a small sample may be insufficient to yield reliable conclusions, this study did not conduct intergroup comparisons of caregiver language input between toddlers who had spoken language during the interaction and those did not within the TD group.
Toddler Age
We examined differences in the characteristics of caregiver language input between 12–18-month-old and 19–24-month-old toddlers in the ASD and TD groups to determine whether toddlers’ age affected such input.
As shown in Table 5, in the ASD group, the proportions of disambiguating gestures differed between age groups. Disambiguating gestures were more common in input to 19–24-month-olds (p = .003). No other significant differences were found in caregiver language input for toddlers with ASD of different ages. In the TD group, the proportions of disambiguating gestures differed. The disambiguating gestures were more common in input to 19–24-month-olds (p = .002). No other significant differences emerged in caregiver language input to TD toddlers of different ages.
Discussion
This study explored the unique characteristics of caregiver language input during 10-min naturalistic play interactions with toddlers with ASD in Chinese contexts. It also analyzed the functional features of such input within different engagement states of toddlers with ASD. Moreover, this study explored the association between caregiver language input and toddlers’ individual characteristics. Key findings are as follows: (a) Chinese-speaking caregivers of toddlers with ASD used a lower proportion of questions and a higher proportion of attention getters during play interactions compared to those of TD toddlers; (b) caregivers adjust language input based on the engagement states of toddlers with ASD; (c) there are no significant differences in caregiver language input when interacting with toddlers with ASD who have or lack spoken language during the interaction; (d) when interacting with 12–18-month-old toddlers, whether with ASD or TD, caregivers use a lower proportion of disambiguating gestures compared to those interacting with 19–24-month-old toddlers.
Caregiver Language Input Provided to Toddlers With ASD Versus TD Toddlers
Previous research on language acquisition in atypical populations have found that mothers adjust their communication styles based on their children's language and cognitive capabilities (Boyce & Boyce, 2002; D’Odorico & Jacob, 2006; Venuti et al., 2012). This study expands the understanding of the characteristics of language input by caregivers of toddlers with ASD in Chinese contexts.
The primary aim of this study is to compare the language input of caregivers of toddlers with ASD and TD toddlers. Consistent with our hypothesis, caregivers of toddlers with ASD posed fewer questions and more attention getters. This findings aligns with the characteristics of language input by caregivers of children with ASD in English-speaking contexts (Venuti et al., 2012), suggesting that caregivers from different cultural backgrounds exhibit similar traits when interacting with children with ASD. Such adjustments in language input demonstrate cross-linguistic stability. Questions typically initiate conversations and encourage language output. Venuti et al. (2012) discovered that when interacting with toddlers with language development delays, caregivers tend to decrease open-ended questions and rely more on direct directives to sustain interactions. The low responsiveness of toddlers with ASD to social interactions may prompt caregivers to adopt more “controlling” strategies. Such caregivers may seek to regulate the situation by excessively structuring their toddlers’ behaviors and providing simplistic, repetitive cues. The high-proportion use of attention getters represents caregivers’ adaptive adjustment to the insufficient social interactions of toddlers with ASD. Contrary to our expectations, caregivers of toddlers with ASD and those of TD toddlers exhibited similar MLU in our study. Although caregivers of toddlers with ASD did have shorter MLU than their TD counterparts, this slight difference did not reach statistical significance. This result is inconsistent with previous findings (Britsch & Iverson, 2024). Given that toddlers with ASD often display delays in language comprehension, caregivers typically simplify syntactic structures to match these toddlers’ cognitive and language levels. One potential explanation for our results may lie in differences in MLU calculation methods between Chinese and English. Additionally, future research should further explore MLU among Chinese caregivers using a larger sample size.
Despite significant differences, this study also uncovered several similarities in specific utterance characteristics between Chinese-speaking caregivers of the ASD and TD groups. This finding aligns with numerous previous studies in English-speaking contexts (Bottema-Beutel & Kim, 2021). The cross-linguistic consistency of such similarities suggests they are not arbitrary but reflect evolutionarily rooted or developmentally tailored strategies. Caregivers, regardless of a toddler's diagnostic status or cultural context, intuitively deploy utterances that align with foundational learning needs. Such similarities may serve as a scaffold, facilitating interactions and language development. In studies on TD toddlers, caregivers’ use of labeling enables toddlers to associate words with objects, thereby supporting language acquisition (Yu et al., 2019). When toddlers focus on objects, caregivers’ use of labeling can capture their attention and promote caregiver–toddler interactions (Bottema-Beutel et al., 2018). This facilitation is particularly beneficial for toddlers with ASD, who struggle with engaging in interaction. In both the ASD and TD groups, caregivers’ utterances during the interactions are mainly information-salient speech, which is consistent with the language patterns expected of mothers of 2-year-old toddlers (Bornstein et al., 1992; D’Odorico et al., 1999). At this stage, the language communication between mothers and toddlers focuses more on sharing internal and external world meanings rather than emotional expression (Venuti et al., 1997).
During language input, caregivers often integrate gestures to help toddlers understand their utterances. Among the three types of gestures, reinforcing gestures are most common in both groups, typically used to indicate the attention focus in utterances. Although toddlers with ASD exhibit fewer social responses than TD toddlers during caregiver–toddler interactions, their caregivers still use gestures effectively to enhance their social responses. A small-scale study even revealed that, compared to the TD group, caregivers of children with ASD produced more gestures and provided more scaffolding for their children's visual experiences (Yoshida et al., 2020). These findings collectively underscore caregivers’ ability to adapt their language to meet their toddlers’ unique needs (Emiddia Longobardi & Cristina Caselli, 2007).
Caregiver Language Input Within Different Engagement States of Toddlers With ASD
Our second aim is to determine whether caregiver language input varies according to the engagement states of toddlers with ASD in Chinese contexts. The input proportions of the six utterance categories in this study show different trends within different engagement states. Affect-salient speech has a consistently low proportion in all engagement states of toddlers with ASD, except for joint engagement. This indicates that when toddlers are inattentive to caregivers, emotional expression is not a key choice for interactions, and caregivers only increase its use in specific scenarios. Directives have a relatively high proportion in the unengaged, object engagement, and joint engagement states, reflecting the need for caregivers to guide the behaviors of toddlers in these three states. The directive style adopted by caregivers of toddlers with developmental disabilities has long been a debated topic. Most studies suggest that when caregivers exhibit a more directive style during the interactions, toddlers show lower engagement, such as having shorter conversation turns (Smith et al., 2018). Nevertheless, some studies contend that caregivers’ use of follow-in directives benefits toddlers’ language outcomes (Delehanty et al., 2023). In this study, caregivers’ directive utterances encompass both appropriate structured guidance (e.g., “Throw the ball like this”) and restrictions (e.g., “Don’t move around”). Yet, this study did not discern whether these utterances follow toddlers’ attention. Future research could refine the classification of utterances to better clarify the relationship between them and toddlers’ attention foci. This would more clearly define the role of directives in caregiver–toddler play interactions in Chinese contexts. Labeling has a small proportion in all states. This difference is particularly obvious in the joint engagement state, indicating that caregivers use this simple form of utterance relatively less when toddlers are actively engaged in the interaction. The proportion of questions remains stable in different states, but its ranking relative to other utterance categories varies, which reflects that caregivers adjust question-asking according to the interaction context. Descriptions have a relatively high proportion when toddlers are engaged in interaction. This demonstrates the importance of descriptive utterances in the entire caregiver–toddler play interaction. When caregivers use rich descriptive expressions, toddlers are exposed to a broader range of vocabulary and language constructs. This exposure, in turn, enhances toddlers’ language comprehension and expression (Christakis et al., 2019). The proportion of attention getters changes irregularly. Caregivers mainly use them to attract toddlers’ attention when toddlers are unengaged. Caregivers of TD toddlers tend to use less affect-salient speech, labeling, and attention-getters, while employing more questions and descriptive utterances. These differences in the characteristics of language input between caregivers of toddlers with ASD and those of TD toddlers indicate that caregivers adapt their language input strategies according to children's performance in interactions. Overall, these findings illustrate the dynamic and context-dependent nature of caregiver language input in Chinese-speaking contexts, highlighting how Chinese caregivers adapt their utterance strategies based on the engagement states of toddlers with ASD. When facing toddlers with ASD who have more complex and diverse engagement states, the language input of caregivers also shows more variations. Our results supplement the understanding of the impact of toddlers’ various engagement states on caregiver language input in the Chinese context.
Caregiver Language Input and Toddlers’ Individual Characteristics
Existing studies have demonstrated a significant reciprocal relationship between caregiver language input and toddlers’ language skills during the first 2 years of life (Choi et al., 2020). This bidirectional relationship is marked by dynamic adaptation: toddlers’ early MLU shapes caregivers’ future linguistic complexity (Smith et al., 2023), while caregivers’ vocabulary and syntax align with those of preschoolers across extended periods, maintaining stable congruence over six assessments spanning two years (Fusaroli et al., 2019). Such synchrony reflects that both caregivers and toddlers continuously adjust to each other's linguistic cues to sustain communicative flow. Nevertheless, in this study, no significant differences in language input characteristics were observed between the caregivers of toddlers with ASD who produced spoken language during the interaction and those who did not. This might be attributable to the fact that, despite variations in spoken language during the interaction of toddlers with ASD in Chinese contexts, during their second year of life, even those with spoken language capability typically only reach the single-word stage, with a limited vocabulary and low vocalization frequency. Therefore, this subtle distinction is insufficient to exert a significant impact on caregiver language input. On the other hand, toddlers with ASD often display language delays of varying degrees and manifestations (Vogindroukas et al., 2022). Although caregivers generally prioritize toddlers’ language development, they may lack sensitivity to subtle changes in toddlers’ language abilities, failing to promptly adjust their language input strategies to align with the toddlers’ evolving needs. This finding underscores the importance of supporting Chinese caregivers of toddlers with ASD in daily communication and across the toddlers’ long-term development. It is critical to guide caregivers to focus on the skills toddlers have already demonstrated and make targeted strategic adjustments, rather than solely addressing foreseeable difficulties.
This study reveals that, regardless of toddlers’ diagnoses, caregivers exhibit commonalities in the gestures integrated during utterance input across toddlers of different age groups. When interacting with toddlers over 18 months of age, caregivers use more disambiguating gestures. This may be due to the increased use of pronouns, and caregivers add disambiguating gestures to ensure semantic clarity. The similar trend in gesture changes among caregivers of the ASD and TD group indicates that caregivers adjust their interaction strategies as toddlers age, and caregivers of both groups exhibit parallel adjustment patterns. This reflects the shared expectations of Chinese caregivers regarding toddlers’ developmental abilities.
In summary, this study represents the pioneering effort to systematically examine the characteristics of caregiver language input during caregiver–toddler play interactions in natural Chinese-language settings. By concentrating on the functional dimension, it explores the correlation between toddlers’ diverse engagement states and the characteristics of caregiver language input. These results have deepened the understanding of the early language development environment of Chinese toddlers with ASD and supplemented previous studies in English-speaking contexts. Our findings indicate that the characteristics of caregiver language input are closely associated with toddlers’ performance during interactions, consistent with the bidirectional nature of caregiver–toddler interactions. When caregivers of toddlers with ASD adjust their language strategies during interactions, there are positive aspects (e.g., frequent use of descriptions and gestures) but also suboptimal adaptations (e.g., excessive reliance on attention-getting utterances to enhance toddlers’ engagement). Additionally, the potential implications of caregivers’ infrequent use of questions for toddlers with ASD warrant further investigation. These characteristics could serve as targets for early intervention. By focusing on naturalistic caregiver–toddler play interactions, we can guide Chinese caregivers to adopt more optimal and effective strategies, thereby facilitating toddlers’ long-term development. Specific interventions targeting caregiver–toddler dyadic interactions help caregivers identify which behaviors and language successfully regulate interactions with their toddlers, as well as those that yield little benefit. Such awareness may enable Chinses caregivers to persist during unsuccessful interactions and develop strategies better suited to their toddlers’ unique needs.
While this study yielded important findings on caregiver language input to toddlers in Chinese-speaking contexts, it had several limitations. First, a larger sample size would enhance the generalization of the findings. Due to sample size constraints, the male-to-female ratios within the ASD and TD groups were not balanced. This imbalance somewhat undermined result comparability. It also undermined the statistical power of the multiple group comparison results. Additionally, over half of the caregivers in this study held at least a bachelor's degree, potentially limiting the generalizability of the findings. Finally, the interaction context affects caregiver language input (Thompson et al., 2024), yet this study only focused on play contexts. To better understand caregiver language input characteristics, future research could explore language input patterns during various daily routines such as dining, dressing, playtime, and bath time.
Conclusion
This study explored the unique characteristics of caregiver language input during 10-min naturalistic play interactions with toddlers with ASD in Chinese-speaking contexts. Key findings include: (a) Chinese caregivers of toddlers with ASD used a lower proportion of questions and a higher proportion of attention getters during play interactions compared to those of TD toddlers; (b) caregivers adjust language input based on the engagement states of toddlers with ASD; (c) there are no significant differences in caregiver language input when interacting with toddlers with ASD who have or lack spoken language during the interaction; (d) when interacting with 12–18-month-old toddlers, whether with ASD or TD, caregivers use a lower proportion of disambiguating gestures compared to those interacting with 19–24-month-old toddlers. These findings indicate that caregiver language input in Chinese contexts is dynamic and context sensitive. The behavioral characteristics of Chinese toddlers with ASD might shape the unique characteristics of caregiver language input.
Supplemental Material
sj-docx-1-dli-10.1177_23969415251389128 - Supplemental material for Caregiver Language Input in Different Engagement States During Play Interactions With Toddlers With Autism: An Observational Study
Supplemental material, sj-docx-1-dli-10.1177_23969415251389128 for Caregiver Language Input in Different Engagement States During Play Interactions With Toddlers With Autism: An Observational Study by Yijie Li, Shaoli Lv, Linru Liu, Leran Xue, Huishi Huang, Yu Xing, Qianying Ye, Feixia Zhang and Hongzhu Deng in Autism & Developmental Language Impairments
Footnotes
Acknowledgments
This work was supported by the Science and Technology Program of Guangzhou, China, Key Area Research and Development Program. Thanks to all the participating families for their time and commitment. Thanks to the personnel of Longitudinal Chinese Autism Study of Early Development of the Child Development and Behavior Center, the Third Affiliated Hospital of Sun Yat-Sen University.
Ethical Approval and Informed Consent Statements
The ethics review committee of the Third Affiliated Hospital of Sun Yat-Sen University approved this study on January 27, 2021 (No. [2020]02-118-02). Written informed consent for inclusion in this research was obtained from the families prior to the study.
Consent for Publication
Not applicable.
Author Contributions
Yijie Li: conceptualization, methodology, formal analysis, investigation, resources, data curation, writing—original draft, and visualization; Shaoli Lv: conceptualization, investigation, resources, and data curation; Linru Liu: methodology and formal analysis; Leran Xue: validation; Huishi Huang: investigation and resources; Yu Xing: investigation; Qianying Ye: investigation; Feixia Zhang: investigation; Hongzhu Deng: conceptualization, writing—review and editing, supervision, project administration, and funding acquisition.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data supporting the findings of this study are stored on a secure designated hard drive belonging to the Child Development and Behavior Center, Third Affiliated Hospital of Sun Yat-Sen University. Due to the licensing agreements governing their use for this research, access to these data is restricted and they are not publicly available. However, upon submission of a reasonable request and with the approval of the corresponding author, the authors will provide access to the relevant data.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
