Abstract
For proper assessment, diagnosis, and treatment of language disorders in Mandarin children, it is important to have measures that closely track the course of normal development. The current study uses a large collection of spontaneous conversational language samples to track the developmental course of five language measures: mean length of utterance (MLU), the MLU of the five longest utterances (MLU5), vocD, number of repetitions, and number of retracings. We used cross-validation-based linear regression to estimate the relationship between age, gender and each of the five variables derived from the conversational language samples in 101 typically developing Mandarin-speaking children aged 3 to 7. Each of the five measures showed significant age-related effects during the period from age 3 to age 7. As norm-referenced measures of language development in children speaking Mandarin, these developmental data could inform clinical therapists regarding assessments and interventions for children with language disorder or impairment.
Introduction
According to the latest survey in 2023, the proportion of children with developmental language disorders in China exceeded 8.5% (S. Wu et al., 2023). Beginning in the 1970s, Chinese scholars have conducted a series of studies of language disorders (see meta-analysis, Y. B. Zhang et al., 2016). However, these studies have not yet produced the type of norm-referenced data that is needed for accurate clinical diagnosis. Without a clear normal comparison database across the full course of language development in childhood, we cannot properly assess, diagnose, and treat language disorders.
Language sample analysis (LSA) is regarded by clinicians as the gold standard procedure for evaluating expressive language ability (Bloom & Lahey, 1978; Brown, 1973; Evans, 1996; Miller et al., 2016; Oh et al., 2020). Spontaneous speech is representative of child language development and can be an effective measure in screening for children at risk of language delay (Guo et al., 2018; MacWhinney & Ratner, 2016; Paul & Norbury, 2012; Yim et al., 2015). Language samples may be elicited from speakers of any age; they are sensitive to change; they minimize cultural bias; and they can be repeated frequently (Heilmann & Westerveld, 2013). In the past years, many indicators based on the analysis of language samples have been produced. For vocabulary analysis indicators, the most common ones are TTR (type/token ratio) (Templin, 1957), NDW (number of different words)(Klee, 1992; Watkins et al., 1995; Wong et al., 2010), and vocD (vocabulary diversity) (Malvern & Richards, 2002). For syntactic analysis, the most common measures are Mean Length of Utterance (MLU) (Blake et al., 1993; Brown, 1973; Rice et al., 2010; Rondal et al., 1987) and longest utterance (Devescovi et al., 2005; Eisenberg et al., 2001; Strömqvist & Verhoeven, 2004). Language samples have also been analyzed by profiling systems, such as LARSP (Crystal et al., 1989), DSS (Lee et al., 1974), and IPSyn (Scarborough, 1990).
As collecting and analyzing language samples takes a lot of time (Pavelko et al., 2016), computer programs have helped standardize the LSA process. Examples include SALT (Miller & Iglesias, 2015), CLAN (MacWhinney, 2000) and SUGAR (Pavelko & Owens, 2017). LSA programs constantly update standards and analysis methods. LSA with SUGAR (Pavelko & Owens, 2017) may be completed in about 20 min, which is even less than the average time for a standardized, normative language assessment, although there can be problems with this level of analysis (Guo et al., 2018). Although both SALT (Miller & Iglesias, 2015) and the Child Language Data Exchange System (CHILDES) (MacWhinney, 2000) are now providing reference corpora for English, there is not yet a systematically constructed normal comparison corpus for child Mandarin.
The current study uses spontaneous language samples from CHILDES to build a database specifically for Mandarin Chinese. CHILDES data from English-speaking children have been configured for rapid LSA (MacWhinney & Ratner, 2016) using the KidEval command within the larger set of CHILDES programs. The numerical results of KidEval analysis for a given child can point to specific weaknesses in vocabulary, fluency, or grammar within a few minutes, saving a lot of time for further clinical practice. For English-speaking children, the LSA measures computed by KidEval are effective for diagnosing language disorders. Looking at data for children aged 2 to 7 years, MacWhinney and Ratner (2016) found that both MLU and IPSyn (MacWhinney et al., 2020; Scarborough, 1990) were successful in diagnosing language disorders. The same methods that are being used for analyzing English-speaking children’s data with KidEval can also be used for Mandarin by comparing samples with the large body of Mandarin data in CHILDES. In this study, we analyzed several core indicators representing Mandarin-speaking children’s language development through LSA to explore the value of these indicators for the diagnosis and intervention of children with clinical language problems.
Standardized Measures in Mainland China
Standardized assessment of children’s language disorders refers to the establishment of reliability, validity, and normative data through formal testing procedures. Measures must be tested according to standard procedures and must demonstrate validity and reliability (S. L. Yang, 2015). For English, there are many standardized measures for clinical diagnosis of childhood language disorder (see review, Wallace et al., 2015). In contrast, valid and reliable methods for Mandarin, based on LSA, are completely lacking. According to Y. B. Zhang et al. (2016), therapists in China are using Gesell Developmental Schedules, Bayley Scales of Infant Development, 0∼6-year-old Pediatric Examination Table of Neuropsychological Development, Sign-Significant Relations-Chinese version, and WPPSI for cognitive assessment. For language disorder assessment, clinicians are using tests that emphasize lexical abilities such as PPVT-R (Chinese Peabody Picture Vocabulary Test-Revised, Lu & Liu, 1998) and EVT-R (Chinese Expressive Vocabulary Test, Williams, 1997). One newly developed measure, DREAM (X. L. Liu et al., 2017) examines receptive and expressive language ability. A second recently developed measure, MCELP-CS (H. Wu et al., 2020), covers vocabulary comprehension (VC), sentence comprehension (SC), vocabulary naming (VN), sentence structure imitation (SSI), and story narration (SN). Although these two measures have good reported validity, they are expensive, which seriously affects their accessibility. Most importantly, none of the above tools has established an effective and public norm database in China, and none are based on LSA.
A meta-analysis of language disorder studies for Chinese has shown that different measures motivate very different intervention effects/outcomes (Y. B. Zhang et al., 2016). Furthermore, tests conducted based on PPVT-R and EVT-R have produced the controversial result that receptive language skills are weaker than expressive language skills for Mandarin-speaking children (Zhou et al., 2014, 2017). In addition, children may be unfamiliar with test formats and anxious about the testing situation in ways that will affect results. Moreover, because the forms of standardized test form are often limited to tasks such as picture naming, sentence repetition, or sentence completion, they can fail to provide a full picture of children’s expressive language ability (Gai et al., 2009). As noted earlier, the high price of language assessment instruments can make them inaccessible to many sites in China.
Utility of LSA in Mainland China: A Short Review
When used in conjunction with these standardized tests, LSA can provide better ecological validity and a fuller measure of a child’s functional language usage. The growth of LSA in China has relied in large part on the introduction of CHILDES methods in the 1990s. The first study using CHILDES corpus methods (Zhou, 2001) focused on pragmatic development of children aged 0 to 3. This study reversed the earlier tradition of child language research which relied on speculative reasoning using single sample sentences. Since that time, investigators have contributed over a dozen new Mandarin child language corpora to CHILDES. Researchers are using CLAN, the free software that comes with the CHILDES system, to automatically calculate various measures of vocabulary and grammar for children with specific language impairment (F. Zhang, 2013), children with autism (X. Y. Li, 2008), children with hearing impairment (He & He, 2009), children with mental retardation (Chen, 2007), as well as typically developing children (H. M. Li, 2003; L. H. Li, 2008; S. Y. Liu, 2018; Ng, 2017; Niu, 2018; Ouyang, 2003; X. L. Yang, 2018; S. Y. Yu, 2017).
Although the CHILDES Mandarin database includes 19 corpora collected from hundreds of children, the studies did not implement a standard elicitation method or recording of demographic information. In order to establish a database for normative validation, it will be necessary to implement elicitation standards and systematic demographic recording.
Context and Measures Used in LSA for Mandarin-Speaking Children
When collecting language samples, it is important to realize that children’s productions may be markedly affected by the sampling context (Eisenberg et al., 2001, 2018; Klein et al., 2010; Nippold et al., 2014; Rice et al., 2010; Sealey & Gilmore, 2008; Southwood & Russell, 2004). This means that a language sample taken from a child in a given context should be compared to a collection of samples from other children taken in the same context. The most common contexts for sampling include toy play (or free play), narratives, story-telling, and dinner talk. At present, the majority of samples are taken from parent-child free play (Chen, 2007; He & He, 2009; Ng, 2017; Ouyang, 2003; X. Y. Li, 2008; X. L. Yang, 2018). Less common are samples from narrative (Niu, 2018; S. Y. Yu, 2017; F. Zhang, 2013), storytelling (L. H. Li, 2011; S. Y. Liu, 2018), and dinner talk (H. M. Li, 2003). Despite the preponderance of data from free play, publications of the utility of LSA have so far focused on samples generated from narration. For example, Hao et al. (2018) compared the narrative abilities of children with language disorder and their peers and found that children with language impairment had lower lexical diversity (NDW) and fewer complex sentences. The only study published with samples elicited in free play with Chinese children focused on Cantonese-speaking children. Wong et al. (2010) found that normally developing children produced more complete and intelligible utterances and higher lexical diversity (NDW) than SLI children.
Prior work has demonstrated that lexical diversity, MLU, and MLU5 (the MLU of the five longest utterances) are good metrics for distinguishing SLI from typically developing children (Cai, 2008; Cheung, 1998; Jin & Jin, 2008; Weng, 2016; X. Y. Li, 2008; L. Yu et al., 2017; L. Zhang & Zhou, 2009; Zhou & Zhang, 2020; Zhu, 1986). For lexical diversity, people are using NDW, vocD and TTR. NDW can be used to differentiate developmental language disorders from typically-developing controls (Sheng et al., 2020). Jin and Jin (2008) used vocD to explore the relationship between lexical diversity and age, and they found that children’s vocabulary diversity increased with age before 66 months, while the speed of increasing decreased after 66 months, indicating the language development may be dominated by grammatical development. VocD can also be used to distinguish typically-developing children from children with developmental delays (Cai, 2008). However, Wong et al. (2010) found that the combination of vocD and MLU cannot distinguish Cantonese-speaking SLI from typically-developing children. Therefore, the use of vocD in the Mandarin context still needs further discussion. Weng (2016) also analyzed the TTR of children in small, middle and large classes and found the same negative results.
Several studies have used CLAN to calculate MLU for Chinese children (Table 1). These studies show that the MLU of Chinese children increases gradually with age. In comparison with overall MLU, L. Zhang and Zhou (2009) and L. Li (2014) found that MLU5 was more able to reflect the grammatical level of Chinese children after four or four and a half years old. Niu (2018) also found that compound sentences appeared more and more frequently in children’s five longest sentences, reflecting the development trend of children’s grammatical structure from simple sentences to compound sentences, especially at the age of 4 to 5. Although many studies have examined the development of MLU or MLU5 in Mandarin-speaking children, few have concentrated on children with language disorders, especially in a clinical context. Sheng et al. (2020) found that there was no significance between children with developmental language disorders and their matched typically developing controls on MLU in narrative. Therefore, the exploration of the development of MLU will provide important evidence to further clarify the clinical significance of this indicator.
MLU and MLU5 Values of Different Studies on Mandarin-Speaking Children.
Disfluencies such as repetitions and retracings have also been considered as important indicators of developmental status for typical Mandarin-speaking children (Liang, 2018; Tseng, 2006). Often, when they detect a mismatch with a stored form, children will repair or retrace words or even whole phrases (MacWhinney & Osser, 1977). These can be calculated automatically by CLAN’s KidEval and FluCalc programs. Liang (2018) analyzed the narrative corpus of 40 6-7-year-old children using The Frog Story. She found that pragmatics played an important role in promoting repairing. Through the interaction with other people, children continuously modified and adjusted themselves to adapt to their own language needs and external communication needs. However, both studies did not offer any quantitative data for further analysis.
This review has identified four gaps in previous research. First, different language sample analysis indicators have not been explored and analyzed in a larger and more integrated framework. Second, the LSA method has not been used for developing norms for Mandarin development (see Sheng et al., 2020). Third, despite this, the research has not been extended to explore the applicability of language sample indicators, or whether the language sample indicators can reflect the level or gradient of language development of Mandarin-speaking children, and as most of the corpora collected are in state of sharing, no data is used to establish a normative reference database. Although lot of corpora has been pushed on CHILDES with more children included, it is hard to find the information on socioeconomic status. As a seemingly old but innovative research, this research uses some language sample analysis indicators commonly used in English, combined with some indicators in exploring the language development of Mandarin speaking children, and then try to explore the value of these indicators for measuring the language development of Mandarin-speaking children and for constructing a diagnostic system with normative meaning. The exploration of the indicator system will provide important support for the subsequent construction of a norm reference dataset, which can be used for preliminary diagnosis of children with clinically delayed or impaired language development, thus providing a diagnostic path based on “language in action.”
Present Study
This study aims to explore the developmental trajectory of the LSA measures of MLU, MLU5, vocD, number of retracings, and number of repetitions. These data can provide important guidance for further measurement of children with possible language impairment by delineating the upper and lower limits of development for children across different ages. For example, MLU and lexical diversity obtained from conversational language samples can reveal the extent to which they can represent the grammar and vocabulary level of Chinese children. From the perspective of developmental research, the above data will further enhance our understanding of LSA on Mandarin children.
The study’s second objective is to explore gender differences between different types of indicators. Few studies have examined gender differences in the language ability of Chinese children (Cao et al., 2018). We do not know whether there are gender differences in spoken language among children with/without language disorders. This exploration will further strengthen the feasibility of applying LSA in clinical practice for children of both genders.
Method
Participants
Participants were drawn from the control group used in a larger study of children intellectual disabilities, hearing loss, autism and specific language impairment. These data are available in the ZhouAssessment corpus at https://childes.talkbank.org. Public access to the anaonymized data was approved by the Ethics Committee of University.
We recruited 112 normally-developing children (60 boys and 52 girls) from preschools in Zibo, Shandong Province, China. The final sample consisted of 101 typically developing children (54 boys and 47 girls), because 11 families did not finish the language background survey. The children were 34 to 72 months of age (M = 51.34, SD = 12.51), and the sample was 47% female and 53% male. All children spoke Putonghua (Mandarin Chinese) at preschool, and 99 spoke Putonghua at home, while only 2 spoke Shandong dialect at home. In regard to maternal education, 7% had completed less than high school, 10% had completed high school, 3% had completed some college, 66% had graduated from college, and 14% had obtained a graduate-level or professional degree, indicating a medium to high SES compared to other studies (e.g., Y. Zhang et al., 2021). We required teachers to confirm that the children’s performance was similar to those of their peers and that they had never received special services.
Material and Procedure
Different approaches for eliciting language samples have been used by researchers in China, including spontaneous free play, peer talk and storytelling. When attempting to assess the language of very young children, particularly those with language disorders or delay under the age of 5 (Pezold et al., 2020), spontaneous free play has the advantage of making the child comfortable enough to maximize language production. In this study, we also used spontaneous child free play to obtain language samples, because this allows teachers to elicit language samples most easily. The elicitation procedure followed a tightly scripted protocol. We limited the entire game process to 20 min. Even if children were still immersed in the game in most cases, we still chose to stop the game in time. In order to ensure the continuity and fun of the game, we chose the dollhouse and dolls that children prefer, and placed them on the table in the play center of the kindergarten. The teacher and the child would play kitchen games and games in which guests would come to visit. Materials for promoting and extending the conversation included a big toy house with toy furniture, such as desks, chairs, beds, cooking devices, and toy pigs and cats frequently shown in cartoons.
The teacher would bring the child to a play center in the preschool familiar to children and would start the procedure by asking shall we start playing the game? They then started recording the conversation, but only if the child gave an answer of yes. After 20 min, the play session was ended by teacher’s question, shall we take a break? The play sessions were recorded via audio-recording devices, such as cellphones.
It is difficult to fully control what teachers say in dialog with different children. We decided that it was best to allow the game process to be maximally natural to increase ecological validity. Therefore, we did not design problems when teachers need to talk to children. In order to make the data of this research more representative and reliable, we still investigated the basic information of teachers. The 6 teachers participating in this study all have bachelor’s degree, and they all have junior teacher qualification certificates.
Reliability
The language samples were transcribed by two graduate students and rechecked by the author. The two graduate students participated in a one-week-training workshop on the word and utterance segmentation guidelines held by the second author, the coordinator of CHILDES. A graduate student listened to the audio recordings while reading the transcript and entering queries and corrections directly onto the transcripts. All discrepancies were discussed and resolved by the graduate students, and 10% of each set of transcripts were double-transcribed, yielding high reliability, Cohen’s Kappa = 0.85 for all transcripts. Where they could not agree, the words or utterances were indicated as unintelligible.
Transcription and LSA
During transcription, everything the child said was written down, including pauses, repetitions and retracings. Conversational turns were divided into utterances based on the rules of intonation patterns and the Chinese grammar. All transcripts were coded in CHAT (Codes for the Human Analysis of Transcripts) format, a standardized format required for further LSA with CLAN (MacWhinney, 2000). Unintelligible utterances or single-word responses to teacher’s questions were excluded in this study. All children produced 50 utterances or more (see Guo & Eisenberg, 2015), and in total, 9,337 utterances were transcribed and store and coded digitally.
Segmentation into words was conducted based on a Java program developed by CHILDES systems (https://talkbank.org/morgrams/zhoseg.zip) that relied on commonly accepted segmentations from the CEDICT dictionary and morphemes Lexicon in CLAN. The Java program would first check morphemes Lexicon in CLAN, where children’s speech morphemes were stored and segmented based on Mandarin rules proposed by Cheung (1998) and Zhou (2001). For morphemes not found in Lexicon, the java program would automatically segment them refer to the CEDICT dictionary. If morphemes could be found in either Lexicon or CEDICT dictionary, a file would be outputted, and the authors would segment add them into Lexicon by hand based on Cheung (1998) and Zhou (2001).
After segmentation, all CHAT files were supplemented with a %mor tier by running the MOR command in CLAN. The %mor tier is the basis for morphemes analysis as it labels the part of speech for each morpheme. Then, all of the language indices were calculated automatically using CLAN. In this study, we focused analysis on the three dimensions of syntactic complexity, lexical diversity, and fluency.
Syntactic complexity. We used the mean length of utterance (MLU) as the index of syntactic complexity. MLU is sensitive to age level in younger children, but this sensitivity decreases with age. In addition, we utilized MLU5 an indicator verified in previous Mandarin child language research (e.g., L. Zhang & Zhou, 2009).
Lexical diversity. We used vocD as the measure of lexical diversity. VocD was widely used in the study of Chinese typically-developing children and children with language disorders (Cai, 2008; Jin & Jin, 2008; Wong et al., 2010). Higher values of vocD reflect greater lexical diversity in children’s expressive language.
Fluency. We used repetition and retracing as our two measures of disfluency. Repetition is a simple repetition of certain words, such as *CHI: [*CHI: <fried eggs> [/] fried eggs have already been done.] Retracing is a vocabulary modification or syntactic modification with a correct meaning, for example, *CHI: [*CHI:< put this side> [//] put the chair here.]
.
.
Although there are studies that have systematically analyzed different types of disfluencies (e.g., Liang, 2018), none of these studies have explored these features developmentally.
Data Analysis
In order to increase the reliability of data analysis, 101 children were randomly divided into three groups using the Excel RANK function, with 34, 34, and 33 children in each group respectively, and a cross-validation-based linear regression was conducted for each group (Group1, Group 2, and Group 3) independently. The independent variables were age and gender; the dependent variables were MLU, MLU5, vocD, repetition, and retracing. In total, 15 separate analyses (3 groups × 5 dependent variables) were conducted.
Results
We used cross-validation-based linear regression to estimate the relationship between age, gender and each of the five variables derived from the conversational free play samples. See Table 2 for the descriptive statistics for these variables and Table 3 for linear analysis on five variables.
Language Variables.
Linear analysis on five variables.
*p < 0.05, **p < 0.01, ***p < 0.001.
Syntactic Complexity
The linear regression showed that age significantly predicted the size of MLU for each group [Group1: β = 0.668, t(33) = 5.102, p = .000; Group2: β = 0.583, t(33) = 4.034, p = .000; Group3: β = 0.654, t(32) = 4.791, p = .000](Figure 1), indicating a linear trajectory with age. Age and gender explained 47.3%, 35.5% and 44.8% of the variance in the size of MLU [Group 1: F (2,31) = 13.890, p = .000; Group 2: F (2,31) = 8.522, p = .001; Group 3: F (2,30) = 12.196, p = .000] in each group respectively, though gender was not a significant predictor.

Cross-sectional trajectory of syntactic complexity (mean length of utterance [MLU]) across age.
The linear regression showed that age significantly predicted the size of MLU5 for group one and group three [Group1: β = 0.588, t(33) = 4.189, p = .000 and Group3: β = 0.718, t(32) = 5.587, p = .000] (Figure 2), indicating a linear trajectory with age. Age and gender explained 39.5% and 51.2% of the variance in the size of MLU5 [F (2,31) = 10.108, p = .000 and F (2,30) = 15.735, p = .000, respectively]. Meanwhile, age and gender significantly predicted the size of MLU5 [Group1: β = 0.451, t(33) = 3.106, p = .004; Group3: β = 0.355, t(33) = 2.449, p = .02], and the model explained 34.9% of the variance in the size of MLU5 [F (2,31) = 8.319, p = .001]. However, age and gender do not contribute to the variance in the size of MLU 5 in group two.

Cross-sectional trajectory of syntactic complexity (mean length of utterance [MLU5]) across age.
Lexical Diversity
The linear regression showed that age significantly predicted the size of vocD for group two and group three [β = 0.397, t(33) = 2.404, p = .022 and β = 0.366, t(32) = 2.207, p = .035, respectively] (Figure 3), indicating a linear trajectory with age. Age and gender explained 16% and 18.5% of the variance in the size of vocD [F (2,31) = 2.944, p = .068 and F (2,30) = 3.405, p = .046, respectively]. Meanwhile, age does not contribute to group one, and gender does not contribute to any group.

Cross-sectional trajectory of lexical diversity (vocD) across age.
Disfluencies
There was an inverse relationship between age and repetition whereas there was a positive relationship between age and retracing. In other words, children retrace more frequently as they get older, whereas they do less repetition.
The linear regression showed that age significantly predicted the size of retracing only for group one [β = 0.367, t(33) = 2.232, p = .033] (Figure 4), indicating a linear trajectory with age. Age and gender explained 17.1% of the variance in the size of retracing [F (2,31) = 3.199, p = .055]. Meanwhile, age and gender do not contribute to the other two groups.

Cross-sectional trajectory of disfluencies (Retracings) across age.
The linear regression showed that for age significantly predicted the size of repetition for group one and group two [β = −0.468, t(33) = −2.949, p = .006 and β = −0.422, t(33) = −2.743, p = .010, respectively] (Figure 5). Age explained 22.5% and 19.8% of the variance in the size of repetition [F (2,31) = 4.495, p = .019 and F (2,31) = 3.847, p = .032, respectively]. Meanwhile, age and gender do not predict the size of repetition in group three, and gender do not contribute other two groups either.

Cross-sectional trajectory of disfluencies (Repetitions) across age.
Discussion
This study examined the developmental patterns for measures based on spontaneous conversational language samples of Mandarin-speaking children. Based on existing language research on Mandarin-speaking children, we found and explored the developmental trend of five indicators: syntactic complexity: MLU, MLU5, vocD, retracing, and repetition.
Syntactic complexity, as measured by MLU and MLU5, increased significantly with age. Although there are inconsistencies in the regression analysis on MLU5 and two of three groups were shown significant age effects, but the linear fit showed that there is still an increasing trend. This result is different from Cheung (1998)’s finding that for children before 3.5 years of age, which MLU is valuable, but for children older than 3.5, MLU may not reflect their real language level. However, our study found that increases in MLU continue for each age group up to age 6. The result of regression analysis on MLU5 was consistent with L. Zhang and Zhou (2009), who found both MLU5 and MLU could demonstrate a developmental trend for children above age 4. It is worth mentioning that the growth trend of MLU found in this study did not include a nonlinear component such as that found for English by Channell et al. (2018) or for Mandarin by Cheung (1998). Those studies found a nonlinear relationship between syntactic complexity and age based on narrative LSA.
Prior studies on vocD have failed to find age-related trends as strong as those found here. Research on children speaking different Chinese languages (Cantonese and Mandarin) has found that vocD has different sensitivity in distinguishing between SLI and normally developing children (Cai, 2008; Wong et al., 2010). However, the above research did not explore the development trends of children of different ages. Our study found that vocD was significantly related to age and that this relation was linear, at least for two subgroups, besides, as shown in Figure 2, the values of vocD of all three groups increased continuously with age. This result supports the claim from Cai (2008) that lexical diversity can be used as an index for language disorder assessment. However, this linear correlation in this study is not so stable, and this unstable age growth effect may exhibit a mutual constraint effect between children’s vocabulary and syntactic development. Jin and Jin (2008) found that children’s lexical diversity increased before 66 months, and the speed of increasing decreased after that time. These differences in findings could be caused by the ways that different speakers interacted with the children. In their study, clinical doctors talked with the children, whereas teachers conversed with children in this study. Preschool teachers are quite familiar with every child and they know how to talk effectively with children using different strategies, but clinical doctors may not be as effective in this regard.
Interestingly, we found that the number of retracings used by children increased with age, indicating that language is more precisely used for serving specific interactional needs by children (Liang, 2018), as they need more and more time to make adjustments and corrections in order to interact most effectively with their teacher. In addition, repetition may arise from the interactive nature of free play. In free play, children needed to constantly adjust syntactic structures to make it easier and clearer for listeners to understand (Tao, 2019). However, only one of the three groups of children showed a significant age effect, which is obviously not too convincing. In our opinion, this may be closely related to the relatively small number of retracing children producing. Nevertheless, this indicator still deserves further attention.
The type of disfluency that leads to repetitions is very common in young typically developing children (MacWhinney & Osser, 1977), in both narrative and conversational samples. Tseng (2006) found that of all repair types used by Mandarin-speaking children, repetition was the most frequent. In this study, we further found that repetition in early childhood showed a significant decrease with age through quantitative analysis of language samples, which was consistent with prior work from narrative-based LSA (Channell et al., 2018; Liang, 2018). For example, Channell et al. (2018) found a linear decrease in dysfluency across the full age range. The decrease in repetition may reflect the improvement of children’s pragmatic abilities, as children continuously changed their speech to serve the partners they were talking with (Liang, 2018). Again, this is an important variable to consider when examining language samples in individual with language disorders of all ages, given that repetition is common during conversation.
We did not find any significant gender effects. Cao et al. (2018) found that, before age 3, girls had a higher MLU and MLT (mean length of turns) than boys. However, for older children, this difference disappeared. The differentiation of gender differences may not be significant in the early stages. Research on English speaking children had found that this difference does not appear until the age of 11, while research on Chinese speaking children suggested that it is around the age of 7 (Huang, 2014). Nonetheless, for language analysis, gender is still an important factor to be considered, especially for children under 3.
Limitations and Future Directions
There are certain limitations in this study. First, the research design is cross-sectional, making it impossible to track individual developmental characteristics. Although we see the trend of the five indicators changing systematically with age for the sub-group as a whole, we also saw differences between individuals. The linear correlation in this study is not so stable, because it cannot be well represented in all three groups for MLU5, vocD, repetition and retracing, though most indictors had great age effects for at least two groups. A larger corpus data set should be used in the future to increase the reliability of research indicators, in order to be strong enough to serve as a good clinical measure. Besides, a longitudinal design could also reveal how the above five indicators change with age in individual children. Second, demographic indicators such as family cultural and socioeconomic status should be included in future analyses. The socioeconomic status was relatively high because children were recruited in the experimental preschool, and children’s mother’s education level were higher than other studies (e.g., Y. Zhang et al., 2021), so, in order to increase the scope of application of the norm-referenced data set, children from more families with different socioeconomic status should be included in future. Third, further research should examine different dialect areas, apart from Shandong. Fourth, it would be good to include these five measures along with current standardized measures such as DREAM or MCELP-CS on the same group of children to examine the mutual validity of the two modes of analysis.
Conclusion
In conclusion, by exploring cross-sectional data, this study pinpointed five measures that reflect the developmental trend of typically developing children speaking Mandarin. These five LSA based variables provide options beyond standardized tools for evaluating Chinese children with developmental language disorders. The norm reference data established based on the five variables, although the current sample size is still small, can help SLP and kindergarten teachers screen for children who may be at risk of language disorders, and develop relevant intervention plans as soon as possible to help children improve their language abilities as soon as possible. Given the free nature of LSA based on CHILDES, this will further promote the rapid development of related evaluation work.
Footnotes
Acknowledgements
The authors would like to thank the teachers and children at Zibo, China, and graduate students at East China Normal University for their collaboration in the collection and transcription of language samples. At the same time, the first author would like to express his deepest gratitude to his wife, Qian, and his son, Muyang, for their unconditional love toward him.
Credit Author Statement
Yibin Zhang: Conceptualization, Methodology, Resources, Writing – Original Draft. Brian MacWhinney: Methodology, and editing the grammatical and lexical errors. Jing Zhou: Writing – Review & Editing, Funding Acquisition, Project Administration.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by National Social Science Foundation (17BYY093), Ministry of Finance Fundamental Research Funds for the Central Universities (2023ECNU-YY1046) and Shanghai Philosophy and Social Sciences Fund (2023EYY004).
Ethics Statement
The study was approved by the University Committee on Human Research Protection (UCHRP) of East China Normal University, Shanghai, China. All children’s parents provided written informed consent.
Data Availability Statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
