Sage Journals: Discover world-class research

Abstract

Keywords

Achievement test preschool early childhood China validity reliability

Introduction

Early childhood is a crucial period for human development that has long-term implications for one’s life trajectories. During the years before formal schooling, brain size and structures, as well as cognitive abilities, undergo rapid development (Lenroot and Giedd, 2006). Children’s cognitive abilities, including both verbal and non-verbal skills, develop by leaps and bounds and show great malleability. Development during early childhood exerts a long-lasting influence on children’s life chances in adulthood (Duncan et al., 1998; Piek et al., 2008; Yoshikawa, 1995). Extensive studies show that there are great individual differences in cognitive development among preschool children due to various social and family factors, such as socioeconomic status (SES), parenting styles, and learning environment (Duncan et al., 1998; Yeung et al., 2002). Studies also reveal that investments in enhancing the capabilities of young children have a greater impact on achievements in adulthood than comparable investments in older children or adults (Heckman, 2000). Therefore, a reliable and valid instrument for assessing young children’s cognitive development may help researchers to identify risk factors for poor cognitive performance and assist parents and educators in helping children at risk and adopting effective interventions as early as possible. Globally, several cognitive and achievement tests are used to evaluate young children’s cognitive development as children’s achievement is closely related to their cognitive ability. Some examples of these tests are the Woodcock-Johnson test (Schrank et al., 2014), the Peabody Picture Vocabulary Test (PPVT) (Dunn and Dunn, 2007), the Wechsler Intelligence Scale for Children (WISC) (Wechsler, 2003), and the Bayley Scales of Infant and Toddler Development (Bayley, 2006). However, since these existing instruments were developed in Western countries in English, they may not be linguistically, culturally, or contextually appropriate for Chinese children. Linguistically, these instruments tend to measure children’s language ability more than achievement, rendering the scores inaccurate, or at best, a crude approximation of their achievement. Phonetically, the pronunciations of ‘E’ and ‘R’ are similar to ‘1’ and ‘2’ in Mandarin, respectively (Zhang, 2009). Tests with these words (e.g. the letter-number sequence of the WISC) may confuse both examiners and examinees – especially with young children. Culturally, terms or units such as ‘dozen,’ ‘block’ (in a street), ‘quarter’ (in a year), and ‘penny’ (in currency) are rarely used in China. Contextually, some pictures may include objects that are specific to a country. As such, children from other countries may not be able to identify these objects (Grégoire et al., 2008). For instance, fire trucks in the United States look different from those in China. In addition, recognizing electronic devices (e.g. a freestanding cooker or coffee maker) or foods unfamiliar to Chinese (e.g. tacos or strawberries) but common in developed countries could be difficult and inappropriate for Chinese children. Moreover, since these tests are mainly normed based on Western children, scores obtained by Chinese children may be misinterpreted. As revealed in the literature, Asian children – not only school-aged children, but also preschoolers – tend to outperform children from other countries on mathematical tests (Organisation for Economic Development and Co-operation [OECD], 2016; Paik et al., 2011). Therefore, Western-based tests may show a ceiling effect among Chinese children.

To address these concerns, scholars have translated and adapted some of the tests into Mandarin Chinese, such as the Chinese version of the PPVT (Gong and Guo, 1984; Lu and Liu, 1998), the Chinese version of the Wechsler Intelligence Scale for Children, fourth edition (WISC-IV-Chinese) (Zhang, 2009), and the Chinese version of the Das-Naglieri Cognitive Assessment System (DN CAS) (Deng et al., 2011) . One of the downsides of these Chinese adaptations is that they were based on tests published decades ago, such as the PPVT-R in 1981, DN CAS in 1997, and WISC-IV in 2003. Given the dramatic transformations in China in the last two decades, the normative sample, items, and illustrations may be outdated (Lu et al., 2013). Another limitation is that the majority of the translated tests were validated based on small selective samples collected in relatively developed areas in China, which casts doubts on their generalizability. Although the Chinese adaption of WISC-IV overcomes this limitation, it is designed for children aged 6 to 16 and not suitable for those under the age of 6. WISC-IV is an intelligence quotient (IQ) test, and DN CAS is a cognitive test; neither is an achievement test and both are relatively long tests for young children. There is a need for a relatively short achievement test for Chinese preschoolers that can be used more widely to study Chinese children’s early development.

To fill this gap, an achievement test for Chinese preschool children – the Zhang-Yeung Test of Achievement for Chinese Children 张杨中国儿童学习能力测验 (hereafter referred to by its abbreviated name as ZY-TACC) - was developed for children aged 3 to 15 by Zhang Houcan and Wei-Jun Jean Yeung in 2012. This is the first achievement test designed in Mandarin for Chinese children starting as young as preschoolers. There are four sections of tests designed for four age groups (3–6, 7–8, 9–12, 13–15) respectively. This instrument was used in a nationally representative survey to assess Chinese children’s achievement – The Urbanization and Child Development Study (UCDS), the child component of the Urbanization and Labor Migration Survey conducted by Tsinghua University between 2012 and 2013. The achievement test data collected in this study were analyzed in recent publications (see Chen, 2019; Lu et al., 2020). This paper aims to document the psychometric attributes of the portion of the ZY-TACC used for the youngest children (aged three to six) based on data collected in the UCDS and to demonstrate the utility of this instrument for early childhood development research on Chinese children.

Methods

Participants

The UCDS adopted a multi-stage stratified probability sample and covered 28 of the 31 province-level administrative divisions in the Chinese mainland. The study included 6796 children aged 0 to 15 and their primary caregivers. The mode of the study was in-person paper-and-pencil interviews conducted at their homes. The sample reflects a good national representation of children aged 15 and below. As shown in Appendix A, the distribution of age, gender, and region of the sampled children is close to that from the 2010 census. Children aged 3 and above participated in the achievement assessment. Among them, those aged 3 to 6 (n = 1495) received the section of the ZY-TACC for preschoolers, which is the focus of the assessment in this paper.

Procedure

During the field period of UCDS, trained interviewers administered the ZY-TACC in person at the target child’s home. To maximize standardization in the administration procedures, the following measures were put in place before the field work began: (a) all interviewers had to undergo a five-day training course for the study; (b) all test booklets, answer sheets, and keys were prepared and printed uniformly by the central office; (c) detailed instructions for the procedures were printed in the interviewers’ manual; and (d) scoring of the test was done back in the office after the interview so as to minimize errors.

The interviewers were instructed to request a quiet area, for example, a room or a quieter corner in the house, in which to administer the test whenever possible. The interviewers explained the test to the parents or other adults present in the home and requested no interference or assistance from adults during the test. The interviewers were also instructed to spend some time building rapport with the child before administering the test. They then presented a colored paper booklet to the child and explained what the test entailed and what the rules were. Interviewers were instructed to start the test only when the child understood the rules. The child was asked to respond verbally to the questions and was reminded to ask the interviewer to repeat a question if he or she did not hear it clearly. If the child did not know the answer to a question, the interviewer was instructed to repeat the question once without further explanation. If the child still could not answer the question, the interviewer would skip that question and move on to the next question. The interviewer would record the child’s answer for each question on the answer sheet. To reduce potential interviewer errors, scoring was done later in the office after the interview.

The achievement test consists of two parts – a verbal and a numeracy component. In each component, there are two sub-sections. In the verbal component, there is a picture vocabulary and a passage comprehension section. In the numeracy test, there is a counting/calculation and an applied problem section. The test has a total time limit of 20 minutes, with 10 minutes each for the verbal and numeracy tests. The interviewers were instructed to keep breaks and interruptions to a minimum. However, since these children were young and sometimes home circumstances were not under interviewers’ control, a pause was allowed when necessary before the child could resume. Timing needed to be adjusted in such circumstances. Demographic information such as the child’s birth date and year, gender, hukou (household registration system) status, migration status, parental education, and region of domicile, and information about the child’s behavior was collected through an interview with the child’s primary caregiver.

Analysis

In this paper, we assessed the reliability of the ZY-TACC preschool portion by examining four areas: first, the normality of the raw scores and the accuracy rate for each item (that is, the percentage of children who answer each question correctly); second, the overall internal correlation and the subscale correlation of the instrument; third, a comparison of the scores by children’s demographic characteristics and family SES and their predictive power on children’s test scores, determined through multiple regression analyses; and fourth, the predictive validity of children’s scores for the ZY-TACC on their prosocial behavior and behavioral problems. Examinations of the third and fourth areas were conducted because existing literature reveals a correlation between children’s demographic characteristics and their achievement, as well as an association between children’s achievement and their behavioral indicators (Arnold et al., 2012). Details for the various measures used in the analyses are described below.

Measurements

The ZY-TACC preschool test comprises two components – the verbal test and the numeracy test. The verbal test captures the child’s vocabulary through a picture vocabulary subscale (14 items) and reading skills through a passage comprehension subscale (14 items). The vocabulary and reading skills are indicators of both general cognitive skills and later mastery of more complex academic skills (Duncan et al., 2007; Son and Morrison, 2010). In the picture vocabulary subscale, children were asked to name objects shown in the pictures. In the passage comprehension subscale, they were required to answer literal or inferential questions based on the pictures. For instance, a picture of nine children was presented, and children were asked to point out the child in the middle and the color of his/her clothes.

The numeracy test comprises a calculation subscale (9 items) and an applied-problem subscale (15 items), which measure 2 crucial dimensions of mathematical skills. In the applied-problem subscale, children were asked to respond to real-life mathematics application questions, including basic counting, addition, and subtraction problems. For example, two pictures of apples were provided, and the children were asked to count the number of apples in each picture, then point out which picture contained more apples. The calculation part required the respondents to do mathematics calculations such as 1 + 3 = ? and 14 – 6 = ? (refer to Appendix B for sample items).

In both verbal and numeracy tests, children obtained between one and three marks for a correct answer based on the difficulty of that item, but no marks if they answered incorrectly. Raw scores on each test were calculated by taking the sum of the marks a child obtained on the test. The raw scores ranged from 0 to 50 for verbal and numeracy tests respectively, with a total score of 100.

The items in the tests were created after careful evaluation in multiple pilot tests and revisions. In creating the tests, Chinese textbooks and published tests in other languages were referred to. Items were created originally following the principle that no existing questions should be used repeatedly. Items included in the instrument were chosen after thorough scrutiny of each pilot test to ensure that it was culturally and contextually appropriate. For example, some children from some regions are not familiar with strawberries. Therefore, rare fruits were replaced by more common fruits in the test. Likewise, fire trucks in China look different from those in the USA, so a picture of a Chinese fire truck was used in the test. Different items were developed to gauge the appropriate levels for each age. Subsequently, items were combined for multiple ages. Several rounds of pilot tests were conducted among kindergarteners aged 3 to 6 in multiple sites in Zhuhai, Xi’an, and Beijing with sample sizes ranging from 50 to 150 children. Revisions were made based on the results of the pretests. The instrument is proprietary. Contact the corresponding author of this paper for more information.

Region

To examine the regional differences, the 28 provincial units covered in the UCDS were categorized into three broad regions: East, Middle, and West.¹

Parental education

Parental education was categorized into three broad groups for ease of interpretation: Less educated (primary school or below); moderately educated (lower middle school); and highly educated (high school/technical school or above).

Children’s migration status

We divided children into four groups: Urban children living with both parents (urban local children);² rural children living with both parents (rural local children), children of rural origin who migrated to the city with parents (migrant children);³ and children of rural origin who were left behind by parents who migrated to cities for work (left-behind children).⁴

Positive behavior

Primary caregivers rated their child’s behavior in 10 items, such as whether the child is careful and organized, and not impulsive on a 5-point scale (1 = not at all like my child, 5 = totally like my child). These items are used in other national surveys such as the Panel Study of Income Dynamics in the USA. We took the mean of the items to construct a positive behavior index. The reliability of the index is good (Cronbach’s α = 0.80).

Behavior problems

A behavior problems index was constructed based on a list of 26 items from the behavior problems scale (Peterson and Zill, 1986). Primary caregivers rated their child’s frequency of externalizing and internalizing behavior problems, such as anxiety, depression, aggression, and hyperactivity on a 3-point scale (1 = not true, 3 = often true). The answers were averaged to create a behavior problems index. The reliability of the index is high (Cronbach’s α = 0.87).

Results

Assessing normality and accuracy rate of the items

Table 1 shows the verbal and numeracy raw scores. As Hemphill (2003) suggested, for large samples, an absolute skew value that is greater than 2 and an absolute kurtosis value larger than 7 may indicate non-normality. In Table 1, all absolute values of skewness were smaller than 2, and all kurtosis values were smaller than 7, indicating that both the verbal and numeracy scores were normally distributed.

Table 1.

Mean and standard deviation by age and test.

Age in years	n	M	SD	Skewness	Kurtosis
Verbal
3	397	18.31	8.98	0.42	3.11
4	390	22.66	9.08	0.35	3.10
5	411	26.26	9.20	−0.12	3.12
6	297	29.27	9.06	−0.17	3.34
Total	1495	23.81	9.91	0.10	2.77
Numeracy
3	397	13.97	10.72	0.93	3.65
4	390	20.58	10.98	0.34	2.83
5	411	26.89	10.77	−0.17	2.72
6	297	31.88	9.99	0.57	3.40
Total	1495	22.81	12.51	0.07	2.23

Accuracy rate, also called “pass rate” in the literature, refers to the percentage of children who answered each question correctly. The accuracy rate on each item was calculated and displayed in descending order by subscales in Table 2. Overall, both verbal and numeracy tests revealed a satisfactory level of difficulty, ranging between 10% and 90%. It suggests that the items can distinguish children with different competencies since they are neither too easy nor too difficult.

Table 2.

Accuracy rate of each item by subscale in descending order.

Verbal				Numeracy
Picture vocabulary		Passage comprehension		Calculation		Applied problem
Item	Proportion	Item	Proportion	Item	Proportion	Item	Proportion
A02	97%	A20	86%	B12	64%	B01	88%
A01	95%	A22	76%	B15	56%	B02	81%
A03	90%	A14	75%	B13	53%	B05	79%
A08	89%	A24	52%	B19	46%	B07	78%
A05	86%	A19	51%	B17	32%	B04	74%
A13	86%	A21	47%	B14	29%	B08	71%
A04	77%	A18	44%	B16	27%	B10	66%
A11	63%	A23	37%	B20	20%	B03	59%
A12	60%	A25	36%	B18	14%	B09	59%
A06	56%	A28	30%			B11	52%
A10	54%	A26	26%			B06	41%
A07	51%	A16	17%			B22	19%
A15	48%	A27	9%			B21	18%
A09	23%	A17	7%			B23	15%
						B24	9%

Internal consistency and subscale correlations

To evaluate the reliability of the instruments, the coefficient omega, subscale correlation, and split-half reliability were examined (see Table 3). The values for the coefficient omega were 0.83 for picture vocabulary subscale, 0.82 for the passage comprehension subscale in the verbal test, 0.89 for the calculation subscale, and 0.92 for the applied problem subscale in the numeracy test. The relatively high omega values indicate that the individual items in the same subscale measure similar ability. The Pearson correlation coefficients showed a high correlation between the subscales of each instrument (0.64 for the two subscales of the verbal test, and 0.70 for the two subscales of the numeracy test), which suggests that the subscales of each instrument measure different yet related abilities (Urbina, 2014).

Table 3.

Internal consistency and subscale correlation.

Test	No. of items	Coefficient Omega	Split-half Reliability	Subscale Correlation
Verbal
Picture vocabulary	14	0.83	0.87**	0.64**
Passage comprehension	14	0.82	0.87**	0.64**
Numeracy
Calculation	9	0.89	0.92**	0.70**
Applied problem	14	0.92	0.92**	0.70**

**p <.01.

The split-half methods were employed to compute the internal consistency of the test. The verbal and numeracy tests were divided into parallel halves using an odd-even split. Both halves were relatively balanced in terms of item difficulty and content. The split-half reliability estimates for verbal test and numeracy tests were 0.87 and 0.92, respectively (p < .01), indicating a high internal consistency of the tests.

Assessing variation by demographic and socioeconomic characteristics

We further examined whether children with different characteristics, such as age, gender, hukou status, region, children’s migration status, and family SES measured by parental education, showed different levels of achievement on the ZY-TACC (refer to Appendix C for variable descriptions). To assess the age trend of children’s verbal and numeracy scores, a norm was constructed for each test using the bootstrap method (refer to Appendix D for the norming process). Norm curves of verbal and numeracy scores (Figure 1a and Figure 1b) showed that 3- to 6-year-old children’s verbal and numeracy abilities increased with age.

Figure 1.

(a) Fitted verbal norm curve by age in months. Points and fitted line missing for 82 months and 83 months because of small number of cases. (b) Fitted numeracy norm curve by age in months. Points and fitted line missing for 82 months and 83 months because of small number of cases.

Next, we compared children’s test scores by gender. Existing literature shows mixed findings on the gender difference in children’s achievement. Some studies indicate that girls may outperform boys in verbal skills (Halpern and LaMay, 2000) or mathematical abilities (Kimball, 1989), while others suggest that gender differences are negligible (Hyde et al., 1990; Hyde and Linn, 1988). As shown in Figure 2, there were no significant gender differences for both the verbal and numeracy scores as the fitted lines for boys and girls nearly overlap each other. These results indicate that children of both genders performed equally well on both tests.

Figure 2.

The development of verbal and numeracy ability as a function of age and gender.

We further evaluated whether children’s scores vary across different family SES, which was captured by paternal and maternal education. There is a consistent finding across countries that parental education is a strong predictor of children’s achievement. Children whose parents are highly educated tend to reach higher achievement (Antonovics and Goldberger, 2005; Carneiro and Ginja, 2016; Holmlund et al., 2011). As seen in Figure 3, children’s performance was positively correlated with their fathers’ education. Children of highly educated fathers scored higher than their peers with moderately educated fathers on both tests, and both groups outperformed children of less-educated fathers. The gaps between the children whose fathers had the highest and those whose fathers had the lowest educational levels were large and significant for all ages. A similar pattern was found across the mothers’ educational levels (Figure 4); that is, the higher the mothers’ education, the higher the children’s test scores. As shown in these analyses, the disparity in children’s achievement across parental education, captured by the ZY-TACC, is consistent with findings in previous literature.

Figure 3.

The development of verbal and numeracy ability as a function of age and paternal education.

Figure 4.

The development of verbal and numeracy ability as a function of age and maternal education.

Next, we divided respondents according to their rural or urban hukou statuses. A vast body of literature shows that Chinese children with urban hukou outperform their rural hukou counterparts because they generally have parents with higher SES and live in an organized home with rich cognitive stimuli, attend a better school, and enjoy a superior community environment (Qian and Smyth, 2008; Xu and Xie, 2015; Yeung and Gu, 2015). Consistent with previous literature, we found a moderate and stable gap between the fitted lines for the urban sample and rural sample, with urban children having higher scores (Figure 5).

Figure 5.

The development of verbal and numeracy ability as a function of age and hukou status.

We further examined children’s test scores by region. The literature points to an evident regional disparity in educational attainment in China due to the highly uneven development, with children in western China lagging significantly behind those in eastern and central China (Qian and Smyth, 2008; Yue et al., 2016). As expected, children from the western region fared significantly worse than their counterparts in other regions on both tests (Figure 6).

Figure 6.

The development of verbal and numeracy ability as a function of age and region.

We conducted one more analysis to assess the predictive validity of the test scores here – by children’s and parents’ migration status. In the literature, researchers found that children’s achievement follows a descending order: urban children, migrant children, rural children, then left-behind children (Lee, 2011; Xu and Xie, 2015; Yeung and Gu, 2015). Figure 7 indicates notable achievement gaps between children with different migration statuses. Consistent with previous studies, on both verbal and numeracy tests, urban local children scored the highest, followed by rural-to-urban migrant children, then rural local children and then rural left-behind children. The gaps between urban local children and rural local children, and between migrant children and rural local children, were significant on both tests. The gaps between rural left-behind children and rural local children were significant on the verbal test but not on the numeracy test.

As shown in the analyses above, the children’s verbal and numeracy scores revealed significant distinctions across demographic characteristics and family SES. Multivariate regression was employed to evaluate the predictive power of these variables. We expected that demographic factors and family SES would significantly predict children’s cognitive skills. Hukou status and children’s migration status were added into the model separately, since they were highly correlated. Because the data were nested in 28 provincial units, we estimated the robust Eicker-Huber-White standard errors clustered by province. The verbal and numeracy scores were standardized by age before the analysis.

Figure 7.

The development of verbal and numeracy ability as a function of age and children’s migration status.

Table 4 shows no significant gender difference in verbal and numeracy scores. Both fathers’ and mothers’ education were positively associated with children’s test scores. More specifically, children of less-educated parents significantly lagged behind their peers with moderately and highly educated parents on the verbal test. Likewise, fathers’ education significantly predicted their children’s numeracy scores, and children of less educated mothers underperformed their peers with highly educated mothers, although the coefficient was marginally significant (p = .075, not presented in Table 4). Furthermore, children with rural hukou scored lower than their urban counterparts. Likewise, children living in the eastern region outscored their counterparts in the west. In models 2 and 4, children’s migration status was added to the analysis. For both tests, urban local children and migrant children outperformed rural local children. The left-behind children did not significantly lag behind their rural local counterparts on both tests. The results of the multivariate analyses reveal that parental education, hukou status, children’s migration status, and region are significant predictors of children’s verbal and numeracy scores.

Table 4.

Multivariate regression with robust SE on standardized verbal and numeracy scores.

	Verbal		Numeracy
	Model 1	Model 2	Model 3	Model 4
Gender (boy)	0.0143	0.0249	−0.0184	−0.0245
	(0.0534)	(0.0484)	(0.0619)	(0.0613)
Father's education (less educated)
Moderately educated	0.142*	0.143*	0.254**	0.246**
	(0.0627)	(0.0682)	(0.0734)	(0.0817)
Highly educated	0.325**	0.275**	0.407**	0.402**
	(0.0916)	(0.0930)	(0.0791)	(0.0837)
Mother's education (less educated)
Moderately educated	0.137**	0.145**	0.0878	0.0881
	(0.0458)	(0.0398)	(0.0695)	(0.0708)
Highly educated	0.302**	0.272**	0.209	0.177
	(0.104)	(0.0980)	(0.113)	(0.108)
Hukou status (rural)	−0.209*		−0.207*
	(0.0926)		(0.0914)
Region (East)
Middle	−0.0694	0.00521	−0.00474	0.0509
	(0.0809)	(0.0825)	(0.0756)	(0.0749)
West	−0.262**	−0.152	−0.257*	−0.178
	(0.0687)	(0.0847)	(0.107)	(0.116)
Children's migration status (rural local children)
Urban local children		0.327**		0.239*
		(0.113)		(0.112)
Migrant children		0.282**		0.269**
		(0.101)		(0.0740)
Left-behind children		−0.157		−0.0528
		(0.0881)		(0.0673)
Constant	−0.0679	−0.367**	−0.0609	−0.349**
	(0.106)	(0.0970)	(0.134)	(0.109)
N	1451	1307	1451	1307
R-squared	0.088	0.101	0.084	0.089

Note: Robust standard errors in parentheses.

**p < .01, *p < .05.

Association with behavioral indicators

Our final assessment of the predictive validity of the ZY-TACC was evaluated by examining its predictive power on children’s behavioral outcomes. Extant literature suggests that the development of prosocial behavior may require a certain level of cognitive competence (Moore et al, 2001). Furthermore, extensive studies show a significant positive correlation between young children’s cognitive skills (e.g. reading ability, verbal fluency, and literacy skills) and their prosocial behavior, and a negative association with their behavior problems, such as anxiety and aggression (Arnold et al., 2012; Huang et al., 2015; Miles and Stipek, 2006).

We regressed children’s standardized scores on their positive behavior and their behavioral problems. As seen in Table 5, when controlled for children’s demographic characteristics, their verbal scores were positively and significantly related to their positive behavior (model 1), but their numeracy scores were not (model 2). Similarly, the verbal score was negatively related to children’s behavior problems, although it was marginally significant at the .1 level (model 3, p = .088, not presented in Table 5), while numeracy score was not associated with behavior problems (model 4). Overall, these findings are consistent with the existing literature (Arnold et al., 2012; Miles and Stipek, 2006).

Table 5.

Multivariate regression with robust SE on positive behavior and behavior problems.

	Positive behavior		Behavior problems
	Model 1	Model 2	Model 3	Model 4
Verbal standardize score	0.0531*		−0.0174
	(0.0214)		(0.00987)
Numeracy standardized score		−0.0115		−0.0131
		(0.0220)		(0.0111)
Age	0.0558**	0.0584**	−0.0217**	−0.0213**
	(0.0127)	(0.0141)	(0.00700)	(0.00704)
Gender (boy)	−0.0855*	−0.0860*	0.0146	0.0141
	(0.0343)	(0.0341)	(0.0134)	(0.0131)
Hukou (rural)	−0.0643	−0.0929	0.00743	0.00953
	(0.0490)	(0.0509)	(0.0231)	(0.0230)
Constant	3.842**	3.856**	1.520**	1.517**
	(0.0636)	(0.0646)	(0.0344)	(0.0345)
N	1426	1426	1425	1425
R-squared	0.031	0.022	0.012	0.010

Note: Robust standard errors in parentheses.

**p < .01, *p < .05.

Discussion

This paper validates the Zhang-Yeung Test of Achievement (ZY-TACC) for children aged three to six in a multi-stage stratified probability sample of Chinese preschool children. The instrument comprises four different components of achievement skills. The content validity was achieved based on the extensive experiences of Zhang and Yeung, who specialize in child development research and assessment tools, and through careful revisions based on results from multiple pilot tests in different sites. The instrument reveals a satisfactory level of difficulty to distinguish among children with different levels of abilities. The test also reflects high internal consistency within each subscale and high inter-correlation between subscales of both verbal and numeracy tests. These findings suggest that the instrument is reliable.

This instrument also exhibits significant ability to distinguish among children of different ages and varying family backgrounds as previous literature has shown. The variations in children’s scores across age, gender, parental education, hukou status, region, and children’s migration status were all in the expected directions. The older children outperformed the younger children, implying that the ZY-TACC is a developmental instrument that captures the changes in children’s competencies as they age. This instrument showed negligible gender differences in both the descriptive analysis and the multivariate analysis. Consistent with the findings in the literature, achievement gaps were observed between children with different family SES. Using parental education as a proxy for family SES, we found an educational gradient on children’s performance – the higher the parental education, the higher the children’s scores. In addition, we observed distinctions among children living in different regions with different hukou statuses. Western rural children fared worse than their eastern urban counterparts. Regarding children’s migration status, a descending order of children’s performance was observed: urban local children; migrant children; rural local children; and, finally, rural left-behind children. Results of the multivariate regression also confirmed the predictive power of the above-mentioned factors for children’s achievement scores, as well as the predictive power of the verbal scores for children’s positive behavior. Evidence gathered in this study suggests that the ZY-TACC is a psychometrically robust, culturally and contextually appropriate measure of three- to six-year-olds’ achievement in China.

This study establishes the validity and reliability of the ZY-TACC. This instrument can be used to identify risk factors for preschool children’s poor cognitive performance; such identification would facilitate timely interventions to help children at risk. In evaluating the ZY-TACC, our study is among the first to provide an assessment of preschool children’s cognitive development with a large-scale nationally representative sample. It provides generalizable findings on the variations in preschool children’s cognitive development across their family backgrounds. Findings on the impacts of children’s demographic factors and family SES on their cognitive skills are consistent with the literature. These findings not only lend support to face, content, and construct validity of the ZY-TACC but also add to the empirical evidence on the impact of family factors on early childhood development.

We have only limited evidence for the criterion validity of the ZY-TACC. As noted at the outset of this paper, most of the existing tests for Chinese young children are either dated or they are not comparable instruments for achievement tests for Chinese preschoolers that can be used as a reference (hence our motivation for developing this instrument). While we have had a test on a small sample of children that shows a high correlation with the Chinese norm of the Wechsler Preschool & Primary Scale of Intelligence (WPPSI-IV Chinese version), which was published after the ZY-TACC was developed (Wechsler, 2014), more evidence is needed to provide strong support in the future. Future studies can also evaluate our test’s validity in predicting children's future academic performance. Overall, we have demonstrated that the ZY-TACC is a useful tool for assessing Chinese preschoolers’ achievement in a relatively short period of time (less than 20 minutes). The brevity of the tests means it can be more easily used in studies than tests that take significantly longer to administer. This instrument thus fills the current gap for tools for research on Chinese children’s early childhood development. The instrument can be administered in schools, hospitals, or other settings as well as at home. Similar to any other tests, having well-trained interviewers, standardized procedures, and a quiet location with little interference in which to administer the test are key for its success. In studies conducted with a large number of interviewers, standardization of administering procedures can be a significant challenge, and thus interviewer training and pre-field preparations become even more crucial. To achieve a greater level of standardization, the instrument can be digitalized to further minimize the interviewer effect.

Footnotes

Acknowledgements

The authors are grateful to the Tsinghua University China Data Center for conducting the Urbanization and Labor Migration Survey and the Urbanization and Child Development Study and for making the data available for analysis. We acknowledge the contributions of Professors Liu Jingming, Lu Yao, Donald Treiman, and Zhang Houcan to the Urbanization and Child Development Study.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors gratefully acknowledge support from the Lippo Group and the OUE Limited to the corresponding author.

ORCID iDs

Wei-Jun Jean Yeung

Xuejiao Chen

Notes

References

Antonovics

Goldberger

(2005) Does increasing women's schooling raise the schooling of the next generation? Comment. American Economic Review 95(5): 1738–1744.

Arnold

Kupersmidt

Voegler-Lee

,et al. (2012) The association between preschool children's social functioning and their emergent academic skills. Early Childhood Research Quarterly 27(3): 376–386.

Bayley

(2006) Bayley Scales of Infant and Toddler Development: Bayley-III ( Vol. 7). San Antonio: Harcourt Assessment.

Carneiro

Ginja

(2016) Partial insurance and investments in children. The Economic Journal 126(596): F66–F95.

Chen

(2019) How income shapes preschool children's development in China. PhD thesis, National University of Singapore, Singapore. Available at: https://scholarbank.nus.edu.sg/handle/10635/164173

Deng

C-p

Liu

Wei

,et al. (2011) Latent factor structure of the Das-Naglieri Cognitive Assessment System: A confirmatory factor analysis in a Chinese setting. Research in Developmental Disabilities 32(5): 1988–1997.

Duncan

Dowsett

Claessens

,et al. (2007) School readiness and later achievement. Developmental Psychology 43(6): 1428–1446.

Duncan

Yeung

Brooks-Gunn

,et al. (1998) How much does childhood poverty affect the life chances of children? American Sociological Review 63(3): 406–423.

Dunn

(2007) PPVT-4: Peabody Picture Vocabulary Test. Minneapolis: Pearson Assessments.

10.

Gong

Guo

(1984) An intelligence screening test for preschool and primary school children —picture vocabulary test. Acta Psychologica Sinica 16(4): 46–55.

11.

Grégoire

Georgas

Saklofske

,et al. (2008) Cultural issues in clinical use of the WISC-IV. In: Prifitera A, Saklofske DH and Weiss LG (eds) WISC-IV Clinical Assessment and Intervention, 2nd ed. San Diego: Elsevier Academic Press, 517–544.

12.

Halpern

LaMay

(2000) The smarter sex: A critical review of sex differences in intelligence. Educational Psychology Review 12(2): 229–246.

13.

Heckman

(2000) Invest in the very young. Chicago: Ounce of Prevention Fund. Retrieved from: https://www.theounce.org/wp-content/uploads/2017/03/HeckmanInvestInVeryYoung.pdf (accessed 20 June, 2018).

14.

Hemphill

(2003) Interpreting the magnitudes of correlation coefficients. American Psychologist 58(1): 78–79.

15.

Holmlund

Lindahl

Plug

(2011) The causal effect of parents' schooling on children's schooling: A comparison of estimation methods. Journal of Economic Literature 49(3): 615–651.

16.

Huang

Xie

(2015) Cognitive ability: Social correlates and consequences in contemporary China. Chinese Sociological Review 47(4): 287–313.

17.

Hyde

Fennema

Lamon

(1990) Gender differences in mathematics performance: A meta-analysis. Psychological Bulletin 107(2): 139.

18.

Hyde

Linn

(1988) Gender differences in verbal ability: A meta-analysis. Psychological Bulletin 104(1): 53–69.

19.

Kimball

(1989) A new perspective on women's math achievement. Psychological Bulletin 105(2): 198–214.

20.

Lee

M-H

(2011) Migration and children's welfare in China: The schooling and health of children left behind. The Journal of Developing Areas 44(2): 165–182.

21.

Lenroot

Giedd

(2006) Brain development in children and adolescents: Insights from anatomical magnetic resonance imaging. Neuroscience & Biobehavioral Reviews 30(6): 718–729.

22.

Liu

(1998) Xiuding bibaode tuhua cihui ceyan (Peabody Picture Vocabulary Test-Revised). Taipei: Psychological Publishing (in Chinese).

23.

Wong

LLN

Wong

AMY

,et al. (2013) Development of a Mandarin expressive and receptive vocabulary test for children using cochlear implants. Research in Developmental Disabilities 34(10): 3526–3535.

24.

Yeung

WJJ

Treiman

(2020, in press) Parental migration and children’s psychological and cognitive development in China: Differences and mediating mechanisms. Chinese Sociological Review.

25.

Miles

Stipek

(2006) Contemporaneous and longitudinal associations between social behavior and literacy achievement in a sample of low-income elementary school children. Child Development 77(1): 103–117.

26.

Moore

Barresi

Thompson

(2001) The cognitive basis of future-oriented prosocial behavior. Social Development 7(2): 198–218.

27.

Organisation for Economic Development and Co-operation [OECD] (2016) PISA 2015 results in focus. Paris: OECD Publishing. Available at: https://www.oecd.org/pisa/pisa-2015-results-in-focus.pdf (accessed 03 June 2018)

28.

Paik

van Gelderen

Gonzales

, et al. (2011) Cultural differences in early math skills among U.S., Taiwanese, Dutch, and Peruvian preschoolers. International Journal of Early Years Education 19(2): 133–143.

29.

Peterson

Zill

(1986) Marital disruption, parent-child relationships, and behavior problems in children. Journal of Marriage and the Family 48(2): 295–307.

30.

Piek

Dawson

Smith

,et al. (2008) The role of early fine and gross motor development on later motor and cognitive ability. Human Movement Science 27(5): 668–681.

31.

Qian

Smyth

(2008) Measuring regional inequality of education in China: Widening coast–inland gap or widening rural–urban gap? Journal of International Development 20(2): 132–144.

32.

Schrank

Mather

McGrew

(2014) Woodcock-Johnson IV Tests of Achievement. Rolling Meadows: Riverside.

33.

Son

S-H

Morrison

(2010) The nature and impact of changes in home learning environment on development of language and academic skills in preschool children. Developmental Psychology 46(5): 1103–1118.

34.

Urbina

(2014) Essentials of Psychological Testing. Hoboken: John Wiley & Sons.

35.

Wechsler

(2003) Wechsler Intelligence Scale for Children-WISC-IV. San Antonio: The Psychological Corporation.

36.

Wechsler

(2014) Wechsler Preschool and Primary Scale of Intelligence—Fourth CN Edition (WPPSI-IV CN) (Li Y and Zhu J trans, eds). King-May Company China.

37.

Xie

(2015) The causal effects of rural-to-urban migration on children's wellbeing in China. European Sociological Review 31(4): 502–519.

38.

Yeung

W-JJ

(2015) Left behind by parents in China: Internal migration and adolescents’ well-being. Marriage and Family Review 52(1–2): 127–161.

39.

Yeung

W-JJ

Linver

Brooks-Gunn

(2002) How money matters for young children's development: Parental investment and family processes. Child Development 73(6): 1861–1879.

40.

Yoshikawa

(1995) Long-term effects of early childhood programs on social outcomes and delinquency. The Future of Children 5(3): 51–75.

41.

Yue

Sylvia

Bai

, et al. (2016) The effect of maternal migration on early childhood development in rural China. Social Science Research Network (SSRN). Available at: https://ssrn.com/abstract = 2890108 or http://dx.doi.org/10.2139/ssrn.2890108 (accessed 18 June, 2018).

42.

Zhang

(2009) The revision of WISC-IV Chinese version. Psychological Science 32(5): 1177–1179.

An achievement test for Chinese preschool children: Validity and social correlates

Abstract

Keywords

Introduction

Methods

Participants

Procedure

Analysis

Measurements

Region

Parental education

Children’s migration status

Positive behavior

Behavior problems

Results

Assessing normality and accuracy rate of the items

Internal consistency and subscale correlations

Assessing variation by demographic and socioeconomic characteristics

Association with behavioral indicators

Discussion

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iDs

Notes

References