Abstract
This study examined the validity of the Korean version of the Child Behavior Checklist (K-CBCL) with 180 children with autism spectrum disorder (ASD) in South Korea. Rasch analysis was applied to examine item fit, item difficulty, suitability of the response scale, and person and item separation indices of the K-CBCL. The results indicated that, with the exception of six out of the 119 items, the K-CBCL had a good item fit. Suitability of the rating scale was supported. Both Attention Problems and Aggressive Behavior factors differentiated two strata of behavior problems of children with ASD, whereas six other factors only captured one stratum of behavior problems. The item separation index indicated that the items were distributed well with high reliability. We demonstrated that statistical item analysis with the Rasch model could provide valuable information related to psychometric properties.
Keywords
A recent South Korean report on individuals with disabilities indicates that the estimated autism spectrum disorder (ASD) population among the age group of 10 to 19 years who are registered as an individual with a disability is 1.27% (S. H. Kim et al., 2017). However, the prevalence of ASD among South Korean children, aged 7 to 12 years, has been reported to be 2.64% when using a total population-based sample (Y. S. Kim et al., 2011). Instead of relying on existing databases, such as the Disability Registry, in estimating ASD prevalence, Y. S. Kim et al. (2011) targeted the entire elementary school population of a district within Goyang, a suburban city of Seoul in South Korea, including samples of both a high probability of ASD population receiving special education services or enrolled in Disability Registry, and a general population in regular schools. The result was surprising because the general population sample consisted of 1.89% (two-thirds) of the total prevalence estimate, whereas the high probability group consisted of 0.75%, which might indicate that South Korean children with ASD remain undiagnosed or are inaccurately diagnosed. The estimated prevalence was higher than those of the Centers for Disease Control and Prevention surveillance data on ASD (1.7%; Imm et al., 2019) and parent-reported ASD diagnosis (2.5%) among U.S. children (Kogan et al., 2018). Although no new Korean population-based ASD prevalence research has been reported in the literature, the results of the study by Y. S. Kim et al. (2011) may indicate the importance of using validated and reliable diagnostic methods and procedures to identify children with ASD in South Korea.
Children with ASD may present with both emotional and behavioral disorders (Dovgan et al., 2019; Gray et al., 2012). It has been reported that children with ASD show higher levels of behavioral problems than those of other children with disabilities (Barroso et al., 2018), and their depressive symptoms are higher than their typically developing peers (Schwartzman & Corbett, 2020). Salazar et al. (2015) found that 90% of preschool and elementary school-aged children with ASD from a community sample met the criteria for at least one Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) psychiatric disorder, and anxiety disorders were found to be occurring in 70%. These suggest the importance of screening and intervention for emotional and behavioral problems in children with ASD during early years of their development. However, emotional problems in children with ASD are difficult to diagnose and to distinguish from the specific characteristics that are related to ASD (Kanne et al., 2009), and little is known about the emotional regulation mechanisms in ASD (Mazefsky et al., 2013).
Multiple behavior rating scales or checklists are available to assess emotional and behavioral problems in children with ASD, including Behavioral Assessment System for Children (Reynolds & Kamphaus, 1992), Behavioral and Emotional Rating Scale (Epstein & Sharma, 1998), Burks’ Behavior Rating Scales (Burks, 1996), Child Behavior Checklist (CBCL; Achenbach & Rescorla, 2001), and Conners’ Rating Scales (Conners, 1997). Recently, the Emotion Dysregulation Inventory (EDI) was developed for assessing emotional problems in children with ASD, which is based on a caregiver report and is designed to identify emotional distress and problems with emotional regulation (Mazefsky et al., 2018). However, few instruments are available that are suitable to identify comorbid emotional and behavioral disorders in children with ASD even though the children have a wide and entangled range of support needs (Howlin & Magiati, 2017).
Of the instruments that have been used to assess emotional and behavioral problems in children with ASD is the CBCL (Achenbach, 1991a). The CBCL is designed to identify emotional and behavioral problems among children in both clinical and nonclinical practices (Lambert et al., 2007). The 1991 CBCL/4-18 edition (Achenbach, 1991a) expanded the participants’ age range from 16 to 18 years and included the statistical foundation and normative data that coordinate information from parent, self, and teacher reports. The most recently updated CBCL, which is now called the Achenbach System of Empirically Based Assessment (ASEBA), is normed for ages 6 to 18 years by replacing six problematic items with new behavior profile items (Achenbach & Rescorla, 2001). Multiple studies have demonstrated strong psychometric properties of the CBCL (Achenbach, 1991a; Achenbach & Rescorla, 2001; H. N. Cho & Ha, 2019; Dutra et al., 2004).
The CBCL was not originally developed to identify emotional and behavioral problems of children with ASD. However, it has been used as a fundamental screening step to assess particular behaviors of individuals with ASD (Deckers et al., 2020; Duarte et al., 2003). Sikora et al. (2008) found that the scales of the CBCL had higher sensitivity and specificity in detecting autism-related disorders in children than those of the Gilliam Autism Rating Scale (Gilliam, 1995), providing evidence that CBCL is a useful tool for assessing the behavior of children with ASD. Pandolfi et al. (2009) conducted a confirmatory factor analysis using data from 128 children with ASD and provided further evidence that CBCL was beneficial in identifying emotional and behavioral problems of children with ASD.
The CBCL has been translated into 61 languages including Turkish, Italian, Dutch, and Chinese (Achenbach & Rescorla, 2001), and multiple studies have examined psychometric properties of CBCL around the globe (e.g., Dumenci et al., 2004; Schmeck et al., 2001). Studies have reported the translated versions of the CBCL to be reliable and valid as measured by indices of reliability and validity and to have psychometric properties similar to the English version. Aside from empirical studies that focused on psychometric qualities of the translated version of CBCL within each nation, previous studies have examined cross-national differences in CBCL scores (e.g., Carter et al., 1995; Crijnen et al., 1997) and conducted invariance tests across nations to examine whether study participants have similar levels of behavioral dysfunction when rated on the same metric (Lambert et al., 2007). However, only two studies evaluated the applicability of the translated versions of CBCL for children with ASD (Duarte et al., 2003; Park, 2011).
In South Korea, a few instruments are available to screen or diagnose children with ASD, including the Korean Autism Behavior Checklist (C. C. Cho & Shin, 1989), the Korean Childhood Autism Rating Scale (K-CARS; Soh & Joung, 1992), and the Korean Autism Diagnostic Interview–Revised (Yoo & Kwak, 2007). However, the limited number of validated instruments continues to be an obstacle to early identification and intervention for children with ASD in Korea (Yi & Cho, 2004). The Korean version of the CBCL (K-CBCL) is commonly used to assess emotional and behavioral problems of children with disabilities, which has been standardized with 1,644 girls and 1,964 boys in K–12 (Oh et al., 1997). Park (2011) validated K-CBCL utilizing a sample of children with ASD to examine whether the instrument was applicable to this population. The author used factor analysis to confirm that the two-factor structure of measurement (i.e., internalizing and externalizing factors) was appropriate for the ASD population. However, factor analysis does not deal with quality of measurement at the item level (Embretson & Hershberger, 1999).
To improve the limitation of factor analysis, item response theory (IRT) models have been used to examine relations between items and the underlying latent construct. The Rasch model (Rasch, 1960) is one of the refined approaches within the IRT model that “independently scales the severity of both items and persons along a theorized underlying latent continuum” (Kahler et al., 2004, p. 323). Multiple studies have extended construct validation of various psychological instruments using the Rasch model (e.g., Kahler et al., 2004; Sideridis & Padeliadu, 2011). Rapport et al. (2009), the only study that analyzed the Rasch model with CBCL in clinically referred boys, found that the aggressive and delinquent scales of the CBCL tended to reflect an underlying unidimensional construct of aggression.
However, to date, no studies have examined the validity of CBCL to assess emotional and behavioral problems of children with ASD at the item level. Examining the validity of K-CBCL using advanced measurement methods such as IRT is an important preceding step to conduct cross-cultural research (Hambleton et al., 2004). The cross-cultural studies, which investigate the applicability of scales across multiple languages and cultures based on the concept of measurement equivalence, enable researchers to identify whether their theories are generalizable into other cultural contexts (McCrae & Terracciano, 2005). Thus, this study examined the validity of K-CBCL using the Rasch model within the IRT framework for assessing emotional and behavioral problems of elementary school students with ASD. Specifically, the study examined (a) differences in the degree of emotional and behavioral problems based on the severity of ASD, (b) item fit of the K-CBCL in assessing emotional and behavioral problems of this population, (c) item difficulty, (d) suitability of the rating scale, and (e) person and item separation indices of the K-CBCL.
Method
Participants
The present sample included 180 Korean elementary school students with ASD aged 6 to 11 years, who were diagnosed by psychiatrists and placed at public or private special schools that primarily served children with ASD. As shown in Table 1, boys constituted 81.7% (n = 146) of the sample, whereas girls constituted 18.9% (n = 34) of the sample. These prevalence statistics are consistent with the gender ratio of Korean people with ASD as the percentage of ASD females was reported as 18% according to 2015 data of registered people with disabilities (http://www.mohw.go.kr/). Participants were recruited from children in 15 public and private special schools serving children with ASD in South Korea. The children were diagnosed as having ASD by a psychiatrist before receiving special education services; however, in addition to the medical diagnosis, they were evaluated for special education eligibility by an evaluation and assessment team that comprised special educators and various related service professionals.
Participant Demographic Information.
Note. The severity of autism was determined by K-CARS scores (scores between 30 and 36.5 indicate mild or moderate autism and scores between 37 and 60 indicate severe autism); children’s social skills were assessed using K-SSRS (K-SSRS has not been standardized yet and does not provide cutoff points to categorize scores into low, moderate, or high categories; however, according to the original SSRS, all participating children had low levels of social skills in all subscales and the total social skills; in the original SSRS, scores within 1 SD of the mean indicate the moderate level and scores below or above the mean indicate low or high levels, respectively). K-CARS = Korean Childhood Autism Rating Scale; K-SSRS = Korean Social Skills Rating Scale.
According to the scores on the K-CARS (T. L. Kim & Park, 1996) completed by the individual child’s team, 111 children (61.7%) were identified as having mild to moderate autism, and 69 children (38.3%) were identified as having severe autism. There were no missing data. The team also assessed the children’s social skills using the Korean Social Skills Rating Scale (K-SSRS; Park, 2002), which is designed to be completed by teachers. The K-SSRS is a translated version of the SSRS developed by Gresham and Elliott (1990) and consists of three domains (social skills, problem behavior, and academic performance). The social skills domain is composed of three subscales: cooperation, assertion, and self-control. The K-SSRS scores showed that all children had low levels of social skills in all subscales of social skills and total social skills.
The sample size of 180 participants is not large but sufficient for analysis. The calculation for needed sample size for 99% confidence that no item calibration is more than 1 logit away from its stable value is done for justifying the small sample size. A two-tailed 99% confidence interval is ±2.6 standard error wide, and for a ±1 logit interval, this standard error is ±1/2.6 logits. This gives a minimum sample in the range 4 × (2.6)2 < N < 9 × (2.6)2. Thus, a sample of 50 well-targeted participants is conservative for obtaining useful, stable estimates (Wright & Panchapakesan, 1969).
Instrument
The CBCL includes the Teacher’s Report Form (Achenbach, 1991b) and Youth Self-Report (Achenbach, 1991c) in addition to its original parents’ reports, and it was normed for children aged 4 to 18 years. Oh et al. (1997) included only children between the ages of 4 and 17 years when standardizing K-CBCL due to the difficulty of including 12th graders because they were otherwise occupied spending long hours preparing for university entrance exams in a Korean-specific educational context. Specifically, 744 girls and 791 boys between 4 and 11 years of age, and 120 girls and 1,173 boys between 12 and 17 years of age participated in the K-CBCL standardization study. Although a new version of the CBCL/6-18 (Achenbach & Rescorla, 2001) has been developed, the new version has not been validated in South Korea.
The K-CBCL consists of 13 subscales that measure internalizing and externalizing behaviors. There are 119 items that are measured on a 3-point scale. However, Item 2 (allergy) and Item 4 (asthma) were excluded when computing the total score because these items did not discriminate significantly between clinically referred and non-referred populations in the original norming sample. Thus, only 117 items were used for the total score, which ranged from 0 to 234. The Social Competence scale consists of 23 items that measure socialization and academic competence. Cronbach’s alpha of K-CBCL was reported to be between .62 and .86 across factors, and the average inter-rater consistency was reported to be .69. Convergent and discriminant validity were reported (Oh et al., 1997). In this study, Cronbach’s alpha of K-CBCL was found to be from .14 (Delinquent Behavior) to .74 (Anxious/Depressed and Attention Problems). We chose the .63 cutoff score because Achenbach (1991a) and K-CBCL validation manual (Oh et al., 1997) suggested the cutoff of .63 for clinical range.
Procedures
Research staff initially contacted all 102 special schools serving children with ASD, Grades 1 through 6, to recruit schools with the target population and then identified teachers and parents who were willing to participate. A total of 15 schools agreed to recruit participants. To avoid sampling errors, the number of possible sampling participants was verified for schools that allowed the study, and the opportunity to participate in the study was provided to all possible sampling participants. Research staff visited the schools and explained the purpose of the study and administration procedures of the K-CBCL. Classroom teachers distributed 200 K-CBCLs to parents, and 190 parents responded (95% return rate). The parents were asked to return the completed K-CBCL within a week, but most required 2 weeks to complete the K-CBCL. Among the returned checklists, 10 were excluded due to missing responses, leaving a total of 180 children’s checklists being analyzed in this study. Teachers of the 180 children completed the K-SSRS.
Data Analysis
Differences in degree of emotional and behavioral problems
We used the independent t test to examine the differences of emotional and behavioral problems between the mild to moderate group and severe group in each factor of Internalizing and Externalizing Problems. We used SPSS statistics for Windows Version 25.0 (IBM Corp, 2017) and provided results from descriptive analyses (i.e., means, standard deviations, percentages of participants in the clinical range), which identified differences in emotional and behavioral problems among participants, depending on their severity of ASD.
Item fit
To evaluate the item fit of the K-CBCL, we used the rating scale model of the Rasch analysis (Andrich, 1978; Battisti et al., 2010). To evaluate how well the observed data fit the Rasch unidimensional model and how well each item contributed to defining one common construct, we computed two-item fit mean square (MNSQ) statistics for each item: (a) Infit (weighted) MNSQ fit index and (b) Outfit (unweighted) MNSQ fit index that are shown with Z values. The ideal value of each Infit and Outfit MNSQ is 1.0, which means the data fit the Rasch model perfectly. We used an MNSQ range of 0.7 to 1.3, a relatively strict standard, as a good fit as suggested by several researchers (Bond & Fox, 2007). Z values were also used to determine misfits; Z values that are less than −2.0 or greater than 2.0 indicate inadequate fit for each item. Outfit MNSQs are influenced by outliers and easy to diagnose and remedy. However, Infit MNSQ is influenced by response patterns and usually hard to diagnose and remedy, so Infit MNSQ is a greater threat to measurement (Bond & Fox, 2007). Therefore, in this study, items were judged unfit when the Infit MNSQ value was less than 0.7 or greater than 1.3, and the Z value was less than −2.0 or greater than 2.0.
Item difficulty and suitability of the rating scale
We used a Wright map of Rasch analysis that allows graphical analysis of persons and items on a map showing the distribution of respondents to examine the item difficulty of the K-CBCL, an indication of construct validity. Persons at the top of the map indicate the highest level of the construct being measured, and the items at the top indicate the most difficult to endorse for the population. When the person mean is higher than the item mean, the items are relatively easy to endorse. On the contrary, when the person mean is lower than the item mean, the items are considered difficult to endorse.
To examine the suitability of the K-CBCL rating scale, we analyzed the category functioning within the Rasch model. Because respondents often fail to react to a rating scale in the manner that was intended and it is uncertain how they use the response options, we assessed the suitability of the rating scale by computing average and structure measure, Infit and Outfit MNSQ, and the thresholds between responses. In general, as the score of a rating scale increases, the average and structure measure should also increase. The fit value for each rating also presents information on whether each rating works well. Individual fit values for each rating that show over 1.5 in a 1.0 standard point suggest the rating scale is not functioning effectively (Andrich, 1996; Linacre, 2006). If the thresholds do not progress in a linear fashion, the responses to items are judged to not correspond to the levels of the construct being measured. We used Winsteps Version 3.6 software (Linacre, 2006) to examine whether the items in each subscale of the K-CBCL satisfy the basic assumptions of Rasch measurements.
Person and item separation indices
Person separation index and item separation index are used to evaluate the reliability of the test in Rasch analysis. The larger separation index means that more distinct functioning can be distinguished by the test. A person separation index of 1.50 represents an acceptable level of separation; an index of 2.00 represents a good level of separation; and an index of 3.00 represents an excellent level of separation (Fisher, 1992). Item separation index is interpreted using the same criteria. Separation reliability was interpreted the same as Cronbach’s alpha (Andrich, 1982).
Results
Differences in the Degree of Emotional and Behavioral Problems
As seen in Table 2, children with a mild or moderate level of ASD symptoms showed lower scores than those with severe levels of ASD symptoms in all K-CBCL scales except for Aggressive Behavior (mild or moderate group: M = 10.53, SD = 5.29; severe group: M = 9.94, SD = 5.64). Results of independent t test indicated a significant difference in the Social Problems factor between the two groups (t = 6.05, p < .05). Table 2 also shows the percentage of children with ASD scoring in the clinical range (T score greater than or equal to 63). Within the mild or moderate group of children with ASD, 14.4% had internalizing problem scores in the clinical range and 28.8% had externalizing problem scores in the clinical range (48.6% in the total problem). In the case of children with severe ASD, 20.3% scored in the clinical group in Internalizing Problems, 29.0% in Externalizing Problems, and 53.6% in total problems.
Participants’ Degree of Emotional and Behavioral Problems Based on ASD Severity.
Note. ASD = autism spectrum disorder.
The clinical range is determined with t score greater than or equal to 63.
p < .05.
Item Fit and Item Difficulty of the K-CBCL
Table 3 presents the results of analyzed CBCL item fit. The results of the item fit test judged six items to be misfit; the Infit MNSQ values were between the range of 0.7 to 1.3 and Z value from −2.0 to +2.0 except for the six items. Misfit items included “Refuses to Talk” and “Secretive” in the Withdrawn factor. In the case of Attention Problems factor, Item 62 “Poorly coordinated” was misfit. In the case of Aggressive Behavior factor, Item 7 “Brags,” Item 86 “Stubborn,” and Item 104 “Loud” were misfit. Figure 1 presents the results of person ability/item difficulty analyses of K-CBCL in the Withdrawn factor and the Somatic Complaints factor (the results of other factors are available from the authors upon requests). Item means were higher than the person means in all K-CBCL factors. In all factors except Social and Attention Problems, item intensities were beyond the ability levels of children with ASD.

Person ability/item difficulty map of the Withdrawn factor and the Somatic Complaints factor.
Item Fit.
Note. Unfit items are presented in bold. SE = standard error; MNSQ = mean square; ADHD = attention deficit hyperactivity disorder.
Suitability of the Rating Scale
Table 4 presents the results of analyzing the 3-point rating scale system of K-CBCL, which was assessed using the item fit approach. The results indicated that the Anxious/Depressed factor was insufficiently measured by the 3-point rating scale. The Infit MNSQ value of 1.28 and Outfit MNSQ value of 1.60 indicated that the 3-point scale did not function effectively for the Anxious/Depressed factor. However, the Rasch item characteristics curve (note: results of each category probability curve analysis are available from the authors upon request) showed a discrete peak for each response option, indicating that the respondents distinguished between the response options. The other seven factors (i.e., Withdrawn, Somatic Complaints, Social Problems, Thought Problems, Attention Problems, Delinquent Behavior, and Aggressive Behavior) had values below 1.5 fit statistics.
Rating Scale Analysis of K-CBCL Factors.
Note. K-CBCL = Korean-Child Behavior Checklist; MNSQ = mean square.
Person and Item Separation Indices
As provided in Table 5, the person separation index was 0.94 for Withdrawn, 0.21 for Somatic Complaints, 1.15 for Anxious/Depressed, 1.02 for Social Problems, 1.19 for Thought Problems, 1.64 for Attention Problems, 0.00 for Delinquent Behavior, and 1.70 for Aggressive Behavior. The item separation index for each of the eight factors was above 2.0. The highest item separation was 8.45 in Social Problems and the lowest separation was 2.96 in Delinquent Behavior, indicating that excellent level of separation is distinct and functioning independently.
Each Factor’s Person and Item Separation Index of the K-CBCL.
Note. K-CBCL = Korean-Child Behavior Checklist.
Discussion
The first research question addressed the measurement results of K-CBCL. The results of this study support findings of previous studies indicating that many children with ASD exhibit emotional and behavioral problems, and that the CBCL may be a good instrument to assess the emotional and behavioral problems of these children. Havdahl and colleagues (2016) suggested the possibility of using the CBCL for differentiating children with ASD from other clinical disorders. Guerrera et al. (2019) reported approximately 50% of all children and adolescents with ASD had a clinical range or at risk in their total problems scores and approximately 30% of them had a clinical range or at risk in their Internalizing Problems score on the CBCL. Mazefsky et al. (2011), using data collected from children with high-functioning ASD, found that half of the children with ASD between the ages of 8 and 18 years exhibited a borderline clinical range (t score greater than or equal to 67) on the Withdrawn/Depressed scale. Interestingly, the mean levels of CBCL subscales reported from Mazefsky et al. (2011) were higher than the scores found in this study. Because Mazefsky et al. (2011) included children with high-functioning ASD whereas this study included children with ASD in general, the direct comparisons between scores may not be suitable.
Nonetheless, it is important to consider the possibility of interaction between ASD and psychiatric comorbidity or other conditions such as intellectual disability and language disorder when interpreting the CBCL scores. Although we were unable to gather information on comorbid conditions of the children included in this study, the literature indicates that low language skills are linked to high rates of externalizing problem behaviors in children with ASD (McClintock et al., 2003). In addition, the co-occurrence of intellectual disability has been associated with increased rates of comorbid conditions in adults with ASD, such as anxiety, bipolar disorder, depression, and schizophrenia (LoVullo & Matson, 2009).
Taking into account the high responsiveness of CBCL in identifying co-occurring emotional and behavioral disorders of children with ASD (Pandolfi et al., 2012; Sikora et al., 2008), K-CBCL may be used for (a) supplementing information on emotional and behavioral problems and (b) assisting the implementation and evaluation of the individualized interventions for children with ASD. Future studies are needed to examine K-CBCL using a longitudinal and repeated measures design to further assess the utility of K-CBCL for children with ASD in educational or clinic settings.
The second research question addressed whether K-CBCL items fit for unidimensionality. Results showed that all but six items generated a unidimensional trait across Withdrawn, Attention Problems, and Aggressive Behavior factors. These six items did not fit for unidimensionality; thus, clinicians or assessors should apply caution when interpreting the K-CBCL results involving these six items, or future researchers should consider removal or adaptations of the items for a future version of K-CBCL for children with ASD. As this study provides initial evidence on the items fit of K-CBCL with a sample of elementary school students with ASD who were placed in special schools, additional research findings from an expanded spectrum of samples need to be accumulated to provide cultural implications to the items that do not have good fit. Among the six misfit items, only “Stubborn” was underfit (MNSQ = 0.69). Although care should be taken when interpreting the results involving the “Stubborn” item, it may not be necessary to adapt or remove the item from the instrument because we employed a strict item fit analysis criterion (an MNSQ range of 0.7–1.3). For clinical observation, Infit and Outfit MNSQ values between 0.5 and 1.7 are considered acceptable (Bond & Fox, 2007).
The third research question addressed the item difficulty of K-CBCL. The results indicated that the items had higher difficulty levels than the ability levels of elementary school students with ASD. The person means of six of the eight factors were lower than the item means; thus, the items were relatively difficult to endorse for this population. The items were more difficult than the children with ASD’s ability level within all factors except Social Problems and Attention Problems. This implies that the assessment of emotional and behavioral problems in children with ASD could be improved through adjusting the items. Although the results of the study indicate that K-CBCL can be applicable to the population of children with ASD, future researchers who are interested in further development of K-CBCL might consider revising K-CBCL using items found to be less difficult in this study to improve the utility and validity of the instrument.
The fourth research question addressed the suitability of the rating scale of K-CBCL. The results yielded support of the 0 to 2 rating scale for each item even though there were six misfit items and unsuitability of the rating scale in one factor, suggesting that K-CBCL is applicable to the population of children with ASD. In addition, the 0 to 2 response scale of K-CBCL was acceptable for all factors except Anxious/Depressed, which had a misfit value. Given that a major problem with rating scales is raters’ tendency of scoring the middle range (Engelhard, 1994), which may result in a loss of discriminative power, as Achenbach (1991a) indicated, improving the discriminative power of items is imperative in developing item response scaling. When using the criteria provided by Linacre (1994), who suggested that for any rating scale to be considered high quality, the average measure of each rating scale category must increase monotonically as each rating response category moves up and the outfit MNSQ statistics must be less than 2.0, and we found the 3-point response scale (i.e., 0, 1, 2) of K-CBCL to be appropriate for capturing emotional and behavioral problems in children with ASD.
The final research question examined the person and item separation indices. A low person separation index was found for all factors except Attention Problems and Aggressive Behavior. Low person separation is an indication of the measure not being sensitive enough to separate the sample into sufficient levels (from low to high problems). Results indicate that the Attention Problems and Aggressive Behavior factors could differentiate two (high, low) strata of behavior problems of children with ASD, whereas the other six factors could only capture one stratum of behavioral problems. However, a high item separation index was found across all factors. In general, it is suggested that the person separation reliability (separation index) of .5 can capture 1 or 2 levels, .8 can capture 2 or 3 levels, and .9 could classify 3 or 4 levels (Fisher, 1992). Thus, the high reliability of items suggests that children with high levels of problems are more likely to receive high scores on the K-CBCL scale.
The results from this study provide important implications in using K-CBCL. The findings of this study and Park (2011) support that K-CBCL may be used to assess emotional and behavioral problems of Korean children with ASD, particularly elementary school-aged children. However, CBCL is designed to function as one element of a multiaxial, empirically based assessment that includes parent reports, teacher reports, cognitive assessment, physical assessment, and direct assessment of the child (Achenbach, 1991a). Thus, professionals should not solely focus on scale scores but should obtain the child’s comprehensive information from various sources to identify the child’s strengths and weaknesses for effective intervention. By consistently creating children’s behavioral profiles, practitioners can document changes in particular problems or syndrome scores and provide more systematic educational and related services. This is important because efforts in utilizing effective intervention strategies are based on accurate screening and diagnosis activities.
There are limitations with interpreting the findings of this study that must be considered. First, it would have been preferable to have a larger sample size than the 180 children with ASD who participated in this study. Although the sample size for IRT analyses relies on multiple factors such as the IRT model chosen, the characteristics of the items, and the number of item parameters, researchers should start with 100 participants for Rasch model and up to 500 participants for models with more parameters (Mokkink et al., 2012). Given the aforementioned recommendation, we satisfied the minimum sample size requirement for the Rasch model. Second, this study used an older version of the CBCL for ages 4 to 18 years (Achenbach & Edelbrock, 1983) due to the unavailability of a Korean version of the 2001 CBCL. However, it is expected that CBCL patterns found in this study would be nearly identical had the current version of the CBCL been used.
Third, as the preliminary IRT-based validity research, children with ASD alone were only included in this study when addressing the first two research questions. Future research is needed to include comparison groups (e.g., children with emotional and behavioral disorders measured by the CBCL, typically developing children) to determine how K-CBCL performs at distinguishing autism symptoms from emotional and behavioral disorders. Last, we did not gather information on the participating children’s language and cognitive levels, and thus, we were not able to compare the groups according to level of langue skills or cognitive function. As discussed earlier, it is likely that children with ASD, who have lower language skills or cognitive levels, might have higher levels of comorbid conditions than their counterparts, such as externalizing and internalizing behavior problems (LoVullo & Matson, 2009; McClintock et al., 2003).
Despite these limitations, the examination of K-CBCL using a Rasch model provides additional information on the psychometric properties of K-CBCL. The specific psychometric evidence of K-CBCL needs to be accumulated to identify emotional and behavioral problems in children with ASD. Future research should examine relationships between K-CBCL and other measurements for children with ASD and evaluate differences in emotional and behavioral problems of children with ASD, based on age and gender. From the cross-cultural perspective based on current research findings, future studies need to examine whether the observed scores and latent constructs of the CBCL are equivalent across countries to test generalizability or boundary conditions of the theory underlying CBCL.
Footnotes
Informed Consent
Informed consent was obtained from all individual participants included in the study.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research and/or authorship of this article.
