Abstract
Intensive longitudinal designs have been used to examine the fluctuations of callous-unemotional (CU) traits and their dynamic links with daily correlates; however, scant research has explored how CU traits manifest in daily contexts at the within-person level. This study evaluated the multilevel factor structure and psychometric properties of a short version of the Inventory of CU Traits in daily contexts among adolescents (
Keywords
Callous-unemotional (CU) traits are characterized by shallow affect, lack of remorse and empathy, and lack of concern or interest in doing things well (Frick et al., 2014a, 2014b). A large body of research has shown that CU traits in childhood and adolescence are concurrently and prospectively associated with conduct problems (Colins et al., 2016), aggressive behaviors (Ciucci et al., 2014), antisocial behaviors (Cardinale & Marsh, 2020), risky sexual behaviors (Anderson et al., 2017), and substance use (Anderson et al., 2018), underscoring the significance of establishing valid and reliable tools to assess CU traits for both clinical and research purposes. CU traits are not immutable over time (Perlstein et al., 2023) and show substantial fluctuations (Goulter et al., 2024). Recent research has started to use intensive longitudinal designs (ILDs) to examine their fluctuations on a micro timescale and their dynamic links with daily antecedents and correlates (Goulter et al., 2024; Y. Zheng et al., 2025), offering a unique insight into elucidating the within-person variability of CU traits.
Expanded from the original four items from the Antisocial Process Screening Device, the 24-item Inventory of Callous-Unemotional Traits (ICU; self-, parent-, and teacher-report versions; Frick, 2004) was developed and has since become among the most comprehensive and widely used measurement tools of CU traits among different age groups spanning from children to adults (e.g., Docherty et al., 2024; Kemp et al., 2024; Y. Zheng et al., 2021). Subsequent factor analyses primarily tested three structures/models: (a) a single-factor model, with all items loading on a single CU factor; (b) a correlated-factor model, with each item loaded on one of three intercorrelated factors:
Although the
During adolescence and young adulthood, psychopathology symptoms arise and manifest primarily in various situations, events, or relationships (Thunnissen et al., 2022; Walz et al., 2014). To investigate how these symptoms fluctuate in the daily contexts where they manifest while minimizing recall bias (Sliwinski, 2011; Y. Zheng & Goulter, 2024), a burgeoning number of studies have employed ILDs to understand day-to-day variability of psychopathology symptoms on a micro timescale as well as their associations with contextual factors and treatment outcomes. Although the term CU “traits” may seemingly suggest relatively stable individual characteristics among adolescents and young adults, research has indicated that these traits are not immutable and are sensitive to environmental factors including treatments (Docherty et al., 2024; Fleming et al., 2022; Frick et al., 2014a; Perlstein et al., 2023). This view aligns with contemporary personality theory, which recognizes that traits reflect stable patterns of thinking, feeling, and behaving, but may still fluctuate meaningfully across time and context (Fleeson, 2004; Soto & Tackett, 2015; Wright & Simms, 2016). In light of these views, some scholars have recommended alternative terms, such as CU
These diary studies measured CU traits by directly using ICU-11/12, as brief instruments are necessary for ILDs to minimize participant burden and accommodate time limits. However, the original ICU and its short versions have only been evaluated psychometrically with cross-sectional or conventional longitudinal (e.g., multi-year) designs (e.g., Byrd et al., 2013; Kemp et al., 2024). The identified (sub)factors are based on between-person level structures, which indicate whether people exhibit general or specific CU traits that are similar to or different from others. For example, a between-person factor implies that individuals who score higher than others on one item also tend to score higher than others on additional items within the same factor. In contrast, a within-person factor structure reflects how items co-fluctuate over time within the same individual (Li et al., 2025; H. Zheng & Zheng, 2025). For instance, a within-person factor suggests that when individual scores higher than their own average on one item at one time, they are also likely to score higher than their average on other items within the same factor at the same time. Recent studies on positive and negative affect (Cooke et al., 2022) and externalizing and internalizing psychopathology (H. Zheng & Zheng, 2025) have shown that between-person level structure does not always translate to within-person level. Therefore, to accurately capture within-person fluctuations in CU traits, it is crucial to first establish the multilevel structure of CU traits measures while properly disentangling variance at the within- and between-person levels.
The secondary CU traits framework posits that, in contrast to primary CU traits, which stem from temperamental or genetically based emotional deficits, secondary CU traits are primarily shaped by adverse contextual factors, such as interpersonal stress and trauma (Craig et al., 2021; Schuberth et al., 2019). During the transition from adolescence to young adulthood, individuals often face challenges such as leaving the family of origin, shifts in financial dependence, and the need to establish new social ties (H. Zheng & Zheng, 2024). This heightened stress may lead some young adults to develop a “mask” of emotional detachment and callousness as an adaptive strategy to cope with stress (Craig et al., 2021). As a result, the patterns of item endorsement and the latent structure of CU traits may shift during this transition period. Evaluating whether the ICU functions equivalently across developmental stages is important to ensure that the observed changes in CU traits reflect actual developmental processes rather than measurement non-invariance.
Previous findings on the links between CU traits and psychopathology outcomes further suggest the existence of a potential secondary variant of CU traits and highlight the importance of distinguishing the factor structures of CU traits across different levels. While the total CU scores and all subfactor scores have consistently exhibited positive associations with externalizing problems (e.g., conduct problems; Cardinale & Marsh, 2020), their associations with internalizing problems are less consistent. Some studies reported that total CU and
The Present Study
The psychometric properties of the ICU have typically been evaluated in cross-sectional or conventional longitudinal designs at the between-person level. Catering to the burgeoning research investigating the daily fluctuations in CU traits and more broadly psychopathology symptoms using ILDs, this study aims to explore the factor structures and psychometric properties of the ICU at both within- and between-person levels in daily diary studies. The present study used two independent samples (one among adolescents and one among young adults), each with a month-long daily diary design, to evaluate CU trait structures previously identified and supported in the literature. A comprehensive set of standards was used to compare these models, including model fit, factor reliability and properties, longitudinal measurement invariance, as well as within- and between-person level criterion validities. We also explored whether the multilevel structure of CU traits remains invariant between adolescents and young adults. Based on previous studies, we expected that the bifactor model would show the best performance (better model fit and higher factor reliability) at the between-person level. Given the scarce literature on the CU traits structure at the within-person level, especially using daily diary designs, an exploratory approach was taken for within-person structures and measurement invariance, and we opted not to make any specific hypothesis. Regarding criterion validities, it was expected that CU traits factor scores would be positively associated with externalizing problems at both within- and between-person levels. In contrast, the associations between CU traits scores and internalizing problems were expected to be negative at the within-person level, but positive at the between-person level.
Method
Participants and Procedures
This study used data from two independent community-based samples of adolescents and young adults. The research procedure and instruments for both datasets received approval from the research ethics committee at University of Alberta. Survey instruments were developed and administered using RedCap (Harris et al., 2019). We report all data exclusions, all manipulations, and all measures used in the study. Since this study involved analyses of existing datasets rather than collecting new data, determining sample size was not applicable.
University Student Sample
An initial sample of 313 Canadian university freshmen (
Participants were recruited from a large Western Canadian university through online advertisements, on-campus posters, and short in-class presentations. For wave 1, all first-year undergraduate students were eligible for inclusion, while wave 2 was limited to those who had completed at least the baseline survey in wave 1. In both waves, participants first completed a baseline survey after providing informed consent online, and participated in daily surveys for 30 consecutive days. Daily surveys were sent by email at 7 pm each night and participants were asked to complete the survey before going to sleep that night. Participants received a $60 and $75 e-gift card as compensation in each wave, respectively (see Cooke et al., 2022; H. Zheng & Zheng, 2025 for more information on recruitment procedures).
Adolescent Sample
A total of 99 Canadian adolescents (
Measures
Daily Callous-Unemotional Traits
Daily CU traits were assessed with an 11-item shortened version of the Inventory of Callous-Unemotional Traits (e.g., “I do not care if I get into trouble.” Colins et al., 2016; Frick, 2004; Wang et al., 2020) in the daily survey. One item deemed as infrequently occurring in the daily lives of adolescents and young adults and with low within-person fluctuations (i.e., “I apologize to persons I hurt”) was excluded. Therefore, the final measure included 10 items, which contains two subscales:
Criterion Validity Measures
Daily Emotional and Conduct Problems
In the daily survey, emotional problems were measured with five items from the emotional problem subscale (e.g., “I have many fears.” “I worry a lot.”) of the Strengths and Difficulty Questionnaire (SDQ; Goodman et al., 1998) validated in daily diary research (H. Zheng & Zheng, 2025). Participants indicate how each item applies to them on that day on a 3-point scale rated (0 =
Daily Anxiety Symptoms
University students reported their daily panic disorder (3 items; e.g., “When I got frightened, my heart beats fast.”), social (2 items; e.g., “I felt nervous with people I didn’t know well.”), and generalized anxiety (3 items; e.g., “I was nervous.”) symptoms modified from the Screen for Adult Anxiety Related Emotional Disorders (Angulo et al., 2017; Li et al., 2025) on a 3-point scale (0 =
Depressive Symptoms
Depressive symptoms were measured using 17 items adapted from the Center for Epidemiological Studies Depression scale (Radloff, 1977) in the two baseline surveys. University students indicated how often the statements described them in the past year on a 4-point scale (0 =
Analytic Strategy
Model Estimation
Multilevel confirmatory factor analyses (MLCFAs) using the weighted least squares with mean and variance corrected (WLSMV) estimator (default for ordinal items) were used to examine the structure of the 10-item ICU at the within- and between-person levels. All observed variables were treated as ordinal. First, ICCs for ICU items were calculated to determine the appropriateness of multilevel modeling. Next, models were estimated using data from the first wave of the university student sample. Three structures/models of ICU were tested: a single-factor model (i.e., a single factor at each level), a correlated-factor model with two factors, and a bifactor model with two specific factors. To investigate if ICU potentially exhibits distinct structures at different levels, models with varying structures across levels were also estimated. Certain subpar models were excluded based on their structural validity. The remaining models were retained for estimation in wave 2 of the university student sample and the adolescent sample to examine their replication across samples. All analyses were conducted in M
Structural Validity
Traditional model fit indices (Hu & Bentler, 1999) include the root mean square error of approximation (RMSEA) <.05, standardized root mean square residual (SRMR) < .08, comparative fit index (CFI) > .90, and Tucker–Lewis Index (TLI) > .90. It should be noted that the CFI, TLI, RMSEA, and SRMR at the within-person level (SRMRw) are not sensitive to the between-person level misspecification (Hsu et al., 2015). The SRMR at the between-person level (SRMRb) was specifically used to assess model fit at the between-person level.
Factor reliability was evaluated by the following indices: The index of construct replicability (H Index) assesses how well a latent factor can be replicated across studies, with H values > .80 for general factors and .70 for specific factors indicating optimal reliability (Rodriguez et al., 2016). For the correlated-factor model, omega subfactor (ωs) values > .75 are considered good (Revelle & Condon, 2019). In the bifactor model, omega hierarchical (ωh) and omega hierarchical specific (ωhs) indicate the proportion of total score variance specifically attributable to general and specific factors, respectively, with ωh/ωhs > .50 demonstrating acceptable reliability. Explained common variance (ECV) quantifies the percentage of common variance explained by each latent factor, which shows the relative strength of factors and the extent of unidimensionality (Rodriguez et al., 2016).
Longitudinal and Multigroup Measurement Invariance
Longitudinal measurement invariance (MI) was assessed in the university student sample. Unconstrained models were compared with models where factor loadings (metric MI), thresholds (scalar MI), and residual variances (strict MI) were constrained to be equivalent across two waves (Widaman et al., 2010). MI at the within-person level is indicated by a ΔCFI decreasing ≤ .01 and an RMSEA ≤ .015 (Khojasteh & Lo, 2015). MI at the between-person level is indicated by a decrease in SRMRb ≤ .030 (Khojasteh & Lo, 2015).
Multigroup MI was tested between the adolescent sample and wave 1 of the university student sample using the Maximum Likelihood estimation with the robust standard error (MLR) estimator. Since M
Within- and Between-Person Criterion Validity
Factor scores estimated with the Bayesian estimator were used as observed variables to examine criterion validity. At the within-person level, concurrent predictive validity was evaluated by correlating each latent factor with same-day criterion measures. The between-person concurrent validity was examined by incorporating the correlations between latent factors at the between-person level and person-average levels of validity variables, reflecting the associations between the random intercepts of these components across individuals.
Transparency and Openness
This study was not preregistered. The code and output files for all the analyses are publicly available (https://osf.io/rh7mv). Data are not publicly available due to ethics agreements. However, the data required for the analyses performed in the study are available from the corresponding author upon reasonable request.
Results
Descriptive Statistics
All ICU items showed moderate to high ICCs in the university student (wave 1: .48–.64; wave 2: 46–.62) and adolescent (.56–.77) sample (Table 1). This indicates that approximately 46% to 77% of the variation in ICU items occurred at the between-person level, while the remaining variation can be attributed to within-person fluctuations over days.
Intraclass Correlations and Frequencies of ICU Items.
Structural Validity
University Student Sample
All three structures/models were fully crossed to enumerate all possible combined structures across levels in wave 1 of the university student sample. Fit indices (Table 2) reveal that models with the single-factor model at either within- or between-person level showed unacceptable model fit. The other models, comprising bifactor and correlated-factor models, demonstrated acceptable-to-good fit. Generally, the bifactor models fit the data better than the correlated-factor model at both within- (higher CFI and TLI, and lower RMSEA and SRMRw) and between- (lower SRMRb) person levels. Based on these results, we proceeded with only the bifactor and correlated-factor models to examine replication in wave 2 of the university student sample and the adolescent sample. All four models estimated with wave 2 data demonstrated acceptable-to-good fit (Table 2). The bifactor models provided a better fit to the data than the correlated-factor models at both the within- and between-person levels.
Fit Indices for the Multilevel Confirmatory Factor Analyses for Daily CU Traits.
In the correlated-factor models (Table 3), all factor loadings were positive and significant, with all indicators showing loadings ≥ .35 at both levels. For bifactor models, the specific factors were strongly indicated by the items at both within- (wave 1 & 2 median λw = .61 and .57) and between- (wave 1 & 2 median λb = .83 and .79) person levels. Loadings on the general factors were relatively lower at both within- (wave 1 and 2 median λw = .26 and .34) and between- (wave 1 and 2 median λb = .45 and .48) person levels, with two items showing non-significant factor loadings at the between-person level in each wave.
Standardized Factor Loadings in the Models.
Overall, factor reliability was greater at the between-person level than at the within-person level (Table 4). The correlated-factor models were reliable and well-defined at both levels across waves, except for the
Factor Reliability Indices.
Adolescent Sample
Consistent with the university student sample, all four models among adolescents showed acceptable-to-good fit (Table 2). In the correlated-factor model, all factor loadings were positive and significant, with only one item in the
The reliability indices of the correlated-factor models in the adolescent sample showed a generally consistent pattern with those observed in the university student sample (Table 4). In the bifactor models, the specific factors generally showed unacceptable reliability (i.e., ωhs < .50), with the exception of the
Longitudinal and Multigroup Measurement Invariance
As shown in Table 5, both the correlated-factor and bifactor models achieved longitudinal metric MI at the within-person level (i.e., ∆CFI ≤ .01 and ∆RMSEA ≤ .015), as well as metric, scalar, and strict MI at the between-person level (i.e., ∆SRMRb ≤ .030). The correlated-factor model revealed no significant difference between the latent means of the two factors over time. The bifactor model indicated a slight increase in the general factor from wave 1 to wave 2 (
Longitudinal Measurement Invariance Tests Among the University Student Sample.
As shown in Table 6, the correlated-factor model demonstrated full MI across the adolescent sample and wave 1 of the university student sample. University students exhibited a higher latent mean of
Multigroup Measurement Invariance Tests Among the Adolescent Sample and the First Wave of the University Student Sample.
Within- and Between-Person Criterion Validity
University Student Sample
In both waves (Table 7), the
Criterion Validity Tests at the Within- and Between-Person Level.
In the bifactor model, the general factor was correlated with almost all validity variables at the within-person level (wave 1:
Regarding prospective correlations, after controlling for wave 1 corresponding validity variable, in the correlated-factor model, wave 1
Adolescent Sample
The
Discussion
This study examined the within- and between-person factor structure of CU traits in daily contexts using two independent samples with month-long daily diary designs. The results indicated that both the bifactor and correlated-factor models demonstrated accepted fit at the within- and between-person levels, though the general factor in the bifactor model in the university student sample showed low reliability and replicability. Longitudinal MI was observed within the university sample over a 2.5-year span, while structural differences emerged between adolescents and university students. At both levels, the general factor and the
Corroborating previous studies conducted at the between-person level (Byrd et al., 2013; Ciucci et al., 2014; Wang et al., 2020; Y. Zheng et al., 2021), conventional fit indices suggest a preference for the bifactor model at both within- and between-person levels. These results warrant cautious interpretation, as these fit indices tend to favor models with greater flexibility (Reise et al., 2016). In the university student sample, the general factors explained < 35% of the total variance at both levels and showed low reliability (ωh < .50) and replicability (H index < .70), suggesting that this general factor may primarily reflect absorbed measurement error rather than a true latent factor (Rodriguez et al., 2016). These results indicate that there is no reliable general factor at either within- or between-person level in this sample. In contrast, the general factor showed acceptable reliability and replicability in the adolescent sample, particularly at the between-person level.
The measurement non-invariance across the two groups further emphasizes that the bifactor model, particularly at the between-person level, is not equivalent across different age groups. This discrepancy aligns with the abovementioned findings regarding divergent psychometric validity across the two samples and suggests that the factor structure of CU traits may change substantially from adolescence to young adulthood. Using a bifactor model may be more appropriate for adolescents. The unidimensionality observed at the between-person level suggests that future studies using ILDs in adolescents could reasonably rely on the general factor of CU traits (Ray & Frick, 2020; Reise et al., 2016; Rodriguez et al., 2016). This finding aligns with a previous meta-analysis, which demonstrated that the reliable variance in the total score of CU traits were largely determined by the general factor in the bifactor models, thereby recommending simply using the total score rather than subscale scores in future studies (Ray & Frick, 2020). Nonetheless, at the within-person level, both the general and specific factors should be considered for a more comprehensive understanding of CU traits in adolescents. If the research specifically focuses on the two subfactors, the correlated-factor model can also be applied to generate separate subfactor scores. In the young adult sample, the unreliable general factors indicate that items in the
Consistent with findings from between-person level studies (e.g., Hawes et al., 2014), CU traits were consistently associated with conduct problems at the within-person level in both samples. The negative associations between the
Taken together, these findings offer several important implications for clinical practice and future research. First, the substantial daily fluctuations observed in CU traits highlight their “state” feature. These results support the notion that, similar to personality traits (Fleeson, 2004; Soto & Tackett, 2015; Wright & Simms, 2016), CU traits are not “fixed” individual characteristics but exhibit meaningful dynamics and variability (Fleming et al., 2022; Goulter et al., 2024; Schuberth et al., 2019; Waller & Hyde, 2017). Incorporating intensive assessments within daily contexts may hence provide a more ecologically valid understanding of how CU traits manifest in everyday life. Second, the findings demonstrate that the latent structure of CU traits can diverge across levels. For instance, the unidimensional structure observed at the between- but not the within-person level among adolescents indicates that while CU traits items tend to co-occur across individuals (i.e., individuals who score higher than others on one item also tend to score higher on other items), they do not necessarily fluctuate together within individuals on a day-to-day basis (Li et al., 2025; H. Zheng & Zheng, 2025). These results highlight the importance of differentiating between within- and between-person structures in the assessments of CU traits. Within-person level structures can be used to track and monitor CU traits in daily contexts over time and understand their short-term antecedent and outcomes on a micro timescale. In contrast, between-person level structures are informative for ranking or comparing the level of CU traits across individuals. Third, these insights could inform interventions to treat CU traits as modifiable characteristics in adolescents and young adults. Adopting a micro timescale approach could enable context-sensitive intervention strategies to manage daily variations in CU traits (Y. Zheng & Goulter, 2024) and, by extension, decrease the likelihood of their progression to more severe externalizing problems. In addition, the measurement non-invariance in factor structures across adolescent and university student samples may suggest that CU traits reorganize during the transition from adolescence to young adulthood. In adolescents, targeting general CU traits through broad-based treatments may be effective, whereas in young adults, tailoring interventions to more specific components such as
Strengths, Limitations, and Future Directions
This study has several notable strengths. Previous research has examined CU traits structure primarily using cross-sectional or conventional longitudinal designs with long time intervals and focused on between-person analyses. This study addressed these limitations by conducting month-long daily diary designs to explore the CU traits structure at both within- and between-person levels. The findings confirm meaningful daily variations in CU traits, which align with emerging research that emphasizes a micro-level approach to better capture psychopathology symptom variability and dynamic links with contextual factors and behavioral outcomes (Thunnissen et al., 2022; Walz et al., 2014). Moreover, this study replicated findings across two independent samples and identified age-related factor structure differences, as well as stability within young adults over 2.5 years, which highlights both developmental continuity and discontinuity in the manifestation of CU traits.
Despite these strengths, several limitations warrant consideration. First, the current study used data from two community samples with relatively low endorsement of items, reflecting limited levels of severity. Replications with high-risk samples (e.g., incarcerated and clinical) could help validate and extend these findings to populations with elevated CU traits (Fontaine et al., 2023; Kemp et al., 2024; Y. Zheng et al., 2021). In addition, the sex distribution in the current study is unbalanced, with over 70% of the participants in the university student sample identifying as female. This may limit the generalizability of the findings, particularly regarding potential sex differences in the expression and fluctuation of CU traits. Future research should replicate these findings in more sex-balanced and diverse samples to enhance generalizability. Second, this study relied exclusively on self-reports. Different informants (e.g., self- vs. parent-reports) may influence psychometric properties of the ICU (Cardinale & Marsh, 2020; Deng et al., 2019; Wang et al., 2020). Future studies should incorporate multi-informants to examine the robustness of within- and between-person factor structures. In addition, although we confirmed sufficient within-person variability in these ICU items, the scale was originally developed for trait-level assessment. Some items may still reflect retrospective evaluations rather than context-sensitive behaviors. Future research could consider developing or validating CU trait measures specifically designed for ILDs. Third, it remains possible that the factor structure of CU traits is partly driven by method variance (Hawes et al., 2014; Ray & Frick, 2020), as all items in the
Conclusion
Daily CU traits exhibit meaningful within-person fluctuations in adolescents and young adults, highlighting their dynamic nature in daily contexts. The manifestation of CU traits appears to change across different developmental periods. In adolescents, both bifactor and correlated-factor models can be used at different levels depending on specific research purposes. Among young adults, the
Footnotes
Acknowledgements
The authors gratefully acknowledge all the participants, research assistants, Elk Island and St. Albert public schools, and the following organizations at University of Alberta for their support: International Student Services, English for Academic Purposes program, New Chinese Generation, Chinese Students and Scholars Association, iGeek, Undergraduate Research Initiative, China Institute, East Asian Studies Undergraduate Students Association, and Taiwanese Student Association. Study data were collected and managed using RedCap electronic data capture tools hosted and supported by the Women and Children’s Health Research Institute at the University of Alberta.
Data Availability Statement
Research data are not publicly available due to ethics agreements. However, the data required for the analyses performed in the study are available from the corresponding author upon reasonable request. This study was not preregistered. To promote transparency and openness, the codes for all the analyses are publicly available at
.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported partly with funding from the China Institute at the University of Alberta, the Social Sciences and Humanities Research Council (IDG 430-2018-00317 and 409-2020-00080) and Natural Sciences and Engineering Research Council (RGPIN-2020-04458 and DGECR-2020-00077) of Canada, and a Killam Research Fund Cornerstone Grant. HZ was supported by a Mitacs Accelerate Grant (IT 18227) awarded to YZ, the Ivy A Thomson and William A Thomson Scholarship, and the Women and Children’s Health Research Institute Graduate Studentship.
