Sage Journals: Discover world-class research

Abstract

Intensive longitudinal designs have been used to examine the fluctuations of callous-unemotional (CU) traits and their dynamic links with daily correlates; however, scant research has explored how CU traits manifest in daily contexts at the within-person level. This study evaluated the multilevel factor structure and psychometric properties of a short version of the Inventory of CU Traits in daily contexts among adolescents (n = 99, 2,132 daily reports) and young adults (n = 313, 6,431, and 4,018 daily reports at each wave). Both bifactor and correlated-factor models showed acceptable fit and reliability at the within- and between-person levels, though the general factor in the bifactor model demonstrated low reliability in university students. Longitudinal measurement invariance was supported among university students over a 2.5-year period, while structural differences emerged between the two samples. Findings highlight meaningful within-person fluctuations in daily CU traits. Future studies should evaluate the applicability of different factor models for a more accurate assessment across age groups.

Keywords

Callous-Unemotional Traits daily diary intensive longitudinal design multilevel confirmatory factor analysis

Callous-unemotional (CU) traits are characterized by shallow affect, lack of remorse and empathy, and lack of concern or interest in doing things well (Frick et al., 2014a, 2014b). A large body of research has shown that CU traits in childhood and adolescence are concurrently and prospectively associated with conduct problems (Colins et al., 2016), aggressive behaviors (Ciucci et al., 2014), antisocial behaviors (Cardinale & Marsh, 2020), risky sexual behaviors (Anderson et al., 2017), and substance use (Anderson et al., 2018), underscoring the significance of establishing valid and reliable tools to assess CU traits for both clinical and research purposes. CU traits are not immutable over time (Perlstein et al., 2023) and show substantial fluctuations (Goulter et al., 2024). Recent research has started to use intensive longitudinal designs (ILDs) to examine their fluctuations on a micro timescale and their dynamic links with daily antecedents and correlates (Goulter et al., 2024; Y. Zheng et al., 2025), offering a unique insight into elucidating the within-person variability of CU traits.

Expanded from the original four items from the Antisocial Process Screening Device, the 24-item Inventory of Callous-Unemotional Traits (ICU; self-, parent-, and teacher-report versions; Frick, 2004) was developed and has since become among the most comprehensive and widely used measurement tools of CU traits among different age groups spanning from children to adults (e.g., Docherty et al., 2024; Kemp et al., 2024; Y. Zheng et al., 2021). Subsequent factor analyses primarily tested three structures/models: (a) a single-factor model, with all items loading on a single CU factor; (b) a correlated-factor model, with each item loaded on one of three intercorrelated factors: Callousness (reduced empathic responding; e.g., “I did not care who I hurt to get what I want”), Uncaring (lack of concern about performance and relationships; e.g., “I work hard on everything I do,” reverse scored), and Unemotional (impoverished emotional experience and expression; e.g., “I did not show my emotions to others”; Byrd et al., 2013; Ray & Frick, 2020); and (c) a bifactor model, with each item loaded on both an overarching CU factor and one of the three aforementioned specific factors that are independent from each other. The bifactor model aligns with how CU traits are conceptualized in the Diagnostic and Statistical Manual for Mental Disorders (5th ed.; DSM-5; American Psychiatric Association [APA], 2013) as a specifier for Conduct Disorder (i.e., “with Limited Prosocial Emotions”). To qualify this specifier, an individual must display at least two of four traits: lack of remorse or guilt, callousness/lack of empathy, unconcern about performance, and shallow or deficient affect. This framework implies a common underlying disposition while allowing for potential heterogeneity in its manifestation, thereby supporting the use of a bifactor model. In such a model, all items load onto a general CU traits factor, reflecting their shared variance, while also allowing specific factors to retain unique variance. That is, although these items are thought to reflect a unified construct with shared causal or etiological processes, they may not always co-occur perfectly. The bifactor model accommodates this by modeling both the shared and distinct components of CU trait dimensions (Watts et al., 2024). Most studies also revealed that the bifactor model provides the best model fit across different age groups (e.g., adolescents aged 10–16 years, Ciucci et al., 2014; young adults, Byrd et al., 2013), although recent psychometric literature has called for cautions in the application and interpretation of the superior performance of bifactor models on fit indices, which may be partly attributed to their greater flexibility to accommodate complexity (Reise et al., 2016).

Although the Unemotional factor captures a core aspect of CU traits—namely, shallow or restricted emotional expression (Byrd et al., 2013; Ray & Frick, 2020)—research using the ICU has often reported weak-to-marginal internal consistency for this factor, along with its modest relations with the other two factors and criterion variables (e.g., antisocial behaviors; see Cardinale & Marsh, 2020; Deng et al., 2019 for systematic reviews). Many studies have therefore excluded some or all items from the Unemotional factor, leading to the development of shortened ICU versions. For instance, Hawes et al. (2014) developed a 12-item parent-report version (ICU-12) among boys aged 6–12 years, which removed all but one item from the Unemotional subscale based on item-total correlations and the Item Response Theory (IRT) analysis. Further studies have suggested that an 11-item version (ICU-11), which excluded the only remaining Unemotional item from ICU-12, achieved a better fit across self- (detained females aged 12–17 years, Colins et al., 2016; male adult offenders, Y. Zheng et al., 2021), parent-, and teacher-report (children in Grades 1 to 6, Wang et al., 2020) versions. This revised scale has demonstrated measurement invariance across informants, detained status (e.g., incarcerated vs. community samples), and cultures (e.g., Allen et al., 2021; Corbelli et al., 2024; Wang et al., 2020; Y. Zheng et al., 2021). Nonetheless, it is important to note that the Unemotional factor still carries important theoretical and empirical relevance to the construct of CU traits, especially in calculating an overall CU traits score (Byrd et al., 2013; Kimonis et al., 2008; Ray & Frick, 2020).

During adolescence and young adulthood, psychopathology symptoms arise and manifest primarily in various situations, events, or relationships (Thunnissen et al., 2022; Walz et al., 2014). To investigate how these symptoms fluctuate in the daily contexts where they manifest while minimizing recall bias (Sliwinski, 2011; Y. Zheng & Goulter, 2024), a burgeoning number of studies have employed ILDs to understand day-to-day variability of psychopathology symptoms on a micro timescale as well as their associations with contextual factors and treatment outcomes. Although the term CU “traits” may seemingly suggest relatively stable individual characteristics among adolescents and young adults, research has indicated that these traits are not immutable and are sensitive to environmental factors including treatments (Docherty et al., 2024; Fleming et al., 2022; Frick et al., 2014a; Perlstein et al., 2023). This view aligns with contemporary personality theory, which recognizes that traits reflect stable patterns of thinking, feeling, and behaving, but may still fluctuate meaningfully across time and context (Fleeson, 2004; Soto & Tackett, 2015; Wright & Simms, 2016). In light of these views, some scholars have recommended alternative terms, such as CU behaviors, features, or symptoms (Schuberth et al., 2019; Waller & Hyde, 2017), to emphasize that CU traits are not fixed and may exhibit within-person fluctuations. To our best knowledge, however, only two recent studies have examined CU traits in daily contexts using ILDs. Following 99 adolescents for 30 days, Goulter et al. (2024) found that CU traits demonstrated substantial within-person fluctuations at both item-level (intraclass correlation [ICC] = .56–.78) and subscale-level (ICC = .60–.66), as well as cross-day associations with positive affect and conduct problems at the within-person level. The other study revealed that daily parental inconsistent discipline was positively associated with adolescents’ CU traits on the next day (Y. Zheng et al., 2025).

These diary studies measured CU traits by directly using ICU-11/12, as brief instruments are necessary for ILDs to minimize participant burden and accommodate time limits. However, the original ICU and its short versions have only been evaluated psychometrically with cross-sectional or conventional longitudinal (e.g., multi-year) designs (e.g., Byrd et al., 2013; Kemp et al., 2024). The identified (sub)factors are based on between-person level structures, which indicate whether people exhibit general or specific CU traits that are similar to or different from others. For example, a between-person factor implies that individuals who score higher than others on one item also tend to score higher than others on additional items within the same factor. In contrast, a within-person factor structure reflects how items co-fluctuate over time within the same individual (Li et al., 2025; H. Zheng & Zheng, 2025). For instance, a within-person factor suggests that when individual scores higher than their own average on one item at one time, they are also likely to score higher than their average on other items within the same factor at the same time. Recent studies on positive and negative affect (Cooke et al., 2022) and externalizing and internalizing psychopathology (H. Zheng & Zheng, 2025) have shown that between-person level structure does not always translate to within-person level. Therefore, to accurately capture within-person fluctuations in CU traits, it is crucial to first establish the multilevel structure of CU traits measures while properly disentangling variance at the within- and between-person levels.

The secondary CU traits framework posits that, in contrast to primary CU traits, which stem from temperamental or genetically based emotional deficits, secondary CU traits are primarily shaped by adverse contextual factors, such as interpersonal stress and trauma (Craig et al., 2021; Schuberth et al., 2019). During the transition from adolescence to young adulthood, individuals often face challenges such as leaving the family of origin, shifts in financial dependence, and the need to establish new social ties (H. Zheng & Zheng, 2024). This heightened stress may lead some young adults to develop a “mask” of emotional detachment and callousness as an adaptive strategy to cope with stress (Craig et al., 2021). As a result, the patterns of item endorsement and the latent structure of CU traits may shift during this transition period. Evaluating whether the ICU functions equivalently across developmental stages is important to ensure that the observed changes in CU traits reflect actual developmental processes rather than measurement non-invariance.

Previous findings on the links between CU traits and psychopathology outcomes further suggest the existence of a potential secondary variant of CU traits and highlight the importance of distinguishing the factor structures of CU traits across different levels. While the total CU scores and all subfactor scores have consistently exhibited positive associations with externalizing problems (e.g., conduct problems; Cardinale & Marsh, 2020), their associations with internalizing problems are less consistent. Some studies reported that total CU and Uncaring scores were negatively linked with internalizing symptoms (e.g., depressive and anxiety symptoms; Colins et al., 2016; Hawes et al., 2014; Waller et al., 2015), whereas a meta-analysis found a slightly positive association (Cardinale & Marsh, 2020). One potential explanation for these mixed findings is that CU traits may exhibit divergent associations with internalizing symptoms over short- versus long-term periods and/or at different levels. Internalizing problems demonstrate substantial and meaningful daily within-person fluctuations (Walz et al., 2014; H. Zheng & Zheng, 2025). In the short term, emotional numbing and a lack of empathy may serve as a coping mechanism to help people detach from traumatic or distressing situations, creating temporary adaptive effects on mitigating the manifestation of internalizing symptoms (Craig et al., 2021). These short-term adaptive effects, however, may cumulatively lead to adverse outcomes in the long term. It remains to be explored how daily fluctuations in CU traits, when disentangled from between-person differences, are associated with internalizing and externalizing problems at the within-person level. Such an investigation could also help determine whether daily fluctuations in CU traits reflect meaningful variations and clarify whether factor structures across levels represent similar or distinct constructs.

The Present Study

The psychometric properties of the ICU have typically been evaluated in cross-sectional or conventional longitudinal designs at the between-person level. Catering to the burgeoning research investigating the daily fluctuations in CU traits and more broadly psychopathology symptoms using ILDs, this study aims to explore the factor structures and psychometric properties of the ICU at both within- and between-person levels in daily diary studies. The present study used two independent samples (one among adolescents and one among young adults), each with a month-long daily diary design, to evaluate CU trait structures previously identified and supported in the literature. A comprehensive set of standards was used to compare these models, including model fit, factor reliability and properties, longitudinal measurement invariance, as well as within- and between-person level criterion validities. We also explored whether the multilevel structure of CU traits remains invariant between adolescents and young adults. Based on previous studies, we expected that the bifactor model would show the best performance (better model fit and higher factor reliability) at the between-person level. Given the scarce literature on the CU traits structure at the within-person level, especially using daily diary designs, an exploratory approach was taken for within-person structures and measurement invariance, and we opted not to make any specific hypothesis. Regarding criterion validities, it was expected that CU traits factor scores would be positively associated with externalizing problems at both within- and between-person levels. In contrast, the associations between CU traits scores and internalizing problems were expected to be negative at the within-person level, but positive at the between-person level.

Method

Participants and Procedures

This study used data from two independent community-based samples of adolescents and young adults. The research procedure and instruments for both datasets received approval from the research ethics committee at University of Alberta. Survey instruments were developed and administered using RedCap (Harris et al., 2019). We report all data exclusions, all manipulations, and all measures used in the study. Since this study involved analyses of existing datasets rather than collecting new data, determining sample size was not applicable.

University Student Sample

An initial sample of 313 Canadian university freshmen (M_age = 18.1 years, SD = 1.31, range 17–29, 72% female) completed a baseline survey and at least one of a 30-day daily diary study (6,431 total observations, M = 21.43 days, SD = 9.65, 72.5% ≥ 20 days) between September and December 2019 in wave 1. Freshmen self-identified as Asian (53%), White (30.3%), Black (5%), Multiracial (4.7%), Native (0.7%), Latino or Hispanic (1%), and Others (5.3%). Two and a half years later, 194 (64% retention rate) participants took part in the baseline survey of wave 2 and at least one of the 30-day daily diary study (4,018 total observations, M = 21.0 days, SD = 8.84, 68.0% ≥ 20 days) between February and May 2022. Participants who retained across waves tended to be a bit younger (18.0 ± 0.7 vs. 18.5 ± 1.9, t[310] = 3.58, p < .001), but did not differ in sex, ethnicity-race, or parental education. Participants who did not participate in wave 2 showed slightly higher person-average means of the total CU traits (0.79 ± 0.37 vs. 0.70 ± 0.34, t[297] = 2.24, p = .013) and the Callousness factor (0.40 ± 0.45 vs. 0.29 ± 0.30, t[297] = 2.59, p = .010) in wave 1 than those who remained in the study, but did not differ in the Uncaring factor.

Participants were recruited from a large Western Canadian university through online advertisements, on-campus posters, and short in-class presentations. For wave 1, all first-year undergraduate students were eligible for inclusion, while wave 2 was limited to those who had completed at least the baseline survey in wave 1. In both waves, participants first completed a baseline survey after providing informed consent online, and participated in daily surveys for 30 consecutive days. Daily surveys were sent by email at 7 pm each night and participants were asked to complete the survey before going to sleep that night. Participants received a $60 and $75 e-gift card as compensation in each wave, respectively (see Cooke et al., 2022; H. Zheng & Zheng, 2025 for more information on recruitment procedures).

Adolescent Sample

A total of 99 Canadian adolescents (M_age = 14.60 years, SD = 1.76, range 12–17, 55.8% female) participated in a 30-day daily diary study (2,132 total observations, M = 21.6 days, SD = 7.80, 77.8% ≥ 20 days) between April 2019 and September 2020. Participants self-identified as 51.5% White, 23.2% Asian, 8.1% multiracial, 4.0% Latinx or Hispanic, 2.0% Black, 6.1% other, and 5.1% missing. Participants were recruited through newsletters, social media, and flyers posted or circulated in a Western Canadian province. Participants received an online baseline survey after providing consent/assent online, and an online daily survey 5 days after the completion of the baseline survey for 30 consecutive days. The daily survey was sent out at 5 pm, and adolescents were asked to fill out the survey before bed that night. Adolescents received a $45 e-gift card as compensation for their participation. Parental consent and adolescent assent were obtained prior to commencing the study (See H. Zheng et al., 2023 for more information on recruitment procedures).

Measures

Daily Callous-Unemotional Traits

Daily CU traits were assessed with an 11-item shortened version of the Inventory of Callous-Unemotional Traits (e.g., “I do not care if I get into trouble.” Colins et al., 2016; Frick, 2004; Wang et al., 2020) in the daily survey. One item deemed as infrequently occurring in the daily lives of adolescents and young adults and with low within-person fluctuations (i.e., “I apologize to persons I hurt”) was excluded. Therefore, the final measure included 10 items, which contains two subscales: Callousness (6 items) and Uncaring (4 items). Participants reported the extent to which they felt the way described by these items on that day on a 4-point scale from 0 (not at all true) to 3 (definitely true). Reverse-worded items were recoded before analyses. Higher scores indicate higher levels of CU traits.

Criterion Validity Measures

Daily Emotional and Conduct Problems

In the daily survey, emotional problems were measured with five items from the emotional problem subscale (e.g., “I have many fears.” “I worry a lot.”) of the Strengths and Difficulty Questionnaire (SDQ; Goodman et al., 1998) validated in daily diary research (H. Zheng & Zheng, 2025). Participants indicate how each item applies to them on that day on a 3-point scale rated (0 = not true, 1 = somewhat true, 2 = certainly true). Conduct problems were measured with five items from the conduct problem subscale for adolescents and three items (“I got very angry and lost my temper,” “I was generally willing to do what other people want,” and “I fight a lot.”) for university students. Scores were averaged with higher scores representing more emotional (university students wave 1: ordinal ω_w = .61, ω_b = .90; wave 2: ordinal ω_w = .63, ω_b = .86; adolescents: ordinal ω_w = .73, ω_b = .94) and conduct problems (university students wave 1: ordinal ω_w = .72, ω_b = .90; wave 2: ordinal ω_w = .70, ω_b = .87; adolescents: ordinal ω_w = .65, ω_b = .92), respectively.

Daily Anxiety Symptoms

University students reported their daily panic disorder (3 items; e.g., “When I got frightened, my heart beats fast.”), social (2 items; e.g., “I felt nervous with people I didn’t know well.”), and generalized anxiety (3 items; e.g., “I was nervous.”) symptoms modified from the Screen for Adult Anxiety Related Emotional Disorders (Angulo et al., 2017; Li et al., 2025) on a 3-point scale (0 = not true, 1 = somewhat true, 2 = very true) in wave 1. In wave 2, only panic disorder and social anxiety symptoms were assessed. Scores were averaged with higher scores reflecting more panic disorder (wave 1: ordinal ω_w = .56, ω_b = .87; wave 2: ordinal ω_w = .65, ω_b = .82), social (correlations between two items in wave 1: r_w = .48, r_b = .76; wave 2: r_w = .48, r_b = .80), and generalized anxiety (wave 1: ordinal ω_w = .57, ω_b = .92) symptoms. This measure was not implemented in the adolescent sample.

Depressive Symptoms

Depressive symptoms were measured using 17 items adapted from the Center for Epidemiological Studies Depression scale (Radloff, 1977) in the two baseline surveys. University students indicated how often the statements described them in the past year on a 4-point scale (0 = rarely or none, 1 = some or a little of the time, 2 = occasionally or a moderate amount of time, 3 = most or all of the time). Items were averaged with a higher score indicating higher levels of depressive symptoms (ordinal ωs = .90–.91). This measure was not implemented in the adolescent sample.

Analytic Strategy

Model Estimation

Multilevel confirmatory factor analyses (MLCFAs) using the weighted least squares with mean and variance corrected (WLSMV) estimator (default for ordinal items) were used to examine the structure of the 10-item ICU at the within- and between-person levels. All observed variables were treated as ordinal. First, ICCs for ICU items were calculated to determine the appropriateness of multilevel modeling. Next, models were estimated using data from the first wave of the university student sample. Three structures/models of ICU were tested: a single-factor model (i.e., a single factor at each level), a correlated-factor model with two factors, and a bifactor model with two specific factors. To investigate if ICU potentially exhibits distinct structures at different levels, models with varying structures across levels were also estimated. Certain subpar models were excluded based on their structural validity. The remaining models were retained for estimation in wave 2 of the university student sample and the adolescent sample to examine their replication across samples. All analyses were conducted in Mplus 8.3 (Muthén & Muthén, 1998–2017).

Structural Validity

Traditional model fit indices (Hu & Bentler, 1999) include the root mean square error of approximation (RMSEA) <.05, standardized root mean square residual (SRMR) < .08, comparative fit index (CFI) > .90, and Tucker–Lewis Index (TLI) > .90. It should be noted that the CFI, TLI, RMSEA, and SRMR at the within-person level (SRMR_w) are not sensitive to the between-person level misspecification (Hsu et al., 2015). The SRMR at the between-person level (SRMR_b) was specifically used to assess model fit at the between-person level.

Factor reliability was evaluated by the following indices: The index of construct replicability (H Index) assesses how well a latent factor can be replicated across studies, with H values > .80 for general factors and .70 for specific factors indicating optimal reliability (Rodriguez et al., 2016). For the correlated-factor model, omega subfactor (ωs) values > .75 are considered good (Revelle & Condon, 2019). In the bifactor model, omega hierarchical (ω_h) and omega hierarchical specific (ω_hs) indicate the proportion of total score variance specifically attributable to general and specific factors, respectively, with ω_h/ω_hs > .50 demonstrating acceptable reliability. Explained common variance (ECV) quantifies the percentage of common variance explained by each latent factor, which shows the relative strength of factors and the extent of unidimensionality (Rodriguez et al., 2016).

Longitudinal and Multigroup Measurement Invariance

Longitudinal measurement invariance (MI) was assessed in the university student sample. Unconstrained models were compared with models where factor loadings (metric MI), thresholds (scalar MI), and residual variances (strict MI) were constrained to be equivalent across two waves (Widaman et al., 2010). MI at the within-person level is indicated by a ΔCFI decreasing ≤ .01 and an RMSEA ≤ .015 (Khojasteh & Lo, 2015). MI at the between-person level is indicated by a decrease in SRMR_b ≤ .030 (Khojasteh & Lo, 2015).

Multigroup MI was tested between the adolescent sample and wave 1 of the university student sample using the Maximum Likelihood estimation with the robust standard error (MLR) estimator. Since Mplus does not support multigroup analysis with multilevel models for categorical items, the ICU items were treated as continuous variables only in this analysis. The same criteria as those used for longitudinal MI were applied to indicate MI between the two groups.

Within- and Between-Person Criterion Validity

Factor scores estimated with the Bayesian estimator were used as observed variables to examine criterion validity. At the within-person level, concurrent predictive validity was evaluated by correlating each latent factor with same-day criterion measures. The between-person concurrent validity was examined by incorporating the correlations between latent factors at the between-person level and person-average levels of validity variables, reflecting the associations between the random intercepts of these components across individuals.

Transparency and Openness

This study was not preregistered. The code and output files for all the analyses are publicly available (https://osf.io/rh7mv). Data are not publicly available due to ethics agreements. However, the data required for the analyses performed in the study are available from the corresponding author upon reasonable request.

Results

Descriptive Statistics

All ICU items showed moderate to high ICCs in the university student (wave 1: .48–.64; wave 2: 46–.62) and adolescent (.56–.77) sample (Table 1). This indicates that approximately 46% to 77% of the variation in ICU items occurred at the between-person level, while the remaining variation can be attributed to within-person fluctuations over days.

Table 1

Intraclass Correlations and Frequencies of ICU Items.

	University Students (Wave 1)					University Students (Wave 2)					Adolescents
		Frequency (%) without Missing					Frequency (%) without Missing					Frequency (%) without Missing
Item	ICC	Not True	Somewhat True	Mostly True	Certainly True	ICC	Not True	Somewhat True	Mostly True	Certainly True	ICC	Not True	Somewhat True	Mostly True	Certainly True
A. Callousness
1. I did not care who I hurt to get what I want	.641	82.9	13.2	2.8	1.0	.577	83.9	12.9	3.0	0.1	.657	90.9	7.0	1.9	0.1
5. I did not care if I get into trouble	.624	76.0	16.5	4.4	1.4	.558	79.4	16.2	3.9	0.4	.718	78.4	16.5	4.0	1.1
6. I did not care about doing things well	.564	73.1	20.7	5.0	1.2	.557	71.1	22.9	5.1	0.9	.704	81.2	14.6	3.1	1.1
7. I seemed very cold and uncaring to others	.558	69.8	23.7	5.2	1.3	.466	72.3	21.9	5.3	0.6	.656	69.5	21.2	6.5	2.7
10. I did not feel remorseful/sorry when I did something wrong	.476	78.3	15.9	4.2	1.6	.460	82.1	13.6	3.7	0.6	.605	83.0	11.8	3.8	1.4
11. The feelings of others were unimportant to me	.537	79.3	16.2	3.9	0.7	.473	78.2	18.3	3.0	0.4	.558	88.2	8.5	2.3	1.0
B. Uncaring
2. I felt bad or guilty when I do something wrong	.586	17.3	22.9	30.8	27.2	.623	15.5	29.1	31.2	24.2	.769	18.6	10.6	21.6	46.1
4. I was concerned about the feelings of others	.546	14.6	28.0	35.5	21.9	.601	14.7	32.6	33.8	18.9	.603	1.9	9.1	28.1	60.9
9. I tried not to hurt others' feelings	.569	15.6	19.3	33.7	31.5	.598	12.3	23.6	35.9	28.3	.620	3.3	7.9	24.3	64.6
12. I did things to make others feel good	.548	16.8	32.6	33.8	16.8	.588	14.8	34.6	33.9	16.7	.676	2.9	13.3	30.0	53.8

Structural Validity

University Student Sample

All three structures/models were fully crossed to enumerate all possible combined structures across levels in wave 1 of the university student sample. Fit indices (Table 2) reveal that models with the single-factor model at either within- or between-person level showed unacceptable model fit. The other models, comprising bifactor and correlated-factor models, demonstrated acceptable-to-good fit. Generally, the bifactor models fit the data better than the correlated-factor model at both within- (higher CFI and TLI, and lower RMSEA and SRMR_w) and between- (lower SRMR_b) person levels. Based on these results, we proceeded with only the bifactor and correlated-factor models to examine replication in wave 2 of the university student sample and the adolescent sample. All four models estimated with wave 2 data demonstrated acceptable-to-good fit (Table 2). The bifactor models provided a better fit to the data than the correlated-factor models at both the within- and between-person levels.

Table 2.

Fit Indices for the Multilevel Confirmatory Factor Analyses for Daily CU Traits.

Within	Between	k	χ²	RMSEA	SRMR within/between	CFI	TLI
University Students: Wave 1
Single	Single	60	6,003.50	.116	.178/.200	.500	.357
Single	CF	60	6,065.08	.116	.178/.046	.494	.350
Single	Bifactor	70	6,032.54	.126	.178/.024	.496	.244
CF	Single	61	1,233.90	.052	.051/.200	.902	.872
CF	CF	61	677.93	.037	.051/.046	.949	.933
CF	Bifactor	71	683.13	.041	.051/.023	.947	.920
Bifactor	Single	70	982.22	.049	.035/.200	.922	.883
Bifactor	CF	70	354.80	.028	.035/.046	.975	.963
Bifactor	Bifactor	80	345.84	.031	.035/.024	.975	.955
University Students: Wave 2
CF	CF	62	386.83	.035	.060/.048	.934	.913
CF	Bifactor	71	374.97	.037	.060/.025	.935	.901
Bifactor	CF	71	189.53	.024	.038/.048	.973	.959
Bifactor	Bifactor	80	172.92	.025	.039/.025	.975	.955
Adolescents
CF	CF	62	130.67	.021	.081/.040	.951	.935
CF	Bifactor	70	121.80	.022	.081/.029	.952	.928
Bifactor	CF	71	91.80	.016	.069/.040	.974	.961
Bifactor	Bifactor	79	83.28	.017	.069/.029	.975	.956

Note. CF = Correlated Factors, RMSEA = root mean square error of approximation, SRMR = standardized root mean square residual, CFI = comparative fit index, TLI = Tucker–Lewis index.

In the correlated-factor models (Table 3), all factor loadings were positive and significant, with all indicators showing loadings ≥ .35 at both levels. For bifactor models, the specific factors were strongly indicated by the items at both within- (wave 1 & 2 median λ_w = .61 and .57) and between- (wave 1 & 2 median λ_b = .83 and .79) person levels. Loadings on the general factors were relatively lower at both within- (wave 1 and 2 median λ_w = .26 and .34) and between- (wave 1 and 2 median λ_b = .45 and .48) person levels, with two items showing non-significant factor loadings at the between-person level in each wave.

Table 3.

Standardized Factor Loadings in the Models.

Item	University Students (Wave 1)						University Students (Wave 2)						Adolescents
	CF		Bifactor				CF		Bifactor				CF		Bifactor
	w	b	Within		Between		w	b	Within		Between		w	b	Within		Between
	w	b	Gen	Spec	Gen	Spec	w	b	Gen	Spec	Gen	Spec	w	b	Gen	Spec	Gen	Spec
ICU1	.69	.97	.55	.45	.47	.83	.65	.98	.39	.53	.72	.65	.54	.93	.51	.26	.77	.46
ICU5	.69	.90	.14	.78	.61	.85	.70	.91	.10	.75	.50	.79	.63	.87	.44	.40	.65	.61
ICU6	.63	.81	.16	.65	.17	.83	.58	.72	.09	.60	.26	.78	.78	.90	.49	.78	.68	.61
ICU7	.63	.79	.52	.40	.13	.83	.65	.76	.37	.52	.46	.60	.46	.67	.46	.25	.57	.30
ICU10	.66	.91	.38	.64	.26	.89	.61	.92	.12	.62	.48	.82	.49	.80	.48	.12	.60	.58
ICU11	.70	.97	.54	.47	.45	.82	.71	.89	.31	.63	.69	.54	.60	.87	.72	-.14	.64	.61
ICU2	.53	.83	.17	.50	.44	.69	.38	.80	.10	.72	.35	.70	.56	.29	.08	.65	.36	-.34
ICU4	.73	.99	.26	.68	.57	.82	.73	.99	.61	.37	.47	.86	.77	.96	.39	.66	.93	.38
ICU9	.66	.99	.25	.60	.68	.71	.58	.99	.53	.25	.49	.81	.73	.98	.35	.65	.96	.16
ICU12	.66	.80	.25	.61	.23	.88	.67	.76	.66	.22	.17	.87	.68	.86	.39	.54	.85	.32

Note. CF = Correlated Factors, Gen = General factor, Spec = Specific factor, w = with-person level, b = between-person level. All loadings are significant, ps < .05, except those in italic. Items 1, 5, 6, 7, 10, and 11 loaded onto the Callousness (specific) factor, items 2, 4, 9, and 12 on the Uncaring (specific) factor.

Overall, factor reliability was greater at the between-person level than at the within-person level (Table 4). The correlated-factor models were reliable and well-defined at both levels across waves, except for the Uncaring factor at the within-person level, showing unsatisfactory reliability (i.e., ω_s < .75). In the bifactor models, the specific factors generally showed acceptable reliability (i.e., ω^hs > .50) except for the Uncaring specific factor at the within-person level in wave 1. At both waves, the general factor did not reliably capture variances (i.e., ω_h < .50) at either level.

Table 4.

Factor Reliability Indices.

Indicator	Within			Between
Indicator	General	Callousness	Uncaring	General	Callousness	Uncaring
University Students: Wave 1
Correlated-factor
ω_s		.83	.74		.96	.95
H index		.83	.75		.98	.99
Bifactor
ω_{h /} ω_hs	.32	.58	.64	.27	.86	.69
ECV	.28	.41	.31	.20	.51	.29
H index	.62	.77	.70	.71	.94	.88
University Students: Wave 2
Correlated-factor
ω_s		.81	.69		.95	.94
H index		.82	.73		.97	.99
Bifactor
ω_{h /} ω_hs	.34	.74	.30	.41	.62	.78
ECV	.33	.50	.17	.30	.37	.33
H index	.69	.79	.58	.79	.87	.89
Adolescents
Correlated-factor
ω_s		.76	.78		.94	.88
H index		.79	.80		.96	.98
Bifactor
ω_{h /} ω_hs	.55	.23	.60	.78	.38	.12
ECV	.46	.20	.34	.71	.24	.05
H index	.75	.65	.72	.96	.73	.30

Note. All models were estimated using the WLSMV estimator. H index = index of construct replicability, ω_h = omega hierarchical, ω_hs = omega hierarchical specific, ω_s = omega specific, ECV = explained common variance.

Adolescent Sample

Consistent with the university student sample, all four models among adolescents showed acceptable-to-good fit (Table 2). In the correlated-factor model, all factor loadings were positive and significant, with only one item in the Uncaring factor having loadings < .35 at the between-person level (Table 3). In the bifactor model, the factor loadings of the specific factors were not as strong as those observed among university students (median λ_w = .47; median λ_b = .42), whereas the loadings for the general factors were higher than those among university students (median λ_w = .45; median λ_b = .67). At the within-person level, one and two items had non-significant factor loadings on the general and specific factors, respectively. At the between-person level, three out of four factor loadings appeared to be non-significant within the Uncaring specific factor.

The reliability indices of the correlated-factor models in the adolescent sample showed a generally consistent pattern with those observed in the university student sample (Table 4). In the bifactor models, the specific factors generally showed unacceptable reliability (i.e., ω^hs < .50), with the exception of the Uncaring specific factor at the within-person level. Nonetheless, the general factor exhibited good reliability at both levels (i.e., ω_h > .50). At the between-person level, the models were unidimensional to some extent since the general factor accounted for the most variance (ECV_b of the general factor = .71) and showed superior reliability and replicability (ω^h-b = .78; H_b = .96).

Longitudinal and Multigroup Measurement Invariance

As shown in Table 5, both the correlated-factor and bifactor models achieved longitudinal metric MI at the within-person level (i.e., ∆CFI ≤ .01 and ∆RMSEA ≤ .015), as well as metric, scalar, and strict MI at the between-person level (i.e., ∆SRMR_b ≤ .030). The correlated-factor model revealed no significant difference between the latent means of the two factors over time. The bifactor model indicated a slight increase in the general factor from wave 1 to wave 2 (Diff = .19, SE = .10, p = .048), while the two specific factors showed no significant change.

Table 5.

Longitudinal Measurement Invariance Tests Among the University Student Sample.

Model	χ²	df	CFI	TLI	RMSEA	SRMR within/between
Correlated-factor model
Unconstrained model	655.41	330	.958	.952	.014	.044/.060
+ Factor loadings constrained (b)	655.77	338	.959	.954	.014	.044/.064
+ Factor loadings constrained (w)	690.01	346	.956	.952	.014	.044/.065
+ Thresholds constrained (b)	749.58	374	.952	.951	.014	.044/.065
+ Residuals constrained (b)	758.58	382	.952	.952	.014	.044/.065
Bifactor model
Unconstrained model	412.90	297	.985	.981	.009	.033/.038
+ Factor loadings constrained (b)	417.83	314	.987	.984	.008	.033/.047
+ Factor loadings constrained (w)	453.87	331	.984	.982	.009	.036/.047
+ Thresholds constrained (b)	499.99	358	.982	.981	.009	.036/.047
+ Residuals constrained (b)	511.35	367	.982	.981	.009	.036/.047

Note. w = with-person level, b = between-person level. In the unconstrained models, the means of latent factor scores at the within-person level in both waves, and at the between-person level in wave 1 were fixed to 0, and the variances at both levels in wave 1 were fixed to 1. The first factor loading of each latent factor was free to estimate, while corresponding first loadings at the same level were constrained to be equal across waves 1 and 2. The first threshold of each latent factor was constrained to be invariant across time at each level (Widaman et al., 2010).

As shown in Table 6, the correlated-factor model demonstrated full MI across the adolescent sample and wave 1 of the university student sample. University students exhibited a higher latent mean of Uncaring (Diff = .59, SE = .06, p < .001) and a marginally higher latent mean of Callousness (Diff = .08, SE = .04, p = .051) than adolescents. Regarding the bifactor model, metric MI at the between-person level was not supported (∆SRMR_b > .030). Thus, further steps were not conducted at the between-person level. At the within-person level, metric MI was supported, but strict MI was violated (∆CFI > .01).

Table 6.

Multigroup Measurement Invariance Tests Among the Adolescent Sample and the First Wave of the University Student Sample.

Model	χ²	df	AIC	BIC	aBIC	CFI	RMSEA	SRMR within/between
Correlated-factor model
Unconstrained model	601.14	138	132,555.0	133,272.9	132,948.8	.936	.028	.020/.033
+ Factor loadings constrained (b)	617.97	146	132,583.0	133,244.6	132,945.9	.934	.028	.020/.035
+ Factor loadings constrained (w)	639.71	154	132,910.1	133,486.9	133,226.4	.932	.028	.021/.038
+ Intercepts constrained (b)	688.60	162	132,666.0	133,214.7	132,944.9	.927	.028	.021/.040
+ Residuals constrained (b)	766.16	171	132,798.6	133,284.1	133,064.7	.917	.029	.021/.050
+ Residuals constrained (w)	779.70	181	133,401.3	133,816.3	133,628.8	.917	.029	.025/.050
Bifactor model
Unconstrained model	310.44	108	132,179.3	133,107.9	132,688.4	.972	.021	.014/.023
+ Factor loadings constrained (b)	347.40	125	132,224.1	133,033.1	132,667.6	.969	.021	.014/.063
+ Factor loadings constrained (w)	388.33	142	132,280.6	132,969.9	132,658.6	.966	.020	.016/.063
+ Residuals constrained (w)	531.75	151	132,894.7	133,520.8	133,237.9	.947	.025	.022/.063

Note. w = with-person level, b = between-person level. All models were estimated with the MLR estimator.

Within- and Between-Person Criterion Validity

University Student Sample

In both waves (Table 7), the Callousness factor in the correlated-factor model was positively correlated with all validity variables at the within- (wave 1: r_w = .09–.22; wave 2: r_w = .11–.23) and between- (wave 1: r_b = .26–.55; wave 2: r_b = .26–.49) person levels. The Uncaring factor was correlated with conduct problems in both waves at the within- (wave 1 & 2: r_w = .28 and .27) and between- (wave 1 & 2: r_b = .60 and .66) person levels. However, the Uncaring factor was negatively correlated with internalizing symptoms. Specifically, at the within-person level, Uncaring was negatively associated with same-day emotional problems (r_w = −.05, p = .021), social anxiety (r_w = −.06, p = .009), and generalized anxiety (r_w = −.12, p < .001) symptoms in wave 1, but not in wave 2. At the between-person level, Uncaring was negatively correlated with person-average levels of emotional problems (r_b = −.12, p < .001) and generalized anxiety symptoms (r_b = −.22, p < .001) in wave 1.

Table 7.

Criterion Validity Tests at the Within- and Between-Person Level.

	University Students Wave 1						University Students Wave 2					Adolescents
Factor	Emo	Con	Panic	Soc	Gen	Dep	Emo	Con	Panic	Soc	Dep	Emo	Con
Correlated-factor model
Callousness _within	.22***	.18***	.14***	.10***	.09***		.19***	.23***	.11***	.12***		.13***	.23***
Uncaring _within	−.05*	.28***	−.01	−.06**	−.12***		.03	.27***	−.02	.02		.05	.21***
Callousness _between	.54***	.28***	.55***	.51***	.41***	.26***	.49***	.26**	.46***	.36***	.30***	.31**	.74***
Uncaring _between	−.12*	.60***	−.06	−.04	−.22***	−.01	−.11	.66***	−.07	−.07	.08	.18	.64***
Bifactor model
Callousness _within	.09***	.04*	.07**	.01	.01		.17***	.11***	.11***	.11***		.12***	.14***
Uncaring _within	−.12***	.23***	−.06*	−.11***	−.16***		−.13***	.03	−.10***	−.06**		.03	.14***
General _ within	.18***	.16***	.10***	.13***	.11***		.09***	.26***	.03	.05**		.02	.09**
Callousness _between	.37***	.13***	.34***	.16**	.28***	.26***	.29***	.06	.30***	.16*	.35***	.26**	.23**
Uncaring _between	−.23***	.56***	−.17***	−.19***	−.30***	−.08	−.29***	.56***	−.22***	−.22***	.05	.02	.19*
General _between	.34***	.22***	.37***	.46***	.25***	.15*	.34***	.34***	.28***	.29***	.16**	.23*	.67***

Note. Emo = Emotional Problems; Con = Conduct Problems; Panic = Panic Disorder Symptoms; Soc = Society Anxiety Symptoms; Gen = Generalized Anxiety Symptoms; Dep = Depressive Symptoms.

p < .05. **p < .01. ***p < .001.

In the bifactor model, the general factor was correlated with almost all validity variables at the within-person level (wave 1: r_w = .10–.18; wave 2: r_w = .05–.26), with the exception of panic disorder symptoms in wave 2. At the between-person level, all person-average validity variables were correlated with general factor scores in wave 1 (r_b = .15–.46) and wave 2 (r_b = .16–.34). The Callousness specific factor exhibited a generally consistent pattern with the Callousness factor in the correlated-factor model, showing positive correlations with almost all validity variables at both levels in both waves. Regarding the Uncaring specific factor, it only showed positive correlations with conduct problems at both levels, but exhibited negative correlations with internalizing symptoms at both levels.

Regarding prospective correlations, after controlling for wave 1 corresponding validity variable, in the correlated-factor model, wave 1 Callousness factor was positively correlated with emotional problems (r = .27, p < .001) and generalized anxiety (r = .15, p = .050) in wave 2. Wave 1 Uncaring factor was positively correlated only with conduct problems (r = .23, p = .001) but not with any internalizing symptoms in wave 2. In the bifactor model, the general factor (r = .17, p = .018) and the Callousness specific factor (r = .23, p = .002) in wave 1 were positively correlated with emotional problems in wave 2. The Uncaring specific factor was positively correlated with conduct problems (r = .24, p < .001).

Adolescent Sample

The Callousness factor in the correlated-factor model was positively correlated with emotional and conduct problems at both levels, while the Uncaring factor was only positively correlated with conduct problems (r_w = .13, p < .001; r_b = .31, p < .001) but not with emotional problems (r_w = .05, p = .087; r_b = .18, p = .162). In the bifactor model, the general factor was correlated with conduct problems at the within-person level (r_w = .09, p = .004), as well as both validity variables at the between-person level (r_b = .23–.67). The Callousness specific factor was positively correlated with both validity variables at both levels, while Uncaring was only positively correlated with conduct problems.

Discussion

This study examined the within- and between-person factor structure of CU traits in daily contexts using two independent samples with month-long daily diary designs. The results indicated that both the bifactor and correlated-factor models demonstrated accepted fit at the within- and between-person levels, though the general factor in the bifactor model in the university student sample showed low reliability and replicability. Longitudinal MI was observed within the university sample over a 2.5-year span, while structural differences emerged between adolescents and university students. At both levels, the general factor and the Callousness (specific) factor were positively associated with internalizing and externalizing problems across both samples. In contrast, the Uncaring (specific) factor was positively associated with conduct problems in both samples but negatively associated with internalizing problems among university students.

Corroborating previous studies conducted at the between-person level (Byrd et al., 2013; Ciucci et al., 2014; Wang et al., 2020; Y. Zheng et al., 2021), conventional fit indices suggest a preference for the bifactor model at both within- and between-person levels. These results warrant cautious interpretation, as these fit indices tend to favor models with greater flexibility (Reise et al., 2016). In the university student sample, the general factors explained < 35% of the total variance at both levels and showed low reliability (ω_h < .50) and replicability (H index < .70), suggesting that this general factor may primarily reflect absorbed measurement error rather than a true latent factor (Rodriguez et al., 2016). These results indicate that there is no reliable general factor at either within- or between-person level in this sample. In contrast, the general factor showed acceptable reliability and replicability in the adolescent sample, particularly at the between-person level.

The measurement non-invariance across the two groups further emphasizes that the bifactor model, particularly at the between-person level, is not equivalent across different age groups. This discrepancy aligns with the abovementioned findings regarding divergent psychometric validity across the two samples and suggests that the factor structure of CU traits may change substantially from adolescence to young adulthood. Using a bifactor model may be more appropriate for adolescents. The unidimensionality observed at the between-person level suggests that future studies using ILDs in adolescents could reasonably rely on the general factor of CU traits (Ray & Frick, 2020; Reise et al., 2016; Rodriguez et al., 2016). This finding aligns with a previous meta-analysis, which demonstrated that the reliable variance in the total score of CU traits were largely determined by the general factor in the bifactor models, thereby recommending simply using the total score rather than subscale scores in future studies (Ray & Frick, 2020). Nonetheless, at the within-person level, both the general and specific factors should be considered for a more comprehensive understanding of CU traits in adolescents. If the research specifically focuses on the two subfactors, the correlated-factor model can also be applied to generate separate subfactor scores. In the young adult sample, the unreliable general factors indicate that items in the Callousness and Uncaring factors appear more heterogeneous and distinct and share fewer co-fluctuations. Studies in this population may benefit from assessing Callousness and Uncaring as separate factors to more accurately capture CU traits and to explore their potentially divergent associations with antecedents and behavioral outcomes.

Consistent with findings from between-person level studies (e.g., Hawes et al., 2014), CU traits were consistently associated with conduct problems at the within-person level in both samples. The negative associations between the Uncaring (specific) factor and emotional problems and anxiety symptoms in university students join the existing evidence (e.g., Fontaine et al., 2023; Waller et al., 2015) that CU traits may be developed as a coping strategy to offer a short-term protective effect against internalizing symptoms for young adults. The challenges during the transition from adolescence to young adulthood may prompt emotional numbing and avoidance of stressful situations, such as neglecting their performance in critical contexts (Byrd et al., 2013; Craig et al., 2021), thereby reducing their exposure to stressors that could otherwise exacerbate internalizing symptoms. It has been expected that these short-term adaptive processes might cumulatively increase internalizing problems over the long term (Craig et al., 2021). However, we did not find evidence supporting this hypothesis in either between-person level associations, which revealed negative associations, or the prospective correlations, which showed no link between Uncaring and internalizing symptoms 2.5 years later. It should be noted that the between-person level results in this study do not fully represent long-term effects, as they only capture associations between person-average means over a relatively short 30-day period. These findings underscore the complexity in the link between CU traits and internalizing symptoms and highlight the need for further exploration across various timescales and developmental periods. Such approaches could clarify whether the short-term effects between CU traits and internalizing symptoms persist and how they relate to long-term effects (Y. Zheng & Goulter, 2024), offering a more nuanced perspective of the role of CU traits in the development of internalizing symptoms.

Taken together, these findings offer several important implications for clinical practice and future research. First, the substantial daily fluctuations observed in CU traits highlight their “state” feature. These results support the notion that, similar to personality traits (Fleeson, 2004; Soto & Tackett, 2015; Wright & Simms, 2016), CU traits are not “fixed” individual characteristics but exhibit meaningful dynamics and variability (Fleming et al., 2022; Goulter et al., 2024; Schuberth et al., 2019; Waller & Hyde, 2017). Incorporating intensive assessments within daily contexts may hence provide a more ecologically valid understanding of how CU traits manifest in everyday life. Second, the findings demonstrate that the latent structure of CU traits can diverge across levels. For instance, the unidimensional structure observed at the between- but not the within-person level among adolescents indicates that while CU traits items tend to co-occur across individuals (i.e., individuals who score higher than others on one item also tend to score higher on other items), they do not necessarily fluctuate together within individuals on a day-to-day basis (Li et al., 2025; H. Zheng & Zheng, 2025). These results highlight the importance of differentiating between within- and between-person structures in the assessments of CU traits. Within-person level structures can be used to track and monitor CU traits in daily contexts over time and understand their short-term antecedent and outcomes on a micro timescale. In contrast, between-person level structures are informative for ranking or comparing the level of CU traits across individuals. Third, these insights could inform interventions to treat CU traits as modifiable characteristics in adolescents and young adults. Adopting a micro timescale approach could enable context-sensitive intervention strategies to manage daily variations in CU traits (Y. Zheng & Goulter, 2024) and, by extension, decrease the likelihood of their progression to more severe externalizing problems. In addition, the measurement non-invariance in factor structures across adolescent and university student samples may suggest that CU traits reorganize during the transition from adolescence to young adulthood. In adolescents, targeting general CU traits through broad-based treatments may be effective, whereas in young adults, tailoring interventions to more specific components such as Uncaring or Callousness may be more beneficial.

Strengths, Limitations, and Future Directions

This study has several notable strengths. Previous research has examined CU traits structure primarily using cross-sectional or conventional longitudinal designs with long time intervals and focused on between-person analyses. This study addressed these limitations by conducting month-long daily diary designs to explore the CU traits structure at both within- and between-person levels. The findings confirm meaningful daily variations in CU traits, which align with emerging research that emphasizes a micro-level approach to better capture psychopathology symptom variability and dynamic links with contextual factors and behavioral outcomes (Thunnissen et al., 2022; Walz et al., 2014). Moreover, this study replicated findings across two independent samples and identified age-related factor structure differences, as well as stability within young adults over 2.5 years, which highlights both developmental continuity and discontinuity in the manifestation of CU traits.

Despite these strengths, several limitations warrant consideration. First, the current study used data from two community samples with relatively low endorsement of items, reflecting limited levels of severity. Replications with high-risk samples (e.g., incarcerated and clinical) could help validate and extend these findings to populations with elevated CU traits (Fontaine et al., 2023; Kemp et al., 2024; Y. Zheng et al., 2021). In addition, the sex distribution in the current study is unbalanced, with over 70% of the participants in the university student sample identifying as female. This may limit the generalizability of the findings, particularly regarding potential sex differences in the expression and fluctuation of CU traits. Future research should replicate these findings in more sex-balanced and diverse samples to enhance generalizability. Second, this study relied exclusively on self-reports. Different informants (e.g., self- vs. parent-reports) may influence psychometric properties of the ICU (Cardinale & Marsh, 2020; Deng et al., 2019; Wang et al., 2020). Future studies should incorporate multi-informants to examine the robustness of within- and between-person factor structures. In addition, although we confirmed sufficient within-person variability in these ICU items, the scale was originally developed for trait-level assessment. Some items may still reflect retrospective evaluations rather than context-sensitive behaviors. Future research could consider developing or validating CU trait measures specifically designed for ILDs. Third, it remains possible that the factor structure of CU traits is partly driven by method variance (Hawes et al., 2014; Ray & Frick, 2020), as all items in the Callousness factor are negatively worded, while items in the Uncaring factor are positively worded. Previous studies have found that positively worded items tend to better discriminate individuals with higher levels of CU traits, and negatively worded items discriminate best at lower levels (Hawes et al., 2014; Ray & Frick, 2020). The current study, unfortunately, cannot directly explore potential method variance as it would require positively worded Callousness items and negatively worded Uncaring items (Paiva-Salisbury et al., 2017). Future studies should integrate IRT at the within-person level to examine item discrimination efficiency in daily contexts, as well as consider using the full 24-item ICU scale to better separate substantive variance from method variance to facilitate more accurate comparisons of alternative models. Using the full ICU scale would also enable evaluating the psychometric validity of the Unemotional factor, which reflects a critical component of CU traits but was excluded from the short-form version due to its low internal consistency at the between-person level (Colins et al., 2016; Wang et al., 2020). Investigating its manifestation at the within-person level may offer important and novel insights into its dynamic properties and contribute to a more comprehensive understanding of CU traits. Finally, the current study assessed adolescents' and young adults' CU traits once per day, which limits our ability to detect meaningful fluctuations that may occur on finer timescales, such as within hours. Future research could employ ecological momentary assessment (Thunnissen et al., 2022) to capture more granular, moment-to-moment changes in CU traits and to examine whether there are meaningful and robust underlying structures on more refined timescales.

Conclusion

Daily CU traits exhibit meaningful within-person fluctuations in adolescents and young adults, highlighting their dynamic nature in daily contexts. The manifestation of CU traits appears to change across different developmental periods. In adolescents, both bifactor and correlated-factor models can be used at different levels depending on specific research purposes. Among young adults, the Callousness and Uncaring factors exhibit greater heterogeneity and distinctiveness, suggesting that these factors should be analyzed separately to explore their potentially divergent underlying mechanisms. These findings underscore the importance of ILDs in deepening our understanding of CU traits in real-world settings, which could inform context-sensitive intervention strategies aimed at managing daily fluctuations in CU traits and potentially reducing their progression to severe externalizing problems.

Footnotes

Acknowledgements

The authors gratefully acknowledge all the participants, research assistants, Elk Island and St. Albert public schools, and the following organizations at University of Alberta for their support: International Student Services, English for Academic Purposes program, New Chinese Generation, Chinese Students and Scholars Association, iGeek, Undergraduate Research Initiative, China Institute, East Asian Studies Undergraduate Students Association, and Taiwanese Student Association. Study data were collected and managed using RedCap electronic data capture tools hosted and supported by the Women and Children’s Health Research Institute at the University of Alberta.

Data Availability Statement

Research data are not publicly available due to ethics agreements. However, the data required for the analyses performed in the study are available from the corresponding author upon reasonable request. This study was not preregistered. To promote transparency and openness, the codes for all the analyses are publicly available at .

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported partly with funding from the China Institute at the University of Alberta, the Social Sciences and Humanities Research Council (IDG 430-2018-00317 and 409-2020-00080) and Natural Sciences and Engineering Research Council (RGPIN-2020-04458 and DGECR-2020-00077) of Canada, and a Killam Research Fund Cornerstone Grant. HZ was supported by a Mitacs Accelerate Grant (IT 18227) awarded to YZ, the Ivy A Thomson and William A Thomson Scholarship, and the Women and Children’s Health Research Institute Graduate Studentship.

ORCID iDs

Hao Zheng

Yao Zheng

References

Allen

J. L.

Shou

Wang

M.-C.

Bird

(2021). Assessing the measurement invariance of the Inventory of Callous-Unemotional Traits in school students in China and the United Kingdom. Child Psychiatry and Human Development, 52(2), 343–354. https://doi.org/10.1007/s10578-020-01018-0

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.).

Anderson

S. L.

Zheng

McMahon

R. J.

(2017). Predicting risky sexual behavior: The unique and interactive roles of childhood conduct disorder symptoms and callous-unemotional traits. Journal of Abnormal Child Psychology, 45(6), 1147–1156. https://doi.org/10.1007/s10802-016-0221-1

Anderson

S. L.

Zheng

McMahon

R. J.

(2018). Do callous–unemotional traits and conduct disorder symptoms predict the onset and development of adolescent substance use? Child Psychiatry and Human Development, 49(5), 688–698. https://doi.org/10.1007/s10578-018-0789

Angulo

Rooks

B. T.

Gill

Goldstein

Sakolsky

Goldstein

Monk

Hickey

M. B.

Diler

R. S.

Hafeman

Merranko

Axelson

Birmaher

(2017). Psychometrics of the Screen for Adult Anxiety Related Disorders (SCAARED)—A new scale for the assessment of DSM-5 anxiety disorders. Psychiatry Research, 253, 84–90. https://doi.org/10.1016/j.psychres.2017.02.034

Byrd

A. L.

Kahn

R. E.

Pardini

D. A.

(2013). A validation of the Inventory of Callous-Unemotional Traits in a community sample of young adult males. Journal of Psychopathology and Behavioral Assessment, 35(1), 20–34. https://doi.org/10.1007/s10862-012-9315-4

Cardinale

E. M.

Marsh

A. A.

(2020). The reliability and validity of the Inventory of Callous-Unemotional Traits: A meta-analytic review. Assessment, 27(1), 57–71. https://doi.org/10.1177/1073191117747392

Ciucci

Baroncelli

Franchi

Golmaryami

F. N.

Frick

P. J.

(2014). The association between callous-unemotional traits and behavioral and academic adjustment in children: Further validation of the Inventory of Callous-Unemotional Traits. Journal of Psychopathology and Behavioral Assessment, 36(2), 189–200. https://doi.org/10.1007/s10862-013-9384-z

Colins

O. F.

Andershed

Hawes

S. W.

Bijttebier

Pardini

D. A.

(2016). Psychometric properties of the original and short form of the Inventory of Callous-Unemotional Traits in detained female adolescents. Child Psychiatry and Human Development, 47(5), 679–690. https://doi.org/10.1007/s10578-015-0601-8

10.

Cooke

E. M.

Schuurman

N. K.

Zheng

(2022). Examining the within- and between-person structure of a short form of the Positive and Negative Affect Schedule: A multilevel and dynamic approach. Psychological Assessment, 34(12), 1126–1137. https://doi.org/10.1037/pas0001167

11.

Corbelli

Levantini

Muratori

Senese

V. P.

Bravaccio

Pisano

Catone

Paciello

(2024). Assessing callous-unemotional traits across early adolescence: Further evaluation of short versions. Child Psychiatry and Human Development. Advance online publication. https://doi.org/10.1007/s10578-024-01746-7

12.

Craig

S. G.

Goulter

Moretti

M. M.

(2021). A systematic review of primary and secondary callous-unemotional traits and psychopathy variants in youth. Clinical Child and Family Psychology Review, 24(1), 65–91. https://doi.org/10.1007/s10567-020-00329-x

13.

Deng

Wang

M.-C.

Zhang

Shou

Gao

Luo

(2019). The Inventory of Callous-Unemotional Traits: A reliability generalization meta-analysis. Psychological Assessment, 31(6), 765–780. https://doi.org/10.1037/pas0000698

14.

Docherty

Boxer

Huesmann

L. R.

Bushman

B. J.

Anderson

C. A.

Gentile

D. A.

Dubow

E. F.

(2024). Within-person bidirectional associations over time between parenting and youths’ callousness. Journal of Clinical Child and Adolescent Psychology, 53(4), 607–622. https://doi.org/10.1080/15374416.2023.2188554

15.

Fleeson

(2004). Moving personality beyond the person-situation debate: The challenge and the opportunity of within-person variability. Current Directions in Psychological Science, 13(2), 83–87. https://doi.org/10.1111/j.0963-7214.2004.00280.x

16.

Fleming

G. E.

Neo

Briggs

N. E.

Kaouar

Frick

P. J.

Kimonis

E. R.

(2022). Parent training adapted to the needs of children with callous–unemotional traits: A randomized controlled trial. Behavior Therapy, 53(6), 1265–1281. https://doi.org/10.1016/j.beth.2022.07.001

17.

Fontaine

N. M. G.

Rozéfort

Bégin

(2023). Associations between callous-unemotional traits and psychopathology in a sample of adolescent females. Journal of the Canadian Academy of Child and Adolescent Psychiatry, 32(1), 14–26.

18.

Frick

P. J.

(2004). The Inventory of Callous–Unemotional Traits. [Unpublished rating scale]. The University of New Orleans.

19.

Frick

P. J.

Ray

J. V.

Thornton

L. C.

Kahn

R. E.

(2014a). Annual research review: A developmental psychopathology approach to understanding callous-unemotional traits in children and adolescents with serious conduct problems. Journal of Child Psychology and Psychiatry, 55(6), 532–548. https://doi.org/10.1111/jcpp.12152

20.

Frick

P. J.

Ray

J. V.

Thornton

L. C.

Kahn

R. E.

(2014b). Can callous-unemotional traits enhance the understanding, diagnosis, and treatment of serious conduct problems in children and adolescents? A comprehensive review. Psychological Bulletin, 140(1), 1–57. https://doi.org/10.1037/a0033076

21.

Goodman

Meltzer

Bailey

(1998). The Strengths and Difficulties Questionnaire: A pilot study on the validity of the self-report version. European Child & Adolescent Psychiatry, 7(3), 125–130. https://doi.org/10.1007/s007870050057

22.

Goulter

Cooke

E. M.

Zheng

(2024). Callous-unemotional traits in adolescents’ daily life: Associations with affect and emotional and conduct problems. Research on Child and Adolescent Psychopathology, 52(1), 51–63. https://doi.org/10.1007/s10802-023-01077-6

23.

Harris

P. A.

Taylor

Minor

B. L.

Elliott

Fernandez

O’Neal

McLeod

Delacqua

Kirby

Duda

S. N.

(2019). The RedCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics, 95, 103208. https://doi.org/10.1016/j.jbi.2019.103208

24.

Hawes

S. W.

Byrd

A. L.

Henderson

C. E.

Gazda

R. L.

Burke

J. D.

Loeber

Pardini

D. A.

(2014). Refining the parent-reported Inventory of Callous-Unemotional Traits in boys with conduct problems. Psychological Assessment, 26(1), 256–266. https://doi.org/10.1037/a0034718

25.

Hsu

H.-Y.

Kwok

O-m

Lin

J. H.

Acosta

(2015). Detecting misspecified multilevel structural equation models with common fit indices: A Monte Carlo study. Multivariate Behavioral Research, 50(2), 197–215. https://doi.org/10.1080/00273171.2014.977429

26.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118

27.

Kemp

E. C.

Ray

J. V.

Frick

P. J.

Thornton

L. C.

Myers

T. D. W.

Robertson

E. L.

Steinberg

Cauffman

(2024). The Inventory of Callous-Unemotional Traits (ICU) self-report version: Factor structure, measurement invariance, and predictive validity in justice-involved male adolescents. Psychological Assessment, 36(9), 562–571. https://doi.org/10.1037/pas0001322

28.

Khojasteh

W.-J.

(2015). Investigating the sensitivity of goodness-of-fit indices to detect measurement invariance in a bifactor model. Structural Equation Modeling: A Multidisciplinary Journal, 22(4), 531–541. https://doi.org/10.1080/10705511.2014.937791

29.

Kimonis

E. R.

Frick

P. J.

Skeem

J. L.

Marsee

M. A.

Cruise

Munoz

L. C.

Aucoin

K. J.

Morris

A. S.

(2008). Assessing callous-unemotional traits in adolescent offenders: Validation of the Inventory of Callous-Unemotional Traits. International Journal of Law and Psychiatry, 31(3), 241–252. https://doi.org/10.1016/j.ijlp.2008.04.002

30.

Cooke

E. M.

Zheng

(2025). Dynamic links between daily anxiety symptoms and young adults’ daily well-being. Anxiety, Stress, & Coping, 38(3), 349–364. https://doi.org/10.1080/10615806.2024.2403437

31.

Muthén

B. O.

Muthén

L. K.

(1998–2017). Mplus User’s Guide (8th ed). Muthén & Muthén.

32.

Paiva-Salisbury

M. L.

Gill

A. D.

Stickle

T. R.

(2017). Isolating trait and method variance in the measurement of callous and unemotional traits. Assessment, 24(6), 763–771. https://doi.org/10.1177/1073191115624546

33.

Perlstein

Fair

Hong

Waller

(2023). Treatment of childhood disruptive behavior disorders and callous-unemotional traits: A systematic review and two multilevel meta-analyses. Journal of Child Psychology and Psychiatry, 64(9), 1372–1387. https://doi.org/10.1111/jcpp.13774

34.

Radloff

L. S.

(1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401. https://doi.org/10.1177/014662167700100306

35.

Ray

J. V.

Frick

P. J.

(2020). Assessing callous-unemotional traits using the total score from the Inventory of Callous-Unemotional Traits: A meta-analysis. Journal of Clinical Child and Adolescent Psychology, 49(2), 190–199. https://doi.org/10.1080/15374416.2018.1504297

36.

Reise

S. P.

Kim

D. S.

Mansolf

Widaman

K. F.

(2016). Is the bifactor model a better model or is it just better at modeling implausible responses? Application of iteratively reweighted least squares to the Rosenberg Self- Esteem Scale. Multivariate Behavioral Research, 51(6), 818–838. https://doi.org/10.1080/00273171.2016.1243461

37.

Revelle

Condon

D. M.

(2019). Reliability from α to ω: A tutorial. Psychological Assessment, 31(12), 1395–1411. https://doi.org/10.1037/pas0000754

38.

Rodriguez

Reise

S. P.

Haviland

M. G.

(2016). Applying bifactor statistical indices in the evaluation of psychological measures. Journal of Personality Assessment, 98(3), 223–237. https://doi.org/10.1080/00223891.2015.1089249

39.

Schuberth

D. A.

Zheng

Pasalich

D. S.

McMahon

R. J.

Kamboukos

Dawson-McClure

Brotman

L. M.

(2019). The role of emotion understanding in the development of aggression and callous-unemotional features across early childhood. Journal of Abnormal Child Psychology, 47(4), 619–631. https://doi.org/10.1007/s10802-018-0468-9

40.

Sliwinski

M. J.

(2011). Approaches to modeling intraindividual and interindividual facets of change for developmental research. In Fingerman

K. L.

Berg

C. A.

Smith

Antonucci

T. C.

(Eds.), Handbook of life-span development (pp. 1–25). Springer.

41.

Soto

C. J.

Tackett

J. L.

(2015). Personality traits in childhood and adolescence: Structure, development, and outcomes. Current Directions in Psychological Science, 24(5), 358–362. https://doi.org/10.1177/0963721415589345

42.

Thunnissen

M. R.

van den Hoofdakker

B. J.

Nauta

M. H.

(2022). Youth psychopathology in daily life: Systematically reviewed characteristics and potentials of ecological momentary assessment applications. Child Psychiatry & Human Development, 53(6), 1129–1147. https://doi.org/10.1007/s10578-021-01177-8

43.

Waller

Hyde

L. W.

(2017). Callous-unemotional behaviors in early childhood: Measurement, meaning, and the influence of parenting. Child Development Perspectives, 11(2), 120–126. https://doi.org/10.1111/cdep.12222

44.

Waller

Wright

A. G. C.

Shaw

D. S.

Gardner

Dishion

T. J.

Wilson

M. N.

Hyde

L. W.

(2015). Factor structure and construct validity of the parent-reported Inventory of Callous-Unemotional Traits among high-risk 9-year-olds. Assessment, 22(5), 561–580. https://doi.org/10.1177/1073191114556101

45.

Walz

L. C.

Nauta

M. H.

aan het Rot

(2014). Experience sampling and ecological momentary assessment for studying the daily lives of patients with anxiety disorders: A systematic review. Journal of Anxiety Disorders, 28(8), 925–937. https://doi.org/10.1016/j.janxdis.2014.09.022

46.

Wang

M. C.

Shou

Liang

Lai

Zeng

Chen

Gao

(2020). Further validation of the Inventory of Callous-Unemotional Traits in Chinese children: Cross-informants invariance and longitudinal invariance. Assessment, 27(7), 1668–1680. https://doi.org/10.1177/1073191119845052

47.

Watts

A. L.

Greene

A. L.

Bonifay

Fried

E. I.

(2024). A critical evaluation of the p-factor literature. Nature Reviews Psychology, 3(2), 108–122. https://doi.org/10.1038/s44159-023-00260-2

48.

Widaman

K. F.

Ferrer

Conger

R. D.

(2010). Factorial invariance within longitudinal structural equation models: Measuring the same construct across time. Child Development Perspectives, 4(1), 10–18. https://doi.org/10.1111/j.1750-8606.2009.00110.x

49.

Wright

A. G. C.

Simms

L. J.

(2016). Stability and fluctuation of personality disorder features in daily life. Journal of Abnormal Psychology, 125(5), 641–656. https://doi.org/10.1037/abn0000169

50.

Zheng

Cooke

E. M.

Zheng

(2023). Capturing hassles and uplifts in adolescents’ daily lives: Links of daily experiences with physical and mental well-being. Journal of Youth and Adolescence, 52(1), 177–194. https://doi.org/10.1007/s10964-022-01682-6

51.

Zheng

(2025). Understanding the within- and between-person structure of daily psychopathology among adolescents and young adults. Assessment, 32(6), 899–920. https://doi.org/10.1177/10731911241283908

52.

Zheng

(2024). Daily links between leisure activities, stress, and well-being during the transition to university. Applied Developmental Science, 58(4), 580–595. https://doi.org/10.1080/10888691.2023.2259789

53.

Zheng

Goulter

(2024). Introduction to the special issue: Novel insights into the externalizing psychopathology spectrum in childhood and adolescence from intensive longitudinal data. Research on Child and Adolescent Psychopathology, 52(1), 1–6. https://doi.org/10.1007/s10802-023-01154-w

54.

Zheng

Pasalich

D. S.

(2025). Daily associations between parental warmth and discipline and adolescent conduct problems and callous-unemotional traits. Prevention Science, 26(4), 519–529. https://doi.org/10.1007/s11121-024-01740-4

55.

Zheng

Zhang

Huang

Goulter

(2021). Validation and measurement invariance of the Inventory of Callous-Unemotional Traits in Chinese incarcerated and normative samples. Law and Human Behavior, 45(6), 542–553. https://doi.org/10.1037/lhb0000461

Examining the Within- and Between-Person Structure of Callous-Unemotional Traits in Adolescents and Young Adults in Daily Life

Abstract

Keywords

The Present Study

Method

Participants and Procedures

University Student Sample

Adolescent Sample

Measures

Daily Callous-Unemotional Traits

Criterion Validity Measures

Daily Emotional and Conduct Problems

Daily Anxiety Symptoms

Depressive Symptoms

Analytic Strategy

Model Estimation

Structural Validity

Longitudinal and Multigroup Measurement Invariance

Within- and Between-Person Criterion Validity

Transparency and Openness

Results

Descriptive Statistics

Structural Validity

University Student Sample

Adolescent Sample

Longitudinal and Multigroup Measurement Invariance

Within- and Between-Person Criterion Validity

University Student Sample

Adolescent Sample

Discussion

Strengths, Limitations, and Future Directions

Conclusion

Footnotes

Acknowledgements

Data Availability Statement

Declaration of Conflicting Interests

Funding

ORCID iDs

References