Abstract
Temperament traits are early appearing and relatively stable phenotypic profiles of behavior that are present across space and time. This definition invariably reflects the timescale imposed when gathering repeated measures of our variables of interest and our reliance on aggregate, mean-level values. However, if the timescale of observations is shortened and the frequency of observations is increased, underlying or latent fluctuations and variability may emerge. Embedding short-term fluctuations into slower developmental trajectories may improve our understanding of behavior in the moment while also strengthening prediction. Researchers should embrace a more granular timescale in research, incorporating new technology and analytical approaches, enhancing our ability to capture developmental change. This article illustrates how shifting timescales can provide new insight into social, behavioral, and cognitive processes across development.
Four decades of research have centered on the form, distribution, and functional impact of individual differences in temperament beginning in the first months of life (Pérez-Edgar & Fox, 2018). Contemporaneous, cross-sectional data show that we can create reproducible profiles of temperament based on the intensity, longevity, and external triggers of specific emotional and behavioral responses to the environment. For example, as early as 4 months of age, infants who show vigorous limb movements, arching of the back, and negative emotion to novel sensory stimuli are defined as having a negative reactive temperamental profile (Anaya et al., 2024). In toddlerhood, children who freeze, withdraw, and refuse to the engage in the face of novel social and nonsocial stimuli are defined as displaying behavioral inhibition (BI; Kagan, 2018).
Longitudinally, we can move across time and see how these early temperamental profiles are linked to each other and to other broader developmental outcomes. For example, infants with high negative reactivity are more likely to become behaviorally inhibited toddlers (Kagan, 2018). Behaviorally inhibited toddlers, in turn, are more likely to become socially withdrawn in middle childhood and then socially anxious as adolescents (Clauss & Blackford, 2012). In addition, children who maintain higher BI scores over time (i.e., stability) are more likely to have greater anxiety symptoms (Chronis-Tuscano et al., 2009). Currently, BI is the best characterized individual difference factor predicting social anxiety, in part because of its early emergence and relative stability. This work has proven to be robust, replicable, and a solid foundation for the creation of interventions and preventive programs (Rapee & Bayer, 2018). However, as strong as these associations are, mean-level relations may hide a great deal of heterogeneity under the surface. Indeed, only 40% to 60% of children high in BI go on to have an anxiety disorder.
Temperament research relies on imposing temporal ordering to data to capture patterns of relative “stability” or “change.” We anticipate that children will exhibit a good deal of stability in temperamental traits, and our conclusions about development are often drawn from mean-level relations between variables. That is, children are given aggregate scores with respect to infant reactivity, BI, or social anxiety. Children with higher scores on one measure tend to have higher scores on the other. Relations between averages are often taken to imply stable tendencies and static relations, both concurrently and prospectively (Petersen, 2024).
However, we cannot say that an individual’s presentation of a temperamental profile is by necessity also stable within any one point in time or over a course of days, hours, or minutes. Rather, research suggests that there can be dynamic patterns of fluctuations and change beneath the surface presentation of relative stability that are meaningful and observable (see Fig. 1; Anaya et al., 2021b). Indeed, not only are there individual differences in variability within traits as presented in the moment, but the pattern, magnitude, and stability of this variability provide additional insight into how an individual functions in the moment, predict trajectories over time, and reveal new interrelations among variables. These insights can be evident even among variables whose ordering seemed obvious and partially fixed based on mean-level comparisons. Thus, we believe that temperament researchers, and developmental scientists broadly, should diversify the temporal scale of their work to capture more nuanced patterns of stability and change (Table 1).

Hypothetical distribution of performance at different timescales. The solid black line represents infant performance by age (in months) on a gold-standard A-not-B task. Although the aggregate performance is relatively smooth as infants move from floor to mastery, there are underlying levels of variability. The blue lines represent these levels of variability from trial to trial and in shorter retesting intervals. Note that variability is greater in “active” developmental phases rather than at the floor and ceiling.
Issues Often Considered by Researchers to Account for Both Immediate (Micro) and Long-Term (Macro) Developmental Changes When Designing Longitudinal Studies
These ideas are not new, or highly original, given the robust literature on the temporal dynamics of psychological constructs (Adolph et al., 2008; Burt & Obradović, 2013; Helm et al., 2018; Oravecz & Brick, 2019). However, their application to the temperament literature does help us better understand constructs that are reflexively thought of as relatively stable, in part because of the timescale we impose. Much of the work in developmental science, including temperament research, focuses on the macroscale, looking at broad developmental changes over the course of months and years (Hollenstein et al., 2013). Other work focuses on the mesoscale, examining changes that are evident across multiple interactions or perturbations of the developmental system. Finally, work on the microscale examines fluctuations and variation within a particular task. Thus, a behavioral trait is defined as much by timescale as domain (Oravecz & Brick, 2019). We can create a profile of stability by defining a timescale that does not allow us to capture fluctuations or heterogeneity in state. Increasing or decreasing the frequency of our measurements of a behavior could potentially change our definition of what a trait is and how stable that trait may be.
If we presume that the individual is a complex dynamic system (Ram & Gerstorf, 2009), we are compelled to examine variability. If we shorten the timescale of change from a month or a year to mere minutes or seconds, we can capture more nuanced behavioral change with multiple instances of positive and negative variations from the mean (McKone & Silk, 2022). In some cases, behaviors and processes may revert such that relations evident in one moment may vanish in another, only to then return. This coupling and uncoupling of variables, although potentially random, could also be linked to variations in the eliciting components of the environment at that moment in time, including task (e.g., what the child is attempting to do), context (e.g., whether the environment is conducive to stable performance), or the child’s current state (i.e., their current mood, affect, or motivation). For example, Gabel et al. (2023) modeled variability in young children's positive and negative emotion across emotionally evocative laboratory tasks. Emotion variability, in turn, predicted internalizing symptoms above and beyond average emotion averaged across tasks. In this case, averaging over repeated measures would not filter out noise but rather suppress a true predictive signal.
Ironically, attitudes toward change in performance have somewhat stymied the conversation between researchers who examine broader windows of developmental change and researchers interested in moment-to-moment fluctuations. Researchers who study behavior at fast timescales must embed momentary behavioral change within the broader context of development. Thus, developmental change can be seen as a quantitative confound. Within-person deviations or variability are dependent on the person-level mean of that variable (Anvari et al., 2023). Extreme scores either at the ceiling or floor, by definition, constrain variability. There is a sweet spot for examining a task to capture maximal within- and between-person variability, either because of the current maturational stage of participants or task characteristics that do not find a stable central tendency through either learning or maturation (see Fig. 1).
As an example, researchers often attempt to disentangle initial emotional reactivity from subsequent emotion regulation. This is quite difficult when higher order mean values can reflect both processes in action. Careful work at smaller, dynamic timescales can disentangle these processes by examining how fine-grain reactivity markers are modulated by regulatory behaviors to shift variable intensity, velocity, and acceleration over time (Cole & Hollenstein, 2018; Morales et al., 2018). At the neural level, the “variability architecture” of the brain appears to decrease with age, whereas spatiotemporal variability is associated with levels of self-regulation and negative affect (Guassi Moreira et al., 2019).
Within the temperament literature, distinct patterns emerge when examining mean-level between-subject differences. Under this lens, BI children often appear ruminative, rigid, and hypervigilant. When moving across contexts or across time, they often revert to past strategies, even when current events would call for different or more adaptive responses (Pérez-Edgar, 2018). For this reason, the intuition is that BI children show relatively little to no change over time, thus leading to social difficulties and increased risk for psychopathology. However, it may be that a smaller timescale, focused on within-person characteristics, may reveal greater degrees of variability and fluctuations (Gunther, Anaya, Myruski, et al., 2023). We should not assume that the relative stability seen in averages derived at one timescale is necessarily reflected in similar patterns at smaller timescales.
Repeated-measures data, whether micro or macro, provide the foundation for asking questions about how people exhibit both stable trait (central tendency) and fluctuating state (individual data point relative to central tendency) behaviors across time. For example, we examined variation over the first 2 years of life in relations between maternal anxiety, infant negative affect, and attention bias to affective stimuli (Vallorani et al., 2023). Measures were collected at five assessments, allowing us to capture both stable trait and fluctuating state associations. For example, at the stable trait level (across assessments), mothers exhibiting more anxiety had infants who exhibited more affect-biased attention, matching the previous literature (Morales et al., 2017). However, at the fluctuating state level (individual assessments), mothers farther away from their trait levels of anxiety (more fluctuations) had infants who remained closer to their trait levels (more stability) of affect-biased attention and negative affect. This relation may reflect how individual infants respond to the fluctuating signals from caregivers. Indeed, analyses in the same sample (Gunther, Anaya, Myruski, et al., 2023) found that when mothers showed greater variability in attention bias, infants showed decreases in negative affect over time. Greater rigidity, in contrast, was associated with potentiated increases in negative affect.
Individual heterogeneity in the face of evocative or complex stimuli may further illuminate the underlying correlates of temperament and social behavior. Indeed, it may be that the literature shows stability and rigidity in individual responses to social stimuli over time because researchers have presented rigid and noncontingent stimuli that cannot evoke underlying heterogeneity (Fu & Pérez-Edgar, 2019; McKone & Silk, 2022). For example, data suggest that BI and social anxiety are linked to an attentional bias to social threat. However, depending on the stimuli used and the population of children, the effect waxes, wanes, and even reverses itself (Van Bockstaele et al., 2021), and patterns vary when examining mean-level bias versus variability in bias (Zvielli et al., 2015). Many studies (including our own) rely on static faces presenting canonical emotions (e.g., happy or angry). There is a conceptual leap from the momentary processing and response to noncontingent, flat two-dimensional faces to the behaviors of children as they engage in their social world. Indeed, Fu et al. (2019) showed variation in the relations between computer-based tasks using proxy stimuli and attention patterns in active social engagements.
Using mobile eye-tracking glasses, Pérez-Edgar et al. (2020) found that BI children can show either attention to (Gunther et al., 2021) or avoidance of (Gunther, Fu, et al., 2022) social partners on the basis of the specific demands of the task (e.g., acute fear). Moment-to-moment fluctuations in visual attention can also be traced, and these patterns are differentially related to individual variation in anxiety risk. As researchers move toward greater ecological validity and provide more opportunity for children to show a wider range of behavior, we could quite logically expect different concurrent and predictive relations to emerge between variables. Context, contingency, and ambiguity may be necessary ingredients to elicit greater heterogeneity within and among children. Indeed, these are the very characteristics that seem to trigger and maintain anxious behaviors and feelings. For example, neural correlates of negative affect and emotional self-regulation (Guassi Moreira et al., 2019) may be best revealed when incorporating complex social stimuli to examine inter- and intraparticipant variation.
Fast timescales may provide insight into an important mechanism of development—systematic plasticity and adaptability in the face of environmental change. The ability and willingness to explore new ways of interacting with the environment and experiencing positive and negative outcomes create a robust psychological toolbox that can be pulled from as needed. These new experiences and tools may lead to the developmental change that we have traditionally captured at the mean level. For example, in a study by MacNeill et al. (2018) in which infants tackled a Piagetian A-not-B task at monthly intervals, the authors observed substantial variation among infants, including the age at which they mastered the task. For many children, their outward behavior appeared static, if not stuck, for months at a time. The authors also collected resting EEG over those same months, and here change was occurring at the neural level month by month. Observable behavioral change appeared dramatic and nonlinear but was associated with linear neural changes.
It may be that in the short term within-person change is unstructured and more tethered to dynamic in-the-moment characteristics. However, over time, fluctuations may become more structured and goal-directed. Inherent levels of variability in the moment or in short time periods may allow us to probe for the capacity to change. If a system truly is inert or unchanging in the face of environmental pressures, the individual may not be ready or capable of developmental progression.
We may also ask whether excessive variability or instability in the moment inhibits change. That is, a system may be so transient that it cannot coalesce on a new, more stable, form of behavior, even in the aggregate. Thus, the lack of developmental change over time at the surface, or the rigidity of a particular behavioral style, may reflect hypervariable dynamic fluctuations in the moment.
A focus on dynamic systems can also expand our ability to capture individuals as they are embedded in a social context. A dynamic systems approach often assumes that development is nonlinear and actively looks for small perturbations that generate disproportionately large outcomes. As a corollary, a dynamic systems approach requires that we examine real-time processes, which tend to fluctuate rapidly in shorter timescales. One recent set of studies illustrates these relations by capturing vagal flexibility.
Vagal flexibility indexes nonlinear changes in the parasympathetic nervous system as children face shifting demands across tasks and contexts (Shakiba et al., 2023). The literature linking vagal reactivity and socioemotional behavior has been inconsistent, in part because analyses typically rely on averages across a baseline. Hassan and Schmidt (2024) pointed out that vagal activity at baseline reflects the capacity to regulate in the face of a task or stressor. The actual introduction of a task or a stressor, in contrast, likely captures in-the-moment reactivity and regulatory efforts that require the implementation of a skill that was up to that point hypothetical. Broad aggregates may obscure variation in active responses to task demands from moment to moment (Ugarte et al., 2023). With respect to temperament specifically, Wagner et al. (2023) found that behaviorally inhibited children who displayed less vagal flexibility (i.e., more rigidity) across multiple novel social tasks were at increased risk for anxiety.
If the individual is a dynamic system, then surely instances of social engagement are as well. That is, the behavior of two children in a dyadic interaction will reflect both their trait-like temperamental traits (e.g., BI) and their ability to adjust their behavioral and emotional responses to social signals in the moment (Anaya et al., 2021a). We have often approached people as autonomous free-floating entities that occasionally interact with or bump into others. However, just as we can use rapid sampling in short time windows to reveal how activity in two brain regions change moment to moment (Gunther, Petrie, et al., 2022), we can also examine dyadic processes across two individuals (Perlman et al., 2022). For example, mother-infant neural synchrony during the phases of a still-face paradigm has been shown to be associated with individual differences in maternal anxiety and infant temperament (Gunther, Anaya, Fisher, et al., 2023). Among 4- to 6-year-olds, changes in concurrent parent-child neural synchrony from free play to when completing a difficult puzzle are associated with temperamental effortful control and fear (Rocha-Hidalgo et al., 2024). In addition, neural similarity between young adult friends when observing videos of past shared social encounters is linked to individual differences in reported shared affect and social anxiety (Vallorani et al., 2024).
Historically, analytic and technical constraints often forced researchers to treat the individual as a self-contained or static unit. However, modern analytic approaches can more powerfully capture change within and across measures (Burt & Obradović, 2013; Oravecz & Brick, 2019). Relations, including cycles, lags, and synchrony in developmental function, are increasingly evident (Cole & Hollenstein, 2018; Helm et al., 2018). We could imagine that young children completing a task might show different levels of variability across trials of a task at the first attempt versus the third or fifth attempt, or if they were to repeat the task a year later. These differences in fluctuation based on timing might then fuel differences at the aggregate level as performance presumably improves over time.
Within a measure, such as vagal tone or eye gaze, microscale changes are nested within a larger task or time window. Depending on the specifics of the data, we now have an arsenal of analytic tools, including growth curve models, multilevel models, and dynamic structural equation models. Each approach has specific assumptions for the data that must be accounted for at the earliest stages of task design. For example, with eye-gaze data, a reliance on fixation-to-fixation data will generate a large number of data points that are also dotted with missing data (Vallorani et al., 2022). Unless the analytic approach can tolerate missingness, it may be necessary to aggregate to larger timescales to eliminate missing cells. In addition, it is possible to go too fine-grained such that the rate of data collection produces slices that cannot reconstruct the construct of interest. For example, although we can meaningfully capture the neural activity at the level of milliseconds with an EEG, attempting to capture facial expressions at the same timescale will be uninterpretable. Finally, the construct of interest should not be disturbed by our measurement. For example, although we can capture heart rate moment to moment in a task, we cannot capture interoceptive accuracy at the same time.
A second layer of complexity emerges when examining dyadic data because we now have to account for two moving relations (Perlman et al., 2022). Although the current article does not lend itself to deep discussion, we can highlight the work of Helm et al. (2018), who examined how to model physiological synchrony and noted the multiple forms relations can take, including trend, concurrent, and lagged synchrony, and the needed analytic approaches. We do want to highlight their admonition that trend synchrony may reflect shared responses to general conditions (e.g., two individuals completing a puzzle) that may artificially inflate the idiosyncratic person-to-person connection that is captured by concurrent and lagged synchrony.
The emerging literature suggests that researchers may need two different types of measurements when examining temperamental variation, social behavior, and associated psychological outcomes, along with additional associated considerations (Table 1). First, we must keep the strong foundation of measures that examine variation in a trait, defining the trait via relative stability across larger time windows. In parallel, we can incorporate measures at a faster sampling rate to capture fluctuations at the shorter timescale. Both sets of data are needed to make the hypothesized leap between short-term process-level fluctuations and more stable patterns observable at a phenotypic level. Given that change in response to contextual demands is a core dynamic that elicits observable developmental trajectories (McKone & Silk, 2022), we need to think about the scale of our time measures from years to milliseconds. A methodical shift up and down the timescale will strengthen our understanding of phenotypic traits by capturing developmental mechanisms across their nested timescales of change.
Recommended Reading
Cole, P. M., & Hollenstein, T. (2018). (See References). Outlines the ways in which our metrics of emotion regulation are dependent on timescales.
Fu, X., & Pérez-Edgar, K. (2019). (See References). Thorough theoretical and empirical analysis of
methodologies available for assessing patterns of attention, particularly in children.
Helm, J. L., Miller, J. G., Kahle, S., Troxel, N. R., & Hastings, P. D. (2018). (See References). Outlines several approaches for measuring and modeling physiological synchrony, including usable Mplus code.
Morales, S., Ram, N., Buss, K. A., Cole, P. M., Helm, J. L., & Chow, S. M. (2018). (See
References). Empirical example of applying dynamic approaches to short time intervals of behavior in young children.
Pérez-Edgar, K., & Fox, N. A. (Eds.). (2018). (See References). Examines multiple perspectives and empirical approaches for studying behavioral inhibition.
