Sage Journals: Discover world-class research

Abstract

States are increasingly important in personality theory and research. Yet, the assessment of personality states usually relies on ad hoc measures whose development and evaluation are largely separated from theoretical considerations. To enable theory-guided development and evaluation of personality state measures, we introduce a framework based on the revised latent state-trait (LST-R) theory. The theory defines latent states as the expectation of an observed measure given a person in a specific situation, which can be decomposed into latent traits and latent situation-specific state residuals. Consequently, items and scales can be evaluated for their reliability due to latent traits (consistency) and situation-specific influences (specificity). We propose that specificity, in particular, is an appealing property for instruments designed to assess personality states. We illustrate this framework with experience sampling data on personality states. Our framework has implications for both the conceptualisation and the assessment of personality states. On the theoretical side, we provide a formal definition of personality states, which enables integration between trait-, process-, and development-focused theories. On the practical side, we show how using LST-R models allows researchers to develop and evaluate state measures on their own terms rather than applying criteria for trait measures to assess the qualities of state measures.

Plain language summary

A personality state is made up of the feelings, thoughts, behaviours, and/or desires a person experiences at a particular moment in time. Although personality is often thought of in terms of stable traits, personality states can fluctuate from moment to moment. Contemporary personality theories hold that personality traits reflect the distribution of states a person experiences over time. Despite the theoretical importance of personality states, personality (and other) researchers often lack the tools to assess states. One reason for this is the lack of a framework which can guide the development and evaluation of personality state measures. We propose such a framework. In doing so, we build on the revised latent state-trait (LST-R) theory. LST-R theory enables us to explicitly define personality states and evaluate personality state measures. In particular, we propose that personality state measures should be specific, which means that they reliably capture moment-to-moment fluctuations in personality states. We illustrate this framework by re-analysing personality states captured through experience sampling methods. Our framework has implications for both the conceptualisation and the assessment of personality states. On the theoretical side, we provide a formal definition of personality states, which enables integration between trait-, process-, and development-focused theories. On the practical side, we show how using LST-R models allows researchers to develop and evaluate measures which capture the most important properties of personality states.

Keywords

assessment experience sampling latent state-trait theory personality states scale development

Introduction

Personality states are increasingly important in personality research (Baumert et al., 2017; Fleeson & Jayawickreme, 2015; Horstmann & Ziegler, 2020). They are crucial ingredients of contemporary theoretical frameworks such as whole trait theory (WTT), which describes personality traits as distributions of states (Fleeson, 2001; Fleeson & Jayawickreme, 2015; Jayawickreme et al., 2019), and in recent work on personality dynamics (Danvers et al., 2020; Sosnowska et al., 2020). From a practical perspective, the ubiquity of smartphones has enabled researchers to gather intensive longitudinal data on personality states in everyday life (Hamaker & Wichers, 2017; Van Berkel et al., 2017). This has led to a proliferation of research on the variation of personality states across contexts and relationships (e.g. Church et al., 2013; Geukes et al., 2017; Kuper et al., 2022). Increasingly, personality states have also found conceptual and empirical applications outside of personality psychology. Examples span across clinical (Clark et al., 2003; Wright & Simms, 2016) and organisational psychology (Abrahams et al., 2023; Beckmann et al., 2021; Huang & Ryan, 2011; Judge et al., 2014; Nübold & Hülsheger, 2021) as well as computer science (Kalimeri et al., 2013; Staiano et al., 2011). This interdisciplinary popularity is a success story for personality psychology, but also underlines the importance of coherent approaches to the assessment of personality states.

Despite the theoretical and empirical popularity of personality states, there remains a surprising degree of uncertainty about the development, evaluation, and interpretation of personality state measures (Horstmann & Ziegler, 2020). This is all the more problematic as many personality and applied researchers may not be aware of these uncertainties. A recent survey of the field highlights that researchers typically develop and evaluate such measures ad hoc (Horstmann & Ziegler, 2020). As a consequence, the theoretical conceptualisation of personality states and empirical practices are frequently ill-aligned with each other. In particular, researchers may repurpose methods and criteria designed for trait constructs to validate and evaluate state measures (e.g. using cross-sectional confirmatory factor analysis to establish the reliability of state measures, which conflates the trait and state components of the construct). This mismatch between the theoretical construct of personality states on the one hand and empirical assessment practices on the other hand highlights the need for a unified framework which links the definition and assessment of personality states.

In this article, we define personality states within the revised latent state-trait theory (LST-R theory; Steyer et al., 2015; Steyer et al., 1999). On this basis, we provide a comprehensive framework for developing, evaluating, and interpreting state measures. The paper begins with a brief review of definitions of personality states and current approaches to measuring state constructs. In doing so, we highlight discrepancies between their theoretical conception and their assessment in practice. Second, we introduce a definition of personality states within the framework of LST-R theory (Steyer et al., 1999, 2015) and discuss plausible LST-R models for (intensive) longitudinal data. Third, we define several desiderata for personality state measures within the LST-R framework. Fourth, we illustrate how models of LST-R theory can be used to evaluate personality state measures using an empirical example. Finally, we discuss the implications of our framework for the theoretical conceptualisation of personality states and for the development, evaluation, and interpretation of personality state measures in practice.¹

States in contemporary personality theory

A prominent definition describes personality states as “quantitative dimension[s] describing the degree/extent/level of coherent behaviours, thoughts, and feelings at a particular time” (Baumert et al., 2017, p. 528). This definition contains two central elements: the coherence of internal characteristics and behaviour, and the temporal constraint to a particular situation. The first element reflects contemporary theories which define personality as individual differences along dimensions of coherent behaviours, thoughts, and feelings (Baumert et al., 2017). The emphasis on coherence also distinguishes personality states from purely idiosyncratic momentary experiences (DeYoung, 2015; McCabe & Fleeson, 2012). At this point, let us introduce our running example: Consider Ahmed, who has just arrived at his friend Bouke’s apartment for a party. Ahmed feels energised and is eager to meet new people. In this situation, Ahmed can be said to experience a state of Extraversion, reflecting his coherent behaviours (attending a party), feelings (feeling energised), and motives (meeting new people). However, part of Ahmed’s experience is idiosyncratic (e.g. when he recognises the door to Bouke’s apartment from his previous visits). In defining personality states as describing coherent behaviours, thoughts, and feelings, this idiosyncratic part of his experience is excluded from the construct.

The second element of the definition of personality states constrains them to a particular point in time. In practice, personality states are often elicited by asking what a person feels, thinks, or does ‘at this moment’ or ‘in this situation’ (Horstmann & Ziegler, 2020). It is implied that they may have felt, thought, and behaved differently earlier, and that they might again feel, think, and behave differently in the future. Indeed, while individual rank-order differences on many personality dimensions (‘traits’) can be fairly stable across situations and time (Bleidorn et al., 2021, 2022; Henry et al., 2022; Mõttus et al., 2019; Seifert et al., 2022), the extent to which an individual experiences or exhibits specific behaviours, thoughts, or feelings at a particular time (‘states’) may fluctuate from situation to situation (Baumert et al., 2017; DeYoung, 2015; Fleeson & Jayawickreme, 2015). Recall Ahmed, who is in a state of Extraversion at the party. The next morning, he may feel tired and cancel his brunch date (a more introverted state). Both of these situations are part of a distribution of states whose central tendency may fall somewhere between them.

Whole Trait Theory provides the most explicit account of the relationship between relatively stable individual differences in traits on the one hand and intrapersonal variation in states on the other hand. In doing so, WTT distinguishes between a descriptive and an explanatory (sub-)model. Descriptively, personality traits reflect the conceptually corresponding states a person experiences across situations (Fleeson, 2001; Fleeson & Jayawickreme, 2015), such that trait measures should correspond closely to the central tendency of the density distribution of states (Fleeson, 2001; Fleeson & Gallagher, 2009; Rauthmann et al., 2019). Explanatorily, personality traits correspond to social-cognitive mechanisms which process inputs and output personality states. Individual differences in social-cognitive mechanisms (as well as stable individual differences in inputs) can explain interindividual variation in traits (i.e. the central tendency of the distribution of states), while situation-to-situation variation in inputs can explain intraindividual variation in states.

Other theoretical accounts largely concur with the descriptive side of WTT, but diverge in their explanation of the relationship between states and traits. Cybernetic Big Five theory (CB5T; DeYoung, 2015) describes traits as individual differences in the parameters of cybernetic systems and posits that traits are situationally specific in that they ‘describe responses to specific classes of stimuli’. Similarly, interactionist or affordance-based theories hold that situational factors afford (or, conversely, constrain) the expression of specific traits in behaviour (e.g. Columbus et al., 2019; de Vries et al., 2016; Hilbig et al., 2018; Horstmann et al., 2021; Mischel & Shoda, 1995; Shoda & Mischel, 2000; Tett & Burnett, 2003; Thielmann et al., 2020; Zettler & Hilbig, 2010). Although such interactionist models are largely silent on personality states, they imply that interindividual differences in traits interact with intraindividual variation in inputs to produce variation in the central tendency and shape of the distribution of behaviours. Contemporary theories thus broadly agree that traits and states are linked because the interaction of social-cognitive mechanisms (or cybernetic system) and situation-specific inputs (or affordances) produces both cross-situational interindividual differences in traits and intraindividual variability in states across situations.

The substantive content of personality states

Personality dimensions encompass broad factors (e.g. HEXACO or Big Five dimensions), but also narrower facets and nuances lower in the personality hierarchy (Mõttus et al., 2017). Personality states can equally be positioned at any level of the personality hierarchy. WTT further requires that personality states ‘have the same affective, behavioural, and cognitive content as a corresponding trait’ (Fleeson & Jayawickreme, 2015). This isomorphism arises from the definition of traits as distributions of states (although the theory imposes the empirically derived structure of personality traits back onto personality states). However, it is imaginable that the structure of personality states differs from the structure of traits. Therefore, we consider structural trait-state isomorphism an empirical question rather than part of the definition of personality states (Rauthmann et al., 2019).

The proposed content of personality states reflects empirical analyses of Big Five trait scales, which have distinguished between references to affect, behaviours, cognitions, and desires (Wilt & Revelle, 2015; Zillig et al., 2002). Of these, behaviour arguably deserves a special status. Some researchers have equated personality states with behaviour (e.g. Horstmann et al., 2021) or have shown that feelings, thoughts, and motives co-vary with corresponding behaviours (McCabe & Fleeson, 2012). Other frameworks suggest that situational factors may afford or constrain the expression of internal states in overt behaviour (Columbus, Böhm, Moshagen, & Zettler, in prep; Columbus et al., 2019; Thielmann et al., 2020). Consequently, while feelings, thoughts, and desires may be aligned with behaviour in situations which afford their expression, this alignment may not exist in the absence of relevant affordances. Returning to our example, at the party, Ahmed’s energy and motivation to make new connections may express themselves in approaching people. Had Ahmed missed his train, however, he may have limited opportunity to express these internal states. In an affordance framework, the conceptualisation of personality states may thus explicitly exclude behaviour, and corresponding state measures should omit items with purely behavioural content.

Conceptualising and measuring personality states

Several recent publications highlighted the demand for measures designed to assess personality states (Baumert et al., 2017; Horstmann & Ziegler, 2020; Ringwald et al., 2022). For example, a consensus article by a group of personality psychologists states that ‘we must have measures of cognitive, affective, motivational, and behavioural states under specified situational conditions’ (Baumert et al., 2017, p. 517). One reason that such measures are lacking may be that existing theoretical frameworks underspecify the criteria by which personality state scales should be evaluated. In the absence of dedicated measures, research on personality states has often relied on measures constructed ad hoc by adapting items from trait measures by appending them with phrases such as ‘in this situation…’ (Horstmann & Ziegler, 2020). Such items were originally worded to capture stable individual differences across situations rather than moment-to-moment fluctuations (Horstmann & Ziegler, 2020). Trait scale evaluation also prioritises consistency over specificity², for example, by relying on factor analyses of cross-sectional data. Consequently, the use of such adapted trait scales may result in overly narrow distributions of states and may attenuate associations of personality states with situation-specific antecedents or consequences. Instead, we argue that personality state scales should be designed to be highly specific in order to capture moment-to-moment fluctuations in coherent affect, cognitions, and desires.

We propose a framework for conceptualising and measuring personality states. Our conceptualisation begins with the definition of personality states as coherent characteristics of a person at a particular time. To capture these core theoretical desiderata of coherence and specificity in a formal definition of personality states, we draw on LST-R theory (Steyer et al., 1999, 2015). LST-R theory defines states as latent variables representing an individual’s characteristics at a particular time in a specific situation. These latent state variables in turn reflect the influences of the person’s immutable characteristics and past experiences (i.e. traits) as well as purely situation-specific influences and person-situation interactions (i.e. state residuals). As such, the conceptualisation of personality states within an LST-R framework is commensurable with the explanatory models of WTT and CB5T without committing to a particular form of the social-cognitive mechanisms or cybernetic system which give rise to traits and states.

By decomposing the variance of observed variables into components due to traits (consistency) and due to systematic situation-specific influences (specificity) as well as measurement error, LST-R theory is particularly well-suited to assess the psychometric properties of state measures. By incorporating autoregressive effects, LST-R theory can also be used to model trait change as a consequence of situational experience in intensive longitudinal data (Eid et al., 2017; Stadtbaeumer et al., 2022). Below, we first introduce the fundamental definitions of states, traits, and state residuals in LST-R theory. In the second step, we translate these definitions into latent variable models for intensive longitudinal data.

Revised latent state-trait theory

Definitions of states, traits, and state residuals in LST-R theory

The basic idea underlying LST-R theory is that a person cannot be assessed in a situational vacuum. That is, observations made at a single time point capture a situation-specific characteristic of the person (state), which may depend on the characteristics of this person independent of the situation (trait), the situation, and the interaction between characteristics of the person and the situation (Steyer et al., 1999). Returning to our example, Ahmed’s state Extraversion may depend on how extraverted he is in general, on the fact that he is at a party (which may make him more extraverted), and on the situation-specific effects of his general Extraversion (as a generally extraverted person, Ahmed may become more outgoing at a party, whereas Bouke, a more introverted person, may feel less comfortable in the same situation). Typically, the items measuring such a state will not be perfectly reliable, which means that the observed variable (i.e. responses on a state measure) also contains measurement error.

Formalising these ideas based on probability theory, each observed variable, Y_it, can be described as the sum of its latent state variable, τ_it, and a measurement error variable, ε_it,

Y_{i t} = τ_{i t} + ε_{i t},

(1)

with i denoting the specific indicator (or item) and t denoting the time point (or measurement occasion). In our example, a value of the observed variable corresponds to Ahmed’s response on a measure of Extraversion as he arrives at the party, which reflects his true state Extraversion (the value on the latent state variable) as well as measurement error.

Importantly, the definition of states and traits in LST-R theory is based on a dynamic concept of a person, which assumes that individuals undergo trait changes over time due to their experiences. More specifically, time-specific person variables, U_t, are used to indicate that a person at time point t may differ from the person at the previous time point t − 1 by the situation that realises at time t − 1, the observations made at time t − 1, and the experiences made between t − 1 and t (Steyer et al., 2015). For instance, Ahmed’s general level of Extraversion might change from time 1 to time 2 due to the situation occurring at time 1 (e.g. making new friends at the party), due to filling out a questionnaire at time 1 (e.g. because reflecting on the items encourages him to behave more boldly), or due to experiences made between both time points (e.g. becoming the victim of a mugging on the way home from the party).

The latent state variable is defined as the conditional expectation of the observed variable given the person variable, U_t, and the situation variable, S_t, at time t:

τ_{i t} : = E (Y_{i t} | U_{t}, S_{t}) .

(2)

Here, the := symbol indicates a definition and E (.|.) indicates a conditional expectation. In other words, the latent state variable represents the (error-free) state of a person being in a specific situation at a particular time point. Measurement error is the random deviation of the observed variable from this expected value. The latent state variable can further be decomposed into a latent trait variable, ξ_it, and a latent state residual variable, ζ_it, according to

τ_{i t} = ξ_{i t} + ζ_{i t} .

(3)

The latent trait variable is defined as the conditional expectation of the observed variable given the person variable, U_t, at time t:

ξ_{i t} : = E (Y_{i t} | U_{t}) .

(4)

A value on the latent trait variable can be interpreted as the expected value for an individual at a particular time across all possible situations that could occur at this time point. Although the latent trait of a person is thus situation-independent, it is defined for a particular time point and can therefore be regarded as a time-specific disposition that is open to change (Eid et al., 2017). Conceptually, the latent trait reflects the influences of the person’s immutable characteristics and experiences up to that point (i.e. Ahmed’s general level of Extraversion) on their personality state in all possible situations that this person may encounter. This corresponds to the parameters of the cybernetic system of CB5T, which likewise may change as a result of new experiences (DeYoung, 2015). However, LST-R models have a decontextualised view on latent traits and assume that these are not situation-specific.

The latent state residual variable, in turn, is defined as the difference between the latent state and trait variables,

ζ_{i t} : = τ_{i t} - ξ_{i t} .

(5)

It represents a systematic situation-specific deviation from the trait variable and may include both situation effects and person × situation interactions. A positive value on the latent state residual variable would indicate that the state of a person in a specific situation was higher than expected based on the trait level of that person. Conceptually, the latent state residual reflects the influences of characteristics of the situation (i.e. the effect of being at a party) as well as person-situation interactions (i.e. the effect of being at a party conditional on Ahmed’s general level of Extraversion) on a personality state in the given situation.

Taken together, an observed variable can be decomposed as follows:

Y_{i t} = ξ_{i t} + ζ_{i t} + ε_{i t} .

(6)

This means that Ahmed’s score on a state Extraversion measure is composed of his true Extraversion trait level, a situation-specific deviation from his Extraversion trait level, and measurement error. In principle, each observed variable has its own indicator- and time-specific latent state, trait, state residual, and measurement error variable (as evident from the subscripts i and t). In practice, however, estimating such a model is not possible because the observed variables do not provide sufficient information to infer the free parameters of all latent variables, resulting in mathematical non-identification. Nevertheless, by making assumptions about the equivalence of the latent variables or about the stability and change in the latent variables across time, identified and estimable models can be derived. In the following, we present two types of models suitable for analysing state measures: the basic multistate-singletrait model and its extension including autoregressive effects, which is particularly appropriate for intensive longitudinal data.

Models of LST-R theory

Multistate-singletrait model

The multistate-singletrait model depicted in Figure 1 assumes that all observed variables assessed at a particular time point share the same latent state variable, τ_t. Therefore, only one common latent state variable per time point is specified. These variables represent, for instance, the level of state Extraversion at each time point. However, the observed variables might measure the latent state variable on a different metric, for example, due to using a different rating scale or differing in item difficulty (items might vary in the level of Extraversion that is required to agree to an item) or discrimination (some items might be more representative of the Extraversion domain [e.g. an item asking whether a person currently talks to others] than other items [e.g. an item asking whether a person is currently physically active]). Therefore, intercepts, ν_it, and factor loadings, λ_it, can be introduced that are allowed to vary between the indicators and time points:³

Y_{i t} = ν_{i t} + λ_{i t} \cdot τ_{t} + ε_{i t} .

(7)

Figure 1.

Alternative Representations of the Multistate-Singletrait Model. Note. The models in Panel A and B are equivalent in terms of the implied variance-covariance matrix and mean structure as well as model fit. For the sake of clarity, the mean structure of the model is omitted. ξ_t = latent trait variables; τ_t = latent state variables; ζ_t = latent state residual variables; Y_it = manifest variables; ε_it = measurement error variables; λ_it = regression coefficients of the latent state variables on the indicators; $λ_{T_{t}}$ = regression coefficients of the latent trait variable ξ₁; i = indicator; t = time.

It may also be assumed that the intercepts and factor loadings only vary between the indicators but not between time points, such that the latent state variables are measured invariantly across time. To identify the model, the intercept of the first indicator is typically set to 0 and the factor loading of the first indicator is set to 1 (note that the intercepts are omitted in Figure 1 for clarity).

As can be seen in Figure 1(a), the time-specific latent state variables are further decomposed into time-specific latent trait variables (ξ_t; e.g. the trait Extraversion level at each time point) and time-specific state residual variables (ζ_t; e.g. the situation-specific deviations of the state extraversion level from the trait Extraversion level). Importantly, although the latent trait variables are time-specific, they are assumed to represent functions of the latent trait variable at the first time point, as indicated by the paths from ξ₁ to ξ₂ and ξ₃. The scale on which the latent trait variables are measured may thus change over time but this change can be perfectly described by a linear function (i.e. adding the constant α_t and multiplying by the constant $λ_{T_{t}}$ ),

ξ_{t} = α_{t} + λ_{T_{t}} \cdot ξ_{1}

with α₁ = 0 and

λ_{T_{1}}

= 1. This implies that the latent trait variables correlate perfectly and that the rank-order of the individuals maintains stable across time. As such, the latent state variables can alternatively be represented as sharing a common latent trait variable, defined as the latent trait variable at the first time point (see Figure 1(b); Eid et al., 2017). This common latent trait variable influences all latent state variables, but the influence does not have to be stable over time (i.e. the factor loadings

λ_{T_{t}}

may differ across time).⁴ In other words, this model assumes that the trait (e.g. of Extraversion) remains stable over the duration of the study, although its influence on personality states may decrease (or grow). Given the short duration of most experience sampling or daily diary surveys compared to the time scale of personality change (Bleidorn et al., 2021), such an assumption may seem plausible.

The multistate-singletrait model assumes that the stability of the latent state variables across time is entirely explained by the common latent trait factor and implies that the state residual variables, capturing the variability across time, are uncorrelated. This may be reasonable if the lag between time points is long or if the construct under investigation is highly stable, such that the situation-specific deviations from the general trait level are independent from each other. However, for more fluctuating constructs or in experience sampling studies in which individuals are assessed repeatedly within short periods of time, situational influences may carry over to subsequent time points, which makes the multistate-singletrait model inappropriate in this case. For example, when Ahmed feels more extraverted than usually because he is at a party at time 1 (i.e. his Extraversion state deviates from his trait Extraversion level), this might affect how extraverted he feels at time 2, especially if these time points are closely spaced (e.g. because having made new friends at the party, Ahmed becomes more lively and communicative). To account for such carry-over effects between adjacent time points, LST-R models can be extended to include autoregressive effects.

Multistate-singletrait model with autoregressive effects

There are different approaches to formulate models with autoregressive effects in line with LST-R theory (Stadtbaeumer et al., 2022).⁵ Here, we present the trait-state occasion (TSO) model (Cole et al., 2005) as reformulated by Eid et al. (2017) according to the LST-R theory, which represents a multistate-singletrait model with autoregressive effects. The basic idea of the reformulated TSO model is that individuals exhibit relatively stable levels of a construct over time, but that the situations or life events encountered by individuals (situation effects) and the reaction to those situations (person × situation interactions) can change the trait level. In contrast to the basic multistate-singletrait model, this implies that the latent state residual variables have an effect on all subsequent latent trait variables, as shown by the autoregressive effects, $λ_{S_{t}}$ , in Figure 2 (also compare with Figure 1(a)). The trait variables, ξ_t, from the second time point onwards are thus assumed to be determined by a linear combination of the trait variable at the first time point, ξ₁, and the state residuals at all previous time points, ζ_t−1, …, ζ₁. Therefore, the time-specific trait variables are not constrained to be perfectly correlated over time, implying that the rank-order of individuals regarding their trait levels may change (Geiser, 2021).

Figure 2.

Multistate-Singletrait Model with Autoregressive Effects. Note. For the sake of clarity, the mean structure of the model is omitted. ξ_t = latent trait variables; τ_t = latent state variables; ζ_t = latent state residual variables; Y_it = manifest variables; ε_it = measurement error variables; λ_it = regression coefficients of the latent state variables on the indicators; $λ_{T_{t}}$ = regression coefficients of the latent trait variable ξ₁; $λ_{S_{t}}$ = regression coefficients of the latent state residual variables (autoregressive effects); i = indicator; t = time.

To illustrate this idea, let us return to our example. At the beginning of our study, Ahmed has a certain level of trait Extraversion, captured by the value of the time-specific trait variable, ξ₁. At the first time point, Ahmed is at a party, which causes his state Extraversion to be higher than expected based on his time-specific trait level, implying a positive value for the first state residual variable, ζ₁. Ahmed’s experiences at the party strengthen his social self-esteem, which increases his trait level of Extraversion at the second time point, ξ₂. As a consequence, Ahmed is now more lively and communicative in various situations. At the second time point, Ahmed experiences social rejection, leading to a state Extraversion level lower than expected based on the time-specific trait variable, ξ₂, and a negative value for the state residual variable, ζ₂. This again can affect the subsequent time-specific trait level, ξ₃, by lowering the situation-independent Extraversion level at the third time point, but arguably less so than if Ahmed had not experienced a positive situation at the first time point. In other words, situation-specific experiences influence all future trait levels. Importantly, however, the effect of state residual variables on the trait variables is expected to fade out over time (implied by the multiplication of the autoregressive effects over multiple time points, see Figure 2; Eid et al., 2017). Thus, sustained trait change is more likely to occur through repeated situational experiences (Wrzus & Roberts, 2017).

Coefficients of LST-R theory

One of our aims is to propose criteria for evaluating the ability of personality state measures to pick up on systematic fluctuations from situation to situation. The variance decomposition provided by LST-R models is a particular advantage for this purpose because it allows scrutinising to what extent the observed variables reflect trait versus state influences and it makes it possible to separate these systematic effects from random measurement error. To quantify the influence of these different sources of variance, several coefficients have been defined. Table 1 presents the general formulae of these coefficients and illustrates their calculation for an exemplary indicator in the model with autoregressive effects shown in Figure 2.

Table 1.

Variance decomposition coefficients in LST-R theory.

Coefficient	General formula	Example
Reliability	$R e l (Y_{i t}) = \frac{V a r (τ_{i t})}{V a r (Y_{i t})} = 1 - \frac{V a r (ε_{i t})}{V a r (Y_{i t})}$	$R e l (Y_{23}) = \frac{λ_{23}^{2} \cdot V a r (τ_{3})}{V a r (Y_{23})} = \frac{λ_{23}^{2} \cdot (λ_{T_{3}}^{2} \cdot V a r (ξ_{1}) + λ_{S_{2}}^{2} \cdot λ_{S_{3}}^{2} \cdot V a r (ζ_{1}) + λ_{S_{3}}^{2} \cdot V a r (ζ_{2}) + V a r (ζ_{3}))}{λ_{23}^{2} \cdot (λ_{T_{3}}^{2} \cdot V a r (ξ_{1}) + λ_{S_{2}}^{2} \cdot λ_{S_{3}}^{2} \cdot V a r (ζ_{1}) + λ_{S_{3}}^{2} \cdot V a r (ζ_{2}) + V a r (ζ_{3}) + V a r (ε_{23}))} =$
Specificity	$S p e (Y_{i t}) = \frac{V a r (ζ_{i t})}{V a r (Y_{i t})}$	$S p e (Y_{23}) = \frac{λ_{23}^{2} \cdot V a r (ζ_{3})}{V a r (Y_{23})}$
Consistency	$C o n (Y_{i t}) = \frac{V a r (ξ_{i t})}{V a r (Y_{i t})}$	$C o n (Y_{23}) = \frac{λ_{23}^{2} \cdot V a r (ξ_{3})}{V a r (Y_{23})} = \frac{λ_{23}^{2} \cdot (λ_{T_{3}}^{2} \cdot V a r (ξ_{1}) + λ_{S_{2}}^{2} \cdot λ_{S_{3}}^{2} \cdot V a r (ζ_{1}) + λ_{S_{3}}^{2} \cdot V a r (ζ_{2}))}{V a r (Y_{23})}$
Predictability by trait 1	$P r e d_{t r a i t 1} (Y_{i t}) = \frac{{λ_{T}^{2}}_{t} \cdot V a r (ξ_{i 1})}{V a r (Y_{i t})}$	$P r e d_{t r a i t 1} (Y_{23}) = \frac{λ_{23}^{2} \cdot λ_{T_{3}}^{2} \cdot V a r (ξ_{1})}{V a r (Y_{23})}$
Unpredictability by trait 1	UPred_trait1(Y_it)=Con(Y_it)-Pred_trait1(Y_it)	$U P r e d_{t r a i t 1} (Y_{23}) = \frac{λ_{23}^{2} \cdot (λ_{S_{2}}^{2} \cdot λ_{S_{3}}^{2} \cdot V a r (ζ_{1}) + λ_{S_{3}}^{2} \cdot V a r (ζ_{2}))}{V a r (Y_{23})}$

Note. The example shows the calculation of the coefficients for the second indicator of the third latent state variable in the model with autoregressive effects shown in Figure 2. Depending on the specific definition of a model (e.g. assuming a common state variable at each time point and imposing invariance constraints on the factor loadings across time), the calculation of the coefficients may differ from the general formulae presented. In models without autoregressive effects, the calculation of the coefficients simplifies because all $λ_{S_{t}}^{2}$ become 0. Coefficients for the predictability and unpredictability by trait 1 are only defined in models with autoregressive effects for observed variables from the second time point onwards because the latent trait variable at the first time point is unaffected by latent state residual variables.

The reliability coefficient gives the proportion of variance in an observed variable that is explained by the latent state variable, or, in other words, the proportion of variance that is not due to random measurement error (Steyer et al., 2015). The reliability of an observed variable is the sum of the proportion of variance explained by the latent state residual variable (captured by the specificity coefficient)⁶ and the proportion of variance explained by the latent trait variable (captured by the consistency coefficient). The specificity coefficient thus represents the degree to which the score on an observed variable is determined by situation-specific influences and the interaction between the person and the situation, with higher values indicating that the observed variable reflects a more state-like construct (Steyer et al., 2015). The consistency coefficient, in turn, represents the degree to which the score on an observed variable is determined by stable individual differences, with higher values indicating that an observed variable reflects a more trait-like construct (Steyer et al., 2015).

In models with autoregressive effects, the latent trait variable at a specific time point is influenced by both the latent trait variable at the first time point and previous state residuals, which allows further decomposing the consistency coefficient into these two components. The proportion of variance in an observed variable that is explained by the first latent trait variable is given by the predictability by trait 1 coefficient, indicating to which degree the score on an observed variable reflects trait differences at the first time point. The proportion of variance in an observed variable that is explained by previous state residuals is given by the unpredictability by trait 1 coefficient, indicating to which degree the score on an observed variable reflects carry-over effects (Eid et al., 2017). Unpredictability by trait 1 thus indicates to what degree individual differences in states can be attributed to trait changes as a result of diverging experiences over the course of the study.

Criteria for evaluating personality state measures

A recent review by Horstmann and Ziegler (2020) has documented that research on personality states largely relies on ad hoc measures. Existing personality state measures have rarely undergone a thorough psychometric evaluation with respect to their validity and reliability (for an exception, see Ringwald et al., 2022). If the validity and reliability of ad hoc state measures is assessed at all, this is often done using methods designed for trait measures (Horstmann & Ziegler, 2020; Wright & Zimmermann, 2019). Here, we describe a framework for developing and evaluating personality state scales on their own terms, and define criteria by which to do so.

LST-R theory provides a coherent framework for defining and measuring personality states. In line with contemporary personality theories, we have defined personality states as coherent affect, cognitions, and desires at a particular time, which can be formalised as latent state variables in LST-R theory. Within the LST-R theory framework, we can define criteria for good personality state measures, that is, criteria to assess the coherence and temporal specificity of proposed state measures. These criteria can be used to evaluate existing scales and to develop new instruments.

Theory-consistent personality state measures should fulfil four key criteria: (1) Correspondence between the internal structure of the measure and the theoretical structure of the construct; (2) longitudinal measurement invariance; (3) ability to capture systematic intrapersonal fluctuation (i.e. specificity); and (4) fidelity to the substantive interpretation of the construct (i.e. validity). We define the first three criteria within the LST-R framework and describe how LST-R models can be extended to examine the measure’s convergent, discriminant, and criterion validity. For an overview of the proposed criteria, see Table 2.

Table 2.

Criteria to evaluate the validity and reliability of state measures within an LST-R framework.

Criterion	Description	Recommendations
Validity	Evidence and theory support the interpretation of scores in line with the intended use of the measure.	• Specify the intended purpose of the measure.
Validity		• Consider various sources of evidence to evaluate the validity of the measure
Test content	The content of the measure is relevant to and representative for the construct that is intended to be measured.	• Provide a precise definition of the target construct.
		• Develop a comprehensive set of item content based on literature reviews, expert judgement, systematic observations of behaviour, critical incident technique etc.
		• Review the appropriateness of the items to the target population of participants and situations.
Internal structure	The relationship among indicators and latent variables conforms to the theoretical structure of the target construct.	• Translate the theoretical assumptions about the structure of the construct into a formal LST-R model.
		• Evaluate the tenability of those assumptions by assessing the goodness-of-fit of the model.
		• Test whether longitudinal measurement invariance can be assumed.
Convergent and discriminant relations	The latent trait and latent state residual variables of the measure exhibit theory-consistent associations with latent variables of other measures for the same construct (convergent evidence) or different constructs (discriminant evidence).	• Examine the convergent and discriminant relations at both the state level and trait level of the construct.
Convergent and discriminant relations		• The latent state residual variables and latent trait variables should correlate with conceptually related constructs, but not with conceptually unrelated constructs.
Criterion relations	The latent trait and latent state residual variables of the measure are related to theoretically or practically relevant criteria.	• Investigate the relation of the state residual variables and trait variables to their respective outcomes.

Criterion	Description	Recommendations
Reliability (internal consistency)	The indicators of a measure assess the same construct.
Reliability	Proportion of variance in an observed variable that is due to the latent state variable.	• Reduce the influence of measurement error on scores by selecting indicators with high reliability.
Specificity	Proportion of variance in an observed variable that is due to the latent state residual variable.	• Increase the influence of situation effects and situation × person interactions on scores by selecting indicators with high specificity.
Consistency	Proportion of variance in an observed variable that is due to the latent trait variable.	• Prefer indicators with high specificity over indicators with high consistency.

Data considerations

We formally model personality states as latent state variables in LST-R models. To fit such models, the construct of interest should be measured with at least two (preferably three or more) items on three or more time points (Clark & Watson, 1995). The interval between measurement occasions should reflect how quickly the construct is expected to fluctuate. For personality states, which are expected to fluctuate from situation to situation rather than, for example, from day to day, experience sampling data with multiple measures per day may be most appropriate. Finally, data should come from the intended population of both persons and situations. For example, if personality states are elicited at the same hour each day, this may underestimate intrapersonal fluctuations because the sampled situations may not reflect the universe of situations the participants experience throughout the day.

Psychometric evaluation and item selection

Internal structure

The internal structure of a scale should correspond to the theoretical structure of the personality state construct it is meant to measure. To assess the degree of correspondence, it is necessary to translate the theoretical assumptions about the structure of the construct into a formal (LST-R) model. Within the LST-R framework, an important aspect to consider is whether the indicators are assumed to reflect multiple states and a single trait (translating into a multistate-singletrait model) or both multiple states and multiple traits (translating into a multistate-multitrait model; Steyer et al., 2015). One must also decide whether autoregressive effects should be included in the model, which may be necessary if the short time lag between measurement occasions or the fluctuating nature of a construct lead to carry-over effects. Note that the omission or inclusion of autoregressive effects also implies different assumptions about the rank-order stability of the trait levels of a construct over the course of the study. Whereas models without autoregressive effects assume a perfectly stable rank order of individuals regarding their trait levels, models with autoregressive effects allow for rank-order changes in trait levels (Geiser, 2021).

The degree of correspondence between the theoretical structure of the construct and the internal structure of the measure can be evaluated using typical goodness-of-fit indices for factor analytic models (Bader & Moshagen, 2022; Hu & Bentler, 1999; West et al., 2012). However, commonly recommended cut-off points may need to be adjusted for models of intensive longitudinal data with many degrees of freedom (see e.g. Norget & Mayer, 2022; Yuan et al., 2015). If multiple models are theoretically plausible, the most appropriate model can be selected using model comparison (e.g. Columbus, Norget, Mayer, & Balliet, in prep). Ultimately, however, an LST-R model should fit the data. Persistent model misfit may indicate that the internal structure of the measure does not fit the theoretical structure of the construct. This may occur when the observed variables do not cohere in the predicted manner. In this case, it may be necessary to reconsider the theoretical conception of the state construct, or to revise the items.

Personality state measures are often multidimensional (e.g. scales corresponding to the Big Five and HEXACO models of personality). When multidimensional measures are evaluated, it should also be assessed whether theoretical assumptions about the dimensionality of the measure (e.g. number of underlying factors and relation between the factors) are in line with its empirical structure. If correlated factors are assumed, specific attention should be paid to the magnitude of factor correlations to ensure that the correlations between factors are neither too low (indicating a lack of association between constructs) nor too high (indicating a lack of discriminability between constructs).

Longitudinal measurement invariance

When state measures are to be used to compare scores on the latent state and/or trait variables over time, it is necessary to establish the longitudinal measurement invariance of the manifest variables. Measurement invariance refers to the comparability of measurement across different time points or groups, which is important for interpreting changes in the levels of latent variables unambiguously (Widaman et al., 2010). Measurement invariance is typically tested by comparing sequences of nested models with increasingly strict equality constraints (Meredith, 1993). Configural invariance imposes an equal factor structure in terms of the number of factors and with regards to which items load on the factors. Metric invariance additionally constrains unstandardised factor loadings of corresponding items to equality across different time points, and scalar invariance additionally constrains the intercepts of corresponding items to equality across time, which is necessary to compare latent mean levels at different time points. Finally, strict invariance imposes additional equality constraints on the measurement error variables of the items, which implies a constant reliability of corresponding items and thus enables the comparison of manifest scale scores across time (Meredith, 1993).

Specificity

Any psychometric measure should pick up on systematic variation in the proposed construct (and as little as possible on noise). Trait measures are therefore evaluated for reliability (i.e. the proportion of variance due to systematic individual differences) using indices of internal consistency such as Cronbach’s α or McDonald’s ω. Such indices assess whether a measure reliably captures stable interindividual differences; systematic intrapersonal fluctuations can create the impression that a measure is unreliable (Horstmann & Ziegler, 2020). Importantly, though, reliability and within-person variability are not opposed: Fluctuations can be measured reliably, and good personality state scales are designed to do so.⁷

Reliability alone thus is not sufficient to assure that the measure is sensitive to systematic moment-to-moment fluctuations. However, with LST-R models of longitudinal data, it is possible to decompose reliability⁸ into specificity (the proportion of variance due to the latent state residuals, i.e. systematic situation-specific influences) and consistency (the proportion of variance due to the latent trait variables). High specificity is particularly desirable for personality state measures. A substantial amount of specificity ensures that the selected indicators are responsive to situation-specific influences and interactions between persons and situations, thereby capturing intraindividual variability in states across time points.

Low specificity is problematic when state measures are used to study the predictors, correlates, outcomes, or dynamics of personality states. If the reliability of indicators was solely attributable to consistency, the resulting measure could only provide insights into stable interindividual differences in trait levels. Using such a measure in a study of personality dynamics may give the impression that personality states have no meaningful antecedents or consequences. This could be the case even if the measure exhibits sizeable intrapersonal variability (e.g. as indicated by the intraclass correlation coefficient), if this variability is purely due to measurement error. Researchers interested in studying personality dynamics should therefore develop and select measures with sufficient specificity to answer their substantive research questions.

One might object that, if the specificity of items is maximised to such an extent that none of the reliable variance in indicators is due to consistency, the indicators could no longer be interpreted as measuring a common underlying trait. In practice, however, it is rather unlikely to obtain indicators that are unrelated to trait effects because psychometric evaluations of existing state measures suggest that indicators are typically affected by stable interindividual differences to a substantial extent (e.g. Rauthmann et al., 2019; Ringwald et al., 2022; Zimmermann et al., 2019), though there might be exceptions (e.g. a brief measure of relative power exhibited high specificity, but only negligible consistency; Columbus, Norget, et al., in prep). In contrast to overall reliability, it is thus not the case that higher specificity is always better, nor is there some optimal level of specificity. Instead, the specificity of state measures should be evaluated with respect to theoretical concerns (how much is the construct expected to vary) and in comparison to related measures.

Further validation

Convergent and discriminant relations

In addition to evidence based on the internal structure, the relation of the target construct to other conceptually related constructs is a further important source of validity evidence. To validate a newly developed state measure, one should therefore concurrently collect data using other measures of the same or very similar constructs as well as of distinct constructs. These constructs can then be included as additional latent variables into the LST-R model to assess their associations with the latent trait and latent state residual variables. Whereas convergent evidence may be obtained from substantial correlations between the latent variables of the newly developed state measure and of alternative measures for the same or very similar constructs, discriminant evidence may be obtained based on weak correlations to the latent variables of measures intended to assess different constructs (AERA, APA, & NCME, 2014).

Importantly, validation should include both the state and trait level of a construct (Wright & Zimmermann, 2019). Correlations at the aggregate level do not necessarily mean that individual states of the constructs co-vary. Therefore, it is necessary to (also) probe convergent and discriminant validity at the level of particular states. That is, there should be theory-consistent associations between latent state residual variables and conceptually related time-varying variables (e.g. a positive deviation of the state Extraversion level from the respective trait level should be positively related to the level of state sociality), but also theory-consistent associations between latent trait variables and related trait measures (e.g. trait levels of Extraversion should be positively related to trait levels of sociality).

Criterion relations

A related source of validity evidence is the relation of the latent variables of the state measure to theoretically meaningful or practically relevant criteria (AERA, APA, & NCME, 2014). For personality states, such criteria may include self-reported behaviour, but also observational data (e.g. smartphone data; Stachl et al., 2021) or physiological measures. Again, the relation to relevant criteria should be demonstrated both with regard to the state and trait level of the measure. For instance, if, for a state measure of Extraversion, the latent state residual variables were shown to correlate significantly with objective measures of talkativeness (e.g. recorded via wearable cameras or electronically activated recorders; Brown et al., 2017) and the latent trait variables were shown to be positively related to the number of (Facebook) friends (Lönnqvist & Itkonen, 2014), this may be interpreted as evidence for the validity of the measure.

Developing personality state measures

The proposed criteria for evaluating personality state measures can also be applied in developing new instruments designed to assess personality states. Most personality state measures are generated ad hoc from existing trait measures (Horstmann & Ziegler, 2020). One reason for the lack of instruments developed explicitly for the assessment of personality states may be the relative scarcity of guidelines. The development of native state measures benefits from general good practice in scale development (DeVellis & Thorpe, 2021). However, there are several aspects of state measures which call for a distinct approach. Below, we highlight how the formal definition of personality states in the LST-R framework and criteria for model evaluation can inform scale development.

Initial theoretical considerations

The development of any psychological measurement instrument ideally begins with a clear account of the target construct and the proposed use of the measure. When developing (personality) state measures, the following questions should be addressed: (1) What is the construct being measured? (2) What is the intended purpose of the measure? and (3) What is the targeted population of persons and situations? (Horstmann & Ziegler, 2020).

Answering the first question requires providing a theoretical account of the structure and content of the proposed construct. The LST-R framework provides an explicit statement of the expected structure of the construct. Different models express different assumptions about the structure of a construct. In particular, one must specify whether the construct corresponds to one or multiple traits (e.g. in the case of personality dimensions with multiple facets). It is also important to consider at which time intervals the construct is assumed to fluctuate. Finally, one must define the content of the construct. When developing state scales corresponding to existing models of (trait) personality, one can draw on conceptual (e.g. Ashton et al., 2014) and empirical (e.g. Zettler et al., 2020) analyses to identify the content of relevant states. Scale developers may also decide to constrain the domain they are studying on theoretical grounds. For example, analyses of trait scales have shown that they capture a mixture of affect, behaviour, cognition, and desire (Wilt & Revelle, 2015; Zillig et al., 2002). However, in developing the HEXACO Personality States Inventory, Columbus, Böhm, et al. (in prep) explicitly excluded behaviours, arguing that concrete behaviours are better understood as consequences of personality states. Such conceptual concerns can inform item generation and scale validation.

Concerning the second question, two main purposes of state measures can be distinguished: (i) using time-specific state scores as an indicator of the state level of a person at a particular time point and (ii) using aggregated state scores as an indicator of the trait level. Depending on the purpose, desirable characteristics of a state measure may differ. For instance, whereas the use of time-specific state scores requires indicators that are responsive to situation effects and person × situation interactions (i.e. high specificity), the use of aggregated state scores benefits from indicators that are strongly influenced by trait effects (i.e. high consistency). Our focus will be on the first purpose, as the assessment of time-specific states is arguably the more common and natural use of state measures (for more information on developing state measures for the purpose of assessing traits, see Horstmann & Ziegler, 2020). Finally, the population for which the state measure is intended refers to both the target population of participants and the target population of situations. This should be specified because it informs the generation of items and the appropriateness of these items for the intended sample of participants (e.g. adolescents, older adults) and situations (e.g. interpersonal situations and situations at work). When items are irrelevant to the sampled situations, this may distort the measurement of personality states (Kritzler et al., 2023).

Item generation

A crucial next step in the development of state measures is the generation of the initial item pool. Guided by the initial theoretical considerations and empirical findings on the nomological net of a target construct, a comprehensive set of all content that might be relevant to the construct should be devised (Clark & Watson, 2019). Here, it is recommended to be overinclusive because whereas subsequent psychometric analyses can detect unrelated content that may be excluded, missing content cannot be identified (Clark & Watson, 1995; Loevinger, 1957). The set of item content then serves as a basis for formulating concrete items. A particular concern when developing state items is the trade-off between breadth and specificity (Horstmann & Ziegler, 2020). On the one hand, items should be broad enough to be applicable to most situations within the sampling frame (Kritzler et al., 2023) and to cover the content domain of the construct. On the other hand, items should refer to specific (momentary) affect, behaviour, cognition, or desire, lest they fail to capture occasion-specific manifestations of the personality dimension and lose specificity. Additionally, general recommendations for optimal item wording (e.g. using simple syntax), choosing an adequate rating scale, and developing appropriate instructions should be taken into account (Horstmann & Ziegler, 2020).

Item selection

The most important contribution of the LST-R framework to state scale development is during the item selection process. Once an initial item pool has been generated, one must reduce the number of items to the final scale. For this purpose, it is important to collect data which fit the criteria outlined in the initial considerations (e.g., right population of participants and situations and appropriate lag between measurement points). In selecting items for a state scale, it is important to use longitudinal data to which an LST-R model can be fitted (for a tutorial, see Norget et al., 2023).

While it is common to select individual items based on item-level criteria (e.g. factor loadings), a more principled approach selects item sets which meet the various desiderata of the scale. Using algorithmic methods, it is possible to reduce an initial item pool into a shorter scale by selecting the combination of items which performs best against a set of pre-specified criteria (for a review, see Olaru et al., 2019). This makes it feasible to select item sets using the criteria of model fit, measurement invariance, specificity, and/or convergent and discriminant validity outlined above. For example, Columbus, Böhm, et al. (in prep) used ant colony optimisation to develop a personality state measure corresponding to the HEXACO model of personality. This algorithm fits subsets of the overall item set to a multidimensional LST-R model to select four items per HEXACO dimension (one per facet) while maximising model fit and specificity and minimising correlations between traits across dimensions. Thus, algorithmic item selection methods can be used to create measures which fit the criteria for good personality state measures defined within the LST-R framework.

The proposed criteria for the evaluation of personality state measures can be adapted or complemented depending on the construct of interest or the intended purpose of the measure. For instance, for some state measures it may be desirable to reduce the impact of autoregressive effects on resulting scores by selecting indicators that are characterised by high specificity and high predictability by trait 1. Conversely, it may also be desirable to develop state measures that are particularly sensitive to the effect of current and previous latent state residual variables by selecting indicators with high specificity and high unpredictability by trait 1. Customising the set of desirable criteria is facilitated by a clear account of the target construct and the proposed use of the measure.

Either before or after the first selection is made from the initial item pool, items should be rated for construct validity by a set of domain experts. Important questions to ask include whether raters can correctly associate items with the relevant construct (especially when the scale captures multiple dimensions) and whether they consider the item appropriate for the theoretical construct. It can also be valuable to use techniques such as cognitive interviewing to probe whether participants interpret items as intended (Peterson et al., 2017; Ryan et al., 2012). Insights from these qualitative procedures can be used to improve or discard items which may be misinterpreted or fail to capture the construct of interest.

The initially collected evidence for the reliability and validity of personality state measures should not be regarded as the end of the scale construction process. The development of psychological measurement instruments ideally proceeds in an iterative fashion, whereby the psychometric properties of a measure should be replicated and potentially optimised across multiple independent samples. Often, LST-R analyses and expert ratings will reveal gaps in the initial item set. For example, it is possible that some facets of the construct of interest are not captured by any of the original items. In this case, it is important to revise or replace items. Moreover, it is important to avoid overfitting. Therefore, once a final candidate item set has been identified, it is important to fit the chosen LST-R model to a separate, confirmatory sample.

Empirical example

To illustrate the proposed approach for developing and evaluating personality state measures within an LST-R framework, we use daily diary assessments of Big Five personality states from Ringwald et al. (2022). The data stem from three independent samples of undergraduate students (Sample 1: N = 330; 62% female; mean age = 18.6, SD = 0.96 years), community members (Sample 2: N = 342; 52% female; mean age = 27.6, SD = 4.9 years), and participants of the University of Pittsburgh Adult Health and Behavior project (Sample 3: N = 458; 54% female; mean age = 59.5, SD = 7.2 years). For the present analyses, we pooled the data of all three samples. Upon exclusion of 176 individuals who did not provide any data on the relevant variables at the time points of interest, the final sample comprised N = 954 participants. In all samples, Big Five personality states were measured daily using four items for each of the five traits (for more details on the procedure, see Ringwald et al., 2022). Participants were instructed to indicate which of two (opposing) adjectives described them best in the past 24 hours on a 7-point Likert scale. We investigate the personality state items for extraversion, namely Item 1 (lethargic/energetic), Item 2 (bold/timid; reverse scored), Item 3 (talkative/silent; reverse scored), and Item 4 (unassertive/assertive), and only consider data of the first three time points for the sake of simplicity. The focus of this empirical example is on the psychometric evaluation of and item selection for a state measure of extraversion. No hypotheses were preregistered and all analyses were performed in an exploratory fashion. The data and commented R code for the following analyses are publicly available on the Open Science Framework (OSF; https://osf.io/szfu7/).

To evaluate the psychometric properties of this extraversion state measure, we first translated the theoretical assumptions about the structure of the measure into an LST-R model. Specifically, we assumed that the four indicators measure a distinct extraversion state at each of the three time points as well as a common extraversion trait across all time points. This corresponds to a multistate-singletrait model as presented in Figure 1 (but with four manifest variables for each latent state variable).

As the measurement occasions were closely spaced in time with a lag of one day between adjacent time points, we also investigated the possibility of carry-over effects by additionally estimating a multistate-singletrait model with autoregressive effects as shown in Figure 2 (again with four manifest variables for each latent state variable). To identify the latter model, we imposed equality constraints on the autoregressive effects, the variances of the occasion residual factors, and the variances of the latent state residuals (Prenoveau, 2016). The models were estimated using the R package lavaan (Rosseel, 2012) as well as the R package lsttheory (Mayer, 2015), which facilitates the specification of LST-R models (see also Norget et al., 2023). Missing values and non-normally distributed data were addressed using full information maximum likelihood estimation with robust (Hubert-White) standard errors and a test statistic that is asymptotically equal to the test statistic by Yuan and Bentler (2000).

The multistate-singletrait model without autoregressive effects yielded only a poor degree of goodness-of-fit to the data, χ² (51) = 358.79, p < .001, RMSEA = .103, RMSEA 90% CI = [.093, .113], SRMR = .061, CFI = .813. We therefore allowed for correlations between the measurement error variables of the same items at different time points to account for stable indicator-specific variance, which significantly improved the model fit ∆χ²(12) = 197.82, p < .001 and led to an acceptable degree of goodness-of-fit of the modified model according to descriptive fit indices, χ² (39) = 153.75, p < .001, RMSEA = .072, RMSEA 90% CI = [.060, .084], SRMR = .044, CFI = .930.

Next, the longitudinal measurement invariance of the measure was assessed to test whether the manifest variables measure the latent state variables equivalently across time. Given the acceptable fit for the multistate-singletrait model assuming one latent state variable for each time point, configural invariance was assumed. Constraining the unstandardized factor loadings of the manifest variables to equality across time points did not lead to a significant deterioration of model fit, ∆χ²(6) = 10.90, p = .091, ∆CFI_{configural−metric} = .003, thus supporting metric measurement invariance. When constraining the intercepts of the manifest variables to equality across time points, the model fit deteriorated significantly, ∆χ²(6) = 36.13, p < .001, and the difference in the comparative fit index, ∆CFI_{metric−scalar} = .011, exceeded benchmarks typically considered indicative of a lack of measurement invariance (Chen, 2007; Cheung & Rensvold, 2002). Thus, scalar invariance cannot be assumed for the state measure and differences in the mean levels of the latent variables should be interpreted with caution. Given the lack of scalar invariance, strict invariance was not assessed.

When adding autoregressive effects to the multistate-singletrait model, the model with autoregressive effects (BIC = 33,025, AIC = 32,816) showed minor improvements in the BIC but not in the AIC compared to the model without autoregressive effects (BIC = 33,035, AIC = 32,816). Moreover, the autoregressive effects did not differ significantly from zero ( $λ_{S_{2}}$ = $λ_{S_{3}}$ = −0.05, p = .659). This suggests that no carry-over effects occurred between adjacent time points and also that there is no evidence for rank-order changes in the levels on the trait variables throughout the time period of three days.

Therefore, the multistate-singletrait model with correlated measurement error variables and metric measurement invariance was the final model used for the psychometric evaluation of measure. This model fitted the data satisfactorily according to descriptive fit indices, χ² (45) = 166.09, p < .001, RMSEA = .068, RMSEA 90% CI = [.057, .079], SRMR = .048, CFI = .927. All manifest variables loaded significantly on the corresponding latent state variable, with the standardised factor loadings ranging from to .41 to .76 (see Table 3). The latent state variables, in turn, were also significantly affected by the common latent trait variable (i.e. the latent trait variable at the first time point), with standardised estimates of

λ_{T_{1}}

= .77,

λ_{T_{2}}

= .82, and

λ_{T_{3}}

= .80.

Table 3.

Descriptive statistics, standardised factor loadings, and variance decomposition coefficients for the indicators of the extraversion state measure.

Indicator	M	SD	Loading	Reliability	Specificity	Consistency
Y_it	M	SD	λ_it	Rel (Y_it)	Spe (Y_it)	Con(Y_it)
Y ₁₁	4.08	1.34	.53	.29	.12	.17
Y ₂₁	4.03	1.38	.75	.56	.23	.33
Y ₃₁	4.41	1.56	.57	.33	.13	.20
Y ₄₁	4.23	1.27	.54	.30	.12	.17
Y ₁₂	3.91	1.54	.48	.23	.08	.15
Y ₂₂	4.14	1.36	.76	.58	.19	.39
Y ₃₂	4.27	1.54	.56	.32	.11	.21
Y ₄₂	4.11	1.16	.60	.36	.12	.24
Y ₁₃	3.94	1.43	.41	.17	.06	.11
Y ₂₃	4.16	1.31	.65	.43	.15	.27
Y ₃₃	4.33	1.41	.50	.25	.09	.16
Y ₄₃	4.22	1.09	.53	.28	.10	.18

Note. Parameters were obtained from a multistate-singletrait model with metric measurement invariance and correlations between the measurement error variables of the same indicators at different time points. Standardised loadings may differ between time points because metric invariance imposes equality constraints on the unstandardised factor loadings. The reliability of an item corresponds to the square of its standardised factor loading. The specificity and consistency coefficients do not always sum up exactly to the reliability coefficient due to rounding.

As can be seen in Table 3, the manifest variables exhibited only a small to moderate degree of reliability. Whereas Item 2, on average, exhibited the highest reliability, Item 1 exhibited the lowest reliability at all time points. For all indicators, a larger proportion of reliable variance was due to stable trait differences than due to situation effects and interactions between the person and the situation, as is evident from the higher values for the consistency coefficients compared to the specificity coefficients. For the total extraversion scale, the reliability was .69, .68, and .60, and the specificity was .28, .23, and .21 at the first, second, and third time point, respectively (see the R code on the OSF for how to calculate the reliability and specificity of the total scale). This means that, on average, only 24% of the variance in the extraversion state scores could be attributed to reliable situation-specific influences. These reliability estimates are lower than those obtained via multilevel modelling, where the within-person reliability for the extraversion scale was .52 and the between-person reliability was .83 (Ringwald et al., 2022).

Taken together, the results support the internal structure of the extraversion state measure and suggest that autoregressive effects between adjacent time points are not required. Whereas metric measurement invariance can be assumed for the measure, scalar invariance was not supported. Therefore, additional work should identify which indicators exhibit different values for their intercept across time and ideally replace those indicators with alternative items conforming to scalar invariance. Furthermore, some indicators showed only a small degree of reliability and very small specificity. In particular, Item 1 exhibited the lowest specificity at all three time points and should be replaced by an indicator that can measure situation-specific effects in extraversion more reliably. In addition to identifying more suitable indicators for the extraversion state measure, further steps in evaluating the measure are to investigate the convergent and discriminant relations of state and trait scores to other conceptually relevant constructs and, potentially, criterion variables.

Discussion

Personality states refer to the affect, behaviour, cognition, and desires of a person in a particular situation. We formally define personality states within LST-R theory, which we translate into testable latent variable models. To examine the predictors, correlates, outcomes, or dynamics of personality states, researchers must rely on valid and reliable measures. Such measures must capture the intrapersonal fluctuations arising from systematic situation-specific influences. On the basis of the LST-R framework, we propose a series of criteria for evaluating personality state measures. In particular, we highlight specificity – the proportion of variance in observed scores due to systematic situation-specific influences – as a key desideratum of personality state measures. These definitions, and the resulting criteria for state measures, have important implications for personality theory and for the assessment of personality states.

Implications for personality theory

LST-R theory provides a formal definition of key concepts such as states, traits, and state residuals, which map onto commonly used definitions of traits and states in contemporary personality theory (e.g. Baumert et al., 2017). Different models of LST-R theory specify the relationships between traits, states, and state residuals and can be used to inform theoretical accounts such as WTT (Fleeson, 2001; Fleeson & Jayawickreme, 2015; Jayawickreme et al., 2019). Of particular interest, recent models of LST-R theory allow for trait change as a consequence of situational experiences (Eid et al., 2017; Stadtbaeumer et al., 2022), which captures a proposed mechanism of trait change (Wrzus & Roberts, 2017). Defining and modelling personality states within this framework takes a step towards the integration of personality structure, personality processes, and personality development (Baumert et al., 2017).

Comparison to whole trait theory

WTT is a leading contemporary account of the relationship between personality traits and personality states (Fleeson, 2001; Fleeson & Jayawickreme, 2015; Jayawickreme et al., 2019). According to WTT, personality traits are made up of two distinct but linked parts. Descriptively, traits are density distributions of states, such that individual differences can be described in terms of the parameters of the distribution of states. Explanatorily, traits consist of social-cognitive mechanisms which generate states from internal and external cues. People differ in these information processing mechanisms, such that the same inputs can produce different states. Thus, individual differences in social-cognitive mechanisms, but also differences in the inputs experienced can explain differences in the distribution of states.

The latent states defined by LST-R theory map onto the state construct in WTT. They capture the characteristics of a person at a particular time (Baumert et al., 2017). In particular, by defining states as latent variables, the coherent affects, behaviours, cognitions, or desires which make up the personality state are disentangled from idiosyncratic influences and measurement error. In LST-R theory, these latent states reflect the influences of situation-independent characteristics of the person at this point in time – the latent trait – as well as situation-specific influences – the latent state residuals. The latent state residuals also capture the effects of person × situation interactions.

The latent trait in LST-R theory is an expectation across all possible situations in which the person might be. This maps broadly onto the location of the distribution of states, which has been used to operationalise traits in WTT (Fleeson, 2001). However, whereas WTT defines traits as the distribution of states, in LST-R theory, latent states reflect the influences of characteristics of the person (traits) and situation-specific influences (latent state residuals). Moreover, LST-R makes this specific to the time point, whereas WTT implicitly assumes a static trait, though recent extensions of the theory do allow for the possibility of trait change (Jayawickreme et al., 2019). Thus, the latent trait in LST-R theory is better understood to capture individual differences in social-cognitive mechanisms which produce individual differences in the distribution of states.

Conversely, latent state residuals capture situation-specific influences and person × situation interactions. This maps onto the role of cues in WTT. On the explanatory side, WTT posits that intrapersonal variation in states arises from variation in cues. Both LST-R theory and WTT further assume that individuals may differ in their response to cues, which can give rise to individual differences in the distribution of states. On the whole, LST-R theory is thus consistent with the core tenets of WTT as well as personality theories which adopt the descriptive side of WTT, such as CB5T (DeYoung, 2015). At the same time, formalising the definition of personality in LST-R theory allows for cumulative theory development, for example, by incorporating trait change in descriptive models of personality states.

Linking personality states and personality development

One challenge for the integration of personality processes and personality development is the question of how state changes may accumulate into trait change (Baumert et al., 2017; Nesselroade & Molenaar, 2010). According to the TESSERA model, personality states triggered by situational factors can be transferred into long-term personality development through reflective and associative processes (Wrzus & Roberts, 2017). Thus, repeatedly experiencing states which differ from one’s previous trait level can elicit personality change. Importantly, though, empirical evidence for such accumulation of states into trait changes is lacking (Baumert et al., 2017; Hofmann et al., 2009; Wrzus et al., 2021).

Recently developed LST-R models with autoregressive effects formalise the accumulation of state residuals (i.e. states not explained by traits) into traits: state residuals (i.e. situation-specific influences) lead to trait change, which in turn affects future personality traits and states (Eid et al., 2017; Stadtbaeumer et al., 2022). LST-R models with autoregressive effects are thus consistent with the TESSERA framework in that situational factors lead to trait changes through the states a person experiences. One advantage of LST-R models with autoregressive effects is their potential to account for experience-dependent trait change in a single model. Moreover, it is possible to include hypothesised time-varying predictors (e.g. situation factors which may trigger atypical personality states). Future research within the TESSERA framework may thus apply LST-R models to test whether personality states accumulate into trait change.

Implications for the development, evaluation, and interpretation of personality measures

In line with recent work (e.g. Horstmann & Ziegler, 2020), we hold that state constructs should be measured using ‘native’ state measures. In current practice, state constructs are often assessed using adapted trait measures (Horstmann & Ziegler, 2020). This may lead to an underestimation of intrapersonal variability and the effect of situation-specific influences if the items are designed to capture consistent individual differences rather than situation-specific states. Therefore, state measures should be designed to reliably capture intrapersonal fluctuations, that is, to have high specificity. The use of LST-R models in the development and evaluation of state measures allows researchers to quantify how well a scale can be expected to capture both stability and change in personality states.

A second problematic practice is the use of intraclass correlation coefficients to quantify the extent of intrapersonal variation (Horstmann & Ziegler, 2020). ICCs confound variation due to coherent situation-specific influences with measurement error. If researchers define personality states as coherent, ICCs thus inflate the degree of intrapersonal variation in personality states. Latent states as defined in LST-R theory correspond to this commonly adopted definition of personality states as the level of coherent affect, behaviour, cognitions, and desires at a particular time. Thus, specificity – the proportion of variance in the latent states that is due to situation-specific influences – is a better indicator of intrapersonal variability than intraclass correlation coefficients.

Intensive longitudinal data often exhibit autoregressive effects. However, these are typically not modelled explicitly, especially in mixed-effects models. In LST-R models, autoregressive effects have a particular substantive interpretation. Specifically, autoregressive effects reflect experience-dependent trait change (Stadtbaeumer et al., 2022). Thus, including autoregressive effects in LST-R models enables researchers to examine the degree to which experiencing particular personality states accumulates into trait change, which is, in turn, reflected in future personality states. This provides an opportunity to integrate research on personality processes with personality development (Baumert et al., 2017; Wrzus & Roberts, 2017).

Applications to other state constructs

In this manuscript, we focus on introducing a framework for developing, evaluating, and interpreting personality state measures. To do so, we draw on personality theory to conceptualise personality states. However, there are other psychological state constructs such as affect (Kuppens, 2015), perceived situation characteristics (Rauthmann et al., 2015), and modes (Lazarus & Rafaeli, 2023). LST models have been applied to affect (e.g. Olatunji et al., 2020; Yasuda et al., 2004). However, given the vast and diverse literature on the nature of affective states (e.g. Barrett, 2017; Moors et al., 2013; Scherer, 2009), whether the framework we develop here applies to these constructs depends on their conceptual definition.

Another area in which the revised latent state-trait theory has been applied are perceived situation characteristics, which are conceptualised as dimensional mental representations of situations (Rauthmann et al., 2015). Thus, they are state variables which are shaped both by situational cues and by characteristics of the perceiver (Rauthmann et al., 2015, 2019). Columbus, Norget, et al. (in prep) analyse perceived situation characteristics (specifically, perceptions of multiple dimensions of interdependence) using LST-R models. They find that perceptions of multiple dimensions of interdependence reflect trait and state influences to different degrees. Perceived situation characteristics are in many ways similar to personality states in that they are coherent, dimensional, and situation-specific. Thus, our proposed criteria for the evaluation of personality state measures may similarly apply to measures of perceived situation characteristics.

Alternative approaches and possible extensions

A limitation of the LST-R models presented in this paper is that the latent state residuals are a composite of situation effects and person × situation interactions. As such, these models do not allow for insights into whether the situation-specific deviation from the trait level is purely due to situation-specific influences or also dependent on the trait level. For instance, is Ahmed’s high Extraversion state solely caused by the fact that he is at a party or does his generally high Extraversion trait level lead him to enjoy the party even more than a person with a lower Extraversion trait level, or both? However, there are alternative LST models that make it possible to disentangle trait-dependent influences of the situation (i.e. person × situation interactions) from main effects of the situation.

LST models for the combination of random and fixed situations (LST-RF) rely on a specific longitudinal measurement design to disentangle situation effects and person × situation interactions (Geiser et al., 2015). Whereas in LST-R models situations are assumed to be random (i.e. sampled randomly and interchangeably from the universe of possible situations) and unknown, LST-RF models additionally require fixed situations that are either experimentally induced (e.g. manipulated in a laboratory) or naturally occurring (e.g. recorded in ecological momentary assessment studies) and thus known to the researcher. This design allows comparing the effect of situations that are of particular substantive interest on states and traits (Geiser et al., 2015). In addition, compared to LST-R models, which implicitly conceptualise traits as situation-unspecific, LST-RF models enable a more contextualised view on traits and allow researchers to investigate whether and to which degree traits are situation-specific (Castro-Alvarez et al., 2022; Geiser et al., 2015).

Despite their appealing properties, LST-RF models have rarely been applied in empirical research. One challenge is that the models rely on known, fixed situations. One promising avenue for future research may be to combine the assessment of personality states with the assessment of situations using modern situation taxonomies (e.g. Situational Interdependence Scale, DIAMONDS; Gerpott et al., 2018; Rauthmann et al., 2015) or mobile sensing (Harari et al., 2020). Combining the assessment of personality states and situations would enable researchers to use LST-RF models to parse out the contribution of personality × situation interactions to personality states. This may be particularly valuable in the context of interactionist affordance models, which posit that manifestations of personality traits are context-dependent (e.g. de Vries et al., 2016; Thielmann et al., 2020).

Besides LST-R theory, there exist a range of alternative approaches to modelling intrapersonal dynamics which may be amenable to the assessment of personality states. For example, it has recently been suggested to estimate both between- and within-person reliabilities using a two-level random dynamic measurement model (Xiao et al., 2023). Moreover, latent Markov factor analysis can be used to probe intrapersonal changes in measurement models over time or situations (Vogelsmeier et al., 2019). Latent Markov factor analysis may be particularly valuable to identify context-specific changes in the measurement model of personality states.

LST-R models only address the level of states, but do not account for other parameters of their distribution. Therefore, the approach presented here does not provide insights into individual differences in variability across situations. However, mixed-effects location-scale models (Hedeker et al., 2008) allow for between- and within-person heterogeneity in variances, which makes it possible to identify person- and situation-level influences on variability in personality states (for recent applications, see Mader et al., 2023; Shrestha et al., 2024).

Conclusion

Personality states describe how a person feels, thinks, and behaves in a particular situation. We formally define personality states within LST-R theory and translate this definition into testable latent variable models. In this framework, latent states reflect the influences of a person’s characteristics and prior experiences as well as those of systematic situation-specific influences. Within this framework, we propose criteria for evaluating and interpreting measures of personality states. We argue that it is particularly important to design and evaluate personality state measures for their specificity, that is, for their ability to reliably assess intrapersonal variability. An application of this framework to an existing measure of Extraversion illustrates how our approach leads to different interpretations and conclusions compared to common practices in state scale evaluation. Adopting an LST-R framework for research on personality states has the potential to improve measurement practices and to clarify and advance personality theory.

Footnotes

Acknowledgements

We thank Whitney R. Ringwald for sharing the data for the empirical example.

Author contributions

Martina Bader and Simon Columbus share first authorship on this manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Open science statement

The data and commented R code for the empirical example are publicly available on the Open Science Framework (OSF; ).

ORCID iDs

Martina Bader

Simon Columbus

Ingo Zettler

Axel Mayer

Notes

References

Abrahams

Vergauwe

De Fruyt

(2023). Within-person personality variability in the work context: A blessing or a curse for job performance? Journal of Applied Psychology, 108(11), 1834–1855. https://doi.org/10.1037/apl0001101

AERA, APANCME . (2014). Standards for educational and psychological testing. American Educational Research Association.

Ashton

M. C.

Lee

De Vries

R. E.

(2014). The HEXACO honesty-humility, agreeableness, and emotionality factors: A review of research and theory. Personality and Social Psychology Review, 18(2), 139–152. https://doi.org/10.1177/1088868314523838

Bader

Moshagen

(2022). Assessing the fitting propensity of factor models. Psychological Methods. https://doi.org/10.1037/met0000529

Barrett

L. F.

(2017). The theory of constructed emotion: An active inference account of interoception and categorization. Social Cognitive and Affective Neuroscience, 12(1), 1–23. https://doi.org/10.1093/scan/nsw154

Baumert

Schmitt

Perugini

Johnson

Blum

Borkenau

Costantini

Denissen

J. J. A.

Fleeson

Grafton

Jayawickreme

Kurzius

MacLeod

Miller

L. C.

Read

S. J.

Roberts

Robinson

M. D.

Wood

Wrzus

(2017). Integrating personality structure, personality process, and personality development. European Journal of Personality, 31(5), 503–528. https://doi.org/10.1002/per.2115

Beckmann

Birney

D. P.

Minbashian

Beckmann

J. F.

(2021). Personality dynamics at work: The effects of form, time, and context of variability. European Journal of Personality, 35(4), 421–449. https://doi.org/10.1177/08902070211017341

Bleidorn

Hopwood

C. J.

Back

M. D.

Denissen

J. J.

Hennecke

Hill

P. L.

Jokela

Kandler

Lucas

R. E.

Luhmann

Orth

Roberts

B. W.

Wagner

Wrzus

Zimmermann

(2021). Personality trait stability and change. Personality Science, 2, 1–20. https://doi.org/10.5964/ps.6009.

Bleidorn

Schwaba

Zheng

Hopwood

C. J.

Sosa

S. S.

Roberts

B. W.

Briley

(2022). Personality stability and change: A meta-analysis of longitudinal studies. Psychological Bulletin, 148(7-8), 588–619. https://doi.org/10.1037/bul0000365

10.

Brown

N. A.

Blake

A. B.

Sherman

R. A.

(2017). A snapshot of the life as lived: Wearable cameras in social and personality psychological science. Social Psychological and Personality Science, 8(5), 592–600. https://doi.org/10.1177/1948550617703170

11.

Castro-Alvarez

Tendeiro

J. N.

de Jonge

Meijer

R. R.

Bringmann

L. F.

(2022). Mixed-effects trait-state-occasion model: Studying the psychometric properties and the person–situation interactions of psychological dynamics. Structural Equation Modeling: A Multidisciplinary Journal, 29(3), 438–451. https://doi.org/10.1080/10705511.2021.1961587

12.

Chen

F. F.

(2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14(3), 464–504. https://doi.org/10.1080/10705510701301834

13.

Cheung

G. W.

Rensvold

R. B.

(2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling: A Multidisciplinary Journal, 9(2), 233–255. https://doi.org/10.1207/S15328007SEM0902_5

14.

Church

A. T.

Katigbak

M. S.

Ching

C. M.

Zhang

Shen

Arias

R. M.

Rincon

B. C.

Morio

Tanaka-Matsumi

Takaoka

Mastor

K. A.

Roslan

N. A.

Ibáñez-Reyes

Vargas-Flores

J. d. J.

Locke

K. D.

Reyes

J. A. S.

Wenmei

Ortiz

F. A.

Alvarez

J. M.

(2013). Within-individual variability in self-concepts and personality states: Applying density distribution and situation-behavior approaches across cultures. Journal of Research in Personality, 47(6), 922–935. https://doi.org/10.1016/j.jrp.2013.09.002

15.

Clark

L. A.

Vittengl

Kraft

Jarrett

R. B.

(2003). Separate personality traits from states to predict depression. Journal of Personality Disorders, 17(2), 152–172. https://doi.org/10.1521/pedi.17.2.152.23990

16.

Clark

L. A.

Watson

(1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7(3), 309–319. https://doi.org/10.1037/1040-3590.7.3.309

17.

Clark

L. A.

Watson

(2019). Constructing validity: New developments in creating objective measuring instruments. Psychological Assessment, 31(12), 1412–1427. https://doi.org/10.1037/pas0000626

18.

Cole

D. A.

Martin

N. C.

Steiger

J. H.

(2005). Empirical and conceptual problems with longitudinal trait-state models: Introducing a trait-state-occasion model. Psychological Methods, 10(1), 3–20. https://doi.org/10.1037/1082-989X.10.1.3

19.

Columbus

Thielmann

Balliet

(2019). Situational affordances for prosocial behaviour: On the interaction between Honesty–Humility and (perceived) interdependence. European Journal of Personality, 33(6), 655–673. https://doi.org/10.1002/per.2224

20.

Columbus

Böhm

Moshagen

Zettler

(in prep.). HEXACO-PSI: The HEXACO personality states inventory.

21.

Columbus

Norget

Mayer

Balliet

(in prep.). State and trait components of perceived situation characteristics.

22.

Danvers

A. F.

Wundrack

Mehl

(2020). Equilibria in personality states: A conceptual primer for dynamics in personality states. European Journal of Personality, 34(6), 999–1016. https://doi.org/10.1002/per.2239

23.

DeVellis

R. F.

Thorpe

C. T.

(2021). Scale development: Theory and applications. Sage Publications.

24.

de Vries

R. E.

Tybur

J. M.

Pollet

T. V.

van Vugt

(2016). Evolution, situational affordances, and the HEXACO model of personality. Evolution and Human Behavior, 37(5), 407–421. https://doi.org/10.1016/j.evolhumbehav.2016.04.001

25.

DeYoung

C. G.

(2015). Cybernetic big five theory. Journal of Research in Personality, 56, 33–58. https://doi.org/10.1016/j.jrp.2014.07.004.

26.

Eid

Holtmann

Santangelo

Ebner-Priemer

(2017). On the definition of latent-state-trait models with autoregressive effects: Insights from LST-R theory. European Journal of Psychological Assessment, 33(4), 285–295. https://doi.org/10.1027/1015-5759/a000435

27.

Fleeson

(2001). Toward a structure- and process-integrated view of personality: Traits as density distribution of states. Journal of Personality and Social Psychology, 80(6), 1011–1027. https://doi.org/10.1037/0022-3514.80.6.1011

28.

Fleeson

Gallagher

(2009). The implications of big five standing for the distribution of trait manifestation in behavior: Fifteen experience-sampling studies and a meta-analysis. Journal of Personality and Social Psychology, 97(6), 1097–1114. https://doi.org/10.1037/a0016786

29.

Fleeson

Jayawickreme

(2015). Whole trait theory. Journal of Research in Personality, 56, 82–92. https://doi.org/10.1016/j.jrp.2014.10.009.

30.

Geiser

(2021). Longitudinal structural equation modelling with Mplus: A latent state-trait perspective. The Guilford Press.

31.

Geiser

Litson

Bishop

Keller

B. T.

Burns

G. L.

Servera

Shiffman

(2015). Analyzing person, situation and person × situation interaction effects: Latent state-trait models for the combination of random and fixed situations. Psychological Methods, 20(2), 165–192. https://doi.org/10.1037/met0000026

32.

Gerpott

F. H.

Balliet

Columbus

Molho

de Vries

R. E.

(2018). How do people think about interdependence? A multidimensional model of subjective outcome interdependence. Journal of Personality and Social Psychology, 115(4), 716–742. https://doi.org/10.1037/pspp0000166

33.

Geukes

Nestler

Hutteman

Küfner

A. C.

Back

M. D.

(2017). Trait personality and state variability: Predicting individual differences in within- and cross-context fluctuations in affect, self-evaluations, and behavior in everyday life. Journal of Research in Personality, 69, 124–138. https://doi.org/10.1016/j.jrp.2016.06.003.

34.

Hamaker

E. L.

Wichers

(2017). No time like the present: Discovering the hidden dynamics in intensive longitudinal data. Current Directions in Psychological Science, 26(1), 10–15. https://doi.org/10.1177/0963721416666518

35.

Harari

G. M.

Müller

S. R.

Gosling

S. D.

(2020). In Rauthmann

J. F.

Sherman

R. A.

Funder

D. C.

(Eds.), Naturalistic assessment of situations using mobile sensing methods (pp. 299–311). Oxford University Press.

36.

Hedeker

Mermelstein

R. J.

Demirtas

(2008). An application of a mixed‐effects location scale model for analysis of ecological momentary assessment (EMA) data. Biometrics, 64(2), 627–634. https://doi.org/10.1111/j.1541-0420.2007.00924.x.

37.

Henry

Baker

Bratko

Jern

Kandler

Tybur

J. M.

de Vries

R. E.

Wesseldijk

Zapko-Willmes

Booth

Mõttus

(2022). Nuanced HEXACO: A meta-analysis of HEXACO cross-rater agreement, heritability, and rank-order stability. Preprint. https://doi.org/10.31234/osf.io/sjthp

38.

Hilbig

B. E.

Kieslich

P. J.

Henninger

Thielmann

Zettler

(2018). Lead us (not) into temptation: Testing the motivational mechanisms linking Honesty–Humility to cooperation. European Journal of Personality, 32(2), 116–127. https://doi.org/10.1002/per.2149

39.

Hofmann

Gschwendner

Schmitt

(2009). The road to the unconscious self not taken: Discrepancies between self-and observer-inferences about implicit dispositions from nonverbal behavioural cues. European Journal of Personality, 23(4), 343–366. https://doi.org/10.1002/per.722

40.

Horstmann

K. T.

Rauthmann

J. F.

Sherman

R. A.

Ziegler

(2021). Unveiling an exclusive link: Predicting behavior with personality, situation perception, and affect in a preregistered experience sampling study. Journal of Personality and Social Psychology, 120(5), 1317–1343. https://doi.org/10.1037/pspp0000357

41.

Horstmann

K. T.

Ziegler

(2020). Assessing personality states: What to consider when constructing personality state measures. European Journal of Personality, 34(6), 1037–1059. https://doi.org/10.1002/per.2266

42.

Bentler

P. M.

(1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. https://doi.org/10.1080/10705519909540118

43.

Huang

J. L.

Ryan

A. M.

(2011). Beyond personality traits: A study of personality states and situational contingencies in customer service jobs. Personnel Psychology, 64(2), 451–488. https://doi.org/10.1111/j.1744-6570.2011.01216.x

44.

Jayawickreme

Zachry

C. E.

Fleeson

(2019). Whole trait theory: An integrative approach to examining personality structure and process. Personality and Individual Differences, 136, 2–11. https://doi.org/10.1016/j.paid.2018.06.045.

45.

Judge

T. A.

Simon

L. S.

Hurst

Kelley

(2014). What I experienced yesterday is who I am today: Relationship of work motivations and behaviors to within-individual variation in the Five-Factor model of personality. Journal of Applied Psychology, 99(2), 199–221. https://doi.org/10.1037/a0034485

46.

Kalimeri

Lepri

Pianesi

(2013). Going beyond traits: Multimodal classification of personality states in the wild. In Proceedings of the 15th International Conference on Multimodal Interaction (pp. 27–34). ACM. https://doi.org/10.1145/2522848.2522878

47.

Kritzler

Haehner

Krasko

Buecker

(2023). What happens when you add a ‘not relevant’ response option to the unipolar response scales of personality state items? Personality Science, 4, 1–24. https://doi.org/10.5964/ps.8477.

48.

Kuper

Breil

S. M.

Horstmann

K. T.

Roemer

Lischetzke

Sherman

R. A.

Back

M. D.

Denissen

J. J. A.

Rauthmann

J. F.

(2022). Individual differences in contingencies between situation characteristics and personality states. Journal of Personality and Social Psychology, 123(5), 1166–1198. https://doi.org/10.1037/pspp0000435

49.

Kuppens

(2015). It’s about time: A special section on affect dynamics. Emotion Review, 7(4), 297–300. https://doi.org/10.1177/1754073915590947

50.

Lazarus

Rafaeli

(2023). Modes: Cohesive personality states and their interrelationships as organizing concepts in psychopathology. Journal of Psychopathology and Clinical Science, 132(3), 238–248. https://doi.org/10.1037/abn0000699

51.

Loevinger

(1957). Objective tests as instruments of psychological theory. Psychological Reports, 3(7), 635–694. https://doi.org/10.2466/PR0.3.7.635-694

52.

Lönnqvist

J.-E.

Itkonen

J. V. A.

(2014). It’s all about extraversion: Why Facebook friend count doesn’t count towards well-being. Journal of Research in Personality, 53, 64–67. https://doi.org/10.1016/j.jrp.2014.08.009.

53.

Mader

Arslan

R. C.

Schmukle

S. C.

Rohrer

J. M.

(2023). Emotional (in)stability: Neuroticism is associated with increased variability in negative emotion after all. Proceedings of the National Academy of Sciences, 120(23), Article e2212154120. https://doi.org/10.1073/pnas.2212154120

54.

Mayer

(2015). lsttheory: An R package for estimating latent state-trait models. [R package]. Retrieved from: https://github.com/amayer2010/lsttheory

55.

McCabe

K. O.

Fleeson

(2012). What is extraversion for? Integrating trait and motivational perspectives and identifying the purpose of extraversion. Psychological Science, 23(12), 1498–1505. https://doi.org/10.1177/0956797612444904

56.

Meredith

(1993). Measurement invariance, factor analysis and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825

57.

Mischel

Shoda

(1995). A cognitive-affective system theory of personality: Reconceptualizing situations, dispositions, dynamics, and invariance in personality structure. Psychological Review, 102(2), 246–268. https://doi.org/10.1037/0033-295X.102.2.246

58.

Moors

Ellsworth

P. C.

Scherer

K. R.

Frijda

N. H.

(2013). Appraisal theories of emotion: State of the art and future development. Emotion Review, 5(2), 119–124. https://doi.org/10.1177/1754073912468165

59.

Mõttus

Kandler

Bleidorn

Riemann

McCrae

R. R.

(2017). Personality traits below facets: The consensual validity, longitudinal stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 112(3), 474–490. https://doi.org/10.1037/pspp0000100

60.

Mõttus

Sinick

Terracciano

Hřebíčková

Kandler

Ando

Mortensen

E. L.

Colodro-Conde

Jang

K. L.

(2019). Personality characteristics below facets: A replication and meta-analysis of cross-rater agreement, rank-order stability, heritability, and utility of personality nuances. Journal of Personality and Social Psychology, 117(4), e35–e50. https://doi.org/10.1037/pspp0000202

61.

Nesselroade

J. R.

Molenaar

P. C. M.

(2010). Emphasizing intraindividual variability in the study of development over the life span: Concepts and issues. In Lerner

R. M.

Lamb

M. E.

Freund

A. M.

(Eds.), The handbook of life-span development (pp. 30–54). John Wiley & Sons, Inc. https://doi.org/10.1002/9780470880166.hlsd001002

62.

Norget

Mayer

(2022). Block-wise model fit for structural equation models with experience sampling data. Zeitschrift für Psychologie, 230(1), 47–59. https://doi.org/10.1027/2151-2604/a000482

63.

Norget

Weiss

Mayer

(2023). Estimating latent state-trait models for experience sampling data in R with the lsttheory package: A tutorial. Preprint. https://doi.org/10.31234/osf.io/ds9rv

64.

Nübold

Hülsheger

U. R.

(2021). Personality states mediate the effect of a mindfulness intervention on employees’ work outcomes: A randomized controlled trial. European Journal of Personality, 35(4), 646–664. https://doi.org/10.1177/08902070211012915

65.

Olaru

Schroeders

Hartung

Wilhelm

(2019). Ant colony optimization and local weighted structural equation modeling. A tutorial on novel item and person sampling procedures for personality research. European Journal of Personality, 33(3), 400–419. https://doi.org/10.1002/per.2195

66.

Olatunji

B. O.

Cox

R. C.

Cole

D. A.

(2020). The longitudinal structure of disgust proneness: Testing a latent trait-state model in relation to obsessive-compulsive symptoms. Behaviour Research and Therapy, 135, Article 103749. https://doi.org/10.1016/j.brat.2020.103749

67.

Peterson

C. H.

Peterson

N. A.

Powell

K. G.

(2017). Cognitive interviewing for item development: Validity evidence based on content and response processes. Measurement and Evaluation in Counseling and Development, 50(4), 217–223. https://doi.org/10.1080/07481756.2017.1339564

68.

Prenoveau

J. M.

(2016). Specifying and interpreting latent state–trait models with autoregression: An illustration. Structural Equation Modeling: A Multidisciplinary Journal, 23(5), 731–749. https://doi.org/10.1080/10705511.2016.1186550

69.

Rauthmann

J. F.

Horstmann

K. T.

Sherman

R. A.

(2019). Do self-reported traits and aggregated states capture the same thing? A nomological perspective on trait-state homomorphy. Social Psychological and Personality Science, 10(5), 596–611. https://doi.org/10.1177/1948550618774772

70.

Rauthmann

J. F.

Sherman

R. A.

Funder

D. C.

(2015). Principles of situation research: Towards a better understanding of psychological situations. European Journal of Personality, 29(3), 363–381. https://doi.org/10.1002/per.1994

71.

Ringwald

W. R.

Manuck

S. B.

Marsland

A. L.

Wright

A. G. C.

(2022). Psychometric evaluation of a Big Five personality state scale for intensive longitudinal studies. Assessment, 29(6), 1301–1319. https://doi.org/10.1177/10731911211008254

72.

Rosseel

(2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02

73.

Ryan

Gannon-Slater

Culbertson

M. J.

(2012). Improving survey methods with cognitive interviews in small-and medium-scale evaluations. American Journal of Evaluation, 33(3), 414–430. https://doi.org/10.1177/1098214012441499

74.

Scherer

K. R.

(2009). The dynamic architecture of emotion: Evidence for the component process model. Cognition & Emotion, 23(7), 1307–1351. https://doi.org/10.1080/02699930902928969

75.

Schmidt

F. L.

Ilies

(2003). Beyond alpha: An empirical examination of the effects of different sources of measurement error on reliability estimates for measures of individual-differences constructs. Psychological Methods, 8(2), 206–224. https://doi.org/10.37/1082-989X.8.2.206

76.

Seifert

I. S.

Rohrer

J. M.

Egloff

Schmukle

S. C.

(2022). The development of the rank-order stability of the big five across the life span. Journal of Personality and Social Psychology, 122(5), 920–941. https://doi.org/10.1037/pspp0000398

77.

Shoda

Mischel

(2000). Reconciling contextualism with the core assumptions of personality psychology. European Journal of Personality, 14(5), 407–428. https://doi.org/10.1002/1099-0984(200009/10)14:5<407::AID-PER391>3.0.CO;2-3

78.

Shrestha

Sigdel

Pokharel

Columbus

(2024). Big Five traits predict within- and between-person variation in loneliness. European Journal of Personality. https://doi.org/10.1177/08902070241239834.

79.

Sosnowska

Kuppens

De Fruyt

Hofmans

(2020). New directions in the conceptualization and assessment of personality—a dynamic systems approach. European Journal of Personality, 34(6), 988–998. https://doi.org/10.1002/per.2233

80.

Stachl

Boyd

R. L.

Horstmann

K. T.

Khambatta

Matz

S. C.

Harari

G. M.

(2021). Computational personality assessment. Personality Science, 2, 1–22. https://doi.org/10.5964/ps.6115.

81.

Stadtbaeumer

Kreissl

Mayer

(2022). Comparing revised latent state–trait models including autoregressive effects. Psychological Methods, 29(1), 155–168. https://doi.org/10.1037/met0000523

82.

Staiano

Lepri

Subramanian

Sebe

Pianesi

(2011). Automatic modeling of personality states in small group interactions. In Proceedings of the 19th ACM International Conference on Multimedia (pp. 989–992). ACM. https://doi.org/10.1145/2072298.2071920.

83.

Steyer

Mayer

Geiser

Cole

D. A.

(2015). A theory of states and traits—revised. Annual Review of Clinical Psychology, 11(1), 71–98. https://doi.org/10.1146/annurev-clinpsy-032813-153719

84.

Steyer

Schmitt

Eid

(1999). Latent state-trait theory and research in personality and individual differences. European Journal of Personality, 13(5), 389–408. https://doi.org/10.1002/(SICI)1099-0984(199909/10)13:5<389::AID-PER361>3.0.CO;2-A

85.

Steyer

Schmitt

(1994). The theory of confounding and its application in causal modeling with latent variables. In von Eye

Clogg

C. C.

(Eds.), Latent variables analysis: Applications for developmental research (pp. 36–67). Sage Publications, Inc.

86.

Tett

R. P.

Burnett

D. D.

(2003). A personality trait-based interactionist model of job performance. Journal of Applied Psychology, 88(3), 500–517. https://doi.org/10.1037/0021-9010.88.3.500

87.

Thielmann

Spadaro

Balliet

(2020). Personality and prosocial behavior: A theoretical framework and meta-analysis. Psychological Bulletin, 146(1), 30–90. https://doi.org/10.1037/bul0000217

88.

Van Berkel

Ferreira

Kostakos

(2017). The experience sampling method on mobile devices. ACM Computing Surveys, 50(6), 1–40. https://doi.org/10.1145/3123988

89.

Vogelsmeier

L. V. D. E.

Vermunt

J. K.

Van Roekel

De Roover

(2019). Latent Markov factor analysis for exploring measurement model changes in time-intensive longitudinal studies. Structural Equation Modeling: A Multidisciplinary Journal, 26(4), 557–575. https://doi.org/10.1080/10705511.2018.1554445

90.

West

S. G.

Taylor

A. B.

(2012). Model fit and model selection in structural equation modeling. In Hoyle

R. H.

(Ed.), Handbook of structural equation modeling (pp. 209–231). The Guilford Press.

91.

Widaman

K. F.

Ferrer

Conger

R. D.

(2010). Factorial invariance within longitudinal structural equation models: Measuring the same construct across time. Child Development Perspectives, 4(1), 10–18. https://doi.org/10.1111/j.1750-8606.2009.00110.x

92.

Wilt

Revelle

(2015). Affect, behavior, cognition, and desire in the big five: An analysis of item content and structure. European Journal of Personality, 29(4), 478–497. https://doi.org/10.1002/per.2002

93.

Wright

A. G. C.

Simms

L. J.

(2016). Stability and fluctuation of personality disorder features in daily life. Journal of Abnormal Psychology, 125(5), 641–656. https://doi.org/10.1037/abn0000169

94.

Wright

A. G. C.

Zimmermann

(2019). Applied ambulatory assessment: Integrating idiographic and nomothetic principles of measurement. Psychological Assessment, 31(12), 1467–1480. https://doi.org/10.1037/pas0000685

95.

Wrzus

Luong

Wagner

G. G.

Riediger

(2021). Longitudinal coupling of momentary stress reactivity and trait neuroticism: Specificity of states, traits, and age period. Journal of Personality and Social Psychology, 121(3), 691–706. https://doi.org/10.1037/pspp0000308

96.

Wrzus

Roberts

B. W.

(2017). Processes of personality development in adulthood: The TESSERA framework. Personality and Social Psychology Review, 21(3), 253–277. https://doi.org/10.1177/1088868316652279

97.

Xiao

Wang

Liu

(2023). Assessing intra- and inter-individual reliabilities in intensive longitudinal studies: A two-level random dynamic model-based approach. Psychological Methods. https://doi.org/10.1037/met0000608

98.

Yasuda

Lawrenz

Van Whitlock

Lubin

Lei

P.-W.

(2004). Assessment of intraindividual variability in positive and negative affect using latent state-trait model analyses. Educational and Psychological Measurement, 64(3), 514–530. https://doi.org/10.1177/0013164403258445

99.

Yuan

K.-H.

Bentler

P. M.

(2000). Three likelihood-based methods for mean and covariance structure analysis with nonnormal missing data. Sociological Methodology, 30(1), 165–200. https://doi.org/10.1111/0081-1750.00078

100.

Yuan

K.-H.

Tian

Yanagihara

(2015). Empirical correction to the likelihood ratio statistic for structural equation modeling with many variables. Psychometrika, 80(2), 379–405. https://doi.org/10.1007/s11336-013-9386-5

101.

Zettler

Hilbig

B. E.

(2010). Honesty-Humility and a person-situation interaction at work. European Journal of Personality, 24(7), 569–582. https://doi.org/10.1002/per.757

102.

Zettler

Thielmann

Hilbig

B. E.

Moshagen

(2020). The nomological net of the HEXACO model of personality: A large-scale meta-analytic investigation. Perspectives on Psychological Science, 15(3), 723–760. https://doi.org/10.1177/1745691619895036

103.

Zillig

L. M. P.

Hemenover

S. H.

Dienstbier

R. A.

(2002). What do we assess when we assess a big 5 trait? A content analysis of the affective, behavioral, and cognitive processes represented in big 5 personality inventories. Personality and Social Psychology Bulletin, 28(6), 847–858. https://doi.org/10.1177/0146167202289013

104.

Zimmermann

Woods

W. C.

Ritter

Happel

Masuhr

Jaeger

Spitzer

Wright

A. G. C.

(2019). Integrating structure and dynamics in personality assessment: First steps toward the development and validation of a personality dynamics diary. Psychological Assessment, 31(4), 516–531. https://doi.org/10.1037/pas0000625

Developing,evaluating,and interpreting personality state measures: A framework based on the revised latent state-trait theory

Abstract

Plain language summary

Keywords

Introduction

States in contemporary personality theory

The substantive content of personality states

Conceptualising and measuring personality states

Revised latent state-trait theory

Definitions of states, traits, and state residuals in LST-R theory

Models of LST-R theory

Multistate-singletrait model

Multistate-singletrait model with autoregressive effects

Coefficients of LST-R theory

Criteria for evaluating personality state measures

Data considerations

Psychometric evaluation and item selection

Internal structure

Longitudinal measurement invariance

Specificity

Further validation

Convergent and discriminant relations

Criterion relations

Developing personality state measures

Initial theoretical considerations

Item generation

Item selection

Empirical example

Discussion

Implications for personality theory

Comparison to whole trait theory

Linking personality states and personality development

Implications for the development, evaluation, and interpretation of personality measures

Applications to other state constructs

Alternative approaches and possible extensions

Conclusion

Footnotes

Acknowledgements

Author contributions

Declaration of conflicting interests

Funding

Open science statement

ORCID iDs

Notes

References