Abstract
Although research has provided rich data highlighting the value of team psychological safety, a common call among scholars is the need for longitudinal examinations of this construct to test implicit assumptions about time made in dominant team theories. By analyzing changes in both consensus and the average level of psychological safety over time, we respond to this need and provide insight into emergent state development. Results indicated that team psychological safety climate develops over time through consensus emergence, this process is initiated by early information sharing, and there is a decline in the average level of psychological safety over time.
“Google’s data indicated that psychological safety, more than anything else, was critical to making a team work. ‘We had to get people to establish psychologically safe environments,’ Rozovsky [a Google employee] told me. But it wasn’t clear how to do that..”
A large body of literature highlights the value that psychological safety (i.e., the shared perception that the team is safe for interpersonal risk-taking, Edmondson, 1999) brings to teams and organizations (Frazier et al., 2017). Team psychological safety aids teams in their development and contributes to performance and innovation by encouraging individuals to speak up, grow, learn, and contribute in their own right (Edmondson, 1999). Such evidence has led to many successful companies, such as Google and Amazon, aiming to capitalize on the benefits of psychological safety. However, although the value of psychological safety has been clearly established (e.g., Frazier et al., 2017), it is less clear how this climate emerges. For a team-level climate to emerge, team members must come to share a similar perception of psychological safety (Luria, 2019). Yet little is known about this consensus process, including how quickly it occurs and what may influence it.
As noted by Woodley et al. (2019), previous researchers have examined agreement indices (e.g., intraclass correlations [ICCs], Rwg) to demonstrate that a sufficient level of agreement has been reached to evidence consistency in individual viewpoints. Once this level of agreement has been deemed sufficient, as evidenced by reaching a certain threshold for the agreement index, psychological safety is generally operationalized as a team average (Roussin et al., 2016). But these agreement metrics provide only a “snapshot of sharedness. . .ignoring the dynamicity of the emergence process” (Woodley et al., 2019, p. 4). These metrics do not represent a strict test of the consensus process nor do they allow researchers to directly test changes in consensus (Lang et al., 2018), which is theoretically argued to represent the process by which individual-level phenomena become shared team-level properties (i.e., the emergence process; Kozlowski et al., 2013; Luria, 2019). Disagreements in consensus may have both beneficial and negative effects on the team, such as facilitating learning behaviors (Gibson & Vermeulen, 2003; Roussin et al., 2016) or relationship conflict (Lau & Murnighan, 1998). Although negative outcomes occurring as a result of distinct subgroups may be similar to effects stemming from low levels of psychological safety across the team, positive outcomes are potentially unique, and the appropriate remediation for negative outcomes may vary based on the root cause. For example, repairing fractured subgroups would likely require a different intervention than addressing low levels of psychological safety across the entire team.
Further impeding understanding, few studies have examined the temporal nature of emergent states (Edmondson & Bransby, 2022; Rapp et al., 2021), precluding the possibility of examining how consensus is reached over time. Most studies use cross-sectional designs and treat psychological safety as static (Newman et al., 2017). There are some exceptions, such as the organization-level study of psychological safety conducted by Higgins et al. (2022) and the longitudinal individual- and department-level study conducted by Bransby et al. (2024). Additional longitudinal work using time-lagged or panel designs, although scarce, highlights the variable nature of psychological safety over time. For example, correlations between team psychological safety across time range from as low as .15 (Schulte et al., 2012) to as high as .68 (Edmondson & Mogelof, 2005), indicating potential variation across contexts. The strength of the effect of certain variables on psychological safety has also been found to fluctuate over time. For example, goal clarity has been found to demonstrate a stronger relationship with psychological safety later in the team lifecycle (Edmondson & Mogelof, 2005). Without understanding such fluctuations, researchers may miss an important explanation for fluctuations in other important team phenomena like team performance.
To address these gaps, we leverage the multilevel group process framework (MGPF; Lang & Bliese, 2019; Lang et al., 2019) to closely examine changes in the level of psychological safety over time. The MGPF accounts for gradual increases and decreases in residual variance among team members. In using this perspective to address these gaps, our research contributes to the teams literature in three primary ways. First, we extend prior work on emergent states by conducting a formal test of whether it takes time for individuals to reach consensus on psychological safety. We conduct our investigation using the consensus emergence model (CEM; Lang et al., 2018), which allows for accurate modeling of the consensus emergence process (Lang & Bliese, 2019; Lang et al., 2019). This statistical approach enables scholars to examine group climates in a theoretically appropriate manner, assuming bottom-up (i.e., individual-level behaviors, attitudes, and cognitions shape and influence team-level phenomena), rather than top-down (i.e., higher-level phenomena influence and shape lower-level phenomena), emergence. The emergence of group climates, emergent states, or shared perceptions over time has been referred to by scholars as a “black box” (e.g., Lang et al., 2019), given the inability to accurately test it and the general trend to apply a top-down approach (Kozlowski et al., 2013). With this methodology, we study psychological safety in a manner that aligns with the theorized nature of its emergence (Kozlowski & Klein, 2000) and conduct a rigorous test of the process by which individual-level perceptions coalesce to represent a team-level state. This allows us to answer more nuanced questions, including how quickly teams develop shared perceptions of emergent states (and therefore, how quickly the climate forms; Luria, 2019) and how team characteristics influence the consensus emergence process (Lang & Bliese, 2019; Rapp et al., 2021).
Second, we provide insight into how consensus on psychological safety can be rapidly established within the team’s lifecycle by identifying an antecedent of psychological safety consensus. Although much work has been conducted on psychological safety, most work has examined relationships between antecedents and the average level of psychological safety, rather than relationships between antecedents and the degree to which there is consensus. As Luria (2019, pp. 1062–1063) notes, “an important missing link in climate studies is longitudinal studies that measure climate over time and capture convergence in employees’ climate perceptions.” Without such work, we run the risk of inaccurately evaluating the onset of climate, which is critical to understand; teams consistently grapple with developing and sustaining positive team climates. By providing insight into a factor that contributes to consensus, information sharing, we demonstrate how to establish team psychological safety, which is a key driver of effectiveness. Other research establishing antecedents likely reflects psychological safety with varying degrees of consensus, masking potential effects of fluctuations or disparities in team consensus. Moreover, in demonstrating how quickly consensus can be reached, we challenge traditional perspectives which hold that emergent states, such as team psychological safety, take a long time to develop (Coultas et al., 2014; Marks et al., 2001). We instead argue that through appropriate team processing, a shared perception of psychological safety can develop in a relatively short amount of time. From a practical standpoint, this provides insight into how to quickly foster agreement among team members on levels of psychological safety.
Lastly, we empirically demonstrate that psychological safety is a dynamic climate which fluctuates over time. As noted, theoretical work on teamwork has long emphasized the importance of time in influencing the development of team phenomena, particularly dynamic team constructs such as emergent states (Marks et al., 2001). Yet, empirical work has largely failed to parallel advancements in team theory by incorporating a focus on time. By addressing this gap, we demonstrate that, once established via consensus, there is no guarantee that psychological safety will remain consistently high. Instead, our results indicate that the average level of psychological safety may decrease towards the end of team lifespans, which may explain fluctuations in team outcomes.
Theory and Hypotheses
Emergent State Development
Schmidt et al. (2023) note that there is some evidence indicating that various team and task features, such as interdependence (Klein et al., 2001) and the observability of team constructs (Carter et al., 2018), influence agreement on several team constructs. There is also evidence that consensus-based indicators can predict team performance (LePine et al., 2008). More recent research investigating collective efficacy, featuring a focus on both consensus and the trajectory using the same statistical approach we adopt in our study (i.e., the CEM; Lang et al., 2018), found that both consensus and the overall level of collective efficacy decreased over time; however, smaller decreases were associated with higher levels of performance (McLarnon et al., 2021). Woodley et al. (2019) similarly found that the average level of group potency decreased over time; however, counter to McLarnon et al. (2021), they found that consensus increased with time, indicating different emergent states may temporally behave in distinct manners.
Despite these examples, work using the CEM and investigating both consensus and levels of emergent states over time is rare; most work examining emergent state development focuses on the level over time (Coultas et al., 2014) and is further divided between many different types of emergent states (with over 50 emergent state constructs in the literature, Rapp et al., 2021). As such, researchers have called for a larger focus on studying the temporal and dynamic nature of these constructs. Rapp et al. (2021) emphasized the need to “explore how long team psychological safety takes to develop, how it strengthens or weakens overtime, and what causes those changes,” (p. 87) a call which motivated our study.
Psychological Safety Emergence
When conceptualizing psychological safety emergence, we adopt the influential perspective of James and Jones (1974); they clarified previous inconsistencies in the climate literature by noting that climate results from both contextual and individual influences, rather than representing an objective characteristic of the organization or individual perceptions alone. We further conceive of psychological safety as emerging in a bottom-up fashion, representing a shared unit property or a construct that “emerge[s] from individual members’ shared perceptions, affect, and responses” (Kozlowski & Klein, 2000, p. 24). Consistent with Kozlowski and Klein (2000), we further view the emergence of psychological safety as interactive in nature, with interactions and processes in the team contributing to its team-level formation. Over time, team member interactions evidence some stability; previous interactions and team member role expectations and behaviors, in part, explain this stability, although emergent states are still subject to fluctuations due to various factors (Kozlowski & Klein, 2000).
We further conceptualize psychological safety following Edmondson (1999), who introduced this construct to the teams literature by explaining its role in creating a team environment conducive to learning. She argued that psychological safety enables learning by removing fear of others’ reactions to behavior vulnerable to embarrassment or threat. Individuals are more likely to engage in behaviors that lead to failure or negative perceptions (e.g., voicing an opinion that might not align with the majority view) and do not believe that there will be punishment for making a mistake if psychological safety is in place. This prompts a higher level of risk-taking and discussion about difficult issues (e.g., low performance), allowing teams to improve performance (Edmondson, 1999; Frazier et al., 2017).
Similar to the robust evidence establishing a positive effect of average psychological safety level on team outcomes, a large body of research identifies antecedents of psychological safety. Antecedents such as leadership (e.g., Nembhard & Edmondson, 2006), team emotional stability (Edmondson & Mogelof, 2005), and team cooperation (Tjosvold et al., 2004) have been identified as leading to high average levels of team psychological safety. Synthesizing this research, Newman et al. (2017) found that team leaders play an integral role in fostering average team psychological safety by providing support and coaching, cultivating trust, and promoting inclusiveness. Based on social learning theory (Bandura & Walters, 1977), Newman et al. (2017) suggested that such behaviors are particularly important to psychological safety; the leader uses them to model that risk-taking and honest communication are safe and acceptable within the team, establishing psychological safety. Support and resources stemming from social networks (e.g., quality of relationships) and certain team characteristics (e.g., shared team rewards) were further identified as cultivating high levels of psychological safety (Newman et al., 2017). Psychological safety has also been found to mediate the relationship between team structural features (i.e., supportive features providing clarity about the team and task environment, such as clarity of the team goal and clear rewards) and important outcomes (Edmondson, 1999; Edmondson & Lei, 2014). However, as noted, it is unclear to what degree team psychological safety consensus impacts these relationships.
Edmondson (1999) emphasized that “shared beliefs about how others will react are established over time; these cannot take shape in the laboratory in a meaningful way,” (p. 379). This type of logic was leveraged to inform the MGPF, which was introduced to investigate the longitudinal development of such phenomena (Lang & Bliese, 2019; Lang et al., 2019). The MGPF holds that states such as psychological safety change in regard to both consensus and direction (Lang et al., 2019). Consensus refers to the degree to which individuals within teams converge on a shared perception. Team consensus that there is a high, rather than low, level of psychological safety will have more positive effects. However, it may also be beneficial to identify teams that exhibit consensus on a low level of psychological safety, as this may represent a source of group dysfunction. Once this source of dysfunction has been identified, steps can be taken to improve psychological safety. On the other hand, without consensus, team members will possess conflicting ideas, which may present an additional host of problems (e.g., relational subgroups). Direction refers to a change in the average level or magnitude of psychological safety (i.e., change in the level of the slope, Lang et al., 2019). Changes in the average level of psychological safety can range from small to large and can move in either a positive or negative way. For example, psychological safety may shift from a very low to a very high score following a concerted effort from the organization and leadership to improve it. Similarly, psychological safety may also shift from a very high to a slightly less high score following instances of task conflict or similar circumstances.
The MGPF is consistent with climate emergence theory (Bliese, 2000; Kozlowski et al., 2013), which specifies that team states emerge in a bottom-up fashion of dynamic interaction over time (Kozlowski et al., 2013). Emergent state formation requires time, in part, because emergent states represent psychological climates shaped by individual team members’ cognitive appraisals of the work environment (James et al., 1988; Kessel et al., 2012). There must be time for team members to make cognitive evaluations before they converge upon a shared perception. Regarding the information which enables team members to make cognitive evaluations, Edmondson (1999) suggests that a set of strong structural features is pivotal to teams forming impressions of psychological safety, theorizing that such features reduce uncertainty about the team. Structural features refer to “a clear compelling team goal, an enabling team design (including context support such as adequate resources, information, and rewards), along with team leader behaviors such as coaching and direction” (Edmondson, 1999, p. 356; Hackman, 1987), although this list is not exhaustive. Structural features can also include characteristics of the task (Hackman, 1987) and would be classified in the “inputs” category featured in the seminal input-process-output (IPO) models of team effectiveness (Ilgen et al., 2005). In the IPO model, inputs encompass a range of task and team or structural components which, in turn, affect interactions and processes (i.e., the process component of the model) which subsequently impact important outcomes (i.e., the output component of the model). In our proposed model, psychological safety and communication would be subsumed within the “process” component of the classic team effectiveness models. Within-team interactions and communication are other important components which may foster psychological safety consensus, as repeated interactions are critical to forming shared impressions (Carter et al., 2018).
Taken together, we suggest that team members are influenced by the same structural features and exposed to the same interactions, leading them to develop similar perceptions of psychological climates over time (Edmondson, 1999). For example, if one team member is not punished for a mistake, but instead encouraged to openly discuss it, team members begin to form a shared impression that they are psychologically safe with the team as a whole. If the individual in this scenario is punished, team members will still begin to form a shared impression, although likely more negative in nature. As individuals both experience and witness additional interactions that indicate acceptance (or rejection) of vulnerability, failure, and interpersonal risk-taking, uncertainty is reduced and similar individual-level perceptions form and coalesce in a shared team-level perception. Consensus emergence, as described in the MGPF, represents this process of emergence over time (Lang et al., 2019).
We hypothesize that team members will reach consensus on a perception of psychological safety over time due to exposure to the same structural features and interactions. Teams may make sense of structural features together, and have opportunities to further solidify impressions, through interactions or processes. Information sharing represents a direct method by which team members can share information about structural features, mostly their team and task, and with which impressions can be formed. As a result of these experiences, individuals will develop a clear idea of what is expected of them and share experiences that reinforce that understanding, gaining and sharing information that allows them to form and reinforce team impressions. These impressions allow individuals to reach consensus and develop a shared perception—and therefore, a climate (Luria, 2019)—of psychological safety, whether it is positive or negative in nature. There is likely a relational component to developing psychological safety as well, yet Newman et al. (2017) suggest that the effect of relational elements on psychological safety stems from social learning processes (i.e., individuals form conclusions about psychological climates by observing behaviors modeled by others). This aligns with the idea that the more time individuals have to interact with one another and observe team behavior, the more information they have to form impressions about psychological climates. In accordance with this rationale, we hypothesize:
Hypothesis 1: Consensus on team psychological safety will increase over time.
As discussed, many antecedents can increase (or decrease) perceptions of psychological safety and, in turn, consensus that high (or low) levels are present in the team (Frazier et al., 2017). Antecedents, however, take time to influence psychological safety; teams require a sufficient number of interactions before reaching consensus on emergent states, in part, because emergent states represent underlying psychological phenomena that are difficult to observe (Carter et al., 2018). As such, we note the particular importance of early information sharing because it provides the necessary interaction opportunities to form impressions of psychological safety. Information sharing also serves as a direct conduit through which team members can make sense of the structural features they are exposed to and share observations about team behavior.
Information sharing, broadly defined, refers to the deliberate attempts of team members to exchange work-related information with one another, disseminate information to keep team members up-to-date, and inform team members about activities (Bunderson & Sutcliffe, 2002). We posit that when individuals make conscious efforts to share information and keep the team updated, they accelerate the rate at which shared impressions, reflecting either low or high levels of psychological safety, are formed. Exposure to information, particularly about the structural features in place, provides individuals with insight into their environment, allowing them to more quickly come to neutral, positive, or negative conclusions (O’Reilly, 1980). By sharing team and task information, individuals begin accumulating the necessary number of interactions to form a perception of the team and psychological climates like psychological safety (Carter et al., 2018). Engaging in information sharing also provides opportunities to discuss or make sense of structural features together, leading to similar impressions. Individuals may also convey information other team members missed, potentially reinforcing or challenging newly forming perceptions. If impressions are challenged in these exchanges, this provides team members with an opportunity to make sense of their impressions and come to agreement together. In accordance with this rationale, we hypothesize:
Hypothesis 2: Early team information sharing (a team-level predictor) is related to the degree to which teams come to consensus on psychological safety; teams reporting high levels of early information sharing will reach consensus at a faster rate than teams reporting low levels of early information sharing.
When considering the emergence of psychological safety, we would be remiss if we did not question whether the average level of team psychological safety remains stable (i.e., has minimal changes in the slope) over time. Given the lack of research on consensus and levels of psychological safety over time, the expected trajectory is unclear. When considering theoretical descriptions of psychological safety as well as assumptions made in existing research, it may seem that throughout the duration of a team’s lifecycle, the level will increase and remain stable once established. Previous theories often implicitly assume that levels of constructs are stable across time (Braun et al., 2020). Further, as team members consistently interact with one another over time, they are better equipped to develop and maintain strong—and positive—emergent states such as psychological safety (Marks et al., 2001). Psychological safety also implies a sense of comfort or ease of interaction between team members, one that would likely grow stronger because of increased contact. However, insight from the punctuated equilibrium model (PEM) of teams suggests that this may not be the case (e.g., Gersick, 1988, 1989; Sabherwal et al., 2001).
The PEM does suggest that levels of team processes remain consistent for long periods (Gersick, 1988, 1989). Any change that occurs is expected to be trivial because individuals resist change after processes and behaviors have become institutionalized and expected. Yet, the PEM further suggests that teams experience a short but impactful period of change. Teams experience two distinct phases, with each phase consisting of relatively stable periods, but experience a high degree of change at the mid-point between these two phases. The change occurring during the mid-point transition sets the tone for the direction of phase two which, once established, remains fairly static.
To assist teams through the mid-point such that the resulting change in tone is positive, scholars suggest teams utilize various resources (e.g., team training, Woolley, 1998). If handled effectively, and teams realize that the consequences of speaking up are positive, the average level of psychological safety may increase after the mid-point. On the other hand, teams that do not have a smooth mid-point transition may have negative experiences, including a decline in the average level of psychological safety. The PEM suggests that after the mid-point transition, teams become more focused on their goals and tasks because they are rapidly approaching deadlines and nearing the end of the team’s performance cycle. Because teams become more task-focused, it is plausible that team factors facilitating high levels of psychological safety (e.g., information sharing, which creates opportunities to address confusion and reduce uncertainty about aspects of the task) are no longer a priority.
As another consequence of an increased focus on deliverables, team members may have less time to interact with one another and form conclusions about the current psychological safety climate. This is in accordance with research on team cohesion, another emergent state, which suggests that a decline in the average level of cohesion may occur later in the team’s lifecycle if team members spend less time together and feel less included (Jones et al., 2018). Less interaction, and a related transition to an enhanced focus on elements of the task, may similarly decrease team levels of psychological safety. There is some preliminary evidence to support this notion, with the average level decreasing over time, although for the construct of collective efficacy, another emergent state (McLarnon et al., 2021). Due to conflicting suggestions from salient teams theory and a lack of strong empirical evidence suggesting whether the average psychological safety score remains stable or exhibits an increase or decrease, we pose the following research question:
Research Question 1: Does the average level of psychological safety increase or decrease in teams over the course of their lifecycle?
Method
Sample, Study Design, and Procedure
Teams were student product development teams from a private research university in the southern United States that worked together over the course of roughly 36 weeks. Individuals were randomly assigned to teams by the instructors. Participants were recruited via classroom and email solicitation at the beginning of their team’s lifecycle and received a gift card in exchange for their participation, which was voluntary. The beginning of the lifecycles of the teams coincided with the beginning of a school semester. Data was collected from the first semester (16-weeks) at three time points, each spanning roughly four weeks apart, and from the second semester at three time points as well, each spanning roughly four weeks apart. Participants consistently engaged and interacted frequently with their teams across the entire period. For instance, teams were expected to meet several times a week given the course structure and schedule, and they were expected to meet with their clients on a regular basis. There were also five major oral presentations, further necessitating consistent and frequent interaction.
Teams followed a cycle-based approach to product development, including defining the problem, developing a solution, implementing and testing, and finalizing and presenting the solution. Movement through the cycles, as well as sub-goal deadlines, created various natural points of interaction. Teams operated autonomously and were randomly assigned a client that was experiencing a challenge requiring a product to be developed. Clients spanned multiple industries; example clients included a leading oil and gas producer, a non-profit corporation, and a large hospital. For example, one assigned problem entailed prototyping a lower cost version of a medical device (i.e., a centrifuge). This constitutes a highly interdependent task as the outcome is influenced by each team member’s input.
Over their lifecycle, teams were responsible for working with the client to identify and address their needs through developing, building, and analyzing a product prototype. Teams consisted of engineer majors spanning various disciplines (e.g., bioengineering, mechanical engineering) and over the course of nine months designed, constructed, and documented a prototype system to meet specifications determined by their client. During this time, teams would meet with their client as necessary to ensure progress was being met. This product development task was part of a semester-long course that is focused on providing students with experience prototyping and deploying solutions to real-world engineering challenges. At the end of the nine-month period, teams are required to submit final documentation of their work and present their prototype to their client and course instructor. Teams are also invited to present their work at an engineering design showcase where they have the opportunity to win awards and prize money.
We collected data from 143 individuals (working in 26 teams) over the course of nine months, collecting six rounds of data. Approximately one month passed between each survey distribution, and the first survey was distributed the first week of the semester. Of 146 potential participants, a total of 139 participants nested within 26 teams responded to the survey at Time one (a response rate of 95.2%). At Time Two, 133 participants completed the survey (91.1% response rate); at Time Three, 130 participants completed the survey (89% response rate). At Time Four, 109 (74.7 response rate%); at Time Five, 94 (64.4% response rate); at Time Six, 97 (66.4% response rate). The average response rate across timepoints was 80%, and we treated missing data using listwise deletion at each stage. Response rates per team varied from 31% to 100% across timepoints (M = 81%), and the number of respondents per team ranged from two to nine, with a mean of 4.49 (SD = 1.55). Although there was not a 100% response rate from each participating team, at the final timepoint (T6), the lowest response rate consisted of 25% of members (a single team); 17 teams had team response rates over 60%. The within-team response rate range is consistent with typical within-team response rates for team-based studies (Maynard et al., 2021). All teams except for one had six data points of psychological safety data. We did not find significant differences between teams in our sample that had a 100% within-team response rate and those that did not in regard to psychological safety data; thus, there was no evidence for response bias. We also did not find a relationship between previous levels of psychological safety and subsequent dropout rates. Further, to reduce response bias, we kept the survey time period short and used previously validated measures. On average, at the final timepoint, each team had 5.6 members (SD = 1.19) with 28.08% of the sample identifying as female (with four participants not reporting their gender) and an average age of 21 (SD = 0.62).
Given the number of teams, members in each team, timepoints, and attrition over time (30%, which is consistent with expected levels of attrition for web-based surveys; Hoerger, 2010), the total number of observations was relatively large (N = 690). However, we note that the team-level sample size is limited at 26 teams. When planning for studies based on the consensus emergence model, power is determined mostly by the expected effect size and the overall number of observations (Lang et al., 2021) because the focal point of analyses is the change in observations over time within teams. We utilized the power simulation presented in Lang et al. (2021) and deemed our sample size sufficient in regards to power (power = .95) considering an anticipated moderate effect size (effect size of −.15) as well as the total number of teams (N = 26), members in each team (N = 6), and number of timepoints (N = 6).
The context and participant sample were uniquely chosen to answer our research questions because the makeup of the teams allowed for (a) longitudinal data collection and (b) accurate investigation of consensus emergence (i.e., teams were formed when we began the study, enabling us to assess emergence over time). First, participants provided demographic information as well as baseline levels of team psychological safety and information sharing perceptions. 1 Then, following periods of recurrent interaction, we collected team psychological safety data five additional times. All surveys were completed through an online survey database, and participants were repeatedly informed that their answers would remain confidential and that they could exit the study at any point without penalty.
Measures
Information Sharing
Bunderson and Sutcliffe’s (2002) three-item information sharing measure was used. Participants rated items on a scale ranging from 1 (very strongly disagree) to 7 (very strongly agree). A sample item is “Information used to make key decisions was freely shared among the members of the team.” The referent “the team” was used, consistent with how this variable has been measured in the past (Bunderson & Sutcliffe, 2002), as well as Kozlowski and Klein’s (2000) recommendation to use a referent-shift approach (Chan, 1998) when evaluating a team-level variable that arises in a compositional fashion and represents a shared unit property. Internal consistency was acceptable (α = .80). Aggregation within each team was also justified given adequate agreement (average r*wg = .64; median r*wg = .76).
Psychological Safety
Team psychological safety was assessed using Edmondson’s (1999) seven-item scale of psychological safety and was distributed to team members six times throughout the study. Participants rated items on a scale ranging from 1 (very inaccurate) to 7 (very accurate). A sample item includes “it is safe to take a risk on this team.” Similar to information sharing, the referent “this team” was used to evaluate this construct, in accordance with Edmondson’s (1999) approach to measurement and Kozlowski and Klein’s (2000) suggestions for evaluating a compositional climate variable that represents a shared unit property (i.e., referent-shift approach, Chan, 1998). This scale demonstrated adequate reliability across time points (αtime0 = .65; αtime1 = .81; αtime2 = .73; αtime3 = .78; αtime4 = .78; αtime5 = .79).
Control Variables
We controlled for team size (team-level variable) as well as familiarity with team members at the beginning of the study period (individual-level variable). Given psychological safety is presumed to take some time to develop and emerges as a result of interpersonal exchanges (Edmondson & Lei, 2014), an individual’s level of familiarity with other team members before the start of the study could affect their perceptions of this construct as well as similarity in perceptions within teams. Team size may also affect similarity in perceptions given larger teams may have a more challenging time reaching consensus on motivational states; further, larger teams are prone to experiencing variability in communication patterns (e.g., Bales et al., 1951) potentially making it more challenging for psychological safety to emerge.
Analyses
To test our hypotheses and research question, we utilized the CEM; (Lang et al., 2018). The CEM is a statistical technique that examines how consensus forms within teams over time; this model is relevant for studying constructs where the degree of agreement or shared perceptions within a team is important. The emergence process occurs through increased similarity among individual team member perceptions of team-relevant constructs. The CEM analysis examines the rate at which individual team members reach consensus on their perceptions. To examine this process, the consensus emergence model systematically models change in residual variances over time to offer insights into the development of shared perceptions; if the variance in individual scores decreases over time (i.e., individual perceptions of team phenomena become more similar), consensus is said to take place. Conceptually, consensus emergence represents a team-level phenomenon; methodologically, however, CEM analyses focus on the change in individual responses (i.e., observations) within teams over time. Thus, if a relatively large number of observations are collected, results are informative even with a relatively small number of teams. CEM is therefore a variant of a three-level model (i.e., observations nested within individuals, nested within teams; see Lang et al., 2018 for more detail). To conduct our analyses, we employed methods for conducting CEM outlined by Lang et al. (2018) using R (R Core Team, 2021).
Results
Descriptive statistics are presented in Table 1, and Figure 1 provides a sample summary of the data set. Tables 2 and 3 present results from the main CEM analyses; these results provide a test of Hypothesis 1 as well as RQ1. The basic three-level CEM model consists of two parts (two models). The first model provides a test of whether perceptions of psychological safety change over time within teams, and the second model provides a test of whether variance among individual scores decrease or increase over time (a decrease in variance over time would indicate that consensus emergence does in fact occur). Model 1 results revealed that over time, the average perception of psychological safety decreased significantly (γ100 = −.19, p < .001). These findings suggest that perceptions of psychological safety within teams decline over time, providing us with an answer to RQ1. Model 2 results provided evidence that individual perceptions of psychological safety became more similar over time, indicating that consensus does in fact emerge. In particular, the average individual perception of team psychological safety moved closer to the latent measure of their teams, δ1 = −.13, χ2(1) = 34.02, p < .001, and the residual variance decreased from .39 to .11 over the course of the study. Furthermore, we compared the fit between the two models (Table 2) using a −2log likelihood test; these results indicate that the model allowing for residual error variance change (Model 2) fit better than the model that assumes equal variance (Model 1). Thus, Hypothesis 1 is supported.
Means, SDs, and Intercorrelations Among Study Variables.
Note. PS = psychological safety, N = 91–146.
Nteams = 26.
p < .05. **p < .01.

Psychological safety perceptions by time, within team.
Basic Modeling Steps in the Analyses of Consensus Emergence.
Note. N = 690 observations nested in 141 team members and 26 teams.
AIC = Akaike’s information criterion; BIC = Bayesian information criterion.
p < .001.
Parameter Estimates for Model 2 in the Analyses of Consensus Emergence.
Note. N = 690 observations nested in 141 team members and 26 teams.
p < .001.
To test Hypothesis 2, we examined whether engaging in information sharing in the beginning of the team’s lifecycle (modeled as the team-level rating of information sharing at Time 0, a team-level predictor) was related to the rate at which teams reached consensus emergence in psychological safety. In particular, we tested whether team information sharing at Time 0 was related to changes in the residual variance; results are reported in Tables 4 and 5. Model contrast results for three different models are provided in Table 4, presenting a formal test of whether team information sharing (T0) predicted consensus emergence.
Explanatory Modeling Steps: Test of Early Information Sharing as a Predictor of the Rate of Consensus Emergence.
Note. N = 690 observations nested in 141 team members and 26 teams.
AIC = Akaike’s information criterion; BIC = Bayesian information criterion.
p < .05.
Parameter Estimates for Model 5 in the Best-Fitting Models from the Predictor Analysis.
Note. N = 690 observations nested in 141 team members and 26 teams.
p < .05. **p < .01. ***p < .001.
We begin by testing Model 3, which includes an interaction between time and team information sharing (T0) on mean differences of psychological safety, as well as time as a predictor in the variance portion of the model. In Model 4, we add team information sharing (T0) as an additional predictor of within-team variance and in Model 5 we add the interaction effect between time and team information sharing (T0) to the variance portion of the model. The formal test of Hypothesis 2 is the model contrast between Models 4 and 5, which supports our hypothesis (χ2(1) = 4.15, p < .05). The model fit improves when accounting for the interaction term as a predictor of within-team variance. In particular, at Time 0 (first measurement, corresponding to the initial phase of the teams’ lifecycle), residual variances were different between teams engaging in high levels of early information sharing (1 SD above the mean) and low levels of early information sharing (1 SD below the mean; σ2 = 4.87 and σ2 = 14.05, respectively). Over time, however, the difference in the residual variance between these groups subsided: at the final timepoint in our study (T5), the residuals were similar (σ2 = .005 for teams with low levels of early information sharing and σ2 = .001 for teams with high levels of early information sharing). The residual variances for both groups over time are displayed in Figure 2, further demonstrating these results. Thus, Hypothesis 2 is supported.

Residual variance over time.
Discussion
This research, leveraging climate emergence theories (Kozlowski et al., 2013; Marks et al., 2001) and the MGPF (Lang & Bliese, 2019; Lang et al., 2019), contributes to our understanding of emergent state development within teams. There were several key findings that provide insight into psychological safety emergence. In line with climate and emergent state theory (Kozlowski et al., 2013), first, we found that the team-level psychological safety climate emerges through consensus of individual perceptions of psychological safety over time. Second, the greater the degree of early information sharing, the quicker individuals reach consensus on psychological safety climate. Finally, we found that the average level of psychological safety is not stable once consensus has been reached: it begins at a relatively high level and declines over time (see Table 1 and Figure 1). In uncovering this insight, we respond to the call of researchers to study emergent states in a more precise way, conducting a test of the consensus emergence process (Luria, 2019), providing credence to the importance of conceptualizing emergent state formation as a bottom-up process that begins with individual inputs (Kozlowski & Klein, 2000), and indicating how team characteristics influence the consensus emergence process (Lang & Bliese, 2019).
Theoretical Contributions
Our primary contribution was to build and test a longitudinal model of the team psychological safety development process, incrementally contributing to climate emergence theory (Kozlowski et al., 2013). Team researchers often discuss the importance of time but note that, historically, most teams research has not had a longitudinal focus (Mathieu et al., 2014). In line with such claims, our results show that it takes time for individuals in newly formed teams to reach consensus on the degree of psychological safety present in the team. Thus, when studying emergent states in newly formed teams, it is important to account for this emergence process and formally test whether consensus has been achieved by using appropriate methods (see Lang et al., 2019) rather than relying on agreement indices. Our findings also highlight how studying the emergence process itself may yield additional insight regarding how emergent states function. For instance, we found that the emergence of team-level psychological safety through consensus on individual-level perceptions can be accelerated by information sharing.
Further contributing to climate emergence theory and work on emergent states, we found that teams that share more information from the beginning come to consensus on psychological safety more quickly. We suggest that this effect occurs because, through the act of sharing information, there are more opportunities to provide clarity about aspects of the team and task; this aligns with climate emergent theory (Kozlowski et al., 2013). This clarity and information allow team members to make similar cognitive appraisals of the work environment, leading them to form impressions of team psychological climates like psychological safety. Of course, these suggested mechanisms are speculative and more research is needed to determine the precise mechanisms through which early information sharing fosters psychological safety. But our findings indicate that team behavior does play different roles at various times in a team’s lifecycle: information sharing appears to be especially important early in the life of a team as it allows individuals to more quickly come to consensus on psychological safety.
In addition to providing evidence of the climate emergence process and what accelerates it, our results also suggest that the average level of psychological safety, once consensus is established, is not stationary and declines over time. This emphasizes the importance of considering where the team is in their lifecycle when measuring psychological safety. For instance, we may have concluded the teams in our sample had a moderate level of psychological safety if we had only collected data at the final timepoint. Instead, examining the trajectory of psychological safety across timepoints, we observed a more complicated pattern of psychological safety levels over time. We found that average team psychological safety climate levels emerged at a fairly high level (M = 5.63, SD = 0.68) before declining over time, with the lowest levels reported towards the end of a team’s lifecycle (M = 4.87, SD = 0.52).
Fluctuations in average levels of psychological safety may also occur due to various demands of the task or characteristics and behaviors of the team (e.g., Schulte et al., 2012). The teams in our sample had a set endpoint, which is one possible explanation for this decline; team member behavior is often influenced by deadlines (Gersick, 1988, 1989). In accordance with the PEM, team members may have become less invested in the team towards the end, with less energy and resources geared towards supporting team functioning or psychological safety. In teams with no set endpoint, it may be that the average level of psychological safety demonstrates a more stable pattern, with fewer fluctuations over time. Alternatively, this decline may have occurred because of other challenges associated with finalizing taskwork at the end of the project, which required teams to finalize their prototypes. These end stages of ensuring the prototype met client needs may have created circumstances (e.g., confusion regarding task demands and associated feelings of uncertainty; relationship conflict, Martins et al., 2013) that would naturally erode psychological safety.
Our results indicating a decline in the average level of psychological safety over time also align with recent work conducted by Bransby et al. (2024). They similarly found that psychological safety was higher for newcomers in an organization and subsequently decreased as these individuals became more tenured, “revealing a protracted period where individuals were vulnerable to becoming less willing to take interpersonal risks,” (p. 21). However, it is important to note that this trend was found at the individual-level for newcomers, while their department-level analysis focused on a sample with varying tenure lengths. Our sample is more similar to the individual-level sample composed of newcomers albeit at a different level of analysis, as it was composed only of newly formed teams. Thus, it may be that new teams and newcomers begin with higher levels of psychological safety that subsequently decline.
Bransby et al. (2024) speculated that this decline could arise from “reality shocks,” (Van Maanen & Schein, 1977); these shocks occur once individuals have developed a more accurate and deeper understanding of the requirements of their new role. Related to “reality shocks” is the idea of the honeymoon hangover effect (Boswell et al., 2005); this effect refers to an initial positive shift in the average level of job satisfaction at the beginning of a new job (i.e., the honeymoon effect), followed by a decline (i.e., the hangover effect). Boswell et al. (2005) note that organizations tend to emphasize favorable information about jobs when individuals are at the recruitment and entry stages of hire (Van Maanen, 1975). Similarly, individuals beginning a new position are likely to focus on more positive aspects of the new role, overlooking less positive elements (Ashforth, 2001). But as individuals gain familiarity with their new position and its requirements, they become more aware of negative aspects of the job and subsequently experience a decline in job satisfaction.
At the beginning of a team working together, individuals may be motivated to present themselves in the most favorable way, make a good impression, and focus on more positive elements of their new role, laying the groundwork for initial positive perceptions of psychological safety. However, over time, individuals are likely to become more familiar with fellow team members and the task, becoming aware of negative aspects and growing less certain about engaging in interpersonal risk-taking behavior; this may lead to a decrease in the average level of team psychological safety. The explanation for the decrease in psychological safety in teams over time poses an interesting topic for future research.
Practical Contributions
Our findings have two notable takeaways for organizations with self-managed teams. First, teams that share more information from the beginning are likely to reach consensus more quickly that there is a high average level of psychological safety. Sharing a perception that psychological safety is high should lead to teams performing at a higher level more quickly, given the strong relationship between psychological safety and performance (Edmondson, 1999; Edmondson & Lei, 2014). As such, organizations should ensure that teams are encouraged to share information from the beginning. Some examples of encouraging information sharing include requiring teams to meet on a regular basis such that key pieces of information can be shared and more informal exchanges can occur; setting up multiple communication platforms; and providing some guidance regarding communication norms (e.g., encouraging team members in more leader-like roles to share weekly email updates).
Second, the average level of psychological safety may become more negative over time for some teams. This may be especially likely for teams with a defined endpoint or teams that are nearing the end of a certain project. Consequently, organizations should not simply assume that once a high average level of psychological safety is established it will not falter. One way to account for the dynamic nature of psychological safety level is to regularly measure this construct, given its noted importance, such that organizations can intervene if necessary. Organizations should also continue to encourage behaviors that promote psychological safety even after it has been clearly established and the average level appears high.
Limitations and Future Research Directions
Conclusions drawn from this work should account for the limitations of this study. Given that team and task characteristics can affect relationships among team constructs (e.g., interdependence, LePine et al., 2008), we encourage scholars to examine whether teams with different characteristics demonstrate dynamic patterns of psychological safety in terms of both consensus and average level. Moreover, given that the sample consisted of self-managed student teams, we cannot be certain findings will generalize to working teams with a more formalized hierarchy (e.g., teams with leaders who engage in behaviors such as feedback-sharing may not experience a decline in psychological safety levels, Coutifaris & Grant, 2021). Conversely, work teams are also subject to membership change (i.e., individuals leave and/or enter the team), which can be disruptive to team processes, emergent states, and outcomes (Trainer et al., 2020; van der Vegt et al., 2010). The sample in our study experienced no such changes so we were unable to examine their potential impact; however, it is likely that membership changes may lead to substantial fluctuations in psychological safety and other emergent states, given the departure or introduction of knowledge, skills, and abilities associated with such an event, a possibility future work should explore. Future research should further build on our findings and test whether declines in psychological safety levels occur in other samples, as well as explore mechanisms underlying the decline and potential mitigators.
We also note that while the current study demonstrates sufficient power required to detect consensus emergence (Lang et al, 2021), the sample size at the group level may raise questions regarding the generalizability of our findings not rooted in this model (e.g., the decline of average levels of team psychological safety). Conducting longitudinal work on teams is often associated with the limitation of a small group level sample size (e.g., Larson et al., 2019; Leon & Venables, 2015), as was the case in this study. Another limitation is the attrition evidenced by our sample over time, which reduced our sample size. Participants may have dropped out as the demands of the semester gradually increased or due to other elements such as individual differences (e.g., less conscientious, more extraverted, and less neurotic individuals are more likely to drop out of surveys, Ward et al., 2017). However, we echo the sentiment of others (Park et al., 2020; Rapp et al., 2021) and note that this type of research advances the field in notable ways. We further note that the teams in our sample, although students, had a vested interest in the outcome of their tasks and performed in a more realistic setting than a traditional laboratory setting. As such, we argue that our sample has value in increasing our understanding of workplace phenomena (Mathieu, 2024).
Our findings also suggest that early information sharing contributes to the consensus or the emergence of shared perceptions of psychological safety across team members. We note that a limitation of our work is that we did not assess the behavioral process in question (i.e., information sharing) across all time points. While this design aligns with the project’s proposed theory and research purpose, we recognize that it limited our ability to examine more exploratory questions. For instance, it could be that team behavioral processes and emergent states also hold a reciprocal relationship with one another (similar to other important team constructs; Mathieu et al., 2015). Thus, we urge future scholars to continue examining team processes and emergent states using longitudinal designs such that we gain more insight into the dynamic nature of teamwork.
Conclusion
Research has established that psychological safety is a critical driver of team effectiveness (Frazier et al., 2017) yet has failed to fully examine the effects of time on this construct. In response to this limitation, researchers have called for longitudinal studies to further understanding regarding how this climate forms at the team-level (Edmondson & Bransby, 2022; Rapp et al., 2021). Given that it can be a challenge for teams to form a shared perception of psychological safety, such longitudinal research has both important theoretical and practical implications. Anecdotally, the challenge of establishing team psychological safety was emphasized by an employee of Google following the company’s push to create psychologically safe teams, as indicated in the opening quote. By analyzing consensus on and average levels of team psychological safety over time, we respond to the calls of researchers to explore the effects of time and the challenges practitioners face in understanding and supporting psychological safety development. It is our goal that this investigation will act as a catalyst for additional research on the formation, variation, and sustainment of team emergent states.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by research grants from the Ann and John Doerr Institute for New Leaders at Rice University.
