Abstract
Failures of listening to individuals raising concerns are often implicated in safety incidents. To better understand this and theorize the communicative processes by which safety voice averts harm, we undertook a conceptual review of “safety listening” in organizations: responses to any voice that calls for action to prevent harm. Synthesizing research from disparate fields, we found 36 terms/definitions describing safety listening which typically framed it in terms of listeners’ motivations. These motivational accounts, we propose, are a by-product of the self-report methods used to study listening (e.g., surveys, interviews), which focus on listening perceptions rather than actual responses following speaking-up. In contrast, we define safety listening as a behavioral response to safety voice in organizational contexts to prevent harms. Influenced by cognitive, interactional, and environmental factors, safety listening may prevent incidents through enabling cooperative sensemaking processes for building shared awareness and understanding of risks and hazards.
Plain Language Summary
Numerous global accidents (e.g., Tenerife air disaster), disasters (e.g., Challenger space-shuttle crash), and scandals (e.g., Enron’s accounting scandal) stem from a shared cause: listeners failing to act upon legitimate voiced concerns. While research mainly centers on understanding and encouraging individuals to raise concerns, fewer studies explore listeners’ responses. In this review, we advocate developing the concept of ‘safety listening’ (listeners’ responses to voice aiming to prevent major harms). Our conceptual review positions safety listening as the necessary counterpart to safety voice; differentiates safety listening from other listening types (e.g., active listening); examines its terms/definitions, explanations, and measurements; and explores its influencers (e.g., listener motives) and impacts (e.g., injuries). Scholars have used 36 unique terms/definitions to describe safety listening (e.g., whistleblower retaliation) and often link it to listeners’ motivations (i.e., they listened because they wanted to). This motivational focus, we argue, is a by-product of using self-report methods (e.g., surveys, interviews) that elicit perceptions rather than observing the actual behavioral responses to voice. To advance the literature, we propose using a standard definition, observing safety listening in real-life data (e.g., 9-1-1 calls, airline transcripts), and exploring mechanisms for how voice and listening cause outcomes (e.g., accidents). In sum, this review creates the foundation for future research to develop a comprehensive and cumulative understanding of safety listening and will ultimately contribute to preventing future accidents and scandals.
Theory on preventing institutional failures (e.g., accidents, scandals) has long emphasized the importance of stakeholders raising concerns to address and avert harm (Westrum, 2014). Accordingly, various research streams on speaking-up have emerged (Jones et al., 2021), which have not only primarily focused on safety voice (Noort et al., 2019), but also incorporated work on ethical voice (Chen & Treviño, 2023) and whistleblowing (Blenkinsopp et al., 2019). The unifying theme across such research is that individuals voice concerns to elicit action to address serious breakdowns in organizations and prevent harm (for parsimony, and due to a common focus on avoiding harm, we refer to such phenomena as “safety voice” from herein).
While safety voice acts are necessary to stop accidents and ensure safety, they are also insufficient: incident analyses often highlight how voice can go unheard before and during major institutional failures (Hald et al., 2020). Indeed, as illustrated in Table 1, ineffective listening is identified as a contributory factor in multiple failures, including the Tenerife air disaster (583 killed; Weick, 1990), the Bhopal disaster (500,000 exposed to toxic gas; Taylor, 2014), and the U.S. Gymnastics scandal (300 sexually assaulted; Kirby, 2018). Such cases illustrate that safety voice is only effective when followed by “safety listening.”
Example Organizational Disasters Preceded or Prolonged by Listening Failures
Understanding why organizations’ members fail to listen to legitimate and consequential concerns before avoidable failures is acknowledged as a critical knowledge gap in the organizational psychology, voice, whistleblowing, and healthcare literatures (Jones & Kelly, 2014; Vandekerckhove et al., 2014). We address this in our review, and, through its undertaking, develop a literature-based conceptualization of safety listening.
Background
Effective communication within organizations is crucial for ensuring that safety-related information is shared and acted upon (Westrum, 2014), with many models and theories focusing on linear message transmission from voicers to listeners. For instance, in their seminal communication model, Shannon and Weaver (1949) conceptualize dyadic communication in terms of voicers encoding and sending messages using specific channels (e.g., verbal), which may be negatively impacted by noise (e.g., static, disruptions), and is decoded by listeners. Likewise, at the organizational level, Westrum (2004)'s information flow theory posits that organizations with freely flowing risk-related information across levels are more effective in addressing problems than organizations characterized by withholding, distorting, or siloing information.
Scholars have argued that such communication conceptualizations primarily focus on message transmission (i.e., voice) rather than reception (i.e., listening; Macnamara, 2018; Vandekerckhove et al., 2014). Accordingly, as communication is an ongoing, interactive, and iterative process where individuals interpret and develop the topic being communicated, scholars (e.g., Schramm, 1955) have extended the Shannon and Weaver model to include feedback from receivers to senders. For example, one feedback type may be “third turn repairs” where listeners try to confirm understandings and/or correct misunderstandings between interlocutors (Schegloff, 1992).
Like the wider communication literature, research on safety in teams and organizations has primarily explored how safety-related information (e.g., about hazards) is shared, rather than how it is listened to. Notably, safety research has focused on safety voice, which is generally conceptualized in terms of discretionary, typically upward communication where individuals raise significant concerns about potential harms to those who can intervene or escalate the problem. Safety voice studies have typically examined speaking-up with safety-related information (Noort et al., 2019), or incorporated related concepts like employee voice (Morrison, 2014); whistleblowing (Near & Miceli, 1985); voice in action teams (Krenz & Burtscher, 2021); and ethical voice (Chen & Treviño, 2022).
As safety voice is considered essential for ensuring information flow about hazards, risks, and potential safety improvements, investigations have generally focused on speaking-up's antecedents (Chamberlin et al., 2017). Common findings are that individuals often engage in voice acts depending on their attitudes and skills for managing safety (Salas et al., 2020) and psychological safety where they believe speaking-up will lead to change (Morrison, 2014) and will not result in negative consequences (Edmondson, 2018). However, while safety voice is an increasingly well understood phenomenon (Bazzoli & Curcuruto, 2021), the mechanism by which it prevents accidents and harms—namely, safety listening—is less examined (Vandekerckhove et al., 2014). This is a significant gap: it is essential to understand the factors that determine how and whether voice is heard, understood, and responded to (Barlow et al., 2019) to explain how safety voice can best achieve its goal of eliciting action. While research outside of safety has provided insight on how listening in organizations can be encouraged (e.g., by signaling aids; Itzchakov & Kluger, 2017), little research has focused on dynamic and high-stakes contexts (e.g., aviation emergencies), or how safety voice invokes actions for preventing accidents and avoiding harm (e.g., raising alarms, instituting changes).
In summary, while there has been extensive research on improving safety voice in organizations, there is no established safety listening counterpart. Although the workplace listening literature has focused on responses to voice acts that might be considered routine (e.g., listening to performance evaluation concerns), the nature of listening to voice where serious failures might be averted has not been studied. Indeed, the broader psychology literature does not indicate precisely how listening to safety voice should be conceptualized: for instance, researchers have diversely conceived listening as being how receivers perceive voicers, observable (e.g., nodding) and unobservable (e.g., comprehension) behaviors in response to voice, feelings about voice (e.g., exhaustion), intentions based on listening to voice, and voicers’ perception of being heard (Kluger & Izchakov, 2022; Yip & Fisher, 2022).
We conducted a conceptual review of research investigating safety listening in organizations. Our goal was, through synthesizing research and observations on how safety voice is listened to and leads to outcomes, to conceptualize the nature, antecedents, indicators, and outcomes of safety listening. For example, in terms of describing what safety listening is (e.g., behaviorally), understanding what drives it (e.g., attitudes, culture), exploring how it should be measured (e.g., naturalistic data), and considering its relationship with outcomes (e.g., how it prevents accidents). Guided by the idea that safety voice is a call to action and a successful listening episode must result in some action (e.g., stopping take-off; Noort et al., 2021a), our initial conceptualization of safety listening was: listeners’ behavioral responses to safety voice acts to address potential harms in organizational contexts.
Our review's contribution is to develop foundational principles for the concept of safety listening, outline how it can be advanced theoretically and empirically, and identify how it can improve psychological research on topics like safety and ethics in organizations.
After outlining our search methods, we describe how researchers have conceptualized and investigated safety listening, including its distinctions from other listening types, its terms/definitions, its theoretical explanations, its measurements, and its pragmatics/antecedents/outcomes. To integrate research on safety listening into the wider psychology literature, we interpreted our findings using the Shannon and Weaver model, third turn repairs, and Westrum's information flow theory. Through critically evaluating the literature—for instance, in relation to fragmented terms/definitions, a focus on listeners’ motivations, and the overuse of self-report measures—we recommend avenues for future investigation.
Methods
We undertook a robust literature search to identify safety listening conceptualizations. Our conceptual review's purpose was to synthesize and reconcile fragmented terminologies, definitions, and measurements of safety listening to aid conceptual demarcation and theory development (Hulland, 2020). Accordingly, rather than performing an exhaustive literature search, we aimed to understand how scholars have conceptualized safety listening.
First, we performed a systematic literature review following Siddaway et al. (2019)'s recommendations. Using Scopus, Web of Science, and PsycINFO, we searched for publications, which included listening, concerns, failure, and organizations (or synonyms) within abstracts (Web of Science and PsycINFO) and titles, abstracts, or keywords (Scopus; Table 2 for the search strategy). We verified our search strategy by ensuring that it included five articles central to our research (i.e., Hald et al., 2020; Harlos, 2001; Jones & Kelly, 2014; Martin et al., 2021; Peirce et al., 1998). As listening is implicit to many literatures (e.g., teamwork, psychological safety), we included publications that explicitly theorized or investigated responses to high consequence concerns (Table 3 for inclusion criteria). Alongside focusing on safety voice studies, we considered other potentially relevant research: for example, investigations on voice more generally, whistleblowing, speaking-up about problems in healthcare or dangerous workplace behavior (e.g., sexual harassment), and unethical conduct. AMP (research psychologist) conducted searches, removed duplicates, and screened results based on titles, abstracts, and full texts in March 2022 (Figure 1 for the flowchart diagram). We conducted an interrater reliability assessment where a psychology PhD candidate screened 30 randomly selected publications, resulting in a Cohen's kappa of 0.86 (substantial strength of agreement). All authors regularly and collectively discussed borderline case inclusion and interpreted findings.

Flow diagram of the systematic literature search on safety listening.
Systematic Review Search Strategy
For Scopus, we manually applied these at the title screening stage for remaining databases.
We excluded unpublished work and did not apply date parameters to our search, as we were interested in the literature's existing conceptualizations.
As we were interested in how organizational psychology conceptualizes responses to high-consequence concerns in organizations, we initially focused on social science, business, and psychology publications and broadened this criterion in the hand search to include healthcare articles.
Inclusion Criteria
Next, we conducted two manual searches in March–June 2022 and June–July 2023 because our initial search strategy excluded relevant papers due to terminology variations (e.g., “cockpit crew” instead of “team”). First, we examined publications citing and cited by those included in our systematic review and hand search. Second, we searched journals likely to contain relevant publications (e.g., International Journal of Listening). Finally, we investigated key authors’ Google Scholar pages and reviews in the following domains for safety listening research: safety voice (Noort et al., 2019), employee voice (Lainidi et al., 2023; Morrison, 2011, 2014), voice in action teams (Krenz & Burtscher, 2021), whistleblowing (Blenkinsopp et al., 2019; Mesmer-Magnus & Viswesvaran, 2005), ethical voice (Chen & Treviño, 2023), workplace listening (Kluger & Izchakov, 2022; Yip & Fisher, 2022), safety culture (Bisbey et al., 2021), teamwork (Salas et al., 2020), and ethical leadership (Brown & Treviño, 2006). We recognize that publications post-July 2023 were not included, and others may have been missed despite our multiple search strategies.
Our review is contextualized against recent workplace listening reviews (Kluger et al., 2023; Kluger & Izchakov, 2022; Yip & Fisher, 2022)—it is narrower in topic (i.e., safety listening) yet broader in scope because these reviews focused on the management, communication studies, and psychology literatures. AMP and a psychology MSc postgraduate research assistant assessed listening-related terminologies and definitions; explanations (i.e., theories, models, processes); methods and data sources; and pragmatics, antecedents, and outcomes for each publication.
Findings
We identified 57 articles, published between 1982 and 2023, focusing on listening to high consequence concerns in organizational contexts. Of these, 43 were empirical (comprising 46 studies), 11 were theoretical, and three were reviews. Table 4 synthesizes our findings into a set of key observations, critiques, and recommendations.
Key Conceptual Review Findings
Safety listening conceptualizations
Safety listening's distinguishability
In developing the concept of safety listening, our first goal was to distinguish it from other listening forms (e.g., active listening). It is distinguishable in four ways. First, it can result in significant and sometimes life-or-death outcomes (Hällgren et al., 2018), including physical (e.g., fatalities), psychological (e.g., trauma), and environmental (e.g., pollution) harms.
Second, safety listening has been studied in risky (e.g., aviation; Noort et al., 2021a) and emergency (e.g., healthcare; Long et al., 2020) contexts, though it may also occur in disrupted contexts (e.g., bomb in subway stations; Hällgren et al., 2018). Such contexts include formalized organizational settings with distinct role definitions and imbalances in roles, responsibilities, and expertise between voicers and listeners (e.g., nurses voicing to doctors; McDonald & Ahern, 2000), and dynamic environments where events cannot be fully rehearsed (e.g., security).
Third, listeners often make decisions amidst high cognitive loads, stress, urgency, and danger (e.g., military). Listeners face a dilemma: acting on voice (e.g., cancelling take-off following technical concerns) could lead to proximal consequences (e.g., disruption) yet potentially prevent future incidents while inaction may create distal risk (e.g., accidents). Likewise, erroneous actions may be challenging to reverse (e.g., shutting down the wrong engine; Krenz & Burtscher, 2021). Thus, listeners must engage with concerns, compare risks tied to different actions, and determine appropriate actions in uncertain conditions.
Last, safety listening is verifiable. As safety voice requests novel or corrective action by listeners (e.g., stopping harassment; Peirce et al., 1998), safety listening is observable in listeners’ responses (e.g., creating plans; Groves et al., 2021), team-members’ shared understandings of the situation, and situations’ outcomes (e.g., de-activated bomb following citizens’ concerns). Table 5 outlines safety listening's defining features, including its content, observability, and context.
Safety Listening Definition and Defining Features
Safety listening's fragmentation
To test the idea that safety listening is a concept that is implicitly recognized as important within the literature but has not been crystalized into a defined term or phenomenon, we examined how listening was defined across the included studies. Publications used 36 unique terms/definitions for listening, with many seeming overlapping or with unclear delineations. For example, scholars used “retaliation” (Rehg et al., 2008), “whistleblower retaliation” (Kenny et al., 2019), and “official” and “unofficial reprisals” (McDonald & Ahern, 2000) to describe negative consequences following voicing. Similarly, Jones and Kelly (2014) renamed the “deaf effect” (Cuellar et al., 2006) as “organizational disregard” while retaining the same definition.
While some terminologies conveyed neutral or positive connotations (e.g., “reaction to speaking up”; Lemke et al., 2021), most illustrated consequences following voice (e.g., “retaliation”; Rehg et al., 2008) or exclusively negative responses (e.g., “silencing”; Tiitinen, 2020). Moreover, some terms were framed at the dyadic level (e.g., “receiver response”; Long et al., 2020), while others pertained to organizational dynamics (e.g., “organizational silencing”; Fernando & Prasad, 2019).
Safety listening's motivational framing
We next investigated how the literature has explained safety listening, finding that terminologies and definitions generally frame it as motivational. Terms such as “willful blindness” (Cleary & Duke, 2019) and “deaf ear syndrome” (Peirce et al., 1998) insinuate that inaction is intentional; listeners choose to turn blind eyes or deaf ears following concerns. Likewise, publications’ definitions primarily focused on listeners’ (in)actions following voice, including addressing concerns (Tucker & Turner, 2015), ignoring complaints (Cleary & Duke, 2019), and retaliating (Rehg et al., 2008). Most definitions implied actions were deliberate (e.g., listeners’ “willingness to refrain from […] retaliation”; Vandekerckhove et al., 2014, p.300) to pursue objectives (e.g., preventing further whistleblowing; Tiitinen, 2020). Such intentionality seems consistent with the behavioral literature (Skinner, 1963) where listeners’ actions (e.g., punishment) may serve to extinguish future voice behaviors.
Safety listening explanations generally posit that listeners intentionally choose responses based on their attitudes, strategy, or motives. Following voice, listeners are thought to determine responses after rationally assessing whether problems exist, whether they are responsible for them, and whether they can address them (Pierce et al., 2004). Applying the theory of planned behavior, Vandekerckhove et al. (2014) posit that listeners are likelier to retaliate if they possess negative whistleblowing attitudes, improvable subjective norms (e.g., witnessing others retaliate), and poor perceived behavioral control (e.g., believing they cannot help). Similarly, Near and Miceli (1985) argue that, following whistleblowing, organizations determine whether the misconduct should cease and/or the whistleblower should be punished; choosing retaliation is seen as proportionate to the organization's dependence on the wrongdoing and inversely related to the whistle-blower's power (Miceli et al., 2008).
Research often frames poor listening as intentional. Ineffective responses are viewed as intended to silence or discredit voicers (Fernando & Prasad, 2019), highlight voicers’ out-group membership (Barlow, 2021), rid listeners’ negative emotions (Sumanth et al., 2011) and cognitive dissonance (Atkinson et al., 2022), prompt voicer conformity (Wellman et al., 2016), and minimize reputation loss (Near & Jensen, 1983). Moreover, Martin and Rifkin (2004) and Roulet and Pichler (2020) conceptualize whistleblowing responses as strategic maneuvers within games (“organizational jiu-jitsu” and “blame games”, respectively), designed to minimize personal and organizational culpability. However, explaining listening as motivational and strategic may overlook alternative explanations, including misunderstandings (Schegloff, 1992).
Safety listening methodologies
Having established how safety listening tends to be conceptualized and understood in organizational contexts, we investigated how it is studied, with, given voice's aim of soliciting action, a particular interest on whether and how it is studied behaviorally. Like safety voice (Noort et al., 2019), the 46 empirical safety listening studies occurred most frequently in healthcare contexts (n = 14) and in America (n = 19).
Considering that safety listening, ultimately, is about responding to voice acts aiming to elicit action, only two directly measured safety listening behaviors in naturalistic contexts. Lemke et al. (2021) observed voice and listening in teams administering pre-surgery anesthesia. They created behavioral codes assessing listeners’ verbal, behavioral, and affective responses, and coded behaviors in situ. Noort et al. (2021a) analyzed behavioral trace data from conversations preceding airplane crashes between 1962 and 2018. They identified safety listening in the three conversational turns following safety voice, classifying it as immediate action, verbal affirmation, ignoring, or disaffirmation. The remaining studies used self-reported measures (e.g., surveys) or measured behavior in contrived settings (e.g., experiments); Table 6 details these methodologies. Notably, almost half of the studies (20/46) did not specify whether complaints were spoken or written, and few explicitly investigated technology-mediated complaints (e.g., phone calls, emails).
Safety Listening Assessments
In sum, like voice (Lainidi et al., 2023) and other listening types (Kluger et al., 2023), safety listening insights were primarily obtained through measurements assessing self-reported imagined or recalled responses to high consequence concerns. Despite variously operationalizing safety listening, researchers have generally positioned “better” listening as agreeing with voicers, making voicers feel heard, and addressing problems (Barlow et al., 2023b; Noort et al., 2021a; Reader, 2022). Studies typically framed voice and listening as one-shot, studying instances where voicers knew how to effectively raise concerns and listeners could address them. Scenarios where voicers were uncertain how to complain and where listeners could not act (e.g., about a third party's error) were rarely examined.
Safety listening findings
To establish the existing corpus of safety listening knowledge and identify potential areas for future investigation, we explored observations on the following: i) safety listening's pragmatics and how it manifests in organizations, ii) safety listening's antecedents (e.g., preventative and promotive factors), and iii) safety listening's outcomes. When evaluating publications’ findings, we considered communication repairs and organizational information flow.
Safety listening pragmatics
Three studies explored how listeners responded to voice, typically conceptualizing effective listening as agreement. Self-reported (i.e., interviews) and observed (i.e., simulated behavior) “appropriate” listening included acknowledgment, thanking voicers for speaking-up, and validating emotions (Barlow et al., 2023b; Groves et al., 2021). Conversely, responses using task-based questioning (e.g., “What actions are necessary to discharge this patient?”) hindered listening because listeners did not address voicers’ concerns. In contrast, Lemke et al. (2021) observed that responses were often neutral or validating, primarily comprised of short approvals or detailed explanations. While Groves et al. (2021) conceptualized voice and listening as one-shot (i.e., voice, then listening, then outcome), Barlow et al. (2023b) and Lemke et al. (2021) (assessing naturalistic or simulated behavior) described listening as iterative; it furthered the conversation by prompting sense-making or inviting future voice acts. Notably, all studies were in healthcare contexts; it may be that listening pragmatics differ in other situations.
Safety listening antecedents
Publications measured antecedents at three levels: listeners’ cognitive/skill-based factors, interactional dynamics among voicers and listeners, and structural factors within organizations. These studies quantitatively investigated listening's antecedents and outcomes or have qualitatively described antecedents and outcomes using institutional failure examinations and voicer/listener interviews.
Cognitive/skill-based factors
Publications explored listeners’ motivations as a listening antecedent. Ineffective listening was preceded by listeners’ poor motivated reasoning (Cleary & Duke, 2019), negative attitudes toward complaints (Hsieh et al., 2005), and expectations that voicers would address problems themselves (Wilkinson et al., 2015). Additionally, inadequate listening (e.g., retaliation) was likelier when listeners perceived voicers as threatening (Kenny, 2019), cold, or unlikeable (Wellman et al., 2016). Receivers also were unlikely to listen if they were stressed (Long et al., 2020) or feared disciplinary actions (Martin et al., 2021).
Complaints’ perceived legitimacy also influences safety listening. Effective listening is encouraged when listeners receive compelling evidence (Mesmer-Magnus & Viswesvaran, 2005) and believe voicers were not responsible for incidents (Pierce et al., 2004). Yet, listening may be hindered if listeners excessively confirm complaints’ legitimacy instead of focusing on comprehending and resolving issues (van Dael et al., 2022).
Inadequate listening skills may underpin ineffective listening. Some listeners indicated not knowing how to respond to voice (Barlow et al., 2023a) and having insufficient training to do so (Hsieh et al., 2005). Conversely, trained managers with experience with whistleblowers were likelier to safeguard them from retaliation (Vandekerckhove et al., 2014). A listening skill which would benefit from more investigation would be identifying and correcting misunderstandings (e.g., the voicer incorrectly perceived a problem when one did not actually exist)—but this would require looking beyond one-shot voice acts.
Interactional factors
Interactional dynamics may shape listeners’ cognitions and responses, with research focusing on voicer/listener power and team members’ support. Listening is likelier when there are low voicer/listener power disparities (Miceli et al., 2008)—for example, if voicers had high-status positions (Cortina & Magley, 2003), their roles included whistleblowing (e.g., auditing; Casal & Zalkind, 1995), and listeners respected voicers’ seniority (Long et al., 2020). Conversely, ineffective listening was often preceded by hierarchical and expertise differences (e.g., junior pilots voicing to captains; Noort et al., 2021a). In short, listeners may have preconceptions as to who might have relevant safety information, which reduces their openness to voices from unexpected sources.
Third-party support may discourage retaliation. Voicers were unlikely to experience retaliation if colleagues understood why they voiced (Park et al., 2020), and voicers had supervisor (Mesmer-Magnus & Viswesvaran, 2005) and/or top management support (Miceli et al., 1999).
Structural factors
Interactional factors are influenced by structural factors, including reporting channels, organizational cultures, and organizational characteristics. Scholars distinguish between different reporting channels: external (e.g., to regulators), informal internal (e.g., open-door policies), and formal internal (e.g., complaints systems). All are associated with ineffective listening. Using external whistleblowing channels increased retaliation's likelihood and severity (Mesmer-Magnus & Viswesvaran, 2005). Likewise, voicers using informal internal reporting channels believed these inadequately addressed complaints (Harlos, 2001). Ineffective formal reporting channels were poorly demarcated, bureaucratic, inadequately captured complaints’ nuances, and prioritized achieving performance targets (e.g., reducing complaint numbers) over addressing complaints (Martin et al., 2021; van Dael et al., 2022). Studies have typically investigated “clear-cut” speaking-up where voicers have straightforward concerns, know how to navigate multiple channels, and have a single intended audience.
Organizations with poor safety cultures (Reader, 2022), cultures prioritizing performance over safety and ethics (Wilkinson et al., 2015), and cultures where complaints conflict with taken-for-granted assumptions (Hald et al., 2020) are associated with ineffective listening. Additionally, complaints were likelier to be disregarded in small firms, family-owned enterprises, decentralized or multinational setups, male-dominated sectors, and rural locations (Peirce et al., 1998). Organizations with ineffective listening often exhibited limited information sharing (Hsieh et al., 2005), possessed poorly defined policies concerning wrongdoings (Peirce et al., 1998), and excluded employees from problem-solving processes (Tiitinen, 2020).
In sum, consistent with Westrum (2014), structural factors including poor reporting channels and pathological or bureaucratic organizational cultures blocked information flow within organizations.
Safety listening outcomes
Five studies investigated consequential outcomes following safety listening using surveys, interviews, and behavioral trace data. Ineffective listening was associated with increased injuries (Tucker & Turner, 2015), airplane damage (Noort et al., 2021a), and death (Hald et al., 2020). Experiencing retaliation also worsened voicers’ physical (Cortina & Magley, 2003) and mental health (Kenny et al., 2019).
Voicers who experienced retaliation also reported diminished job satisfaction (Cortina & Magley, 2003), supervisory relationships (Rehg et al., 2008), and career advancements (McDonald & Ahern, 2000). Voicers receiving multiple retaliation forms were likelier to report retaliation (Near & Miceli, 1986), and women who experienced retaliation were likelier to whistle-blow externally (Rehg et al., 2008). Certain listening behaviors extinguished future voice behaviors: voicers were reluctant to voice again when listeners provided extensive explanations (Lemke et al., 2021) and when senior listeners disaffirmed junior voicers (Noort et al., 2021a).
Discussion
Our review found research on safety listening to be highly fragmented in terms of definitions and conceptualizations and to be overly focused on attitudes toward listening. We concluded that, because safety listening influences outcomes through responding to voice acts, its conceptual basis lies in consequential (and thus observable) behavior (i.e., action) and its outcomes (e.g., harm avoidance), rather than listeners’ or voicers’ attitudes. From this standpoint, safety listening can be understood as a “world-making” behavior (Power et al., 2023), because what listeners say and do after hearing safety voice—for example, addressing concerns, correcting misunderstandings, and starting sense-making processes—determines how and whether action is taken to prevent accidents and avoid harm.
Like the communication literature, for example, Shannon and Weaver's one-shot communication model (i.e., voice-listening-outcome), safety listening research tended to consider only “one turn” of communication, and not how individuals iterate and sense-make across many turns to understand safety concerns. For example, through third turn repairs, which are important for correcting misunderstandings in how listeners have understood voice acts (Noort et al., 2021a). Accordingly, like extensions to the Shannon and Weaver model (Schramm, 1955), we suppose that safety listening should not be considered a ballistic one-shot behavior, but rather part of an iterative sense-making process, with feedback loops between voice and listening.
Our review has implications for Westrum (2004)'s information flow theory. Like Westrum, we found that structural factors (e.g., improvable organizational cultures) block organizational information flow, sometimes with deleterious outcomes. Yet, Westrum focused on whether information was being voiced within organizations and when considering listeners’ roles, primarily explored listeners’ motivations upon hearing problems (e.g., preoccupation with power). As such, information flow theory can be broadened to incorporate misunderstandings and miscommunications to offer a more comprehensive perspective, including safety in uncertain conditions. Moreover, empirical studies should apply information flow theory at the moment of voice and listening to enhance this theory's applicability.
For the safety literature, our safety listening conceptualization is significant because it explains how different forms of voice—for instance that communicating promotive (e.g., safety-related improvements) or prohibitive voice messages (e.g., safety complaints)—can change the status quo, and that there are factors which underlie this (e.g., individual, contextual). For prohibitive voice, listening is essential for individuals and teams to form a shared situation awareness with the voicer, and thus act (or otherwise; Endsley, 1995). For promotive voice, how listening shapes outcomes is less clear, is likely to occur in a less urgent context, and may require more ongoing and iterative processes (e.g., on deciding whether to improve a safety procedure). Following the behavioral literature, for example, on operant conditioning (Skinner, 1963) and melioration theory (Herrnstein & Prelec, 1991), it would also be valuable to understand how safety listening influences safety voice (e.g., through encouraging or extinguishing future voice acts).
Implications for workplace voice and listening literatures
Beyond safety, our findings have significance for both the workplace voice and listening literatures. For example, researchers often theorize listening in terms of attitudes and perceptions around voice, rather than how individuals react and respond to the content or intentions of a voice act. This is especially important for determining whether high-consequence voice acts, like whistleblowing (e.g., rather than small talk), are listened to.
Similarly, the voice literature generally assumes that listening will occur following speaking-up; accordingly, responses to such voice are rarely examined. Yet, voice and listening are iterative and complementary speech acts. Oftentimes safety or ethical problems are not resolved through single voice acts, but a series of voice acts and responses (e.g., during problem-solving), with some focusing on addressing and understanding concerns (e.g., potential medical errors), and others on ensuring third-turn repair (Schegloff, 1992). As such, listening may not always occur after a single voice act, but a sequence of voice acts that, potentially, have inconsistent information yet demand attention (Macnamara, 2018). In such exchanges, boundaries between voicers and listeners may become blurred, as listeners may transition to being voicers in team sense-making processes and while clarifying concerns.
Likewise, the role of who “voices” and who “listens” also requires consideration. For example, listeners can become future voicers if they cannot address voicers’ concerns directly and must voice upwards. Vandekerckhove and colleagues (2012, 2014) conceptualize the process of voicing upwards as follows: if, after hearing concerns, listeners do not believe they can address the problem, they can voice the concern to someone who can. Thus, listeners can become future voicers, with the voice/listening sequence continuing until the problem is addressed or the concern is discarded. Such systems are limited in that listeners must have the courage to voice to higher ups (Vandekerckhove & Langenberg, 2012), and must accurately recall and relay concerns (Tiitinen, 2020). To avoid perceptions that complaints were left unheard—and to encourage future voice (King et al., 2019)—listeners should keep voicers informed by communicating decisions related to concerns back to voicers, including if concerns will not be addressed. Future empirical research should examine how safety voice cascades upwards in organizations and how listeners can voice upwards while keeping voicers abreast.
The voice literature should also explore how voice and listening occur in multi-member teams, in organizations with multiple reporting systems, and outside of organizations (e.g., to regulators). This suggestion is especially pertinent to safety research, where individuals often engage in undirected voice to multi-person audiences (e.g., nurses pointing out abnormal bleeding to surgery teams; Kolbe et al., 2014), and listeners may not always know who is expected to respond. Poorly demarcated internal reporting systems may lead voicers to mistakenly believe that they raised formal complaints despite using informal channels (van Dael et al., 2022); likewise, voicers may have unclear concerns and may voice to incorrect audiences. It would be helpful to understand what happens following such misunderstandings and how organizations might correct this.
Safety listening: relationship with climate and culture?
The safety literature conceptualizes voice as a component of both safety climate and safety culture, and includes voice-related survey items in climate/culture assessments (Bisbey et al., 2021). While some organizational safety climate scales include listening-related items (e.g., Bahari & Clarke, 2013; Huang et al., 2013), current organizational culture conceptualizations typically do not include safety listening, despite its essential role in nurturing safe and ethical cultures and preventing potential scandals. Like Hald et al. (2020), we propose that safety listening should be seen as a subset of organizational cultures, warranting its inclusion in assessments. For instance, scholars may incorporate safety listening items into pre-existing culture surveys and capture safety listening through unobtrusive indicators of culture (e.g., quality of responses to whistleblowing complaints). This integration would provide a more comprehensive understanding of how speaking-up and listening behaviors combine to contribute to organizational cultures.
New research avenues
Here, we outline conceptual, theoretical, methodological, and empirical gaps, and recommend avenues for future research.
Proposing a standard term/definition
The review highlights a duplication of listening terms/definitions and a predominantly motivational focus, which likely impedes knowledge accumulation and theory development. It remains unclear whether the 36 terms are interchangeable or represent distinct concepts—for example, would improvable responses be opposite to positive or neutral ones? Research streams appear to be advancing independently, potentially leaving gaps in comprehending the spectrum of listening behaviors and their underlying drivers.
Expanding existing terms/definitions, we propose a standardized concept—safety listening—defined as listeners’ behavior responding to safety voice demanding action to prevent harms in organizational contexts. Our definition emphasizes listeners’ observable responses to voice acts (e.g., engaging, ignoring) as they directly influence outcomes (e.g., hazard mitigation). We underscore that voice acts may be inaccurate (e.g., voicers may have incomplete information), therefore inaction may be appropriate following erroneous concerns. Crucially, effective listening is not necessarily agreeing with voicers; rather, it includes actions like investigation, intervention, and inaction (e.g., for false alarms). Our safety listening conceptualization encompasses both promotive and prohibitive voice messages—for instance, when individuals voice about improving safety (e.g., requesting an improved ventilation system), there is likely still an underlying concern (i.e., better ventilation would reduce the likelihood of harm). Thus, raising safety-related suggestions and concerns both include sharing observations or safety-related information to improve the status quo.
Extending explanations beyond listeners’ motivations
Scholars typically position safety listening as resulting from listeners’ motivations. This conceptualization likely influences and is influenced by challenges obtaining naturalistic listening data. Difficulties with reliably capturing listening behavior likely led to using self-report methodologies; these methodologies’ findings are likely interpreted in terms of listeners’ motivations due to biases and misattributions (e.g., voicers may attribute poor listening as motivational due to the self-attribution error). Moreover, conceptualizations framing listening as motivational indicate the use of self-report data assuming that listening is deliberate; these data then provide supporting evidence for explaining listening in terms of motivations. Thus, conceptualizations of ineffective listening and reliance upon self-report measures likely mutually reinforce each other and have constrained our understandings of listening.
Research should move beyond the focus on motivation-driven conceptualizations, especially since organizational disasters reveal instances where listeners were motivated to listen but failed to do so (e.g., pilots are motivated to safely fly airplanes; Noort et al., 2021a). Consequently, Martin et al. (2021) have questioned the prevailing notion that mishandling complaints solely arises from deliberate efforts to enforce silence in organizations. We concur—recognizing that while motivations undoubtedly influence listening behaviors, concentrating solely on listeners’ attitudes narrows our understandings and neglects alternative explanations.
The literature would benefit from developing a conceptual model and theory, which explain the organizational and system dynamics through which safety voice and listening impact outcomes (e.g., scandals). A plausible explanation posits that safety voice signifies shared cognition discrepancies within teams, while safety listening serves to re-establish shared cognition. Empirically investigating this proposition in future research may hold promise and researchers should continue refining explanations for the relationships between voice, listening, and outcomes.
Assessing safety listening behaviors
Most empirical studies employed self-report measures rather than assessing naturalistic listening behavior. Using proxies is understandable because safety listening behavior is elusive and requires infrequent complaints to occur. Moreover, truly high consequence experiments and simulations are unethical as they expose participants to significant risks.
Yet, relying on self-report, simulation, and experimental methodologies assumes that findings in safe and controlled situations generalize to dynamic and dangerous environments (Diener et al., 2022). In such decontextualized conditions, listening intentions or behaviors may be over-reported or inaccurately described, recalled, or attributed. For instance, self-report methods assume that individuals can (and would) precisely describe and attribute intentions to their own and others’ listening behaviors. Human errors (e.g., misunderstandings), social desirability biases (van de Mortel, 2008), attribution errors (Ross, 1977), and other factors (e.g., primes; Bargh & Chartrand, 2000) may influence listening behaviors. Similarly, vignettes measuring listening intentions may be abstract, over-rely on participants’ imaginations, and may inaccurately predict listening behavior (Sheeran, 2005). Moreover, simulations and experiments involving confederates require precise execution of researchers’ instructions to be believed by participants (Yeomans et al., 2023). Addressing discrepancies between self-reported perspectives (e.g., listener says they listened; voicer disagrees) poses another challenge. Such discrepancies may be frequent, as Bodie et al. (2014) found no association between voicers’ perceptions of receivers’ listening, receivers’ perceptions of their listening, and behavioral listening measures. Likewise, voicers may incorrectly believe that listeners can address problems; however, listeners may hear safety concerns but cannot transparently show what action has been taken (e.g., addressing complaints about another's errors) and cannot do more than pass the complaint onwards.
Although self-report measures are helpful for addressing specific research inquiries (e.g., assessing employees’ commitment to safety), they offer limited insights into the actual behaviors and underlying mechanisms driving individuals’ responses to safety voice. Self-reports’ limitations are heightened in this context because this measurement form is truly not high consequence in nature.
Only two out of 46 studies assessed listening behavior. Like Baumeister et al. (2007), we advocate for a more balanced approach in future research, with more behavioral measures using naturalistic data. We describe methodologies to measure naturalistic safety listening behaviors—including naturalistic and ecological observations and behavioral trace data—in Table 7. Measuring naturalistic listening behaviors would enable us to validate assumptions about this phenomenon (e.g., is effective safety listening always agreement?), uncover unexpected “real world” manifestations (e.g., listening to conflicting voice messages), and causally examine safety listening's relationship with important outcomes (e.g., airplane crashes). Additionally, researchers can triangulate behavioral and non-behavioral findings to converge on evidence that is more compelling than generated by one method alone (Barnes et al., 2018). Table 8 illustrates sample high validity behavioral trace datasets, which are publicly available and are unobtrusive (Hill et al., 2014). Due to its high validity, such data may be distressing; scholars should consider how to safeguard participants, researchers, and transcribers during data collection and analysis.
Methodologies for Assessing Naturalistic Safety Listening Behaviors
Example Behavioral Trace Data Sources to Study Safety Listening
There is little consensus on how to code safety listening behaviors, likely due to an unclear understanding of what such behaviors entail in naturalistic settings. Studies, often using self-reported data, may have conflated effective listening with agreement; however,
effective listening may be disagreement if voicers are incorrect. To address this, researchers should establish a taxonomy of safety listening behaviors, which can be translated into a reliable and valid coding framework. This framework should be empirically grounded and incorporate observable listening behaviors (e.g., asking questions; Kluger & Izchakov, 2022) and defensive tactics (Gillespie, 2020).
Given the considerable sample size of some of the behavioral trace datasets in Table 8 (i.e., thousands of instances of safety voice and safety listening), machine-learning-based natural language processing models and more recently large language models may revolutionize safety listening measurement. These techniques can rapidly analyze vast amounts of complex textual communications, generating high-quality results approaching human-level performance (Kjell et al., 2019; Luo et al., 2024; Törnberg, 2023). Accordingly, these techniques could efficiently and accurately identify patterns, trends, and potential risks in transcribed or written safety conversations and may detect novel nuanced insights that might not be possible with manual coders (Berger & Packard, 2022; Speer et al., 2022). For instance, artificial intelligence measures of safety voice and listening could code data at a sufficient scale, rigor, and subtlety to begin to identify the types of voice and listening behaviors, which are associated with aviation accidents.
Proposing novel safety listening antecedents
Here, we suggest additional possible cognitive, interactional, and structural factors influencing safety listening.
Cognitive/skill-based factors
Stress likely influences safety listening, yet this relationship and its underlying mechanisms have not been empirically examined. Studies have shown that listening to trauma increases listeners’ stress (Michelson & Kluger, 2021) and clinicians reported that stress influenced their listening (Long et al., 2020). We propose that stress impairs listening by diminishing cognitive capabilities and information processing (Sandi, 2013), which are required for situation awareness and listening abilities. Future research should examine how stressful environments influence safety listening and the skills required to effectively listen under stress.
Safety listening can be viewed as a skill rather than just an attitude, aligning with crew resource management (Kanki et al., 2019), non-technical skills (Fletcher et al., 2004), and workplace listening literatures (Itzchakov, 2020). This perspective is supported by listeners reporting they were under-skilled in responding to complaints (Barlow et al., 2023a). In addition to general listening skills like suspending judgment (Itzchakov, 2020), safety listening skills encompass recognizing voice, determining when and what to listen to, and listening under stress. For instance, listeners must notice muted voice (Noort et al., 2021b) and undirected concerns (Kolbe et al., 2014), assess concerns’ legitimacy, and discern between conflicting voice messages. These skills are trainable, as evidenced by Noort et al. (2021a), who found improved listening following crew resource management training. Training programs could incorporate real conversation recordings to help participants assess how others in their roles communicate (Stokoe, 2014).
Interactional factors
Groupthink—where the group's urge for conformity impedes the critical evaluation of signals indicating problems (Janis, 1972)—may discourage effective safety listening. Groupthink symptoms which can obstruct listening include collectively rationalizing warnings, categorizing voicers as inferior, and enforcing conformity by pressuring dissenters (Mannion & Thompson, 2014).
Interactions with technology and environments can impede the reception of voice, a factor often overlooked in the literature. Wilson et al. (2007) illustrate that faulty communication devices (e.g., dead batteries), human errors (e.g., mismatched radio frequencies), background noise (e.g., gunfire), and environmental obstacles (e.g., terrain obstructing radio signals) can impede voice's reception. Likewise, concerns may be unheard (e.g., whispered), incorrectly sent (e.g., wrong address), or lost (e.g., improperly archived). Future research may consider hearing as a mediator between safety voice and safety listening and explore this relationship empirically.
Given that many communications rely on technology, both verbal (e.g., phone calls) and written (e.g., email, instant messaging), future research should explore whether safety voice and listening using these channels differ from face-to-face communication. Some behavioral trace data in Table 8 may aid this endeavor. It may be that non-verbal information gleaned from face-to-face or video communication (e.g., nodding) better facilitates developing shared understandings. It could also be that—with avenues to publicly complain about organizational practices (e.g., Glassdoor, blogs)—having a written and public record of concerns may prompt listeners’ action to address hazards to “save face”. Moreover, certain technology-mediated communications (e.g., phone calls) prompt instant safety listening while others (e.g., emails) may not be immediately responded to.
Structural factors
Organizational policies and procedures aimed at addressing specific wrongdoings have received limited attention in the literature despite their potential impact on safety listening. Organizational failure investigations reveal that these policies were either absent, unclear, or insufficient in guiding complaint handling (e.g., Crofts, 2017). These investigations also underscore that protocols can fail in unforeseen circumstances. For instance, in response to the September 11, 2001 attacks, organizations initially followed standard hijack protocols assuming the hijackers would make demands upon landing; however, this protocol was deemed inadequate for hijackings as attacks (Waller & Uitdewilligen, 2008). Consequently, we propose that clear and adequate policies/procedures highlighting their possible fallibility in unexpected situations would encourage safety listening.
Conclusion
Research has highlighted the significance of voicing high consequence concerns to avert harm. Nonetheless, although voicing is often necessary, it is insufficient as evident in many organizational disasters where raised concerns went unaddressed. Recognizing this, we conducted this integrated conceptual review to establish the concept of safety listening as the necessary counterpart to safety voice. This review synthesizes existing publications to define safety listening as listeners’ behavioral responses to safety voice acts in organizational settings which are intended to avoid physical and/or social harms. In advancing the field, we distinguish safety listening from other listening forms, recommend non-motivational explanations, advocate for the utilization of naturalistic data to measure listening behaviors, and suggest novel contributory factors. This review lays the foundation for future research to foster a comprehensive and cumulative understanding of safety listening, ultimately contributing to the prevention of future organizational failures.
Footnotes
Authors’ Note
This work was presented at the 21st European Association of Work and Organizational Psychology conference in Katowice, Poland, in May 2023; the 19th Cultural Psychology Network meeting in Nicosia, Cyprus, in December 2023; and the 39th Society for Industrial and Organizational Psychology conference in Chicago, USA, in April 2024.
Acknowledgments
The authors would like to thank Dr. Ganga Shreedhar for commenting on an earlier draft, Alex Goddard for helping with inter-rater reliability testing, and Hannah Bunt for helping with data extraction.
Declaration of competing interests
The authors declare that there is no conflict of interest.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a London School of Economics PhD studentship awarded to Alyssa M. Pandolfo.
