Abstract
Individuals who hear voices (i.e. auditory verbal hallucinations) have been reported to exhibit a range of difficulties when listening to and processing the speech of other people. These speech processing challenges are observed even in the absence of hearing voices; however, some appear to be exacerbated during periods of acute symptomology. In this advisory piece, key findings from pertinent empirical research into external speech processing in voice-hearers are presented with the intention of informing healthcare professionals. It is the view that through a better understanding of the speech processing deficits faced by individuals who hear voices, more effective communication with such patients can be had.
Keywords
Introduction
The experience of voice-hearing (i.e. auditory verbal hallucination; AVH) refers to the perception of voices in the absence of an external auditory input. Although commonly associated with schizophrenia, voice-hearing is reported in numerous clinical conditions, as well as many individuals within the non-clinical population (Baumeister et al., 2017). Alongside an often challenging or distressing set of perceived psychotic experiences, individuals who hear voices have been reported to exhibit numerous difficulties in processing external speech (Conde et al., 2016). Importantly, these difficulties persist even if AVHs are effectively managed (Løberg et al., 2004). Thus, an awareness of these difficulties will aid in effective communication with people who either have a history of, or are actively, hearing voices. In this context, the training of Australian trainee psychiatrists pertaining to effective communication with individuals with psychosis is generally lacking (Ditton-Phare et al., 2015), despite this being a vital component of successful therapeutic relationships and hence improved psychiatric outcomes (Priebe et al., 2011). In light of this, pertinent neuropsychiatric research into the external speech processing deficits experienced by those who hear voices is highlighted below, with some suggestions as to how to accommodate for these deficits to follow.
Subverbal communication
Underlying speech production and perception are numerous basic auditory processes, which constitute subverbal communication. Spectral aspects of subverbal communication refer to the modulation of the tone of speech sounds, including the pitch (i.e. fundamental frequency), amplitude (i.e. loudness) and timbre (i.e. colour or quality) of speech. A wealth of information is communicated through these spectral components, including emotion, mood and cues for interpreting more complex linguistic information (Frühholz and Belin, 2018). Thus, they enable a listener to navigate the verbal information of speech efficiently and accurately, providing important information on the context and intended meaning of words spoken. Greater spectral processing deficits are a highly replicated finding in voice-hearers with schizophrenia compared to those with no history of AVH (McLachlan et al., 2013; Rossell and Boundy, 2005).
Pitch, which describes the relative highness or lowness of a sound’s frequency, is the main spectral acoustic correlate of tone and intonation, and arguably one of the most important carriers of meaning in auditory information. Poor pitch discrimination capabilities are among the most well-documented deficits in voice-hearers (McLachlan et al., 2013). Furthermore, these pitch processing challenges are underpinned by abnormalities in the functional organisation of the primary auditory cortex, the central receptor of auditory information in the brain (Doucet et al., 2019). The burden of pitch processing deficits becomes apparent when considering the use of pitch in speech, one such instance being its importance during the communication of emotion. For example, low speech pitch with little variability is associated with sadness, while high pitch with high variability is associated with happiness or anger (Frühholz and Belin, 2018). Voice-hearers experience a decreased ability to identify pitch-based emotion accurately (Rossell and Boundy, 2005). A second central function of pitch in speech is its role in the perception of voice identity. Numerous speaker attributes – such as physical (e.g. age and gender), social (e.g. regional origin) and personal attributes (e.g. competence and friendliness; Frühholz and Belin, 2018) – can be determined from the pitch of a voice. A wealth of literature has shown that poor pitch perception difficulties result in voice-hearers having challenges in accurately identifying and characterising the source of a voice (Conde et al., 2016). This emphasises that not only do voice-hearers struggle to differentiate between pitches, but that such deficits can lead them to make incorrect assumptions about the context and motivations behind a conversation, or the identity and emotional state of the speaker.
Timbre also plays a key role in identifying voices. Timbre describes the complex overtones of a voice which create its unique qualities, allowing us to recognise separate voices. When listening to speech, a non-voice-hearing individual will register the timbre of a voice and associate it with the memory of an individual, to help differentiate between and recognise voices, a process referred to as source monitoring (Brébion et al., 2002). Source monitoring deficits are a well-documented, frequently reported challenge in people who hear voices (Brébion et al., 2002; Johns et al., 2001). For these individuals, the subtlety of distinction between voices is lacking and as a result, voices, including their own, are misattributed to an incorrect source or simply not recognised. This is important to be aware of in clinical environments such as hospitals and emergency services, where high volumes of staff work on rotating rosters. Thus, mental health professionals are urged to be cognisant of source monitoring deficits in voice-hearers.
Alongside these spectral processing deficits, temporal aspects of subverbal communication – i.e. the speed and duration of speech – can also be challenging for voice-hearers. However, such temporal processing deficits are observed in psychiatric groups (e.g. schizophrenia) regardless of voice-hearing (McLachlan et al., 2013). As temporal processing challenges are not a distinct deficit of voice-hearers, they are not a key focus of this article. Nevertheless, a moderate, even pace of speech, and an awareness that long conversations may be challenging, are recommended while speaking to any patient with psychosis.
Verbal communication
In addition to the important subverbal features described above, a critical part of understanding externally spoken speech involves the accurate processing of its verbal content, ranging from speech sounds (e.g. ‘ah’ and ‘ba’), to single words, and complex sentences. When moving from subverbal to verbal communication, it becomes evident that there are additional speech processing challenges for voice-hearers. Even when listening to speech sounds, a voice-hearer is more likely to mishear what has been said than a non-voice hearer with the same diagnosis (Steinmann et al., 2017). This is evidenced by studies of dichotic listening, which involve presenting different speech sounds to each ear. Due to the contralateral and left hemispheric dominance of auditory processing in the cortex, a robust right ear advantage is reported in the non-voice hearing population: i.e. they will report hearing the syllable presented to the right ear over the left ear (Hugdahl, 2003). In contrast, voice-hearers reliably exhibit a loss of right ear advantage (Green et al., 1994; Løberg et al., 2004), which represents deviant activity of and between core auditory regions (Hugdahl, 2003).
With an increasing complexity of verbal language, further issues arise. When listening to whole words or sentences, voice-hearers can completely mishear what has been said (Hoffman et al., 1999). In such instances, what a voice-hearer may expect to hear can override what has actually been said (Conde et al., 2016). Other studies have shown that voice-hearers struggle to extract, retain and encode the meaning of speech, accurately (Siddi et al., 2017). These challenges are underscored by widespread changes in relevant cortical function. When voice-hearers are listening to whole words and sentences, regions involved with not only rudimentary audition, but also with the processing and evaluation of language, show deviant function and connectivity (Richards et al., 2021). To assist voice-hearers in processing speech as accurately as possible, the clarity and simplicity of speech content are pertinent, as is continually monitoring and receiving confirmation from the individual that they have understood what has been said.
The content of speech
It is also essential to consider the content of speech while conversing with voice-hearers. This includes both how things are said, as well as the topic of conversation. Two themes are relevant: the use of pragmatics and discussing emotional topics. Pragmatics refers to the contextual use of language to convey specific meaning. A key feature of pragmatics in communication is the use of non-literal language, which requires a listener to use inference to come to the correct conclusions about what the speaker intends to communicate. Therefore, accurate pragmatic processing is important for social cognition, theory of mind and fostering relationships (Frühholz and Belin, 2018). Many studies of pragmatic processing have investigated individuals with schizophrenia as a broad diagnostic group, where severe, widespread challenges in pragmatic comprehension are commonly reported (Champagne-Lavau and Stip, 2010). Further work has shown that voice-hearers are at a more pronounced disadvantage during pragmatic comprehension compared to non-voice hearers with schizophrenia, which may represent a higher degree of concrete thinking.
Metaphor is a type of figurative language which requires acute pragmatic comprehension to decipher meaning. This form of speech asks the listener to create a mental image and look past a literal definition to be interpreted correctly. Voice-hearers have been found to have an impaired ability to create mental images (Conde et al., 2016). It is therefore likely that metaphor comprehension may be challenging for voice-hearers (Siddi et al., 2016). However, other types of figurative language which do not require mental imagery, such as idioms, are also challenging for these individuals (Siddi et al., 2016). As such, it is recommended that figurative language is avoided when communicating with voice-hearers.
To our knowledge, other types of pragmatic comprehension have not been explicitly investigated in voice-hearers relative to non-voice hearers with the same diagnosis. However, it is likely that further distinctions between these two groups exist. For instance, sarcasm, often used to convey humour, amicability and hostility, relies heavily on pitch modulation to give cues to negate the literal meaning of a statement and distinguish between sincerity and deception (Frühholz and Belin, 2018). It is therefore not unreasonable to assume that voice-hearers, as a group with notable pitch processing deficits (McLachlan et al., 2013), will have more pronounced sarcasm detection challenges than non-voice hearers. Each of these challenges in pragmatic comprehension highlights the fact that voice-hearers are less likely to decipher the true meaning of non-literal language in speech, which may result in jokes being misinterpreted, abstract ideas being interpreted in a concrete manner or nuances of conversation being missed.
Finally, we consider emotional topics during conversations with voice-hearers. Compared to non-voice hearers, emotion in speech will automatically capture and maintain the attention of voice-hearers, leading to an impaired ability to focus on the less-emotive contents of speech (Alba-Ferrara et al., 2013). Similarly, voice-hearers tend to have difficulty in accurately shifting their attention when emotion is present in speech, resulting in emotional expressions being misidentified and the true meaning of speech being misinterpreted (Rossell et al., 2013). While these challenges are observed with emotion of any valence, evidence suggests that negative emotional content amplifies speech processing deficits in voice-hearers. For example, voice-hearers are more likely to misidentify emotions (Rossell and Boundy, 2005) or make source monitoring errors (Johns et al., 2001) when listening to negative rather than positive emotions. Furthermore, negative topics are known to exacerbate AVH symptomology (Freeman and Garety, 2003), which can consequentially lead voice-hearers to withdraw from conversation and experience difficulties connecting with others (Sheaves et al., 2020).
In clinical settings, these findings are relevant for discussing distressing topics such as people’s voice-hearing experiences. Many voice-hearers report that speaking directly about the content of their AVH can be emotionally challenging (Freeman and Garety, 2003). Therefore, increased speech processing deficits should be anticipated during conversations about voice-hearing. Nonetheless, there is evidence that with practice, people talking about their AVHs with someone they trust can ease the burden and paranoia caused by distressing, derogatory and threatening voice content (Sheaves et al., 2020). Although little research exists on the effects of discussing other distressing topics with voice-hearers, the likelihood of such conversations during clinical encounters is high. Of importance are conversations around adverse life events or negative self-schemas, which occur at notably high rates in voice-hearers (Baumeister et al., 2017). Due to their personal and negative themes, similar exacerbations to speech processing deficits should be expected during these conversations. In sum, in environments where emotions are being discussed or expressed, clinicians need to be aware of the aggravating effect that this may have on speech processing challenges, as well as understand that voice-hearers are prone to misinterpret emotional expressions and topics.
The effects of voice-hearing severity
Speech processing deficits are reported in individuals with a history of voice-hearing, even when they are not actively hallucinating (Løberg et al., 2004). This is due to the enduring changes in cortical regions involved with audition and language processing observed in these individuals (Richards et al., 2021). As such, health professionals should anticipate speech processing challenges in any individual who endorses a history of AVH. However, it is important to highlight that the magnitude of some of the speech processing deficits outlined increases with voice-hearing severity. Of note, increased source monitoring deficits and an increased likelihood of voice-hearers mishearing speech should be expected during periods of acute symptomology (Aleman et al., 2003; Brébion et al., 2002; Steinmann et al., 2017). Therefore, it is critical that healthcare professionals interacting with any individual who is actively hearing voices have an awareness of these speech processing deficits. This is particularly important while communicating with voice-hearers during a first psychotic episode. These individuals are experiencing a range of distressing psychotic symptoms for the first time, which are likely to worsen their diminished ability to understand what is being said to them. To our knowledge, a comprehensive characterisation of the effects of increased AVH severity on the full range of speech processing deficits is yet to be conducted.
Further cognitive challenges confound the speech processing deficits observed while an individual is actively hearing voices. Two basic requirements for processing speech are attentional and intentional inhibitory processes. In combination, these inhibitory processes enable the suppression of irrelevant information to optimise control over the perception, encoding and behavioural responses to auditory information. Compromised auditory attention and intentional inhibitory control in people who are actively hearing voices are well-documented and appear to worsen with increased AVH severity (Waters et al., 2003). These individuals are reported to have an internally orientated bias for auditory information, and thus pay attention to their AVH preferentially, instead of the external speech in their environment. Poor attention and intentional inhibitory capabilities in voice-hearers may also result in poor concentration, forgetfulness and disorganised response patterns, particularly in busy, highly stimulating environments and especially during periods of acute symptomology.
Recommendations and conclusion
As evidenced above, individuals who hear voices experience a number of deficits in their ability to process externally spoken speech accurately. The omnipresent nature of these external speech processing deficits needs to be considered and accommodated for when conversing with people who hear voices, regardless of whether or not they are actively hallucinating at the time of the conversation. It should also be appreciated that when voices are present, some of the speech processing deficits outlined above can be amplified. In situations when voice-hearing is present, an individual’s abilities to hear speech correctly, identify voices, process information and focus on a conversation may further decline. In response to the issues highlighted, Table 1 presents some broad recommendations on how to adapt speech when conversing with individuals who hear voices. When considering these recommendations, it should be noted that some of the evidence base from which they have been derived, in particular the neuroimaging data, has been previously critiqued as suffering from generally poor and inconsistent methodological quality (Richards et al., 2021). To the best of our knowledge, external speech processing deficits in voice-hearers are synergistic and no deficit is more important than another. As such, there is no level of priority within this list of recommendations; they should be adopted as a whole and evenly prioritised.
Recommendations for clinicians and allied health professionals when speaking to voice-hearers.
AVH: auditory verbal hallucination.
Finally, it is acknowledged that considerably less research has been conducted into external speech processing deficits in voice-hearing groups outside of schizophrenia. Phenomenologically, individuals with schizophrenia exhibit AVHs which are often more severe (e.g. occur at a higher frequency, with more negative content or with greater associated distress) than non-clinical voice-hearers or voice-hearers with another clinical diagnosis (Baumeister et al., 2017). Thus, a continuum of severity appears to exist. Given this, it may be safe to assume that these speech processing challenges will have a more severe presentation in an individual with schizophrenia compared to a voice-hearer with a different diagnosis. However, this is a proposition which needs further investigation.
Overall, empirical research has highlighted extensive external speech processing deficits in voice-hearers. Challenges while processing speech are a separate, yet compounding, factor for the distraction and distress associated with AVH, and need to be front-of-mind during clinical encounters. An awareness of these difficulties and effective adaptations to communication will optimise the chances for a voice-hearer to process the content of speech accurately and efficiently. This may assist in how much information a voice-hearer is able to retain and correctly encode. Furthermore, changes in communication may improve patient–clinician relationships, leading to better clinical outcomes for voice-hearers.
Footnotes
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship and/or publication of this article: D.J.C. has received grant monies for research from Eli Lilly, Janssen Cilag, Roche, Allergen, Bristol-Myers Squibb, Pfizer, Lundbeck, AstraZeneca and Hospira; travel support and honoraria for talks and consultancy from Eli Lilly, Bristol-Myers Squibb, AstraZeneca, Lundbeck, Janssen Cilag, Pfizer, Organon, Sanofi-Aventis, Wyeth, Hospira, Servier and Seqirus; is a current or past Advisory Board Member for Lu AA21004: Lundbeck; Varenicline: Pfizer; Asenapine: Lundbeck; Aripiprazole LAI: Lundbeck; Lisdexamfetamine: Shire; Lurasidone: Servier; Brexpiprazole: Lundbeck; Treatment Resistant Depression: LivaNova; Cariprazine: Seqirus. He is founder of the Optimal Health Program, currently operating as Optimal Wellness, and is part owner of Clarity Healthcare. He is on the scientific advisory of The Mental Health Foundation of Australia. He does not knowingly have stocks or shares in any pharmaceutical company. He was a member of the original schizophrenia and related disorders CGP working group: the views expressed here have been endorsed by neither that group nor the RANZCP.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Australian Government Research Training Program (S.E.R.), the Australian National Health and Medical Research Council (NHMRC; fellowship to S.L.R. [ID: 1154651] and a project grant to S.L.R. [ID: 1060664]) and a Barbara Dicker Brain Sciences Foundation project grant (S.L.R.).
