Sage Journals: Discover world-class research

Abstract

Individuals who hear voices (i.e. auditory verbal hallucinations) have been reported to exhibit a range of difficulties when listening to and processing the speech of other people. These speech processing challenges are observed even in the absence of hearing voices; however, some appear to be exacerbated during periods of acute symptomology. In this advisory piece, key findings from pertinent empirical research into external speech processing in voice-hearers are presented with the intention of informing healthcare professionals. It is the view that through a better understanding of the speech processing deficits faced by individuals who hear voices, more effective communication with such patients can be had.

Keywords

Auditory verbal hallucination external speech listening functional neuroimaging schizophrenia schizoaffective disorder subverbal communication verbal communication voice-hearing

Introduction

The experience of voice-hearing (i.e. auditory verbal hallucination; AVH) refers to the perception of voices in the absence of an external auditory input. Although commonly associated with schizophrenia, voice-hearing is reported in numerous clinical conditions, as well as many individuals within the non-clinical population (Baumeister et al., 2017). Alongside an often challenging or distressing set of perceived psychotic experiences, individuals who hear voices have been reported to exhibit numerous difficulties in processing external speech (Conde et al., 2016). Importantly, these difficulties persist even if AVHs are effectively managed (Løberg et al., 2004). Thus, an awareness of these difficulties will aid in effective communication with people who either have a history of, or are actively, hearing voices. In this context, the training of Australian trainee psychiatrists pertaining to effective communication with individuals with psychosis is generally lacking (Ditton-Phare et al., 2015), despite this being a vital component of successful therapeutic relationships and hence improved psychiatric outcomes (Priebe et al., 2011). In light of this, pertinent neuropsychiatric research into the external speech processing deficits experienced by those who hear voices is highlighted below, with some suggestions as to how to accommodate for these deficits to follow.

Subverbal communication

Underlying speech production and perception are numerous basic auditory processes, which constitute subverbal communication. Spectral aspects of subverbal communication refer to the modulation of the tone of speech sounds, including the pitch (i.e. fundamental frequency), amplitude (i.e. loudness) and timbre (i.e. colour or quality) of speech. A wealth of information is communicated through these spectral components, including emotion, mood and cues for interpreting more complex linguistic information (Frühholz and Belin, 2018). Thus, they enable a listener to navigate the verbal information of speech efficiently and accurately, providing important information on the context and intended meaning of words spoken. Greater spectral processing deficits are a highly replicated finding in voice-hearers with schizophrenia compared to those with no history of AVH (McLachlan et al., 2013; Rossell and Boundy, 2005).

Pitch, which describes the relative highness or lowness of a sound’s frequency, is the main spectral acoustic correlate of tone and intonation, and arguably one of the most important carriers of meaning in auditory information. Poor pitch discrimination capabilities are among the most well-documented deficits in voice-hearers (McLachlan et al., 2013). Furthermore, these pitch processing challenges are underpinned by abnormalities in the functional organisation of the primary auditory cortex, the central receptor of auditory information in the brain (Doucet et al., 2019). The burden of pitch processing deficits becomes apparent when considering the use of pitch in speech, one such instance being its importance during the communication of emotion. For example, low speech pitch with little variability is associated with sadness, while high pitch with high variability is associated with happiness or anger (Frühholz and Belin, 2018). Voice-hearers experience a decreased ability to identify pitch-based emotion accurately (Rossell and Boundy, 2005). A second central function of pitch in speech is its role in the perception of voice identity. Numerous speaker attributes – such as physical (e.g. age and gender), social (e.g. regional origin) and personal attributes (e.g. competence and friendliness; Frühholz and Belin, 2018) – can be determined from the pitch of a voice. A wealth of literature has shown that poor pitch perception difficulties result in voice-hearers having challenges in accurately identifying and characterising the source of a voice (Conde et al., 2016). This emphasises that not only do voice-hearers struggle to differentiate between pitches, but that such deficits can lead them to make incorrect assumptions about the context and motivations behind a conversation, or the identity and emotional state of the speaker.

Timbre also plays a key role in identifying voices. Timbre describes the complex overtones of a voice which create its unique qualities, allowing us to recognise separate voices. When listening to speech, a non-voice-hearing individual will register the timbre of a voice and associate it with the memory of an individual, to help differentiate between and recognise voices, a process referred to as source monitoring (Brébion et al., 2002). Source monitoring deficits are a well-documented, frequently reported challenge in people who hear voices (Brébion et al., 2002; Johns et al., 2001). For these individuals, the subtlety of distinction between voices is lacking and as a result, voices, including their own, are misattributed to an incorrect source or simply not recognised. This is important to be aware of in clinical environments such as hospitals and emergency services, where high volumes of staff work on rotating rosters. Thus, mental health professionals are urged to be cognisant of source monitoring deficits in voice-hearers.

Alongside these spectral processing deficits, temporal aspects of subverbal communication – i.e. the speed and duration of speech – can also be challenging for voice-hearers. However, such temporal processing deficits are observed in psychiatric groups (e.g. schizophrenia) regardless of voice-hearing (McLachlan et al., 2013). As temporal processing challenges are not a distinct deficit of voice-hearers, they are not a key focus of this article. Nevertheless, a moderate, even pace of speech, and an awareness that long conversations may be challenging, are recommended while speaking to any patient with psychosis.

Verbal communication

In addition to the important subverbal features described above, a critical part of understanding externally spoken speech involves the accurate processing of its verbal content, ranging from speech sounds (e.g. ‘ah’ and ‘ba’), to single words, and complex sentences. When moving from subverbal to verbal communication, it becomes evident that there are additional speech processing challenges for voice-hearers. Even when listening to speech sounds, a voice-hearer is more likely to mishear what has been said than a non-voice hearer with the same diagnosis (Steinmann et al., 2017). This is evidenced by studies of dichotic listening, which involve presenting different speech sounds to each ear. Due to the contralateral and left hemispheric dominance of auditory processing in the cortex, a robust right ear advantage is reported in the non-voice hearing population: i.e. they will report hearing the syllable presented to the right ear over the left ear (Hugdahl, 2003). In contrast, voice-hearers reliably exhibit a loss of right ear advantage (Green et al., 1994; Løberg et al., 2004), which represents deviant activity of and between core auditory regions (Hugdahl, 2003).

With an increasing complexity of verbal language, further issues arise. When listening to whole words or sentences, voice-hearers can completely mishear what has been said (Hoffman et al., 1999). In such instances, what a voice-hearer may expect to hear can override what has actually been said (Conde et al., 2016). Other studies have shown that voice-hearers struggle to extract, retain and encode the meaning of speech, accurately (Siddi et al., 2017). These challenges are underscored by widespread changes in relevant cortical function. When voice-hearers are listening to whole words and sentences, regions involved with not only rudimentary audition, but also with the processing and evaluation of language, show deviant function and connectivity (Richards et al., 2021). To assist voice-hearers in processing speech as accurately as possible, the clarity and simplicity of speech content are pertinent, as is continually monitoring and receiving confirmation from the individual that they have understood what has been said.

The content of speech

It is also essential to consider the content of speech while conversing with voice-hearers. This includes both how things are said, as well as the topic of conversation. Two themes are relevant: the use of pragmatics and discussing emotional topics. Pragmatics refers to the contextual use of language to convey specific meaning. A key feature of pragmatics in communication is the use of non-literal language, which requires a listener to use inference to come to the correct conclusions about what the speaker intends to communicate. Therefore, accurate pragmatic processing is important for social cognition, theory of mind and fostering relationships (Frühholz and Belin, 2018). Many studies of pragmatic processing have investigated individuals with schizophrenia as a broad diagnostic group, where severe, widespread challenges in pragmatic comprehension are commonly reported (Champagne-Lavau and Stip, 2010). Further work has shown that voice-hearers are at a more pronounced disadvantage during pragmatic comprehension compared to non-voice hearers with schizophrenia, which may represent a higher degree of concrete thinking.

Metaphor is a type of figurative language which requires acute pragmatic comprehension to decipher meaning. This form of speech asks the listener to create a mental image and look past a literal definition to be interpreted correctly. Voice-hearers have been found to have an impaired ability to create mental images (Conde et al., 2016). It is therefore likely that metaphor comprehension may be challenging for voice-hearers (Siddi et al., 2016). However, other types of figurative language which do not require mental imagery, such as idioms, are also challenging for these individuals (Siddi et al., 2016). As such, it is recommended that figurative language is avoided when communicating with voice-hearers.

To our knowledge, other types of pragmatic comprehension have not been explicitly investigated in voice-hearers relative to non-voice hearers with the same diagnosis. However, it is likely that further distinctions between these two groups exist. For instance, sarcasm, often used to convey humour, amicability and hostility, relies heavily on pitch modulation to give cues to negate the literal meaning of a statement and distinguish between sincerity and deception (Frühholz and Belin, 2018). It is therefore not unreasonable to assume that voice-hearers, as a group with notable pitch processing deficits (McLachlan et al., 2013), will have more pronounced sarcasm detection challenges than non-voice hearers. Each of these challenges in pragmatic comprehension highlights the fact that voice-hearers are less likely to decipher the true meaning of non-literal language in speech, which may result in jokes being misinterpreted, abstract ideas being interpreted in a concrete manner or nuances of conversation being missed.

Finally, we consider emotional topics during conversations with voice-hearers. Compared to non-voice hearers, emotion in speech will automatically capture and maintain the attention of voice-hearers, leading to an impaired ability to focus on the less-emotive contents of speech (Alba-Ferrara et al., 2013). Similarly, voice-hearers tend to have difficulty in accurately shifting their attention when emotion is present in speech, resulting in emotional expressions being misidentified and the true meaning of speech being misinterpreted (Rossell et al., 2013). While these challenges are observed with emotion of any valence, evidence suggests that negative emotional content amplifies speech processing deficits in voice-hearers. For example, voice-hearers are more likely to misidentify emotions (Rossell and Boundy, 2005) or make source monitoring errors (Johns et al., 2001) when listening to negative rather than positive emotions. Furthermore, negative topics are known to exacerbate AVH symptomology (Freeman and Garety, 2003), which can consequentially lead voice-hearers to withdraw from conversation and experience difficulties connecting with others (Sheaves et al., 2020).

In clinical settings, these findings are relevant for discussing distressing topics such as people’s voice-hearing experiences. Many voice-hearers report that speaking directly about the content of their AVH can be emotionally challenging (Freeman and Garety, 2003). Therefore, increased speech processing deficits should be anticipated during conversations about voice-hearing. Nonetheless, there is evidence that with practice, people talking about their AVHs with someone they trust can ease the burden and paranoia caused by distressing, derogatory and threatening voice content (Sheaves et al., 2020). Although little research exists on the effects of discussing other distressing topics with voice-hearers, the likelihood of such conversations during clinical encounters is high. Of importance are conversations around adverse life events or negative self-schemas, which occur at notably high rates in voice-hearers (Baumeister et al., 2017). Due to their personal and negative themes, similar exacerbations to speech processing deficits should be expected during these conversations. In sum, in environments where emotions are being discussed or expressed, clinicians need to be aware of the aggravating effect that this may have on speech processing challenges, as well as understand that voice-hearers are prone to misinterpret emotional expressions and topics.

The effects of voice-hearing severity

Speech processing deficits are reported in individuals with a history of voice-hearing, even when they are not actively hallucinating (Løberg et al., 2004). This is due to the enduring changes in cortical regions involved with audition and language processing observed in these individuals (Richards et al., 2021). As such, health professionals should anticipate speech processing challenges in any individual who endorses a history of AVH. However, it is important to highlight that the magnitude of some of the speech processing deficits outlined increases with voice-hearing severity. Of note, increased source monitoring deficits and an increased likelihood of voice-hearers mishearing speech should be expected during periods of acute symptomology (Aleman et al., 2003; Brébion et al., 2002; Steinmann et al., 2017). Therefore, it is critical that healthcare professionals interacting with any individual who is actively hearing voices have an awareness of these speech processing deficits. This is particularly important while communicating with voice-hearers during a first psychotic episode. These individuals are experiencing a range of distressing psychotic symptoms for the first time, which are likely to worsen their diminished ability to understand what is being said to them. To our knowledge, a comprehensive characterisation of the effects of increased AVH severity on the full range of speech processing deficits is yet to be conducted.

Further cognitive challenges confound the speech processing deficits observed while an individual is actively hearing voices. Two basic requirements for processing speech are attentional and intentional inhibitory processes. In combination, these inhibitory processes enable the suppression of irrelevant information to optimise control over the perception, encoding and behavioural responses to auditory information. Compromised auditory attention and intentional inhibitory control in people who are actively hearing voices are well-documented and appear to worsen with increased AVH severity (Waters et al., 2003). These individuals are reported to have an internally orientated bias for auditory information, and thus pay attention to their AVH preferentially, instead of the external speech in their environment. Poor attention and intentional inhibitory capabilities in voice-hearers may also result in poor concentration, forgetfulness and disorganised response patterns, particularly in busy, highly stimulating environments and especially during periods of acute symptomology.

Recommendations and conclusion

As evidenced above, individuals who hear voices experience a number of deficits in their ability to process externally spoken speech accurately. The omnipresent nature of these external speech processing deficits needs to be considered and accommodated for when conversing with people who hear voices, regardless of whether or not they are actively hallucinating at the time of the conversation. It should also be appreciated that when voices are present, some of the speech processing deficits outlined above can be amplified. In situations when voice-hearing is present, an individual’s abilities to hear speech correctly, identify voices, process information and focus on a conversation may further decline. In response to the issues highlighted, Table 1 presents some broad recommendations on how to adapt speech when conversing with individuals who hear voices. When considering these recommendations, it should be noted that some of the evidence base from which they have been derived, in particular the neuroimaging data, has been previously critiqued as suffering from generally poor and inconsistent methodological quality (Richards et al., 2021). To the best of our knowledge, external speech processing deficits in voice-hearers are synergistic and no deficit is more important than another. As such, there is no level of priority within this list of recommendations; they should be adopted as a whole and evenly prioritised.

Table 1.

Recommendations for clinicians and allied health professionals when speaking to voice-hearers.

Key empirical finding	Clinical recommendation
Many speech processing challenges are present in voice-hearers, even if they are not currently hallucinating.^a	Ensure that the content of speech is clear and ask for confirmation from the patient that they understand what is being said. Avoid clinical jargon where possible.
Voice-hearers struggle to process several subverbal cues.^b	Use an even tone and pace in your voice at all times.
Source monitoring deficits are a significant difficulty for voice-hearers.^c	A reduced ability to recognise the identity of voices should be expected. In busy environments where voice-hearers are interacting with numerous unfamiliar people, frequent and repetitive introductions are needed, including both the name and role of the health professional. Visible nametags are recommended.
Poor pragmatic comprehension is associated with voice-hearing.^d	Complex language should be avoided where possible. This includes metaphor, sarcasm, irony or idioms. Instead, use simple language.
Emotional content in speech increases speech processing deficits in voice-hearers.^e	Emotional topics should be approached with caution. This is particularly important when discussing topics with a negative connotation, such as distressing voice content and adverse life events.
Voice-hearers have poor attention and intentional inhibitory control.^f	Be aware of lapses in attention and ensure that the patient is focussing on the conversation.
Many speech processing challenges are amplified if the patient is experiencing AVH.^g	Enquire or assess if the patient is experiencing AVH at the beginning of the conversation. If AVH is present, expect the patient to have increased speech processing challenges and, based on the recommendations above, adjust your conversation style.

AVH: auditory verbal hallucination.

Conde et al. (2016), Hoekert et al. (2007).

Hoekert et al. (2007), McLachlan et al. (2013), Rossell and Boundy (2005).

Brébion et al. (2002), Johns et al. (2001).

Conde et al. (2016), McLachlan et al. (2013), Siddi et al. (2016).

Alba-Ferrara et al. (2013), Freeman and Garety (2003), Johns et al. (2001), Rossell and Boundy (2005), Rossell et al. (2013), Sheaves et al. (2020).

Waters et al. (2003).

Aleman et al. (2003), Brébion et al. (2002), Steinmann et al. (2017), Waters et al. (2003).

Finally, it is acknowledged that considerably less research has been conducted into external speech processing deficits in voice-hearing groups outside of schizophrenia. Phenomenologically, individuals with schizophrenia exhibit AVHs which are often more severe (e.g. occur at a higher frequency, with more negative content or with greater associated distress) than non-clinical voice-hearers or voice-hearers with another clinical diagnosis (Baumeister et al., 2017). Thus, a continuum of severity appears to exist. Given this, it may be safe to assume that these speech processing challenges will have a more severe presentation in an individual with schizophrenia compared to a voice-hearer with a different diagnosis. However, this is a proposition which needs further investigation.

Overall, empirical research has highlighted extensive external speech processing deficits in voice-hearers. Challenges while processing speech are a separate, yet compounding, factor for the distraction and distress associated with AVH, and need to be front-of-mind during clinical encounters. An awareness of these difficulties and effective adaptations to communication will optimise the chances for a voice-hearer to process the content of speech accurately and efficiently. This may assist in how much information a voice-hearer is able to retain and correctly encode. Furthermore, changes in communication may improve patient–clinician relationships, leading to better clinical outcomes for voice-hearers.

Footnotes

Declaration of Conflicting Interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship and/or publication of this article: D.J.C. has received grant monies for research from Eli Lilly, Janssen Cilag, Roche, Allergen, Bristol-Myers Squibb, Pfizer, Lundbeck, AstraZeneca and Hospira; travel support and honoraria for talks and consultancy from Eli Lilly, Bristol-Myers Squibb, AstraZeneca, Lundbeck, Janssen Cilag, Pfizer, Organon, Sanofi-Aventis, Wyeth, Hospira, Servier and Seqirus; is a current or past Advisory Board Member for Lu AA21004: Lundbeck; Varenicline: Pfizer; Asenapine: Lundbeck; Aripiprazole LAI: Lundbeck; Lisdexamfetamine: Shire; Lurasidone: Servier; Brexpiprazole: Lundbeck; Treatment Resistant Depression: LivaNova; Cariprazine: Seqirus. He is founder of the Optimal Health Program, currently operating as Optimal Wellness, and is part owner of Clarity Healthcare. He is on the scientific advisory of The Mental Health Foundation of Australia. He does not knowingly have stocks or shares in any pharmaceutical company. He was a member of the original schizophrenia and related disorders CGP working group: the views expressed here have been endorsed by neither that group nor the RANZCP.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This work was supported by the Australian Government Research Training Program (S.E.R.), the Australian National Health and Medical Research Council (NHMRC; fellowship to S.L.R. [ID: 1154651] and a project grant to S.L.R. [ID: 1060664]) and a Barbara Dicker Brain Sciences Foundation project grant (S.L.R.).

ORCID iDs

Sophie E Richards

Sean P Carruthers

Susan L Rossell

References

Alba-Ferrara

De Erausquin

Hirnstein

, et al (2013) Emotional prosody modulates attention in schizophrenia patients with hallucinations. Frontiers in Human Neuroscience 7: 59.

Aleman

Böcker

KBE

Hijman

, et al (2003) Cognitive basis of hallucinations in schizophrenia: Role of top-down information processing. Schizophrenia Research 64: 175–185.

Baumeister

Sedgwick

Howes

, et al (2017) Auditory verbal hallucinations and continuum models of psychosis: A systematic review of the healthy voice-hearer literature. Clinical Psychology Review 51: 125–141.

Brébion

Gorman

Amador

, et al (2002) Source monitoring impairments in schizophrenia: Characterisation and associations with positive and negative symptomatology. Psychiatry Research 112: 27–39.

Champagne-Lavau

Stip

(2010) Pragmatic and executive dysfunction in schizophrenia. Journal of Neurolinguistics 23: 285–296.

Conde

Gonçalves

Pinheiro

(2016) A cognitive neuroscience view of voice-processing abnormalities in schizophrenia: A window into auditory verbal hallucinations? Harvard Review of Psychiatry 24: 148–163.

Ditton-Phare

Halpin

Sandhu

, et al (2015) Communication skills in psychiatry training. Australasian Psychiatry 23: 429–431.

Doucet

Luber

Balchandani

, et al (2019) Abnormal auditory tonotopy in patients with schizophrenia. NPJ Schizophrenia 5: 1–6.

Freeman

Garety

(2003) Connecting neurosis and psychosis: The direct influence of emotion on delusions and hallucinations. Behaviour Research and Therapy 41: 923–947.

10.

Frühholz

Belin

(2018) The Oxford Handbook of Voice Perception. Oxford: Oxford University Press.

11.

Green

Hugdahl

Mitchell

(1994) Dichotic listening during auditory hallucinations in patients with schizophrenia. The American Journal of Psychiatry 151: 357–362.

12.

Hoekert

Kahn

Pijnenborg

, et al (2007) Impaired recognition and expression of emotional prosody in schizophrenia: Review and meta-analysis. Schizophrenia Research 96: 135–145.

13.

Hoffman

Rapaport

Mazure

, et al (1999) Selective speech perception alterations in schizophrenic patients reporting hallucinated ‘voices’. American Journal of Psychiatry 156: 393–399.

14.

Hugdahl

(2003) Dichotic listening in the study of auditory laterality. In: Hugdahl

Davidson

(eds) The Asymmetrical Brain. Cambridge, MA: MIT Press, pp. 441–475.

15.

Johns

Rossell

Frith

, et al (2001) Verbal self-monitoring and auditory verbal hallucinations in patients with schizophrenia. Psychological Medicine 31: 705–715.

16.

Løberg

E-M

Jørgensen

Hugdahl

(2004) Dichotic listening in schizophrenic patients: Effects of previous vs. ongoing auditory hallucinations. Psychiatry Research 128: 167–174.

17.

McLachlan

Phillips

Rossell

, et al (2013) Auditory processing and hallucinations in schizophrenia. Schizophrenia Research 150: 380–385.

18.

Priebe

Richardson

Cooney

, et al (2011) Does the therapeutic relationship predict outcomes of psychiatric treatment in patients with psychosis? A systematic review. Psychotherapy and Psychosomatics 80: 70–77.

19.

Richards

Hughes

Woodward

, et al (2021) External speech processing and auditory verbal hallucinations: A systematic review of functional neuroimaging studies. Neuroscience & Biobehavioral Reviews 131: 663–687.

20.

Rossell

Boundy

(2005) Are auditory-verbal hallucinations associated with auditory affective processing deficits? Schizophrenia Research 78: 95–106.

21.

Rossell

Van Rheenen

Groot

, et al (2013) Investigating affective prosody in psychosis: A study using the comprehensive affective testing system. Psychiatry Research 210: 896–900.

22.

Sheaves

Johns

Černis

, et al (2020) The challenges and opportunities of social connection when hearing derogatory and threatening voices: A thematic analysis with patients experiencing psychosis. Psychol Psychother 94: 341–356.

23.

Siddi

Petretto

Burrai

, et al (2017) The role of set-shifting in auditory verbal hallucinations. Comprehensive Psychiatry 74: 162–172.

24.

Siddi

Petretto

Scanu

, et al (2016) Deficits in metaphor but not in idiomatic processing are related to verbal hallucinations in patients with psychosis. Psychiatry Research 246: 101–112.

25.

Steinmann

Leicht

Andreou

, et al (2017) Auditory verbal hallucinations related to altered long-range synchrony of gamma-band oscillations. Scientific Reports 7: 1–10.

26.

Waters

FAV

Badcock

Maybery

, et al (2003) Inhibition in schizophrenia: Association with auditory hallucinations. Schizophrenia Research 62: 275–280.

Speech processing in voice-hearers: Bridging the gap between empirical research and clinical implications

Abstract

Keywords

Introduction

Subverbal communication

Verbal communication

The content of speech

The effects of voice-hearing severity

Recommendations and conclusion

Footnotes

Declaration of Conflicting Interests

Funding

ORCID iDs

References