Abstract
Background and aims
Fascinations for or aversions to particular sounds are a familiar feature of autism, as is an ability to reproduce another person's utterances, precisely copying the other person's prosody as well as their words. Such observations seem to indicate not only that autistic people can pay close attention to what they hear, but also that they have the ability to perceive the finer details of auditory stimuli. This is consistent with the previously reported consensus that absolute pitch is more common in autistic individuals than in neurotypicals. We take this to suggest that autistic people have perception that allows them to pay attention to fine details. It is important to establish whether or not this is so as autism is often presented as a deficit rather than a difference. We therefore undertook a narrative literature review of studies of auditory perception, in autistic and nonautistic individuals, focussing on any differences in processing linguistic and nonlinguistic sounds.
Main contributions
We find persuasive evidence that nonlinguistic auditory perception in autistic children differs from that of nonautistic children. This is supported by the additional finding of a higher prevalence of absolute pitch and enhanced pitch discriminating abilities in autistic children compared to neurotypical children. Such abilities appear to stem from atypical perception, which is biased toward local-level information necessary for processing pitch and other prosodic features. Enhanced pitch discriminating abilities tend to be found in autistic individuals with a history of language delay, suggesting possible reciprocity. Research on various aspects of language development in autism also supports the hypothesis that atypical pitch perception may be accountable for observed differences in language development in autism.
Conclusions
The results of our review of previously published studies are consistent with the hypothesis that auditory perception, and particularly pitch perception, in autism are different from the norm but not always impaired. Detail-oriented pitch perception may be an advantage given the right environment. We speculate that unusually heightened sensitivity to pitch differences may be at the cost of the normal development of the perception of the sounds that contribute most to early language development.
Implications
The acquisition of speech and language may be a process that normally involves an enhanced perception of speech sounds at the expense of the processing of nonlinguistic sounds, but autistic children may not give speech sounds this same priority.
Introduction
Autism spectrum disorders (ASD) are heterogeneous neurodevelopmental conditions. There are individuals who are nonverbal with learning disability, and individuals with impaired nonverbal communication but no delays or obvious anomalies in their language acquisition. Although the manifestation of autistic symptoms change with time and can be alleviated by intervention (Warren et al., 2011), ASD usually persists throughout life. Conditions vary in severity without sharp distinction between normality and pathology (Tantam, 2012). High-functioning autism and Asperger syndrome are part of ASD and not independent diagnosis in the latest editions of the Diagnostic and Statistical Manual of Mental Disorders of the American Psychiatric Association (the 5th edition) and the International Classification of Diseases of the World Health Organization (the 11th edition). They are normally placed at the high-functioning end of the spectrum as individuals with those conditions are verbal and do not have learning disability (American Psychiatric Association, 2013; World Health Organization, 2019). Those terms are still used by some (National Autistic Society, 2023) but we use “verbal autistic individuals with language delay” for high-functioning autism, and “verbal autistic individuals without language delay” for Asperger syndrome 1 in this article wherever possible, unless using the old terms seems better for the purpose of clarity.
Also included in the latest diagnostic criteria are differences in sensory perception. ASD is often accompanied by either hypersensitivity or hyposensitivity to a wide range of stimuli. Hypersensitivity may mean that intense sensations or particular sensations, for example, the sound of a baby crying, may trigger challenging behavior (Myles & Simpson, 2002). According to Marco et al., 96% of children with ASD report hypersensitivity or hyposensitivity in more than one domain (Marco et al., 2011). It is possible that a child who is hypersensitive in one sensory modality is hyposensitive in another (Tantam, 2012). Both hyperacusis, or increased auditory sensitivity (American Speech-Language-Hearing Association, 2022), and hearing impairment are more prevalent in autism (Rosenhall et al., 1999).
Autistic individuals, like other neurodiverse people, may be “savants” with “islets of ability” where particular abilities are much higher than their overall skill level (Uddin, 2022). Exceptional ability, in which the ability exceeds that of neurotypical peers, does occur in these “islets” but it is rare (Frith, 1989). It has been argued that this unexpected ability is due to a practice effect, consequent on the repetition of an activity related to the ability, to which the child is fascinated (Heaton & Wallace, 2004). This condition is also sometimes called savant syndrome which is a condition accompanied by serious mental disabilities. The prevalence of savant syndrome in the autistic population is estimated to be up to 10%, whereas approximately 50% of people with savant syndrome have ASD (Treffert, 2009). They do not always co-occur, but are understood to be closely linked (Heaton & Wallace, 2004). Autistic individuals without enhanced skills also display better abilities in specific areas such as visual searching skills (O’Riordan et al., 2001; Samson et al., 2012).
Enhanced skills are also observed in the auditory modality. Auditory perception is “the recognition, discrimination, and interpretation of sounds, including frequency, pitch, timbre, loudness, speech, and music, together with sound localization, temporal analysis, and the perceptual organization of auditory phenomena” (Oxford Reference, 2021). It differs from auditory sensation, or hearing, that is “the physiological process of attending to sound within one's environment” (University of Minnesota Libraries, 2021). According to percussionist Evelyn Glennie, who is profoundly deaf, “Listening is an act of choice which is not only about hearing a sound. Anyone can engage in the act of listening should they make that choice” (Glennie, 2019). Observations of autistic individuals show that they are likely to listen to sounds actively (Ockelford, 2013). Enhanced sensitivity to pitch perception has been widely reported, as well as auditory sensitivity to particular sounds or fascinations for environmental sounds. Music processing including beat perception may be relatively spared in autistic people (Dahary et al., 2023) who have other severe cognitive difficulties. Studies on music therapy have shown its benefits to autistic individuals (Cahart et al., 2022) supporting their appreciation of music (Geretsegger et al., 2014; Kim et al., 2009).
Sensitivity to sounds may extend to speech perception that relies on auditory perception. Apparently unimpaired language has been used to define what was called Asperger syndrome. Whereas many with this condition are highly verbal, their speech is often reported as unusual (Landa, 2000), particularly in terms of prosody (Klin et al., 2000; Shriberg et al., 2001). Mottron, Ostrolenk, and Gagnon propose that autistic individuals may master language via a nonsocial route, which also serves the acquisition of special savant abilities such as musical, artistic, or calculation skills (Mottron et al., 2021). According to them, some autistic people develop language using their ability to detect and use complex embedded structures that are shared by language and materials of their special interests. They argue that this nonsocial path is an alternative method to language acquisition and an extension to human capacities rather than deficits, although it may be more elaborative and potentially causes language delays, peculiarities or failure (Mottron et al., 2021). Despite unusual prosody in their own speech, people with Asperger syndrome are observed to copy other people's prosody when they mimic, so that the prosodic impairment in their own speaking voice is replaced by unimpaired prosody in their mimicking voice (Tantam, 2000). One way of interpreting this is that autistic people can remember prosody and content together, as a package, while it is more usually stored separately (Stevens, 2004). This is likely why people with good mimicking abilities are rare. One can repeat content but repeating prosody as well is not easy. This would be consistent with “foreign accent syndrome,” where a native speaker habitually speaks with a foreign accent, is anecdotally reported to be more common in autism (Tantam, 2000). It is also consistent with the reduced use of their native regional dialect by autistic children and adolescents (Matsumoto & Sakihara, 2011; Wire, 2005). 80% of instances of foreign accent syndrome reported in a literature review involve disruption of networks linking left and right motor cortical areas associated with phonation (Higashiyama et al., 2021). This would be consistent with the right hemisphere normally being the common pathway for encoding prosody and the left for syntax and semantics, with the networks between them being required to synthesize the encoded information into motor control of speech apparatus (Friederici, 2011; McWhirter et al., 2019). The almost direct connection between the input and the output in mimicking indicates autistic people correctly perceive the utterances of other people and can reproduce them. This observation further indicates that their auditory perception is intact and potentially detail-oriented, although it may not contribute to their prosody when they speak.
Up to now, studies that investigated auditory perception in autism have either reported deficits or superior abilities. In a recent literature review, Chen et al. investigated the potential factors that account for inconsistent findings in pitch perception, and whether autistic people have enhanced pitch processing abilities. Their findings indicated that studies using pitch intervals and isolated pitch differed in their results. It was the latter as the stimulus that contributed to superior auditory abilities in autistic people, who surpassed neurotypicals in tasks using isolated pitch than those with pitch intervals. This advantage for pitch perception was found to coexist with speech deficits in autism (Chen et al., 2022). In the current article we independently reviewed studies on auditory perception in autism and several studies indicated that some autistic individuals with language delay have atypically enhanced sensitivity to pitch (Bonnel et al., 2010; Heaton et al., 2008c). Pitch perception in autism has been less studied in recent years although heightened pitch perception and absolute pitch are often reported in autistic individuals (Frith, 1989). Absolute pitch is commonly known as “the ability to identify or produce the pitch of a sound without any reference point” (Zatorre, 2003), and its prevalence in the general population is estimated to be less than 0.01%, that is less than one in 10,000, in North America and Europe (Bachem, 1955; Profita et al., 1988, cited in Takeuchi & Hulse, 1993). This ability is generally considered to be an element of musical giftedness and is not known to directly affect language development. It is unlikely that pitch perception is the only aspect of auditory perception that may be different in autism as audiovisual integration and auditory temporal processing are observed to be impaired in autistic individuals (Chan et al., 2023; Foss-Feig et al., 2017; Huang et al., 2018; Stevenson et al., 2014). However, whereas those atypicalities are reported, it is not very clear how uncommon many of such deficits or abilities are, as it is also unclear how uncommon they are in the general population. Absolute pitch on the other hand, is found in autistic individuals across different age groups (Mayer et al., 2016), which may be unique as some atypicapities seem to change or diminish through development, and its prevalence is known to be 5% to 11% (Rimland & Fein, 1988; DePape et al., 2012, cited in Mottron et al., 2013). It is 500 to 1,100 times higher than in the general population which seems to be a significant difference.
In this article we focussed on pitch perception and particularly absolute pitch in autism for its unusual profile based on what we already know, and for its potential consequences. To our knowledge, the present article is original in its view on heightened sensitivity to pitch, which seems to stem from the processing bias towards low-level information observed in autism. Although heighted pitch perception itself is unlikely to cause developmental issues, findings suggest that such detail-oriented pitch perception may cause atypical attention or processing of information potentially during perceptual narrowing, with cascading effects on the development of language in autism.
Method
As the present article is focussed on the listening ability in autism that is not limited by whether or not the sounds are speech related, the term “auditory perception” was chosen to search for relevant papers. Academic papers were searched using the key words “autism”, “ASD”, “auditory”, and “auditory perception” in PubMed, APA PsycArticles and Mendeley. The papers that were initially retrieved were mainly published in the past 20 years and tended to focus on speech related sounds such as phonemes when the abstracts were screened. As performance in processing both linguistic and nonlinguistic sounds were of interest, we selected out those papers that focussed on features relevant to both linguistic and nonlinguistic sounds such as pitch. We excluded studies that focussed speech sound perception. Papers were included if their main investigation was auditory perception of people who were diagnosed with autism regardless of their verbal and cognitive abilities or age. As musical training is known to improve auditory perception of both nonlinguistic and linguistic stimuli, studies on musically trained subjects such as musical savants with autism were not included. Further papers were sourced through their referenced papers. All papers whose abstracts satisfied the criteria were read through for the final selection (see Table 1).
Details of the reviewed studies.
ASD: autism spectrum disorders; WCC: weak central coherence theory; AS: Asperger syndrome; HFA: high-functioning autism.
Hypotheses and reports on auditory perception in autism
There are two main hypotheses as to why auditory perception might be atypical in autism: the weak central coherence theory and the enhanced perceptual functioning model. We first describe these hypotheses, along with miscellaneous alternative hypotheses to the two main ones, and then consider the findings from our selected papers in relation to each hypothesis.
The weak central coherence theory
Frith proposed the weak central coherence theory (WCC) to explain “a fragmented world” (Frith, 1989) of autistic individuals, particularly referring to their restricted, stereotyped, repetitive repertoire of interests. Taking an example from autistic children and their construction of a jigsaw puzzle, using the shape of the edges of the puzzle pieces rather than seeing the picture, Frith explained how this is a piecemeal process, as the aim is not to complete the picture but to enjoy fitting pieces together. She argued that for nonautistic people, the final picture is the meaning of the greater unit and the greater unit represents central coherence on a larger scale, which is the tendency to process information for “meaning and gestalt (global) form” as opposed to pieces as fragments that autistic individuals appreciate. According to Frith, weak central coherence is a processing bias for local information that compromises to see the bigger picture, often at the expense of attention to or memory for details (Happé & Frith, 2006). WCC was thought to be advantageous for certain tasks such as the Block Design and Embedded Figures tests, where autistic children scored above average (Frith, 1989), but detrimental for other tasks such as homograph reading, where autistic individuals were found to perform worse (Hill & Frith, 2003). As central coherence is a cognitive style to process information, weak central coherence in one modality meant it could be present in other modalities. It was expected to be present in the auditory modality, where enhanced pitch perception and absolute pitch were observed in autistic individuals suggesting bias toward local processing (Heaton, 2019). Those contradictory reports led to further investigation on local and global information processing in the auditory modality, but findings look more inconsistent.
Autism and gestalt
WCC was built on the notion of gestalt and explained that autistic individuals are more influenced by the local parts of the gestalt than the gestalt itself (Hill & Frith, 2003). The idea of this particular cognitive style existed before WCC known as field independence/dependence, which is a cognitive style with a tendency to see objects as independent or dependent from the field as a whole (Happé & Frith, 2006). Prizant (1983) compared echolalia to a gestalt style of language acquisition in typically developing children. He argued that echolalic utterances often produced by autistic children are similar to chunks or imitations of sentences typically developing children produce beyond their true language level. In a gestalt style of language acquisition, such chunks or deferred imitations are considered to be single units or “gestalt” language forms. A child may use those memorized whole utterances or phrases with little understanding of internal structure or meaning. The use of such forms is an important part of typical language acquisition before a child becomes capable of analyzing those chunks. Prizant maintained that those gestalt forms are similar to echolalic utterances in autistic children, particularly delayed echolalia as they may be produced as whole units with little understanding of the internal structure (Prizant, 1983). Chunking is normally a process to divide large pieces of information into smaller units to make them more manageable (American Psychological Association, 2022). As some echolalic utterances could be quite long, it is not certain whether all echolalic utterances can be explained in terms of chunks produced by typically developing children. However, they share a similarity that they are both likely treated as single units as gestalt forms. Autistic children may remember some auditory stimuli as units even when several different sound sources are involved, as observed in a musical savant who incorporated the sound of page turning when playing a piece by Chopin (Ockelford, 2013). According to Prizant, the literature on autism was “replete with the description of wholistic or gestalt learning patterns” as of 1983. The gap between global and local features processing was also supported by studies on neural mechanisms, with more brain activation in the right hemisphere for global features and the left hemisphere for local features (Fink, 1997; Heinze et al., 1998) in the visual modality. However, there may be variations within the autistic population. A study on perception of facial and nonfacial stimuli in autistic children, including those with and without language delay, found that verbal autistic children were varied in their abilities to perceive global information in both facial and nonfacial stimuli (Davies et al., 1994).
Absence of global interference
Foxton et al. (2003) investigated auditory perception in autism based on the very idea of WCC that the unusual cognitive style should be present across modalities. Unlike other studies that used short melodies and regarded the actual pitches in them as the local feature and the pitch contour as the global feature, Foxton et al. regarded the pitches as part of the local feature together with the exact time points of pitch changes, and an integration of those local features as the global feature forming an auditory gestalt. This seems to better reflect the actual music perception than the contour paradigm often used in such studies (Heaton et al., 2007). A report, that dyslexic children who are learning music tend to take a sequence of notes as a block rather than as individual notes (Macmillan, 2004), may also support the idea of pitch contours forming units. They predicted that the performance of the control group would deteriorate with increasing mismatches in the local features due to their gestalt percept, but not for the clinical group as mismatches would not violate their coherent whole. The results were consistent with their hypothesis. As the aim of the study was to investigate whether WCC existed in the auditory domain in autism, they interpreted the results as consistent, and concluded that the results showed abnormal interactions between local and global auditory perception (Foxton et al., 2003).
Superior chord decomposing ability
As musical savants and nonautistic absolute pitch possessors demonstrate high levels of a chord decomposing ability, Heaton investigated chord decomposing abilities of autistic children, based on the hypothesis that nonsavant autistic individuals can decompose musical chords into individual components (Heaton, 2003). A group of autistic children and two control groups, one with matching verbal IQ and another with matching age, all without musical training participated in three types of experiments. In experiment 1, the children were presented with four different notes and four different animals each of which had its favorite note. Their task was to match the animals with their favorite notes upon hearing the notes randomly. The clinical group outperformed both comparison groups. In experiment 2, the children were presented with chords that had three of the four notes used in experiment 1 and were asked which animal's favorite note was missing. The clinical group performed better than the comparison groups. In experiment 3, 24 chords were played followed by either the notes used in the chords or different notes and they were asked if those notes were part of the chord they just heard. There was no significant difference across the groups. Heaton interpreted the findings to mean that the superior chord decomposing abilities in the ASD group were dependent on the experimental conditions which differed in memory demand. As sensory memory lasts only a few seconds and is vulnerable to interference (Cowan, 1984), experiment 2 would be difficult for those without absolute pitch, as they would be more dependent on auditory working memory. Experiment 3 did not demand such memory and the results did not show group difference. Heaton argued it is therefore not enhanced perceptual processing ability but rather superior pitch memory and working memory, as enhanced perceptual processing ability would have helped the clinical group perform better also in experiment 3. She concluded that a local bias toward featural processing at the perceptual level (Happé, 1999, cited in Heaton, 2003) best accounts for autistic cognition without deficits in global processing.
The enhanced perceptual functioning model
Mottron et al. (2006) proposed the Enhanced Perceptual Functioning model to account for perception in autistic individuals. This was a modified version of their Hierarchization Deficit Hypothesis model, which was more similar to WCC but explainied the local bias as the result of nonhierarchical access to information, favoring local targets, and not of the inability to integrate parts into wholes (Mottron et al., 2006). The hypothesis was proposed based on savant syndrome in the visual modality, however, it produced conflicting results when applied to another savant case in the auditory modality, where global perception was uncompromised. Further conflicts were found with nonsavant autistic individuals, thus the hierarchization deficit hypothesis was modified and the enhanced perceptual functioning model emerged.
Local versus global features
Mottron et al. (2000) tested local and global processing of musical stimuli in verbal autistic individuals with and without language delay, without musical experience. They used a series of melodic contours, the pattern of ups and downs in pitch that characterizes a melody (Dowling, 1978), that had mismatches in terms of pitch as well as keys (Mottron et al., 2000). Pitches were regarded as the local and melodic contours as the global features in this study. The basic melodic contour was either violated by a different pitch, or was transposed while keeping the melodic contour itself. The task was to make same/different judgments and both their hierarchization deficit hypothesis and WCC predicted different performances in the transposed condition for the clinical and control groups. The control group was expected to be able to recognize the basic melodic contour as long as the melodic contours were kept, while the clinical group would struggle with transposition as WCC predicted they tend to rely on absolute pitch rather than relative pitch. Based on the same idea, the clinical group was expected to differentiate contour-preserved and contour-violated melodies at a similar level. Results proved otherwise and both groups performed similarly in the contour-preserved and contour-violated conditions, with the clinical group showing generally better scores. Against expectation, they also exhibited the global advantage in the contour-violated condition over the contour-preserved condition, as well as better performance in local processing when they could use absolute pitch for discrimination. The study demonstrated superior pitch discriminating abilities in autistic individuals, in the presence of normal processing of global musical features, suggesting it is the enhancement of local feature processing, rather than weakened global feature processing.
Pure-tone discrimination in autistic individuals with language delay
Bonnel et al. investigated pitch sensitivity in verbal autistic individuals with language delay by testing their discrimination and categorization abilities of pure tones presented in isolation (Bonnel et al., 2003). This was based on research where verbal autistic individuals with language delay showed superiority in detecting pitch changes in melodies, as well as observations of musical savants and their possession of absolute pitch. Autistic individuals were predicted to excel in those tasks if they were better than nonautistic individuals at discriminating between pitches presented in the form of musical stimuli. The tasks required sensory memory to remember previous tones to judge if they are the same or different, or to categorize frequencies of tones as being high or low in pitch. Those tasks were expected to be demanding. The clinical group performed better than the control group in both tasks and outperformed in the categorization task as the performance of the control group deteriorated in the latter. The findings were interpreted as superior pitch sensitivity, as well as less sensitivity to tasks in autistic individuals, potentially relevant to the higher incidence of absolute pitch in musical savants. As this study demonstrated superior performance in autistic individuals and not in savants, the authors argued that peaks of abilities in pitch perception in autism are not relative in comparison to compromised abilities elsewhere, but rather absolute peaks of abilities.
Pure-tone discrimination in autistic individuals without language delay
Bonnel et al. (2010) performed a study on pure-tone pitch discrimination in autistic individuals, including those who are verbal and with and without language delay (Bonnel et al., 2010). The verbal autistic participants with and without language delay were differentiated based on two criteria, that they had their first single words before 24 months and first two-word phrases before 33 months. Pure-tone pitch discrimination, nonvocal and vocal timbre, and loudness were investigated, and the verbal autistic individuals with language delay, but not those without language delay, displayed enhanced auditory discrimination abilities across all four areas compared to the comparison group. This was consistent with findings from another study in which a subgroup of autistic children with a history of language delay and minimal musical training, demonstrated remarkable pitch discrimination abilities (Heaton et al., 2008c). Bonnel et al. referred to their previous findings where verbal autistic individuals with language delay demonstrated enhanced performance in visuo-motor perception. They suggested that enhanced perception in auditory and visual modalities may constitute cognitive correlates of delayed speech onset in a subgroup of autistic individuals.
Other hypotheses
Unusual early brain development
Based on studies that suggest unusual brain development early in life in autism, DePape et al. (2012) hypothesized that some aspects of speech and music learning, which normally develop early, could be affected causing unusual auditory perceptual processing in autism. As speech and music are also influenced by enculturation processes (Hannon & Trehub, 2005), they further predicted that the perception of autistic people could be less specialized for the language or music they are exposed to in their native environment. Monolingual English-speaking autistic adolescents with and without language delay and typically developing controls were recruited to test their (1) ability to filter out sounds irrelevant to a task and focus on what is relevant, (2) sensitivity to phonemic categories relevant to their language, and (3) multisensory integration of auditory and visual information in speech including the McGurk effect in the domain of speech, as well as (4) tendency to use absolute pitch, (5) development of specialization for rhythm in the musical system in their native environment, and (6) internalization of tonal harmony rules in the musical system in their native environment in the domain of music. Findings were mainly consistent with their hypothesis except for three categories that are, the high prevalence of absolute pitch, less specialization for native language phonemic categories, and filtering difficulties. For the high prevalence of absolute pitch found to be 11% in the ASD group and 0% in the control group, the authors referred to relative pitch that some studies showed to develop early (Trehub, 2001) and then suggested that the prevalence of absolute pitch in autism is consistent but possibly with a genetic propensity for absolute pitch. Less specialization for native language phonemic categories was more difficult and they referred to the importance of early social communication. The filtering difficulties remained unexplained as studies suggested the ability to segregate simultaneous and sequential sounds develops early but improves until 9 to 11 years of age (Sussman et al., 2007). Autistic children are observed to have difficulties in segregating sound streams when there is more than one stream (Lepistö et al., 2009). This may be particularly prominent during childhood and adolescence as a meta-analysis suggests significantly reduced connectivity in white matter for language-related tracts in autistic individuals, and it is more pronounced in children than in adults (Li et al., 2022).
Auditory perception in the periphery
Based on a hypothesis that the mechanisms that cause WCC effects may be perceptual, Plaisted et al. investigated difficulties in understanding speech in noise often experienced by autistic individuals (Plaisted et al., 2003). As people with hearing impairment have the same difficulties due to their auditory filter being too wide to select frequency in complex sounds, the authors tested frequency selectivity for a group of verbal autistic individuals with and without language delay. A masking experiment was performed to measure their frequency selectivity by giving the participants signal sounds and masking sounds close in frequency. This was to test their ability to separate the two which would indicate the limits of their frequency selectivity. The results showed the frequency selectivity of the participants was worse than for normal hearing people without ASD, suggesting their auditory filter was wider than normal, which would allow a large amount of noise that interferes with the signal. They argued this could be at least partially the cause for the speech-in-noise difficulties, and that it is perceptual processing in its earliest stages at the auditory periphery level which is unusual in autism. There was no within-group difference although the ASD group consisted of verbal individuals with and without language delay. However, this may suggest that the difference in the auditory periphery may account for the speech-in-noise difficulties but not necessarily other language-related impairments or delay, as the autistic participants were all verbal and some had no delay. Another study that investigated the speech-in-noise difficulties in verbal autistic individuals with and without language delay also found decreased speech recognition in both groups, as well as reduced ability to take advantage of temporal dips in the noise, indicating atypical temporal processing (Alcantara et al., 2004). As the findings did not account for other aspects such as enhanced sensitivity to pitch, the authors proposed that abnormalities in the auditory periphery may not have adverse effects on all later perceptual processing stages, referring to studies that found hearing-impaired people with wider than normal auditory filters do not always display problems in pitch perception and frequency discrimination (Moore & Peters, 1992; Moore et al., 1995, cite in Plaisted et al., 2003).
Increased perceptual capacity
Remington and Fairnie investigated auditory capacity in autism based on their hypothesis that autistic individuals have an “increased perceptual capacity that allows them to process more information at any given time” (Remington & Fairnie, 2017). Two experiments were performed with a group of verbal autistic individuals and a group of IQ-matched individuals. Experiment 1 tested the auditory search and detection abilities using animal sounds and a target sound of a car coming from various directions. The perceptual load was altered by adding more animal sounds and car sounds to 50% of the trials. The task was to search for the target animal, which was either present or absent, while also detecting the car sound as the secondary task. Results showed no significant group effects in the search task, but a significant group difference was observed in the detection task where the performance of the control group deteriorated. In experiment 2, the participants heard a short scene with four people talking, to which another character appeared repeating “I’m a gorilla.” The participants were asked questions that they could answer only by listening to the conversation. They all answered correctly demonstrating they were paying attention to the conversation. However, when asked if they heard anything unusual, the clinical group outperformed the control group with the detection of the auditory gorilla. Upon closer look, those who noticed the gorilla also showed a greater ability to detect the car sound in experiment 1. The results demonstrated that the performance of the clinical group was less affected by levels of auditory load, suggesting autistic individuals have increased auditory capacity rather than filtering difficulties or inability to maintain attention. The authors maintained that this capacity could work either for or against them, allowing them to demonstrate superior performance in tasks such as pitch discrimination, while also leading to detrimental effects such as distractibility.
Findings
WCC explains autism as a fragmented world where autistic people experience sensations or treat objects as fragments without the need to form a coherent whole with those fragments. Their preference for detailed local-level information accounted well for islets of ability in the visual modality, predicting poor performance in tasks where central coherence mattered more. This was also expected in the auditory modality, however, neither Foxton et al. nor Heaton found weakened global processing that could support enhanced local processing. Instead, they found differences in the way autistic individuals process local- and global-level stimuli (Foxton et al., 2003; Heaton, 2003). As weakened global processing is the essential element of WCC, the absence of it made it inconclusive whether WCC accounts as well for the auditory modality. The enhanced perceptual functioning model takes the same local preference as “enhancement” and it seems to better explain auditory perception in autism, as Mottron et al. (2000) found detail-focussed local processing with the presence of normal global processing. At a slightly different perceptual level, the findings by Plaisted et al. (2003) suggest that autistic people have broader frequency selectivity and auditory perception may be atypical at the periphery. Although this is inconsistent with detail-focussed pitch perception, studies on hearing impairment indicate that wider than normal auditory filters are not always accompanied by problems in pitch perception. It might be that certain brain structure allows pitch perception to be unaffected by the filtering difficulties. On a larger scale at the attention level, this seems to be the case. Findings by Remington and Fairnie (2017) suggest that auditory perception in autism is capable to perceive more information almost indiscriminately. Whereas the consequence of this quality is not always detrimental, it may shed some light on the filtering difficulties that remained unexplained in the study by DePape et al. (2012).
Discussion
It seems auditory perception in autism is different from that in neurotypicals. That may account for many of the behaviors observed in autistic individuals. The increased auditory perceptual capacity that causes distractibility may also promote fascinations for and aversions to certain sounds, which would be unnoticeable for nonautistic individuals but perceivable for some autistic individuals. Enhanced perception of loudness and reduced tolerance towards it were found in autistic children, even for sounds considered to be of moderate intensity (Khalfa et al., 2004), and it may account for hyperacusis in some. Whereas the prevalence of decreased sound tolerance in the general population is reported to be 3.5% (Jastreboff & Jastreboff, 2014), 50% to 70% of people with autism are said to experience it at some point in their lives. The condition is understood to cause distress, anxiety, and challenging behaviors among others (Williams et al., 2021). Gomot and Wicker maintain that hypersensitivity to details contributes to exaggerated perception of even slight changes in the environment, which may make the world appear more unpredictable and challenging for autistic people (Gomot & Wicker, 2012). On the other hand, there are also reports of atypical auditory perception having positive effects on the lives of autistic individuals and foreign language learning is one such example. Wire maintains that autistic pupils could be “good literal mimics” of the foreign accent and they have the potential to have the best accents in the class (Wire, 2005). Some autistic subjects in a study on musical experience reported timbre and performance as the important characteristics of their experience, indicating their perception of fine details (Allen et al., 2009). There are parental reports that their autistic children with heightened pitch sensitivity also have intense music interest or expertise (Eigsti & Fein, 2013). Ockelford, who has been teaching the piano to autistic children and young autistic people as well as to blind children, reports some children with absolute pitch enjoy playing a piece of music in different keys, suggesting they hear something different in each version that may sound quite the same for a person with relative pitch (Ockelford, 2008, 2013). Despite difficulties that may arise, given the right environment, atypical perception may offer qualitatively different and positive experiences to those who are equipped with it. Absolute pitch and enhanced sensitivity to pitch are frequently mentioned as heightened abilities in autistic individuals and most studies reviewed here also touched upon them. Whereas it would be beneficial to look at other aspects of auditory perception such as temporal processing also reported to be atypical, this article chose to focus on absolute pitch and look into it in detail for its potential influence on the lives of autistic individuals.
Absolute pitch
According to Frith, absolute pitch is not an “uncommon finding” in autistic people and it has enabled a number of autistic individuals to become skillful piano tuners (Frith, 1989). Such remark may make this ability seem not so special but the prevalence of absolute pitch in the general population is estimated to be less than 0.01%, therefore, it is a rare finding. In the autistic population, on the other hand, enhanced pitch, discrimination ability, and pitch memory are often reported strengths and the prevalence of absolute pitch is estimated to be 5% to 11% (Frith, 1989, p. 8) and it is constant in savant autistic musicians (Miller, 1999, cited in Mottron et al., 2013). That absolute pitch is not uncommon in the autistic population does not seem a mere coincidence. WCC, which interprets a local bias due to impaired global coherence, predicts that autistic people with absolute pitch should be impaired in their global processing of music. However, this is not the case as seen in autistic individuals with savant syndrome who are especially gifted in music. Although there are differences in the use of local and global processing in the autistic population, they are not observed to have impaired or weakened global processing. Whereas global processing is the mandatory cognitive style for the general population, it may be an optional style for autistic people (Haesen et al., 2011).
It is thought that absolute pitch can be acquired under some circumstances. Whereas some studies argue that postcritical period adults can acquire absolute pitch up to a certain level (Van Hedger et al., 2015), it is generally understood that early musical training is required to acquire absolute pitch (Miyazaki, 1988; Miyazaki et al., 2018) and by the age of 6 (Chin, 2003; Takeuchi & Hulse, 1993). However, not all early musical training guarantees absolute pitch, therefore it is suggested that it is a combination of genetic and nongenetic influences (Baharloo et al., 1998), as well as the type of early musical training to some extent (Wilson et al., 2009; Zatorre, 2003). This account does not seem applicable to absolute pitch in autism as the reviewed studies specifically recruited individuals without early musical training. Unlike absolute pitch in the general population, which emerges after early and appropriate musical training, absolute pitch in the autistic population is normally present before the development of musical skills (Mottron et al., 2013) or without such skills (Heaton, 2009; Ockelford, 2013).
Another possibility for the acquisition of absolute pitch is by means of a compensation mechanism. Absolute pitch is also more prevalent in the population of blind musicians (Deutsch, 2013) and Ockelford suggests one in 20 professional musicians and four in ten blind children, 5% and 40% respectively, possess absolute pitch (Ockelford, 2013). It is also noteworthy that the prevalence of ASD is higher in the population of people with congenital blindness (Tantam, 2012). According to a study on the pitch discrimination ability of blind people, those who became blind at an early age had better ability than those who became blind later in life and sighted individuals, suggesting that enhanced pitch perception in this population is a compensatory mechanism with the time of visual deprivation accounting for their performance level (Gougoux et al., 2004). Enhanced abilities often result from deprivation in other areas and a compensatory mechanism that follows. However, it is inappropriate to apply this to absolute pitch or enhanced pitch perception in autism as many of the difficulties they have are of a social nature (Mottron et al., 2000). It seems also inappropriate to consider it as compensation for language delay or deficits, as its low prevalence in the general population suggests absolute pitch is not particularly advantageous for language acquisition except for tonal languages (Giuliano et al., 2011; Krishnan et al., 2005).
Absolute pitch in autism
Some studies on pitch perception suggest that typically developing children shift in the use of pitch cues from absolute to relative pitch, and the use of relative pitch becomes dominant by adulthood (Saffran, 2003; Saffran & Griepentrog, 2001). Other studies found relative pitch use in infants (Plantinga & Trainor, 2005; Trehub, 2001) thus this is an area that requires further research, however, both accounts indicate that it is the relative pitch use that is dominant in the general population while it is not the case for at least 5% to 11% of autistic people. The participants in the reviewed studies varied in age from 7.9 to 31 years and had neither early musical training nor a condition that contributed to neural reorganization, but some had either absolute pitch or enhanced sensitivity to pitch.
It is very rare to possess absolute pitch but it is understood to be even rarer to possess absolute pitch that detects pitches irrespective of variation in other qualities such as register or timbre (Deutsch, 2013; Ockelford, 2013). Ockelford suggests that autistic children perceive pitch by a means that is not attenuated by fatigue. There are cases of autistic individuals who correctly identify pitch even of a fan buzzing in the key of F (Brenton et al., 2008, cited in Mottron et al., 2013) without formal musical training and with unusually high accuracy. Musicians with absolute pitch, on the other hand, regularly make errors by a semitone and they are not necessarily better at discriminating tones at almost the same frequency (Parncutt & Levitin, 2001). This is particularly interesting as it indicates absolute pitch in the autistic population might be different from that in the nonautistic populations.
The definition for absolute pitch helps us understand the skill but it does not imply that there are variations in accuracy or strategies to work out a given pitch (Wilson et al., 2009), or that it may have different origins. Whereas absolute pitch as a result of early musical training is likely “the association of pitch names with particular absolute pitches” (Takeuchi & Hulse, 1993) which is learned, or as a result of a compensatory mechanism (Gougoux et al., 2004; Hamilton et al., 2004), absolute pitch in autism is considered to be linked to the atypical perceptual bias toward local-level information. This coincides with what is required to process pitch in music, which is also “detail-oriented, local-level processing” (Peretz, 1990, cited in Mottron et al., 2013). This view is also supported by a positive correlation found between enhanced pitch perception and nonverbal ability that is higher than verbal ability in an individual. Higher nonverbal ability is suggested to contribute to processing advantage and bias for local-level information in autism (Chen et al., 2022). Absolute pitch in autism may be qualitatively different from what is generally understood as absolute pitch in nonautistic populations with the potential to influence language acquisition in autism.
Autism and language development
Perceptual narrowing
Music and early language development are often quoted together and supposed to proceed parallel tracks (Brandt et al., 2012; Mueller et al., 2012). Saffran and Griepentrog (2001) suggest that shifting to relative pitch might be loosely analogous to perceptual narrowing toward native language. Perceptual narrowing is observed to take place sometime between 6 and 12 months of age for typically developing infants who transition from universal to language-specific perception (Kuhl, 2000), however, this might be slightly different for autistic children. Based on the idea that perceptual narrowing depends on learning in a social context and interactions, Seery et al. hypothesized that infants at high risk for autism may limit their tuning into language due to relative lack of interest in social stimuli. They investigated event-related-potential (ERP) responses to native and non-native speech in infants at high and low risk for autism at the age of 6, 9, and 12 months. Against their expectation, both groups of infants showed typical ERP responses suggesting perceptual narrowing was taking place, however, the high-risk group did not display the lateralized response that the low-risk group had. The authors suggested this atypical lateralization may be a potential ASD marker in the first year of life. They also suggested that there may not be any delays in perceptual narrowing for their participants, who all began speaking before the age of 2 years, even though some of them were diagnosed with autism later at 3 (Seery et al., 2013). In another study, typically developing monolingual children who had better native language perception skills at 7.5 months showed faster acquisition of language at 24 and 30 months, whereas those who had better non-native language perception skills at 7.5 months showed slower advancement at 24 and 30 months (Kuhl et al., 2008). In other words, those who were likely to still possess universal perception showed slower language growth, similar to that of autistic individuals who have remarkable pitch discrimination abilities with a history of language delay. Kuhl maintains that language development in young children depends on their ability to attend to the linguistically relevant phonetic distinctions in social contexts rather than attending to all phonetic distinctions (Kuhl, 2007).
In speech sound production, Schoen et al. found autistic children aged 18 to 36 months had speech-like vocalizations similar to language-matched children, but produced more atypical vocalizations that were not present in their native language environment (Schoen et al., 2011). This was also observed in slightly older children at a mean age of 44.67 months (Sheinkopf et al., 2000). Those findings make an interesting comparison to the reviewed study by DePape et al. in which autistic adolescents had less “enculturation” to the phonemic categories of their native language. As perceptual narrowing is an enculturation process, it is not surprising that atypical perceptual narrowing may result in less enculturated speech sound perception and production later in life. Schoen et al. (2011) propose that because autistic children do not attend and tune into their native language, their vocalizations are not aligned to the sound properties of the language, negatively affecting language acquisition. Observations suggest that infants who have a larger set of contrasts to learn may be slower to acquire the phonotactic regularities that are informative (Eigsti & Fein, 2013). Research on early speech perception in children exposed to two languages from birth showed that their perceptual narrowing takes longer and settles at a slightly later age (Genesee & Nicoladis, 2008). Having to learn a larger set of contrasts is more time-consuming and it is not difficult to understand that attending to all phonetic distinctions could contribute to a delay in language acquisition.
Audiovisual information processing
Based on studies that demonstrated that infants integrate the audio and visual information available on a speaker's face to process speech (Kuhl & Meltzoff, 1984; Lewkowicz et al., 2015), Chawarska et al. (2022) investigated attention to audiovisual information of speech in high-risk and low-risk infants for autism. Both groups paid attention to the face and mouth at 12 months and their receptive and expressive language scores did not differ at that point. For the low-risk infants, however, greater attention to the face and more time spent monitoring the mouth contributed to better language outcomes at 18 months, particularly in receptive language skills. This effect was not observed in the high-risk infants and the authors suggested that intact attention to the face and mouth does not guarantee better language outcomes for high-risk children. This seems consistent with studies that demonstrated lip-reading difficulties, which may contribute to the speech-in-noise difficulties (Iarocci et al., 2010; Smith & Bennetto, 2007), and reduced perception of the McGurk effect (DePape et al., 2012; Wallace et al., 2020) in children and adolescents with autism.
One interpretation for this is atypical temporal processing. Stevenson et al. (2014) reported speech-specific deficits in multisensory temporal integration in verbal autistic children. This is in line with the findings in the literature review by Casassus et al. that atypical temporal processing is observed more consistently across studies on the integration of audiovisual speech stimuli in children. While more deficits were found in studies with children, studies with adults found enhancement, thus Casassus et al. (2019) argued that deficits in temporal processing may diminish with age. This may relate to other observations such as multisensory processing where autistic children may be delayed but improve later (Beker et al., 2018), and the McGurk effect according to which children and adolescents are reported to be less susceptible (Wallace et al., 2020). Casassus et al. also found inconsistencies between studies with different cognitive demands and suggested that atypicalities in temporal processing may depend on the complexity of other cognitive tasks involved (Casassus et al., 2019).
Another interpretation for atypical audiovisual integration is individuals’ propensity for their “preferred” modality. A study that investigated the McGurk effect in skilled musicians found significantly reduced sensitivity to the illusion likely due to their superior auditory abilities or the stronger focus on auditory stimuli (Proverbio et al., 2016). The McGurk effect in the general population is understood to be visually driven (Buchan & Munhall, 2011; Alsius et al., 2014) but it may differ for certain groups of people with a focus on sounds. Schwartz maintains that audiovisual integration for speech perception is subject-dependent based on the weights they place on the auditory and visual inputs, and suggests that subjects should be considered differently depending on whether they are auditory or visual in their integration performance for interventions (Schwartz, 2010). This may explain the low-level audiovisual integration in some children with autism (Brandwein et al., 2013; Ostrolenk et al., 2019) should they have enhanced auditory or visual perception. Whereas language acquisition for typically developing children is a multimodal process, it may be different for autistic children.
Brain morphology and lateralization
The coexistence of enhanced sensitivity to pitch and delayed speech is also consistent with some findings in the area of specific language impairment, now called developmental language disorder, and its brain structure. Absolute pitch is found to be related to the left planum temporale, a region in the core of Wernicke's area, integral to auditory processing and receptive language. This region in most human brains is shown to be leftward asymmetrical (Geschwind & Levitsky, 1968), and it was found to be exaggerated among nonautistic absolute pitch possessors (Schlaug et al., 1995) as well as in some autistic individuals with language impairment (Tager-Flusberg, 2006). According to a study on language-association cortex asymmetry in autistic individuals with language impairment and individuals with specific language impairment, both language-impaired groups were found to have significant leftward asymmetry while autistic individuals without language impairment did not (De Fossé et al., 2004). It may be worth noting that this asymmetry is more variable in blind musicians with absolute pitch (Hamilton et al., 2004). Exaggerated leftward asymmetry of the planum temporale is suggested to be closely related to language diagnosis in ASD (Tager-Flusberg & Joseph, 2003). Whereas absolute pitch is not a condition, it may have consequences that are long-lasting and possibly extend to the general population as well. There are observations such as autistic adults with hyposensitivity to prosody and hypersensitivity to pitch (Haigh et al., 2022), as well as musicians with absolute pitch demonstrating a higher prevalence of autistic traits, including eccentric social aspects of language and better scores on the Block Design test (Brown et al., 2003). Similar correlations between enhanced pitch discrimination and autistic traits were also seen in typically developing nonmusician adults (Mayer et al., 2016).
Another observation that is frequently mentioned is atypical brain lateralization in autistic individuals. In typically developing individuals, there is a strong predisposition to process speech in the left and music in the right auditory cortex (Tervaniemi & Hugdahl, 2003) but this pattern may be different in autistic individuals as more activation of the right hemisphere is seen for speech processing. Atypical ear and hand preferences are also observed (Escalante-Mead et al., 2003) which could contribute to atypical brain activation, but it is proposed that this atypical pattern of right hemisphere dominance may be related to enhanced local pitch processing in autism (Haesen et al., 2011). Despite the strong predisposition for the left hemisphere to process language in neurotypicals, there is also evidence that this functional specialization is more vulnerable to relatively small changes in sound features or familiarity of individuals (Tervaniemi & Hugdahl, 2003). A study showed that the pattern of brain activation can vary depending on how subjects are instructed. Using the same verbal and musical stimuli, subjects were instructed to “categorize the phoneme” which led to activity in the left hemisphere, while the right hemisphere was activated when they were instructed to “discriminate between the pitch contents” (Zatorre et al., 1992). Whereas this was a result based on explicit instructions, those who are equipped with detail-oriented pitch perception may naturally follow the latter instruction without being instructed.
Perception, attention, and context
The theory on language acquisition in autism proposed by Mottron et al. (2021, p. 5) maintains that language in autism is primarily “self-taught,” much the same way as they acquire special savant skills including pitch detection and three-dimensional drawing, based on their enhanced perceptual functioning. Nadia Chomyn, who had autism and savant skills in drawing, was able to produce photographically realistic drawings between the age of 3 and 7 despite her severe learning difficulties and no communicative speech. She lost her drawing skills, however, as her language developed and her communication ability improved when she was exposed to schooling after the age of 6 (Selfe, 2015). Savant skills usually persist or increase with intense interest and preoccupation with their areas of interest and such trade-off is exceptional (Treffert, 2009). However, Nadia's case is indicative of how different it could be for some autistic children to integrate multimodal information or transition to the use of more global information.
The transition from reliance on local information to the use of global information is a general developmental process that occurs with perceptual stimuli (Rhodes et al., 1989, cited in Davies et al., 1994). However, this transition may not occur in some autistic people who continue to rely on local information. Verbal autistic children with and without language delay in the study by Davies et al. (1994) showed variations in their perception of global information in facial and nonfacial stimuli. In the auditory modality, perceiving sounds in relative pitch may be equally important as acquiring language-specific perception. Terhardt suggests that what matters to identify speech sounds is the frequency relations and not the absolute frequencies among the partials of voiced speech sounds, therefore the shift from absolute to relative features of pitch is a by-product of language acquisition (Terhardt, 1974, cited in Takeuchi & Hulse, 1993).
People also speak at different pitches as we have different voice ranges. Heaton, Davis, and Happé report on a verbal autistic child with language delay and absolute pitch. He started producing single words from the age of 3, and semantically meaningful sentences around the age of 6. As his father was an amateur pianist, he was exposed to music at home and showed evidence of absolute pitch by 3 years. Once he learned that each pitch has a name, he started asking his parents questions such as “Why do you make a D when you call “Dinner is ready” and “Mum makes an A?” (Heaton et al., 2008a). This report gives us a glimpse into the atypical auditory experience of someone with absolute pitch and shows what feature of stimuli may become salient regardless of its little or no contribution to the context. It is difficult to imagine that such atypical attention to contextually irrelevant information would not have any effect on language acquisition or comprehension. Hyper-attentiveness to sounds has been considered as a factor that may affect language and communication (Constantino et al., 2007).
Whereas pitch in speech is understood to be more difficult to discriminate than pitch in music, autistic people are found to be better at it than neurotypicals and musicians with absolute pitch (Heaton et al., 2008a, 2008b). In a study, autistic children demonstrated enhanced pitch discrimination on speech as well as sentence comprehension difficulties, although they were matched to control children on measures of age, nonverbal ability and receptive vocabulary, and were instructed to attend to context. The authors suggested that attending to multiple cues in speech may limit the processing capacity and resources, resulting in poorer sentence comprehension in autistic children. Typically developing children on the other hand, may have increased attention to content information, resulting in poorer perceptual performance (Järvinen-Pasley et al., 2008). In a study that investigated the sensory and attentional processing of tones of different complexity and vowels, verbal autistic children with language delay failed to show involuntary orienting only to vowel changes, while showing intact perception of all types of stimuli. The authors interpreted the findings to mean that sensory sound processing is intact in autistic children and they can perceive both speech and nonspeech sounds, but they may not attend to speech sounds (Ceponiene et al., 2003). Lepisto et al. also investigated pre-attentive brain responses to pitch and vowel changes in autistic children with language difficulties. They showed enhanced responses compared to control children to both pitch and vowel changes in a constant condition, where a stream of one vowel sound was sometimes presented at different pitches or a stream of different vowels was presented at the same pitch. However, in a condition where both pitch and vowels changed, which is more similar to speech, their enhanced responses to vowel changes diminished while that to pitch changes remained. Lepisto et al. (2008) maintain that the diminished responses to phoneme changes only in a speech-like condition indicate their auditory perception is governed by acoustical characteristics rather than linguistic relevance and this may adversely affect their speech perception. This is consistent with another study that also investigated brain responses to tones and vowels in verbal autistic children. Whitehouse and Bishop found diminished responses to complex tones that were embedded in a stream of speech sounds, while phonemes embedded in a stream of nonspeech sounds were attended (Whitehouse & Bishop, 2008). They argue that poor orienting to speech sounds in autistic children is not related to difficulties in shifting attention to speech sounds, but they actively inhibit responses to speech sounds which they call “top-down inhibition” to speech sounds (ibid.). In tonal languages such as Mandarin, similar inhibition toward speech sounds was observed as diminished brain responses to lexical tones while responses to harmonic sounds were greater in autistic children compared to control children (Wang et al., 2017). In this study both types of stimuli were tones yet they were treated differently, and categorical perception of lexical tones, which serve phonemic roles in Mandarin, was reduced similar to phonemic perception in nontonal languages. Whereas phonemic or lexical tone perception appears to be intact, it may become suppressed when there are concurrent nonspeech pitch stimuli to process, possibly due to top-down inhibition to speech sounds. It is suggested that auditory perception is at the root of language learning for both typically developing infants and populations with language-related disorders (Mueller et al., 2012). Although infants initially perceive speech not as speech but more as music in a broad sense, once they realize that words have meanings, semantic and syntactic development takes over and the musical aspect of language becomes secondary (Brandt et al., 2012). Language is the most important class of stimuli in the auditory domain for typically developing children (Heaton et al., 2008c), but some autistic children may not have the same priority.
Potential explanations and limitations
Pitch in speech is usually supposed to help comprehension. Pitch contours that rise and fall define prosody in nontonal languages and the meaning in tonal languages (Oxenham, 2012), and pitch changes can be used to identify speakers (Xie & Myers, 2015). It is also suggested that the ability to extract linguistic rules, that develops in infancy and also in some adult language learners, is closely linked to pitch perception (Mueller et al., 2012). Pitch in speech is there to facilitate the development and comprehension of language and not to be identified or excessively attended to. In a study that investigated pitch discrimination on speech processing in autistic children, adolescents and adults, aged between 6 to 59 years, pitch discrimination was found to be enhanced in all autistic groups, and this ability did not correlate with language scores. In the matched control groups on the other hand, pitch discrimination performance increased with age and correlated with language scores (Mayer et al., 2016). Mayer et al. (2016) propose that it may be the allocation of initial attentional resources that differs between autistic and typically developing individuals, and the increase in attentional resources through development and changes in the allocation of attention guide differential trajectories in speech and pitch processing. The study also found that enhanced pitch discrimination in autistic adults was associated with sensory atypicalities and autism symptom severity. Similar results were obtained by Eigsti and Fein in their study on pitch sensitivity and autistic symptoms including delayed language development. Their autistic participants with the best pitch discrimination abilities tended to have language delay (Eigsti & Fein, 2013).
Observations support that it is atypical processing of low-level information that affects higher-order cognitive functions in autism (Groen et al., 2008). Particularly heightened perception of any one aspect of incoming stimuli may lead to over-focussing and/or over-processing of certain information, overshadowing the context. This may in turn disrupt or lessen the effect of the expected developmental processes such as perceptual narrowing, resulting in various atypicalities elsewhere including, but potentially not limited to, language. Pitch perception is only one of the aspects of auditory perception and auditory perception is also one of many factors that could contribute to atypical language acquisition in autism. Deficits in temporal processing appear to play a role in atypical audiovisual integration that is part of language development. As temporal deficits may diminish with age and are also observed in verbal autistic people without obvious language delay or impairment, their effect on language acquisition may not be detrimental. But they coexist with enhanced sensitivity to pitch and those atypicalities may interact. There are various other factors such as genetic or cognitive differences as well as comorbidities in addition to autism, which may contribute to atypical perception and need to be taken into account. Significant discrepancy between verbal and nonverbal abilities in favor of the latter is one of them and it is observed with enhanced pitch perception (Chen et al., 2022; Heaton et al., 2008c). However, a few areas that were briefly looked into in this paper indicate that language delay is more likely to occur in the presence than the absence of atypically enhanced sensitivity to pitch and absolute pitch, which are unlikely to result from a compensation mechanism or early musical training but derive from the processing bias toward local-level information in autism. In this respect, atypically enhanced sensitivity to pitch and absolute pitch may be a by-product of atypical perception in autism. A recent study found greater variability in pitch perception in autistic individuals than in neurotypicals, and exceptional pitch sensitivity in a subgroup of autistic individuals (Wang et al., 2023). Also observed and likely to occur is the combination of heightened visual perception and language delay as visual modality is also crucial for language acquisition. Autistic individuals with particularly enhanced perception in either of those modalities may constitute a subgroup of people who present language delay. Differences in intensity and affected modality may partially explain why some people are more affected, do not benefit from certain types of therapy as much as others, or exhibit certain types of difficulties contributing to the variability within the spectrum. More research in modalities other than auditory and visual may be also beneficial.
According to one longitudinal study involving autistic children and young people between the ages of 2 and 19, their language trajectories were strikingly stable after the age of 6 despite their heterogeneity in the preceding years. Those who caught up have done so before the age of 6 and the strongest predictor was their verbal ability between the ages of 2 and 3 (Pickles et al., 2014). Whereas a lot of research has been done on the early years of language acquisition in general, there are not many studies that span up to the age of 6 years. We do not know what other factors may play significant roles in shaping the language abilities of children during those years. It is also worth noting that the age of 6 is important for the acquisition of absolute pitch through early musical training. The first 6 years of life may be particularly critical for the development of both language and musical abilities. Speech perception has progressed more as a separate entity from general auditory perception, however, evidence suggests the transfer of learning between nonlinguistic and linguistic stimuli, indicating that the nature of speech perception is not necessarily different from that of other types of auditory perception (Holt & Lotto, 2008). More research to fill the gap in the knowledge for the missing years and greater collaboration between autism researchers and auditory science researchers are required.
The link between heightened pitch perception and language abilities in autistic individuals has been investigated in the past (Eigsti & Fein, 2013; Heaton et al., 2008b; Lepistö et al., 2008; Mayer et al., 2016). Although findings indicated a possible link, which has been suggested once again in a recent literature review (Chen et al., 2022), it is still unclear how such atypical pitch perception and language delay are related. As our paper primarily focussed on nonlinguistic pitch perception, particularly the potentially atypical nature of absolute pitch in autism, studies on language perception were excluded for review at the time of selection. We did not consider language abilities unless related to pitch perception, nor did we focus on particular age groups or certain cognitive abilities. Our review and observations suggest that absolute pitch in autism is different in nature, and this atypical pitch perception may partially explain language delay in autism. How such atypical pitch perception contributes to language delay in autism, whether the former causes the latter or they simply correlate, are yet to be investigated. Future investigation also needs to take into account other aspects of auditory perception as well as cognitive factors. Some autistic people seem naturally auditory-oriented and have high sensitivity to pitch. While having detail-oriented and context-independent perception may give certain advantages in the right environment, it may compete with and have adverse effects on language acquisition and communication when it becomes excessively heightened and starts detecting pitch, especially in speech. Knowing what aspect of atypical perception could affect language acquisition and how it works would be important to better understand the mechanisms behind language delay in autism, and develop more individually targeted interventions to support autistic children based on their strengths and weaknesses.
Conclusion and implications
Findings from research on both auditory perception and language acquisition in autism are that autistic children have an atypical relationship with sounds where they demonstrate enhanced sensitivity to pitch. While often considered as musical giftedness, absolute pitch in autism poses a question with its implications for language delay. Behavioral and ERP studies on language acquisition and perception found atypicalities in autistic individuals with enhanced pitch discrimination abilities. If present in autistic individuals without musical training or a condition that contributes to brain reorganization, particularly heightened sensitivity to pitch appears to be related to or possibly caused by their innate bias for local-level information. In typical development, progresses in language acquisition appear to override an enhanced sensitivity to nonlinguistic sounds, and pitch perception diminishes once a child learns greater accuracy in detecting linguistic sounds. However, children with a focus on a particular aspect of auditory stimuli due to their inner bias may follow atypical language development, which may in turn cause language delay. Our findings are preliminary and further investigation in conjunction with other aspects and factors is required. However, pitch perception seems to be an area that is worth researching further with its potential to help us better understand mechanisms for language delay and impairment in autism. There is a need for more integrative research.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
