Toward Sound Localization Testing in Virtual Reality to Aid in the Screening of Auditory Processing Disorders

Abstract

Sound localization testing is key for comprehensive hearing evaluations, particularly in cases of suspected auditory processing disorders. However, sound localization is not commonly assessed in clinical practice, likely due to the complexity and size of conventional measurement systems, which require semicircular loudspeaker arrays in large and acoustically treated rooms. To address this issue, we investigated the feasibility of testing sound localization in virtual reality (VR). Previous research has shown that virtualization can lead to an increase in localization blur. To measure these effects, we conducted a study with a group of normal-hearing adults, comparing sound localization performance in different augmented reality and VR scenarios. We started with a conventional loudspeaker-based measurement setup and gradually moved to a virtual audiovisual environment, testing sound localization in each scenario using a within-participant design. The loudspeaker-based experiment yielded results comparable to those reported in the literature, and the results of the virtual localization test provided new insights into localization performance in state-of-the-art VR environments. By comparing localization performance between the loudspeaker-based and virtual conditions, we were able to estimate the increase in localization blur induced by virtualization relative to a conventional test setup. Notably, our study provides the first proxy normative cutoff values for sound localization testing in VR. As an outlook, we discuss the potential of a VR-based sound localization test as a suitable, accessible, and portable alternative to conventional setups and how it could serve as a time- and resource-saving prescreening tool to avoid unnecessarily extensive and complex laboratory testing.

Keywords

spatial hearing binaural interaction functions sound localization abilities auditory processing disorders virtual reality

Introduction

Hearing is a complex process. It entails the transduction of acoustic information arriving at the ears into neural impulses, their transmission through the auditory nerves, and their appropriate interpretation by the central nervous system (Werner et al., 2012). Sound localization and lateralization, auditory pattern recognition, temporal integration and discrimination, and speech understanding in challenging acoustic situations are just a few basic skills that rely on our auditory processing abilities (Bellis, 2003a; Chermak & Musiek, 1997).

Auditory processing disorders (APDs) are difficulties in the perceptual processing of auditory information by the central nervous system, evidenced by poor performance on one or more of the aforementioned tasks (Chermak & Musiek, 2013; de Wit et al., 2016; Geffner & Ross-Swain, 2019). Children and adults with APDs have impaired abilities to attend to, discriminate, organize, or comprehend auditory information despite having average intelligence and normal hearing (NH) sensitivity (Keith, 1986). Thus, Bellis described this phenomenon as “when the brain cannot hear” (Bellis, 2003b). It is estimated that around 5% of school-aged children (Chermak & Musiek, 2013; Geffner & Ross-Swain, 2019), more than 40% of children with learning disorders (Iliadou et al., 2009), and between 26% and 76% of people over the age of 55 are affected by APDs (Cooper & Gates, 1991; Golding et al., 2004; Stach et al., 1990). However, diagnosing APDs can be challenging due to their heterogeneous presentation and similarities to other common disorders such as attention deficit hyperactivity disorder (ADHD), autism spectrum disorder (ASD), language impairments, and learning disabilities (Musiek & Chermak, 2014).

Despite previous efforts to develop methods for diagnosing and treating APDs, providing timely, widely available, and efficient access to diagnosis and treatment remains an ongoing topic of research. Audiologists currently use a comprehensive set of behavioral tests to diagnose APDs (American Academy of Audiology, 2010; American Speech-Language-Hearing Association, 2005; Geffner & Ross-Swain, 2019; Jerger & Musiek, 2000; Musiek & Chermak, 2014). These tests help narrow down the specific deficits contributing to the patient's hearing difficulties and design suitable treatment plans. Therefore, it is crucial to select the right tests for each patient, taking into account their individual history, auditory complaints, and potential comorbidities.

The test battery should include tasks that assess different levels and regions of the central auditory nervous system and different auditory processes, such as speech-in-noise tests (Cameron & Dillon, 2007; Dillon et al., 2012; Nilsson et al., 1994; Soli & Wong, 2008), dichotic listening tests (Hurley & Musiek, 1997; Musiek et al., 1991), auditory discrimination tests (Cranford et al., 1982), or tests of temporal processes such as within-channel gap detection (Musiek et al., 2005), among others.

The test results are interpreted using criterion-referenced scores, also known as normative cutoff values. These normative cutoff values are set at performance levels to provide the best balance between sensitivity (detection rate) and specificity (correct rejection rate). In order to diagnose APDs, there must be performance deficits of at least two standard deviations $(2 σ)$ below the mean on at least two tests included in the battery or performance deficits of three standard deviations $(3 σ)$ below the mean on one test component (American Speech-Language-Hearing Association, 2005; Cameron & Dillon, 2007; Chermak & Musiek, 1997).

Sound localization testing is a key part of screening for suspected APDs as it is sensitive to central auditory nervous system involvement. Poor sound localization skills have been observed in individuals with temporal lobe impairment (Moore et al., 1990; Sanchez-Longo & Forster, 1958; Sanchez-Longo et al., 1957), multiple sclerosis (Cranford et al., 1990), the aging population (Cranford et al., 1993), and patients with hearing loss (Musiek & Chermak, 2014). However, the clinical adoption of sound localization testing remains low, in part due to time and resource constraints (Musiek & Chermak, 2015) and concerns about the validity and reliability of sound localization results unless they are obtained in a well-attenuated sound room or even an anechoic chamber using complex loudspeaker arrays (American Academy of Audiology, 2010; Musiek & Chermak, 2015). An accessible, reproducible, and widely accepted method for assessing sound localization abilities is still lacking (Musiek & Chermak, 2015).

Efforts to develop a suitable configuration and procedure for sound localization testing are not new. Directional audiometry, or spatial audiometry, originated in the 1950s (Goodhill, 1954; Jongkees & Groen, 1946; Jongkees & Veer, 1957; Sanchez-Longo & Forster, 1958), and a decade later, Tonning published eight papers on the development and use of directional hearing tests for audiological applications. Six focused on directional speech intelligibility testing (Tonning, 1971a, 1971b, 1972a, 1972b, 1972c, 1973b), and only two addressed localization issues (Tonning, 1970, 1973a). Several publications followed that proposed some form of directional audiometry (Humes et al., 1980; Link & Lehnhardt, 1966; Newton & Hickson, 1981; Vermiglio et al., 1998). However, although several test configurations and data collection procedures have been proposed, the clinical community has not yet agreed on a single standard procedure for assessing sound localization abilities. Unresolved issues include the technical requirements for testing systems, comprehensive yet flexible procedures, and normative data for directional hearing (American Academy of Audiology, 2010; Letowski & Letowski, 2016).

More recently, the ERKI method was developed to fill this gap and as an attempt to embed sound localization testing into typical clinical audiological procedures (Plotz & Schmidt, 2017). ERKI is an acronym for “Erfassung des Richtungshörens bei Kindern” in German, which translates to “measurement of directional hearing in children.” The method determines the listener’s angular localization error in nonspeech localization tasks over the horizontal plane using an approved medical device of the same name. The setup consists of five loudspeakers arranged in a frontal semicircle around the patient (0°, ± 45°, ± 90°; r = 1 m), hidden behind an acoustically transparent curtain so that neither the number nor the position of the loudspeakers is known. The system can display 37 sound sources; five are real (loudspeakers), and the remaining 32 are generated as phantom sources between adjacent loudspeakers using the vector base amplitude panning (VBAP) method (Pulkki, 1997).

The patient sits in the center of the loudspeaker array with the head aligned to the 0° azimuth position. A brief noise signal is presented, and their task is to determine the location of the perceived signal by using a rotatory switch to position a light on an LED bar placed around the semicircle. They press a button to confirm their choice, and the whole procedure can take less than ten minutes.

To ensure that the results were not compromised by the use of phantom sound sources, Plotz and Schmidt compared subjects’ localization performance using a setup with discrete real loudspeakers to that using the proposed ERKI setup (using only five real loudspeakers and generating the remaining 32 phantom sound sources using VBAP). The comparison revealed no significant differences in performance between the two setups, indicating that sound localization testing can be reliably performed using the proposed ERKI method in children and adults.

The ERKI method offers a higher spatial resolution than similar devices previously available and improves the user-friendliness and automatability of the procedure. However, although it is a suitable technical solution with basic hardware equipment, sound localization abilities are still not widely evaluated as a typical audiological practice. Reasons for this may be lack of access to the equipment or lack of space, as typical setups (ERKI and similar) require semicircular loudspeaker arrays with a radius of at least 1 m in sufficiently large and acoustically treated rooms.

To address these concerns, we evaluate the feasibility of performing sound localization testing in virtual reality.

The field of digital healthcare is advancing rapidly, and virtual reality and augmented reality (VR and AR) technologies are playing an important role in medical care. These innovations have made medical training, diagnosis, and treatment more portable, accessible, and affordable. Hearing healthcare is no exception (Murphy, 2017). Previous studies on sound localization training (Steadman et al., 2019), auditory spatial analysis in multi-talker environments (Ahrens & Lund, 2022; Ahrens et al., 2019a), and speech intelligibility measurements (Ahrens et al., 2019b; Salorio-Corbetto et al., 2022) have successfully demonstrated the potential of AR and VR technologies to improve audiological research and care. They could provide cost-efficient alternatives to bulky and technically complex setups.

With great potential to support tele-audiology, they could improve diagnostic and intervention services and facilitate access to hearing healthcare services across geographic boundaries. Therefore, the development of efficient auditory tests and training procedures that can be performed on inexpensive consumer-grade hardware with simple setups is of great research interest today. In this context, modern VR peripherals can be a suitable alternative for evaluating sound localization abilities because they support spatial audio, are portable, and are comparatively inexpensive. In addition, recent versions support standalone operation, which facilitates the reproducibility and scalability of the setups.

To create a virtual sound localization test, it is necessary to make some modifications to the conventional test setup, such as replacing the loudspeaker arrays with a headphone-based binaural audio presentation for the auditory stimuli and using a head-mounted display (HMD) for the visual feedback. However, these changes may reduce test accuracy by increasing localization blur. To measure these effects, we conducted a study with a group of normal-hearing adults, comparing sound localization performance in different AR and VR scenarios. We started with a conventional loudspeaker-based measurement setup and gradually moved to a virtual audiovisual environment, testing sound localization in each scenario using a within-participant design. By comparing localization performance between the conventional loudspeaker-based and virtualized conditions, we could estimate the increase in localization blur induced by virtualization at each step. Furthermore, the results of the virtual localization test provided new insights into localization performance in state-of-the-art VR environments, allowing us to estimate first proxy normative cutoff values for sound localization testing using consumer-grade VR peripherals.

As a baseline against which to compare the (potentially degraded) performance in the virtualized scenarios C2 and C3, we created a conventional loudspeaker-based scenario, hereafter referred to as C1.

C1 basically replicates the ERKI method, but we replaced the original allocentric pointing method, that is, the rotating disk, with an egocentric pointing method using a handheld controller. We chose an egocentric pointing method because it is known to lead to more accurate performance than allocentric methods (Bahu et al., 2016; Djelani et al., 2000; Pernaux et al., 2003). Such egocentric pointing methods are also intuitive and user-friendly, and they are the most commonly used in VR.

As a second test condition, we used an AR scenario (C2), in which the loudspeaker-based audio playback was replaced by a (static) binaural headphone-based presentation using nonindividual head-related transfer functions (HRTFs). All other test components remained identical to C1. For the third condition (C3), we virtualized the visual feedback by replacing the real LED array with its virtual counterpart: a virtual LED array presented through a standalone HMD. In addition, the auditory stimuli were presented using state-of-the-art headphone-based dynamic binaural rendering with nonindividual HRTFs. Thus, C3 was a completely virtual audiovisual environment.

Our first hypothesis was that conditions C2 and C3 would result in lower performance (i.e., higher localization errors) compared to C1 due to the use of headphone-based audio presentation and nonindividual HRTFs. Typically, the use of nonindividual (generic) HRTFs results in reduced localization accuracy due to the lack of appropriate monaural (spectral) cues, increased front-back confusion, and usually higher localization errors in the median sagittal plane (Begault et al., 2001; Møller et al., 1996; Wenzel et al., 1993). However, generic HRTFs still provide robust binaural cues for localization in the frontal horizontal plane (Wenzel et al., 1993), often without a relevant increase in localization error compared to individual HRTFs (Begault et al., 2001). Furthermore, since sound localization testing is typically limited to the frontal horizontal plane when assessing binaural interaction functions, using nonindividual HRTFs may be sufficient for this use case (Brungart et al., 2017).

Our second hypothesis was that localization accuracy in C3 would be equal to or better than in C2. We expected that any performance difference between C2 and C3 would be due to the switch from real to virtual visual presentation, that is, the introduction of the HMD. This is because, with the procedure and short stimulus duration used in our study, the switch from static to dynamic binaural rendering for the audio presentation is unlikely to affect localization performance (see “Setup and Stimuli” section for details). In addition, we expected that the inclusion of matching HMD-based visuals would improve the overall plausibility of the virtual scene and the perceived immersion, thus aiding the sound localization task.

With this initial feasibility study, we aimed to gather preliminary evidence that sound localization abilities in the frontal horizontal plane can be tested in VR (C3 in this study) and that a virtualized version of the conventional test setup could be useful in screening of sound localization abilities, despite the (expected) overall performance degradation induced by virtualization.

Methods

Participants

Twenty engineering students and fellow researchers from the TH Köln University of Applied Sciences participated in the study. They all had NH sensitivity, verified by standard pure-tone audiometry in octave frequency bands from 125 Hz to 8 kHz (hearing threshold < = 25 dB HL), and had previous experience participating in listening experiments. Table S1 in the supplemental material lists the relevant demographic details of the study participants and their pure-tone audiogram results for both ears (Ramírez et al., 2024).

One participant was excluded from the study because they had physical difficulties performing the task, that is, extending their arm to point in the direction of the perceived sound source. Therefore, we report data from nineteen participants (n = 19, age 21–51 years, M = 28 years, Mdn = 26 years, $σ$ = 6.37).

Setup and Stimuli

We replicated the ERKI setup (Plotz & Schmidt, 2017) in the sound-insulated anechoic chamber of the acoustics laboratory at the TH Köln, which has dimensions of 4.5 × 11.7 × 2.3 m (W × D × H), a lower cutoff frequency of about 200 Hz, and a background noise level of ∼ 20 dB(A) SPL. We used five Genelec 8020D loudspeakers as the real sources and generated the remaining 32 virtual sound sources as phantom sources between adjacent loudspeakers using the VBAP method (Pulkki, 1997).

Using VBAP-produced virtual sound sources instead of real sound sources does not reduce localization accuracy in this setup, as shown by Plotz and Schmidt (2017). Furthermore, Frank (2013) showed that VBAP yields sufficient localization accuracy for the setup used in the present study (5° steps in the frontal horizontal plane).

We used two successive strands of Adafruit WS2801 pixels to display the location of the 37 sound sources as individual LED lights, using a serial peripheral interface to transmit the color data and clocked by a microcontroller board (Arduino Mega 2560) (Figure 1a). An opaque, acoustically transparent fabric covered the setup.

Figure 1.

(a) Experimental setup in the anechoic chamber of the TH Köln. Rendering of the setup without the fabric cover (K. Altwicker, TH Köln). (b) Participants had to extend their arms and point to the location of the perceived sound object relative to their body axis (egocentric pointing). They received visual feedback about the direction they were pointing by changing the color of the LED dot (R. Gillioz, TH Köln). (c) Screenshot of the VR scenario. A child-friendly, gamified, and fully automated application for sound localization testing based on the ERKI method was developed. It ran on a standalone HMD (Oculus Quest 2).

An OptiTrack system with an update rate of 120 Hz tracked the listener's head orientation, and the handheld controller used for pointing. We used head tracking to ensure that stimuli were not presented unless the participant's head was oriented at 0° azimuth (central loudspeaker) for at least two seconds. In addition, the tracking information from the handheld controller was used to know where listeners were pointing and to provide visual feedback by changing the color of the corresponding LED dot (Figure 1b).

The stimulus was a 300 ms broadband white noise with 10 ms cosine-squared onset and offset ramps. We chose a duration of 300 ms to ensure that the signal is long enough to be accurately perceived by NH individuals (Tobias & Zerlin, 1959). At the same time, it is short enough to prevent listeners from turning their heads toward the perceived direction during the stimulus presentation (Brungart et al., 2017; Gaveau et al., 2022; Higgins et al., 2023; Pollack & Rose, 1967; Thurlow & Mergener, 1970), as typical reaction times for such head movements are around 400 ms (Savelsbergh et al., 1991).

A custom MATLAB application running on a PC controlled the test procedure. Game elements were included to increase user engagement and motivation. For example, child-friendly voice prompts guided the participant through the procedure explanation, initial training trials, and the experiment, making it easy to use and resulting in a fully automated process that did not necessarily require an experimenter.

Materials

Test Condition 1: Baseline Measurement (C1)

As described in the “Introduction” section, we replicated the ERKI setup. However, we replaced their original allocentric pointing method, that is, the rotating disk, with a modified handheld Oculus Quest 2 controller, thus changing the spatial coding of the pointing method from allocentric to egocentric. In this experimental condition, the audio presentation was loudspeaker-based, using the five loudspeakers as real sources and generating the remaining 32 virtual sound sources as phantom sources between adjacent loudspeakers.

A short video illustrating some trials in this experimental condition is part of the supplemental material (Ramírez et al., 2024).

Test Condition 2: Introduction of Headphone-Based Static Binaural Rendering (C2)

In this test condition (C2), we created an AR environment in which the loudspeaker-based audio playback was replaced by a static binaural headphone-based presentation. The test environment was otherwise identical in design and procedure to C1. The subject sat in the center of the semicircle, listened to the stimuli through headphones (Sennheiser HD 600), and indicated the location of the perceived sound source by extending their arm and pointing in the direction of the perceived sound object in the LED array.

For the binaural presentation, we used measured far-field HRTFs from a Neumann KU100 dummy head (Bernschütz, 2013). The HRTF set, initially measured on a Lebedev grid with 2702 spatial sampling points, was transformed to the spherical harmonic domain at a sufficiently high spatial order of N = 44, allowing artifact-free spherical harmonic interpolation to obtain HRTFs for any desired direction using the open-source SUpDEq toolbox (Pörschmann et al., 2019). This processing resulted in accurate HRTFs for the 37 sound source directions, which we then used to generate the corresponding virtual sound sources by convolution with the noise test signal.

Given the availability of a high-quality, dense HRTF set, we decided to use the 37 discrete HRTFs covering the horizontal plane with a resolution of 5°, rather than using only five and interpolating between them (as in C1). Switching from loudspeaker-based to headphone-based binaural rendering with nonindividual HRTFs is already known to degrade localization performance, and using only five HRTFs and interpolating between them using VBAP could further reduce the localizability of virtual sound sources. This allows evaluating the feasibility of testing sound localization using simple setups as accurately as possible, that is, using the best technology currently available.

A generic headphone compensation filter was applied to the precomputed stimuli (noise test signal convolved with the respective HRTF) to minimize the influence of the headphones used. The filter is based on 12 measurements (putting the headphones on and off the dummy head) to account for repositioning variability and was designed by regularized inversion of the complex mean of the headphone transfer functions (Lindau & Brinkmann, 2012) using the implementation of Erbes et al. (2017).

Test Condition 3: Introduction of Headphone-Based Dynamic Binaural Rendering and HMD-Based Visual Feedback (C3)

The third test condition (C3) was a fully immersive audiovisual virtual environment. The LED array was replaced by its virtual counterpart (Figure 1c). Additional gamification elements were included in this condition. For example, the player earned stars as they progressed through the test, and there were short text prompts that displayed encouraging messages in a gamified manner to support the voice instructions.

The subject simply wore the HMD and headphones and started playing the game. The virtual LED array automatically adjusted to the listener’s interaural axis height when the game started. Stimuli were presented when the listener's head was oriented at 0° azimuth and if their head or the HMD was not tilted (roll control). In addition, this scenario integrated dynamic binaural rendering with head tracking.

We evaluated two different renderers for the dynamic binaural presentation. STEAM® Audio SDK (Valve Corporation, 2022) and the Unity wrapper for the 3D Tune-In Toolkit (Cuevas-Rodríguez et al., 2019; Reyes-Lecuona & Picinali, 2022). In the current study, we used the 3D Tune-In Toolkit because it is open-source, well-documented, and explicitly developed for hearing research. We used the same headphone compensation filter as well as the same Neumann KU100 HRTFs as in the previous condition (C2), but in this case, the full-spherical HRTF set in SOFA format (Majdak et al., 2022).

A short video illustrating some trials in this test condition is included in the Supplemental Material (Ramírez et al., 2024). Additionally, Table 1 summarizes the parameter settings in all three test conditions and compares them to the traditional ERKI setting.

Table 1.

Parameter Settings: Comparison Between ERKI and All Scenarios Implemented in the Current Study.

	ERKI method (Plotz & Schmidt, 2017)	Controlled virtualization (current study)
	ERKI method (Plotz & Schmidt, 2017)	C1 (real)	C2 (AR)	C3 (VR)
Sound sources	5 real 32 virtual (VBAP)	5 real 32 virtual (VBAP)	37 virtual (HRTFs)	37 virtual (HRTFs)
Visual feedback	LED bar	Real LED array	Real LED array	Virtual LED array (HMD)
Pointing method	Allocentric (rotatory disc)	Egocentric (modified Oculus Quest 2 controller)	Egocentric (modified Oculus Quest 2 controller)	Egocentric (Oculus Quest 2 controller)
Head tracking	No	Yes (for head orientation control)	Yes (for head orientation control)	Yes (for head orientation control and dynamic binaural rendering)
Stimulus presentation	Loudspeaker based	Loudspeaker based	Headphone based	Headphone based
Spatialization technique	VBAP	VBAP	Static binaural spatialization	Dynamic binaural rendering

Abbreviations: ERKI = Erfassung des Richtungshörens bei Kindern; HRTFs = head-related transfer functions; VBAP = vector base amplitude panning; HMD = head-mounted display; LED = light emitting diode.

Experimental Procedure

We used a within-participant design so that the procedure was identical for all three test conditions (C1, C2, and C3), and the order of presentation of the test conditions was randomized. The subject sat in the center of the loudspeaker array with the head oriented at the 0° azimuth. The stimuli (65 dB(A) SPL) were presented if their head was aligned to 0° (a slight tolerance of ±2° was allowed). If their head was not aligned, the LED dot at 0° azimuth flashed, and a beep signal sounded to direct their attention to the desired direction. Additionally, voice messages instructed the listener to look forward.

The subjects’ task was to determine the location of the perceived signal by extending their arm and pointing in the direction of the perceived sound object in space and pressing a button on the handheld controller to confirm their response. Participants heard encouraging messages and tones regardless of the accuracy of their responses. They could not repeat a trial or proceed without responding, and no feedback was provided, that is, whether the response was correct. The flashing central light and beep redirected listeners’ attention until their head was reoriented to 0° azimuth, and the next stimulus was presented.

Before each test condition, child-friendly voice prompts guided listeners through the procedure. Following the voice instructions, the subject was guided through the first practice trial in a game-like fashion. A total of five practice trials were presented to familiarize them with the setup, stimuli, and procedure. After the practice trials, there was room for questions about the task.

In the experiment, stimuli were randomly presented once from all possible 37 positions (from −90° to +90° azimuth), corresponding to the frontal horizontal plane with a resolution of 5°. We used one trial per position as the previous ERKI studies showed that the number of trials could be reduced to one without negatively affecting the reliability of the results (Plotz & Schmidt, 2017). A complete experimental session lasted ∼ 1 hr, including instructions, training, and short breaks between test conditions.

The study was conducted following the principles of the Declaration of Helsinki (World Medical Association, 2013) and the guidelines of the local institutional review board of the Institute of Computer and Communication Technology at the TH Köln University of Applied Sciences. All participants gave written informed consent for voluntary participation in the study and the subsequent publication of the results. All personal data and experimental results were collected, processed, and archived according to country-specific data protection regulations.

Parameters and Statistical Analysis

We estimated localization performance for each test condition using the root-mean-square (RMS) localization error $⟨ \bar{D} ⟩$ between the objective location of the sound source and the subject's response, as proposed by Hartmann (1983) and Grieco-Calub and Litovsky (2010). In addition, we calculated $⟨ \bar{D} ⟩$ by region ( ${\bar{D}}_{f r o n t a l}$ and ${\bar{D}}_{l a t e r a l}$ ) to analyze the effect of sound-source position. For the present study, we defined the frontal region as the area covering stimuli between ± 45° azimuth in front of the listener. The lateral region corresponds to all areas beyond these limits.

For statistical analysis of the results, we first applied a Lilliefors test for normality to the RMS localization error $⟨ \bar{D} ⟩$ in all test conditions. The tests showed no violation of normality, indicating that parametric tests could be used. Additional visual inspection of the histograms and quantile-quantile (QQ) plots of the data (see Figures S4 and S5 in the Supplemental Material (Ramírez et al., 2024) did not indicate any violation of normality assumptions either.

We performed a two-way repeated measures analysis of variance (ANOVA) on $⟨ \bar{D} ⟩$ with the within-subjects factors condition [C1, C2, C3] and region [frontal, lateral] and corrected for slight violations of sphericity assumptions using the Greenhouse-Geisser (GG) correction (Greenhouse & Geisser, 1959). We performed post-hoc testing using Tukey's honestly significant difference (HSD) tests (Tukey, 1949) to assess the significance of differences between pairs of group means. In addition, we report Bayes factors for some comparisons to quantify evidence for the null hypothesis, which is not provided by conventional (frequentist) significance testing (Rouder et al., 2009, 2012).

Results

Sound Localization Performance

Figure 2 shows the results of the listening experiment for two different subjects: one with above-average localization accuracy (subject No. 5) and one with relatively higher localization errors (subject No. 15). The plot shows the perceived (subjective) position of the sound source as a function of its real (objective) position for all the test conditions.

Figure 2.

Subjective localization of a sound source as a function of its position over the horizontal plane. The graph shows the objective position (abscissa) and the corresponding subjective localization (ordinate) of two study participants: Subject No. 5 (left) and Subject No. 15 (right) in all the experimental conditions (C1, C2, and C3). The solid lines are the fifth-order best-fit polynomial curves for the discrete data. The solid black line represents the ideal correct localization, and the dotted gray lines represent the closest possible responses (adjacent positions) to both the left and right sides of the ideal correct localization.

Note that our experiment allows localization with a precision of at most 5°. Therefore, for ease of interpretation, we have added gray dashed lines to the graph to represent the closest possible adjacent responses to the left and right sides of the ideal correct localization. Figure S6 in the Supplemental Material (Ramírez et al., 2024) shows the individual results for all subjects.

Figure 3 shows the pooled results of all subjects as a two-dimensional histogram. It shows the perceived localization of a sound source as a function of its position over the horizontal plane in all three test conditions.

Figure 3.

Two-dimensional histograms of the perceived localization of a sound source as a function of its position over the horizontal plane in all the experimental conditions: C1 (left), C2 (middle), and C3 (right), pooled over all the subjects (n = 19). The solid black line represents the ideal correct localization, and the dotted gray lines represent the closest possible responses (adjacent positions) to both the left and right sides of the ideal correct localization.

The plots show that localization accuracy tends to decrease in conditions C2 and C3 compared to C1, especially for the lateral locations, which is consistent with our hypothesis and the literature. The average RMS localization errors across subjects with their standard deviations are ${⟨ \bar{D} ⟩}_{C 1} = 7.99 \circ \pm 2.21 \circ$ for C1, ${⟨ \bar{D} ⟩}_{C 2} = 13.16 \circ \pm 5.32 \circ$ for C2, and ${⟨ \bar{D} ⟩}_{C 3} = 11.94 \circ \pm 4.96 \circ$ for C3.

To better interpret these results, we present $⟨ \bar{D} ⟩$ for all subjects per test condition as box plots in Figure 4 (left). The plots also clearly show the increase in RMS localization error in conditions C2 and C3 compared to C1, whereas the error slightly decreases from C2 to C3. The trend lines allow following the individual performance across conditions, supporting that most participants had higher localization errors in C2 than in C1 (colored in red) but improved when comparing C3 to C2 (colored in green).

Figure 4.

RMS localization error in degree across test conditions (C1, C2, and C3) for all subjects (n = 19) over the entire frontal horizontal plane (left) and by region (right). The individual RMS localization errors are shown as colored markers. The boxes represent the IQR across participants, and the medians are shown as solid black lines. Whiskers indicate 1.5 × IQR below the 25th percentile or above the 75th percentile, and asterisks indicate outliers beyond this range. Left: Trend lines connect the results per participant. The color of the line indicates higher (in red) or lower (in green) RMS localization error compared to the previous test condition. Right: Frontal region: Area containing stimuli between ± 45° azimuth in front of the listener. Lateral region: All areas beyond these limits, that is, from −45° to −90° azimuth to the left and from 45° to 90° azimuth to the right.

In addition, we present the RMS localization error by region ( ${\bar{D}}_{f r o n t a l}$ and ${\bar{D}}_{l a t e r a l}$ ) in Figure 4 (right) to analyze the effect of sound source position. Consistent with previous literature, the results of our experiment confirm that localization accuracy is highest for sounds located directly in front of the listener, and it deteriorates when stimuli are moved laterally across the horizontal plane, that is, toward ±90° azimuth (Blauert, 1996; Rayleigh, 1907).

A two-way repeated measures ANOVA for the RMS localization error with the within-subjects factors condition [C1, C2, C3] and region [frontal, lateral] revealed significant main effects of condition $[F (2, 36) = 8.95, p = .001, η_{p}^{2} = .33, ε = 0.91]$ and region $[F (1, 18) = 20.50, p < .001, η_{p}^{2} = .53, ε = 1.00]$ , but no interaction effect of condition*region $[F (2, 36) = 1.27, p = .292, η_{p}^{2} = .07, ε = 0.96]$ .

Post-hoc Tukey's HSD tests (Tukey, 1949) showed significant differences in the paired comparisons between the RMS localization error ${⟨ \bar{D} ⟩}_{C 1}$ and ${⟨ \bar{D} ⟩}_{C 2} [t (18) = 3.76, p = .004]$ and between ${⟨ \bar{D} ⟩}_{C 1}$ and ${⟨ \bar{D} ⟩}_{C 3}$ $[t (18) = 3.68, p = .005]$ , indicating that headphone-based audio presentation (binaural rendering) with nonindividual HRTFs negatively affected listeners’ localization accuracy compared to the loudspeaker-based presentation. These results are consistent with our hypothesis and previous literature (Iida, 2019; Møller et al., 1996; Wenzel et al., 1993).

Furthermore, although the listeners’ performance, that is, localization accuracy, was on average $1.22 \circ$ better in C3 than in C2 $({\bar{D}}_{C 3} < {\bar{D}}_{C 2})$ , this difference between the $⟨ \bar{D} ⟩$ values is very close to the just noticeable difference in our ability to discriminate between the spatial location of two sound sources, that is, the minimum audible angle (Harris, 1972; Mills, 1958). Therefore, it may not be perceptually significant. To assess the statistical significance of this slight difference, we performed a third paired sample comparison between ${⟨ \bar{D} ⟩}_{C 2}$ and ${⟨ \bar{D} ⟩}_{C 3}$ , which showed to be nonsignificant $[t (18) = 0.95, p = .615]$ . However, since nonsignificant results of null-hypothesis significance testing cannot be interpreted as evidence for the absence of an effect, we calculated the Bayes factor $({BF}_{01}, JZS scaling factor r = .707)$ for the pairwise comparison (Rouder et al., 2009; Wagenmakers, 2007). The obtained $B F_{01} = 3.07$ suggests that the data provide three times more evidence for the absence of an effect than for its presence.

To determine whether subjects’ ability to localize stimuli changed during the experiment due to factors such as fatigue, increased familiarity with certain stimulus features, or increased proficiency with the experimental procedure, we divided the results of each condition into four time periods, or epochs, and calculated RMS localization errors separately for each epoch. Epoch 1 included trials 1–9, epoch 2 included trials 10–19, epoch 3 included trials 19–27, and epoch 4 included trials 28–37.

We performed a one-way repeated measures ANOVA with epoch as the within-subjects factor for statistical analysis. The results showed no significant effect of epoch for any of the three conditions ( $p = .224$ for C1, $p = .648$ for C2, and $p = .671$ for C3). The corresponding Bayesian repeated measures ANOVA with default priors (r scale fixed effects of .5, r scale random effects of 1, and r scale covariates of .354) provided further evidence for the absence of an effect of epoch ( $B F_{01} = 2.92$ for C1, $B F_{01} = 8.53$ for C2, and $B F_{01} = 9.84$ for C3) (Rouder et al., 2012). Overall, this indicates that subjects’ average localization accuracy was not systematically affected by fatigue, learning, or adaptation.

Figure S7 in the Supplementary Material (Ramírez et al., 2024) shows the corresponding RMS localization errors for the three conditions as a function of the four epochs using box plots.

Localization Performance in VR Relative to Baseline

Although the sample sizes used in our study are not large enough to establish a normative sample, the data can provide a first hint at classification criteria relevant for localization testing over the frontal horizontal plane in NH adults using conventional setups and in AR and VR environments. Moving forward, we have decided not to discuss test condition C2 any further. This decision is based on the fact that the primary objective of our study is to evaluate the overall feasibility of assessing sound localization skills in VR. Since the results of our study showed no significant differences between the RMS localization errors of C2 and C3, our analysis and subsequent discussion will focus on the VR-based application for sound localization testing (C3).

Figure 5 shows the best-fitting normal distribution of the data collected in our study for test conditions C1 and C3, using the mean and standard deviation of the RMS localization error of these test conditions. The vertical lines in the figure represent the cutoff values for normal sound localization abilities based on the mean and standard deviation of our baseline measurement (C1) and the VR condition (C3) for NH adult listeners. The solid lines represent the cutoff values for a scoring criterion of localization deficits of $3 σ$ below the mean ( $C V_{3 σ} = 14.63 \circ$ for C1 and $C V_{3 σ} = 26.83 \circ$ for C3). Similarly, the dashed lines indicate the cutoffs when using $2 σ$ below the mean as the scoring criterion ( $C V_{2 σ} = 12.42 \circ$ for C1 and $C V_{2 σ} = 21.87 \circ$ for C3). The cutoffs represent the maximum RMS localization error an individual can have to “pass” the localization test. Localization errors greater than the cutoffs would be interpreted as indicating that the subject has sound localization deficits.

Figure 5.

Best fitting normal distribution of the experimental data for test conditions C1 (purple) and C3 (pink). The vertical lines indicate the cutoff values, that is, the maximum RMS localization error a subject could have to pass the localization test. They are calculated based on the mean and standard deviation of the data collected in C1 (purple lines) and C3 (pink lines) from NH listeners. The cutoffs for performance deficit scoring criteria of two and three standard deviations below the mean are shown respectively: $C V_{2 σ}$ (dashed line) and $C V_{3 σ}$ (solid line).

Figure 5 shows that, as expected, the virtual sound localization test (C3) has a reduced mean accuracy compared to the baseline (C1 in our study). The higher localization errors in C3 also lead to a wider distribution and, consequently, higher cutoff values.

The relationship between C1 and C3 scores for individual listeners is shown in Figure 6. The graph shows that most subjects performed significantly worse in C3 than in C1, with only a few subjects performing similarly or better in C3. The solid gray regression line predicts localization accuracy in VR from localization accuracy in the more standard baseline condition. In addition, the cutoff values, indicated as purple solid and dashed lines, provide a first proxy for classification criteria for sound localization testing in the horizontal plane in real and virtual conditions.

Figure 6.

Relationship between RMS localization errors in the loudspeaker-based (C1) and virtual (C3) conditions for individual listeners (pink dots) in our study. The solid gray line is the least-squares regression line, and the dashed gray line depicts a perfect correlation. The purple solid and dashed lines show the cutoff values for evaluation criteria of performance deficits of $2 σ$ and $3 σ$ below the mean respectively ( $C V_{2 σ}$ and $C V_{3 σ}$ ), calculated based on the performance of the NH listeners in C1 (vertical lines) and C3 (horizontal lines).

There is a low to moderate, positive but nonsignificant correlation between the two data sets ( $r (17) = .33$ , $p = .166$ , Pearson's correlation). This could be explained by the fact that our study included only NH subjects, which leads to the well-known restriction of range problem (Pearson, 1903; Sackett & Yang, 2000). It seems likely that most of the variance produced in sound localization comes from subjects with sound localization deficits and that there is little (residual) variance in the NH population. Without variance in a measure, there can be no covariance (and no correlation, which is the standardized covariance). Thus, a higher correlation between the results of the real and virtual conditions would be expected if our study had also included listeners with sound localization deficits. These results are consistent with previous findings by Brungart et al. (2017). Using real and virtual sound sources, they investigated sound localization in NH and hearing impaired (HI) groups. They found a clear correlation between the real and virtual localization tasks in the HI group. However, there seemed to be no correlation between performances in the two scenarios in the NH subjects.

Discussion

On the Availability of Normative Data

To determine whether sound localization testing in VR is a viable alternative to the traditional loudspeaker-based setups in the context of comprehensive auditory processing diagnostic testing, it is necessary to consider the criteria used in clinical practice to evaluate auditory test performance. As noted in the “Introduction” section, international guidelines and expert consensus recommend that performance deficits of at least $3 σ$ below the mean on one skill tested in the battery or performance deficits of at least $2 σ$ below the mean on at least two tests are considered sufficient for a diagnosis of APD (American Speech-Language-Hearing Association, 2005; Chermak & Musiek, 1997). Thus, each test included in the batteries should be specific enough to determine whether the subject's performance is within those expected ranges. However, the available normative data on sound localization abilities remain sparse. This can be explained by the lack of accessible and standardized setups and procedures for sound localization testing in the typical audiological clinical practice.

Published data on the sound localization abilities of NH listeners mainly refer to the abilities observed in laboratory studies on adults (Blauert, 1996; Makous & Middlebrooks, 1990; Middlebrooks & Green, 1991; Perrott & Saberi, 1990; Sabin et al., 2005). Data on children are even more scarce. In most cases, studies of sound localization skills in infants and school-aged children seem to have been motivated by the need to benchmark the development of these skills in NH individuals so that their performance can be compared with the evolution of patients after hearing aid and (or) cochlear implant fitting (Beijen et al., 2010; Bess et al., 1986; Grieco-Calub & Litovsky, 2010).

Furthermore, the terms “minimum audible angle” and “localization accuracy” have been used interchangeably in some studies, although both measures are based on different psychoacoustic paradigms. The first measure is derived from a relative spatial discrimination task that assesses the ability to distinguish between different sound source locations. The second is derived from an absolute identification task that measures the ability to identify a single sound location without a reference. It is still unclear whether or how the measures from the two tasks are related (Moore et al., 2008; Werner et al., 2012, Chapter 6). To date, researchers agree that absolute localization (source identification) and spatial discrimination are two distinct auditory tasks that tax (at least in part) different stages of the ascending auditory pathway (Kühnle et al., 2013; Spierer et al., 2009; Zatorre et al., 2002). Moreover, absolute localization is considered a more ecologically relevant measure of localization ability (Werner et al., 2012). Therefore, we limit our discussion to previous studies that used the same psychoacoustic paradigm as in our study, i.e., an absolute localization task.

The methods used to measure localization accuracy in previous studies vary widely, making it difficult to compare their results. Several factors may influence and explain the performance variability seen in previously published data. These factors include the number of loudspeakers used and their distribution in the horizontal plane, whether they are visible or not, the type and duration of the stimuli, the response/pointing methods, and the training protocols prior to the experimental session. Besides, some studies report absolute mean localization errors $⟨ | \bar{L E} | ⟩$ , while others report RMS localization errors $⟨ \bar{D} ⟩$ . According to Hartmann et al. (1998), the RMS localization error is a better measure of sound localization performance because it considers the distribution of errors across different sound source locations.

Populin (2008) collected data from adults with NH $(n = 9)$ using gaze as a spatial pointer. They used 13 loudspeakers $(L = 13)$ in the frontal horizontal plane and used broadband noise bursts as stimuli. They reported high intersubject variability in localization accuracy, with a mean angular error $⟨ | \bar{L E} | ⟩ = 8.40 \circ$ for stimuli located laterally and $⟨ | \bar{L E} | ⟩ = 3.47 \circ$ for stimuli located in front of the listeners. A similar study with NH adults and a larger sample size $(n = 45)$ was conducted by Yost et al. (2013). They used 12 loudspeakers $(L = 12)$ arranged in the frontal horizontal plane from −75° to 75° azimuth, achieving a spatial resolution of 15°. Listeners used a handheld keyboard to enter their responses regarding the perceived location of a broadband noise burst. Their study concluded that NH listeners perform with high reliability, low response bias, and normally distributed performance measures with a mean RMS error of $⟨ \bar{D} ⟩ = 5.98 \circ \pm 2.79 \circ$ .

Our study has the most methodological similarities with the study by Yost et al. (2013), and the results of both studies are also quite comparable. The mean RMS localization error that we obtained in our loudspeaker-based scenario ${⟨ \bar{D} ⟩}_{C 1} = 7.99 \circ \pm 2.21 \circ$ was slightly higher than their $⟨ \bar{D} ⟩ = 5.98 \circ \pm 2.79 \circ$ . The fact that our results show higher localization errors is not surprising, considering that our setup had a higher spatial resolution (5° instead of 15°). In addition, their setup was limited to ±75° in azimuth. In contrast, our setup covered the entire frontal hemifield, which includes additional more lateral, and therefore, relatively more difficult to localize sound source positions.

The development of portable and accessible sound localization testing setups and procedures, such as the one presented in this study, could facilitate the collection of normative data in clinical and laboratory settings.

Sound Localization Testing in VR

VR-based sound localization testing using consumer-grade devices has great potential to simplify the screening of APDs (as part of a larger virtualized comprehensive test battery). However, our results confirm that virtualization increases localization blur.

Binaural reproduction of virtual sound sources using nonindividual HRTFs has the greatest negative impact on localization accuracy. On the other hand, switching from real to virtual visual representation did not significantly affect performance. The use of VR for sound localization testing may be a trade-off between cost efficiency and accessibility on the one hand and lower accuracy on the other.

The results of our baseline localization test are consistent with similar studies in the literature, as are the corresponding proxy cutoff values. This consistency further validates the reliability of our loudspeaker-based localization test. The virtual replica of this test provides the first insight into the localization performance of NH listeners in virtual audiovisual test environments. In particular, our study provides the first proxy cutoff values for sound localization screening using VR-based setups.

One point to consider is that our study included only NH adult listeners. The available data on the sound localization abilities of NH children suggest that they show higher RMS localization errors and intersubject variability than NH adults. Based on the data reported by Grieco-Calub and Litovsky (2010) and Litovsky and Godar (2010), as well as clinical practice guidelines (American Academy of Audiology, 2010; American Speech-Language-Hearing Association, 2005; Musiek & Chermak, 2014), it can be estimated that children (5 + years old) with sound localization deficits would need average RMS localization errors greater than ∼ 32° before their performance can be considered abnormal (using the most conservative criterion of $C V_{2 σ}$ ). Based on the results of our study, it seems that these are deficits that the virtual version of the test (C3) could detect. However, this is purely speculative, and our results cannot be applied to the case of children. Further studies comparing test results in a loudspeaker-based setup with those obtained in a VR-based setup with NH children and adolescents are needed to verify this.

In general, our analysis of the feasibility of VR-based sound localization testing is based on the assumption that patients with sound localization deficits will not perform better in the virtual version of the sound localization test than in conventional loudspeaker-based setups. In this context, Brungart et al. (2017) showed that similar to NH listeners, HI listeners have higher sound localization errors when using virtualized auditory stimuli (headphone-based dynamic rendering with nonindividual HRTFs) than when using a traditional loudspeaker-based setup. They conclude that their study provides evidence that a virtual auditory display based on nonindividual HRTFs could be a valuable tool for evaluating the localization abilities of HI listeners. However, it should be noted that the case of patients with suspected APDs may be different from the case of HI patients. Further studies including patients with suspected APDs are needed to verify this.

At the expense of (already expected) reduced sound localization accuracy, the use of headphone-based binaural presentation using nonindividual HRTFs in the assessment of sound localization abilities opens up other possibilities, such as the chance to independently manipulate specific parameters, for example, interaural time and level differences, spectral information, and different HRTF sets, among others. In addition, VR technology allows us to further investigate the effects of multimodal stimulation in such behavioral tests, for example, by manipulating visual and proprioceptive feedback independently and in a controlled manner. The ability to conduct auditory processing tests in an immersive and controlled VR environment free from distraction is also attractive. In particular, gamification is an aspect that can be highly relevant, especially for children who may have shorter attention spans.

If the performance of the virtual sound localization test is still considered insufficient, it could be further improved by enhancing spatialization using individual or individualized HRTFs, thereby improving localization accuracy in general. Many individualization approaches exist (Guezenoc & Séguier, 2018), and methods for adapting generic HRTFs based on individual anthropometric data could be implemented in a VR application in a way that is reasonably manageable for users. For example, a comparatively simple individualization approach would be to adapt the interaural time differences of the generic HRTFs to the listener's individual head size, which would need to be measured in some way. This individualization should generally improve localization accuracy in the horizontal plane. However, individualization (even using individual HRTFs) conflicts with making sound localization testing in VR easy to use and accessible. A balance must be sought between (potentially) increasing accuracy through (sometimes complex) individualization approaches and making an assessment tool general and easy to use.

The use of low-cost, standalone HMDs, such as the one used in this study, may increase clinical efficiency by allowing testing to be performed in any clinic, at home, and periodically as needed. Simplifying the equipment and space requirements to assess complex listening skills, including the development of home testing versions, could increase equity of access to hearing care across geographic boundaries, improve the quality of care, and enhance the experience of patients and their families.

Future work should evaluate sound localization skills in VR with NH children, adolescents, and patients with suspected APDs. In addition, the effectiveness of gamified auditory training suites in VR for the remediation of APDs remains to be investigated.

This manuscript presents an initial feasibility study along with the first estimates of normative cutoff values for sound localization abilities in VR in NH adult listeners. However, in order to use such a test for individual screening, it is necessary to collect large, dedicated standardization samples. These samples should include patients with spatial hearing deficits and normal controls in different age groups and should be collected under both conventional loudspeaker-based and VR conditions. This will help to determine appropriate cutoff criteria for sound localization testing using VR setups.

Conclusion

We examined how the virtualization of sound localization testing setups and procedures degrades subjects’ localization accuracy. Moreover, we discussed how, at the same time, sound localization testing in VR could be a viable alternative to conventional loudspeaker-based setups. With the goal of advancing VR-based sound localization testing to aid in the screening of APDs, the study provides initial proxy normative data and approximate cutoff values for sound localization testing in virtual audiovisual environments using state-of-the-art technology and consumer-grade standalone hardware.

Our results encourage further evaluation of sound localization testing in VR as an alternative to bulky and more complex systems. VR may provide a time- and resource-saving prescreening tool. These findings support the further development of VR applications for the assessment of spatial hearing abilities, facilitating the screening and diagnosis of patients with suspected APDs and further research into the mechanisms of spatial auditory processing.

Supplemental Material

sj-docx-1-tia-10.1177_23312165241235463 - Supplemental material for Toward Sound Localization Testing in Virtual Reality to Aid in the Screening of Auditory Processing Disorders

Supplemental material, sj-docx-1-tia-10.1177_23312165241235463 for Toward Sound Localization Testing in Virtual Reality to Aid in the Screening of Auditory Processing Disorders by Melissa Ramírez, Johannes M. Arend, Petra von Gablenz, Heinrich R. Liesefeld and Christoph Pörschmann in Trends in Hearing

Footnotes

Acknowledgments

The authors would like to thank all the participants in the study for their time. The authors thank Kai Altwicker for his help in designing and building the loudspeaker and LED arrays, Miguel Ángel Olivares for his help in developing the VR scenario, and Raphaël Gillioz for the visual documentation of the experimental setup. The authors are grateful to the action editors and reviewers who provided constructive feedback that significantly improved earlier versions of the manuscript.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was sponsored by the German Federal Ministry of Education and Research BMBF (13FH666IA6-VIWER-S) and partly by the German Research Foundation (DFG WE 4057/21-1).

ORCID iDs

Melissa Ramírez

Johannes M. Arend

Petra von Gablenz

Supplemental Material

Supplemental material for this article is available online in https://doi.org/10.5281/zenodo.10655341.

References

Ahrens

Lund

K. D.

(2022). Auditory spatial analysis in reverberant multi-talker environments with congruent and incongruent audio-visual room information. The Journal of the Acoustical Society of America, 152(3), 1586–1594. https://doi.org/10.1121/10.0013991

Ahrens

Lund

K. D.

Dau

(2019a). Audio-visual scene analysis in reverberant multi-talker environments. Proceedings of the 23rd International Congress on Acoustics, (September), (pp. 3890–3896). https://doi.org/10.18154/RWTH-CONV-239768

Ahrens

Marschall

Dau

(2019b). Measuring and modeling speech intelligibility in real and loudspeaker-based virtual sound environments. Hearing Research, 377, 307–317. https://doi.org/10.1016/j.heares.2019.02.003

American Academy of Audiology. (2010). Diagnosis, treatment and management of children and adults with central auditory processing disorder. Clinical Practice Guidelines, pp. 1–51.

American Speech-Language-Hearing Association. (2005). (Central) auditory processing disorders. Technical Report .

Bahu

Carpentier

Noisternig

Warusfel

(2016). Comparison of different egocentric pointing methods for 3D sound localization experiments. Acta Acustica United with Acustica, 102(1), 107–118. https://doi.org/10.3813/AAA.918928

Begault

D. R.

McClain

B. U.

Anderson

M. R.

(2001). Early reflection thresholds for virtual sound sources. Proceedings of the 2001 International Workshop on Spatial Media, Aizu-Wakamatsu, Japan.

Beijen

Snik

A. F. M.

Straatman

L. V.

Mylanus

E. A. M.

Mens

L. H. M.

(2010). Sound localization and binaural hearing in children with a hearing aid and a cochlear implant. Audiology and Neurotology, 15(1), 36–43. https://doi.org/10.1159/000218361

Bellis

T. J.

(2003a). Assessment and management of central auditory processing disorders in the educational setting: from science to practice (2nd ed.; J. L. Danhauer, Ed.). Plural Publishing.

10.

Bellis

T. J.

(2003b). When the brain can’t hear: Unraveling the mystery of auditory processing disorder. Atria Books.

11.

Bernschütz

(2013). A spherical far field HRIR HRTF compilation of the Neumann KU 100. Proceedings of the 39th DAGA (pp. 592–595).

12.

Bess

F. H.

Tharpe

A. M.

Gibler

A. M.

(1986). Auditory performance of children with unilateral sensorineural hearing loss. Ear and Hearing, 7(1), 20–26.

13.

Blauert

(1996). Spatial hearing (Revised Ed; J. S. Allen, Ed.). The MIT Press. https://doi.org/10.7551/mitpress/6391.001.0001

14.

Brungart

D. S.

Cohen

J. I.

Zion

Romigh

(2017). The localization of non-individualized virtual sounds by hearing impaired listeners. The Journal of the Acoustical Society of America, 141(4), 2870–2881. https://doi.org/10.1121/1.4979462

15.

Cameron

Dillon

(2007). Development of the listening in spatialized noise-sentences test (LISN-S). Ear and Hearing, 28(2), 196–211. https://doi.org/10.1097/AUD.0b013e318031267f

16.

Chermak

Musiek

(1997). Central auditory processing disorders: new perspectives. Singular Publishing.

17.

Chermak

Musiek

(2013). Handbook of central auditory processing disorder, volume II: comprehensive intervention (2nd ed., pp. 1–792). Plural Publishing.

18.

Cooper

J. C.

Gates

G. A.

(1991). Hearing in the elderly—The Framingham cohort, 1983–1985: Part II. Prevalence of central auditory processing disorders. Ear and Hearing, 12(5), 304–311. https://doi.org/10.1097/00003446-199110000-00002

19.

Cranford

J. L.

Andres

M. A.

Piatz

K. K.

Reissig

K. L.

(1993). Influences of age and hearing loss on the precedence effect in sound localization. Journal of Speech and Hearing Research, 36(2), 437–441. https://doi.org/10.1044/jshr.3602.437

20.

Cranford

J. L.

Boose

Moore

C. A.

(1990). Tests of the precedence effect in sound localization reveal abnormalities in multiple sclerosis. Ear and Hearing, 11(4), 282–288. https://doi.org/10.1097/00003446-199008000-00005

21.

Cranford

J. L.

Stream

R. W.

Rye

C. V.

Slade

T. L.

(1982). Detection v discrimination of brief-duration tones: Findings in patients with temporal lobe damage. Archives of Otolaryngology, 108(6), 350–356. https://doi.org/10.1001/archotol.1982.00790540022007

22.

Cuevas-Rodríguez

Picinali

González-Toledo

Garre

de la Rubia-Cuestas

Molina-Tanco

Reyes-Lecuona

(2019). 3D Tune-In toolkit: An open-source library for real-time binaural spatialisation. PLoS ONE, 14(3), e0211899. https://doi.org/10.1371/JOURNAL.PONE.0211899

23.

de Wit

Visser-Bochane

M. I.

Steenbergen

van Dijk

van der Schans

C. P.

Luinge

M. R.

(2016). Characteristics of auditory processing disorders: A systematic review. Journal of Speech, Language, and Hearing Research, 59(2), 384–413. https://doi.org/10.1044/2015_JSLHR-H-15-0118

24.

Dillon

Cameron

Glyde

Wilson

Tomlin

(2012). An opinion on the assessment of people who may have an auditory processing disorder. Journal of the American Academy of Audiology, 23(2), 97–105. https://doi.org/10.3766/jaaa.23.2.4

25.

Djelani

Pörschmann

Sahrhage

Blauert

(2000). An interactive virtual-environment generator for psychoacoustic research II: Collection of head-related impulse responses and evaluation of auditory localization. Acta Acustica United with Acustica, 86(6), 1046–1053.

26.

Erbes

Wierstorf

Geier

Spors

(2017). Free database of low-frequency corrected head-related transfer functions and headphone compensation filters. Proceedings of 142nd Audio Engineering Society Convention, e-Brief 325 (pp. 1–5).

27.

Frank

(2013). Phantom sources using multiple loudspeakers in the horizontal plane. University of Music and Performing Arts Graz.

28.

Gaveau

Coudert

Salemme

Koun

Desoche

Truy

, … Pavani

(2022). Benefits of active listening during 3D sound localization. Experimental Brain Research, 240(11), 2817–2833. https://doi.org/10.1007/s00221-022-06456-x

29.

Geffner

D. S.

Ross-Swain

(2019). Auditory processing disorders: assessment, management, and treatment (3rd ed.). Plural Publishing.

30.

Golding

Carter

Mitchell

Hood

L. J.

(2004). Prevalence of central auditory processing (CAP) abnormality in an older Australian population: The blue mountains hearing study. Journal of the American Academy of Audiology, 15(9), 633–642. https://doi.org/10.3766/jaaa.15.9.4

31.

Goodhill

(1954). Directional free field startle-reflex audiometry. A.M.A. Archives of Otolaryngology, 59(2), 176–177. https://doi.org/10.1001/archotol.1954.00710050188007

32.

Greenhouse

S. W.

Geisser

(1959). On methods in the analysis of profile data. Psychometrika, 24(2), 95–112. https://doi.org/10.1007/BF02289823

33.

Grieco-Calub

T. M.

Litovsky

R. Y.

(2010). Sound localization skills in children who use bilateral cochlear implants and in children with normal acoustic hearing. Ear & Hearing, 31(5), 645–656. https://doi.org/10.1097/AUD.0b013e3181e50a1d

34.

Guezenoc

Séguier

(2018). HRTF individualization: A survey. Proceedings of 145th AES Convention (pp. 1–10), New York, USA.

35.

Harris

J. D.

(1972). A florilegium of experiments on directional hearing. Acta Oto-Laryngologica. Supplementum, 298, 1–26.

36.

Hartmann

Rakerd

Gaalaas

(1998). On the source-identification method. Journal of the Acoustical Society of America, 104(6), 3546–3557. https://doi.org/10.1121/1.423936

37.

Hartmann

W. M.

(1983). Localization of sound in rooms. Journal of the Acoustical Society of America, 74(5), 1380–1391. https://doi.org/10.1121/1.390163

38.

Higgins

N. C.

Pupo

D. A.

Ozmeral

E. J.

Eddins

D. A.

(2023). Head movement and its relation to hearing. Frontiers in Psychology, 14(June), 1–18. https://doi.org/10.3389/fpsyg.2023.1183303

39.

Humes

L. E.

Allen

S. K.

Bess

F. H.

(1980). Horizontal sound localization skills of unilaterally hearing-impaired children. Audiology: Official Organ of the International Society of Audiology, 19(6), 508–510. https://doi.org/10.3109/00206098009070082

40.

Hurley

R. M.

Musiek

(1997). Effectiveness of three central auditory processing (CAP) tests in identifying cerebral lesions. Journal of the American Academy of Audiology, 8(4), 257–262.

41.

Iida

(2019). Head-related transfer function and acoustic virtual reality (1st ed.). Springer. https://doi.org/10.1007/978-981-13-9745-5

42.

Iliadou

Bamiou

D. E.

Kaprinis

Kandylis

Kaprinis

(2009). Auditory processing disorders in children suspected of learning disabilities—A need for screening? International Journal of Pediatric Otorhinolaryngology, 73(7), 1029–1034. https://doi.org/10.1016/J.IJPORL.2009.04.004

43.

Jerger

Musiek

(2000). Report of the consensus conference on the diagnosis of auditory processing disorders in school-aged children. Journal of the American Academy of Audiology, 11(09), 467–474. https://doi.org/10.1055/s-0042-1748136

44.

Jongkees

L. B. W.

Groen

J. J.

(1946). On directional hearing. The Journal of Laryngology & Otology, 61(9), 494–504. https://doi.org/10.1017/S002221510000832X

45.

Jongkees

L. B. W.

Veer

R. A. V. D.

(1957). Directional hearing capacity in hearing disorders. Acta Oto-Laryngologica, 48(5–6), 465–474. https://doi.org/10.3109/00016485709126908

46.

Keith

(1986). SCAN: A screening test for auditory processing disorders. The Psychological Corporation.

47.

Kühnle

Ludwig

A. A.

Meuret

Küttner

Witte

Scholbach

, … Rübsamen

(2013). Development of auditory localization accuracy and auditory spatial discrimination in children and adolescents. Audiology and Neurotology, 18(1), 48–62. https://doi.org/10.1159/000342904

48.

Letowski

T. R.

Letowski

S. T.

(2016). Auditory spatial perception: auditory localization.

49.

Lindau

Brinkmann

(2012). Perceptual evaluation of head-phone compensation in binaural synthesis based on non-individual recordings. The Journal of the Audio Engineering Society, 60(1/2), 54–62.

50.

Link

Lehnhardt

(1966). The examination of directional hearing. A simple clinical method. International Audiology, 5(2), 67–70. https://doi.org/10.3109/05384916609074143

51.

Litovsky

R. Y.

Godar

S. P.

(2010). Difference in precedence effect between children and adults signifies development of sound localization abilities in complex listening tasks. The Journal of the Acoustical Society of America, 128(4), 1979–1991. https://doi.org/10.1121/1.3478849

52.

Majdak

Zotter

Brinkmann

de Muynke

Mihocic

Noisternig

(2022). Spatially oriented format for acoustics 2.1: Introduction and recent advances. AES: Journal of the Audio Engineering Society, 70(7–8), 565–584. https://doi.org/10.17743/jaes.2022.0026

53.

Makous

J. C.

Middlebrooks

J. C.

(1990). Two-dimensional sound localization by human listeners. Journal of the Acoustical Society of America, 87(5), 2188–2200. https://doi.org/10.1121/1.399186

54.

Middlebrooks

J. C.

Green

D. M.

(1991). Sound localization by human listeners. Annual Review of Psychology, 42, 135–159. https://doi.org/10.1146/annurev.ps.42.020191.001031

55.

Mills

A. W.

(1958). On the minimum audible angle. Journal of the Acoustical Society of America, 30(4), 237–246. https://doi.org/10.1121/1.1909553

56.

Møller

Sørensen

M. F.

Jensen

C. B.

Hammershøi

(1996). Binaural technique: Do we need individual recordings? Journal of the Audio Engineering Society, 44(6), 451–464.

57.

Moore

C. A.

Cranford

J. L.

Rahn

A. E.

(1990). Tracking of a “moving” fused auditory image under conditions that elicit the precedence effect. Journal of Speech and Hearing Research, 33(1), 141–148. https://doi.org/10.1044/jshr.3301.141

58.

Moore

Tollin

Yin

(2008). Can measures of sound localization acuity be related to the precision of absolute location estimates? Hearing Research, 238(1–2), 94–109. https://doi.org/10.1016/j.heares.2007.11.006

59.

Murphy

(2017). Virtual reality: The next frontier of audiology. Hearing Journal, 70(9), 24–27. https://doi.org/10.1097/01.HJ.0000525521.39398.8f

60.

Musiek

Chermak

(2014). Handbook of central auditory processing disorder, Volume I: Auditory neuroscience and diagnosis (2nd ed.; B. A. Stach, Ed.). Plural Publishing.

61.

Musiek

Chermak

(2015). Psychophysical and behavioral peripheral and central auditory tests. In Handbook of clinical neurology (Vol. 129, pp. 313–332). Elsevier B.V. https://doi.org/10.1016/B978-0-444-62630-1.00018-4

62.

Musiek

Gollegly

K. M.

Kibbe

K. S.

Verkest-Lenz

S. B.

(1991). Proposed screening test for central auditory disorders: Follow-up on the dichotic digits test. American Journal of Otology, 12(2), 109–113.

63.

Musiek

Shinn

Jirsa

Bamiou

D.-E.

Baran

Zaida

(2005). GIN (gaps-in-noise) test performance in subjects with confirmed central auditory nervous system involvement. Ear & Hearing, 26(6), 608–618. https://doi.org/10.1097/01.aud.0000188069.80699.41

64.

Newton

V. E.

Hickson

F. S.

(1981). Sound localization. Part II: A clinical procedure. The Journal of Laryngology and Otology, 95(1), 41–18.

65.

Nilsson

Soli

S. D.

Sullivan

J. A.

(1994). Development of the hearing in noise test for the measurement of speech reception thresholds in quiet and in noise. The Journal of the Acoustical Society of America, 95(2), 1085–1099. https://doi.org/10.1121/1.408469

66.

Pearson

(1903). Philosophical transactions: I. Mathematical contributions to the theory of evolution—XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 200(321–330), 1–66. https://doi.org/10.1098/rsta.1903.0001

67.

Pernaux

J. M.

Emerit

Nicol

(2003). Perceptual evaluation of binaural sound synthesis: The problem of reporting localization judgments. Proceedings of the 114th AES Convention, Amsterdam, The Netherlands.

68.

Perrott

D. R.

Saberi

(1990). Minimum audible angle thresholds for sources varying in both elevation and azimuth. Journal of the Acoustical Society of America, 87(4), 1728–1731. https://doi.org/10.1121/1.399421

69.

Plotz

Schmidt

K. K.

(2017). Localization of real and virtual sound sources using an automated extension module of the “Mainzer Kindertisch”—Development of the ERKI method. Zeitschrift Für Audiologie, 56(1), 6–18.

70.

Pollack

Rose

(1967). Effect of head movement on the localization of sounds in the equatorial plane. Perception & Psychophysics, 2(12), 591–596. https://doi.org/10.3758/BF03210274

71.

Populin

L. C.

(2008). Human sound localization: Measurements in untrained, head-unrestrained subjects using gaze as a pointer. Experimental Brain Research, 190, 11–30. https://doi.org/10.1007/s00221-008-1445-2

72.

Pörschmann

Arend

J. M.

Brinkmann

(2019). Directional equalization of sparse head-related transfer function sets for spatial upsampling. IEEE/ACM Transactions on Audio Speech and Language Processing, 27(6), 1060–1071. https://doi.org/10.1109/TASLP.2019.2908057

73.

Pulkki

(1997). Virtual sound source positioning using VBAP. Journal of the Audio Engineering Society, 45(6), 456–466.

74.

Ramírez

Arend

J. M.

von Gablenz

Liesefeld

H. R.

Pörschmann

(2024). Supplemental material for “Toward sound localization testing in virtual reality to aid in the screening of auditory processing disorders”. https://doi.org/10.5281/zenodo.10655341

75.

Rayleigh

(1907). On our perception of sound direction. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 13(74), 214–232. https://doi.org/10.1080/14786440709463595

76.

Reyes-Lecuona

Picinali

(2022). Unity Wrapper for 3DTI. Retrieved from https://github.com/3DTune-In/3dti_AudioToolkit_UnityWrapper

77.

Rouder

J. N.

Morey

R. D.

Speckman

P. L.

Province

J. M.

(2012). Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56(5), 356–374. https://doi.org/10.1016/j.jmp.2012.08.001

78.

Rouder

J. N.

Speckman

P. L.

Sun

Morey

R. D.

Iverson

(2009). Bayesian T tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16(2), 225–237. https://doi.org/10.3758/PBR.16.2.225

79.

Sabin

A. T.

Macpherson

E. A.

Middlebrooks

J. C.

(2005). Human sound localization at near-threshold levels. Hearing Research, 199(1–2), 124–134. https://doi.org/10.1016/j.heares.2004.08.001

80.

Sackett

P. R.

Yang

(2000). Correction for range restriction: An expanded typology. Journal of Applied Psychology, 85(1), 112–118. https://doi.org/10.1037/0021-9010.85.1.112

81.

Salorio-Corbetto

Williges

Lamping

Picinali

Vickers

(2022). Evaluating spatial hearing using a dual-task approach in a virtual-acoustics environment. Frontiers in Neuroscience, 16(787153), 1–17. https://doi.org/10.3389/fnins.2022.787153

82.

Sanchez-Longo

Forster

(1958). Clinical significance of impairment of sound localization. Neurology, 8(2), 119. https://doi.org/10.1212/WNL.8.2.119

83.

Sanchez-Longo

Froster

Auth

(1957). A clinical test for sound localization and its applications. Neurology, 7, 655–663. https://doi.org/10.1212/WNL.7.9.655

84.

Savelsbergh

G. J. P.

Netelenbos

J. B.

Whiting

H. T. A.

(1991). Auditory perception and the control of spatially coordinated action of deaf and hearing children. Journal of Child Psychology and Psychiatry, 32(3), 489–500. https://doi.org/10.1111/j.1469-7610.1991.tb00326.x

85.

Soli

S. D.

Wong

L. L. N.

(2008). Assessment of speech intelligibility in noise with the hearing in noise test. International Journal of Audiology, 47(6), 356–361. https://doi.org/10.1080/14992020801895136

86.

Spierer

Bellmann-Thiran

Maeder

Murray

M. M.

Clarke

(2009). Hemispheric competence for auditory spatial representation. Brain, 132(7), 1953–1966. https://doi.org/10.1093/brain/awp127

87.

Stach

Spretnjak

Jerger

(1990). The prevalence of central presbyacusis in a clinical population. Journal of the American Academy of Audiology, 1(2), 109–115.

88.

Steadman

M. A.

Kim

Lestang

J. H.

Goodman

D. F. M.

Picinali

(2019). Short-term effects of sound localization training in virtual reality. Scientific Reports, 9(1), 1–17. https://doi.org/10.1038/s41598-019-54811-w

89.

Thurlow

W. R.

Mergener

J. R.

(1970). Effect of stimulus duration on localization of direction of noise stimuli. Journal of Speech and Hearing Research, 13(4), 826–838. https://doi.org/10.1044/jshr.1304.826

90.

Tobias

J. V.

Zerlin

(1959). Lateralization threshold as a function of stimulus duration. The Journal of the Acoustical Society of America, 31(12), 1591–1594. https://doi.org/10.1121/1.1907664

91.

Tonning

F. M.

(1970). Directional audiometry. I. Directional white-noise audiometry. Acta Oto-Laryngologica, 69(6), 388–394. https://doi.org/10.3109/00016487009123383

92.

Tonning

F. M.

(1971a). Directional audiometry: II. The influence of azimuth on the perception of speech. Acta Oto-Laryngologica, 72(1–6), 352–357. https://doi.org/10.3109/00016487109122493

93.

Tonning

F. M.

(1971b). Directional audiometry: III. The influence of azimuth on the perception of speech in patients with monaural hearing loss. Acta Oto-Laryngologica, 72(1–6), 404–412. https://doi.org/10.3109/00016487109122500

94.

Tonning

F. M.

(1972a). Directional audiometry: IV. The influence of azimuth on the perception of speech in aided and unaided patients with monaural hearing loss. Acta Oto-Laryngologica, 73(1), 44–52. https://doi.org/10.3109/00016487209138192

95.

Tonning

F. M.

(1972b). Directional audiometry: V. The influence of azimuth on the perception of speech in patients with monaural hearing loss treated with hearing aids (CROS). Acta Oto-Laryngologica, 74(1–6), 37–44. https://doi.org/10.3109/00016487209128420

96.

Tonning

F. M.

(1972c). Directional audiometry: VI. Directional speech audiometry in patients with practical deafness in one ear and impaired hearing in the other ear treated with hearing aids. Acta Oto-Laryngologica, 74(1–6), 206–211. https://doi.org/10.3109/00016487209128442

97.

Tonning

F. M.

(1973a). Directional audiometry: VIII. The influence of hearing aid on the localization of white noise. Acta Oto-Laryngologica, 76(1–6), 114–120. https://doi.org/10.3109/00016487309121489

98.

Tonning

F. M.

(1973b). Directional audiometry VII. The influence of azimuth on the perception of speech in aided and unaided patients with binaural hearing loss. Acta Oto-Laryngologica, 75(2–6), 425–431. https://doi.org/10.3109/00016487309139769

99.

Tukey

J. W.

(1949). Comparing individual means in the analysis of variance. Biometrics, 5(2), 99–114.

100.

Valve Corporation. (2022). STEAM Audio SDK. Retrieved from https://valvesoftware.github.io/steam-audio/

101.

Vermiglio

Nilsson

Soli

Freed

(1998). Development of a Virtual Test of Sound Localization: The Source Azimuth Identification in Noise Test. The 10th AAA Convention (Poster). American Academy of Audiology . https://doi.org/10.13140/RG.2.2.33400.24325

102.

Wagenmakers

E.-J.

(2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14(5), 779–804. https://doi.org/10.3758/BF03194105

103.

Wenzel

E. M.

Arruda

Kistler

D. J.

Wightman

F. L.

(1993). Localization using nonindividualized head-related transfer functions. The Journal of the Acoustical Society of America, 94(1), 111–123. https://doi.org/10.1121/1.407089

104.

Werner

L. A.

Fay

R. R.

Popper

A. N.

(2012). Human auditory development. In Springer handbook of auditory research (1st ed., pp. XIV–286). Springer. https://doi.org/10.1007/978-1-4614-1421-6.

105.

World Medical Association (2013) World Medical Association Declaration of Helsinki: Ethical principles for medical research involving human subjects. JAMA, 310(20), 2191–2194. https://doi.org/10.1001/jama.2013.281053

106.

Yost

W. A.

Loiselle

Dorman

Burns

Brown

C. A.

(2013). Sound source localization of filtered noises by listeners with normal hearing: A statistical analysis. The Journal of the Acoustical Society of America, 133(5), 2876–2882. https://doi.org/10.1121/1.4799803

107.

Zatorre

R. J.

Bouffard

Ahad

Belin

(2002). Where is “where” in the human auditory cortex? Nature Neuroscience, 5(9), 905–909. https://doi.org/10.1038/nn904

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

2.44 MB