Abstract
Hearing-impaired listeners are known to have difficulties not only with understanding speech in noise but also with judging source distance and movement, and these deficits are related to perceived handicap. It is possible that the perception of spatially dynamic sounds can be improved with hearing aids (HAs), but so far this has not been investigated. In a previous study, older hearing-impaired listeners showed poorer detectability for virtual left-right (angular) and near-far (radial) source movements due to lateral interfering sounds and reverberation, respectively. In the current study, potential ways of improving these deficits with HAs were explored. Using stimuli very similar to before, detailed acoustic analyses were carried out to examine the influence of different HA algorithms for suppressing noise and reverberation on the acoustic cues previously shown to be associated with source movement detectability. For an algorithm that combined unilateral directional microphones with binaural coherence-based noise reduction and for a bilateral beamformer with binaural cue preservation, movement-induced changes in spectral coloration, signal-to-noise ratio, and direct-to-reverberant energy ratio were greater compared with no HA processing. To evaluate these two algorithms perceptually, aided measurements of angular and radial source movement detectability were performed with 20 older hearing-impaired listeners. The analyses showed that, in the presence of concurrent interfering sounds and reverberation, the bilateral beamformer could restore source movement detectability in both spatial dimensions, whereas the other algorithm only improved detectability in the near-far dimension. Together, these results provide a basis for improving the detectability of spatially dynamic sounds with HAs.
Keywords
Introduction
Hearing-impaired listeners are known to exhibit considerable difficulties in complex environments relative to normal-hearing peers. For example, in situations where multiple talkers are present their ability to understand speech deteriorates more dramatically compared with quiet conditions (Dirks, Morgan, & Dubno, 1982; Plomp, 1978); thus, hearing-impaired listeners require a better signal-to-noise ratio (SNR) to achieve a performance similar to that of normal-hearing listeners (Plomp, 1986). Hearing aids (HAs) can help by restoring audibility and by improving the SNR. This can improve speech reception in noise, but it may also compromise spatial hearing abilities such as movement perception (e.g., Carlile & Leung, 2016). Together with source distance judgments (e.g., Akeroyd, Gatehouse, & Blaschke, 2007), the perception of source movement has been found to be difficult for hearing-impaired listeners, which in turn has been related to the experience of handicap (Gatehouse & Noble, 2004). Although bilateral HA fittings can provide some benefit in complex situations compared with unaided listening, research has shown that there clearly is room for further improvement (Noble & Gatehouse, 2006). Therefore, the identification of possible avenues for improving these abilities by studying the effects of HA signal processing on the perception of spatial dynamics is a worthwhile goal.
Despite this, research into the spatial hearing abilities of hearing-impaired listeners has been largely restricted to localization and discrimination performance in relatively simple acoustic scenarios. In realistic environments, sound sources and listeners typically both move around but so far hardly any research into motion perception with hearing impairment has been carried out. One exception is the study of Brimijoin and Akeroyd (2014) who found that for both normal-hearing and hearing-impaired listeners detection thresholds for self-motion (i.e., rotations of the head) were smaller than for source motion, suggesting more accurate spatial hearing abilities with self-motion cues. In a recent study, we investigated sensitivity to angular and radial source movements as a function of acoustic complexity with young normal-hearing (YNH) and older hearing-impaired (OHI) listeners (Lundbeck, Grimm, Hohmann, Laugesen, & Neher, 2017). That is, we used virtual acoustics to simulate complex sound scenarios with a moving target source and multiple static interferers. The YNH listeners were only slightly affected by the number of concurrent sound sources in the left-right (L-R) dimension and did not show poorer performance in the near-far (N-F) dimension when reverberation was added. For the (linearly aided) OHI listeners, we found that concurrent interfering sounds impaired the detectability of L-R source movements, and reverberation that of N-F source movements.
The results from our previous study raise the question of how to compensate these deficits with HA signal processing. Over the past decades, researchers have studied L-R localization performance in static auditory space for listeners with hearing loss with or without HAs. In a recent review chapter, Akeroyd and Whitmer (2016) summarized several studies investigating different types of aided performance in L-R localization tasks. For example, Keidser et al. (2006); Keidser, O'Brien, Hain, McLelland, and Yeend (2009); Picou, Aspell, and Ricketts (2014); and Van den Bogaert, Klasen, Moonen, Van Deun, and Wouters (2006) all investigated different HA features including directional microphone settings in terms of their influence on the sound localization performance of hearing-impaired listeners. While Keidser et al. (2006) found that there was scope for improving the localization performance of hearing-impaired listeners, the other studies indicated minimal effects of directional microphones.
In contrast to most other research on spatial hearing with HAs, Best, Mejia, Freeston, Van Hoesel, and Dillon (2015) recently addressed the performance of bilateral beamformers in static and dynamic multitalker environments. They found that, relative to conventional directional microphones, the tested beamformer algorithms were generally superior for speech perception and L-R localization in noise involving fixed frontal targets but not for situations involving head movements. Neither source movement nor N-F spatial perception were addressed in their study.
The studies reviewed above have in common that they used static or pseudo-dynamic (i.e., successive presentations of a target signal from different discrete locations) scenarios to investigate the influence of HA algorithms. So far, no systematic research appears to have been conducted into the effects of HAs on target source movement detection in the L-R and N-F dimension in the presence of interfering sounds and reverberation. In the current study, we therefore investigated the influence of different multimicrophone signal enhancement algorithms on source movement detection in acoustically complex situations. To that end, we used a higher order Ambisonics-based system for simulating complex sound scenes together with a computer simulation of bilateral multi-microphone HAs. The HAs were configured to process the input signals in different ways. More specifically, the algorithms that we chose suppressed sound coming from non-frontal directions and furthermore attenuated spatially diffuse sound components such as reverberation. Reverberant sound is known to enhance auditory distance perception but it can also degrade localization performance (Valimaki et al., 2012).
In our previous study, we found that OHI listeners were negatively affected by reverberation in terms of their ability to perform a source movement detection task. In addition, we found that monaural spectral changes clearly increased as a result of source movements. On the basis of these findings, we started the current study with a detailed acoustic analysis to investigate changes in acoustical parameters that we previously had found to be related to source movement perception: changes in monaural spectral information, direct-to-reverberant energy ratio (DRR), as well as SNR. In doing so, our aim was to find out whether the chosen HA algorithms would alter these measures in a way that could be beneficial for HA users in a source movement detection task. We then evaluated the most promising HA algorithms in a listening test to explore the potential for improving source movement detection with HAs.
In summary, the current study had the following aims:
To acoustically evaluate various signal enhancement algorithms in terms of their ability to enhance acoustic cues that have previously been shown to be associated with L-R and N-F source movement detection; To evaluate the most promising HA algorithms for improving L-R and N-F source movement detection with a group of OHI listeners.
Methods
The current study was approved by the ethics committee of the University of Oldenburg. All participants provided written informed consent and received financial compensation for their participation.
Experimental Setup
The experimental setup was based on the one from our previous study (Lundbeck et al., 2017). The acoustic environment was simulated using a toolbox for creating dynamic virtual acoustic environments (TASCARpro version 0.128; Grimm, Luberadzka, Herzke, & Hohmann, 2015). A two-dimensional (horizontal-plane) 23rd-order ambisonics receiver with max-rE decoding (Daniel, 2000) was used, resulting in a theoretical frequency range of 16 kHz without spatial aliasing artifacts. It was configured to produce 48 virtual loudspeaker signals with a spatial resolution of 7.5°. The virtual listener was seated at the center of the simulated loudspeaker array. As the aim of the current study was to include different HA algorithms, multimicrophone signals were generated by convolving the virtual loudspeaker signals with binaural room impulse responses for the corresponding directions. The impulse response measurements were taken from the database of Thiemann and van de Par (2015). They were recorded in an anechoic chamber (volume = 238 m3) with a head-and-torso simulators equipped with two behind-the-ear (BTE) HA
The simulated acoustic scenario was an entrance hall of approximately 10.5 m × 6 m × 2.8 m with solid walls (including various large glass surfaces) and a wooden floor (see Lundbeck et al., 2017). The head of the virtual listener was placed 1 m away from the middle of the shorter wall facing along the longer side at a height of 1.5 m. In the reference condition, the target source was located 1 m away from, and directly in front of, the listener. A change in complexity of the scenario was achieved by adding four static interfering sound sources at a distance of 1 m each with azimuthal angles of ±45° and ±90° relative to the frontal direction.
HA Signal Processing
For the simulation of the different HA algorithms, we used the master hearing aid (MHA) research platform (Grimm, Herzke, Berg, & Hohmann, 2006). The MHA comes with a set of representative mulit-microphone signal enhancement algorithms. In the current study, we compared algorithms based on a pair of unilateral directional microphones (DIR), a binaural coherence-based noise reduction (NR) scheme for the suppression of spatially diffuse signals (COH), a bilateral beamformer with binaural cue preservation (BEAM), as well as different combinations thereof. In addition, we included a reference condition without any processing (UNPROC). All signal processing was carried out at a sampling rate of 16 kHz. Prior to their presentation, all stimuli were resampled to 44.1 kHz.
Unprocessed
The UNPROC condition corresponded to a pair of omnidirectional microphones that were simulated using the front microphone signals of the two BTE devices. No other processing was applied.
Combination of DIR and COH
Apart from testing the DIR and COH algorithms separately, we also tested them in series (combination of DIR and COH [DIRCOH]) to achieve stronger suppression of noise and reverberation.
DIR: In the DIR condition, the front and rear microphone signals of each BTE device were processed using a simple delay-and-subtract beamformer (e.g., Doclo & Moonen, 2003) to simulate a pair of static forward-facing cardioid microphones. To compensate for the high-pass characteristic that is typical of directional microphones, we spectrally equalized each output signal with a finite impulse response (FIR) filter that ensured that the frontal target signals sounded highly similar across the UNPROC and DIR conditions.
COH: In the COH condition, a binaural NR scheme for attenuating incoherent signal segments (Grimm, Hohmann, & Kollmeier, 2009) was used. This algorithm is effective in spatially diffuse environments. Using a time constant of 40 ms, it first estimates the short-term coherence Signal processing chain used for the acoustical analyses. Following the generation of the stimuli using TASCAR (left) and the hearing aid processing in the MHA (middle), different output channels (1–6) were analyzed using different measures (right).
Bilateral beamformer
The BEAM algorithm corresponded to a bilateral beamforming algorithm after Rohdenburg, Hohmann, and Kollmeier (2007). This algorithm makes use of the minimum variance distortionless response beamformer (Bitzer & Simmer, 2001). It generates a single-channel estimate of the desired target signal based on the available input signals and then applies a binaural postfilter (Lotter & Vary, 2006) to this estimate. By applying identical gains to the left and right sides, the original ITDs and ILDs of the input signal are preserved. For the current study, we used a nonadaptive, forward-facing implementation based on six input signals (three per side) and the front BTE microphone signals as reference signals for the binaural postfilter. Furthermore, we spectrally equalized the output signals using another FIR filter that compensated for any monaural spectral changes in the frontal (0°) direction.
Tested HA algorithms
Overview of the HA Conditions Used for the Acoustic and Perceptual Measurements.
HA = hearing aid.
Stimuli
The stimuli that we used were very similar to those from our previous study (Lundbeck et al., 2017). That is, we made use of five different environmental sounds. As the target sound, we chose a broadband noise-like fountain signal. This was because pilot measurements had shown that, compared with other more modulated signals such as a ringing phone, the fountain signal led to clearer acoustic changes, presumably due to its large bandwidth, which is an important factor for movement detection, especially in the angular dimension (e.g., Chandler & Grantham, 1992). As interfering sounds, we used recordings of ringing bells, bleating goats, pouring water, and humming bees. The target sound (S1) was presented at a nominal level of 65 dB sound pressure level (SPL) and the other sounds (S2–S5) at 62 dB SPL (nominal) each, as measured under reverberant conditions at the position of the virtual listener. The duration of each sound was 2.3 s without reverberation and 3.1 s with reverberation.
Technical Measurements
The acoustic analyses that we conducted were based on the results of our previous study (Lundbeck et al., 2017). Those results indicated that under reverberant conditions, monaural spectral changes play a particular role for source movement detection. Furthermore, the listening tests indicated that interfering sounds lead to poorer detection thresholds in the L-R dimension and that reverberation leads to poorer detection thresholds in the N-F dimension. For the L-F dimension, we therefore chose monaural spectral and SNR changes as our measures of interest. For the N-F dimension, we additionally analyzed changes in the DRR.
General setup and procedure
We performed the analyses separately for the L-R and N-F dimension. For the generation of the stimuli, we used the median L-R and N-F detection thresholds across all OHI listeners and conditions tested earlier (Lundbeck et al., 2017). Specifically, we generated stimuli where the target signal moved 28° in the L-R direction or 1.5 m in the N-F direction relative to the reference position (0°, 1 m re. the listener). The signal processing chain used for the acoustical analyses is shown in Figure 1. The virtual listener was equipped with two BTE devices with up to three microphones each. The microphone signals generated in this manner were then processed in the MHA. More specifically, we used the shadow-filtering method, that is, we estimated any gains or filter coefficients based on the signal mixture (target + interferers) and then applied these gains or filter coefficients separately to the target and interfering signals. Depending on the measure of interest (see later), we then analyzed different output signals. To be able to reveal short-time changes in the chosen measures, we used a 100-ms analysis window with 50% overlap.
Binaural cues
To analyze the influence of the HA algorithms on binaural cues, we applied the binaural hearing model of Dietz, Ewert, and Hohmann (2011). This model takes a binaural stimulus as input and then estimates ITDs and ILDs, which are the dominant cues for L-R spatial hearing. Consistent with our expectations, we found that the binaural cues were generally unchanged by the HA algorithms that we tested (data not shown). In the following, we therefore focus on the other measures.
Monaural spectral changes
To analyze the influence of the different HA algorithms on monaural spectral cues, we applied a coloration measure of Moore and Tan (2004). After the simulation of peripheral auditory processing, this measure computes the internal excitation pattern for a given input stimulus. In doing so, it considers both the magnitude of the changes in the excitation pattern and the rapidity with which the excitation pattern changes as a function of frequency. It then combines this information into a single dimensionless measure of spectral distance or coloration. In its original form, this measure is further transformed into a prediction of perceived naturalness. For our purposes, however, we used the measure of coloration. Furthermore, we always analyzed the stimulus channel ipsilateral to the movement direction (captured at the frontal BTE microphone) and referenced it to the stationary equivalent of the same stimulus. We did this as this measure needs an
SNR changes
As the HA algorithms included directional and NR processing for the attenuation of unwanted signal components, we estimated the SNR due to the applied signal processing. To that end, we used the separate target and interfering signals (see Figure 1, middle panel; channels 3 + 4: target alone, channels 5 + 6: interferers alone). Based on these signals, we calculated the short-term level ratio between the target and interferers at the ipsilateral side (L-R dimension) or averaged across the two sides (N-F dimension).
DRR changes
For the stimuli moving along the N-F dimension, we calculated the DRR. The DRR declines proportionally with source distance in reverberant rooms (e.g., Bronkhorst & Houtgast, 1999). To estimate the DRR, we created two stimuli per condition: one with reverberation and one without it. We then subtracted the anechoic stimulus (comprising the direct sound only) from the reverberant stimulus to create the diffuse-sound stimulus. We then fed the direct- and diffuse-sound signals separately into the MHA and processed them with the different HA algorithms. By comparing the DRR at the in- and output of the MHA, we could measure DRR changes due to the applied HA processing.
Perceptual Measurements
Participants
The participants were 20 OHI listeners (14 men, 6 women) aged 63 to 80 years (median: 72.5 years). Fifteen of them had bilateral HA experience of at least 2 years. Initially, we measured the participants' hearing thresholds at the standard audiometric frequencies from 0.125 to 8 kHz. All participants had symmetric, sloping mild-to-moderate sensorineural hearing losses. The participants were divided into two groups—9 for the 1 + 0 (target without interferers) and 11 for the 1 + 4 (target with four interferers) groups—based on the results of an initial target detection task (see Detectability of Target Signal section). The average audiograms of the two resultant groups are shown in Figure 2. The mean pure-tone average hearing loss calculated across 0.5, 1, 2, and 4 kHz and both ears (PTA4) was 58.3 dB HL for the 1 + 0 group and 46.8 dB HL for the 1 + 4 group. The median age was 78 years (Group 1 + 0) and 71 years (Group 1 + 4), respectively. Two paired Mean hearing thresholds averaged across left and right ears for the two participant groups. Error bars denote ± 1 standard deviation.
General setup and procedure
The test setup for the perceptual measurements was based on the one from the technical analyses. As our earlier study (Lundbeck et al., 2017) had revealed a clear negative influence of reverberation on N-F but not L-R movement detection, we carried out the listening test under reverberant ( Illustration of the effects of NAL-RP amplification on the target stimulus (fountain). Gray dashed line: Grand average hearing thresholds (in dB SPL) for the OHI group. Error bars denote ± 1 standard deviation. Black solid line without symbols: LTAS of the target signal at the eardrum without amplification. Black solid line with diamonds: LTAS of the target signal at the eardrum with NAL-RP amplification.
To investigate the perceptual consequences of the different HA algorithms, we carried out a listening test with 20 OHI listeners. Initially, we assessed each listener's ability to detect the target signal in the presence of the four interferers (see Detectability of Target Signal section). Subsequently, we measured movement detection thresholds in the UNPROC condition and with two HA algorithms that we selected based on our acoustical analyses (see Tested HA Algorithms section). For those participants who had problems detecting the target signal in the presence of the interferers (Group 1 + 0), we performed the detection threshold measurements with the target signal alone (no interferers). For the other participants (Group 1 + 4), we performed the measurements with all five signals (see Source Movement Detection Thresholds section).
Prior to the actual measurements, we familiarized the participants with the stimuli and the procedure. Using a graphical user interface, they could listen to several static target stimuli, first without (both groups) and then with the four interferers (only Group 1 + 4). Furthermore, we also varied the HA algorithm, so that the participants could acquaint themselves with the different sounds.
Detectability of target signal
To assess target detectability in the presence of the four interferers, we used a single-interval two-alternative-forced-choice paradigm with 50 trials. In half of the trials, a static target sound was present, while in the other trials, only the four interferers were presented. Each interval had a duration of 3.1 s. On each trial, the task of the participants was to indicate whether they heard the target sound by pressing a button on the screen (
Source movement detection thresholds
Depending on the outcome of the target detectability measurements, we carried out the source movement detection threshold measurements with (Group 1 + 4, good performers) or without (Group 1 + 0, poor performers) the four interferers. The procedure for measuring the detection thresholds was very similar to that used in our previous study (Lundbeck et al., 2017). On half of the trials, we simulated a moving target sound, whereas in the other trials, the target sound remained static at the reference position (0°, 1 m). For the angular measurements, we randomized the direction of movement (toward the left or right), whereas for the radial measurements, we always simulated a withdrawing (N-F) movement. In this manner, we ensured the same reference position (0°, 1 m) for both movement dimensions. To control the extent of the movement, we varied the velocity (in °/s or m/s) in the adaptive procedure. For the angular source movement measurements, the velocity ranged from 2 to 30°/s (starting value: 17.4°/s) across all tracks. For the radial source movement measurements, it ranged from 0.25 to 3.7 m/s (starting value: 1.74 m/s). The smallest step size was 2° or 0.25 m. The stimulus duration was constant (2.3 s), thus the amount of movement was proportional to the velocity. On each trial, the task of the participants was to indicate whether they heard a movement (independent of the direction) of the target sound or not by pressing a button on the screen (
For the adaptive procedure, we used the single-interval adjustment-matrix method of Kaernbach (1990). This procedure takes
We estimated the detection thresholds by taking the arithmetic mean of the last eight reversal points of each measurement run. In this manner, we quantified the smallest displacement (in ° or m) of the target source that the participants were able to detect within the 2.3 s over which the movements occurred. In our paradigm, an optimal test run would have resulted in a MAMA threshold of 4.6° and a MAMD threshold of 0.35 m, and all our participants had thresholds that were clearly higher than those values. In the following, we will refer to these thresholds as the minimum audible movement angle (MAMA) and minimum audible movement distance (MAMD) thresholds.
We carried out the L-R and N-F source movement measurements in separate blocks. Within these blocks, we tested the various conditions in randomized order. After 1 to 2 weeks, we performed a set of retest measurements. In total, we measured six L-R thresholds and six N-F thresholds per listener (and thus 240 thresholds in total).
Prior to the statistical analyses, we examined the distributions of the various data sets. According to Kolmogorov–Smirnov's test, all data sets fulfilled the requirements for normality (all
Results
Technical Measurements
L-R dimension
Concerning the L-R dimension, the acoustical changes that we observed were generally as expected. Regarding the monaural spectral changes, our analyses revealed that the BEAM and DIRCOH algorithms generally led to clear increases (except for DIRCOH under reverberant conditions), suggesting that they could be suited for improving source movement detectability. Figure 4 shows the resultant spectral coloration relative to the static condition in the presence of four interferers with and without reverberation. As can be seen, reverberation by itself also increased the spectral changes in the target signal.
Monaural spectral coloration relative to a static stimulus subjected to the same processing (black: UNPROC, light gray: BEAM, dark gray: DIRCOH) as a function of source azimuth. Left: Without reverberation. Right: With reverberation. The legends show the mean values for the three HA algorithms as calculated over the whole stimulus.
Figure 5 shows the SNR changes caused by the three HA algorithms over the course of the target source movement in the presence of the four interferers. The panel on the left shows the SNR in the condition without reverberation and the panel on the right that with reverberation. It is noticeable that the SNR generally varied substantially over the course of the source movement. This was because of the inherent temporal fluctuations of the environmental sounds that we used as stimuli. Concerning the influence of the DIRCOH and BEAM algorithms, the DIRCOH algorithm led to a larger SNR improvement in the condition without reverberation compared with the BEAM algorithm. In the reverberant condition, the SNR improvements were greater for BEAM than for DIRCOH.
SNR for the UNPROC (black), BEAM (light gray), and DIRCOH (dark gray) conditions as a function of source azimuth. Left: Without reverberation. Right: With reverberation. The legends show the mean values for the three HA conditions as calculated over the whole stimulus.
N-F dimension
Concerning the N-F dimension, the acoustical changes were also as expected. To illustrate, the DRR generally decreased with increasing source distance, irrespective of the HA algorithm (Figure 6, left). Furthermore, the BEAM and especially the DIRCOH algorithm led to DRR increases. The same was essentially true with respect to monaural spectral coloration (Figure 6, right), suggesting that monaural spectral cues may provide salient information about N-F source movements.
Left: DRR for the UNPROC (black), BEAM (light gray), and DIRCOH (dark gray) conditions as a function of source distance. Right: Monaural spectral coloration relative to a static stimulus subjected to the same processing. The legends show the mean values for the three HA conditions as calculated over the whole stimulus.
Regarding the SNR changes relative to UNPROC, the BEAM and especially DIRCOH algorithm led to clear increases, as shown in Figure 7.
SNR for the UNPROC (black), BEAM (light gray), and DIRCOH (dark gray) conditions as a function of source distance. The legend shows the mean values for the three HA conditions as calculated over the whole stimulus.
Summary
The acoustic analyses showed that the DIRCOH and BEAM algorithms led to changes in SNR, DRR, and monaural spectral coloration, suggesting better target signal detectability in the presence of multiple interferers as well as reverberation. For the spectral coloration and DRR measures, these changes were largely monotonic in nature, thus providing a cue proportional to the source movement.
Perceptual Measurements
Initially, we examined the test–retest reliability of the MAMA and MAMD thresholds. For Group 1 + 4, we found relatively strong correlations for the MAMA (Pearson's correlation coefficient,
L-R dimension
Figure 8 shows means and 95% confidence intervals of the MAMA thresholds for the different groups and HA conditions. For Group 1 + 0, the thresholds varied little across HA conditions and listeners, as already noted earlier. For Group 1 + 4, the thresholds were much higher for the UNPROC and DIRCOH conditions than for the BEAM condition. Furthermore, the UNPROC condition was characterized by the largest spread and the BEAM algorithm by the smallest spread.
Means and 95% confidence intervals of the MAMA thresholds for the different groups and HA conditions.
To test for statistical differences among the three HA conditions, we conducted a repeated-measures analysis of variance per group with HA condition (UNPROC, DIRCOH, and BEAM) as within-subject factor. For Group 1 + 0, we found no effect of HA condition,
N-F dimension
Figure 9 shows means and 95% confidence intervals of the MAMD thresholds for the different groups and HA conditions. As can be seen, Group 1 + 0 obtained thresholds of around 1 m or lower in all conditions. In other words, the different HA conditions did not appear to affect the performance of these participants. Furthermore, the variance across them was generally small. In contrast, for Group 1 + 4, there appeared to be a clear influence of HA condition on movement detectability.
Means and 95% confidence intervals of the MAMD thresholds for the different groups and HA conditions.
To test for statistical differences among the three HA conditions, we conducted a repeated-measures analysis of variance per group with HA condition (UNPROC, DIRCOH, and BEAM) as within-subject factor. For Group 1 + 0, the effect of HA condition was not significant,
Discussion
The current study aimed to evaluate HA algorithms that can enhance acoustic cues that are relevant to L-R and N-F source movement detection in acoustically complex scenarios. Another aim was to test the most promising HA algorithms in terms of improving L-R and N-F source movement detection with a group of OHI listeners. For that purpose, we used a test setup based on virtual acoustics together with a computer simulation of different HA algorithms. For the acoustic analyses, we used stimuli akin to those used for the perceptual measurements. The analyses showed that the serial combination of directional microphones and binaural coherence-based noise reduction DIRCOH as well as a bilateral beamformer with binaural cue preservation (BEAM) caused consistently greater changes in monaural spectral cues compared with no HA processing. Furthermore, whereas there were only small SNR increases in the L-R dimensions, we found large SNR and DRR improvements in the N-F dimension. Based on these results, we evaluated these two algorithms perceptually with the help of 20 OHI listeners. The data analyses revealed clear improvements in the ability to detect dynamic changes in azimuth or distance under complex (but not single-source) conditions.
Acoustic Effects
In general, the changes in acoustic cues that we observed were as expected and can be related to the physical effects of the HA algorithms that we tested. To recapitulate, these algorithms focused on spatial filtering and de-reverberation. Due to their narrow main lobes, bilateral beamformers have better spatial selectivity in the acoustic look direction than unilateral directional microphones (e.g., Dillon, 2012). At the same time, their polar patterns are typically characterized by larger spectral ripples (e.g., see Neher, Wagener, & Latzel, 2017, their Figure 3). In the current study, we spectrally equalized the DIR and BEAM algorithms in the 0° direction (see HA Signal Processing section). When the target source moved around the static beamformer pattern, it was subjected to clear spectral coloration, as demonstrated by our acoustical measurements (Figure 4).
As the DIR algorithm was not as spatially selective, the suppression of sounds near the acoustic look direction was not as strong as for the BEAM algorithm. Nevertheless, sounds (including reflections) from the sides and especially behind the listener were clearly attenuated. In addition, the COH algorithm effectively de-reverberated the stimuli, as apparent from our DRR measurements (Figure 6). In the N-F dimension where the target always stayed in front of the listener, the DIRCOH algorithm most likely led to better performance compared to the BEAM algorithm because of the greater SNR improvements (Figure 7).
As mentioned earlier, stimulus velocity generally co-varies with the stimulus duration. For our study, we decided to use a fixed stimulus duration and varied the velocity in the adaptive measurements. It is possible that the velocity influenced the effects of some of the algorithms, especially the COH algorithm. Within a given time window, a high velocity may lead to a different degree of binaural coherence than a low velocity, for example. As a result, the acoustical effects of the COH algorithm could have co-varied with source velocity, which in turn might have affected movement detection performance. However, pilot measurements (data not shown) showed that for the range of velocities tested here the resultant acoustical differences were negligible.
We also observed a clear influence of the room condition (with vs. without reverberation). That is, the changes in monaural spectral coloration were generally greater under reverberant than anechoic conditions. In the simulated environment with reflective surfaces, constructive and destructive interference patterns arose between the indirect and direct sound components which were likely perceivable in terms of spectral coloration. This implies that there was a specific room contribution to these changes and that rooms with other characteristics in principle could lead to different spectral changes.
Perceptual Effects
Regarding the perceptual results for the two movement dimensions, it is likely that the differences that we observed were related to different technical results (SNR, monaural spectral coloration, and DRR). In the N-F dimension, we observed large SNR improvements (likely because the target signal did not overlap spatially with the interferers) and coloration changes for both BEAM and DIRCOH. Consistent with this, the detection thresholds of Group 1 + 4 with these two algorithms were also improved. In the L-R dimension, the BEAM algorithm provoked the largest changes in the monaural coloration measure. Qualitatively speaking, the perceptual data correlated with this finding in the sense that there was also a clear threshold improvement for Group 1 + 4. DIRCOH, on the other hand, did not provoke large spectral coloration changes, nor did it achieve a clear SNR improvement. This was probably why it did not result in a threshold improvement for this group in this dimension.
It is also worth noting that the variance in the data from Group 1 + 0 in general, and in those for the BEAM condition in the L-R dimension and the DIRCOH and BEAM conditions in the N-F dimension from Group 1 + 4, was quite low (Figures 8 and 9). For Group 1 + 0, the detection task was generally easy, which probably resulted in a perceptual floor effect. In principle, the same might have been true for those thresholds of Group 1 + 4 that improved significantly with the HA signal processing.
In the N-F dimension, a few participants from Group 1 + 4 performed as well as YNH listeners (Lundbeck et al., 2017) and as the OHI listeners tested only with the target sound (Group 1 + 0). For them, neither reverberation nor the concurrent interferers seemed to increase thresholds. Broadly speaking, these differences across participants are consistent with the large variability among hearing-impaired listeners that is typically observed in relation to spatial hearing (and many other) tasks (e.g., Noble, Byrne & Ter-Horst, 1997). Comparison with corresponding data of Brimijoin and Akeroyd (2014) is not straightforward as these authors used speech signals for their moving target measurements. Furthermore,
Interestingly, we observed a clear influence of HA algorithm on the detection thresholds of Group 1 + 4 but not on those of Group 1 + 0. In fact, the detection thresholds of Group 1 + 0 were unaffected by the HA algorithms. A potential explanation for this is that Group 1 + 0 obtained rather low thresholds in the UNPROC condition that were only 2 to 3° higher than those of YNH subjects (Lundbeck et al., 2017), so there was less room for improvement with the BEAM and DIRCOH algorithms than for Group 1 + 4. Another possible explanation could be that the algorithms did not provide sufficiently large acoustic changes for low baseline detection thresholds (and thus low movement velocities). In the acoustic analyses, we found increases in, for example, monaural spectral cues with greater source movements, but for participants with low UNPROC thresholds, these were perhaps not perceivable. Because we tested these participants only in the target-only scenario, it is unclear how they would perform when tested with interferers. It is noteworthy that the two groups differed in PTA4 (see Participants section). That is, Group 1 + 0 had a greater hearing loss than Group 1 + 4. This could be an explanation for their inability to detect the target signal in the scenario with four interferers. Furthermore, Group 1 + 4 included nine listeners who had already taken part in our previous study. Consequently, their greater experience with the task could have put them at an advantage, despite the training included in the current study for all of our participants (see Perceptual Measurements section).
Given the sloping hearing losses of our participants, it is possible that they did not weight all frequencies equally when making their source movement judgments. However, most of them were experienced HA users and so were accustomed to listening to high-frequency sound, as provided by the NAL-RP amplification used here (Figure 3). In principle, more research could be devoted to developing effective predictors of the spatial hearing abilities of hearing-impaired listeners, but this was beyond the scope of the current study. Instead, our approach was to rely on established acoustic measures.
Limitations
Our test scenarios were created using a higher order Ambisonic-based toolbox. As discussed in Lundbeck et al. (2017), the simulation method is an important factor for the accuracy with which a sound field can be synthesized. The aim of the toolbox that we used is not to reproduce a given sound field in a physically correct manner but rather to achieve a perceptually plausible approximation. Research into higher order ambisonics has shown that the spatial hearing abilities of normal-hearing listeners are essentially unaffected at the center position of the array (Daniel, 2000; Daniel, Moreau, & Nicol, 2003). In addition, Ambisonics rendering with an order of 23 and 48 horizontal-plane loudspeakers (as used in the current study) has been found to be sufficient for an accurate technical evaluation of different multimicrophone HA algorithms, in terms of both beam pattern analysis and SNR behavior (Grimm, Ewert, & Hohmann, 2015). With this degree of spatial resolution, spatial aliasing at the ear position occurs above approximately 16 kHz and thus above the Nyquist frequency of the HA algorithms tested here (see HA Signal Processing section). What is more, the setup used here has been found to be capable of room simulations with room acoustical parameters comparable to those of the actual rooms (Grimm, Heeren, & Hohmann, 2015). Altogether, these results provide support for the general validity of our simulation approach.
Nevertheless, due to the lack of a head-tracking device, we effectively prevented our listeners from following the source movements. Natural head movements have been found to differ substantially among individuals (Grimm, Luberadzka, et al., 2015) and are considered an important factor for spatial perception under dynamic conditions (e.g., Brimijoin & Akeroyd, 2012). Thus, future work should ideally address the influence of head movements on source movement detectability.
Furthermore, our study was limited to one particular acoustic environment. It would be important to investigate other types of environments and scenarios that reflect other complex listening tasks such as a traffic situation or a group discussion (cf. Grimm, Kollmeier, & Hohmann, 2016). Challenges in real life with multiple sources occur in various scenarios that differ in spatial complexity and the task of the listener. Another limitation is that we only considered a frontal starting position for the target signal. In addition, the BEAM and DIR algorithms were non-adaptive and always steered towards 0°. Consequently, for the L-R dimension the target source moved outside of the (frequency-dependent) main lobe of these algorithms and was thus attenuated, especially for BEAM. The main lobe of BEAM had a width of about ±10° over a broad frequency range (see Rohdenburg, 2008, Figure 4.5a). For greater source azimuths, the target signal was spectrally filtered, leading to improved detectability. In future studies, it would be important to investigate the influence of adaptive beamforming algorithms, which are likely to lead to different results.
It would also be important to assess performance with non-frontal source movements in multi-source environments, for which bilateral beamformers have been found to deteriorate performance (Best et al., 2015). In addition, it would make sense to test different target signals that are representative of real-world communication scenarios (e.g., speech). Finally, other aspects of spatial perception such as counting or locating multiple concurrent sources in a complex environment, as, for example, studied by Best, Buchholz, and Weller (2017), should ideally also be covered in order to assess the perception of spatial awareness more holistically.
Summary
Based on a computer simulation of a complex listening environment combined with bilateral HA processing, the current study showed that selected multi-microphone signal enhancement algorithms can enhance acoustical features that are related to source movement perception. Furthermore, it showed clear improvements in source movement detectability for a group of OHI listeners in complex scenarios with reverberation and concurrent interfering signals. In future studies, it would be of interest to investigate movement perception further in combination with wearable HAs.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Oticon Foundation and the DFG Cluster of Excellence EXC 1077/1 “Hearing4all”.
