Abstract
This study evaluates the perceptual effects of single-microphone noise reduction in hearing aids. Twenty subjects with moderate sensorineural hearing loss listened to speech in babble noise processed via noise reduction from three different linearly fitted hearing aids. Subjects performed (a) speech-intelligibility tests, (b) listening-effort ratings, and (c) paired-comparison ratings on noise annoyance, speech naturalness, and overall preference. The perceptual effects of noise reduction differ between hearing aids. The results agree well with those of normal-hearing listeners in a previous study. None of the noise-reduction algorithms improved speech intelligibility, but all reduced the annoyance of noise. The noise reduction that scored best with respect to noise annoyance and preference had the worst intelligibility scores. The trade-off between intelligibility and listening comfort shows that preference measurements might be useful in addition to intelligibility measurements in the selection of noise reduction. Additionally, this trade-off should be taken into consideration to create realistic expectations in hearing-aid users.
Introduction
Single-microphone noise reduction is a common feature in modern hearing aids that should determine whether the input signal is contaminated with noise and then adjust the hearing aid’s gain in specific frequency bands to suppress unwanted background noise. Generally, hearing-aid noise reduction is presented as a
In a recent study, we compared noise reduction from different hearing aids to gain some insight in the effects of noise reduction (the
In this follow-up study, we investigated whether these findings also hold true for hearing-impaired listeners. It might be that hearing-impaired listeners are less sensitive to differences between processing conditions because of suprathreshold deficits such as reduced frequency selectivity or impaired modulation detection (Marzinzik, 2000). On the other hand, because of their decreased ability to understand speech in noise, it might be more important for hearing-impaired listeners to avoid distortions of the speech signal. In this phase, we evaluated noise reduction in isolation. In a later stage, possible interactions between noise reduction and compression should be addressed. The goal of the current study was to answer the following research question: Does hearing-aid noise reduction influence speech intelligibility, listening effort, noise annoyance, speech naturalness, and preference for listeners with a moderate sensorineural hearing loss, compared with (a) no noise reduction and (b) noise reduction from other linearly fitted hearing aids?
Methods
The methods for hearing aid recording, perceptual measurements, and statistical analyses were identical to those described by Brons et al. (2013). Approval for this experiment was obtained from the Medical Ethical Committee of the Academic Medical Center on November 29, 2011 (MEC2011_310).
Hearing-Aid Recordings
We recorded hearing-aid output of three linearly fitted hearing aids from different manufacturers (Phonak Exélia M, ReSound Azure AZ80-DVI, and Widex Mind 440) using the method described by Houben, Brons, and Dreschler (2011). Analyses of recordings from these hearing aids fitted for different hearing losses revealed that noise-reduction processing in this selection of hearing aids was independent of hearing loss (i.e., the patterns of gain reduction remained the same for the same input signals when the hearing aids were fitted for another hearing loss), so that it was not necessary for the current purpose to fit the hearing aids to other targets than in Houben et al. We therefore took the same hearing aids and settings as in that study. In short, all hearing aids were linearly fitted with all signal-processing features turned off for the unprocessed condition, and with the strongest available noise-reduction setting on for the noise-reduction conditions. Compensation for individual hearing loss was done after recording and filtering. Compensation was done according to the linear National Acoustic Laboratories’ prescription for hearing-aid gain and frequency response—Revised, Profound (NAL-RP) (Byrne & Dillon, 1986). Recordings of the three hearing aids with noise reduction activated were randomly coded as conditions NR1, NR2, and NR3. This coding is the same as in Brons et al. (2013).11
All recordings were filtered with an inverse filter (Houben et al., 2011) that corrected for differences in frequency response between hearing aids. Thus, if noise reduction was inactive, all recordings had the same output spectrum as the input signal. There were no perceptual differences between recordings from different hearing aids when noise reduction was inactive, as was verified by Houben et al. (2011). We therefore used recordings of one hearing aid that formed the
Stimuli consisted of Dutch sentences (Versfeld, Daalder, Festen, et al., 2000) in babble noise (20 talkers reading simultaneously different passages; Auditec, St. Louis, MO), recorded from the hearing aids with a hearing-aid input noise level of 65 dB(A). The recorded stimuli were presented monaurally to the subjects with Sennheiser HDA200 headphones. During the listening experiments, the noise level was 65 dB(A) for all the stimuli in the unprocessed condition, and the average speech levels ranged from 61 to 75 dB(A) to obtain all input SNRs that were required for the stimuli (−4 to +10 dB SNR). Additional amplification was applied according to the linear NAL-RP prescription (Byrne & Dillon, 1986) to compensate for listeners’ individual hearing loss.
Acoustical analyses of the noise-reduction processing of these hearing aids are given in Brons et al. (2013) and summarized here in Figure 1, where the long-term average gain reduction for the three noise-reduction conditions is plotted for the six different SNRs (−1 through +4 dB) that were used in this study. A more negative gain value in Figure 1 indicates stronger noise reduction. All noise-reduction algorithms apply more gain reduction at lower SNRs, except for NR3 at frequencies between 1 and 2 kHz. NR3 reduced gain only for frequencies below 1 kHz and increased gain slightly between 1 and 2 kHz. NR1 and NR2 applied gain reduction over a broader range of frequencies, but NR1 was more cautious around 1 kHz. The analyses in Brons et al. showed that NR1 and NR2 change gain rather quickly to amplify speech when present and to attenuate the noise when speech is not present (e.g., in between speech segments), whereas NR3 applies its gain reduction more gradually.
Long-term average gain reduction as a function of frequency for the three different noise-reduction conditions. Every line represents the difference between noise reduction 
Subjects
Twenty hearing-impaired subjects between 48 and 69 years of age (average = 61.3 years) participated in this study. We used the results obtained in Brons et al. (2013) from normal-hearing subjects for a power calculation. The power calculation revealed that a number of 10 subjects would be sufficient to detect a difference of about 13% in intelligibility score (which is equal to about 1 dB change in SRT50, i.e., the speech reception threshold, the SNR at which the subject can correctly reproduce 50% of the sentences). Also, 10 subjects would be sufficient to detect a difference of 1 rating point in perceived listening effort. Because we expected that the variation between subjects would be higher for hearing-impaired subjects than for normal-hearing subjects, we decided to include 20 subjects in the current study. The subjects’ audiograms were similar (i.e., no more than 10 dB difference at octave frequencies) to audiogram type N3 (moderate hearing loss with moderate slope) in the set of standard audiograms proposed by Bisgaard, Vlaming, and Dahlquist (2010). Figure 2 shows the hearing thresholds for the ears included (one per subject) averaged over all subjects, and the corresponding standard deviation. Figure 2 also shows the standard audiogram N3 on which the selection of the subjects was based.
The average hearing thresholds of the 20 subjects (one ear per subject). Error bars show the standard deviation. The standard audiogram N3 is shown in gray.
The two intelligibility outcome measures were (a) the subjects’ individual SRT50 and (b) the percentage correct words at a fixed SNR of +4 dB. The outcomes for listening effort and preference were measured at both subjects’ individual SRT50 (averaged over the four conditions) and at a fixed SNR of +4 dB. The +4 dB SNR was previously used in Brons et al. (2013) for measurements with normal-hearing subjects and roughly corresponds to daily listening situations for the hearing impaired (Smeds, Wolters, & Rung, 2012).
Intelligibility
Following the adaptive procedure described by Plomp and Mimpen (1979), we measured the SRT50. At the fixed SNR of +4 dB, we measured the percentage of words correctly repeated. This is similar to Brons et al. (2013) but at a higher SNR. Both measurements started with 13 training sentences followed by one list of 13 sentences per processing condition. The order of the lists and noise-reduction conditions as well as the combinations of list and condition were balanced across subjects to minimize possible effects of differences between lists or training effects.
Listening Effort
The subjects rated the listening effort on a 9-point rating scale that ranged from
Paired-Comparison Rating
We used paired-comparison ratings to measure noise annoyance, speech naturalness, and overall preference. All processing conditions were compared with each other, by presenting the same sentence in two different conditions. For each combination of processing conditions, subjects indicated which was best on noise annoyance, which on speech naturalness, and which they would prefer for prolonged listening. For each of these three criteria, there were seven possible answers, ranging from
Results
Intelligibility
Figure 3 shows the group results for speech intelligibility. We used paired Mean and 95% confidence interval of the SRT50 (left panel) and of the percentage of words correctly repeated by the subjects at +4 dB SNR (right panel). “Unpr” is the unprocessed reference condition, and NR1, NR2, and NR3 are the hearing-aid noise reductions. The horizontal line indicates which processing conditions differ significantly from each other (*
The other outcome measures described in this article were measured at the individual SRT50, averaged over noise reductions and rounded to whole decibels, ranging from −1 to +4 dB.
Listening Effort
Figure 4 shows the group-average listening-effort ratings relative to that for unprocessed at SRT50 level and averaged over the three fixed levels (upper panel), and the average absolute ratings for the three fixed SNRs separately (lower panel).
Mean and 95% confidence interval of the listening effort ratings assigned by the 20 subjects relative to unprocessed (Δ listening effort, upper panel) at SRT level (left) and averaged over the three fixed SNRs (right), and absolute ratings at −4, +4, and +10 dB SNR (lower panel). Horizontal lines indicate which processing conditions differ significantly from each other (*
Because the rating scale has an upper and a lower bound, the variance will be lower in the scores near the bounds of the scale than in the middle. We therefore applied an arcsine transformation to the listening effort scores to satisfy the criteria for an analysis of variance (ANOVA; Fink, 2009). A repeated measures ANOVA on the arcsine-transformed absolute data on SRT level showed a significant effect of processing condition,
Paired Comparison Rating
Figure 5 shows the average rating scores for each processing condition for the three judgment criteria. Scores from −3 to 3 represent the seven categories in the paired comparison scale, with 0 indicating Mean rating scores derived from the paired-comparison data for the three judgment criteria and two SNRs. Scores from −3 to +3 were assigned as 0, indicating 
Discussion
Intelligibility
Word scores for NR2 were significantly lower than those for unprocessed. In the results for normal-hearing listeners (Brons et al., 2013), NR2 also had the lowest word score although not significantly lower than unprocessed. Most studies have found no effect of noise reduction on speech intelligibility (Loizou & Kim, 2011; Nordrum, Erler, Garstecki, & Dhar, 2006). Results of Hu and Loizou (2007) suggest that noise reduction reduces intelligibility more at lower SNRs. The more the noise is dominating the input signal, the more difficult it is for the noise-reduction algorithm to recognize the speech and to correctly separate the speech and noise. This might introduce more classification errors, resulting in speech distortions. Our intelligibility results did not confirm this larger decline due to noise reduction at lower SNRs; whereas the intelligibility was reduced somewhat by noise reduction at +4 dB SNR, it was not at the SRT level, which was measured at lower SNRs for most subjects. A possible explanation can be found in the dynamic behavior of the noise reduction. At higher SNR, the noise reduction shows larger and quicker changes in gain to separate between speech and noise, whereas at lower SNRs, gain is more gradually reduced because speech is not well recognized (Brons et al., 2013). Speech distortions by quick gain transitions may be more detrimental to speech intelligibility than a more gradual suppression of the speech signal. Hilkhuysen, Gaubitch, Brookes, & Huckvale (2012) found no interaction between noise reduction and SNR. However, they did not take measurements at positive SNRs.
Listening Effort
Subjects rated effort at the fixed SNRs significantly higher for NR3 than for unprocessed and NR1. This finding agrees with that for normal-hearing listeners, who also rated the highest listening effort for NR3 (Brons et al., 2013). Whereas Bentler, Wu, Kettel, & Hurtig (2008) found a reduction of perceived listening effort due to hearing-aid noise reduction, most other studies using a rating scale for determining listening effort did not (Alcántara, Moore, Brian, Kühnel, & Launer 2003; Brons et al., 2013; Desjardins & Doherty, 2014). Desjardins and Doherty (2014) measured listening effort both using a dual task and a rating scale and found a reduction in listening effort with the dual task due to hearing-aid noise reduction in the same conditions where ratings of listening effort showed no difference between noise reduction on and off. This implies that a method more sensitive than a rating scale is effective. However, the positive effect of noise reduction on listening effort in Desjardins and Doherty was only found at SRT50 levels and not for higher, arguably more relevant, SNRs. Recently, cognitive factors such as listening effort have enjoyed increased attention in the evaluation of hearing-aid functions. Apart from factors such as noise type and SNR, the cognitive capacity of the listener may also be important in determining which noise reduction processing should be applied in a specific situation (Lunner, Rudner, & Rönnberg, 2009; Rudner, Lunner, Behrens, Thorén, & Rönnberg, 2012).
The data in this study reveal that the absolute effort ratings by the hearing-impaired subjects for the −4 and +4 SNR conditions were higher than those reported for normal-hearing subjects in Brons et al. (2013). This difference between subject groups was also found by Luts, Eneman, Wouters, et al. (2010) and reflects the fact that hearing-impaired listeners have more difficulty with listening to speech in noise, which is also reflected in the intelligibility results.
Noise Annoyance, Speech Naturalness, and Overall Preference
Hearing-impaired listeners indicated differences in noise annoyance, speech naturalness, and overall preference between the conditions of noise reduction on and off and between noise-reduction algorithms of different linear hearing aids. Except for the higher speech-naturalness rating for NR2 by the hearing impaired, the results at +4 dB SNR agree well with those of Brons et al. (2013) for normal-hearing subjects. Although the strength of preference cannot directly be compared because the ratings by the normal-hearing subjects were based on comparisons of five conditions instead of four, the ranking of these four conditions is the same in the normal-hearing listeners and the hearing-impaired listeners.
NR2 reduced noise annoyance more than the other conditions and had higher speech naturalness than the other conditions. The combination of reduced noise annoyance with a high speech naturalness is remarkable because in general stronger reduction of noise is accompanied by more speech distortion (Houben, Dijkstra, & Dreschler, 2012; Loizou, 2007). Perhaps this uncharacteristically high rating of speech naturalness was based on different aspects. At the SNR under consideration (SNR = +4 dB), normal-hearing listeners rated no difference in speech naturalness among conditions (Brons et al., 2013). This suggests that at +4 dB SNR, the noise reduction introduced no audible distortions because that would have reduced the perceived speech naturalness for normal-hearing listeners. In the absence of audible distortions, the hearing-impaired subjects might have used other cues to rate naturalness, for instance, the absence of noise (Marzinzik, 2000). At −4 dB SNR, normal-hearing subjects rated speech naturalness lower with noise reduction, indicating that the distortions due to noise reduction increase with decreasing SNR. To determine whether this effect was also present in the data for hearing-impaired subjects, we repeated the analysis with the subjects divided in two groups based on their SRT (12 subjects with SRT −1, 0, or +1 dB and 8 subjects with SRT +2, +3, or +4 dB). For the first group, the ratings at SRT were based on relatively unfavorable SNRs, and here subjects rated the speech naturalness highest for the unprocessed condition and significantly lower for NR3. For the latter group, the ratings were based on higher SNRs, and subjects from this group rated the naturalness lowest for the unprocessed condition and significantly higher for NR2. This finding confirms that noise reduction appears to affect speech naturalness more at lower SNRs as was previously found for normal-hearing subjects. Neher, Grimm, and Hohmann (2014) also found that preference for noise reduction over no noise reduction was stronger at higher than at lower input SNRs. Boymans and Dreschler (2000) and Ricketts and Hornsby (2005) also found preference for noise reduction on over off at positive SNRs, in contrast to Alcántara et al. (2003), who measured mainly at negative SNRs.
The condition that was most preferred by the subjects (NR2) also produced the lowest intelligibility scores. Such a trade-off between preference and intelligibility was previously found to be inherent to noise reduction in several studies (Brons, Houben, & Dreschler, 2012; Neher et al., 2014; Wang, 2008). Apparently, due to the reduced noise level, it is more comfortable to listen to speech and noise that were processed with noise reduction, even though the reduction in speech level or distortions of the speech signal may cause lower intelligibility scores.
Signal-to-Noise Ratios for Evaluation of Noise Reduction
Noise-reduction processing depends on the input SNR (Hoetink, Körössy, & Dreschler, 2009). We therefore measured not only at an individual SNR (SRT50) for each subject (to ensure an equal performance level for all subjects) but also at a fixed SNR to ensure equal noise-reduction processing for all subjects. Group results were similar between the two independent datasets obtained at SRT level and at fixed SNR. This implies that, for the (small) range of hearing losses included, the approach of a fixed and individual SNR did not influence the results. This conclusion might not hold for a broader range of hearing losses. In that case, the approach of evaluating both from a listener’s perspective (individually adjusted SNR) and a processing perspective (fixed SNR) might be considered, although results from fixed SNRs are easier to interpret because the effects of noise reduction and hearing ability are easier to separate.
Limitations
The results of this study were measured in a laboratory setting and cannot easily be generalized beyond the limited number of conditions included. In addition, noise reduction was studied in isolation, whereas in practice it will often be used in combination with other hearing-aid features such as directional microphones and other signal processing algorithms. The most important signal processing that should be investigated in combination with noise reduction is dynamic-range compression, the effects of which may be opposite to that of noise reduction (Anderson, Arehart, & Kates, 2009; Chung, 2007).
Implications for Hearing Aid Fitting
Because of the mentioned limitations, the hearing aid specific results cannot be used to conclude which of the hearing aids tested is best. Nevertheless, this study indicates that differences in the types and implementations of hearing-aid noise reduction are perceptually relevant and this may have consequences for the selection and fitting of hearing aids. At least for listeners with specific complaints in noisy environments, it might be worthwhile to perform technical and perceptual comparisons to select the best noise-reduction system for the individual listener. Additionally, it is important to raise realistic expectations on the effects of noise reduction. Listeners should be aware that no improvement in intelligibility scores in noise should be expected, but that single-microphone noise reduction might improve listening comfort and reduce noise annoyance.
Conclusions
Noise reduction from three hearing aids tested was able to reduce the annoyance of babble noise perceived by listeners with moderate sensorineural hearing loss. The noise reduction that reduced noise annoyance the most and that was most preferred caused poorer intelligibility scores, confirming a trade-off between listening comfort and intelligibility.
Overall, the results of hearing-impaired subjects agree well with those obtained for normal-hearing listeners in a previous study.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by grants from the Heinsius-Houbolt Fund.
