Abstract
Hearing aids and other hearing devices should provide the user with a benefit, for example, compensate for effects of a hearing loss or cancel undesired sounds. However, wearing hearing devices can also have negative effects on perception, previously demonstrated mostly for spatial hearing, sound quality and the perception of the own voice. When hearing devices are set to transparency, that is, provide no gain and resemble open-ear listening as well as possible, these side effects can be studied in isolation. In the present work, we conducted a series of experiments that are concerned with the effect of transparent hearing devices on speech perception in a collocated speech-in-noise task. In such a situation, listening through a hearing device is not expected to have any negative effect, since both speech and noise undergo identical processing, such that the signal-to-noise ratio at ear is not altered and spatial effects are irrelevant. However, we found a consistent hearing device disadvantage for speech intelligibility and similar trends for rated listening effort. Several hypotheses for the possible origin for this disadvantage were tested by including several different devices, gain settings and stimulus levels. While effects of self-noise and nonlinear distortions were ruled out, the exact reason for a hearing device disadvantage on speech perception is still unclear. However, a significant relation to auditory model predictions demonstrate that the speech intelligibility disadvantage is related to sound quality, and is most probably caused by insufficient equalization, artifacts of frequency-dependent signal processing and processing delays.
Introduction
Hearing aids and other ear-worn hearing devices provide benefit to the user by changing the acoustic environment in a way that is desired depending on the application. Hearing aids compensate for effects of hearing impairment by applying frequency and level-dependent amplification, enhance desired speech sources, and include other advanced features (Launer et al., 2016). In other applications, desired modifications can be reduction of high-level sounds for electronic hearing protectors (Killion et al., 2011), or augmenting an acoustic scene by adding virtual sound sources with augmented reality headsets or earbuds (Härmä et al., 2004). Ideally, listening through any hearing device should sound like the desired gain or other modification would be applied to the source signal. Current hearing devices cannot achieve this ideal perception and additionally introduce undesired alterations of the perceived acoustic environment (Brungart et al., 2007; Cubick et al., 2018; Schepker et al., 2020). Such undesired side effects are referred to as hearing device disadvantage in the following. A hearing device disadvantage counters the possible benefit of a device and limits the range in which it is useful and accepted by a user and should thus be minimized. The hearing device disadvantage can be studied in isolation when a hearing device is brought into a transparent setting, that is, when adjusted to approach open-ear listening as closely as possible. Understanding the physical correlates of hearing device disadvantages in different dimensions of auditory perception by quantifying the effects of listening through transparent hearing devices is therefore a key to optimizing hearing devices of the future.
It was previously shown that hearing devices impair spatial perception (Best et al., 2010, 2020; Brungart et al., 2007; Cubick et al., 2018; Denk et al., 2019; Van den Bogaert et al., 2011), mostly due to limited spatial information captured by the hearing device microphones and effects of limited bandwidth and processing delays. Further, inaccurate reproduction of the open-ear frequency response as well as delay artifacts or binaural distortions lead to a reduced perceived sound quality (Biberger et al., 2021; Lelic et al., 2022; Schepker et al., 2020; Stone et al., 2008). Another disturbance for the wearer is an alteration of the own voice and other body-generated sounds due to the occlusion effect (Denk et al., 2024; Kiessling et al., 2005; Killion, 1988). Regarding speech perception, only few studies examining a potential hearing device disadvantage have been conducted. Cubick and Dau (2016) as well as Cubick et al. (2018) demonstrated that listening through behind-the-ear (BTE) hearing aids leads to decreased speech intelligibility in noise in scenes where speech and noise are spatially separated, and also very slightly also in situations where speech and noise are collocated. The disadvantage of a BTE hearing aid in a spatially separated setting can be predicted by the effectively lower signal-to-noise ratio (SNR) reaching the eardrum due to microphone location effects. In contrast, the disadvantage in the collocated condition remains to be explained. In our previous work (Denk et al., 2021a), we were able to largely replicate the results of Cubick et al. (2018), including a significant decrease in speech intelligibility for listeners wearing a transparent hearing device in a collocated setting. It is still unclear whether a hearing device disadvantage for speech perception due to listening through transparent hearing devices can be persistently observed, and whether similar trends are observed for objective metrics like speech intelligibility and subjective ratings like listening effort or sound quality. Further, the origin of such hearing device disadvantages in collocated speech in noise situations is still unclear.
In the present contribution, we report on a series of experiments examining the effects of transparent hearing devices on speech perception in a collocated setting, with the aim to assess the occurrence of a hearing device disadvantage on speech perception in such a scenario and pinpoint its origin. To this end, we measured speech reception thresholds (SRT), that is, the SNR at 50% speech intelligibility, as well as rated listening effort depending on the SNR, in young normal-hearing participants with and without transparent hearing devices. In addition, various technical measurements were performed to characterize the utilized hearing devices. Specifically, we examined four possible hypotheses for the underlying reason for a hearing device disadvantage of linear, transparent hearing devices on speech perception in a collocated noise setting:
Deviation from the individual open-ear transfer function (linear distortion) Artifacts of signal processing strategy and delay Device self-noise Nonlinear transducer distortions
These hypotheses were assessed in two separate experiments. Experiment I was conducted to assess Hypotheses A and B by measurements with a set of different hearing devices from different manufacturers and intended for different applications, to include devices with different signal processing strategies. The devices included six high-end, BTE receiver-in-canal hearing aids from different manufacturers, an insert earphone with active noise control (ANC) and transparency features, and a research hearing aid based on an open-source signal processing platform and high-end hearing aid hardware called portable hearing lab (PHL) (Denk et al., 2022; Kayser et al., 2022; Pavlovic et al., 2020). To study the specific effect of linear distortions (Hypothesis A), the PHL was operated in both a generic equalization setting directly taken from our previous work (Denk et al., 2021a), as well as in an individually adjusted setting that closely approached the individual open-ear transfer function for the present acoustic setting. The other devices were also individually programmed to best approach the open-ear transfer function as closely as possible to minimize the effect of linear distortions in these devices. Please note that in all previous works (Cubick and Dau, 2016; Cubick et al., 2018; Denk et al., 2021a), no individual equalization of the hearing devices was made. Microphone location effects are not expected to play a role in the present experiments, since only collocated scenarios are considered. We expected that reducing linear distortion can reduce or avoid a hearing device disadvantage.
Hypotheses C and D were assessed in Experiment II, where the same measurements were done with the PHL at different stimulus levels. We expected significant self-noise of the hearing device (Hypothesis C) to result in an increased hearing device disadvantage at low stimulus levels, and that nonlinear transducer distortions (Hypothesis D) would be more prominent at high stimulus levels than at low levels. Independent of the origin of a hearing device mismatch, we expected similar effects for speech intelligibility and rated listening effort or a larger effect size for the listening effort, when it is assumed than any kind of signal fidelity would first impair subjective ratings before it would affect objectively measured speech intelligibility.
Methods
General Setup
All experiments were conducted in a sound-isolated and acoustically treated auditory booth (2.6 × 3.6 × 2.5 m, T20 = 0.1 s). Stimuli were presented from a loudspeaker (Genelec 8351) facing the participants at 1 m distance, with a location chosen to avoid the occurrence of room modes at the location of the participants. The height of the acoustic axis of the loudspeaker was 1.3 m above the floor, and the participants’ ears height was adjusted by altering the height of their chair appropriately. The loudspeaker was connected to a PC through a soundcard (RME Fireface 802) and calibrated using a free-field microphone (Brüel&Kjær 4192 with 2669 preamplifier) that was connected to the same soundcard and calibrated using a pistonphone (Brüel&Kjær 2829).
Measurements were conducted using the Oldenburg Measurement Application version 2.2. Speech intelligibility measurements determined the SRT using the German male Oldenburg Sentence Test (OLSA), which contains 5-word sentences that are compiled out of a 50-word matrix without semantic context (Kollmeier et al., 2015). In this test, the noise level is held constant, and the speech level adapted until a speech intelligibility of 50%, measured by word scoring, is reached. This was done using lists of 20 sentences, starting at an SNR of 0 dB, corresponding to 100% intelligibility in normal-hearing participants. Both speech and noise were presented from the same loudspeaker in front of the subjects. The test was conducted in an open fashion, that is, the subject repeated the words they understood, and scoring was conducted by the experimenter. On each appointment, every participant completed two test lists prior to the evaluated experiments to minimize training effects of the OLSA (Wagener and Brand, 2005). The rated listening effort was measured using the Adaptive Categorical Listening Effort Scaling (ACALES) method (Krueger et al., 2017), where subjects rate the perceived listening effort for the OLSA material for different adaptively determined SNRs, starting at +10 dB SNR corresponding to no effort at all. The outcome of the ACALES procedure is a function linking the so-called Effort Scaling Categorical Units (ESCU) on a scale from 1 (no effort at all) to 13 (extreme effort) depending on the SNR. To obtain thresholds analogous to the SRT, the SNRs corresponding to no effort (ESCU 1), medium effort (ESCU 7) and extreme effort (ESCU 13) were evaluated.
The hearing device disadvantage was calculated by subtracting the unaided threshold from the threshold measured while wearing a hearing device. A positive value thus corresponds to a poorer performance while wearing the hearing device. The hearing device disadvantage was calculated for each participant and hearing device, prior to further statistical evaluation.
Experiment I
In the first experiment, subjects underwent measurements unaided and while wearing a total of eight different hearing devices. The subjects thus underwent nine conditions, each comprising verification and adjustment of the hearing device (see below), an SRT measurement and a round of ACALES, in this order. These measurements were divided in two sessions of approximately 90 to 120 minutes each. The order of conditions was balanced using a randomized Latin Square approach, which results in a random succession of conditions in each participant while maintaining a balanced occurrence. Due to practical considerations, both PHL conditions (see below) were always conducted after each other in alternating order across participants. 18 normal-hearing subjects with threshold at all audiometric frequencies no more than 20 dB, no self-reported tinnitus, no relevant allergies, and no audiological conditions participated (9 male, age 26.5 ± 2.6 years).
The tested hearing devices included six commercial hearing aids, the PHL (Kayser et al., 2022; Pavlovic et al., 2020), and a pair of commercial earphones with advanced ANC and hear-through functionalities. The commercial and research hearing aids were BTE-receiver in canal style devices. The commercial hearing aids were high-end devices and are referred to as HA1–6 in the following. Their exact types are given in the Supplementary Material without disclosing the assignment. All possible features were turned off as possible in the fitting software (feedback reduction, noise reduction, impulse and wind noise reduction, all kinds of directional processing, automatic program adaptation). Medium or small receivers, typically used for the mildest hearing loss as recommended by the fitting software, were used to allow fitting to acoustic transparency. The earphones were insert-type, without a vent and coupled to the ear canal with silicone domes whose size was chosen for each individual subject. The hear through mode made use of microphones both located in the pinna and the ear canal, and included active reduction of the occlusion effect when the subjects were speaking (Denk et al., 2024). The software running on the PHL was equivalent to the default firmware image (MAHALIA 4.16.0-r1, http://mahalia.openmha.org/), with all hearing aid processing (feedback reduction, coherence-based noise reduction, multi-band dynamic compression) turned off. Processing included an overlap-add based filterbank with 55 samples at a sampling rate of 24 kHz, which allowed to set individual gains for each frequency bin (plugin “equalize”). A technical evaluation of the PHL is presented elsewhere (Denk et al., 2022), showing that the electro-acoustic performance metrics of the device are on par with current commercial hearing aids.
Coupling of the hearing aids and PHL to the ear canal was achieved using custom modified foam earplugs with a 20-mm long piece of sound tube with an inner diameter of 2 mm glued in for inserting the hearing aid receiver. A closed coupling to the ear was intended to minimize the influence of sound directly leaking into the ear canal and a consistent result across participants. Note that the 20 mm tubing is longer than standard for RIC couplings. However, all resulting transmission effects have been compensated by individual in-situ fitting.
Probe tube microphones (Etymotic ER7C-B, Etymotic Research) were inserted into the ear canal for adjusting and verifying the settings of the hearing devices. On the one hand, this included a verification of a closed fit by assessing the passive attenuation with respect to the unaided case (real-ear occluded insertion gain, REOIG) of at least 10 dB in all third-octave bands between 500 Hz and 4 kHz. On the other hand, the gains of all devices were programmed individually to minimize the level difference between the aided and unaided case (real-ear insertion gain, REIG) and achieve a linear setting, that is, a compression ratio of 1:1. Here, the goal was a difference of less than 5 dB in all third-octave bands between 200 Hz and 8 kHz, measured with a pink noise at 70 dB SPL. This target could be achieved with all devices except HA1, where the lowest possible gain setting was not sufficient to reduce the insertion gain around 6 kHz below an average 8 dB (c.f. Figure 1). This individual adjustment was applied in all commercial hearing aids and the individualized PHL condition (termed PHL-ind). With the generic PHL condition (PHL-gen), an adjustment that resulted in a 0-dB REIG on KEMAR for diffuse-field sound incidence was reused (Denk et al., 2021a). This condition is the one from the current experiment that most closely matches that of Cubick et al. (2018). Note that equalization to diffuse-field incidence is an optimal choice for most acoustic scenes in daily life, however, the present experimental condition is close to a free-field condition which requires a different equalization filter to achieve an REIG close to 0 dB (Ohlmann et al., 2024). Therefore, the PHL-gen condition introduces a linear distortion compared to unaided listening that is inherently associated with choosing a generic, non-scene-matched equalization approach. This distortion comes in addition to results of individual differences in ear acoustic as compared to KEMAR. For the earphone, no equalization options were available, and the standard setting was used for all participants. In this device, the 10-dB REOIG could not be achieved with the passive device due to its construction. However, it is assumed that the ANC functionality is active also in the transparency mode and effectively lowered the leakage component well below the criterion for the other devices (Denk et al., 2024).

Measured real-ear insertion gains (REIG) in transparent setting and occluded case in third-octave bands for four devices as denoted in the panel title. Thin lines denote results from individual ears, thick lines arithmetic averages. Green shaded areas mark the target ranges for the transparent setting (upper area, around 0 dB) and the occluded case (below −10 dB between 500 Hz and 4 kHz). The grey shaded area shows the noise floor of the insertion gain measurement, given by the difference between the microphone noise and the real-ear unaided response.
Experiment II
The second experiment focused on the influence of stimulus level on the hearing device disadvantage. To this end, measurements of the SRT and rated listening effort with the same methods as in Experiment I were conducted, using only the PHL in generic setting as a hearing device. However, this device was operated with either 0 dB or 15 dB broadband gain on top of the previously explained frequency-dependent equalization. This broadband gain was included to amplify possible effects of device input noise and transducer nonlinearities as compared to a transparent setting, while reducing the influence of possible direct sound leaking into the ear canal. Based on the experience in Experiment I, it was not deemed necessary to conduct real-ear verification of the passive attenuation with the utilized earpieces.
Speech intelligibility and listening effort measurements were conducted at noise levels of 30, 45, 60 and 75 dB SPL, where the 75 dB condition was excluded in the hearing device condition with 15 dB gain. Note that the increments in noise level equal the hearing device insertion gain, such that the hearing aid in the 15 dB gain setting elevates the stimulus level to that of another experimental condition. Including an unaided condition with all noise levels, this amounted to 11 conditions. Effects of condition order were balanced using a nested Randomized Latin Square approach, randomizing first the order of devices (Unaided, transparent, 15 dB gain) and then the order of noise levels in each device. This experiment was divided into two separate tracks, where subjects in one track conducted the speech intelligibility measurements, and the other track conducted the rated listening effort measurements. Sixteen normal-hearing subjects participated in each track (inclusion criteria same as in Experiment I, speech intelligibility measurements: 5 males, age 24.0 ± 2.5 years; rated listening effort: 6 males, age 24.8 ± 2.7 years).
Technical Measurements
Various measurements were conducted to characterize the changes the hearing devices made on the sound pressure reaching the ear. This included the insertion gain measurements as already assessed during hearing device fitting (see Experiment I) and other more advanced measurements.
First, a recording of sound waveforms at the eardrum was made in the unaided case and the aided case. These recordings were intended for an assessment of overall perceptual differences using a modified version of the GPSMq sound quality model (Biberger et al., 2018; Flessner, 2018, chap. 5). The output of this model is a perceptual difference metric that regards both linear and non-linear distortions between two samples, using the recordings at the aided and unaided eardrum as test and reference. Previous studies showed a good suitability of the GPSMq model for assessing various hearing device-related artifacts, as shown in previous comparisons to other models (Biberger et al., 2021). Since higher sensitivity to arbitrary nonlinear distortions was observed for independent samples of the same broadband noise (Flessner, 2018, chap. 5), independent pink noise samples of 10 seconds length were used for this evaluation. These measurements were made within Experiment I, directly after the programming of devices was finished.
Second, the processing delay of all devices was assessed. To this end, they were placed in an anechoic test box (Brüel&Kjær 4228) including a reference microphone placed near the hearing device input microphone, and 2 cc coupler (Brüel&Kjær 4946) to which the hearing device receiver was attached. Pink noise was presented from the test box loudspeaker, and the recordings of both the reference and coupler microphone passed through a third-octave filterbank. Finally, the hearing aid delay in each frequency channel was determined by finding the peak of the cross-correlation function between the reference and coupler microphone. Note that frequency-dependent group delays introduced by the fractional octave analysis itself are compensated by measuring the relative delay between the identically processed input and output signal of the hearing device under test.
Results
Technical Measurements
Figure 1 shows the REIG determined within Experiment I for a selection of utilized hearing devices. The appropriate results for all devices are provided as Supplementary Material. HA6 (top panel) is representative also for the remaining devices not depicted here, with an insertion gain that lies within the target range of ±5 dB in all regarded third-octave bands, and an average across all ears close to 0. A low-frequency roll-off that could not be compensated by adjustments in the fitting software is noted below approximately 200 Hz. Similar low-frequency roll-offs were observed in the other commercial hearing aids at similar frequencies, but not in the PHL (see also result for PHL-gen depicted in Figure 1). The highest cut-off frequency was observed in HA4 at approximately 300 Hz. In the other devices shown in Figure 1, deviations from the 0 dB insertion gain target were observed. With the PHL-gen, a peak around 4 kHz and on average across ears at +10 dB is observed. In some ears, the peak REIG exceeded 20 dB. As noted in the “Methods” section, this peak is most probably related to using a generic equalization for diffuse-field incidence for a frontal sound source, as well as effects of individual ear acoustics. In HA1, the available gain settings in the fitting software prohibited lowering the output level at frequencies above 2 kHz to 0 dB insertion gain for many ears, leading to an average REIG peak of 7 dB around 5 kHz. In some ears, the peak REIG exceeded 20 dB. In the earphone, a flat 0 dB insertion gain with very low variation across ears is noted below 1 kHz. Frequencies above 2 kHz are reduced by about 3 dB on average across ears, with a larger variation across ears than in the low-frequency regime.
Figure 2 shows the signal quality (SQ) metrics as predicted by the modified GPSMq model, which can be interpreted as a perceptual difference index between the unaided case and the device denoted by the

Predicted signal quality by the auditory model for devices from Experiment I, showing average and standard deviation across subjects for each device as denoted on the
Figure 3 shows the measured processing delays of the hearing devices across frequency. The lowest delay of around 100 µs except for a peak of near 1 ms around 1.5 kHz is observed in the earphone, which most likely uses a time-domain filtering approach. A very low delay of around 1 ms is also observed in HA 4 at frequencies of 500 Hz and below, and at sub-ms delays comparable to the earphone at higher frequencies. The other devices have delays in the range between 4 and 10 ms, with HA5 having the lowest (3.5–5 ms, depending on frequency) and the PHL (10 ms) having the highest delay. Most devices show some frequency dependence of the delay, with delays decreasing with increasing frequency, usually in steps. The frequency dependence of delay is most pronounced in HA1, and clearly visible in HA4, HA5 and HA6.

Measured processing delays of utilized hearing devices depending on frequency.
Speech Intelligibility Thresholds
Figure 4 shows the hearing device disadvantage regarding speech intelligibility for all devices included in Experiment I. Note that the general pattern mirrors the one seen for the predicted SQ (Figure 2) very well. A Shapiro–Wilk test with Bonferroni correction verified that the values in each condition were normally distributed. A statistically significant positive difference from 0 is marked by a star above each condition, denoting

Hearing device disadvantage regarding SRT in Experiment I, showing mean and standard deviation across subjects based on individually determined SRT values.
Figure 5 shows the SRTs observed in Experiment II depending on the noise level and the aiding condition. Independent of the aiding condition, the SRT is dependent on the noise level. In both the unaided and transparent conditions, a minimum SRT at 60 dB is seen (−7.3 and −6.8 dB in the unaided and transparent conditions, respectively), with an increase at both higher and lower levels. The SRT minimum in the 15 dB gain condition appears shifted to 45 dB SPL at −6.9 dB. The maximum observed SRT, that is, poorest performance, was observed at 30 dB SPL in all conditions, ranging from −5.1 dB in the +15 dB condition to −4.2 dB in the transparent condition. For reference, the level-dependent SRTs for the OLSA test reported by Wagener and Brand (2005) are also reproduced in Figure 5. They were measured at levels of 45, 55, 65, 75 and 80 dB SPL and also show a level dependence of the SRT, with a minimum at 55 dB, and a maximum at 80 dB. However, the previously reported level dependence is generally lower, with a difference of 0.3 dB SRT between 55 and 75 dB SPL, compared to a difference of 0.8 dB SRT between 60 and 75 dB SPL in our unaided data. Also, the reference SRTs are generally higher than our present unaided data, most pronounced around 60 dB SPL. It is worth noting that the reference measurements were conducted using monaural sound presentation over headphones, instead binaural free-field presentation used in the present investigation (Wagener and Brand, 2005).

Speech reception thresholds and standard deviations across subjects observed in Experiment II depending on the noise level and aiding condition. Note that the SRTs of the unaided and 15 dB gain conditions were shifted on the
Figure 6 shows the hearing device disadvantage regarding speech intelligibility obtained at various noise levels in Experiment II. Note that with the 15 dB gain setting of the hearing device, the hearing aid disadvantage can be regarded for two different cases. On the one hand, the SRT obtained in the aided condition can be compared to the SRT with the unaided condition in the same presented noise level. This hearing aid disadvantage thus relates to a device that provides a broadband gain of 15 dB with respect to the reference unaided condition. On the other hand, the hearing aid disadvantage can be calculated by relating the SRT in unaided conditions with aided SRT with 15 dB lower input level, that is, at the same output level. The hearing device disadvantage is shown in Figure 6 as dark and light red colors for these “same input level” and “same output level” interpretations, respectively.

Hearing device disadvantage regarding SRT in Experiment II, showing mean and standard deviations across subjects, depending on the presentation level of noise. Note that individual symbols have been shifted on the
A Shapiro–Wilk test with Bonferroni correction for 10 conditions showed that the data in each condition are normally distributed. Statistically significant positive differences from 0 are marked by stars below each condition, denoting
Rated Listening Effort
Figure 7 shows the hearing device disadvantage regarding the listening effort from Experiment I. Similar to Figure 4, the values denote the average and standard deviation across participants, using the individual SNR differences for equal rated listening effort. Generally, the results lie around or above 0 dB with maximum effect sizes around 2 dB (HA2, ESCU 2), albeit with large standard deviations. In most devices, the largest effect sizes are seen at the lowest rated listening effort level. A set of single-sided

Hearing device disadvantage regarding rated listening effort from Experiment I, showing mean and standard deviation across subjects. Colors indicate the results for different rated listening effort levels.
Figure 8 shows the rated listening effort scaling results from Experiment II grouped by hearing device condition, with the results of all noise levels shown in each panel. Independent of the hearing device condition, the curves are shifted to left with increasing noise level, that is, the rated listening effort is reduced. However, no mentionable differences are observed between 60 and 75 dB in the unaided and transparent conditions, and between 45 and 60 dB in the 15 dB gain condition. It should also be noted that the steepness of the curves increases with increasing noise level, such that the SNRs at high rated listening efforts are less different between levels than those at low listening effort. In all conditions, a large variation across subjects is seen, with the largest outliers visible at 30 dB noise.

Rated listening effort scaling results from Experiment II, showing the results with all presentation levels in each panel for one hearing device condition as denoted in the panel title. Thin lines show individual fits of the scaling function, thick lines the mean across subjects, averaging the SNR at fixed ESCU values.
For further evaluation of a possible hearing device disadvantage regarding the listening effort, only the rated listening effort corresponding to very little (ESCU 1), moderate (ESCU 7) and extreme (13) were regarded, and the appropriate SNR averaged across subjects observed with the hearing device subtracted from the appropriate Unaided result, in the same way as for the speech intelligibility (c.f. Figure 6 and evaluation). The resulting hearing device disadvantage in dB SNR to achieve the same rated listening effort is shown in Figure 9. Two-sided

Hearing device disadvantage regarding rated listening effort from Experiment II, showing mean and standard deviation across subjects. Each panel denotes the difference in ESCU rating between aided and unaided conditions. Note that individual symbols have been shifted on the
In the transparent condition, the disadvantage does not exceed approximately 1 dB in size, with a pattern that is not consistent across levels or listening effort levels, and an effect size that is below half of the appropriate standard deviation in all conditions. The conditions closest to interpretable results are observed in the extreme rated listening effort case (ESCU 13), where the hearing aid disadvantage lies around 1 dB, corresponding to about half of one standard deviation. In the 15-dB condition, the results differ largely between the “same input level” and “same output level” evaluations. In the “same input level” case, that is, amplification of the overall listening level, the hearing device benefit for low little and moderate effort (ESCU 1 and 7) at 30 and 45 dB noise level is negative, that is, the device provides a benefit of 2–3 dB SNR, respectively. This benefit vanishes at the highest listening effort level (ESCU 13) as well as a noise level of 60 dB. These observations are consistent with the general effects of level on rated listening effort (see Figure 8), showing a decrease in rated listening effort at the same SNR with increasing level, which is more pronounced at low or moderate listing effort levels. However, no trend towards a disadvantage of the hearing device in the evaluation mode is seen. Contrary, in the “same output level” case, the resulting hearing device disadvantage tends to be above or at 0 dB. The largest effect is seen at low rated listening effort levels and low noise levels (approx. 2 dB for ESCU1 and 45 dB noise level) and decreases with both increasing noise level and increasing rated listening effort level.
Discussion
Disadvantage of Transparent Hearing Devices for Speech Perception
Across several experiments in comparable conditions, hearing devices with a setting that aimed at conserving the transmission through the open external ear with good or even best possible accuracy showed a disadvantage regarding speech intelligibility in a collocated noise setting. On the contrary, no statistically significant effects of transparent hearing devices were found for the subjectively rated listening effort. Figure 10 shows a summary of SRTs with and without a transparent hearing device worn by the users from the present experiments with the PHL-gen, our previous work (Denk et al., 2021a) using the same device, and the study of Cubick et al. (2018), which also used a custom research hearing device without individual equalization. Across all studies, the SRT was increased by the presence of such a hearing device. The effect appears to be robustly present, and measurable across different speech tests, sentence material, languages, and test rooms, which led to different unaided SRTs.

Summary of SRT obtained unaided and with transparent hearing devices, from current experiments and literature references. Symbols show mean and standard deviation across subjects, numbers above brackets denote average hearing aid disadvantage, experimental conditions are summarized on the
This is a somewhat surprising effect that is not easily explained, since both speech and noise undergo identical processing by the hearing device. We interpret the consistent disadvantage of transparent hearing devices regarding speech intelligibility is an objectively well-measurable and relevant symptom of undesired subtle modifications of the perceived signals, that were also previously reported for other perceptual dimensions like sound quality (Schepker et al., 2020) and spatial perception (Cubick et al., 2018; Denk et al., 2019). We thus consider understanding its origin an important step to improving transparency of hearing devices in the future. A motivating result in this respect is the observation from Experiment I that the hearing device disadvantage can apparently be eliminated with appropriate device design, as shown by the earphone.
Regarding the effect of hearing devices in a transparent setting on rated listening effort, the present results showed no significant effects, although trends towards a slight general disadvantage with an effect size in the same order of magnitude as for the speech intelligibility were observed. In conclusion, we do not reject the assumption that a similar hearing aid disadvantage exists for both speech intelligibility and experienced listening effort, but that the accuracy of the assessed rated listening effort (Krueger et al., 2017) was not sufficient to result in a significant effect in the present experiments. However, the present data suggest that the disadvantage regarding rated listening effort is not larger than for speech intelligibility.
It should be noted that hearing aids are designed to compensate for effects of a hearing loss in hearing-impaired users, and not transparent listening by normal-hearing subjects as used in the present investigation. The main difference to a use case would be additional amplification of the input signal, which the hearing aids are capable of. In the present results, we showed that neither self-noise nor nonlinear distortions caused the observed disadvantage with no amplification, while their influence is larger when the hearing aid provides amplification (c.f. Figure 6). Therefore, we consider it likely that the same or even larger side effects of wearing hearing aids can be observed in the actual use case with hearing-impaired users. However, it is also possible that the artifacts introduced by the hearing device that caused a disadvantage in normal-hearing participants are not audible to hearing-impaired users, and who in consequence do not suffer from a disadvantage of hearing devices. When hearing aids do amplify sounds, “transparency” is defined as no other changes as the desired amplification are introduced by the hearing aid. Hence, testing for the device disadvantage could be conducted by comparing outcomes while listening to a signal at normal level through a hearing aid, and listening in an unaided condition but presenting appropriately pre-amplified signals.
Potential Reasons for Hearing Device Disadvantage
Listening through hearing devices potentially introduces various undesired modifications of the signal reaching the eardrum, which were hypothesized as potential underlying reasons for a hearing device disadvantage and intentionally varied within the presented experiments. The different possible influencing factors are discussed in the following.
Linear Distortions
Hypothesis A stated that a deviation from the individual open-ear transfer function causes the hearing device disadvantage. The main means to test this assumption is a comparison of the generic and individualized versions of the PHL in Experiment I, with results shown in Figure 4. Indeed, a significantly lower hearing device disadvantage was observed when the same device was individually equalized as compared to a generic equalization that was deliberately chosen as not ideal for the given acoustic scene. This demonstrates an influence of the hearing device equalization on speech intelligibility in noise that probably would not have been expected if an appropriate filtering would have been applied to the source signals. Further evidence for the relevance of linear distortions is given by the fact that HA1, where the fitting software did not permit settings of 0 dB insertion gain across all frequencies in all subjects, also yielded a significantly higher hearing device disadvantage compared to the other devices used.
The earphone showed no significant hearing device disadvantage despite some deviations from a 0 dB insertion gain (c.f. Figure 4). However, while HA1 and PH-gen resulted in a peak of 5–15 dB in a frequency range of several kHz that was dependent on the subject, the earphone showed a smooth high-frequency shelving leading to an approximately 3 dB reduction of frequencies above 2 kHz, which is very consistent across participants. The evaluation using the auditory model shown in Figure 2 verified that this results in a sound quality that is predicted to be better than with HA1 or the PHL-gen, which is well in line with the authors’ subjective impression. Generally, the model sound quality predictions from Figure 2 well reflect an inverse pattern of the hearing device disadvantage, that is, a perceptual similarity of the aided and unaided conditions as measured by an auditory model predicts the observed hearing aid disadvantage to some extent. By equalizing the residual REIG in the aided case before passing the signals into the model, we verified that the present modelling results, and in particular the observed between-device differences, were mainly driven by linear distortions. On an individual level, we found a weak but significant correlation (
For future work, a direct prediction of the hearing aid disadvantage using (modified) speech intelligibility models would provide a useful tool. Our previous modelling work using the binaural speech intelligibility model that relies on the speech intelligibility index (Beutelmann et al., 2010; Denk et al., 2021a) showed that this model is not suitable to predict such effects. Cubick et al. (2018) reproduced the observed disadvantage in a collocated condition well by model predictions in case of a noise masker but not with a speech masker. A systematic evaluation of speech intelligibility models in the context of hearing device disadvantage could provide additional insights into the origin of this effect.
We conclude that linear distortions are an important factor for the occurrence and size of a hearing device disadvantage, if they occur in a way that reduces the perceived sound quality. These distortions and their influence on the hearing device disadvantage can be predicted by appropriate sound quality models with reasonable accuracy. However, minimizing linear distortions is no guarantee for a hearing device disadvantage to not appear, as seen in our research hearing device PHL.
Signal Processing Strategy and Delay
Hearing device processing could have further, more subtle effects on the signal reaching the eardrum that cannot be easily measured by means of the insertion gain. The processing chains in current commercial devices are complex, although they were adjusted to a minimum effect as permittable by the fitting software. Our research device (PHL) contained an overlap-add based filterbank, and gain adjustments for the individual frequency bins (Pavlovic et al., 2020). This constitutes a setting that represents the most transparent processing in state-of-the-art hearing aids, and it is assumed that the commercial hearing aids were in a similar state, although it cannot be ruled out that more complex features were still in action.
Hypothesis B states that specific implementations of the signal processing framework affect the occurrence and size of a hearing device disadvantage. As can be seen in Figure 4, no such relative differences in hearing device disadvantage that could not be already explained by linear distortions were observed within the commercial hearing aids and the PHL. While the exact processing in the devices is unknown, the results regarding the processing delay shown in Figure 3 give some insights into the implementation of frequency-depending processing (Launer et al., 2016) in the individual devices. That is, while a pure overlap-add analysis would result in a frequency-independent delay, a filterbank with auditory resolution (Hohmann, 2002) would result in delays of specific channels increasing with decreasing frequency. Neither of the commerial hearing aids in the present study showed a behavior that is perfectly consistent with either of these extremes. Most of the devices show steps in the delay that are most probably associated with discrete hearing aid channel groups for which the frequency resolution might be chosen differently. HA1 shows a behavior that is most consistent with a classic filterbank at frequencies above approximately 500 Hz, and HA4 shows a delay that is probably not achievable based on an overlap-add analysis. The other devices show mostly frequency-independent delays, where single steps in delay may relate to overlap-add processing with dual frequency resolution. In conclusion, the different hearing aids probably used somewhat different but generally comparable frequency-dependent processing strategies, the details of which did not apparently affect the observed hearing device disadvantage.
Again, the results looked different in the earphone, which showed no significant hearing device disadvantage and a delay well below 1 ms. In contrast to the other devices, the low delays and limited adjustment options point towards the interpretation that processing was performed in the time domain. The operation mode of this device used in the experiment is a pure transparency mode, that is, in contrast to hearing aids, no complex frequency-dependent processing is required, which spares the need for frequency analysis that introduces delay and might introduce undesired artifacts. We consider it likely that the absence of frequency-dependent processing was a main contributor for the non-occurrence of a hearing device disadvantage in this device.
Processing delays itself are a disturbance that have been frequently reported to lead to decreased sound quality (Lelic et al., 2022; Stone et al., 2008), and recently also poorer neural encoding of speech (Zhou et al., 2024) and speech intelligibility in special situations (Roth et al., 2024). They are thus another processing-related issue that should be discussed as a potential cause of the hearing device disadvantage. In the present study, we aimed at reducing the level of sound directly leaking into the ear canal and verified an attenuation of at least 10 dB at 500 Hz by real-ear measurements (see Figure 1). In consequence, the interaction of the hearing device output and the sound leaking directly into the ear canal leading to spectral interference effects was also reduced, which should lead to a less pronounced effect of processing delays. However, the amount of direct sound increased towards lower frequencies, such that mild comb-filtering effects became visible as dips in the insertion gain that are consistent across participants in devices with a larger delay (see Figure 1, top three panels). It is thus possible that delay and comb filtering artifacts were audible despite the very controlled experimental setup that aimed at achieving a tight fit, especially in subjects with less attenuation of external sounds. All devices except the earphone and HA4 feature delays above 4 ms in the low-frequency regime, which is in the audible range as long as sufficient direct sound is present (Denk et al., 2021b; Stone et al., 2008). However, when these devices are grouped, no significant correlation between the low-frequency attenuation of direct sounds and the hearing aid disadvantage was obtained. It can thus be concluded with sufficient certainty that interactions between direct sound and the delayed output of the hearing devices was suppressed sufficiently to not contribute to a hearing device disadvantage in the present study. This does not mean that a delay cannot cause a hearing aid disadvantage regarding speech intelligibility when a more vented fitting is used as in common practice. Indeed, Roth et al. (2024) recently predicted SRT disadvantages due to delays in open-fit monaural, but not bilateral fits.
HA4 and the earphone showed very low delays that should be undetectable also if direct sound leaks into the ear at sufficient level (Denk et al., 2021b; Lelic et al., 2022; Stone et al., 2008). Although the latencies of both devices are comparable, a significant hearing device disadvantage was observed in HA4 but not in the earphone. We conclude that if strong comb filtering effects detectable by spectral notches are avoided, the processing delay is not sufficient to explain the occurrence a hearing device disadvantage. If a more open coupling is used, we would thus expect a contribution of delays to the hearing device disadvantage due to spectral distortions that are introduced. If such comb filtering effects occur, we consider it likely that their influence on sound quality and possibly hearing device disadvantage could be predicted by auditory models as used in the present study.
It should also be noted that in the condition with hearing device gain of 15 dB, the prominence of delay and comb-filtering effects is further reduced since the level difference between sound leaking directly into the ear canal and hearing device output is increased. The same principle was applied by Cubick et al. (2018) to reduce the effects of processing delay. In Experiment II, the hearing aid disadvantage was not decreased by this setting at intermediate levels (“same output level” in Figure 6), as compared to the transparent setting. This observation has three possible explanations: (a) delay and comb filtering effects have no influence on the hearing aid disadvantage with non-occluding fits, (b) delay and comb filtering effects were still audible in the 15-dB gain condition, (c) Increased device self-noise (see below) compensated for the reduction of the delay and comb filtering effects in the 15 dB gain condition.
Non-Linear Distortions and Self-Noise
Experiment II was conducted to assess Hypotheses C and D, namely that self-noise and transducer nonlinearities were responsible for the observed hearing device disadvantage. To this end, the measurements were conducted at different noise levels, with the assumption that self-noise of the hearing device would have an increasing negative impact towards low levels, and transducer nonlinearities would have an increasing negative impact towards high levels. With the hearing device set to transparency, no such trend was observed, neither for the SRT nor for the rated listening effort. Hypotheses C and D are thus rejected for the case of transparent hearing devices with state-of-the-art hardware as the PHL (Denk et al., 2022).
Some trends regarding the disadvantage were noted when hearing device gain was provided. In the case where the outcomes are compared to the unaided results with the same output level (i.e., lower stimulus level compensated by the gain of the hearing device), a hearing aid disadvantage very comparable to the transparent hearing devices was observed at 60 and 75 dB SPL noise, which was increased at 45 dB SPL (c.f. Figures 6 and 9). This level dependence then well fits hypothesis C, and it is likely that the amplified device self-noise was the reason for this increase, very similarly for the SRT and rated listening effort.
When the results with the hearing device providing 15 dB gain are compared to the unaided results for the same stimulus level, the listening level is increased. This means that effects of a higher device output, which were included to emphasize the effects of transducer distortions (Figures 6 and 9), are confounded by effects of listening at different levels that are also observed in the unaided case (Figures 5 and 8). In this case, the hearing aid disadvantage regarding the SRT actually did increase from low towards 60 dB noise levels. However, most of the hearing aid disadvantage in this case at 60 dB is already explained by the increase in listening level seen in the unaided condition (Figure 5). Hence, we reject Hypothesis D and conclude that at normal conversational levels up to 75 dB, transducer nonlinearities do not contribute to a hearing device disadvantage regarding speech perception in noise also when 15 dB gain is provided by the hearing device.
At lower levels, no effect of amplification of the listening level differs between the SRT and listening effort results. While a benefit regarding the listening effort, especially the low and medium effort stages of up to 3 dB SNR (on average across subjects, 45 dB noise at ESCU 1 in Figure 9) is seen, no benefit is noted for the SRT. This benefit is smaller than the SNR difference for equal ESCU at different levels in the unaided case (c.f. Figure 8). At 30 dB noise level, the 15 dB level increase due to the hearing device should have provided a benefit of more than 1.5 dB SRT due to the higher resulting listening level. The fact that no such advantage was present is further support for the limiting factor of the device self-noise. While hearing aid amplification can make listening at low levels less effortful also for normal-hearing subjects, the self-noise apparently prohibits that speech intelligibility is also improved.
Cross-Modal and Psychological Effects
Besides the possible explanations discussed above, factors not directly affecting the stimuli reaching the eardrums of the participants might have contributed to the persistent hearing device disadvantage. First, it could be argued that acclimatization to the devices could eliminate the hearing device disadvantage. In the present experiment, the devices were worn for approximately 20–30 minutes, while longer periods of time were included in our previous work (Denk et al., 2021a, approx. 90 minutes) and other studies observing a hearing device disadvantage (Cubick and Dau, 2016; Cubick et al., 2018). Also, given that in one tested device no disadvantage was observed shows that acclimatization effects cannot be the only reason. Still, future work could include an assessment of the hearing device disadvantage over the course of time.
Second, cross-modal effects could have contributed to the observed disadvantages. Since subjects had to repeat the sentences, alterations of the own voice due to the occlusion effect with the closed-fit devices may have led to perceived disturbances during vocalization that ultimately also led to worse performance than in the unaided condition. The earphone in the present study included a feature that reduced the excess low-frequency levels in the ear canal that usually occur while speaking when the ear canal is occluded by ANC-like processing (Denk et al., 2024). This potentially provides another possible explanation why this device showed no disadvantage. Finally, the measured hearing aid disadvantage could also have a psychological origin. The participants were always aware when they were wearing a device in their ear and might thus be prompted to expect that their perception was altered. Such an explanation may be supported by recent findings from Lelic et al. (2023) showing that hearing aid outcomes can be influenced by directing the attitude of participants.
Conclusions
We systematically evaluated the negative effect that linear, transparent hearing devices have on speech perception in noise that comes from the same direction as a speech source. Although listening through such a hearing device does not change the relative level between speech and noise, a consistent hearing device disadvantage was seen both for speech intelligibility and, as a trend, also for rated listening effort. This hearing device disadvantage was only avoided in one tested device, which was an earphone with hear-through feature that was characterized by a low delay, processing in time domain and no large deviations from a 0 dB insertion gain. While self-noise of the device had a significant effect only at low stimulus levels and when the hearing device provides gain, transducer distortions did not contribute to the hearing device disadvantage observed in the present experiments. Comparison of results between devices and settings showed that the observed hearing device disadvantage is most probably caused by parameters that are commonly associated with sound quality, namely linear distortions, subtle artifacts of frequency-dependent processing, and potentially delay and comb filtering effects. The present results thus demonstrate that the sound quality of hearing devices should be optimized as much as possible to avoid negative side effects—not only for the perceived fidelity as a luxury feature but also for audiologically relevant aspects like speech intelligibility in noise.
Supplemental Material
sj-pdf-1-tia-10.1177_23312165241246597 - Supplemental material for (Why) Do Transparent Hearing Devices Impair Speech Perception in Collocated Noise?
Supplemental material, sj-pdf-1-tia-10.1177_23312165241246597 for (Why) Do Transparent Hearing Devices Impair Speech Perception in Collocated Noise? by Florian Denk, Luca Wiederschein, Markus Kemper and Hendrik Husstedt in Trends in Hearing
Footnotes
Acknowledgements
The authors thank our participants for their time and patience during these measurements, Alina-Sophie Bockelmann and Laureen Moschner for executing Experiment II, Thomas Biberger for providing the code and support for the GPMSq modelling, the Lübeck academy of hearing acoustics for access to the commercial hearing aid samples, and Chaslav Pavlovic for providing the PHL. Data obtained within this work will be made available upon reasonable request.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
