Abstract
The combination of directional microphones and noise reduction (DIR + NR) in hearing aids offers substantial improvement in speech intelligibility and reduction in listening effort in spatial acoustic scenarios. Pupil dilation can be used to infer ocular markers of listening effort. However, pupillometry is also known to crucially depend on luminance. The present study investigates the effects of a state-of-the-art DIR + NR algorithm (implemented in commercial hearing aids) on pupil dilation of hearing aid users both in darkness and ambient light conditions. Speech intelligibility and peak pupil dilations (PPDs) of 29 experienced hearing aid users were measured during a spatial speech-in-noise-task at a signal-to-noise ratio (SNR) matching the individual's speech reception threshold. While speech intelligibility improvements due to DIR + NR were substantial (about 35 percentage points) and independent of luminance, PPDs were only significantly reduced due to DIR + NR in ambient light, but not in darkness. This finding suggests that the reduction in PPD due to DIR + NR (most likely through improvement in SNR) is dependent on luminance and should be interpreted with caution as a marker for listening effort. Relations of reduction in PPD due to DIR + NR in ambient light to subjectively reported long-term fatigue, age, and pure-tone average were not statistically significant, which indicates that all patients benefitted similarly in listening effort from DIR + NR, irrespective of these patient-specific factors. In conclusion, careful control of luminance needs to be taken in hearing aid studies inferring listening effort from pupillometry data.
Introduction
Listening effort is an important outcome for studying the effects of rehabilitative measures in hearing-impaired (HI) listeners. Different from the much more often used outcome measure of speech intelligibility (Winn & Teece, 2021), listening effort measures “the allocation of attentional and cognitive resources toward auditory tasks, such as detecting, decoding, processing, and responding to speech” (Bess & Hornsby, 2014). Listening effort also addresses the cognitive burden of doing mental corrections of misperceived words, even when speech intelligibility is close to maximum (Winn & Teece, 2021).
If the quality of auditory input is reduced, as in HI listeners, they may need to allocate listening effort differently in everyday situations compared to normal-hearing (NH) listeners (Ohlenforst et al., 2017b). This can result in increased subjective (or experienced) listening effort (Pichora-Fuller et al., 2016). Consequently, increased listening effort is one of the primary complaints of HI listeners (Hughes et al., 2018). Long-term effects of increased listening effort can be mental fatigue (Bess & Hornsby, 2014), avoidance of social situations (Hughes et al., 2018), and increasing need for recovery time after work (Nachtegaal et al., 2009).
Pupillometry can be used as an objective measure of listening effort (Zekveld et al., 2010). Typically, an increase in pupil size can be observed in response to target sentences (i.e., pupil response), which has produced reliable results in hearing research (Winn et al., 2018). In contrast to subjective measures of listening effort (i.e., questionnaires), the pupil response can be seen as an objective measure of
One goal of this line of research has been to find relations between task-evoked listening effort (Pichora-Fuller et al., 2016) and (long-term) fatigue (Bess & Hornsby, 2014; Hornsby et al., 2016), as well as to determine their relations to hearing impairment. For example, Wang et al. (2018a) found a negative correlation between self-reported fatigue and peak pupil dilation (PPD), indicating that persons with higher levels of fatigue show reduced PPD. Along the same lines, smaller task-evoked pupil responses were associated with increased tiredness from listening (McGarrigle et al., 2021). Wang et al. (2018b) investigated PPD in darkness and ambient light and were thus able to separate the influence of the PNS and SNS on the pupil response. The interpretation of their results is based on evidence that the PNS has a negligible effect on the task-induced pupil dilation in darkness, because of relaxed sphincter muscles resulting in a larger overall pupil diameter (Steinhauer et al., 2004). Since both PNS and SNS are actively contributing to PPD in ambient light, a contrast of pupil responses in darkness and ambient light allows to disentangle SNS and PNS contributions. Thus, Wang et al. (2018b) were able to relate higher levels of self-reported fatigue to an overall more activated PNS.
The usage of hearing aids (HAs) may lead to a reduction in listening effort in HI listeners (Wendt et al., 2017). For instance, recent research has shown that longer HA experience is associated with less subjectively reported listening effort in everyday life (Ferschneider & Moulin, 2023). Although amplification provided by HAs was reported in some studies to reduce listening effort (e.g., Downs, 1982), there is no clear systematic evidence that HA amplification alone reduces listening effort (see Ohlenforst et al., 2017a for a review). HAs used by NH listeners can even increase listening effort (Denk et al., 2024). It is, however, well established that the improvement of the SNR translates to reductions in objective measures of listening effort in HA users (Seifi Ala et al., 2020), even in NH listeners (Książek et al., 2021; Ohlenforst et al., 2017b).
Classical single-channel noise reduction (NR) algorithms that are widely used in HAs were shown to not systematically reduce listening effort (e.g., Brons et al., 2014), which mirrors the nonexistent effect of these algorithms on speech intelligibility (Bentler et al., 2008). Only in very difficult listening conditions using a dual-task paradigm, a reduction in listening effort through NR was shown (Desjardins & Doherty, 2014). Since novel NR algorithms based on deep neural networks today offer a small but significant improvement in speech intelligibility (Andersen et al., 2021), these relations, however, need to be revisited.
In contrast, directional microphone (DIR) algorithms (either alone or in combination with NR) in HAs showed significant reductions of measures of listening effort in spatial acoustic scenarios consistently across studies: using subjective ratings (Johnson et al., 2016; Picou et al., 2017; Winneke et al., 2020), a dual-task paradigm (Picou et al., 2017), pupillometry (Fiedler et al., 2021; Wendt et al., 2017) and EEG (Bernarding et al., 2017; Fiedler et al., 2021; Winneke et al., 2020). DIR + NR hereby reduces also the susceptibility to fatigue (Hornsby, 2013). Although the effects of DIR + NR on objective measures of listening effort are well documented, to the authors’ knowledge the effect of luminance on this release from listening effort through state-of-the-art DIR + NR and its relation to fatigue and other patient-related factors have not yet been studied.
The present study used a research design similar to Wendt et al. (2017) and Wang et al. (2018b) to address the following goals:
To investigate whether the expected DIR + NR-related reduction of the pupil response (as a measure of reduced listening effort accompanied by increased speech intelligibility) depends on luminance. It is hereby hypothesized that there is an improvement in speech intelligibility and a reduction in pupil dilation through DIR + NR both in ambient light and darkness. To investigate correlations of pupil dilation and pupil dilation changes due to DIR + NR to patient-specific factors, such as age, audiogram, and self-reported fatigue. Based on the results of Wang et al. (2018a) it is hereby hypothesized that HA users with a higher level of fatigue show reduced pupil dilation in ambient light, and based on the results of Winn et al. (1994) that age and the audiogram are negatively correlated with pupil dilation change due to DIR + NR.
Methods
Participants
Twenty-nine (14 female and 15 male) experienced HA users participated in this study. All participants were native German speakers. Ages ranged from 29 to 78 years, with an average of 63 years and a standard deviation (SD) of 13.1 years. Standard pure-tone audiometry was done using an Affinity 2.0 audiometer (Interacoustics, Middelfart, Denmark). All participants were bilaterally sensorineurally (i.e., air-bone gap averaged across 500, 1000, 2000, and 4000 Hz < 15 dB) hearing-impaired with no known neurotological or cognitive disorders. Participants showed across-ear threshold differences ≤ 15 dB for at least 4 out of the 6 octave frequencies (250, 500, 1000, 2000, 4000, and 8000 Hz), indicating rather symmetric hearing losses. Figure 1 shows the average hearing threshold (HT) as blue crosses for the left and as red circles for the right side. Error bars denote SD. The pure tone average (PTA) between frequencies of 500, 1000, 2000, and 4000 Hz was 54.1 dB hearing level (HL, SD = 13.1 dB HL) averaged over left and right side.

Average Pure-Tone Hearing Thresholds Measured in the Left (Blue) and Right (Red) Ear Across Frequencies (125 Hz–8 kHz) for All 29 Participants. Error Bars Indicate ±1 Standard Deviation.
Apparatus
Participants were seated in a room of the University of Applied Sciences Lübeck with living-room-like acoustics. Three loudspeakers (Genelec 8040A, Iisalmi, Finland) were positioned at a distance of 1.2 m around the participant, one in front and two at ±100°, as shown in Figure 2. Signals were generated on a laptop via customized scripts in MATLAB (The MathWorks, Natick, USA), which was connected to a FireFace UC sound card (RME, Haimhausen, Germany) that fed the three loudspeakers. Calibration was done for each loudspeaker using an NTI XL2 level meter (Schaan, Liechtenstein) at the position of the listener's head. Pupil diameter of both left and right eyes were recorded with a sampling rate of 200 Hz using a Pupil Labs Core eye tracking device (Pupil Labs GmbH, Berlin, Germany). The light in the room was controlled by two continuously dimmable LED lamps (Viltrox VL-500 T, Jueying Technology Co., Shenzhen, China) placed behind the participant's chair, which were directed toward a white projection screen in front of the participant (i.e., behind the 0° loudspeaker) to avoid reflection in the pupils. Light intensity was measured with a luxometer (Testo 545, Testo SE, Titisee-Neustadt, Germany) that was calibrated according to DIN 5032 part 7.

Setup of the Experiment. Target Speech was Presented from the Front. Masker Signals were Presented at ± 100°.
Hearing Aid Settings
Participants in the present study were recruited from the pool of individuals participating in a longitudinal HA study (Jürgens et al., 2025; Zaar et al., 2023). To this end, all participants were fitted with Oticon More HAs (Oticon A/S, Smørum, Denmark) and had at least 4 months of experience with these HAs. HAs were fitted using the NAL-NL2 fitting formula (Keidser et al., 2011) by a professional HA acoustician, who also verified the fitting using real-ear measurements. Insert ear-tips or individually fitted custom-made ear molds were used, as prescribed by the fitting software Genie 2 (Oticon A/S, Smørum, Denmark). Participants used the same acoustic coupling and amplification settings as in the HA study, that is, they were accustomed to both this amplification and the closedness of the acoustic coupling (Jürgens et al., 2025). For the purpose of the present study, two settings were programmed, one with DIR + NR on and one with DIR + NR off. DIR + NR consists of a fast-acting combination of a minimum variance distortionless response (MVDR) beamformer (Kjems & Jensen, 2012) and a deep-neural-network based noise suppression for postfiltering (Andersen et al., 2021). The strength of DIR + NR was chosen maximum adaptive. Other signal processing algorithms (e.g., feedback cancelation) were left at default.
Fatigue Questionnaires
Two different fatigue questionnaires were used in the present study. The “Need for Recovery” (NfR) scale by Van Veldhoven and Broersen (2003) consists of eleven items that capture the extent of the need for rest at the end of a workday. Examples of the items are “I find it difficult to wind down at the end of a working day” and “When I get home from work, I find it hard to get involved with other people.” These statements can be answered using a four-point rating scale from 0 “never” to 3 “always.” The evaluation is done by forming a sum value across the eleven items, where the lowest value that can be achieved is 0 and the highest value that can be achieved is 33. The higher the score in the NfR, the higher the need for recovery at the end of a working day. Because many retirees participated in this study, the term “working day” was rewritten as “weekday.” For example, “I find it difficult to wind down at the end of a weekday.” The participants were asked to indicate how they felt after their daily routine.
The “Checklist Individual Strength” (CIS) questionnaire is a 20-item questionnaire and was designed to measure several aspects of fatigue (Vercoulen et al., 1994). Examples of the items are “I feel tired” and “I think I do a lot during the day.” Using a 7-point Likert scale (ranging from completely true to completely false), participants indicated how they have felt over the past two weeks. Ratings were assigned numerical values such that the total score ranges between 20 and 140 points. The higher the score, the higher the impact of fatigue, concentration problems, reduced motivation, and reduced activity (Beurskens et al., 2000).
To calculate a participant's overall fatigue score in this study, the CIS and NfR scores were converted to a percentage for each participant and then averaged.
Measurement Procedure
After informed consent and standard audiometry, participants were asked to fill out the NfR and CIS questionnaire. Luminance was controlled with each participant seated. Light intensity was adjusted manually by the trained experimenter using the continuously adjustable dimmers of the two LED lamps to ensure 370 ± 5 lux for the ambient light and <1 lux for the dark condition at the participants head in their line of sight. Participants were asked not to wear corrective glasses during the measurement to avoid reflections that may reduce the quality of the pupil size measurements. As target speech, German sentences (male speaker) of the hearing in noise test (HINT, Joiko et al., 2021) were presented from the loudspeaker positioned in front of the listener at 0° azimuth with 1.2 m distance as shown in Figure 2. Two different maskers were presented from ±100° azimuth. Both maskers were intelligible speech signals, each about a different topic, from two male speakers mixed with speech-shaped noise at −6 dB relative to the running speech level (Zaar et al., 2024).
First, a training run was conducted with DIR + NR switched off in ambient light using one HINT list with 20 sentences. Then an adaptive procedure was used to estimate the individual SNR required for 50% sentence intelligibility (SRT50) in ambient light with DIR + NR switched off, using two HINT lists. A one-up-one-down procedure with SNR adjusted in 2 dB steps (as proposed by Joiko et al., 2021) was used. Hereby, the level of the maskers was fixed at 65 dB SPL (measured at the position of the participant) and the target signal was varied, starting with an SNR of 0 dB. To avoid mixing the pupil responses that are indicative for listening effort with pupil responses originating from participants’ verbal responses addressing the speech intelligibility task, participants were asked to repeat the target-sentence words only after they heard a signal tone (at 1000 Hz) that was presented 3 s after the sentence ended (cf. Winn et al., 2018). Using word-scoring, the experimenter marked all words that the participant repeated correctly before the next sentence was presented. In each of the subsequent four conditions, the HINT test was conducted using two lists (i.e., 40 sentences per condition) at fixed SNR that equals the individual SRT50. Hereby, both pupil dilation and speech intelligibility scores in percent correct were recorded. HINT lists were randomized across participants and conditions using a Latin square design. Two conditions in ambient light with DIR + NR on and off, and two conditions in darkness with DIR + NR on and off were conducted in randomized order. For each condition there was a minimum of 3 min given to the participant to adjust to the new luminance condition.
Pupillometry and Pupil Data Processing
Pupil diameters of both eyes were recorded. Before preprocessing the data, both datasets of the left and right eye were analyzed and the data set with fewer missing samples (zeros) was selected. The following analysis was done with MATLAB R2021a (The MathWorks INC., Natick, Massachusetts, USA) and accompanied by visual inspection of the raw and preprocessed data to ensure plausibility. Only pupil data with more than 50% confidence (as provided as metadata by the eye tracking device) were kept for further processing to ensure quality of pupil size estimation. Accordingly, pupil data with confidence less than or equal to 50% were replaced by NaN-values (“not a number” as placeholder for missing samples), including the time range 35 ms before and 100 ms after their occurrence to minimize artifacts caused by eye blinks and other noise (Fiedler et al., 2021; Winn et al., 2018). The pupil diameter was divided into trials based on markers generated during the recording at sentence onset. A trial ranged from 1 s before to 5 s after sentence onset. In each condition, the first and last trial of the total of 40 trials was discarded. Any trial with more than 30% missing samples was excluded from the analysis (Fiedler et al., 2021; Winn et al., 2018). Additionally, a participant's entire dataset was discarded if less than 70% of their data across all trials was available to ensure that enough trials were available to estimate the individual pupil response. As a result of this data processing, the data of 3 out of the 29 participants needed to be discarded completely. In the remaining 26 participants, 71.2% and 71.9% of the trials in darkness were available with DIR + NR off and on, respectively, and 80.6% and 79.3% of available trials in ambient light with DIR + NR off and on, respectively. For these remaining datasets a linear interpolation was performed to reconstruct missing samples. The data was temporally smoothed by convolution with a hamming window of 0.5 s length. For calculating the baseline pupil diameter (BPD), the pupil dilation of the interval during 1 s before sentence onset was used. Within this time range the mean pupil size was calculated and subtracted from the pupil size of the whole trial, thereby isolating the response to the sentence. The BPD was calculated for each participant for each sentence. For further investigations the PPD was determined. First, the pupil responses were averaged across all valid trials within each condition for each participant. Finally, the PPD was then extracted as the maximum value of the pupil response relative to BPD from these averaged pupil dilations.
Statistical Analysis
Statistical analysis was done using Jamovi version 2.5.6 (Jamovi, 2024). PPDs and speech intelligibility scores were first tested for normal distribution using the Shapiro-Wilk test. Since speech intelligibility scores were not normally distributed, they were transformed to rationalized arcsine units (RAU, Studebaker, 1985). Next, a two-way analysis of variance (ANOVA) with the independent variables (i.e., factors) luminance (levels: ambient light and darkness) and DIR + NR (levels: on and off) on the dependent variable PPD and separately from that on the dependent variable RAU-transformed speech intelligibility scores were performed. Post hoc paired t-tests with Bonferroni-correction were conducted to check for significant differences in PPDs (and speech intelligibility) between DIR + NR on and off (paired-samples). A generalized linear mixed model was used to investigate possible correlations between age, fatigue score, audiogram, luminance, and algorithm (as factors) and PPD as output. In addition, a second generalized linear mixed model with factors age, fatigue score, audiogram, luminance and output
Post hoc power analyses using the software G*Power version 3.1.9.6 showed with Cohen's d = 0.79 (based on the data in ambient light condition) and the final sample size of 26 persons configured for difference between two means and matched pairs a statistical power of >98% to detect a statistical difference in PPD between DIR + NR on and off. The correlational approaches with a medium effect size (Cohen's d = 0.5) showed a statistical power (correlation and point biserial model) of 89% to detect an effect.
Results
Speech Intelligibility
Individual SRT50s of all participants ranged between −7.6 and +1.7 dB with an average of −3.7 dB. Figure 3a shows boxplots of speech intelligibility scores in percent for both DIR + NR on and off, in ambient and in darkness, measured at the constant SNR corresponding to the individual SRT50. In addition, individual values are shown as gray dots. Figure 3b shows the difference in speech intelligibility scores between DIR + NR on and off in darkness (dark gray dots) and in ambient light (light gray dots). A two-way ANOVA on the RAU-transformed speech intelligibility scores indicated a main effect of DIR + NR (F(1,28) = 602.18,

(a) Boxplots of Speech Intelligibility Scores for All 29 Participants with DIR + NR Off and on Both in Ambient Light and Darkness. Each Individual HA User is Shown as a Gray Dot; and (b) Boxplot of Individually Calculated Speech Intelligibility Benefit (Percentage Scores) Due to the DIR + NR Algorithm Both in Darkness and in Ambient Light. DIR+NR: Directional Microphones and Noise Reduction; HA: Hearing Aid.
Pupil Dilation
The corresponding average pupil responses across the 26 participants whose data was usable are displayed in Figure 4a. The pupil size difference from the baseline is shown as a function of time relative to sentence onset. Curves represent grand average pupil dilations (i.e., averaged across all participants and all trials) for ambient light and DIR + NR on (blue), ambient light and DIR + NR off (green), dark and DIR + NR off (black), and dark and DIR + NR on (red). Pupil dilations’ standard error of the mean (SEM) are indicated as a shaded band in each condition. In each condition the pupil dilation time course shows a maximum between 2.5 and 3.5 s after sentence onset.

(a) Time Course of the Grand Average Pupil Dilations with Respect to Baseline for Two Conditions in Ambient Light with DIR + NR Off (Blue) and on (Green) and Two Conditions in Darkness with DIR + NR Off (Black) and on (Red); (b) Boxplot of PPDs Using the Same Color Code as in (a). Each Individual Participant's PPD is Shown as a Gray Dot; (c) Boxplot of PPD Differences Between DIR + NR on and Off in Darkness (Dark Gray Dots) and in Ambient Light (Light Gray Dots). DIR+NR: Directional Microphones and Noise Reduction; PPD: Peak Pupil Dilations.
Figure 4b shows PPD in darkness and in ambient light with DIR + NR on and off (n = 26) as boxplots with individual data (gray dots). In the dark condition a mean PPD of 0.09 mm (SD = 0.07 mm) with NR off (black box) and 0.08 mm (SD = 0.08 mm) with NR on (red box) was observed. In ambient light the mean PPD was 0.17 mm (SD = 0.07 mm) with NR off (blue box) and 0.12 mm (SD = 0.06 mm) with NR off (green box). Figure 4c shows the difference in PPD between DIR + NR on and off in darkness (dark gray dots) and in ambient light (light gray dots). A two-way ANOVA revealed a significant main effect of DIR + NR (F(1,25) = 15.35,
Relations of Patient-Specific Factors to Pupil Dilation
A generalized linear mixed model with output PPD and independent factors algorithm (DIR + NR on/off), luminance, PTA, age, fatigue score as well as interactions between luminance and algorithm (PPD ∼1 + luminance + algorithm + PTA + age + fatigue score + luminance:algorithm + (1|subject)) indicated main effects of luminance (
Another generalized linear mixed model with output PPD difference between DIR + NR on and off was done in order to investigate if the amount of reduction of PPD depended on either of the independent factors luminance, PTA, age or fatigue score (PPD difference ∼ 1 + luminance + PTA + age + fatigue score + (1|subject)). This analysis indicated again a main effect of luminance (
Discussion
The present study showed that speech intelligibility of HA users measured at a fixed, individual SNR improved substantially using state-of-the-art commercially available DIR + NR, but PPDs (as an ocular marker of listening effort) only decreased significantly in ambient light. Luminance itself had a significant effect on PPDs irrespective of DIR + NR on or off. In addition, there was no significant effect of neither age, PTA or fatigue score on both PPDs and PPD difference due to DIR + NR in the ambient light condition.
Influence of DIR + NR on Speech Intelligibility
The effect of combined DIR + NR on speech intelligibility reported here matches qualitatively reported improvements in SRT50 (Jürgens et al., 2025) using the same algorithm and the same spatial scenario. Note that the improvement in speech intelligibility can largely be attributed to an improvement in SNR due to spatial filtering from the DIR algorithm. The chosen spatial setup of two interfering sources from ±100° is beneficial for this type of algorithm. Furthermore, good-quantitative agreement of the percentage improvements reported by Wendt et al. (2017) (28 percentage point increase due to DIR + NR) are obtained despite differences in language of the test (Danish vs. German here), different arrangements of the maskers (5-speaker circle and 4 overlapping talkers vs. 3-speakers and 2 overlapping talkers here) and using a predecessor algorithm in their study. The increase of about 35 percentage points in the present study and speech intelligibility scores themselves both with and without DIR + NR were hereby, as expected, independent of luminance. This is in line with almost identical SRT50s in darkness and ambient light reported in Wang et al. (2018b). Note that the study design of the present study aimed at testing participants at their individual SRT50, thus expecting 50% correct word recognition with DIR + NR off. This expectation wasn’t statistically confirmed in the dark with an average speech intelligibility score of 54.1% (t(28) = 2.58,
Influence of DIR + NR on Pupil Response in Ambient Light
Reductions of measures of listening effort due to DIR + NR have been shown by numerous studies with various techniques, most of them without stating the light intensity during the experiment (Bernarding et al., 2017; Johnson et al., 2016; Picou et al., 2017; Wendt et al., 2017; Winneke et al., 2020). Others used lower light intensities than in the present study such as an average of 84.3 lux (Ohlenforst et al., 2018) or 45 lux (Fiedler et al., 2021). Luminance is known to have crucial effect on both tonic and phasic pupil dilation (Pan et al., 2022; Peysakhovich et al., 2017). Systematic investigations of different light intensities on the strength of pupil response have shown that dim light maximizes the effect (Baldock et al., 2024). Therefore, it is possible that the ambient light condition used in the present study was not optimal for largest effects. As a qualitative comparison the study of Wendt et al. (2017) is particularly worth mentioning here due to the high similarity of measurement methods (i.e., DIR + NR algorithm, spatial speech-in-noise setup). They showed higher PPD (indicating increased listening effort) in their lower intelligibility condition (level at 50% speech intelligibility) compared to their higher intelligibility condition (95% speech intelligibility). When DIR + NR was switched on, PPDs were significantly reduced in both measurement conditions (indicating reduced listening effort). This means that even with a high SNR and saturated speech intelligibility, a benefit in listening effort from DIR + NR can still be measured, which shows the effectiveness of the combination DIR + NR that was also used here.
Ohlenforst et al. (2018) explored the effects of the same DIR + NR algorithm as in Wendt et al. (2017) under various SNR conditions to examine its impact within different difficulty levels of the hearing situation. They constructed psychometric functions illustrating the relationship between SNR and pupil dilation, as well as between SNR and speech intelligibility. A shift in the psychometric function for SNR versus speech intelligibility and versus mean PPD by about −4 dB was observed when applying DIR + NR. These results support the general hypothesis that DIR + NR in HA leads to reduced listening effort as indicated by smaller pupil dilations simply through improvement of the SNR.
Influence of DIR + NR on Pupil Response in Darkness
The present study's finding that there is no reduction in PPD due to DIR + NR in darkness (in contrast to the substantial reductions observed in ambient light) is novel and has to the authors’ knowledge not yet been reported elsewhere. As most likely the DIR + NR improves the SNR, our finding can be compared with the study of Zhang et al. (2022), who contrasted PPDs to speech material at two different SNRs in darkness, dim and ambient light. Zhang et al. (2022) found very small PPD differences between the two SNRs in darkness, much smaller than in dim and ambient light. One possible interpretation of our finding and the finding of Zhang et al. (2022) are the differential effects of the autonomic nervous system in dark and light-adapted conditions. During cognitive events, such as speech understanding in noise under ambient light conditions, pupil dilation is influenced by both the SNS, which activates the dilator muscle, and the PNS, which relaxes the sphincter muscle (Steinhauer et al., 2004). If the task is more challenging, such as speech comprehension without DIR + NR, increased SNS and reduced PNS activity can be expected. In contrast, when DIR + NR is switched on, the SNR improves, making speech easier to understand, thereby reducing listening effort and resulting in less SNS and more PNS activity. Therefore, the difference in pupil dilation between DIR + NR on and off is more pronounced in ambient light, because both the stimulation of the dilator muscle and the relaxation of the sphincter muscle contribute to the pupil dilation.
In darkness, however, the contribution of the PNS is minimal, and the sphincter muscle is already relaxed, resulting in a larger overall pupil diameter for both DIR + NR on and off. Pupil dilation in response to cognitive events in darkness is solely mediated by the SNS (Steinhauer et al., 2004). Consequently, the difference in pupil dilation between DIR + NR on and off is expected to be smaller in the dark than in ambient light, because only the SNS significantly contributes to the dilation. The absence of statistically significant difference of PPD between DIR + NR off and on may be attributed to pupil variability of the individual participants (see standard deviation in Figure 4a).
However, summarizing these two interpretations, one possible interpretation is that the observed reduction of PPD due to DIR + NR in ambient light can be almost exclusively related to an increase in PNS activity, mediated by the relaxation of sphincter muscles. This finding can be set into the context of the findings of Wang et al. (2018b), who investigated pupil dilation in dark and ambient light among NH listeners and HI participants without HAs. Their study demonstrated that the NH group exhibited greater pupil dilation in ambient light compared to the HI group. This difference was attributed to the lower inhibitory effect on the pupil via the PNS pathway in the HI group, indicating higher overall PNS activity (which leads to smaller pupil dilation) in the HI group. In darkness, no significant difference in pupil dilation was observed between the NH and HI groups, suggesting that the effect of SNS activity on the pupil is similar across both groups. The difference in PPD between ambient light and dark conditions was more pronounced for the NH group compared to the HI group. This suggests that the difference between ambient light and dark conditions within a group reflects the extent of the inhibitory effect of PNS activity on the pupil. In other words, the HI group exhibits higher overall PNS activity that was in their study also related to higher levels of overall fatigue, based on questionnaires.
With respect to the findings of the present study, the difference in PPD between DIR + NR switched on and off, however, cannot be related to higher overall PNS activity of one group, because the present study tested the same participants in all conditions. Instead, this difference reflects an
Questionnaires
Our hypothesis, based on the findings of Wang et al. (2018a), was that HA users with higher levels of fatigue should exhibit reduced pupil dilation in ambient light, regardless of usage of DIR + NR. The results from our study showed no significant effect: irrespective of fatigue score reduced PPD was present with DIR + NR switched on.
Wang et al. (2018a) investigated the relationship between self-reported daily-life fatigue, hearing status, and pupil dilation during a speech-in-noise task. Daily-life fatigue was assessed using the NfR and CIS questionnaires, participants were tested with a speech-in-noise task assessing the SRT50 and listening effort was measured using the PPD at that SRT50. They found that participants with higher levels of fatigue had smaller pupil dilations. One reason for this could be that more fatigued individuals are less motivated to perform well in the speech-in-noise task, leading to reduced exertion and consequently smaller PPDs. Wang et al.’s (2018a) findings support the idea that fatigue can influence the pupil dilation response, possibly through the involvement of the autonomous nervous system, indicating that fatigue could be a barrier to accurately measuring listening effort via PPDs. A reason for the nonsignificance here may be that average fatigue scores have been used for our participants, who (as voluntary subjects willing to participate in a longitudinal HA study) can be considered as relatively high performers compared to other HA users, who may suffer more from fatigue effects. Consequently, the average fatigue score here is also lower than the fatigue scores of other studies. The significant effect observed in Wang et al. (2018a) may thus be due to the fact that their participants were on average more fatigued than ours.
Another reason for the absence of a significant result here may be different evaluation methods and usage of the questionnaires. In Wang et al. (2018a), the NfR questionnaire was used, where participants answered the 11 items by selecting either “Yes” or “No.” The overall NfR score was calculated as the number of “Yes” responses divided by the total number of items, expressed as a percentage (i.e., ranging from 0 to 100). NH participants scored an average of 33.0%, while HI participants scored 45.7%. In the present study, participants answered the 11 items using a four-point rating scale ranging from 0 “never” to 3 “always.” To align the scores of the two evaluation methods of the NfR questionnaire, the total scores from the four-point rating scale were divided by the maximum possible score (33) to obtain a percentage. The overall average score across all participants in the present study was 26.6%. This result is substantially lower than that of Wang et al. (2018a) and may be attributed to the sensitivity of the scales. When comparing the CIS scores, it becomes apparent that participants in previous studies (Wang et al., 2018a,b) also scored higher than those in the present study. Their HI participants achieved an average score of 61.6%, whereas our participants had an average CIS score of 46.5%. This further indicates that their participants were more fatigued than ours.
Reliability of Pupil Dilation in Darkness
The nonsignificant difference in pupil dilation between DIR + NR on and off in darkness raises questions about the reliability of the recorded pupil dilation data in darkness. While pupillometry in darkness has been done by several studies before (e.g., Ohlenforst et al., 2017b; Steinhauer et al., 2004; Wang et al., 2018b; Zhang et al., 2022) the amount and quality of usable data may be indicative for its reliability. In the present study the amount of usable data in darkness is with 71.5% on average slightly smaller than the usable data in ambient light (79.9%). However, the amount of data is still comparable to studies in dim light (e.g., Fiedler et al. (2021) had a minimum of available trials of 71% and Peysakhovich et al. (2017) had an average of 69% available trials). Therefore, the amount of evaluated data in darkness is deemed to be sufficient. Also, we found the quality of the data from Pupil Labs Core setup to be good. Pupil labs core measures pupil diameter in pixel units and transforms to pupil diameter in mm using their 3D eye model. Investigating the raw pixel data instead of the postprocessed millimeter data revealed the same statistical effect for the primary objective of this study: A two-way ANOVA revealed a significant main effect of DIR + NR (F(1,25) = 10.78,
Another objection could be whether the pupil was fully adapted to the light conditions during the measurement and whether there was sufficient headroom in pupil size to detect an effect between DIR + NR on and off in darkness. However, since there were at least 3 min between setting the illumination within the room and the start of the measurement, it is very likely that the pupil was fully adapted to the light conditions during the measurement, because 30 s have been suggested to be sufficient (Pan et al., 2022). It is furthermore unlikely that the lack of difference in PPD between DIR + NR on and off in darkness was due to saturation of the pupil dilation in dark conditions. To investigate this, we extracted the absolute pupil diameter of all participants in darkness. These are with a median of 5 mm and a maximum of 6.5 mm much lower than the diameters of the study from Steinhauer et al. (2004), their Figure 1, who showed baseline diameters in darkness of 7 mm and response diameters of 7.5 mm. Even after accounting for an age-related reduction in pupil dilation (Winn et al., 1994 suggests a −0.043 mm decline per year in pupil diameter)—our median pupil diameter remains over 1 mm smaller than the response diameters reported by Steinhauer et al. (2004). This leaves sufficient margin for a potential maximal effect of 0.14 mm (see our Figure 4a). Just one single participant (out of our 26) had a diameter of about 10 mm and we cannot rule out a saturation in this single participant. Therefore, although not likely, it still cannot be ruled out with certainty that saturation effects occurred in single participants.
Future studies should also collect directional eye-gaze information to resolve whether participants were fixating their gaze in the respective condition, which may be an indicator whether participants were distracted within the dark or the ambient light condition.
Significance for Clinical Practice
The assessment of objective markers of listening effort in persons with hearing impairment and rehabilitative devices is an important growing field in audiology research. Our results support evidence from other studies that noise reduction significantly impacts pupil dilation response in brightness, serving as an objective marker for listening effort. Our data furthermore indicate that this effect is dependent on the light condition with no effect of NR in darkness. The absence of a difference between PPDs with DIR + NR on and off in darkness suggests that luminance levels should be carefully considered in such studies. This missing difference may point to contributions from the autonomic nervous system, but it does not necessarily imply that participants experience less listening effort in darkness or do not benefit from DIR + NR. Instead, it might indicate different sympathetic and parasympathetic activities that are differentially involved in light and dark responses. However, while the interaction of PNS and SNS activity cannot be fully examined with the current study, it is important to recognize these differences. To further study the role of PNS activity measured by pupillometry, it would be interesting to incorporate additional measurements sensitive to autonomic nervous system activity, such as heart rate variability or skin conductance.
For clinical applications, it is crucial to recognize that luminance can mitigate the pupil effect when used as a marker of effort. Therefore, it is recommended that luminance be carefully controlled when designing experiments to study listening effort particularly in clinical environments where there may be limited control over lighting.
Conclusions
The present study investigated the effect of state-of-the-art DIR + NR in commercially available hearing aids on speech intelligibility and PPD as an ocular marker of listening effort both in dark and ambient light conditions. The following conclusions can be drawn:
Speech intelligibility significantly increased due to DIR + NR from the baseline of about 50% both in darkness and ambient light by the same amount of about 35 percentage points. This increase may be largely attributable to improvements of SNR due to the DIR + NR algorithm. PPDs were reduced significantly due to DIR + NR only in the ambient light condition, but not in darkness. While the result in ambient light is in line with the notion of reduction of listening effort due to DIR + NR known from other studies, the result of the measurement in darkness is novel.
There were no correlations of either PPD or change in PPD due to DIR + NR to age, pure-tone average or fatigue score. The absence of correlation to fatigue score may be related to relatively active and less fatigued participants recruited for the present study.
Supplemental Material
sj-docx-1-tia-10.1177_23312165251336652 - Supplemental material for Influence of Noise Reduction on Ocular Markers of Listening Effort in Hearing Aid Users in Darkness and Ambient Light
Supplemental material, sj-docx-1-tia-10.1177_23312165251336652 for Influence of Noise Reduction on Ocular Markers of Listening Effort in Hearing Aid Users in Darkness and Ambient Light by Jessica Herrmann, Lorenz Fiedler, Dorothea Wendt, Sébastien Santurette, Hendrik Husstedt and Tim Jürgens in Trends in Hearing
Footnotes
Abbreviations
Acknowledgments
The authors would like to thank all participants for their valuable time.
ORCID iDs
Ethical Considerations
All participants gave written informed consent before participating. Ethical approval was granted by the Ethics Committee of the University of Applied Sciences Lübeck (approval 311.012.17 from May 08, 2020).
Funding
The authors acknowledge funding from William Demant Foundation (20-2461) and the Institute of Acoustics, Technische Hochschule Lübeck.
Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability
Data will be made available upon reasonable request.
Supplemental Material
Supplemental material for this paper is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
