Abstract
During sound lateralization, the information provided by interaural differences in time (ITD) and level (ILD) is weighted, with ITDs and ILDs dominating for low and high frequencies, respectively. For mid frequencies, the weighting between these binaural cues can be changed via training. The present study investigated whether binaural-cue weights change gradually with increasing frequency region, whether they can be changed in various frequency regions, and whether such binaural-cue reweighting generalizes to untrained frequencies. In two experiments, a total of 39 participants lateralized 500-ms, 1/3-octave-wide noise bursts containing various ITD/ILD combinations in a virtual audio-visual environment. Binaural-cue weights were measured before and after a 2-session training in which, depending on the group, either ITDs or ILDs were visually reinforced. In experiment 1, four frequency bands (centered at 1000, 1587, 2520, and 4000 Hz) and a multiband stimulus comprising all four bands were presented during weight measurements. During training, only the 1000-, 2520-, and 4000-Hz bands were presented. In experiment 2, the weight measurements only included the two mid-frequency bands, while the training only included the 1587-Hz band. ILD weights increased gradually from low- to high-frequency bands. When ILDs were reinforced during training, they increased for the 4000- (experiment 1) and 2520-Hz band (experiment 2). When ITDs were reinforced, ITD weights increased only for the 1587-Hz band (at specific azimuths). This suggests that ILD reweighting requires high, and ITD reweighting requires low frequencies without including frequency regions providing fine-structure ITD cues. The changes in binaural-cue weights were independent of the trained bands, suggesting some generalization of binaural-cue reweighting.
Introduction
Spatial hearing is an important ability of the auditory system, as it allows the localization of sound sources and improves speech understanding in complex environments. This ability relies on the integration of information provided by different auditory cues (for a recent review on sound localization cues, see Stecker & Gallun, 2012). For azimuthal sound localization (i.e., in the horizontal dimension, referred to as lateralization for in-head localization judgements), which is the focus of this study, humans rely primarily on two binaural cues, namely interaural differences in time (ITD) and level (ILD).
Psychophysical and physiological studies (e.g., Grothe et al., 2010; Henning, 1980) have revealed that ITD cues are available in the temporal fine structure at low carrier frequencies (up to approx. 1.4 kHz) and in the temporal envelope at higher carrier frequencies. ILD cues, on the other hand, are much more pronounced at high frequencies, due to the frequency dependence of the head shadow (e.g., Middlebrooks & Green, 1991). Consequently, there are disparate cues available at low (mainly ITDs) and high (mainly ILDs) frequencies, a phenomenon known as the duplex theory of sound localization (Strutt, 1907). To establish the perceived azimuth of a sound source, the auditory system weights the information provided by the binaural cues. Given that the information carried by ITD and ILD cues varies with frequency, it is an important question to what extent these binaural-cue weights depend on the sound's frequency content. In fact, studies have shown that ITDs dominate for broadband sounds and at low frequencies, while ILDs dominate at higher frequencies (Ahrens et al., 2020; Macpherson & Middlebrooks, 2002; Wightman & Kistler, 1992). It is unclear, however, whether this binaural-cue weighting gradually changes with increasing frequency or whether it abruptly switches from being ITD dominant to being ILD dominant. Macpherson and Middlebrooks (2002) investigated the effect of delaying or attenuating the signal at one ear (while keeping the other localization cues intact) on localization of low-pass (0.5–2 kHz), high-pass (4–16 kHz), or wideband (0.5–16 kHz) stimuli, but did not test intermediate frequency bands. Ahrens et al. (2020) presented multiband stimuli consisting of up to 11 frequency bands which varied independently in either ITD or ILD while fixing the other cue at zero. They observed divergence for the lowest and highest frequency bands (i.e., stronger ITD weights for the lowest and stronger ILD weights for the highest band) but constant weights for in-between bands. This simultaneous presentation may, however, not show the weighting of each frequency band individually and instead be affected by simultaneous grouping effects, particularly binaural interference (Ahrens et al., 2020). The diverging weights for the lowest and highest bands appear to reflect an “edge effect” (i.e., stronger weighting of information at the edges), because the same pattern was observed in different edge bands when only intermediate frequency bands were presented (“removed” condition, Ahrens et al., 2020). Therefore, we were interested in whether binaural-cue weights measured similarly to Macpherson and Middlebrooks (2002) methods (i.e., by presenting each frequency band in isolation) show a comparable pattern to Ahrens et al. (2020) or whether they gradually change with increasing frequency. Such a gradual change would be more in line with the information carried by the binaural cues as discussed above as well as with previous literature on the basic sensitivity to ITD and ILD cues, assuming higher sensitivity (i.e., a lower detection threshold) correlates with stronger weighting of the respective cue. ILD thresholds tend to gradually improve between 1000 and 4000 Hz with increasing center frequency of narrow-band noise stimuli (Gabriel et al., 1992; Goupell & Stakhovskaya, 2018). In contrast, ITD thresholds worsen in a similar frequency range with increasing center frequency (Buchholz et al., 2018; Gabriel et al., 1992; Klumpp & Eady, 1956; Smoski & Trahiotis, 1986; Trahiotis & Bernstein, 1990). Note that the noise bands provide temporal fine-structure ITDs at lower frequencies and temporal envelope ITDs at higher frequencies. Therefore, if binaural cue weights are related to binaural cue thresholds, one would expect a gradual change in binaural-cue weighting with increasing frequency.
In addition to frequency, the binaural-cue weighting is influenced by other stimulus properties such as the overall intensity (David et al., 1959; Deatherage & Hirsh, 1959), the inter-click interval of click trains (Stecker, 2010), or the presence of reverberation (Rakerd & Hartmann, 2010) and shows substantial variation across listeners (Klingel et al., 2021; Klingel et al., 2020; Macpherson & Middlebrooks, 2002). Such a dependence on stimulus, environmental, and personal properties is not surprising, given that listeners adapt their processing of sound localization cues when exposed to cue alterations (see Carlile, 2014, for a review). Such adaptation can either be mediated by remapping (i.e., learning a new relationship between sound localization cues and corresponding locations in space; e.g., Shinn-Cunningham et al., 1998) or reweighting (i.e., a stronger relative weighting of unaltered or reliable cues compared to altered or unreliable cues). Such reweighting has been shown for monaural (spectral-shape) cues (resulting from the directional filtering of the outer ears) relative to binaural cues for horizontal sound localization (Keating et al., 2013; Kumpik et al., 2010; Van Wanrooij & van Opstal, 2007). Usually, monaural cues do not contribute to horizontal sound localization if binaural cues are available (Macpherson & Middlebrooks, 2002; Slattery & Middlebrooks, 1994).
Additionally, three recent studies have demonstrated reweighting of binaural cues. Kumpik et al. (2019) observed an increase in ILD weighting after the ITDs of broadband noise stimuli were randomized during the completion of a visual task, but no corresponding increase in ITD weighting after ILDs were randomized. Klingel et al. (2020) also observed an increase in ILD weighting after participants received response feedback consistent with the ILDs of a mid-frequency (2–4 kHz) narrow-band noise stimulus in an auditory discrimination task, but did not test a group receiving feedback consistent with the ITDs. Finally, Klingel et al. (2021) observed a comparably strong increase in either the ITD or the ILD weighting of the same 2–4-kHz noise stimulus, depending on which cue was visually reinforced during a lateralization task in a virtual audio-visual environment.
In summary, there is evidence that listeners can adjust their weighting of sound localization cues, and particularly their binaural-cue weighting. It is, however, still unclear whether such binaural-cue reweighting can be achieved for low- or high-frequency stimuli, for which either ITDs or ILDs are known to dominate. If the baseline weight of the reinforced cue is already strong, there might not be room for a further increase (i.e., ceiling effects). On the other hand, if the baseline weight of the reinforced cue is low, the dominance of the other cue might limit the access to the reinforced cue, thus preventing binaural-cue reweighting. Finally, we were interested in whether such reweighting is specific to the trained frequency band or whether it generalizes to untrained stimuli. Wright and Fitzgerald (2001), for example, observed different generalization patterns for ITD and ILD sensitivity training.
By conducting two experiments, the present study investigates the pattern of baseline binaural-cue weights across different frequency regions (i.e., whether weights gradually or abruptly change from ITD- to ILD-dominant with increasing frequency), whether the weighting can be changed via visual reinforcement for different frequencies, and whether such binaural-cue reweighting generalizes to untrained stimuli. Experiment 1 addressed all of these questions but some questions about the potential roles of the stimulus’ azimuthal range, the presentation mode of frequency bands (randomized vs. blocked), and the frequency range required for reweighting remained. These questions were addressed in experiment 2.
Experiment 1
The aim of this experiment was to test whether binaural-cue weights, measured similarly to Macpherson and Middlebrooks (2002), will show gradual weight changes across different frequency regions or whether they will show divergence for the lowest and highest frequency bands but constant weights in between, similar to the results reported by Ahrens et al. (2020). We further investigated whether binaural-cue reweighting can be induced for different frequency regions using the paradigm of Klingel et al. (2021; see description of methods below). Finally, we were interested in whether binaural-cue reweighting is specific to the trained frequency band or whether it generalizes to untrained frequency bands or to a broad-band stimulus.
Methods
Binaural auditory stimuli were generated using a computer and output via a digital audio interface (ADI-8, RME) at a 48-kHz sampling rate and presented via headphones (HD 580, Sennheiser). They were band-pass filtered white noise bursts (1/3 octave wide), randomly generated on each trial. Four frequency bands were used: A low-frequency band centered at 1000 Hz (low, 793.7–1259.9 Hz), a mid-low-frequency band centered at 1587.4 Hz (mid-low, 1414.2–1781.8 Hz), a mid-high-frequency band centered at 2519.8 Hz (mid-high, 2244.9–2828.4 Hz), and a high-frequency band centered at 4000 Hz (high, 3563.6–4489.8 Hz). These stimuli were chosen to have equally spaced frequency bands while minimizing physical and perceptual overlap between bands in a similar frequency range to Ahrens et al.’s (2020) “removed” condition. We did not test the full range presented in Ahrens et al. (2020) to keep the experimental time reasonable, since we presented the frequency bands sequentially. Additionally, a stimulus comprising all four bands (multiband) was used. The stimulus duration was 500 ms, including 50-ms raised-cosine on/off ramps. The mean overall sound pressure level (SPL) was 63 dB for an ILD of zero. For the multiband condition, the individual bands were equalized in level. To discourage participants from using differences in the absolute level rather than ILDs for lateralization, the overall level was roved randomly from trial to trial within a ± 2.5 dB range. The stimuli were not filtered with head related transfer functions (HRTFs). This was to ensure that they did not convey monaural spectral localization cues that are potentially informative about the stimulus azimuth (e.g., Hebrank & Wright, 1974), which might confound the binaural-cue weight estimation (although the narrow bandwidth and frequency range of the stimuli would have made this unlikely). Instead, ITDs ranging from −396 to + 396 μs and ILDs ranging from −14.95 to + 14.95 dB were imposed on these source stimuli. ITDs that matched azimuths ranging from −45° to + 45° with a 6° spacing were determined based on Xie’s (2013) estimation using the HRTFs of the KEMAR head with DB-61 small pinna at a source distance of 1.4 m. In Xie (2013), ITD values were obtained via broadband cross-correlation of the left and right ear head-related impulse responses (HRIRs). ILDs matching the same azimuths were determined individually for each frequency band based on the mean HRTF magnitudes at the respective center frequencies of four participants that did not take part in this study (taken from the HRTF database3 of the Acoustics Research Institute). For the multiband stimulus, the ILDs determined for each frequency band were used for the respective part of the stimulus. The azimuth range of ± 45° was chosen to ensure monotonically increasing ILDs with increasing azimuths for all frequency bands. Figure 1 shows the dependence of ITDs (panel a) and ILDs (panel b) on azimuth as determined according to the described methods, with the used binaural cues marked by symbols up to the dashed black line (symbols beyond that line refer to experiment 2).

Experimental setup and stimuli. Panels a) and b) show the functional relation between the azimuth and the binaural cues. ITDs are derived by Xie (2013) based on broadband cross-correlation of the left and right ear head-related impulse responses (HRIRs) of the KEMAR head. ILDs are based on the mean HRTF magnitudes at the respective center frequency of four new participants. In experiment 1, binaural-cues only up to the black dashed line were used. Panel c) shows all the ITD/ILD-azimuth combinations used in the pre- and posttest. In experiment 1, only yellow combinations were used. In experiment 2, both yellow and blue combinations were included. The frame indicates the azimuthal offsets Δ ITD and Δ ILD that were used to estimate the model parameters for the pre-/posttest data at one example azimuth (9°). Panel d) shows all cue combinations used in the training. For the ITD group, reinforced and unreinforced cues were ITD and ILD, respectively, and for the ILD group, reinforced and unreinforced cues were ILD and ITD, respectively.
ITDs and ILDs were combined into “consistent-cue” and “inconsistent-cue” conditions. In consistent-cue conditions, the ITD and ILD cue of the auditory stimulus corresponded to the same azimuth, while they corresponded to disparate azimuths in inconsistent-cue conditions. Cue disparities (i.e., the difference between the azimuths corresponding to the binaural cues within each stimulus) were restricted to a maximum value of 24° to avoid the perception of split images which can occur in case of large cue disparities (Gaik, 1993). During training, either the ITD or the ILD cue was visually reinforced as described below. Reinforced-cue azimuths ranged from −21° to + 21° (with 6° spacing) and azimuths corresponding to the unreinforced cue were uniformly distributed ± 24° (also with 6° spacing) around each reinforced-cue azimuth (Figure 1d in yellow). By symmetrically varying the unreinforced-cue azimuth around each reinforced-cue azimuth [resulting in a larger range of unreinforced-cue azimuths (±45°) than reinforced-cue azimuths (±21°)], the reinforced cue was more stable, which might encourage reweighting in addition to the visual reinforcement (Dahmen et al., 2010). In the pre- and posttest in which neither of the two cues was reinforced, azimuths in both the ITD and ILD dimension were uniformly distributed ± 24° around each azimuth from −21° to + 21° (Figure 1c in yellow).
Time Course (top to Bottom) for Each Experiment. The Tested Frequency Bands are Shown in Parentheses. Tasks Involving Visual Reinforcement are shown in italic.

Time course of a trial during the pre- and posttest (panels 1-2) and the practice session as well as the training (panels 1-6). 1) Participants oriented towards the reference position (indicated by a red sphere) and pressed a button to elicit the sound presentation. 2) Participants turned their head (guiding a green crosshair) to the perceived azimuth and pressed the button (in this example, they turned their head to the left). 3) Visual reinforcement (a rotating red cube) appeared at the reinforced-cue azimuth. 4) The reinforced-cue azimuth was confirmed via a head-turn to the visual reinforcement and a button-press. 5) The visual reinforcement turned green, participants returned to the reference position, and elicited the second sound presentation (while the visual reinforcement was still visible) with another button-press. 6) Participants confirmed the reinforced-cue azimuth again via another head-turn and button-press.
Auditory stimuli included both inconsistent and consistent ITD/ILD-combinations, as shown in Figure 1d in yellow. The training procedure was the same for the two groups except for which cue was visually reinforced and presented in a limited azimuth range. Thus, for the ITD group, ITD azimuths did not exceed ± 21° and for the ILD group, ILD azimuths did not exceed ± 21°. Each training session consisted of 432 trials presented in two blocks of 216 trials each. Within each block, trials were presented in a random order and each ITD/ILD combination shown in Figure 1d in yellow was presented once for the low, mid-high, and high bands. We chose the low and high bands for training to test for reweighting potential in frequency regions where one of the cues clearly dominates. The mid-high band was chosen because it was close in center frequency to the stimuli used in Klingel et al. (2021), for which both ITD and ILD reweighting was shown. Thus, the mid-low and the multiband conditions were not presented during training to test whether the training effects generalize to them. After every 72 trials, participants took a short break.
where R
ITD
(R
ILD
) is the participant's mean response azimuth in a trial for which the ILD (ITD) corresponded to azimuth α and the ITD (ILD) corresponded to azimuth α + Δ
ITD
(α + Δ
ILD
). That is, Δ
ITD
and Δ
ILD
are not values in μs or dB but refer to the azimuth difference between α and the azimuth corresponding to the other cue. The parameters
The data were analyzed using MATLAB R2018b (The MathWorks, Natick, MA). Statistical analyses were performed using SPSS Statistics 20 (IBM, Armonk, NY).
Results

Pretest ILD weights (normalized, i.e., ILD weight = 1 – ITD weight) for each frequency band, averaged across azimuths. Blue circles show the results of experiment 1 and red triangles show the results of experiment 2. The averaged azimuths were restricted to the range that was tested in both experiments (3°-21°). Error bars show the standard error of the mean. ILD weights gradually and significantly increase for single-band conditions from low (794-1260 Hz), to mid-low (1414-1782 Hz), to mid-high (2245-2828 Hz), to high (3564-4490 Hz). These conditions are connected by a line. The multiband condition (multi, plotted as a separate data point) shows ILD weights similar to the mid-low condition.
To test, whether ILD weights change gradually from low- to high-frequency bands, we ran a 5 (frequency band) x 4 (azimuth) ANOVA across the pretest data (to exclude any possible effects of training) of all participants. The ANOVA yielded a significant main effect of frequency band (F(4,72) = 143.91, p < .001, ηp2 = .889). Follow-up pairwise comparisons showed significant differences in the ILD weight between all bands (all p < .008, Bonferroni-corrected) except between the mid-low and the multiband conditions (p > .999, Bonferroni-corrected). ILD weights gradually increased from the low (M = .195, SEM = .030) to mid-low (M = .368, SEM = .040) to mid-high (M = .722, SEM = .015) to the high band (M = .796, SEM = .019). The multiband condition (M = .336, SEM = .026) showed ILD weights similar to the mid-low band.
While the pretest ILD weight averaged across the four narrow bands was close to 0.5 (M = .486, SEM = .019), it was significantly smaller than 0.5 for the multiband (M = .337, SEM = .026; t(18) = −6.17, p < .001, two-tailed, dZ = −1.42). This suggests that the low (i.e., ITD dominant) frequencies received more weight in the multiband stimulus.
The ANOVA additionally yielded a significant main effect of azimuth (F(3,54) = 3.71, p = .017, ηp2 = .171). Follow-up pairwise comparisons showed significant differences in the ILD weight between 3° and 21° azimuth (p = .015, Bonferroni-corrected) with larger ILD weights at 3° (M = .504, SEM = .019) compared to 21° azimuth (M = .458, SEM = .024). This suggests that ILD weights were larger at central compared to lateral azimuths. Figure 4 depicts the detailed patterns of ILD weights across azimuths, separately for each band and group (blue circles denote data from experiment 1). Note how in most panels, the blue lines with filled circles (pretest results) trend downwards (indicating lower ILD weights) from central (left on the x-axis) to lateral (right on the x-axis) azimuths.

ILD weights (normalized, i.e., ILD weight = 1 – ITD weight) as a function of azimuth for each frequency band (columns) and group (rows). The top row shows results of the ITD groups (i.e., participants for whom ITDs were reinforced during training) and the bottom row shows results of the ILD groups (i.e., participants for whom ILDs were reinforced during training). Solid lines and filled symbols show pretest results, dashed lines and open symbols show posttest results. Blue circles show the results of experiment 1 and red triangles show the results of experiment 2. Error bars show the standard error of the mean. ILD weights gradually increase from low-, to mid-low-, to mid-high-, to high-frequency stimuli and on average are lower for lateral azimuths. Significant weight changes, indicated by the asterisks in the respective colors (* denotes p < .05, ** denotes p < .01), from pre- to posttest were observed for the ILD group in experiment 1 for the high-frequency band as well as in experiment 2 for the mid-high-frequency band. Additionally, a significant weight change from pre- to posttest was observed for the ITD group in experiment 1 for the mid-low-frequency band at 9° azimuth as well as in experiment 2 at 39° azimuth.
There was no significant frequency band x azimuth interaction (F(5.38,96.87) = 1.56, p = .175, ηp2 = .080, Greenhouse-Geisser-corrected).
In the ITD group, a 2 (pre- vs. posttest) x 4 (azimuth) repeated measures (RM) ANOVA yielded no significant effects for the low band (all p > .199).
For the mid-low band, the 2 × 4 RM ANOVA yielded a significant main effect of azimuth (F(3,27) = 6.70, p = .002, ηp2 = .427) as well as a significant time x azimuth interaction (F(3,27) = 4.92, p = .007, ηp2 = .353), but no significant main effect of time. Follow-up pairwise comparisons showed that the time x azimuth interaction was driven by a significant decrease of ILD weights at 9° azimuth (p = .022, Bonferroni-corrected), but no difference between time points at other azimuths (all p > .532, Bonferroni-corrected). This suggests that participants reweighted the binaural cues in the expected direction, but only at 9° azimuth. The follow-up pairwise comparisons further showed significant differences in the ILD weights between 3° and 21° azimuth (p = .005, Bonferroni-corrected), as well as between 9° and 21° azimuth (p = .049, Bonferroni-corrected). ILD weights at 21° azimuth (M = .286, SEM = .058) were lower compared to 3° (M = .422, SEM = .049) and 9° azimuth (M = .435, SEM = .051). This suggests that ILD weights were larger at central compared to lateral azimuths.
For the mid-high band, the 2 × 4 RM ANOVA yielded no significant effects (all p > .219).
For the high band, the 2 × 4 RM ANOVA yielded a significant main effect of azimuth (F(3,27) = 6.33, p = .002, ηp2 = .413) as well as a significant time x azimuth interaction (F(3,27) = 3.08, p = .044, ηp2 = .255), but no significant main effect of time. Follow-up pairwise comparisons showed significant differences in the ILD weight between 3° and 21° azimuth (p = .017, Bonferroni-corrected) with lower ILD weights at 3° azimuth (M = .796, SEM = .021) compared to 21° azimuth (M = .878, SEM = .024), again suggesting larger ILD weights at central azimuths, but there were no significant weight changes from pre- to posttest at any azimuth (all p >.107, Bonferroni-corrected).
For the multiband condition, the 2 × 4 RM ANOVA yielded no significant effects (all p > .086).
In the ILD group, the 2 × 4 RM ANOVA for the low band yielded a significant main effect of azimuth (F(3,24) = 4.25, p = .015, ηp2 = .347), but neither a significant main effect of time nor a time x azimuth interaction. Follow-up pairwise comparisons showed significant differences in the ILD weight between 3° and 21° azimuth (p = .034, Bonferroni-corrected) with higher ILD weights at 3° azimuth (M = .239, SEM = .048) compared to 21° azimuth (M = .149, SEM = .033). This again suggests that ILD weights were larger at central compared to lateral azimuths.
The 2 × 4 RM ANOVA yielded no significant effects for either the mid-low band (all p > .200) or the mid-high band (all p > .289).
For the high band, the 2 × 4 RM ANOVA yielded a significant main effect of time (F(1,8) = 21.67, p = .002, ηp2 = .730) with smaller ILD weights in the pretest (M = .756, SEM = .020) compared to the posttest (M = .847, SEM = .029), but neither a significant main effect of azimuth nor time x azimuth interaction. This suggests that participants reweighted the binaural cues in the expected direction.
For the multiband condition, the 2 × 4 RM ANOVA yielded no significant effects (all p > .090).

Predicted responses for consistent-cue combinations by the regression analysis (i.e., factor Q) as a function of azimuth. Solid lines show pretest results and dashed lines show posttest results. Error bars show the standard error of the mean. Lateralization slopes are steeper for higher frequencies. Experiment 1 shows response compression from pre- to posttest (i.e., lateralization slopes are shallower in the post- compared to the pretest) while in experiment 2, responses are already compressed in the pretest and therefore no difference between time points is observed.
We then submitted the resulting lateralization slopes of each participant to a 5 (frequency band) x 2 (pre- vs. posttest) RM ANOVA. The ANOVA yielded significant main effects of frequency band (F(2.77,49.81) = 39.24, p < .001, ηp2 = .686, Greenhouse-Geisser-corrected) and time (F(1,18) = 39.76, p < .001, ηp2 = .688), with steeper lateralization slopes in the pretest (M = 1.08, SEM = 0.04) compared to the posttest (M = 0.86, SEM = 0.05). Additionally, there was a significant frequency band x time interaction (F(4,72) = 8.93, p < .001, ηp2 = .332). Follow-up pairwise comparisons showed significant differences between the pre- and posttest for all frequency bands (all p < .002, Bonferroni-corrected). This suggests a compression of responses from pre- to posttest. They further showed significant differences between all frequency bands (all p < .003, Bonferroni-corrected) except between the multiband and the mid-low (p = .286, Bonferroni-corrected) as well as the mid-high bands (p = .904, Bonferroni-corrected), and between the low and mid-low bands (p = .230, Bonferroni-corrected). Lateralization slopes gradually increased from the low band (M = 0.832, SEM = 0.04) to the mid-high band (M = 1.03, SEM = 0.04) to the high band (M = 1.13, SEM = 0.04). The multiband (M = 0.97, SEM = 0.04) showed lateralization slopes similar to the mid-low band (M = 0.88, SEM = 0.04) and the mid-high band.
We further tested if lateralization slopes differed significantly from 1. A lateralization slope of 1 would indicate the expected lateralization behavior, smaller slopes would indicate compressed responses, and larger slopes would indicate expanded responses. In the pretest, lateralization slopes did not differ significantly from 1 for the low and mid-low band as well as the multiband (all p > .053, two-tailed). However, they were significantly larger than 1 for the mid-high (t(18) = 4.09, p = .001, two-tailed, dZ = 0.94) and the high band (t(18) = 6.97, p < .001, two-tailed, dZ = 1.60). In the posttest, slopes were significantly smaller than 1 for the low (t(18) = −5.18, p < .001, two-tailed, dZ = −1.19), mid-low (t(18) = −4.01, p = .001, two-tailed, dZ = −0.92), and mid-high band (t(18) = −2.37, p = .029, two-tailed, dZ = −0.54) as well as the multiband (t(18) = −2.71, p = .014, two-tailed, dZ = −0.62), but not for the high band.
Summary of Results
In this section we summarize the results of Experiment 1, which will then be discussed together with those of experiment 2 in the General Discussion. The baseline (i.e. pretest) results showed that binaural-cue weights gradually changed from being ITD dominant to ILD dominant from low-, to mid-low-, to mid-high-, to high-frequency-band conditions. The multiband stimulus was weighted similarly to the mid-low band, meaning that ITDs dominated the percept when both ITD and ILD dominant bands were combined into one stimulus. On average across frequency bands, we further observed higher ILD weights at central compared to lateral azimuths, although deviations from this effect are observed for some conditions in band-specific analyses that included the posttest data. Additionally, we found steeper lateralization functions for higher frequencies and shallower lateralization functions in the post- compared to the pretest.
The results regarding training-induced changes in binaural-cue weights from pre- to posttest were unexpected. We expected to see an increase in the reinforced-cue weight and a respective decrease in the unreinforced-cue weight from pre- to posttest (i.e., reweighting) at least for the trained mid-high band, since it is similar in center frequency to the stimulus used in Klingel et al. (2021), where reweighting was shown in both the ITD and ILD groups. However, for the mid-high band, no reweighting was found in either group. There were, however, reweighting effects for some other conditions. The ILD group showed reweighting for the trained high band, which did not generalize to untrained conditions (i.e., mid-low and multiband). The ITD group, on the other hand, showed reweighting for the mid-low band, but only at 9° azimuth. This is surprising, as the mid-low condition was not trained. For all other bands, the ITD group showed no reweighting effects.
A closer look at the mid-high band showed that while it was similar in center frequency to the frequency band tested in Klingel et al. (2021), the baseline (pretest) binaural-cue weighting differed between the two studies. The ILD weight reported in Klingel et al. (2021) was lower than the ILD weight for the mid-high band in the present study. Instead, it closely matched the ILD weight for the mid-low band. This is likely due to the different bandwidths of the stimuli: In Klingel et al. (2021), stimuli were 1-octave wide while they were only 1/3-octave wide in the present study to minimize spectral overlap between frequency bands. As ITDs are known to dominate for broadband stimuli (Macpherson & Middlebrooks, 2002), it is likely that the lower-frequency part of the 1-octave wide band in Klingel et al. (2021) dominated the overall binaural-cue weighting of that stimulus.
Considering the design of the present study and of Klingel et al. (2021), we identified three potential reasons for the lack of reweighting in most conditions. 1) We had to restrict the azimuth range to ensure monotonic ILDs for all frequency bands. In Klingel et al. (2021), however, the largest effect was observed for more lateral azimuths (although the time x azimuth interaction did not reach significance). 2) We presented the different frequency bands randomly in one block. As they all had very different baseline weights, this may have been confusing, especially during training where visual reinforcement was provided. And 3), we did not train participants with the mid-low band, which was closest in baseline ILD weight to the stimuli in Klingel et al. (2021). If the baseline ILD weight predicts the reweighting potential, training with this stimulus might increase chances for reweighting. To address these issues, we conducted a follow-up experiment. In experiment 2, we only included the mid-low- and mid-high bands, for which ILDs were monotonic up to 63° azimuth. This allowed us to use a wider range of azimuths than in experiment 1 (reinforced-cue azimuths up to 39° and unreinforced-cue azimuths up to 63° while experiment 1 included reinforced-cue azimuths only up to 21° and unreinforced-cue azimuths up to 45°). To avoid confusion, the two frequency bands were presented in blocks during practice as well as the pre- and posttest and only one frequency band (mid-low) was used during training.
Experiment 2
Based on the considerations discussed above, a follow-up experiment was performed. Experiment 2 was conducted identically to experiment 1 except for the changes described in the following.
Methods
Results
In the ITD group, a 2 (pre- vs. posttest) x 7 (azimuth) RM ANOVA yielded a significant time x azimuth interaction for the mid-low band (F(6,54) = 2.36, p = .042, ηp2 = .208), but neither a significant main effect of time nor azimuth. Follow-up pairwise comparisons showed that the interaction was driven by a significant decrease in ILD weight from pre- (M = .508, SEM = .070) to posttest (M = .312, SEM = .073) at 39° azimuth (p = .021, Bonferroni-corrected), but no significant differences at other azimuths (all p > .127, Bonferroni-corrected). This suggests that participants reweighted the binaural cues in the expected direction, but only at 39° azimuth.
For the mid-high band, the 2 × 7 RM ANOVA yielded a significant main effect of azimuth (F(6,54) = 4.09, p = .002, ηp2 = .312), but neither a significant main effect of time nor time x azimuth interaction. Follow-up pairwise comparisons showed a significant difference between 3° and 15° azimuth (p = .024, Bonferroni-corrected) with larger ILD weights at 3° (M = .725, SEM = .030) compared to 15° azimuth (M = .632, SEM = .032). This suggests that ILD weights were larger at central compared to lateral azimuths.
In the ILD group, the 2 × 7 RM ANOVA yielded no significant effects for the mid-low band (all p > .407).
For the mid-high band, the 2 × 7 RM ANOVA yielded a significant main effect of time (F(1,9) = 5.17, p = .049, ηp2 = .365), with smaller ILD weights in the pre- (M = .600, SEM = .046) compared to the posttest (M = .666, SEM = .040). This suggests that participants reweighted the binaural cues in the expected direction. There further was a significant main effect of azimuth (F(2.45,22.08) = 5.97, p = .006, ηp2 = .399, Greenhouse-Geisser-corrected). Follow-up pairwise comparisons showed significant differences between 33° and 3° (p = .039, Bonferroni-corrected), 15° (p = .030, Bonferroni-corrected), as well as 21° azimuth (p = .018, Bonferroni-corrected). ILD weights were smaller at 33° azimuth (M = .435, SEM = .058) compared to 3° (M = .706, SEM = .046), 15° (M = .649, SEM = .050), and 21° azimuth (M = .684, SEM = .051). This suggests that ILD weights were again larger at central compared to lateral azimuths. There was no significant time x azimuth interaction.
All lateralization slopes were significantly smaller than 1. In the pretest, both the mid-low band (M = .825, SEM = .064; t(19) = −2.73, p = .013, two-tailed, dZ = −0.61) and the mid-high band (M = .892, SEM = .034; (t(19) = −3.17, p = .005, two-tailed, dZ = −0.71) had lateralization slopes smaller than 1. This suggests that responses were already compressed in the pretest. In the posttest, again both the mid-low band (M = .804, SEM = .021; t(19) = −9.37, p < .001, two-tailed, dZ = −2.09) and the mid-high band (M = .906, SEM = .026; t(19) = −3.58, p = .002, two-tailed, dZ = −0.80) had lateralization slopes smaller than 1.
Summary of Results
In experiment 2, we observed binaural-cue reweighting in the ILD group at the mid-high band, which was not trained, but not for the trained mid-low band. This is surprising, given that we did not observe reweighting at the mid-high band in experiment 1, even though it was trained. In the ITD group, we observed binaural-cue reweighting for the trained mid-low band only at 39° azimuth, which is in line with the expectation based on Klingel et al. (2021) that a stronger reweighting effect occurs for more lateral azimuths. However, we did not observe reweighting at 9° azimuth as we did in experiment 1. This reweighting did not generalize to the mid-high band.
We suspected that the trial-by trial switching between frequency bands might have contributed to the overall small reweighting effect in experiment 1 and, therefore, used blocked presentation during testing and included just one band during training in experiment 2. However, as only the untrained mid-high band in the ILD group showed more reweighting in experiment 2 compared to experiment 1, there is no evidence that trial-by-trial switching between frequency bands had an impact on reweighting.
Consistent with the results of experiment 1, we observed stronger ILD weights and steeper lateralization functions for the mid-high compared to the mid-low band. However, there was no response compression from pre- to posttest. Instead, especially at the larger azimuths tested in experiment 2, responses were already compressed in the pretest (i.e., pretest lateralization slopes were significantly smaller than 1 and shallower than in experiment 1). This might be attributable to the larger azimuth range presented in experiment 2. Namely, participants may have tended to respond inside of the visual markers of ± 45°, even though the unreinforced cue extended that range. Note that while a larger azimuth range was also used in Klingel et al. (2020), who observed compression only in the posttest, they used visual markers every 15° and participants were therefore accustomed to respond beyond the markers.
General Discussion
The present study investigated whether binaural-cue weights gradually change from low- to high-frequency bands, whether binaural-cue reweighting can be induced in different frequency regions, and whether binaural-cue reweighting is specific to the trained frequency band or whether it generalizes to untrained frequency bands.
Binaural-Cue Weights Across Spectral Regions
We observed gradually increasing ILD weights with increasing spectral region of narrow-band noise stimuli. These baseline weights appear to be robust, as similar pretest weights were obtained in both experiments involving different participants and presentation modes (randomized vs. blocked). The finding of gradually increasing ILD weights with increasing frequency extends Macpherson and Middlebrooks (2002) results involving a low- and a high-pass stimulus but differs from Ahrens et al.’s (2020) observation, who measured the contributions of different frequency bands in a multiband stimulus to binaural-cue weighting. Ahrens et al. (2020) found significant differences mainly for the lowest and the highest frequency band and similar weights for in-between frequency bands, irrespective of the absolute frequency range of presented bands. These different results likely resulted from differences in the methodology: While the frequency bands were presented in isolation in the present study, they were presented as part of a multiband stimulus in Ahrens et al. (2020), leading to an “edge effect” (i.e., stronger weighting of information at the edges). Instead, the present results are more in line with the sensitivity pattern to ITD and ILD as a function of center frequency for narrow-band noise stimuli. Compared to ITD thresholds, ILD thresholds are fairly constant across frequencies, but tend to decrease with increasing center frequency (Gabriel et al., 1992), although there are some local peaks in the threshold function (Goupell & Stakhovskaya, 2018). Additionally, ILDs are physically smaller for low- compared to high-frequency stimuli (leading to a shallower slope of the cue-versus-azimuth function) and therefore more informative at high frequencies. In contrast, ITDs are physically almost frequency-independent and ITD thresholds increase with increasing center frequency from approx. 800 Hz towards higher (at least up to 4000 Hz) frequencies (Buchholz et al., 2018; Gabriel et al., 1992; Klumpp & Eady, 1956; Smoski & Trahiotis, 1986; Trahiotis & Bernstein, 1990), because of decreasing access to fine-structure ITD.
For the multiband stimulus, comprising all tested bands, ITDs dominated the percept, similar to Macpherson and Middlebrooks (2002) broadband stimuli. While the pretest ILD weight averaged across the four frequency bands of experiment 1 was close to 0.5, it was significantly lower than 0.5 for the multiband stimulus. This suggests that for broadband sounds, the auditory system performs a weighted integration of binaural cues across frequencies, where ITDs receive more weight than ILDs.
Training-Induced Binaural-Cue Reweighting Across Spectral Regions
In the ILD group, a training-induced change in binaural-cue weights was observed for the trained high-frequency band (experiment 1) and for the untrained mid-high-frequency band (experiment 2), but not for the other conditions. In the ITD group, binaural-cue reweighting was observed for the untrained mid-low-frequency band (experiment 1, only at 9° azimuth) and for the same band when it was trained (experiment 2, only at 39° azimuth). While it may seem like these results are driven by noise in the pretest (given that participants get more accustomed with the setup and task over time), it should be noted that the chosen correction method for the post-hoc comparisons (Bonferroni) is very conservative and similar noise in other conditions did not yield significant results, suggesting that the reported results are meaningful. There was no significant reweighting effect for the other frequency bands.
Given that both trained and untrained stimuli showed some effects in both groups, band-specific training does not seem to be crucial. This may suggest some across-frequency generalization of binaural-cue reweighting. Instead, the occurrence of reweighting seems to depend on an appropriate match between the stimulus’ spectral region and the binaural cue that is reinforced. In the ILD group, effects were only observed for the mid-high and high bands, while in the ITD group, effects were only observed for the mid-low band (at specific azimuths). This suggests that inducing an increase in ILD weighting only works for sufficiently high frequencies: The strongest effect was observed for the high band (centered at 4000 Hz) and a less robust effect (only present in experiment 2, close to the significance threshold) occurred for the mid-high band (centered at 2520 Hz). Similarly, sufficiently low frequencies seem to be needed to induce an increase in ITD weighting, as only the mid-low band (centered at 1587 Hz) showed some effects in the ITD group. However, if the stimuli included the low-frequency temporal fine structure region that provides exquisite ITD sensitivity (i.e., for the low band centered at 1000 Hz and the multiband), no change in binaural-cue weights was observed. This suggests that ITD reweighting was based on envelope-ITD information. Compared to Klingel et al. (2021), the reweighting effects observed in the present study are relatively weak. In the ITD group, the effect for the mid-low band was so small that it only reached significance for specific azimuths. In the ILD group, the effect for the mid-high band reached significance in only one of the two experiments. Solely the effect for the high band in the ILD group showed comparable strength to the effects reported in Klingel et al. (2021). Nevertheless, the present results agree with the results reported by Klingel et al. (2021), who used bandpass-filtered (2–4 kHz) noise and found a reweighting effect in both directions (i.e., for both their ILD and ITD group). The 2–4 kHz stimulus both included “sufficiently high” and “sufficiently low” frequencies but did not touch the frequency region providing highest ITD sensitivity (i.e., the fine-structure region), and thus seems to be optimally suited to induce binaural-cue reweighting.
Additional Effects Independent of Binaural-Cue Reweighting
On average across frequency bands, we observed higher ILD weights at central compared to lateral azimuths. This is surprising, as the opposite pattern was observed in Klingel et al. (2020), who used a discrimination task to measure binaural-cue weights. In Klingel et al. (2021), who used the same auditory stimuli as Klingel et al. (2020) and the same lateralization task as the present study, no significant effect of azimuth was observed. Therefore, the effect of azimuth does not appear to be robust and may depend on the task and auditory stimuli used. The pattern of higher ILD weights at central azimuths observed in the present study may be due to different sensitivity patterns for ITDs and ILDs across azimuths. Variations of ITD sensitivity across azimuths appear to be fully accounted for by the azimuth dependence of the ITD cue itself rather than by decreasing ITD sensitivity with increasing reference ITD (Smith & Price, 2014). Given that for the azimuth range included here ITDs increased relatively linearly with increasing azimuth (see Figure 1a), ITD sensitivity is expected to be constant across azimuths. ILD sensitivity, on the other hand, decreases with increasing reference ILD in addition to the physical dependence of ILD magnitude on azimuth (Brown et al., 2018). For the mid-low, mid-high, and high bands, the ILD magnitude increases less strongly (i.e., the slope of the cue-versus-azimuth function is shallower) for azimuths between approximately 21°–39° compared to azimuths between 3°–15° (see Figure 1b). The combined effect of decreasing ILD sensitivity with increasing reference ILD as well as the reduced increase in ILD magnitude for more lateral azimuths (leading to a reduced acoustic “goodness” of the cue) may, therefore, have contributed to the stronger ILD weighting at central compared to lateral azimuths.
Additionally, we found steeper lateralization functions for higher frequencies. This could be explained by two factors regarding ILDs, which dominate the percept at higher frequencies. First, for natural sounds arriving at the outer ears (at least for the frequency range used in this study), ITDs are almost frequency-independent, whereas ILDs are larger at higher frequencies (e.g., Middlebrooks & Green, 1991). Second, Bernstein and Trahiotis (2011) showed that even for a constant ILD magnitude, higher frequencies produce a larger extent of laterality than lower frequencies. The steeper lateralization functions at higher frequencies in our study could be due to both of these factors. We also observed shallower lateralization functions in the post- compared to the pretest in experiment 1. The same pattern was observed in Klingel et al. (2021) and attributed to a training-induced response compression, due to a response mapping to the visually reinforced range. This compression was not observed in experiment 2, possibly because responses were already compressed in the pretest. Importantly, any differences in the slope of the lateralization functions do not influence the estimated binaural-cue weights, as they affect k ITD and k ILD equally and therefore cancel out in the final weight estimation.
Limitations and Future Directions
A general limitation of the impact of training-induced changes in binaural-cue weights is their strong dependence on the auditory stimulus. In addition to requiring either sufficiently high or sufficiently low frequencies, reweighting seems to be restricted to conditions where low-frequency temporal-fine-structure ITDs do not contribute, which limits the ecological importance of the phenomenon in the normal auditory system where such cues are often available.
However, the results may be relevant for hearing-impaired or cochlear-implant (CI) listeners. Lacher-Fougère and Demany’s (2005) results suggest that listeners with sensorineural hearing loss may not have access to fine-structure ITD cues, while retaining some sensitivity to envelope ITD cues. Also, CI listeners seem to have access to envelope ITD cues only. Many CI stimulation strategies encode ITDs only via the envelope of the stimulus waveform and, even when encoding ITDs via the pulse timing, CI listeners’ sensitivity pattern resembles that for envelope ITDs in acoustic hearing (Bernstein & Trahiotis, 2002; Laback et al., 2007). In fact, binaural-cue reweighting has recently been observed in CI listeners when ITDs were encoded via the pulse timing of low-rate pulse trains (Klingel & Laback, 2021).
We presented auditory stimuli via headphones without HRTF filtering. This ensured that participants did not have access to monaural spectral localization cues, which might provide information about the stimulus azimuth and, thus, might prevent purely binaural-cue reweighting. Kumpik et al. (2010), for example, observed an increased weighting of monaural cues for azimuthal sound localization but no adaptation to changed binaural cues after training, when binaural cues were modified but monaural spectral cues were preserved at one ear. But open questions remain. For example, it is unclear whether the resulting lack of externalization in our study affected the weighting. Kumpik et al. (2019) used (non-individualized) HRTFs as well as reverberation to promote externalization and observed slightly (but significantly) ILD-dominant weights for their broadband stimuli. However, given that Macpherson and Middlebrooks (2002) also included HRTFs, the higher ILD weights in Kumpik et al. (2019) compared to Macpherson & Middlebrooks’ wideband and our multiband stimuli more likely resulted from the added reverberation, which makes ITDs less reliable (Rakerd & Hartmann, 2010), than from the HRTFs or from externalization. Comparing our low- and high-frequency band as well as the multiband stimuli to Macpherson and Middlebrooks’ low-pass, high-pass, and wideband stimuli results in very similar weight estimates. Additionally, it would be an interesting topic for future studies to clarify under which conditions binaural-cue reweighting and binaural-to-monaural-cue reweighting occurs for azimuthal sound localization. Particularly interesting are situations where listeners have access to spectral localization cues but access to low-frequency fine-structure ITD cues is prevented, e.g., due to traffic noise.
Summary and Conclusions
The present study is, to our knowledge, the first to show gradually increasing ILD weights (or decreasing ITD weights) with increasing frequency (i.e., spectral region) of narrowband stimuli. It therefore extends our knowledge on the duplex theory of sound localization, particularly on the transition from ITD- to ILD-dominant frequency regions. The across-frequency pattern of weights appears to be consistent with the corresponding patterns of ITD and ILD sensitivity from the literature as well as with the acoustic “goodness” of the cue. We further showed that binaural-cue reweighting is frequency dependent: To induce an increase in ILD weighting, the stimulus frequency needs to be sufficiently high and to induce an increase in ITD weighting, the stimulus frequency needs to be sufficiently low without including the low-frequency region providing fine-structure ITD cues. However, the observed increase in ITD weighting was so small that it reached significance only at specific azimuths. Which frequency band is used for training does not appear to influence the results systematically, suggesting some across-frequency generalization of binaural-cue reweighting. Such reweighting likely plays a role when listeners adapt to acoustic environments with altered robustness of binaural cues (e.g., when low-frequency ITD cues are masked by traffic noise) and has potential applications, such as supervised training after introducing a previously impeded cue to hearing devices such as cochlear implants.
Footnotes
Research Data Availability Statement
The data is directly available for download from http://amtoolbox.org/amt-1.1.1-dev/auxdata/klingel2022/ and part of the Auditory Modeling Toolbox (AMT) development repository (i.e., they will be part of the next release). In the meantime, they can be accessed via the AMT functions “data_klingel2022” (raw data), and “exp_klingel2022” (ILD weights) as a download from sourceforge: https://sourceforge.net/p/amtoolbox/code/ci/master/tree/data/data_klingel2022.m and
, respectively.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the uni:docs Fellowship Program for Doctoral Candidates of the University of Vienna and a VDS CoBeNe final fellowship.
