Abstract
Physiological studies have shown that noise-induced sensorineural hearing loss (SNHL) enhances the amplitude of envelope coding in auditory-nerve fibers. As pitch coding of unresolved complex tones is assumed to rely on temporal envelope coding mechanisms, this study investigated pitch-discrimination performance in listeners with SNHL. Pitch-discrimination thresholds were obtained for 14 normal-hearing (NH) and 10 hearing-impaired (HI) listeners for sine-phase (SP) and random-phase (RP) complex tones. When all harmonics were unresolved, the HI listeners performed, on average, worse than NH listeners in the RP condition but similarly to NH listeners in the SP condition. The increase in pitch-discrimination performance for the SP relative to the RP condition (
Keywords
Introduction
Pitch perception and its underlying coding mechanisms have been investigated for decades to understand what information is necessary for the human auditory system to extract pitch (for a review, see de Cheveigné, 2005). Although some studies favored either a place-based (e.g., Goldstein, 1973; Ohm, 1843; Terhardt, 1974; von Helmholtz, 1877; Wightman, 1973) or a temporal approach (e.g., Licklider, 1951; Rutherford, 1886), more recent investigations suggested that both types of cues may be important for pitch coding (e.g., Cedolin & Delgutte, 2005; Heinz, Colburn, & Carney, 2001; Moore, 2003; Shamma & Klein, 2000; Oxenham, Bernstein, & Penagos, 2004).
Numerous studies have focused on the pitch coding mechanisms underlying pitch perception of complex tones (Bernstein & Oxenham, 2003, 2006a, 2008; Carlyon & Shackleton, 1994; Hoekstra & Ritsma, 1977; Houtsma & Smurzynski, 1990; Kaernbach & Bering, 2001; Moore, Glasberg, Flanagan, & Adams, 2006, Moore, Glasberg, & Hopkins, 2006; Moore & Glasberg, 2011; Shackleton & Carlyon, 1994). Different coding mechanisms were suggested for complex tones containing either low-numbered resolved harmonics or high-numbered unresolved components. While resolved components are processed by separate auditory filters and produce distinct ripples in the excitation pattern, neighboring unresolved components are processed within the same auditory filter and their interaction gives rise to a smooth excitation pattern which does not convey place information from which the frequency of individual harmonics can be retrieved (Plack, 2005). As a result, the pitch of resolved complex tones may be retrieved by fine spectral or temporal cues, while the pitch of unresolved complex tones can only be retrieved by the temporal information conveyed by envelope coding (Moore and Moore, 2003).
Sensorineural hearing loss (SNHL) is commonly associated with reduced frequency selectivity (Glasberg & Moore, 1986) and a reduced ability to extract temporal fine structure information (Hopkins & Moore, 2007; Moore, Glasberg, & Hopkins, 2006). However, recent physiological studies in animals showed that noise-induced SNHL increases the temporal precision and the amplitude of envelope coding in single auditory-nerve fibers (Henry, Kale, & Heinz, 2014; Kale & Heinz, 2010). These findings were ascribed to a variety of factors, such as broader auditory filters, a reduction of cochlear compression due to outer hair cell damage, and altered auditory-nerve response temporal dynamics (Scheidt, Kale, & Heinz, 2010). Thus, while fine spectro-temporal cues are disrupted, temporal envelope cues may be enhanced and the relative importance of spectral and temporal cues for pitch processing may be altered in listeners with SNHL. Although several studies reported that hearing-impaired (HI) listeners have disrupted abilities in pitch discrimination of complex tones (Arehart, 1994; Arehart & Burns, 1999; Bernstein & Oxenham, 2006b; Hoekstra & Ritsma, 1977; Moore & Glasberg, 1988; Moore & Peters, 1992; Moore & Moore, 2003), it has been found that the performance of HI listeners is not always disrupted as compared with NH listeners (Moore, 1998).
In fact, while most studies reported a degraded performance of HI listeners in pitch discrimination of stimuli containing low-order harmonics (Arehart, 1994; Bernstein & Oxenham, 2006b; Hoekstra, 1979; Hoekstra & Ritsma, 1977; Moore & Glasberg, 1990; Moore & Peters, 1992), which may be related to a reduced frequency selectivity (Bernstein & Oxenham, 2006b; Moore & Glasberg, 2011), some studies showed a similar performance of HI versus NH listeners for pitch discrimination of unresolved complex tones and also a comparable performance of HI listeners for pitch discrimination of resolved versus unresolved stimuli (Arehart, 1994; Bernstein & Oxenham, 2006b). Since the broadening of auditory filters in HI listeners leads to an increased number of unresolved harmonics as compared with NH listeners, it seems plausible that HI listeners rely more on the temporal information conveyed by the unresolved harmonics, rather than on the fine spectro-temporal information conveyed by the resolved harmonics (Moore & Carlyon, 2005). It is still unclear whether the altered importance of temporal versus spectral cues for pitch discrimination may be additionally due to the suggested enhancement of temporal envelope coding with SNHL (Henry et al., 2014; Kale & Heinz, 2010).
The aim of the present behavioral study was to clarify: (a) whether human listeners with SNHL show an enhancement of temporal envelope coding, (b) if this enhancement is related to the broadening of auditory filters and/or to the reduction of cochlear compression, and (c) how this enhancement affects pitch discrimination of complex tones. Pitch discrimination of complex tones was investigated behaviorally as a function of the fundamental frequency (
While in previous studies (e.g., Bernstein & Oxenham, 2006b; Glasberg & Moore, 1989; Hoekstra, 1979; Moore & Glasberg, 1990; Moore & Glasberg, 2011; Moore & Peters, 1992) the individual performance in pitch discrimination was correlated with individual measures of frequency selectivity, the novelty of the current study is that pitch discrimination was further investigated as a potential indicator of temporal envelope processing, on which pitch coding of unresolved complex tones is assumed to rely.
Methods
Listeners and Experimental Setup
Fourteen NH listeners (6 females), aged from 22 to 28 years, and 10 HI listeners (4 females), aged from 65 to 81 years, participated in this study. All NH listeners had hearing thresholds of less than 20 dB hearing level (HL) at all audiometric frequencies between 125 Hz and 8 kHz. The HI listeners had hearing thresholds between 25 and 65 dB HL at the audiometric frequencies between 1 and 4 kHz. The individual hearing thresholds of the HI listeners are reported in Figure 1 and the hearing thresholds at 2 kHz are listed in Table 1. All experiments were carried out monaurally, whereby the NH listeners were tested at their right ear and the HI listeners at their best ear matching the inclusion criteria. All experiments were approved by the Science-Ethics Committee for the Capital Region of Denmark.
Hearing thresholds in the test ear for the 10 HI listeners who participated in this study. The thresholds were obtained via conventional audiometry. Summary of Stimulus Levels and Auditory Profiling Measures Estimated for the Mean of NH Listeners and 10 Individual Hearing-Impaired Listeners.
Experiment I: Pitch-Discrimination of Complex Tones
The ability to discriminate the pitch of resolved and unresolved complex tones was assessed via difference limens for fundamental frequency (
Procedure
A three-alternative forced choice (3-AFC) paradigm was used in combination with a weighted up-down method (Kaernbach, 1991) to measure the 75% point on the psychometric function. For each trial, two intervals contained a reference complex tone with a fixed fundamental frequency (
Stimuli
All signals were generated digitally in MATLAB at a sampling rate of 48 kHz and consisted of 300-ms complex tones embedded in threshold equalizing noise (TEN, Moore et al., 2000). For the NH listeners, the sound pressure level (SPL) of the TEN was set to 55 dB per equivalent rectangular bandwidth (ERB, Glasberg & Moore, 1990) to mask the combination tones. For the HI listeners, pure tone detection in quiet was performed at 1.5, 2, and 3 kHz (two repetitions per frequency), and the level of the TEN was set at the maximum threshold measured in this range. The complex tones were created by summing harmonic components either in sine phase (SP) or random phase (RP) to vary the envelope peakiness. Summing the harmonics in SP yields to a peaky signal envelope, while summing the harmonics in RP yields to a much flatter envelope. All HI listeners were tested in the SP and RP conditions, whereas only 9 out of the 14 NH listeners completed the measurements for both conditions. Conditions of varying resolvability were achieved by bandpass filtering the complexes in a high-frequency region (HF, 1500–3500 Hz), with 50 dB/octave slopes, and by varying the
Experiment Ii: Amplitude-Modulation Detection
The temporal modulation transfer function (TMTF), i.e., the amplitude-modulation (AM) detection threshold as a function of the modulation frequency (
Procedure
A 3-AFC paradigm, in combination with a weighted up-down rule, was used to measure modulation detection thresholds at the 75% point of the psychometric function. For each trial, two intervals contained a pure tone at 2 kHz and one interval contained a sinusoidally amplitude-modulated 2-kHz sinusoid modulated at
For each listener, the auditory filter bandwidth was estimated as the
Stimuli
All signals were generated digitally in MATLAB at a sampling rate of 48 kHz and consisted of 300-ms pure tones. The carrier level was set to the same level as the nominal components of the complex tones in the pitch discrimination experiment (i.e., at 12.5 dB SL re the TEN level used in experiment I, see Table 1). No background noise was used. The stimuli were presented via Sennheiser HDA 200 headphones.
Experiment III: Estimates of BM I/O Function and Cochlear Compression
The residual peripheral compression was estimated in 9 out of the 10 HI listeners (all except HI 7) by estimating the individual BM I/O functions at 2 kHz. The BM I/O functions were derived from the temporal masking curves (TMCs) measured via a forward masking experiment for the nine listeners.
Procedure
Masker thresholds were measured as a function of the temporal gap between a 2-kHz probe and a masker tone, either “on-frequency” at 2 kHz or “off-frequency” at 0.6 times the probe frequency. The thresholds were tracked using the Grid method (Fereczkowski, 2015), which reduces the duration of the forward-masking experiment. After three repetitions of the measurement, the on-frequency thresholds were fitted for each listener with either two or single sections, depending on the estimated value of the Bayesian Information Criterion (Schwarz, 1978). This criterion was used to avoid model overfitting. Off-frequency thresholds were fitted with single sections in all cases. The fits were used to infer BM I/O functions following the paradigm of Nelson et al., (2001). The inverse slope of the section comprising the input stimulus level was taken as an estimate of the compression ratio (CR) at 2 kHz.
Stimuli
The masker tone duration was 200 ms, and the probe tone duration was 16 ms. Both were gated with 4-ms raised-cosine onset and offset ramps, hence the lengths of the steady state portions were 192 and 8 ms, respectively. The probe level was set at 10 dB above the absolute probe threshold. The stimuli were generated in MATLAB (44100 Hz sampling rate, 24-bit rate) and presented via Sennheiser HDA 200 headphones.
Modeling the Effects of Cochlear Compression and Frequency Selectivity on Envelope Peakiness
HF-filtered complex tones (
Results
Experiment I: Pitch-Discrimination of Complex Tones
Figure 2 (top panels) depicts the mean pitch-discrimination thresholds for NH listeners (black solid symbols), as well as the individual thresholds for HI listeners (open symbols), for the SP condition (left panel), the RP condition (middle panel) and the ratio between the RP and the SP thresholds (right panel). The thresholds for the SP and RP conditions showed similar trends for the NH listeners, whereby Pitch-discrimination thresholds for the SP condition (left panels) and RP condition (middle panels). The right panels depict the ratio of the RP and SP thresholds (
The mean performance of the 10 HI listeners was generally worse than that of the NH listeners. In fact, although some HI listeners showed a better performance than the NH listeners at low
Experiment II: Amplitude-Modulation Detection
Figure 3(a) depicts the amplitude-modulation detection thresholds for the individual HI listeners (open symbols), as well as the mean modulation thresholds for the five NH listeners who completed Experiment II (filled squares). The modulation thresholds for the NH listeners were independent of (a) Amplitude-modulation detection thresholds for a 2-kHz sinusoidal carrier as a function of the modulation frequency for the 10 HI listeners (same open symbols as in Figure 1; error bars depict the standard deviation across the three repetitions of each experimental condition). The mean thresholds for five NH listeners are also depicted in each panel for a comparison purpose (filled squares; error bars depict the standard error of the mean). The dashed vertical lines depict the estimated filter bandwidth as the 
Experiment III: Estimates of BM I/O Function and Cochlear Compression
Figure 4 depicts the TMC thresholds (on-frequency masker: open symbols; off-frequency masker: filled circles) measured in nine HI listeners, together with the corresponding fits. The measured masking thresholds increased with increasing masker-probe gap, consistent with the TMC data reported in the literature (e.g., Nelson et al., 2001). For most listeners, the fitted sections to the on-frequency TMCs (solid lines) were steeper than the corresponding off-frequency fits (dashed lines), while for other listeners (HI 6, HI 10), the on- and off-frequency fits showed similar slopes. This is consistent with some residual peripheral compression affecting the on-frequency maskers in case of the former listeners, but not the latter.
Temporal masking curves (TMCs) for nine HI listeners (HI 7 not measured), together with the corresponding fits. The on- and off-frequency thresholds are depicted with open and filled circles, respectively. The fits to the on-frequency data are shown with a solid line while the single-section fits to the off-frequency data are shown with a dashed line.
Figure 5 depicts the BM I/O functions (solid lines) estimated for the same nine listeners from the TMC fits. The linear reference is indicated by the dashed lines. The portions of the BM I/O functions that are shallower than the linear reference indicate the presence of peripheral compression in a given listener. The BM I/O functions represent the off-frequency TMC threshold on the ordinate (i.e., the BM output level) versus the on-frequency TMC threshold on the abscissa (i.e., the BM input level) for each given masker-probe gap. Thus, as the BM I/O functions were estimated only in the range where both on- and off-frequency TMCs were measured, the obtained BM input-level range differed among listeners (i.e., from 12 dB for HI 3 to 34 dB for HI 1 and HI 4). The individual peripheral compression at 2 kHz was estimated as the inverse of the slope (i.e., the compression ratio, CR, see Table 1) of the fitted section comprising the input stimulus level (depicted by the asterisks in Figure 5). This level was estimated for each listener as the overall level of a HF-filtered complex tone (at BM I/O functions (solid lines) estimated from the TMCs for nine HI listeners (HI 7 not measured). The dashed line depicts the linear reference, that is, the BM I/O function assuming absent peripheral compression. The asterisks show the estimated levels of a HF-filtered complex tone at 
Effects of Cochlear Compression and Frequency Selectivity on Pitch Discrimination
As influencing factors such as musical training and individual cognitive resources, as well as individual limitations (e.g., neural synchrony, internal noise level) are likely to affect the overall pitch-discrimination performance, the ratio between the RP and SP thresholds ( Mean 
Modeling the Effects of Cochlear Compression and Frequency Selectivity on Envelope Peakiness
The left panels in Figure 7 depict the modulation power of the SP (open symbols) and RP (closed symbols) complex tones, estimated at the output of a peripheral model individually adjusted according to the auditory profiles of the nine HI and the mean of the NH listeners. In the model, three simulations were run to clarify the relative effect of auditory-filter bandwidth and cochlear compression on the envelope representation of unresolved complex tones. In a first simulation (top panels), auditory-filter bandwidth was varied according to the estimates from Experiment II, while cochlear compression was fixed at a common value for NH listeners (CR = 6, Lopez-Poveda et al., 2003). The simulation revealed no effect of filter bandwidth on the modulation power of either the SP or RP signals. In a second simulation (middle panels), cochlear compression was varied according to the estimates from Experiment III, while filter bandwidth was fixed at the value of 325 Hz estimated for NH listeners (Experiment II). Reducing cochlear compression yielded an increase in the modulation power of the SP complex tone, indicating an increase of the envelope peakiness, while hardly affecting the modulation power of the RP complex tones. In fact, since compression is a non-linear operation, it mainly reduces the modulation depth of peaky signals. Thus, a reduction of compression yielded a much larger enhancement of the modulation depth for the SP than for the RP stimuli. In a third simulation (bottom panels), both filter bandwidth and cochlear compression were varied according to the estimates from Experiments II and III, respectively, yielding qualitatively similar results as for the second simulation. While filter bandwidth had no effect on the first simulation (i.e., when the CR was fixed at a high value), in the third simulation filter bandwidth had a small but consistent effect in increasing the modulation power by about a factor of 1.2 when the CR was close to 1 (i.e., in case of a large loss of compression: diamond, star, left-pointing triangle), as a consequence of more harmonic components passing through the filter.
Left panels: Envelope modulation power of a complex tone (
Thus, these results demonstrate that the modulation power of the RP complex tones was low (only slightly above 1, which would imply a flat envelope) and almost independent of both filter bandwidth (top left panel in Figure 7) and compression (middle left panel in Figure 7). In contrast, the modulation power of the SP complex tone increased with increasing loss of compression (almost perfectly linear increase, middle left panel) and, to a minor extent, when increasing filter bandwidth (only at CRs close to 1). Thus, the envelope peakiness of the SP complex tone was increased as compared with the RP envelope up to a factor of 3, mostly as a result of reduced compression.
This envelope enhancement was estimated as the ratio of the modulation power for the SP complex versus the RP complex (
Discussion
Relation Between Behavioral Results and Envelope Representation
The hypothesis of the current study was that if the envelope representation is enhanced for listeners with SNHL (Henry et al., 2014; Kale & Heinz, 2010), pitch cues for unresolved complex tones should also be enhanced if one assumes an envelope coding mechanism for pitch extraction of unresolved harmonics. The pitch-discrimination thresholds measured in the present study (Experiment I) revealed that the HI listeners performed worse than the NH listeners for the RP unresolved conditions (gray-shaded area on middle panels in Figure 2). However, the performance of the HI listeners was similar to that of the NH listeners when the harmonics were added in SP (gray-shaded area on left panels in Figure 2). This finding is in agreement with previous studies showing similar performance of the HI and NH listeners for pitch discrimination of complex tones with unresolved harmonics (Arehart, 1994, Bernstein & Oxenham, 2006b) and with stronger phase effects for the HI than for the NH listeners (e.g., Bernstein & Oxenham, 2006b; Moore & Carlyon, 2005; Moore & Peters, 1992). In fact, in the presence of a peaky envelope (SP condition), the pitch-discrimination performance of NH listeners increased, on average, by a factor of 1.3 relative to the RP condition (for the two unresolved conditions), while the performance of the HI listeners increased, on average, by a factor of 2.6. Thus, although the overall performance of the HI listeners was not better than that of the NH listeners, these findings suggest that HI listeners benefited more from a peaky signal relative to a signal with a flatter envelope in terms of pitch discrimination than NH listeners did. Hence, the behavioral findings of Experiment I do not rule out an enhanced envelope representation following SNHL. In fact, an envelope enhancement at the output of peripheral stages of the auditory system might be counteracted by other factors limiting the behavioral performance of the HI listeners (e.g., disrupted temporal fine-structure cues, degradation of auditory-nerve coding, higher internal noise level, age-related cognitive deficits). In agreement with this hypothesis, the results of Experiment II revealed significantly lower (better) modulation detection thresholds for the HI listeners (up to 100 Hz) as compared with NH listeners, consistent with previous findings (Moore & Glasberg, 2001; Moore et al., 1996). Thus, when amplitude-modulation detection is based on temporal envelope cues (i.e., when the sidebands are not resolved), the HI listeners showed a higher sensitivity in detecting amplitude modulations imposed on a sinusoidal carrier as compared with NH listeners.
While the larger benefit of HI listeners in pitch-discrimination performance for the SP relative to the RP condition might be a consequence of more harmonics being processed within broader than normal auditory filters, the lower thresholds obtained in Experiment II for HI listeners cannot be explained by the larger number of harmonics within the same auditory filter. In fact, since the sinusoidally amplitude-modulated tones of Experiment II contained only three frequency components (
F0DL Ratio and Individual Measures of Cochlear Compression and Filter Bandwidth
To quantify the changes in the internal envelope representation, the increase in pitch-discrimination performance for the SP condition relative to the RP condition (
Figure 8 depicts the correlation between the estimates of auditory-filter bandwidth and cochlear compression obtained from Experiments II and III, respectively. Although not significant, there was a trend of increasing bandwidth with increasing loss of compression ( Correlation between the estimated auditory-filter bandwidth and loss of cochlear compression across the nine HI listeners (open symbols) that participated in both Experiments II and III.
Modeling Results and Envelope Enhancement
Although auditory-filter bandwidth and cochlear compression are physiologically linked, they may have different effects on the envelope at the output of the auditory filters. Therefore, a simplified peripheral model that considers auditory-filter bandwidth and cochlear compression as independent factors was used to qualitatively describe the relative effect of one factor versus the other on the envelope representation of the unresolved complex tones.
The modulation power of a complex tone at the output of the model was used as an indicator of the salience of temporal envelope cues for pitch discrimination of unresolved complexes. The assumption was that the higher the modulation power (i.e., the peakier the envelope), the larger was the salience of temporal pitch cues. Thus, a higher modulation power would correspond to an improved performance in pitch discrimination (i.e., a lower behavioral threshold). The simulation outcomes revealed that reducing cochlear compression and, to a minor extent, increasing the filter bandwidth led to an increase in the modulation power for the unresolved SP complex tone, with reduction of compression clearly being the dominant factor (left panels in Figure 7). In contrast, the modulation power for the RP complex did not vary with either reducing compression or increasing filter bandwidth. Thus, the modeling outcomes suggest that the envelope cues for a RP complex tone may be similar for HI and NH listeners at the output of peripheral stages of the auditory system (provided that audibility is compensated for). Assuming similar processes for NH and HI listeners after the cochlear stages and assuming a temporal-envelope pitch coding mechanism for unresolved complex tones, one would predict similar performance for the RP condition in listeners with SNHL as compared with NH listeners. However, the behavioral performance of the HI listeners for the RP condition was, on average, worse than for NH listeners. This finding suggests that other individual factors than outer-hair cell damage might limit the performance of the HI listeners for both SP and RP conditions (e.g., disrupted temporal fine-structure cues, degradation of auditory-nerve coding, internal noise, age-related cognitive deficits). Thus, a possible enhancement of envelope cues following SNHL cannot be revealed based on a comparison of pitch-discrimination thresholds in HI versus NH listeners, but rather on a comparison between SP versus RP thresholds, whereby the RP thresholds represent the baseline condition in each listener.
The ratio between the modulation power (
AM-Detection as a Measure of Frequency Selectivity
The lack of correlation between the
Conclusion
Overall, the results of the pitch-discrimination experiment revealed that the performance of the HI listeners was, on average, similar to that of the NH listeners for the SP unresolved complex tones, and worse for the RP complexes. Thus, the increase in performance for the SP condition relative to its RP counterpart (
Footnotes
Acknowledgments
The authors would like to thank Andrew Oxenham and the two anonymous reviewers for their constructive feedback.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Technical University of Denmark.
