The auditory peripheral system filters broadband sounds into narrowband waves and decomposes narrowband waves into quickly varying temporal fine structures (TFSs) and slowly varying envelopes. When a noise is presented binaurally (with the interaural correlation being 1), human listeners can detect a transient break in interaural correlation (BIC), which does not alter monaural inputs substantially. The central correlates of BIC are unknown. This study examined whether phase locking-based frequency-following responses (FFRs) of neuron populations in the rat auditory midbrain [inferior colliculus (IC)] to interaurally correlated steady-state narrowband noises are modulated by introduction of a BIC. The results showed that the noise-induced FFR exhibited both a TFS component (FFRTFS) and an envelope component (FFREnv), signaling the center frequency and bandwidth, respectively. Introduction of either a BIC or an interaurally correlated amplitude gap (which had the summated amplitude matched to the BIC) significantly reduced both FFRTFS and FFREnv. However, the BIC-induced FFRTFS reduction and FFREnv reduction were not correlated with the amplitude gap-induced FFRTFS reduction and FFREnv reduction, respectively. Thus, although introduction of a BIC does not affect monaural inputs, it causes a temporary reduction in sustained responses of IC neuron populations to the noise. This BIC-induced FFR reduction is not based on a simple linear summation of noise signals.
- frequency-following responses
- inferior colliculus
- interaural correlation
- temporal fine structure
interaural correlation (IAC) is defined as the similarity of sound waves presented at the two ears (Jeffress et al. 1962). IAC-based binaural processing plays a critical role in both sound localization (Coffey et al. 2006; Franken et al. 2014; Soeta and Nakagawa 2006) and target-object detection in noisy environments (Durlach et al. 1986; Palmer et al. 1999). The IAC also affects the auditory perception. For example, when the IAC drops from 1 to 0, without affecting the spectra of monaural inputs, the auditory image of the simultaneously arriving binaural sounds changes vividly from a single image located at the head center into two separated images at each ear (Blauert and Lindemann 1986; Culling et al. 2001). Accordingly, human listeners with normal hearing can easily detect an interaurally uncorrelated fragment embedded in the interaurally correlated noises (Akeroyd and Summerfield 1999; Boehnke et al. 2002; Huang et al. 2008, 2009a, 2009b; Kong et al. 2012, 2015; Li et al. 2009, 2013), i.e., a transient change of IAC from 1 to 0, then back to 1 [so-called “break in interaural correlation” (BIC)]. Note that introduction of a BIC does not significantly alter monaural inputs. Until now, the neural correlates of the BIC in the central auditory system have not been reported in the literature.
The peripheral auditory system not only band-pass filters broadband sounds into a series of narrowband waves orderly distributing along the basilar membrane but also decomposes narrowband waves into both quickly varying temporal fine structures (TFSs) and slowly varying envelopes (Moore 2008; Rosen 1992). These two temporal components are subsequently represented by temporal firing patterns of the auditory nerves (Johnson 1980; Joris and Yin 1992; Young and Sachs 1979). Although the neural representation of a BIC in the central auditory system may contain TFS and envelope components, listeners in fact do not perceive the BIC as separated TFS and envelope percepts.
Both scalp-recorded and intracranially recorded frequency-following responses (FFRs) are sustained neuroelectrical potentials based on precisely phase-locked responses of neuron populations to instantaneous waveforms of low- to middle-frequency acoustic stimuli (Chandrasekaran and Kraus 2010; Du et al. 2009a, 2009b, 2011, 2012; Marsh and Worden 1969; Moushegian et al. 1973; Ping et al. 2008; Weinberger et al. 1970; Worden and Marsh 1968). FFRs can efficiently convey both TFS information (e.g., Chandrasekaran and Kraus 2010; Du et al. 2011; Galbraith 1994; Krishnan 2002; Krishnan and Gandour 2009; Russo et al. 2004) and envelope information (also called envelope-following response) (e.g., Aiken and Picton 2006, 2008; Dolphin and Mountain 1992, 1993; Hall 1979; Shinn-Cunningham et al. 2013; Supin and Popov 1995; Zhu et al. 2013). FFRs start to occur in the auditory nerve (Dau 2003) and can be intracranially recorded in both the lower auditory brain stem structures (Kuokkanen et al. 2010; Wagner et al. 2005, 2009) and the auditory midbrain, the inferior colliculus (IC) (Du et al. 2009b; Ping et al. 2008). In humans, the origin of human scalp-recorded FFRs has been widely considered to be the IC (e.g., Chandrasekaran and Kraus 2010; Marsh 1974; Smith et al. 1975; Sohmer et al. 1977; Weinberger et al. 1970).
The IC is the end point that both converges inputs from lower auditory brain stem structures and processes IAC signals (Palmer et al. 1999; Shackleton et al. 2005; Shackleton and Palmer 2006; Yin et al. 1987). It is also considered the critical generator for human scalp-recorded FFRs (Chandrasekaran and Kraus 2010; Marsh 1974; Smith et al. 1975; Sohmer et al. 1977; Weinberger et al. 1970). This study investigated the following four issues with rats as the mammal modeling subjects: 1) in the IC, whether a narrowband noise can evoke local-field FFRs that contain the TFS component (FFRTFS) and the envelope component (FFREnv); 2) whether the FFRTFS and/or FFREnv to interaurally correlated noises are affected by introduction of a BIC; 3) whether the BIC-evoked change in FFRTFS contributes to the neural BIC detection differently from that in FFREnv; and 4) whether the binaural integration of IC neuron populations for neural detection of a BIC is based on a simple linear summation (i.e., cross-correlation) of noise signals from the two ears.
MATERIALS AND METHODS
Eight young adult male Sprague-Dawley rats (age 10–12 wk, weight 280–350 g) were purchased from the Vital River Experimental Animal Company. They were anesthetized with 10% chloral hydrate (400 mg/kg ip), and the state of anesthesia was maintained throughout the experiment by supplemental injection of the same anesthetic. Stainless steel recording electrodes (10–20 kΩ) insulated by a silicon tube (0.3 mm in diameter) except at the 0.25-mm-diameter tip (Du et al. 2009b; Ping et al. 2008) were aimed at the central nucleus of the IC bilaterally. Based on the stereotaxic coordinates of Paxinos and Watson (1997) and referenced to bregma, the coordinates of the aimed IC site were AP, −8.8 mm; ML, ±1.5 mm; DV, −4.5 to −5.0 mm. Two electrodes were inserted per animal, one on each side of the IC.
Rats used in this study were treated in accordance with the Guidelines of the Beijing Laboratory Animal Center and the Policies on the Use of Animals and Humans in Research approved by the Society for Neuroscience (2006). The experimental procedures were also approved by the Committee for Protecting Human and Animal Subjects in the Department of Psychology at Peking University.
Apparatus and stimuli.
All sound waves were processed by a TDT System II (Tucker-Davis Technologies) and presented through two ED1 earphones. Two 12-cm TDT sound-delivery rubber tubes were connected to the ED1 earphones and inserted into each of the rat's ear canals for sound delivery. All narrowband noises were calibrated with a Larson Davis Audiometer Calibration and Electroacoustic Testing System (AUDit and System 824, Larson Davis). The sound pressure level (SPL) of all signals was 72 dB for each earphone.
Gaussian wideband noises (10-kHz sampling rate and 16-bit amplitude quantization) were generated and filtered by a 512-point digital filter with a center frequency of 2,000 Hz and a bandwidth of 0.466 octaves with MATLAB (MathWorks, Natick, MA). The stimulus duration was 900 ms with 10-ms linear onset/offset ramps, and the (offset-onset) interstimulus interval was 100 ms.
Under the baseline-stimulation condition that occurred before and after the occurrence of either the BIC or the interaurally correlated amplitude gap (Corgap), the interaurally correlated noises (IAC = 1) were presented for the total duration of 900 ms. Under the BIC-stimulation condition, a 200-ms uncorrelated noise fragment (IAC = −0.046) was substituted into the temporal middle of the noise (i.e., from 350 to 550 ms from the noise onset) with no interaural delays. Note that mathematically the amplitude of the linear summation of two uncorrelated noises is smaller than that of two correlated noises (Fig. 1). Thus if the central binaural integration follows the simple theoretical summation, the magnitude of neural signals under the BIC-stimulation condition should be smaller than that under the baseline-stimulation condition (Fig. 1B, left).
Since the linear summation of binaural signals under the BIC-stimulation condition leads to an amplitude reduction (Fig. 1A), the Corgap-stimulation condition was introduced as the stimulation control condition. Under the Corgap-stimulation condition, the two monaurally presented noises were identical (correlated), but their amplitudes were equal to 50% of the left-right summated signal amplitude under the BIC-stimulation condition. In other words, the linearly summated left ear and right ear signals under the BIC-stimulation condition and those under the Corgap-stimulation condition are identical (Fig. 1). The BIC and Corgap were distinguished in the value of the IAC coefficient (during the fragment period from 350 to 550 ms after the sound onset). Note that monaurally the intensity of the Corgap-stimulation condition was reduced during the fragment period compared with the pre- and postfragment periods but the monaural intensity under the BIC-stimulation condition was not reduced.
Evoked neural potentials were recorded in a sound-attenuating chamber, amplified 1,000 times by a TDT DB4 amplifier, filtered through a 100- to 10,000-Hz band-pass filter (with a 50-Hz notch), and averaged 100 times per stimulation condition. Online recordings were processed with TDT Biosig software, digitized at 16 kHz, and stored on a disk for off-line analyses. The same stimuli were used for each animal under a certain stimulation condition. Also, both the prefragment and the postfragment were not changed across stimulation conditions.
Theoretically, a steady-state Gaussian narrowband noise with a center frequency of c Hz and a bandwidth of b Hz has a TFS energy around c Hz and an envelope energy within the frequency range between 0 and b Hz (Longtin et al. 2008). Thus for a narrowband noise with bandwidth b, the TFS energy distributes from the low-cut (flc) to the high-cut (fhc) frequencies, and the fhc is below the frequency b. The normalized amplitude of FFRTFS can be calculated by the following function: (1)
The normalized amplitude of FFREnv can be calculated by the following function: (2) where the denominator represents the level of noise floor ranging from 2 to 5,000 Hz while the numerator represents the spectral region of interest. The FFRTFS and FFREnv components were extracted to calculated normalized amplitude with functions 1 and 2.
To estimate the neural detection of the BIC fragment and that of the Corgap fragment, responses in each of the three 200-ms periods were separately processed: prefragment (100–300 ms after noise onset), fragment (350–550 ms), and postfragment (600–800 ms). Furthermore, the (neural) fragment detection index (FDI) was defined as the relative difference between the amplitude of the fragment (BIC or Corgap) and the average of prefragment amplitude and postfragment amplitude (normalized against the average of pre- and postfragment amplitudes).
Statistical analyses were performed with IBM SPSS Statistics 20 (SPSS, Chicago, IL). Within-subjects, repeated-measures analyses of variance (ANOVAs), t-tests, and Pearson correlation were conducted to examine differences between stimulation conditions or correlation between responses. The null hypothesis rejection level was set at 0.05.
When all recordings were completed, rats were euthanized with an overdose of chloral hydrate. Lesion marks were made via the recording electrodes with an anodal DC current (500 μA for 10 s). The brains were stored in 10% formalin with 30% sucrose and then sectioned at 55 μm in the frontal plane in a cryostat (−20°C). Sections were examined to determine locations of recording electrodes.
Histological results and response latencies.
According to the histological examination, all 16 electrodes were located precisely within the central nucleus of IC in all rats (Fig. 2A). Each of the electrodes was used in experimental recordings. The response latency to the noise stimulus onset was examined by cross-correlation analyses of the best delay between the noise stimulus waveform and the evoked neural response waveform (Burkard 1991; Dobie and Wilson 1984). The best delay, at which the stimulus-response correlation reached the maximum, ranged from 5.8 to 6.1 ms with a mean of 6.0 ms, consistent with the results reported by previous studies (Du et al. 2011; Ping et al. 2008).
To estimate whether binaural sluggishness exists in IC FFRs, the latency of IC FFR to the BIC was compared to that to the Corgap by cross-correlation analyses of the best delay between the fragment waveform (350–550 ms after sound onset) and the evoked neural response waveform. Pairwise t-test showed that there was no significant difference between the latency for BIC (mean = 5.99, SD = 0.35) and that for Corgap (mean = 6.01, SD = 0.39) [t(13) = −0.034, P = 0.974]. The results suggested that there was no binaural sluggishness at the midbrain level (also see Fitzpatrick et al. 2009).
Effects of BIC or Corgap on FFRTFS and FFREnv.
The results of this study clearly showed that narrowband noises could evoke FFRs containing both the FFRTFS and FFREnv components under each of the stimulation conditions (baseline, BIC, and Corgap; see Fig. 2B for examples of BIC- and Corgap-stimulation conditions).
The narrowband noise used in this study had a TFS energy around 2,000 Hz (center frequency) and an envelope energy within the frequency range between 0 and 640 Hz (bandwidth). As shown in the example in Fig. 2B, bottom, the fragment-induced FFRTFS and FFREnv exhibited similar spectra with the stimulus TFS and envelope, respectively (see Longtin et al. 2008).
To examine how faithful the FFRTFS and FFREnv were in representing acoustic features of the noise stimulus, the significance of the stimulus-to-response (S-R) correlation for the prefragment noise section (100–300 ms after sound onset) was examined at each recording site with Pearson correlation tests. The results showed that the S-R correlation between the noise stimulus TFS and the IC FFRTFS was significant (for all recording sites, P < 0.05); the S-R correlation between the noise envelope and the IC FFREnv was also significant (for all recording sites, P < 0.001).
To examine whether the BIC and Corgap fragments affected IC FFRs, normalized amplitudes of FFRTFS and FFREnv in the three periods (prefragment, fragment, and postfragment) were calculated separately. Figure 3A shows that both FFRTFS and FFREnv decreased as either the BIC or the Corgap occurred. For the FFRTFS, a 2 × 3 (stimulation condition: BIC, Corgap; response period: prefragment, fragment, and postfragment) two-way repeated-measures ANOVA showed that the both the main effect of stimulation condition (F1,15 = 8.889, P = 0.009, partial η2 = 0.372) and the main effect of response period (F1,15 = 22.249, P < 0.001, partial η2 = 0.597) were significant but the interaction effect was not significant (F2,30 = 0.606, P = 0.552, partial η2 = 0.039). Post hoc tests confirmed that the amplitude of FFRTFS during the BIC was significantly lower than that during the Corgap (P = 0.031, with Bonferroni adjustment).
For FFREnv, a two-way repeated-measures ANOVA showed that both the main effect of stimulation condition (F1,15 = 5.563, P = 0.032, partial η2 = 0.271) and the main effect of response period (F1,15 = 17.629, P < 0.001, partial η2 = 0.540) were significant but the interaction between the two factors was not significant (F2,30 = 2.122, P = 0.137, partial η2 = 0.124). Post hoc tests confirmed that the amplitude of FFREnv during the BIC was significantly lower than that during the Corgap (P = 0.040, with Bonferroni adjustment).
Post hoc tests also showed that no significant differences occurred between the pre- and postfragments under each of the stimulation conditions (for all P > 0.05, with Bonferroni adjustment). Thus the normalized amplitudes of pre- and postfragments were averaged in the following analyses of the fragment effects.
Correlations between fragment detection indexes.
The FDI was introduced as the relative amplitude difference between FFRs during the fragment and the average of pre- and postfragment FFRs (for details see materials and methods). To test whether the FDI under the BIC-stimulation condition and that under the Corgap-stimulation condition shared a common neural mechanism, Pearson correlation tests for FDI between the two conditions were conducted for FFRTFS and FFREnv, separately. As shown in Fig. 3B, no significant correlation was found between the two stimulation conditions for either FFRTFS or FFREnv.
To compare the effect of introduction of a BIC or a Corgap on the FFRTFS and on the FFREnv, the FFRTFS-FFREnv FDI matrix was examined, in which FFRTFS was presented on the y-axis and FFREnv was presented on the x-axis (Fig. 4). As shown in Fig. 4, the majority of the BIC FFRTFS FDIs were larger than the BIC FFREnv FDIs (most filled circles are above the diagonal). However, this pattern was not present for Corgap FDIs.
This study showed that a steady-state narrowband noise can elicit remarkable FFRs in the auditory midbrain IC, which is the end point integrating inputs from lower auditory brain stem structures for binaural processing (Li and Kelly 1992; Palmer et al. 1999; Shackleton et al. 2005; Shackleton and Palmer 2006; Yin et al. 1987). Since FFRs are based on precisely phase-locked responses of neuron populations to instantaneous waveforms of acoustic stimuli (Chandrasekaran and Kraus 2010; Du et al. 2009a, 2009b, 2011, 2012; Marsh and Worden 1969; Moushegian et al. 1973; Ping et al. 2008; Weinberger et al. 1970; Worden and Marsh 1968), narrowband noises are useful for investigating phase locking-based neural mechanisms underlying binaural integration.
Moreover, the noise-evoked FFRs exhibit two temporal components: the fast-varying FFRTFS signaling the center frequency and the slow-varying FFREnv signaling the bandwidth. Thus the FFRTFS and FFREnv precisely represent the spectral features of a narrowband noise. The results support the concept that FFRs efficiently convey both TFS information (e.g., Chandrasekaran and Kraus 2010; Du et al. 2011; Galbraith 1994; Krishnan 2002; Krishnan and Gandour 2009; Russo et al. 2004) and envelope information (also called envelope-following response) (e.g., Aiken and Picton 2006, 2008; Dolphin and Mountain 1992, 1993; Hall 1979; Shinn-Cunningham et al. 2013; Supin and Popov 1995; Zhu et al. 2013).
More importantly, this study for the first time provides evidence showing that introduction of a BIC reduces both the FFRTFS and FFREnv. Since introducing a BIC does not substantially change monaural inputs, the FFR reduction must be based on binaural interactions, which have been demonstrated previously (Du et al. 2009b). The BIC-induced FFR reduction may be the neural correlate underlying perceptual detection of the BIC (Akeroyd and Summerfield 1999; Boehnke et al. 2002; Huang et al. 2008, 2009a, 2009b; Kong et al. 2012, 2015; Li et al. 2009, 2013).
When the FDI is used to estimate the degree of FFR changes caused by introducing a fragment (BIC or Corgap), the BIC-induced FDI for FFRTFS is larger than that for FFREnv, indicating that introduction of a BIC causes more reduction in FFRTFS than in FFREnv. The Boehnke et al. (2002) study showed that the envelope information is not as important as the TFS information in determining the detection of the BIC detection. Clearly, further perceptual work is needed to verify whether the processing of FFRTFS contributes more to the BIC detection than the processing of FFREnv. However, the Corgap-induced FDI for FFRTFS is not significantly different from that for FFREnv. Since IAC-based binaural processing plays a role in both sound localization (Coffey et al. 2006; Franken et al. 2014; Soeta and Nakagawa 2006) and target-object detection against masking (Durlach et al. 1986; Palmer et al. 1999), further perceptual work is also needed to verify whether FFRTFS signals are more involved in sound localization and target unmasking than FFREnv signals. Smith et al. (2002) have suggested that TFS signals and envelope signals are most important for pitch/location perception and speech recognition, respectively. It is of interest to know whether this functional dichotomy between TFS and envelope is associated with certain differences in sensitivity to the BIC between FFRTFS and FFREnv.
The IC is the end point converging inputs from lower auditory brain stem structures (Palmer et al. 1999; Shackleton et al. 2005; Shackleton and Palmer 2006; Yin et al. 1987). Previous studies have suggested that binaural integration occurs in the IC (Du et al. 2009b; Kelly and Li 1997; Li and Kelly 1992). Does the IAC-based binaural integration follow a simple linear summation (cross-correlation) function? The results of this study indicate that for either FFRTFS or FFREnv the BIC-induced FDI is independent of the Corgap-induced FDI. Thus the BIC-induced changes in FFRs cannot be explained by a simple signal input reduction.
In the IC, a narrowband noise can efficiently induce FFRs that contain both the FFRTFS and FFREnv components, signaling the center frequency and bandwidth, respectively. Introduction of a BIC reduces both FFRTFS and FFREnv, and the FFR reductions cannot be explained by a simple reduction in linear summation of signal inputs from the two ears.
This work was supported by the National Natural Science Foundation of China (31470987) and the “985” Project of Peking University.
No conflicts of interest, financial or otherwise, are declared by the author(s).
Author contributions: Q.W. and L.L. conception and design of research; Q.W. performed experiments; Q.W. analyzed data; Q.W. and L.L. interpreted results of experiments; Q.W. prepared figures; Q.W. and L.L. drafted manuscript; Q.W. and L.L. edited and revised manuscript; Q.W. and L.L. approved final version of manuscript.
- Copyright © 2015 the American Physiological Society