|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1Ear Institute and 2Department of Neuroscience, Physiology and Pharmacology, University College London, London, United Kingdom
Submitted 17 October 2007; accepted in final form 21 August 2008
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Despite the impressive ability of the peripheral auditory system to phase-lock to frequencies as high as 2 kHz (Johnson 1980
; Joris et al. 1994
), there is a limit to the frequencies that can be represented in this manner. Although auditory nerve fibers with high characteristic frequencies (CFs) cannot follow the carrier of high-frequency signals, they do show phase-locking to the envelope of amplitude-modulated signals (Joris et al. 2004
). These envelope-locked spikes can serve as a substrate for coincidence detection in the MSO and high-CF neurons can show comparable sensitivity to envelope ITDs as low-CF neurons do to carrier ITDs (Batra et al. 1993
; Griffin et al. 2005
).
This suggests that carrier ITDs are used for localization at low frequencies, with envelope ITDs of primary relevance at higher frequencies. Here we demonstrate an envelope-sensitive component to the ITD sensitivity of low-CF (<1.5 kHz) neurons, in two midbrain nuclei that receive direct input from the MSO: the inferior colliculus (IC) and the dorsal nucleus of the lateral lemniscus (DNLL). This envelope sensitivity is responsible for asymmetry in the ITD-tuning curves in both IC and DNLL, with the strongest effect over the range of ITDs experienced under natural listening conditions (the range created by the size of the head). The presence of this envelope-sensitive component provides a mechanism by which envelope structure can influence sound localization at low frequencies (Bernstein and Trahiotis 1985
).
| METHODS |
|---|
|
|
|---|
All experiments were carried out in accordance with the Animal (Scientific Procedures) Act 1986 of Great Britain and Northern Ireland. Young-adult guinea pigs (Cavia porcellus), with body masses ranging from 0.3 to 0.7 kg, were anesthetized with an intraperitoneal (ip) injection of urethane (1.0 g kg–1, 25% solution); analgesia was induced with a 0.1-ml intramuscular injection of Hypnorm (fentanyl citrate/fluanisone; Janssen–Cilag, High Wycombe, UK). Anesthesia was monitored throughout the experiment via the withdrawal reflex and was maintained with supplementary 0.1-ml doses of Hypnorm as required. A 0.1-ml subcutaneous injection of atropine sulfate (0.6 mg ml–1; Animalcare, York, UK) was used to reduce bronchial secretions and local anesthesia at surgical sites was produced using subcutaneous injections of lidocaine (2% solution). On completion of an experiment, animals were killed by a 1- to 2-ml ip injection of sodium pentobarbital (60 mg ml–1, Pentoject, Animalcare).
A tracheotomy was performed to maintain airway patency and a homeothermic blanket was used to maintain body temperature at 36°C. The animal was positioned in a stereotaxic apparatus housed in a sound-attenuating chamber (IAC, Winchester, UK). Custom-made ear bars allowed positioning of the animal within the restraint so that the tympanic membranes could be clearly visualized. The skull was leveled stereotaxically and middle ear pressure was maintained throughout the experiment by ventilation through small cannulae sealed into the bullae with petroleum jelly. A craniotomy above the location of the right IC or the right DNLL was performed and the dura overlying the cortex was removed. A warm 4% agar solution was placed over the craniotomy and allowed to cool; this reduced brain pulsation and protected the cortical surface from desiccation.
Single-neuron recordings
An electrode was stereotaxically positioned in the brain above the site of either the right IC or the right DNLL. The DNLL and IC were located using stereotaxic coordinates that had previously been histologically verified to be accurate (McAlpine 2004
; Medvedeva 1977
). Either glass-coated tungsten electrodes (manufactured in house) or parylene-coated tungsten electrodes (World Precision Instruments, Stevenage, UK) were used. A brass screw, used as a ground, was positioned contralateral to the recording site, 1 mm anterior and 1 mm lateral of bregma.
Electrical signals were recorded using a Medusa preamp and RA16 base station (Tucker–Davis Technologies [TDT], Alachua, FL), which 16-bit quantized the signal at a 25-kHz sampling rate, high-pass filtered at 300 Hz and low-pass filtered at 10 kHz. The filtered signal was monitored both aurally (over a speaker) and visually (on an oscilloscope). Putative action potentials were isolated based on a template match to the voltage waveform, defined by a negative-threshold crossing, a positive-threshold crossing, and the time between the negative-threshold crossing and the peak of the waveform. The spike shape was continuously monitored to ensure a single-neuron recording.
Single neurons were isolated using tone pips with variable frequency and intensity. The characteristic frequency (CF) of a neuron was measured as the frequency of the pure-tone stimulus (at zero ITD) at the minimal sound level able to produce a detectable (audiovisual) change in discharge rate above spontaneous levels. The threshold of the neuron was measured as this minimum sound level. If the recorded response did not appear to be sensitive to changes in the ITDs of either tones or noise, the recording was abandoned.
Sound presentation
Acoustic stimuli were delivered via the ear bars using Beyerdynamic DT48 audiological speaker transducers (Beyerdynamic, Burgess Hill, UK). Small audiological microphones (FG3452, Knowles Europe, Burgess Hill, UK) were connected to the ear bar channels by high-impedance tubing, 1 to 2 mm from the tympanic membrane. The impulse response of these microphones had been previously calibrated with respect to the output of the speakers using a high quality 1/8-in. microphone (Type 4138, Brüel & Kjær, Stevenage, UK), 1/2-in. preamplifier (Type 2669, Brüel & Kjær) and measurement amplifier (Type 2610, Brüel & Kjær). On completion of surgery, these microphones were used to check the transfer function of the system and thereby assess the quality of acoustic coupling. In all experiments, the gain of the transfer function was flat to within ±5 dB over the frequency range 50 Hz to 2 kHz, and matched to ±2 dB between the ears. No interaural phase differences arose from differing transfer functions at frequencies relevant to this study.
Stimulus presentation and data collection were controlled by the Brainware program (TDT). Noise stimuli were generated using MATLAB (The MathWorks, Cambridge, UK) and presented via a custom Brainware interface we programmed in Delphi (Borland, Twyford, UK). Noise stimuli were then loaded onto signal processing hardware (RP2.1, TDT) and output at the maximum possible gain of the D/A converter, with digitally controlled analog attenuators used to control the sound intensity (PA5, TDT).
Recorded noise-delay functions
To investigate both envelope- and carrier-sensitive components of responses in IC and DNLL, responses were recorded to noise bursts that incorporated both interaural time delays (affecting both envelope and carrier) and interaural phase disparities (IPDs, affecting only the carrier) (Yin et al. 1987
). Noise bursts of 300-ms duration were generated digitally via the inverse Fourier transform, at a sampling rate of 50 kHz. The power spectrum of each noise burst was flat, covering a frequency range of 50 Hz to 5 kHz, whereas the phase spectrum was randomly distributed from a uniform circular distribution. Stimuli were gated with 5-ms ramped-cosine windows, triggered synchronously to ensure no onset time differences between ipsilateral and contralateral stimuli. For each presentation, a new sample of noise was generated and presented ipsilaterally (the right ear). The contralateral (left ear) stimulus had a similarly flat power spectrum, but its phase spectrum
C(f) was formed from the ipsilateral phase spectrum
I(f) by taking each component with frequency f, time delaying it by
milliseconds and phase delaying it by
cycles (cyc)
![]() | (1) |
milliseconds and an IPD of
cyc. Positive ITDs corresponded to a time lead of the contralateral stimulus and IPDs in the range 0 to 0.5 cyc corresponded to a phase lead of the contralateral stimulus. The presented range of ITDs varied between neurons, to account for CF-dependent changes in the width of tuning curves. The ITD was varied between ±1.5 periods of the CF (cyc re CF) in 0.05 cyc re CF steps, and the IPD ranged from –0.375 to 0.5 cyc in 0.125-cyc steps. Each ITD/IPD pair was repeated ten times, using a new sample of noise on each repeat (i.e., each noise was ever presented only once). On each repeat, all presentations of each ITD/IPD pair were randomly interleaved, to control for neural adaptation. The minimum interstimulus interval was 600 ms, although in practice it could be higher due to the computational overhead of uploading the stimulus to the hardware.
The sound pressure level (SPL) at which this noise stimulus was presented was determined for each neuron from its response to uncorrelated noise (for one neuron the response to correlated noise was used instead). Independent (i.e., uncorrelated) samples of noise were presented binaurally and the firing rate was recorded as the SPL (measured in dB SPL) was varied from 15 to 85 dB (–15 to 55 dB per spectral component). The delayed noise stimuli were presented at a level that was in the linear part of this rate–intensity function (to minimize any nonlinearity) and were sufficiently intense to ensure a response for all parameter conditions (so that the shape of the noise-delay function could be clearly seen). For neurons that showed saturation in their rate–intensity functions, the level used was the midpoint of the linear portion of the curve. If the rate–intensity function did not saturate over the range of levels tested, the level used was either the midpoint of the visible linear portion or a level 10 dB above threshold. One noise-delay function was recorded at a level 10 dB quieter than intended, although since it showed only mild rectification, the shape of the noise-delay function could still be seen. This resulted in stimulus intensities ranging from 35 to 80 dB in IC (5 to 50 dB per component) and from 40 to 60 dB in DNLL (10 to 30 dB per component).
To compare these levels for neurons with different sensitivities, neural thresholds to noise were estimated by finding the sound level above which mean spike rates were all significantly greater than the mean spike rate at the lowest sound level (P
0.05, permutation test). Estimated thresholds varied from 25 to 70 dB in IC and from 25 to 55 dB in DNLL. The range of stimulus sound levels was therefore 0 to 15 dB above threshold in DNLL and 10 to 35 dB above threshold in IC. However, for the majority of neurons, stimuli were 10 or 15 dB above threshold (DNLL: 12 of 15 neurons; IC: 14 of 20 neurons). Note that for the neuron whose noise-delay function was recorded at threshold, it is likely that the true threshold was lower but obscured by its high spontaneous firing rate.
Data analysis
QUANTIFYING THE ASYMMETRY. The noise-delay function consists of a family of eight ITD tuning curves, one for each IPD (Fig. 1A; see APPENDIX). Each IPD phase shifts the carrier of the noise-delay function, causing the family of curves to tile an area (shaded in Fig. 1A) that is constrained by upper and lower bounds. These upper and lower bounds are the maximum and minimum firing rates at each ITD. However, since only 8 data points are available at each ITD, the maximum and minimum firing rates could fall between data points. To find these extrema the IPD function at each ITD was interpolated up from 8 to 1,024 points (via the Fourier transform). The upper bounds were then measured from the maximum firing rate at each ITD and the lower bounds from the minimum firing rate at each ITD. To remove some of the noisiness, the bounds were then smoothed using the "robust loss" algorithm (smooth function, MATLAB). This used a robust least-squares method to fit a quadratic polynomial with a span of 25% of the entire data set around each data point. The value of the fitted polynomial at the central data point was then taken as the smoothed value of the bounds at that point.
|
![]() | (2) |
The delay asymmetry was measured from the cross-correlogram of the upper and lower bounds by finding the delay that produced the most negative correlation. This usually, but not always, corresponded to the difference between the ITDs at the maximum of the upper bounds and the minimum of the lower bounds. Positive delay asymmetry indicated the upper bounds were shifted toward more positive ITDs than the lower bounds, whereas negative delay asymmetry indicated that the upper bounds were shifted to more negative ITDs than the lower bounds.
ANALYSIS OF IPD DEPENDENCE.
Noise-delay functions were decomposed into IPD-independent and IPD-dependent components using Fourier analysis (Fig. 3). Taking the Fourier transform of a noise-delay function r(
,
) over the IPD dimension resulted in eight frequency components: z–3(
) through z4(
)
![]() | (3) |
) through r4(
,
), each of which had a different dependence on the IPD
![]() | (4) |
) is the absolute value of zn(
) (compensated for any power loss arising from a missing negative-frequency component)
![]() | (5) |
n(
) is the argument of the corresponding zn(
)
![]() | (6) |
|
BEST ITD. Best and worst ITDs are traditionally measured for a correlated noise stimulus as the ITDs producing the maximum and minimum firing rates. To estimate best ITD and worst ITD from our noise-delay functions it was necessary to identify the location of maxima and minima in the noise-delay function recorded at 0 cyc IPD. These turning points were estimated from zero crossings in the first derivative of this response, using linear interpolation. The maximum was taken to be the positive-to-negative zero crossing corresponding to the highest firing rate and the minimum to be the negative-to-positive zero crossing corresponding to the lowest firing rate. The first derivative was estimated at each point from the gradient of the line between the two data points on either side of that point. Because of the sensitivity of this method to noise, recorded data analyzed in this way were first smoothed with a three-point moving-average filter before taking the derivative and identifying turning points.
Neurons were loosely identified as peak-type, trough-type, or intermediate-type on the basis of their characteristic phase (CP; see next section). For the analysis of best ITD and worst ITD neurons were classified according to whether their CP was nearer 0 cyc (peak-type) or 0.5 (trough-type). Elsewhere, exact classification was unnecessary since CP itself was used; in the text, we use peak-type to refer to neurons with CPs close to 0 cyc, trough-type to neurons with CPs close to 0.5 cyc, and intermediate-type to neurons with all other CPs.
CHARACTERISTIC DELAY AND CHARACTERISTIC PHASE.
The internal delays of ITD-sensitive neurons are often modeled as having a time component (the characteristic delay [CD]) and a phase component (the characteristic phase [CP]). Usually, these are extracted for a single neuron using pure-tone stimuli (Yin and Kuwada 1983
) by regression using the model
![]() | (7) |
(f) is the best ITD (in cyc) measured using a pure-tone stimulus with frequency f. The CD and CP can be similarly extracted by considering their effects on different frequencies when presented as a complex rather than individually. For a component with a linear dependence on interaural correlation, an internal ITD (the CD) will affect both carrier and envelope, translating the entire noise-delay function along the ITD axis. Any internal IPD (the CP) will phase-shift only the sinusoidal carrier, leaving the envelope untouched (see APPENDIX).
The time and phase delays affecting the noise-delay function were measured from r1(
,
) (Eq. 4). The CD was measured from the centroid of its amplitude envelope, a1(
) (Eq. 5)
![]() | (8) |
1(
) (Eq. 6) to the model
![]() | (9) |
![]() | (10) |
,
) was used as a weighting factor at each ITD
![]() | (11) |
,
). The regression was performed using a subspace trust-region algorithm (fminunc, MATLAB). All parameters were unconstrained in the regression, with initial estimates determined from the dominant component in the Fourier transform of r1(
,
) over ITD (at zero IPD).
The regression converged and was significant for all neurons (P
0.05, F-test). In the DNLL, the R2 for the fits ranged from 0.94 to 1.00 (median 0.99, interquartile range 0.97 to 1.00) and in the IC, the R2 ranged from 0.79 to 0.99 (median 0.96, interquartile range 0.90 to 0.98). For 13 neurons in DNLL and 15 neurons in IC, the residuals showed a systematic deviation with ITD (P
0.05, circular runs test; Mardia and Jupp 2000
), indicating significant phase modulation, albeit with a weak effect. Visual inspection of
1(
) revealed an increase in the instantaneous frequency at central ITDs, which was largely captured by the fits. The majority of unexplained variance arose from the slower instantaneous frequency at more extreme ITDs. Despite these errors, the high R2 value indicated that this regression was a suitable method for measuring the phase of the central peak.
EQUIVALENCE CONTOURS. To assess whether the observed rate asymmetry could be explained by a static nonlinear dependence on interaural correlation, we examined the line that marked the intersections of ITD functions corresponding to IPDs differing by 0.5 cyc (the equivalence contour, black line in Fig. 8A). A noise-delay function produced by a static nonlinear dependence on the interaural correlation will show a flat equivalence contour.
|
=
z(
) in Eq. 4, where
![]() | (12) |
Modeling noise-delay functions
INCORPORATING HALF-WAVE RECTIFICATION.
Noise-delay functions predicted from half-wave rectification of the cochlear filtered input stimuli were generated as described elsewhere (Albeck and Konishi 1995
). Briefly, white-noise stimuli were generated at a sampling rate of 100 kHz, band-pass filtered at 500 Hz using a rounded exponential (roex) filter, half-wave rectified, and then cross-correlated. To include the effect of IPDs, phase delays were introduced into the contralateral stimulus, as described earlier. A time delay was also introduced to produce a best ITD of 1/8 cyc re CF. Finally, the bandwidth of cochlear filtering was increased from that originally used to model noise-delay functions in owl to reproduce the narrower noise-delay functions in guinea pig (the P value of the roex filter was lowered from 20 to 4). To investigate the effect of additional low-pass filtering, responses were smoothed using a filter with a Gaussian impulse response. The degree of low-pass filtering was adjusted by varying the width of the filter.
INTERMEDIATE-TYPE RESPONSES.
Intermediate-type responses were modeled as a linear combination of an ipsilateral peak-type input and a contralateral trough-type input. Both sets of inputs were modeled using a delay-symmetric response recorded from DNLL that showed a characteristic delay around 1/8 cyc re CF and a characteristic phase around 0 cyc (Fig. 8B). The peak-type input was constructed from the r0(
) and r1(
,
) components of this response (Eq. 4), with the degree of rate asymmetry adjusted by
, the proportion of explained variance in r0(
)
![]() | (13) |
![]() | (14) |
controlling the relative contribution of the inputs
![]() | (15) |
and different degrees of envelope sensitivity could be produced by changing
. Although Eq. 15 can result in negative firing rates, it did not matter for the purposes of this model since only the shape of the response was of interest. Thus the contralateral input could be considered either an excitatory trough-type input or an inhibitory peak-type input and the ipsilateral input could be considered either an excitatory peak-type input or an inhibitory trough-type input.
The model was also tested using inputs generated from the responses of the model incorporating half-wave rectification. Instead of Eq. 13, the noise-delay function predicted by half-wave rectification was used as rpk(
,
), and
was fixed at 0.5 (
was not used).
| RESULTS |
|---|
|
|
|---|
Noise-delay functions in IC and DNLL are asymmetric
Noise-delay functions recorded from the IC and the DNLL (Fig. 1, B–G) differed from those predicted by a simple interaural-correlation model of MSO neurons (Fig. 1A; see APPENDIX). Differences in details such as the degree of damping were expected because these can reflect differences in cochlear filtering. However, more fundamental differences were observed in the shape of the bounds of the noise-delay functions. The upper and lower bounds of the recorded noise-delay functions were similar but asymmetrical: one bound was often larger than the other (Fig. 1, B–E and G) and the maximum of the upper bounds and the minimum of the lower bounds sometimes occurred at different ITDs (Fig. 1, C and D, F and G), often on opposite sides of zero ITD. In contrast, the upper and lower bounds of the model noise-delay function have the same shape and size and both are centered at the same ITD. When the stimulus in one ear is inverted (i.e., by adding an additional 0.5 cyc IPD), the interaural correlation of a sound is similarly inverted. The noise-delay function in Fig. 1A therefore inherits this property by virtue of its linear dependence on interaural correlation. This produces reflectional symmetry between the upper and lower bounds. The asymmetry in the recorded data therefore indicated some process not captured in the model.
The rate asymmetry—the asymmetry in the size of the upper and lower bounds—was quantified using the rate asymmetry index (RAI; see METHODS). Positive RAI values indicated that the upper bounds were larger in area, whereas negative RAI values indicated that the lower bounds were larger. In the DNLL, RAIs were near zero at the lowest CFs, but increased with increasing CFs (Fig. 2A; r = 0.69, P = 0.006, Spearman's rank correlation coefficient). This indicated that the upper and lower bounds were roughly the same area for low-CF neurons, but that the upper bounds became larger as CF increased. In contrast, RAI values from IC were more often negative than those from DNLL (Fig. 2A; DNLL: one neuron; IC: seven neurons) and neither the RAI (P = 0.43) nor the absolute value of the RAI (P = 0.85) showed any correlation with CF. Thus rate asymmetry in IC responses was less stereotyped than that in DNLL.
|
Asymmetry is produced by a carrier-insensitive component
The deviation from linearity producing the asymmetric boundaries was investigated by decomposing noise-delay functions into IPD-independent and IPD-dependent components using Fourier analysis (Fig. 3; see METHODS). The response at each ITD was replotted as a function of IPD (e.g., Fig. 3E, which shows the IPD functions at 0 cyc re CF ITD) and decomposed into harmonic sinusoids (Fig. 3, F–H). This was repeated for each ITD and the components were recombined into several ITD functions, one for each harmonic (Fig. 3, B–D). The r0(
) component (Fig. 3B, produced by the DC term of the Fourier transform) was represented as a function independent of the IPD since the response at each IPD was identical.
The dominant component in the noise-delay functions was the r1(
,
) component (Fig. 3C), which accounted for a median 90% of the total variance in DNLL and 73% in IC. The r1(
,
) component arises from the fundamental component of the Fourier transform and so has a sinusoidal dependence on IPD. This is consistent with a linear dependence on the interaural correlation (Fig. 1A) and the dominance of this component reflects the fact that deviations from linearity were not extreme. The r0(
) component (Fig. 3B) accounted for a median 8% of the total variance in DNLL and 9% in IC. This component arises from the DC component of the Fourier transform and so is insensitive to phase shifts in the carrier. The r2(
,
) (Fig. 3D) and higher components (not shown) arose from the second and higher harmonics in the Fourier transform. These made very little contribution in DNLL (median 2%), although they were stronger in IC (median 14%). This partially reflected a higher noise contribution in IC arising from the higher trial-to-trial variability (50%, IC; 25%, DNLL; P = 0.032, Wilcoxon rank-sum test). However, around half the neurons showed significant ITD-dependent variation in their r2(
,
) components (IC: 11 neurons; DNLL: 6 neurons; P < 0.05, Wald–Wolfowitz runs test applied to the sign of the deviation from the median; Lehman 2006
), which would not be expected if these components had arisen solely as a consequence of noise.
Since the component r1(
,
) is necessarily both rate and delay symmetric due to its sinusoidal dependence on IPD, any asymmetry is produced by the other components. Removing the harmonic components from the responses produced little change in either the rate asymmetry (Fig. 4A) or the delay asymmetry (Fig. 4B), whereas removing the r0(
) component produced largely symmetrical responses (Fig. 4, C and D). Thus despite harmonic components making a significant contribution to the shape of many noise-delay functions, they made little contribution to the observed asymmetry. This indicated that the asymmetric responses in DNLL and IC could be well described by just two ITD-sensitive components: one sensitive to the phase of the carrier and the other insensitive to it. In DNLL, the weak harmonics meant that this approximation was close (median R2 = 0.98, interquartile range 0.95 to 0.99), but the stronger harmonics in IC resulted in a poorer approximation (median R2 = 0.87, interquartile range 0.78 to 0.89). In both nuclei, the quality of the approximation improved at higher CFs (DNLL: r = 0.59, P = 0.024; IC: r = 0.48, P = 0.033; Spearman's rank correlation coefficient), with R2 values reaching around 0.99 in DNLL and 0.9 in IC. This reflected a decreasing contribution of the harmonic components with increasing CF.
|
) component and the upper and lower bounds of the r1(
,
) component. Rate asymmetric responses resulted from r0(
) directly increasing the size of one bound and decreasing the size of another. No delay asymmetry was produced if r0(
) was symmetrical with respect to the envelope of r1(
,
) (Fig. 5, A and D), but delay asymmetry was produced if r0(
) was asymmetric with respect to the envelope (Fig. 5, B, C, E, and F). Finally, multimodal upper or lower bounds arose when r0(
) was large compared with the envelope of r1(
,
) (Fig. 5, C and F). The asymmetry in the noise-delay functions was therefore entirely an effect of the carrier-insensitive component.
|
Neurons in the MSO produce peak-type noise-delay functions (at zero IPD), with a large central peak (Yin and Chan 1990
). The internal delay is indicated by the best ITD—the ITD producing the highest firing rate (at 0 cyc IPD). For these neurons, the best ITD compensates for the internal delay, bringing inputs into register, maximizing interaural correlation, and thereby producing a peak in the noise-delay function. In contrast, neurons in the lateral superior olive (LSO) produce trough-type noise-delay functions (at zero IPD), with a large central trough (Batra et al. 1997
; Tollin and Yin 2005
), due to a preference for interaural anticorrelation. For these neurons the internal delay is indicated by the worst ITD—that producing the lowest firing rate (at 0 cyc IPD).
In agreement with previous reports (Hancock and Delgutte 2004
; Joris et al. 2006
; McAlpine et al. 2001
), best ITDs of peak-type responses in IC showed a negative correlation with CF (Fig. 6C; r = –0.72, P = 0.002, Spearman's rank correlation coefficient) and were distributed roughly around 1/8 cyc re CF. Trough-type responses showed worst ITDs distributed roughly around –1/8 cyc re CF. In DNLL, best ITDs showed a similar dependence on CF (Fig. 6A; r = –0.78, P = 0.001), again distributed around roughly 1/8 cyc re CF (median: 0.14 cyc re CF; interquartile range: 0.11 to 0.17 cyc re CF). No trough-type responses were observed in DNLL, but this sampling difference between the DNLL and IC could have arisen by chance (P = 0.12, Fisher's exact test).
|
Although a long-standing model suggested that internal delays are time delays arising from differences in axonal propagation time (Jeffress 1948
), this does not always sufficiently describe the experimental data. Instead, internal delays are often modeled as having some dependence on the stimulus frequency with both a time component (the characteristic delay [CD]) and a phase component (the characteristic phase [CP]) (Yin and Kuwada 1983
). A neuron's best ITD is therefore determined by both its CD and the CP
![]() | (16) |
Since the best ITD was determined by r1(
,
) (Fig. 6, B and D), the CD and CP could be measured from their effects on that component alone. The CD was estimated from the centroid of the amplitude envelope a1(
), whereas the CP was estimated from the difference between the CD and the phase of its carrier
1(
) (see METHODS). Although any internal ITD (CD) will translate the whole of r1(
,
) along the ITD axis, any internal IPD (CP) will phase shift only the sinusoidal carrier
1(
), leaving the envelope a1(
) untouched. This allows the effects of the CD and the CP to be distinguished. Note that since the phase of the carrier determines the best ITD, the CP was in effect the deviation of the CD from the best ITD, as suggested by Eq. 16.
Similar to the best ITD, the CD of peak-type neurons was negatively correlated with the CF in both DNLL (Fig. 7A; r = –0.56, P = 0.032, Spearman's rank correlation coefficient) and IC (Fig. 7C; r = –0.74, P = 0.002). No significant difference was seen between CD and the best/worst ITD (DNLL: P = 1.00, IC: P = 0.82, sign test; see Fig. 6, A and C). Since the CP determines the deviation of the CD from the best ITD, the similarity between CD and best ITD reflected the lack of a systematic contribution from the CP. The CP showed a broad range of values with no dependence on the CF (Fig. 7, B and D; DNLL: P = 0.65, IC: P = 0.20). As expected from the lack of a significant difference between the CD and the best ITD, the CP of peak-type neurons showed no significant bias away from 0 cyc in either the DNLL (P = 1.00, sign test; median 0.00 cyc) or the IC (P = 0.45, median 0.08 cyc). Thus the inverse relationship we observed between best ITD and CF was chiefly determined by internal time delays (i.e., CD) that varied with the CF of the neuron, with no significant contribution from internal phase delays.
|
Equation 16 suggests that any best ITD is achievable with the right CP. However, this assumes that the CD and CP of a neuron are independent; in fact they covary. When the CD was normalized by the CF, a correlation between CP and CD was observed in both IC and DNLL (Fig. 11, A and D; DNLL: Dn = 0.44, P < 0.05; IC: Dn = 0.31, P < 0.05; Mardia's linear-circular rank correlation coefficient; Mardia 1976
). This indicated different CDs for different classes of neurons: peak-type neurons had CDs around 1/8 cyc re CF, trough-type neurons had CDs around –1/8 cyc re CF, and intermediate-type neurons (those with CPs around 0.25 cyc) showed CDs around 0 cyc re CF.
This correlation indicates a more complex interaction of CD and CP than suggested by Eq. 16. This can be seen by approximating the relationship between CD and CP by
![]() | (17) |
![]() | (18) |
Origin of rate asymmetry
We have shown that a carrier-insensitive component is responsible for the observed asymmetries in the noise-delay functions—but what is the source of this carrier insensitivity? One trivial explanation is the existence of a (static) nonlinear dependence on the interaural correlation. Neural responses to noise bursts with different degrees of statistical correlation between the two ears are nonlinear (Albeck and Konishi 1995
; Coffey et al. 2006
; Shackleton et al. 2005
) and have previously been interpreted as reflecting a (static) nonlinear dependence on interaural correlation (Hancock and Delgutte 2004
). Such nonlinearity could arise either directly in the MSO or through a transformation of the MSO response by the DNLL or IC. As demonstrated by the example in Fig. 8A, such a model can produce rate-asymmetric responses, although not delay-asymmetric responses. Furthermore, the r0(
) component of these responses will be carrier insensitive, similar to those in Fig. 5. Thus this model would appear to show an envelope-sensitive component, without requiring envelope sensitivity in the input to the MSO.
A simple prediction of this model is that antiphasic ITD-tuning curves will intersect each other at the same firing rate. At ITDs where two antiphasic ITD-tuning curves cross, inverting the carrier (by adding an additional 0.5 cyc IPD) produces no change in firing rate. For a response linearly dependent on the interaural correlation (Fig. 1A), the underlying interaural correlation at these points must be zero and these points will all sit along a line. Since a static nonlinearity is dependent only on the underlying interaural correlation, these intersections at zero interaural correlation will remain intersections in the nonlinearly transformed response and will be mapped onto the same firing rate. Thus for a noise-delay function with a (static) nonlinear dependence on the interaural correlation (Fig. 8A), the equivalence contour (the line along which these points sit) will be independent of the ITD. However, equivalence contours obtained in both IC and DNLL (see METHODS) were not constant, as predicted, but showed significant ITD dependence (Fig. 8, B–G; IC: 17 neurons; DNLL: 11 neurons, P
0.05, Wald–Wolfowitz runs test applied to the sign of the deviation from the median). Thus (static) nonlinear dependence on the interaural correlation cannot adequately describe the recorded responses, either because of their delay asymmetry or, even when delay symmetric, because of the ITD dependence of their equivalence contours.
Another possible explanation for the nonlinear dependence on interaural correlation is nonlinearity in the monaural pathways projecting to MSO. This nonlinearity (frequently modeled as half-wave rectification) will introduce carrier-insensitive distortion components into the input to MSO. In this model, carrier insensitivity in the noise-delay function arises from the interaural correlation of carrier-insensitive components in the input. Figure 9A shows a noise-delay function predicted by half-wave rectification of the filtered input stimuli (see METHODS). Similar to that in Fig. 8A, the noise-delay function shows rate asymmetry, but appears to have a flat equivalence contour. Thus envelope sensitivity introduced by half-wave rectification of stimuli at the cochlea cannot account for the envelope sensitivity observed in DNLL and IC.
|
Responses like those in Fig. 9A (and those predicted from auditory nerve and AVCN responses) contain the components r0(
), r2(
,
), and higher, which are distortion components arising from nonlinearities such as half-wave rectification. As demonstrated in Fig. 5, if the r2(
,
) and higher components are removed from the noise-delay function, the equivalence contour is completely determined by the r0(
) component. Thus the elevated equivalence contours can be produced from distorted responses by attenuating the higher components while leaving the r0(
) and r1(
,
) components relatively unaffected. Figure 9B shows the same response as in Fig. 9A but smoothed with a Gaussian kernel. Such a smoothing process can arise from the convergence of heterogeneous MSO inputs in the DNLL or IC. If a given IC neuron receives input from several MSO neurons with a Gaussian selection of best ITDs, it would effectively low-pass filter the input in the manner illustrated. With the harmonic components attenuated, the equivalence contour is determined by r0(
) and so becomes elevated (Fig. 9B). Note that the rate asymmetry is unchanged due to the weak contribution of the harmonic components to the rate asymmetry (see Fig. 4A). As the cutoff of the low-pass filter shifts to lower frequencies (Fig. 9C), the r1(
,
) component becomes more attenuated, increasing the relative contribution of r0(
) and thereby increasing both the rate asymmetry and the equivalence contour.
Low-pass filtering before cross-correlation has an equivalent effect: attenuating harmonic distortions in the input to MSO will attenuate the harmonic components in its output. This temporal low-pass filtering could arise from processes such as synaptic kinetics and membrane capacitance. Alternatively, it could arise from the heterogeneity of inputs to the MSO since multiple inputs phase-locked to a variety of preferred phases would broaden the spike-timing distribution compared with that of a single fiber.
Thus the elevated equivalence contours suggest additional processing unaccounted for in both the model incorporating half-wave rectification and the model based on the cross-correlation of peripheral responses. This processing must therefore occur either at the level of the MSO (cellular filtering, heterogeneous inputs) or above it (convergence in IC or DNLL).
Origin of delay asymmetry
Intermediate-type responses in IC have previously been suggested to be formed from the combination of an ipsilateral peak-type input and a contralateral trough-type input (McAlpine et al. 1998
; Shackleton et al. 2000
), so it was of interest to see whether such a pattern of convergence could explain the observed delay asymmetry. Figure 10, A–E shows noise-delay functions produced by this model (see METHODS). When the peak-type input is dominant (Fig. 10A), the noise-delay function reflects that input, showing no delay asymmetry. As the trough-type input becomes stronger (Fig. 10C), delay asymmetry develops, only to disappear again when the trough-type input dominates (Fig. 10E). The relative strength of the inputs is reflected by the characteristic phase: responses dominated by a peak-type input will themselves be peak-type with a CP of 0 cyc, whereas those dominated by a trough-type input will be trough-type with a CP of 0.5 cyc. Responses where neither input dominates will be intermediate-type with CPs around 0.25 cyc. As the response shifts from a peak-type response to a trough-type response, the characteristic delay shifts from that observed for peak-type responses (1/8 cyc re CF) to that observed for trough-type responses (–1/8 cyc re CF). Thus this model predicts a negative correlation between the CP and the CD (Fig. 10F). It also predicts that rate asymmetry should be negatively correlated with the CP (Fig. 10G), with positive RAIs for peak-type responses falling to negative RAIs for trough-type responses. Finally, the delay asymmetry is expected to be zero for both peak- and trough-type responses, but higher for intermediate-type responses (Fig. 10H). Note that the exact degree of rate and delay asymmetry is dependent on the strength of the envelope-sensitive component to the response; if the inputs contain no envelope sensitivity (and therefore no rate asymmetry) then the resulting intermediate-type responses will show no envelope sensitivity (and hence no delay asymmetry). When the model incorporating half-wave rectification was used to generate inputs like that in Fig. 9A, delay asymmetry still resulted, although the equivalence contour appeared flat (Fig. 9D). Again, low-pass filtering was required to modulate the equivalence contours (Fig. 9, E and F).
|
|
| DISCUSSION |
|---|
|
|
|---|
Envelope sensitivity
Although a previous study (Joris 2003
) suggested the existence of low-CF envelope sensitivity in IC, it could not address whether this was simply a side effect of neural firing rates having a nonlinear dependence on interaural correlation. In our study, measuring equivalence contours allowed us to demonstrate that this is not the case. Both the asymmetries and the elevated equivalence contours observed in this study can be seen in noise-delay functions recorded in other studies in cat IC (Joris et al. 2006
; Louage et al. 2005
), thus indicating that the low-CF envelope sensitivity is a general property of mammalian IC and DNLL. The existence of this envelope-sensitive component explains how envelope modulation biases the perceived lateral position of a sound at frequencies where ITD sensitivity was previously considered to be carried only by stimulus fine structure (Bernstein and Trahiotis 1985
). With envelope modulation, the perceived lateralization space was broadened, producing more separation between different ITDs and potentially enhancing the spatial discriminability. Consistent with our findings, this influence of the envelope was greater at higher frequencies.
Envelope sensitivity in the peripheral auditory system is insufficient to explain the elevated equivalence contours observed in this study. Some enhancement of envelope sensitivity clearly occurs, as evidenced by the transition from carrier to envelope sensitivity in IC occurring at CFs roughly 1 kHz lower than those in auditory nerve (Joris 2003
). However, it is unclear how much of this enhancement occurs at the level of the MSO and how much at the level of the IC and DNLL. The peaks of noise-delay functions in MSO (Yin and Chan 1990