|
|
||||||||
1 Department of Physiology, The Hebrew University
Hadassah Medical School, Jerusalem 91120, Israel; and 2 Department of Biomedical Engineering and Center for Hearing Sciences, The Johns Hopkins University, Baltimore, Maryland 21205
| |
ABSTRACT |
|---|
|
|
|---|
Nelken, Israel, Peter J. Kim, and Eric D. Young. Linear and nonlinear spectral integration in type IV neurons of the dorsal cochlear nucleus. II. Predicting responses with the use of nonlinear models. J. Neurophysiol. 78: 800-811, 1997. Two nonlinear modeling methods were used to characterize the input/output relationships of type IV units, which are one principal cell type in the dorsal cochlear nucleus (DCN). In both cases, the goal was to derive predictive models, i.e., models that could predict the responses to other stimuli. In one method, frequency integration was estimated from response maps derived from single tones and simultaneous pairs of tones presented over a range of frequencies. This model combined linear integration of energy across frequency and nonlinear interactions of energy at different frequencies. The model was used to predict responses to noisebands with varying width and center frequency. In almost all cases, predictions using two-tone interactions were better than linear predictions based on single-tone responses only. In about half the cases, reasonable quantitative fits were achieved. The fits were best for noisebands with narrow bandwidth and low sound levels. In the second nonlinear method, the spectrotemporal receptive field (STRF) was derived from responses to broadband stimuli. The STRF could account for some qualitative features of the responses to broad noisebands and spectral notches embedded in broad noisebands. Quantitatively, however, the STRFs failed to predict the responses of type IV units even to simple broadband noise stimuli. For narrowband stimuli, the STRF failed to predict even qualitative features (such as excitatory and inhibitory frequency bands). The responses of DCN type IV units presumably result from interactions of two inhibitory sources, a strong one that is preferentially activated by narrowband stimuli and a weaker one that is preferentially activated by broadband stimuli. The results presented here suggest that the STRF measures effects related to the broadband inhibition, whereas two-tone interactions measure mostly effects related to narrowband inhibition. This explains why models based on two-tone interactions predict the responses to narrow noisebands much better then models based on STRFs. It is concluded that a minimal stimulus set for characterizing type IV units must contain both broadband and narrowband stimuli, because each stimulus class by itself activates only partially the integration mechanisms that shape the responses of type IV units. Similar conclusions are expected to hold in other parts of the auditory system: when characterizing a complex auditory unit, it is necessary to use a range of stimuli to ensure that all integration mechanisms are activated.
The dorsal cochlear nucleus (DCN) in unanesthetized animals displays striking nonlinear response characteristics (Nelken and Young 1994 Animal preparation and protocol
The methods have been described in detail elsewhere (Nelken and Young 1997 Acoustic stimuli
Sound was delivered from an electrostatic driver to the left (ipsilateral) ear through a closed acoustic system connected to a hollow ear bar. Acoustic calibrations at the eardrum were performed for each animal by sweeping a tone between 0.04 and 40 kHz and measuring the resulting sound pressure level with a probe tube near the tympanic membrane. All stimuli (except for the pseudorandom noise used for the STRF, described below) were 200 ms long with 10 ms rise-fall times and a 1-s repetition period. Because of the length of the total paradigm, responses of any one unit were obtained only to subsets of the stimuli described below.
Linear and nonlinear summation of tones
The single- and two-tone response map data were used to predict responses to noisebands and notch-noise stimuli. Two models were used: a first-order model in which the response to a noise is predicted as the sum of the responses to tones within the passband of the noise, and a second-order model in which the summation is taken over all single- and two-tone combinations within the passband. Because the responses of DCN type IV units change dramatically with sound level, it is necessary to choose the level of the tones to be comparable with that of the noise; in this case, the tone level was set so that the tone power was equal to the power in a narrow noiseband of width bw Hz, where the spectrum level of the narrow noiseband was the same as in the passband of the noiseband being predicted; bw is called the reference bandwidth below and was usually 200 Hz.
STRF calculations
STRFs were computed with the use of the same algorithms as in Kim and Young (1994) This paper is based on the responses of 25 type IV units to the two-tone stimulus paradigm. The responses of these units to other stimuli are described in the companion paper, which treats regions of linearity in type IV responses (Nelken and Young 1997 Two-tone stimuli can be used to predict the responses to narrow noisebands
Examples of two-tone response planes for three different units are shown in Fig. 2. The axes of each plane are the two frequencies composing the stimulus. The firing rate is represented by gray levels, according to the scale to the right of each plane. The line plot below each plane shows the responses to single tones on the same frequency axis. Figure 2A shows a two-tone response plane for one unit at a low level, close to the maximum of the BF rate-level function. Significant responses occur only when the two-tone combination contains near-BF tones, resulting in the cross pattern in the figure. In Fig. 2A, the responses along the diagonal (where the 2 tones are identical and behave as a single tone with a level 6 dB higher) are almost the same as the single-tone responses off the diagonal, because the former is measured at a level slightly above and the latter slightly below the maximum of the BF rate-level function. The nonlinear interaction term c(f1,f2) (Fig. 2D) is essentially zero except when both tones are very close to BF, where it is negative.
STRFs of DCN type IV units
STRFs were computed for nine type IV units, usually at multiple levels. Figure 6 shows three examples of type IV STRFs. The STRF is interpreted as the mean triggering event for spikes, in the sense that the STRF is an estimate of the average power spectrum in the stimulus as a function of time preceding spikes. A triggering event may be either an increase in the level of some frequencies in the noise (white regions in Fig. 6) or a decrease in the level of other frequencies (black regions in Fig. 6), or both. Spike latency is the time from the triggering event to the spike, which occurs at time 0 on the ordinate of the STRF plots.
Prediction of responses to other stimuli based on the STRF
The STRF has limited predictive power in the sense that it cannot be used to describe the responses of DCN units to most other stimuli. It can be seen in Fig. 7 that the STRF fails to predict the responses of most type IV units to tones. In fact, except for the case in Fig. 7B and one other in which the STRF was entirely inhibitory, the STRF predicts excitatory effects of BF tones at all levels; this occurs because the largest component of type IV STRFs at all levels is the excitatory peak near BF. By contrast, type IV units show inhibitory responses to BF tones at suprathreshold levels. This failure of the STRF to predict inhibitory responses to tones extends also to the prediction of the responses to narrow noisebands, because they give essentially the same responses as do tones (Nelken and Young 1997 Nonlinear modeling methods
The arsenal of nonlinear modeling methods is rather limited at present. The most widely tested methods for nonlinear modeling of auditory units have been second-order reverse-correlation-based methods, used either in the time or frequency domains (Wiener kernel) (Wickesberg et al. 1984 Tone and two-tone predictions
First-order predictions of responses to noisebands with the use of summation of responses to single tones have been found to have very limited power (Spirou and Young 1991 STRF predictions
In contrast to the two-tone predictions, the STRFs of type IV units are computed with the use of responses to broadband stimuli. They do show qualitative agreement with the responses of some type IV units to notch stimuli, but are unable to predict the responses to narrowband stimuli (Fig. 7). In this sense, they complement the two-tone data, which are taken with the use of narrowband stimuli and are useful for predicting the responses to some narrowband stimuli. The main use of the STRF may be in defining and classifying the large variability in properties of type IV units (Nelken and Young 1994 Implications for auditory modeling
The main advantage of the modeling approach used in this study is that there are no free parameters in the prediction formulas, and therefore no training. The formulas are completely general, and they are made specific to DCN type IV units by plugging in the measured responses to a selected family of sounds. However, the results presented here show that it is very important to select correctly a family of sounds that captures the whole complexity of the integration mechanisms of the unit.
![]()
INTRODUCTION
Abstract
Introduction
Methods
Results
Discussion
References
, 1997
; Spirou and Young 1991
). As a result, it has been difficult to predict the responses of DCN principal cells to stimuli with arbitrary spectral shape. This paper focuses on type IV units, which are one response type recorded from DCN principal cells. Tones and narrow bands of noise evoke inhibitory responses in type IV units over a broad range of frequencies and sound levels (Spirou and Young 1991
; Young and Brownell 1976
); from these responses, one would predict that broadband noise (BBN) stimuli would also evoke inhibitory responses. However, the responses to BBN in type IV units are in fact predominantly excitatory. Other examples of nonlinear behavior, in which the response to the sum of two stimuli is far different from the sum of the responses to the two stimuli presented separately, have been documented (Nelken and Young 1997
; Spirou and Young 1991
).
) and semiquantitative (Blum et al. 1995
; Reed and Blum 1995
) models of DCN have been developed that account for its known properties. However, this approach is not general and can be applied to a structure only after extensive experimental studies. The alternative is to develop models that capture the important aspects of the input/output processing in a structure, without requiring a detailed knowledge of its internal processing mechanisms (e.g., Marmarelis and Marmarelis 1978
). A variety of quasilinear and nonlinear modeling techniques has been applied to the auditory system (e.g., Aertsen et al. 1981
; Backoff and Clopton 1991
; Eggermont et al. 1983b
; Nelken et al. 1994a
,b
; Schreiner and Calhoun 1994
; Shamma et al. 1995
; Wickesberg et al. 1984
). The predictive power of these methods has been evaluated to a limited extent. In some cases, good predictions are obtained, e.g., predictions of responses to four-tone complexes from responses to single- and two-tone complexes (Nelken et al. 1994b
) and predictions of responses to arbitrary spectral shapes from a linear superposition of rippled-spectra stimuli of different ripple frequencies (Shamma and Versnel 1995
). However, other studies have emphasized poor results of prediction attempts, particularly when the stimuli used to derive the model differ substantially from the test stimuli (Bonham et al. 1996
; Eggermont et al. 1983a
).
,b
) and the other on the spectrotemporal receptive field (STRF) (Aertsen and Johannesma 1981
). Both methods should capture some nonlinearities in the Wiener-Volterra sense. We emphasize predictions in which the levels of the stimuli used to develop the models and the stimuli used to test them are similar. Thus we emphasize the spectral integration capabilities of the models rather than their ability to accurately predict behavior as a function of level. For both methods, the crucial assumption for applicability is that the neglected higher-order terms are small relative to the terms that are kept in the model.
![]()
METHODS
Abstract
Introduction
Methods
Results
Discussion
References
). Briefly, single-unit recordings were made in DCN of decerebrate cats with the use of tipped platinum-iridium microelectrodes. Responses to best-frequency (BF) tones and BBN were used to classify units as type I, II, III, I/III, or IV (Young 1984
). This paper reports on responses of type IV units only.
).
and their generation is described in that paper. The noise consisted of 32,768 samples and was played at a sampling rate of 10, 20, 40, or 100 kHz, depending on the BF of the unit; the sampling rate was chosen to keep unit BF between one-sixth and one-third of the sampling rate, equivalent to one- to two-thirds of the aliasing rate. The noise was presented through a 16-bit D/A converter as a continuous sound of 100-200 periods.
). The HRTF is the transfer function from free field to a point near the eardrum; when a stimulus with this spectral shape is presented through a closed sound system, the result simulates free-field presentation of a BBN from a particular direction in space. The HRTF stimulus was stored in digital form and played through a D/A converter at a variable sampling rate chosen to place the prominent spectral notch in the stimulus (arrow) at various frequencies relative to unit BF.

View larger version (13K):
[in a new window]
FIG. 1.
Spectra of 3 noise stimuli used to test predictions of models. A: noiseband of bandwidth BW; center frequency is arithmetic middle of band, as shown. B: noise filtered with cat head-related transfer function (HRTF), shown at D/A sampling rate of 100 kHz. Arrow: prominent spectral null or notch, which was moved to various frequencies above and below unit's best frequency (BF) by changing sampling rate of D/A converter. HRTF, obtained from Rice et al. (1992)
, applies to sound originating at 15° ipsilateral azimuth and 30° elevation in 1 cat. C: notch noise generated as described in text. NW: notch width. Notch was usually centered linearly on BF of unit.
then the predicted response R1noiseband was computed by
where Rspontaneous is the spontaneous rate, R(fk) is the response to the tone of frequency fk, and the weights wk account for the fact that the frequencies are usually logarithmically, rather than linearly, spaced
(1)
That is, each R(fk) is assumed to represent the response rate to noise energy in a frequency band of width wk centered on fk. The factor wk/bw in Eq. 1 corrects the tone response rate R(fk) for the fact that the tone is used to approximate a bandwidth wk of the noise signal, not the reference bandwidth bw, for which the level of the tone is appropriate. Note that we assume that it is driven rates that summate linearly, not total rates; that is, spontaneous rate is treated as if it were an independent input to the neuron. This is theoretically required, because the first- and higher-order terms in any functional expansion can only give zero output to zero input. A formula similar to Eq. 1 was used by Spirou and Young (1991)
, except that arbitrary scaling was used instead of division by bw.
). The two-tone data can be used to derive a second-order correction factor c(f1,f2) that captures higher-order interactions between tones of frequencies f1 and f2 as (Nelken et al. 1994a
,b
)
where R(f1,f2) is the response rate to the two-tone combination of f1 and f2, and R(f1) and R(f2) are the rates to the tones individually, as above. The spontaneous rate is added to compensate for the fact that it is subtracted twice in the R(fi). Note that c(f1,f2) = 0 if the two tones combine linearly in the sense of Eq. 1. The nonlinear terms c(f1,f2) were computed from the single- and two-tone data with the use of Eq. 2. The second-order prediction R2noiseband was obtained from R1noiseband by adding terms derived from c(f1,f2) as
(2)
where
(3)
The summation is taken over the region fL < fj < fH and fL < fk < fH and the weighting factor wjk is defined similarly to wk in R1noiseband and D1. The need for the diagonal term D1 is explained in the APPENDIX.
fL) in Eq. 3 was developed empirically. An intuitive explanation for this correction factor is as follows: on the one hand the neuron sums up roughly n inputs where n is proportional to the integration bandwidth of the neuron; on the other hand, the summation defining C2 has contributions from about n2 two-tone contributions. The correction factor bw(fH
fL) is roughly proportional to 1/n, because it measures the width of the noiseband in units of bw and so fixes this discrepancy.
Where L is the level of the noiseband, Lref is the reference level (at which the 2-tone data were presented), and the other terms are those computed for the prediction of the noiseband at the reference level. The dependence of the various first- and second-order terms on the level is derived from Eq. A1. The first-order terms (R1noiseband and D1) depend only on the level at a single frequency, and therefore scale linearly with level. The second-order term (C2) depends on products of tone levels at two frequencies, and therefore scales quadratically with level. Note that for L = Lref, Eq. 4 reduces to Eq. 3. These rate-level predictions were compared with the measured rate-level functions, which were always taken for noisebands centered at the BF.
(4)
. An STRF is a function of time and frequency that measures the average spectral density of the stimulus as a function of time before action potentials from a neuron. A peristimulus time histogram was computed from the responses to the pseudorandom BBN. For each bin of the peristimulus time histogram, the time-frequency distribution of the sound segment (10-20 ms) just preceding it was computed (the Wigner distribution was used here) (Eggermont and Smith 1990
; Kim and Young 1994
), and those distributions were averaged over all bins with weights proportional to the number of spikes per bin of the peristimulus time histogram. This is equivalent to averaging the Wigner distribution of the sound preceding each spike, as in Eq. 7 of Kim and Young (1994)
. The expected time-frequency distribution in the absence of correlation between spikes and sound was subtracted to account for small spectral irregularities in the noise; the result of the subtraction is the STRF. The expected distribution was computed as the STRF for a sequence of Poisson spikes, uncorrelated with the pseudorandom BBN. The STRF can be thought of as representing the average spectrotemporal "triggering event" for the unit and for the stimulus used to compute it (Aertsen et al. 1981
).
![]()
RESULTS
Abstract
Introduction
Methods
Results
Discussion
References
). The STRFs of nine type IV units were collected; three of these units were also tested with two-tone stimuli and noisebands. The results are presented in two stages. First it will be shown that two-tone stimuli can be used to model some nonlinear phenomena, mostly for narrowband stimuli. Second it will be shown that the STRF predicts some aspects of responses to broadband, but not narrowband, stimuli.

View larger version (47K):
[in a new window]
FIG. 2.
A-C: examples of 2-tone responses from 3 units. In each part, 2-tone responses R(f1,f2) are shown at left as matrix of gray-scale values, with scale shown at right. Arrow: spontaneous discharge rate. Line plots: responses to single-tone R(f1) at same sound level; spontaneous rate is given by horizontal line. Note that 2-tone matrixes are symmetrical around major diagonal, because frequency combination f1,f2 is identical to frequency combination f2,f1 and each frequency combination was presented only once. Responses on major diagonal, where f1 = f2, were presented as single tone, with sound level increased by 6 dB, corresponding to in-phase addition of 2 identical tones. Each frequency combination was presented once. D-F: nonlinear interaction terms c(f1,f2), Eq. 2, for the 3 examples in A-C.
), except for some oscillations at center frequencies between 4 and 5 kHz. This oscillation may be the result of noise in the measurements. Figure 3B is another example of a good fit. Figure 3C shows a borderline case. Although the second-order prediction correctly identifies the main peak of the response and its bandwidth, the prediction for the maximal rate is too large by a factor of ~2.

View larger version (22K):
[in a new window]
FIG. 3.
Responses to noisebands (Fig. 1A) as center frequency is varied (
). First-order predictions (R1 in Eq. 1, · · ·) and 2nd-order predictions (R2 in Eq. 3, - - -) calculated from tone and 2-tone responsemaps are superimposed. Values of d given in each plot are for 2nd-order prediction. Examples are ordered by quality. A and B are good fits, C is medium fit, D and E are bad fits. Unit BFs, noise bandwidths, and noise spectrum levels: (A) 4.23 and 0.4 kHz, level =
10 dB; (B) 2.62 and 1.6 kHz, level =
20 dB; (C) 17.7 and 1.6 kHz, level =
4 dB; (D) 5.9 and 2.4 kHz, level = 5 dB; (E) 17.8 and 2 kHz, level =
10 dB.
2 of the difference between the measured and predicted functions; if the difference between two functions is entirely due to noise, then d should be near 1, and d increases as the difference between the functions increases. This measure is described in detail elsewhere (Nelken and Young 1997
). In the predictions made here, cases with d ~ 1 were rare. This is probably the result of two factors: first, the different character of the predicting stimuli (2-tone complexes) and the predicted stimuli (noisebands); and second, the noise in the estimates of the second-order contributions (they are computed as the difference between 3 measured rates, increasing their SD by a factor of ~1.7). Fits with d < 30 were clearly good, as judged subjectively; fits with d > 300 were clearly bad. In between, there was a large group of results for which the objective d value and the subjective judgment did not correlate well (e.g., Fig. 3, C vs. D, both of which have similar d values). Values of d for the second-order fits are shown in Fig. 3, A-E.

View larger version (36K):
[in a new window]
FIG. 4.
A: scatter plot of d vs. test noise bandwidth for measured and predicted responses to test noisebands of varying center frequency (data as in Fig. 3). Bandwidth is measured relative to reference bandwidth used in predictions (bw in Eqs. 1 and 3, usually 200 Hz). B: d as function of noise bandwidth, measured in octaves re BF. C: d values at low and high spectrum levels for those cases in which 2 levels were measured (usually 30 dB apart). Line shows where d values are equal. Note that most of data lie above this line. Most extreme departure from line is case in which firing rate at higher level was almost completely depressed by both tones and noisebands, giving rise to almost perfect prediction. D: scatter plot of d values for 1st-order (Eq. 1) and 2nd-order (Eq. 3) predictions. Line shows where d values are equal; 2nd-order prediction improved fit in almost all cases.
showed that for units in the anteroventral cochlear nucleus, the second-order Wiener kernels did not improve response predictions in some cases. For the method presented here, the second-order prediction was almost always better than the first-order prediction, as judged by the d measure. In Fig. 4D, the scatter plot of the d values for first- and second-order predictions lies almost entirely below the diagonal. The d value in some cases decreased by 3 orders of magnitude.
threshold is too low and the maximum firing rate of the unit is too high. As expected, there was strong correlation between cases in which the fixed-level predictions (i.e., Fig. 3) were good and cases in which the rate-level predictions were good. Twenty-one units were tested with at least one noiseband rate-level function; for a large majority (19 of 21) the rate-level prediction was successful (at least at low levels, as in Fig. 5C) at some bandwidths. Nonmonotonicity was predicted in 16 of 21 of the units. Only cases in which the two-tone data were measured close to the peak of the rate-level function (as in Fig. 2, A and B) showed reasonable predictions for noiseband rate-level functions, and the predictions were best for narrower noisebands (200-800 Hz). These results show that, in some circumstances, the two-tone response maps capture not only information about spectral integration at a fixed level but also the information needed to describe the behavior of type IV units as a function of level.

View larger version (22K):
[in a new window]
FIG. 5.
Responses to noisebands as function of level (
). Noisebands were always arithmetically centered on unit BF. First-order (· · ·) and 2nd-order (- - -) predictions (Eq. 4) are superimposed. A: good fit with 2nd-order prediction between threshold and level at which unit is strongly inhibited. B: good fit of 2nd-order prediction between threshold and ~10 dB re level of peak rate. C: good fit of both 1st- and 2nd-order predictions below peak of rate-level function; in this case, 2nd-order prediction did not show nonmonotonicity. D: bad fit of both 1st- and 2nd-order predictions over full level range. Qualitatively, however, 2nd-order fit does predict nonmonotonicity of rate-level function.

View larger version (60K):
[in a new window]

View larger version (60K):
[in a new window]

View larger version (59K):
[in a new window]
FIG. 6.
Examples of spectrotemporal receptive fields (STRFs) for type IV units. STRFs are shown in rectangles with the use of gray scale; scale is given at right. Because average stimulus spectrum has been subtracted, STRF would be constant near 0 in absence of any correlated response from neuron. White regions: peaks of energy (excitatory regions). Dark regions: valleys (inhibitory regions). Stimulus frequency is on abscissa and time preceding spike discharges is on ordinate. A: narrow excitatory region at 12.5 kHz with somewhat wider inhibitory flanks. Note repetition of excitation at longer latencies. B: wide excitatory region centered on 6 kHz with wide inhibitory region at longer times. C: medium-width excitatory region near 10 kHz that lasts for a long time, and flanking inhibitory bands.
; Kipke et al. 1991
; Parham and Kim 1992
). To check this possibility, the autocorrelation was computed for this unit; the interval between the two peaks in Fig. 6A (~1.5 ms) was within the refractory period, so the second peak is not produced by simple regularity in this unit. However, multipeaked STRFs were observed in three other units and, in those cases, the autocorrelation showed a significant peak at the repeat period of the STRF. Thus, in the majority of cases, repetitions in the STRF result from regularity of firing of the unit; that is, they are related to the intrinsic properties of the unit rather than to spectral integration mechanisms. A possible mechanism to explain the exceptional case in Fig. 6A is that the unit has an intrinsic oscillatory rhythm at the right frequency (~600 Hz), which is not observed ordinarily in spike discharges because of refractoriness. However, in those cases in which a spike was not discharged at the usual short latency (~4 ms) from the favorable triggering event, the triggering event still evoked the intrinsic oscillation, which increased the firing probability 1.5 ms later.
80 dB), which has a width similar to that of the strong excitatory region in the STRF; however, above 10 dB re threshold, the unit gave predominantly inhibitory responses to tones. Thus there is no qualitative correspondence between the tone response map and the STRF.

View larger version (49K):
[in a new window]
FIG. 7.
Tone response maps and STRF response maps for 2 type IV units. A and C: tone response maps consists of rate-vs.-frequency plots at 8 or 9 attenuations; 0-dB attenuation is ~100 dB SPL, but actual sound level varies with frequency according to acoustic calibration (not shown). Horizontal lines: spontaneous rates; common rate scale is shown at bottom left. Inhibitory regions are shaded in gray and excitatory regions in black. B and D: STRF plots shows frequency marginals for STRFs computed from responses at 6 attenuations (0-dB attenuation corresponds to spectrum level of ~40 dB SPL). Frequency marginals are computed by averaging STRF along time dimension. Solid lines: frequency marginals that include BF excitatory peak in STRF but not (refractory) inhibitory effect at longer latencies; dashed lines include long-latency inhibitory effect. Marginals computed from latencies as follows: (B) solid lines, 4.5-7 ms, dashed lines, 4.5-10 ms; (D) solid lines, 3-6 ms, dashed lines, 3-8 ms. Ordinate scale for STRF is arbitrary, but is same at different stimulus levels.
). Inhibitory regions at frequencies just above or just below BF were also observed in most units (6 of 8); the case in Figs. 6B and 7B is one exception. The cases in Figs. 6, A and C, and 7D are more typical. These off-BF inhibitory regions probably result from neural interactions. In four of six cases with both BF excitation and off-BF inhibition, the inhibitory region had a slightly longer latency than the excitatory region, which is consistent with a multisynaptic inhibitory pathway. Very small off-BF inhibitory regions are observed in auditory nerve fibers (Kim and Young 1994
), but these are too small to explain the kind of inhibitory effects shown in Fig. 6.
).
![]()
DISCUSSION
Abstract
Introduction
Methods
Results
Discussion
References
) or as the STRF (Aertsen and Johannesma 1981
; Backoff and Clopton 1991
; Eggermont et al. 1983b
). In all cases, some function of the waveform just preceding a spike is averaged over all spike occurrences. Various forms of STRF have been used, but all are related to the Fourier transform of the second-order Wiener kernel across one time dimension, giving the average time-frequency distribution of energy preceding a spike.
). There are two problems: first, the STRF assumes a stationary system and is generated from steady-state stimuli, whereas the responses of auditory neurons to interesting stimuli are not stationary. That problem is bypassed in this paper by attempting to predict only average discharge rate behavior and not detailed temporal aspects of responses. This is appropriate in DCN, because average discharge rate captures reasonably well the temporal modulation of units' responses (Young and Brownell 1976
). The second problem is that there is no reason to assume a priori that the nonlinearity of an auditory neuron is limited to second-order terms in the Wiener-Volterra sense. As is discussed below, the results of this paper suggest that the nonlinearities are of higher than second order in the DCN. Therefore, although the full Wiener-Volterra series would represent the unit completely, there is no reason to think that truncating it at the second order would give a useful approximation (Johnson 1980
). Unfortunately, estimating higher-order kernels is technically difficult.
; Blum et al. 1995
; Pont and Damper 1991
; Reed and Blum 1995
). Although these models show some promise, they have not yet demonstrated predictive power for the kinds of stimuli analyzed here.
). In this paper, we show that two-tone responses can be used to improve the quality of these predictions, but that the improvement depends on the bandwidth of the noise and also on its level (Fig. 4). The second-order predictions almost always improve the fit qualitatively, but sometimes overcompensate for the errors in the single-tone fits (e.g., Fig. 3D). Presumably, this behavior reflects the fact that higher-order nonlinear terms are necessary to fully model the unit's responses. In addition, there are indications that some of the problems in the second-order prediction are caused by noise in the two-tone responses, because only one repetition of each combination of parameters was presented and the two-tone contributions are computed as differences of these noisy values.
), we argue that the major source of nonlinearity in type IV responses is the inhibitory input from type II units. The limited success that was achieved with second-order predictions is probably due to the fact that type II units are relatively weakly activated in the parameter range used. This follows from the fact that the tone levels used for the single- and two-tone response maps were usually at the peak of the rate-level function or on its descending limb, which is where type II units are just beginning to fire (Young and Voigt 1981
). Assuming that type II units are the major source of nonlinearity, the fact that the predictions deteriorated with increasing bandwidth (Fig. 4A) can be explained as resulting from an increasing difference between the narrowband predictor, which activates the type II units, and the broadband test response, which does not.
in fact, the units usually shut down completely. As a result, the predictions based on two-tone measurements cannot predict the rate-level functions at higher levels. It is nevertheless noteworthy that one measurement, near the peak of the rate-level function, is sometimes able to capture the quantitative features of the whole rate-level functions from threshold to peak and to the inhibitory area.
; Spirou and Young 1991
; Young and Brownell 1976
).
). With broadband stimuli, type II units are minimally activated and the inhibition observed with notch-noise stimuli probably comes from a second inhibitory interneuron, the wideband inhibitor. The wideband inhibitor's inhibition of type IV units is hypothesized to be weak (Nelken and Young 1994
), and the STRF suffers from a narrow dynamic range (Kim and Young 1994
). Thus the explanation for the difference in inhibitory bandwidth may be that only the central, strongest portion of the wideband inhibitor's input is reflected in the STRF. Alternatively, the narrow inhibitory inputs in the STRF may be the weak remains of type II inhibition or of inhibition from other unknown sources.
| |
ACKNOWLEDGEMENTS |
|---|
The comments of Prof. Haim Sompolinsky were helpful in preparing this manuscript.
This work was supported by National Institute of Deafness and Other Communications Disorders Grant DC-00115 and by a grant from the Israeli Academy of Sciences.
| |
APPENDIX |
|---|
In this appendix, mathematical derivations related to the prediction formulas are outlined. The basic model used here is the following
|
|
(A1) |
For a first-order model, in which the second-order kernel is null, single tones are sufficient. Single tones are delta functions in the frequency domain, and so
|
|
(A2) |
For the second-order model, in which both kernels are assumed to be nonzero, both single-tone and two-tone data are necessary. The following combinations are used
|
|
|
The predicted responses to these stimuli are, respectively
|
|
|
|
|
|
|
(A3) |
In practice, it turned out that the second-order correction is very often in the right direction but too large. The discrepancy increased with bandwidth, and it turned out that over the population, the increase in discrepancy was proportional to the bandwidth being predicted. This finding led to the introduction of thefactor bw/(fH
fL) to scale the second-order correction. The surprising finding is that the proportionality factor is not significantly different from 1, although there is no a priori reason for that. In the physics literature, it is known that infinite series may sometimes be summed by multiplying a lower-order term in the series by a scaling factor. Further investigation of this finding may bring additional insight into the spectral integration mechanisms of type IV units.
| |
FOOTNOTES |
|---|
Address for reprint requests: I. Nelken, Dept. of Physiology, Hebrew University
Hadassah Medical School, PO Box 12272, Jerusalem 91120, Israel.
Received 30 August 1996; accepted in final form 15 April 1997.
| |
REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
G. D. Grana, C. P. Billimoria, and K. Sen Analyzing Variability in Neural Responses to Complex Natural Sounds in the Awake Songbird J Neurophysiol, June 1, 2009; 101(6): 3147 - 3157. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. A. Lesica and B. Grothe Dynamic Spectrotemporal Feature Selectivity in the Auditory Midbrain J. Neurosci., May 21, 2008; 28(21): 5412 - 5421. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bandyopadhyay, L. A. J. Reiss, and E. D. Young Receptive Field for Dorsal Cochlear Nucleus Neurons at Multiple Sound Levels J Neurophysiol, December 1, 2007; 98(6): 3505 - 3515. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. J. Reiss, S. Bandyopadhyay, and E. D. Young Effects of Stimulus Spectral Contrast on Receptive Fields of Dorsal Cochlear Nucleus Neurons J Neurophysiol, October 1, 2007; 98(4): 2133 - 2143. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. B. Averbeck and L. M. Romanski Probabilistic Encoding of Vocalizations in Macaque Ventral Lateral Prefrontal Cortex J. Neurosci., October 25, 2006; 26(43): 11023 - 11033. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. D. Young and B. M. Calhoun Nonlinear Modeling of Auditory-Nerve Rate Responses to Wideband Stimuli J Neurophysiol, December 1, 2005; 94(6): 4441 - 4454. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Escabi, R. Nassiri, L. M. Miller, C. E. Schreiner, and H. L. Read The Contribution of Spike Threshold to Acoustic Feature Selectivity, Spike Information Content, and Information Throughput J. Neurosci., October 12, 2005; 25(41): 9524 - 9534. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Qiu, C. E. Schreiner, and M. A. Escabi Gabor Analysis of Auditory Midbrain Receptive Fields: Spectro-Temporal and Binaural Composition J Neurophysiol, July 1, 2003; 90(1): 456 - 476. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Fujino and D. Oertel Bidirectional synaptic plasticity in the cerebellum-like mammalian dorsal cochlear nucleus PNAS, January 7, 2003; 100(1): 265 - 270. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Bauer, A. Klug, and G. D. Pollak Spectral Determination of Responses to Species-Specific Calls in the Dorsal Nucleus of the Lateral Lemniscus J Neurophysiol, October 1, 2002; 88(4): 1955 - 1967. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Escabi and C. E. Schreiner Nonlinear Spectrotemporal Sound Analysis by Neurons in the Auditory Midbrain J. Neurosci., May 15, 2002; 22(10): 4114 - 4131. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Yu and E. D. Young Linear and nonlinear pathways of spectral information transmission in the cochlear nucleus PNAS, October 24, 2000; 97(22): 11780 - 11786. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. E. Theunissen, K. Sen, and A. J. Doupe Spectral-Temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds J. Neurosci., March 15, 2000; 20(6): 2315 - 2331. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. X. Joris and P. H. Smith Temporal and Binaural Properties in Dorsal Cochlear Nucleus and Its Output Tract J. Neurosci., December 1, 1998; 18(23): 10157 - 10170. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. C. deCharms, D. T. Blake, and M. M. Merzenich Optimizing Sound Features for Cortical Neurons Science, May 29, 1998; 280(5368): 1439 - 1444. [Abstract] [Full Text] |
||||
![]() |
I. Nelken and E. D. Young Linear and Nonlinear Spectral Integration in Type IV Neurons of the Dorsal Cochlear Nucleus. I. Regions of Linear Interaction J Neurophysiol, August 1, 1997; 78(2): 790 - 799. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |