|
|
||||||||
Center for Hearing and Balance and Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland
Submitted 14 May 2007; accepted in final form 18 September 2007
|
|
ABSTRACT |
|---|
|
|
|
INTRODUCTION |
|---|
|
|
The STRF has a frequency and a time axis, variations along which show the neuron's frequency tuning and its modulation filtering properties, respectively. The frequency selectivity of neurons in the cochlear nucleus and auditory nerve can be characterized by a weighting-function model (Young and Calhoun 2005
; Yu and Young 2000
), which attempts to predict only the average discharge rate of the neuron to a spectrally stationary stimulus, i.e., only its frequency integration properties. Such a frequency-weighting function model can be related to the STRF assuming separability of STRFs in time and frequency (Young et al. 2005
). It should be noted that often the STRF is separable into a function of frequency and a function of time (Depireux et al. 2001
; Qiu et al. 2003
), at least at levels of the auditory system up to the inferior colliculus.
A weighting-function model is advantageous because it is feasible to incorporate second-order components into the model, providing insights into nonlinear properties. Such second-order terms capture the interactions between energy at different frequencies to model one form of nonlinearity in neurons responses. In the cochlear nucleus, second-order terms improve the performance of the weighting-function model (Yu 2003
; Yu and Young 2000
), but in the dorsal cochlear nucleus (DCN), significant nonlinearities remain even when second-order terms are incorporated.
A second source of nonlinearity in auditory receptive fields is that the neurons responses change with sound level (Nagel and Doupe 2006
; Nelken et al. 1997
). This is apparent in Fig. 1 where the tone response map and the weighting functions change with the overall sound level of the stimulus. The change in weighting functions with level at 3-dB contrast for the example neuron in Fig. 1 is more subtle than the changes observed in another neuron shown later in the paper in ![]()
![]()
![]()
![]()
Fig. 7. The importance of sound level is also suggested by changes in both the gain and prediction performance of weighting functions, depending on stimulus contrast (Reiss et al. 2007
).
|
|
|
|
|
|
Herein, we develop a receptive-field model for DCN principal neurons that incorporates sound level as an explicit parameter in the weighting function. We show that such models predict responses to stimuli with large contrast, typical of natural sounds. However, for stimuli with low spectral contrast, the effects of interactions between different frequencies are more important and a level-dependent model is unnecessary.
|
|
METHODS |
|---|
|
Experiments were conducted on a total of 14 adult cats (3–4 kg) with infection-free ears and clear tympanic membranes. Animal-use protocols were approved by the Johns Hopkins Animal Care and Use Committee. Cats were tranquilized with xylazine [2 mg, administered intramuscularly (im)] and anesthetized with ketamine (40 mg/kg im). Atropine (0.1 mg im) was given to control mucous secretion. A tracheal tube was inserted. Cats were decerebrated by aspirating through the brain stem between the superior colliculus and thalamus, after which anesthesia was discontinued. Core body temperature was maintained at about 38°C using a regulated heating blanket and lactated Ringer solution was given intravenously to maintain fluid volume.
The DCN was exposed by opening the skull and dura above the cerebellum and aspirating the part of the cerebellum overlying the DCN. Platinum–iridium microelectrodes were advanced into the DCN under visual control and single neurons were isolated and recorded extracellularly. Action potentials were detected with a Schmitt trigger and spike times recorded with a precision of 10 µs.
Experimental protocol
Recordings were made in a sound-attenuating chamber. Acoustic stimuli were delivered to the ipsilateral ear by an electrostatic speaker coupled to a hollow ear bar. The bulla was vented through a length of PE-90 tubing. The speaker was calibrated in situ using a probe tube placed about 2 mm from the eardrum. The calibration was essentially flat with fluctuations of <10 dB from 0.5 to 30 kHz. Within the bandwidth used for the analysis of each neuron (1.25 octaves, subsequently discussed), the SD of the calibrations around their values at the BFs of the neurons was about 2 dB. No correction for the calibration was applied during the analysis. The effect of fluctuations in the calibration on the analysis are small and have been ignored.
All data are from well-isolated single neurons, judged from the separation of the action-potential amplitude from noise and other action potentials and the presence of a refractory period. Isolated neurons were characterized using a combination of tones and broadband noise. Rate versus level functions were collected for best-frequency (BF) tones and noise by presenting 200-ms stimulus bursts (10-ms rise/fall times) once per second over an 80- to 100-dB range of sound levels. Type IV neurons were classified as having moderate spontaneous rates and BF-tone rate–level functions with excitation at low sound levels and inhibition at high sound levels (Shofner and Young 1985
). Only neurons located along the electrode track before a BF gradient shift, which indicates a transition from DCN to VCN, were classified as DCN neurons. This paper describes data from DCN type IV neurons only.
The acoustic stimuli described in the next section were presented at a rate of one stimulus per 1.1 s. The stimulus duration was 399 ms. Each set of stimuli was presented over a range of sound levels, spaced at 5–10 dB, beginning near threshold. Response rate was computed as the number of spikes during the stimulus divided by the duration.
Stimuli
The random spectral shape (RSS) stimuli used here are similar to those used before (Young and Calhoun 2005
; Yu and Young 2000
). Each stimulus consists of a sum of tones spaced logarithmically at 1/64th octave. The tones are grouped into frequency bins of 1/8th octave and all 8 tones within a bin have the same amplitude; the sound level S(f) in each frequency bin is the sum of the energies of these tones. The starting phases of the tones were randomized to avoid a click at stimulus onset. The phases were randomized for each stimulus. Linear 10-ms onset and offset ramps were added to the time-domain signal. The stimuli were not corrected for spectral irregularities in the speaker calibration.
The RSS stimuli had a bandwidth of 6.125 octaves, centered on 5.75 kHz. Each RSS set consisted of 410 stimuli. In 400 of these, the dB amplitudes of the bins S(f) were selected pseudorandomly from an approximately Gaussian distribution with 0 mean and SD of 12, 6, or 3 dB; the SD is subsequently called "spectral contrast." S(f) is the dB level, relative to a reference sound level, of the sound in the bin centered on frequency f. The remaining 10 stimuli had the reference sound level in all bins, i.e., S(f) = 0 dB for all f. Stimuli were organized into successive plus–minus pairs, so that the dB levels of the first stimulus of the pair Si(f) were inverted in the second stimulus Si+1(f) = –Si(f), for i odd. These plus–minus pairs were used to separate the estimation of even- and odd-order terms, as described previously (Reiss et al. 2007
). The 10 all-0-dB stimuli were used to estimate the reference rate R0 in the quadratic model described in the next section. Note that the all-0-dB stimuli are not "flat," in the usual sense. Because of the logarithmic spacing of tones, this spectrum actually has a 1/f shape.
Data from four type IV neurons were included for which the 6- and 12-dB contrast stimulus sets were different from the earlier description. They were 3.75 octaves wide with a periodic structure that repeated every 1.25 octaves (9 frequency bins), and had 210 stimuli (including 10 all-0-dB stimuli) in each set. These stimuli were resampled during presentation to be centered at the BF of the neuron. Additionally, these sets did not have plus–minus pairs and thus their corresponding quadratic model was computed as in Young and Calhoun (2005)
. The weighting functions computed with the two types of stimuli behaved in the same fashion.
Quadratic weighting function model
The average discharge rate r is modeled using a quadratic weighting function as follows
![]() | (1) |
Testing validity and generality of the model
To test the quality of the fit, the model was estimated from 75% of the data points and then tested by using it to predict the remaining 25% of the data points. Confidence intervals were calculated using
200 bootstraps (Efron and Tibshirani 1993
), each time estimating the model from 75% of the data and predicting responses to the remaining 25% of stimuli. The measure of prediction performance was the fraction of variance, defined as
![]() | (2) |
j is the rate computed by the model, and
is the mean rate. fv has a maximum value of 1, when the model fits the data perfectly, and decreases as the error increases. It is zero when the mean rate fits as well as the model and can go negative for poor models. Here, fv was not limited at zero. Level-dependent weight model
To test the importance of weight variation with sound level, we developed a model in which the weights are explicitly a function of stimulus level. The quadratic model contains a form of weight variation with sound level. By regrouping the terms in Eq. 1, the quadratic model can be rewritten as a first-order weight summation with level-dependent weights as follows
![]() | (3) |
![]() | (4) |
k in Eqs. 1 and 3) are not explicitly present in this model. The weight in a particular frequency bin is assumed to be a function of the sound level in that bin only. Although this assumption reduces the performance of the model at low spectral contrast, it was made to keep down the number of parameters that have to be estimated. Figure 2 shows an example of a weight function gj(Sj) as a function of bin frequency fj (abscissa) and the stimulus level in that bin Sj (ordinate). The weights as a function of stimulus energy in the bin at BF, gBF(SBF), are shown as the red line on the back wall of the plot. In both fitting the model and computing its response to a stimulus, Eq. 4 is used and the weight in each frequency bin is determined from the stimulus energy in the bin by interpolation as subsequently described. This local linear variation of weights makes the LDWM locally quadratic, meaning that the weights vary with level as in Eq. 3, but only for the diagonal terms, mjj and not the cross-frequency terms mjk for j
k. However, the slope of the weight variation also changes with level, to differentiate Eq. 4 from Eq. 3.
The functions gi(Si) are piecewise linear and are specified by a matrix of weights W in successive segments, defined by the elbow points in the vector e
![]() | (5) |
dB and the weights within each
dB step are linearly interpolated based on the weights at the two ends of the step. Because the stimuli had some sound levels outside the endpoints of e, linear extrapolation of the gains was done at these levels, continuing the slope of the segment adjacent to the boundary of e. Because the rate for S = 0 is R0, by definition (Eq. 4) the weight at 0 dB cannot be estimated from the data. Thus the elbow points nearest 0 dB were usually offset from 0, i.e., placed at ±
/2. The highest elbow was usually 10–15 dB above the highest reference level in the data and similarly for the lowest elbow; this limit was necessary to guarantee sufficient data to estimate the gains.
The LDWM was fit to a neuron's responses to several sets of RSS stimuli with different contrasts and overall sound levels. However, in the LDWM, the stimulus energies are all expressed relative to a single reference or all-0-dB stimulus component level, which is the s = 0 stimulus for the LDWM. Thus when used with Eq. 4 the stimuli from an RSS set with reference (all-0-dB) level A dB SPL are corrected for the reference sound level B of the LDWM by adding A – B to the Si of the RSS set. The model was fit by minimizing the chi-square error between rates and model predictions using a gradient-descent algorithm (the Matlab function lsqcurvefit available in the Matlab Optimization Toolbox). When computing rates from the model, its output was thresholded at 0, disallowing negative firing rates. The number of parameters estimated for the LDWM is the number of elements in the vector e times the number of frequency bins, which is of the order of 100. To maximize the ratio of data to parameters, the LDWM weights were computed over a continuous range of frequencies (
1.25 octaves wide) symmetric around the BF of the neurons. To compare performances, the corresponding quadratic models were computed over the same range of frequencies. This range is wide enough to include all the significant nonzero weights in most neurons (Yu 2003
).
|
|
RESULTS |
|---|
|
Two assumptions are important to the LDWM: first, that the weight varies with the stimulus level and second that the stimulus level in a particular frequency bin is the primary determinant of the weight in that bin. The plausibility of these assumptions is supported by Fig. 3, which shows first-order weights (wi in Eq. 1) computed from subsets of the overall RSS stimulus set by constraining the amplitudes in one frequency bin. A subset of 100 of the 400 RSS stimuli from a set with spectral contrast of 12 dB was chosen, these being the ones with the smallest 100 amplitudes in a particular frequency bin, say fC. The result was to constrain the amplitudes S(fC) to range from approximately –3.8 to +3.8 dB, with small variation depending on the choice of fC. Because the bin amplitudes are independent, this selection did not systematically change the distribution of amplitudes in the other bins, and those remained at 12-dB spectral contrast. The first-order weights were then recomputed for the chosen subset of stimuli. In Fig. 3, these constrained weights are compared with the weights for the full 3- and 12-dB RSS stimulus sets, for three constraint bins, as indicated in the figure legend. The symbols mark the bin with the stimulus-amplitude constraint.
Figure 3 shows that constraining the stimulus amplitude in one bin causes the first-order weights to change substantially from those computed with 12-dB contrast stimuli (heavy dotted line); it is important that the change occurs primarily in the constrained bin. For example, the weights in bins 6 (circle) and 7 (left triangle) increase significantly when those bins are constrained. The weight in bin 5 (diamond) does not change when it is constrained, perhaps because the 3- and 12-dB contrast stimuli give roughly the same weight in bin 5. In this example, the weights in constrained bins approach the weight size for 3-dB contrast (heavy dashed line). In other neurons, the weight in the constrained bin does not always approach the 3-dB-contrast value, but the change in weight due to constraint is always in the constrained bin. This result shows that the estimated weight wC in a particular bin is affected by the presence of high energy [large S(fC)] in that bin. Although the constrained weights do not equal the weights for 3-dB spectral contrast, this analysis shows that the effects of changes in level are largely local, confined to the frequency bin itself, and provides a motivation for the LDWM formulation.
Level-dependent weight model (LDWM)
The LDWM of Eq. 4 was fit to data from 21 DCN type IV neurons. Generally, data from multiple spectral contrasts and reference levels were used depending on the data available. For comparison, quadratic models were fit to the same data. Usually the LDWM was fit to the whole data set and the quadratic model was fit separately to each RSS data set (i.e., a set of RSS stimuli at one reference level and spectral contrast).
Figure 4 shows the weights for a type IV DCN neuron (BF = 21.1 kHz) in a three-dimensional (3D) plot (Fig. 4A). The same weights are plotted as contours of weight versus frequency at various sound levels (Fig. 4B). These sound levels are the elbow points for the piecewise linear fit of the weights (ek in Eq. 5). In this case, the reference level was set below the threshold of the neuron (–20-dB SPL per component). At levels just above threshold (17–23 dB re reference in Fig. 4B), the weights are positive and narrowly tuned near BF. At higher levels, the neuron is inhibited by frequencies at and below BF and is excited by frequencies just above BF. This pattern of first-order weight variation is observed in quadratic models of about 60% of DCN type IV neurons (Yu 2003
). Looking across sound level at a fixed frequency, it is clear that the weight variation with sound level is not linear, as assumed in the quadratic model (Eq. 1). Furthermore, the weights in each frequency bin change differently, disallowing a separable model consisting of a frequency tuning function multiplied by a single nonlinear function of level.
The SDs of the model parameters were estimated by bootstrap, where the estimation was done 100 times based on 75% of the data, randomly chosen without replacement from the full set. The SDs of the weights are shown as error bars in Fig. 4B; they are small because this is a highly overdetermined system of equations (
100 parameters estimated from
2,000 equations). The robustness of the estimation algorithm was also tested by starting from random initial values for the weights and repeating the gradient descent; the resulting weight estimates had small variations, comparable to the error bars in Fig. 4B.
The quality of the model was tested using the fv (Eq. 2) when predicting responses to the 25% of data not included in the model estimation. Figure 4C shows a distribution of fv values for this neuron. The performance is quite good for all spectral contrasts and sound levels, except for the cases with fv near 0 (arrow). These were all cases of 3-dB spectral contrast at a sound level near threshold where the rate responses were small and the threshold nonlinearity was not well fit by the model. Cases with 3-dB spectral contrast at higher reference levels gave good performance and lie in the peak centered near 0.8.
A second example of an LDWM is shown in Fig. 5. In this case the model was estimated from data collected with 3- and 12-dB spectral contrasts at only one reference level, so the range of levels over which the LDWM was estimated is small compared with the neuron in Fig. 4. Again, the weights are nonmonotonic with level, decreasing at the highest level. Clearly the quadratic model will not be able to fit this weight variation. In Fig. 5B, the performance of the LDWM in Fig. 5A is compared with that of three quadratic models in predicting data obtained with 3-, 6-, and 12-dB spectral contrast (in each case the quadratic model was fit to data at the same spectral contrast). At 3 dB, both models work well (fv = 0.77 for the LDWM and 0.87 for the quadratic model). At 6 and 12 dB, the LDWM does better than the quadratic model (at 6-dB contrast, fv = 0.66 LDWM and 0.20 quadratic; at 12-dB contrast, fv = 0.47 LDWM and 0.0 quadratic). An important point to note is that the LDWM does better than the quadratic model in predicting responses to the 6-dB stimuli, even though stimuli of 6-dB spectral contrast were not used in fitting the LDWM. The fv values given in the previous sentences are bootstrap averages and not the values for the examples in the figure.
Relative performance of the LDWM and quadratic model
Over all the 21 neurons studied, the LDWM generally performed better at predicting rates for 12-dB spectral contrast, whereas the quadratic model generally performed better at 3-dB spectral contrast. Figure 6 shows the fv values for rate predictions at the two spectral contrasts with the two models. The fv values for the quadratic model's fits are shown along the abscissa; they are better for 3-dB contrast (x symbols, median 0.71) than for the 12-dB contrast (circles, median 0.26, significantly different P
6 x 10–6 by rank-sum). For the LDWM on the ordinate, the data have similar medians (0.49 for 12 dB and 0.46 for 3 dB, NS). However, it is better to compare fv values within a neuron; for this comparison, notice that the x symbols, which show data for 3-dB contrast, are mostly below the dashed line (41/58), meaning better performance for the quadratic model; the circles, for 12-dB contrast, are mostly above the line (26/38; different at P < 10–4 by
2), meaning better performance for the LDWM.
Finally in comparing prediction performance of the models directly, for 12-dB contrast the median performance of the LDWM (0.49) is significantly greater (P
0.0003 by rank-sum) than that of the quadratic model (0.26). On the other hand, for 3-dB contrast the median performance of the quadratic model (0.71) is significantly greater (P
0.0033 by rank-sum) than that of the LDWM (0.46). Thus the quadratic model does better for small contrast, whereas the LDWM does better for large contrast.
Quadratic models derived from the LDWM
In a previous paper (Reiss et al. 2007
) it was shown that the weights of the quadratic model are larger for responses to stimuli with 3-dB contrast (as in Fig. 1B) than with 12-dB contrast. If the LDWM is an accurate measure of the neuron's receptive field, then it should predict this change in weight amplitude with spectral contrast. A test of this idea is to fit the LDWM to a set of rate responses to RSS stimuli, compute artificial rate responses from this model for 3- and 12-dB RSS stimulus sets, and then fit the quadratic model to the two sets of artificial data. Figure 7 shows this calculation for the same neuron as in Fig. 4. Quadratic model weights computed from the actual and model data are compared; results from the 3-dB contrast are in the top half of the figure and for 12-dB contrast in the bottom half. The first-order weights computed from actual rate data (dotted lines in Fig. 7, A and C) and model data (solid lines in Fig. 7, A and C) agree qualitatively in that they have the same excitatory and inhibitory regions, although the weight values often differ by >1 SD. The two kinds of weights also change shape in the same way as stimulus level changes (indicated by different colors). More important, as with actual data, the weights computed from the model data are significantly larger for the 3-dB contrast (top row) compared with the 12-dB contrast (see following text). The second-order weights (Fig. 7, B and D) are compared at three sound levels and show a similar qualitative agreement. Note that the on-diagonal second-order weights (terms mjj) are better reproduced than the off-diagonal weights. Apparently the effects of the off-diagonal weights are small, although they may account for some of the difference between the data and model in Fig. 7, A and C.
For this neuron, the quadratic model weights (Fig. 7) are similar in shape to the cross-sectional weights of the LDWM (Fig. 4B). Note, however, that the magnitude of the weights for the LDWM cannot be directly compared with the weights of the quadratic model. The quadratic model's weights are always expressed with reference to the all-0-dB stimulus of the RSS stimulus set. However, for the LDWM, the weights are expressed with respect to a particular fixed reference level for all stimuli. As explained in METHODS, the two reference levels are not necessarily the same. Changing the LDWM's reference will change the magnitude of the weights of the LDWM.
|
|
DISCUSSION |
|---|
|
The LDWM differs from previous STRF and weight-function models by explicitly accounting for the sound levels of the frequency components of a stimulus when calculating a weighted sum across frequency. As such it provides a method of evaluating the importance of stimulus level in the formulation of auditory receptive-field models. The results across a population of DCN type IV neurons show a nonmonotonic variation of weights with sound level in which weights first increase with level and then decrease rapidly and become negative, or approach negative values, at levels a few 10s of dB above threshold (Figs. 2, 4, and 5). It is important to note that the variation of weights with level is different for different frequencies and thus cannot be represented by a separable model of frequency and level tuning. A test of separability done by performing a singular value decomposition (SVD) on the matrix of LDWM weights (Sen et al. 2001
) indicates inseparability in frequency and level.
The nonmonotonic behavior of LDWMs seems to be sufficient to account for the apparent nonlinearity of DCN type IV neurons (Reiss et al. 2007
; Yu 2003
; Yu and Young 2000
) observed with RSS stimuli or with natural spectra such as head-related transfer functions, which show a similar 10- to 20-dB range of variation of component stimulus levels as the RSS stimuli (Musicant et al. 1990
; Rice et al. 1992
). In both cases the quadratic model does a poor job of predicting responses to stimuli with approximately 12-dB spectral contrast (Fig. 6). The quadratic model assumes a linear dependence of weights on level (Eq. 3), which is inadequate to fit the weights plotted in Figs. 2, 4, and 5. As a result the calculation of weights for such neurons represents an averaging between larger weights for small stimulus-level deviations and smaller weights for large deviations. The result is a poorly fitting quadratic model that is a compromise between the small- and large-deviation regimes causing weights to be smaller for 12- than for 3-dB spectral contrast. For stimuli with 3-dB spectral contrast, the quadratic model is adequate because the small level-deviations fall within a range where the weights are linearly dependent on level.
The considerations in the previous paragraph do not explain why the quadratic model does better than the LDWM at 3-dB spectral contrast (Fig. 6). Presumably the quadratic model includes important interactions between frequencies (the terms involving mjk for j
k in Eqs. 1 and 3) that are not included in the LDWM explicitly. At 12-dB contrast, the effect of sound level is the dominant effect and the LDWM does better, but at 3-dB contrast the effect of sound level is smaller and is well approximated by the quadratic model, so the interaction among frequencies becomes the important effect.
Comparison with tone response maps and implicit cross-frequency interactions
The dimensions of the LDWM, frequency and level, match with those of the classical receptive field characterization, i.e., tone response maps (e.g., Fig. 1A). In spite of this match the two are fundamentally different characterizations. The LDWM gives the sensitivity of the neuron to changes in stimulus energy in a particular frequency bin at a particular energy level in a broadband sound. Such sensitivity cannot be derived using tones. The differences between the spectral weighting functions (Fig. 1B) and the tone response map profiles (Fig. 1A) clearly underline this fact. Tone response maps provide information about the sensitivity of the neuron to narrowband stimuli in the absence of energy in surrounding frequencies. Thus the independence of frequency channels shown in Fig. 3 should not be taken to mean that the LDWM can be derived using tones or narrowband stimuli separately in each frequency channel. Nelken and colleagues (1994a
,b
) investigated a similar question using multiunit activity in auditory cortex. They found that the responses were determined mainly by the single-tone tuning with strong modulation by two-tone interactions and only weak modulation by additional tones. Thus tone and broadband tuning are expected to differ substantially, as in Fig. 1, with most of the difference occurring in the transition from one frequency component to two.
Furthermore, the nature of the stimuli (broadband) used to derive the LDWM creates implicit frequency interactions. The LDWM weight in a particular frequency bin at a particular level is the sensitivity of the neuron to changes in stimulus energy in that bin and level when energy in all the other bins is at the average energy level of all the stimuli. Thus an average cross-frequency interaction is present in the LDWM model and this interaction effectively changes with reference level. As a result, the second-order models derived from the LDWM (Fig. 7, B and D) have weak cross-frequency terms in spite of having no such explicit interactions in the model. Frequency response maps on the other hand are devoid of any such interactions.
Contrast and luminance dependence in vision
A considerable amount of literature exists that addresses the issue of contrast and luminance gain control in the early visual system (Bonin et al. 2006
; Mante et al. 2005
; Shapley and Victor 1981
; Zaghloul et al. 2005
). Luminance adaptation occurs at the level of the retina, and contrast gain control is present in the retina and enhanced in later stages. In all these cited studies, gain control has been studied in terms of a spatiotemporal or temporal filter, and not just spatial filters that are analogous to the frequency weighting functions in the present study. Contrast gain control in vision also shows similar results in that the sizes of the filters change with stimulus contrast, whereas similar shape is maintained. Additionally, these filters have shorter integration times with increased contrast. However, unlike our study, prediction performance of the linear filters (followed by a static nonlinearity) usually remained equally good at all contrast sizes; for example, in the LGN, mean explained variances are consistently >70% (Bonin et al. 2006
). Thus even in early vision, although fairly linear, contrast gain control requires determination of the filters for each different contrast and mean luminance to obtain a predictive model. Thus finding a single predictive model for all contrasts together requires a luminance-dependent weight model much like the LDWM. However, independence of contrast and luminance gain controls (Mante et al. 2005
) suggests other ways of combining data from different luminance and contrasts into a single model. Such independence of contrast and sound level is absent in DCN type IV neurons.
Implications for STRFs
Because the STRF, minus its temporal component, is similar to the first-order weight function of the quadratic model (Young et al. 2005
), the results shown here imply that STRFs should also depend on stimulus contrast and stimulus level. In addition, the predictive ability of STRFs should improve for stimuli with lower contrast. STRFs derived from stimuli with "natural" contrasts, on the scale of 12 dB, may reflect a compromise process similar to that postulated earlier for weights, providing an explanation for the poor prediction performance of STRFs (Machens et al. 2004
) and the stimulus dependence of STRF shape (Theunissen et al. 2000
; Valentine and Eggermont 2004
). STRFs represent a linear function of the stimulus parameters, but those parameters are obtained with a nonlinear transformation (energy or envelope) of the stimulus (Escabi and Read 2003
; Theunissen et al. 2000
), like dB energy in the present study. It may be possible to improve the prediction performance of STRFs by properly choosing the nonlinear measure, as in the square-law function in models of complex cells in visual cortex (Carandini et al. 2005
). However, it is doubtful that such a strategy could capture both the nonmonotonicity of the LDWM and the benefits of frequency interactions, demonstrated at low spectral contrast. Finally, this discussion is based on the spectral nonlinearity alone. It may be that nonlinearities are present in temporal interactions as well, which could additionally affect performance of STRFs.
Sources of nonlinearity in the DCN
The nonlinear behavior typical of DCN neurons is not seen in the inputs to the DCN from the auditory nerve (Young and Calhoun 2005
) nor in neurons of the VCN (Yu 2003
; Yu and Young 2000
). Thus the nonlinearity of DCN principal cells is a computational property of its interneuronal circuits. Nonmonotonicity of rate responses across sound level is a defining feature of DCN principal-cell (type IV) responses (Spirou and Young 1991
) and has been attributed to inhibitory inputs from so-called type II interneurons, vertical cells (Voigt and Young 1990
). Previous analyses of DCN nonlinearity led to the conclusion that nonlinear responses are observed in DCN principal neurons for stimuli that activate the type II interneurons (Nelken and Young 1997
; Nelken et al. 1997
). Although these interneurons project to the VCN (Ostapoff et al. 1999
; Wickesberg and Oertel 1988
; Zhang and Oertel 1993
) and inhibitory responses are seen in VCN neurons in vivo (e.g., Caspary et al. 1994
; Ingham et al. 2006
; Kopp-Scheinpflug et al. 2002
), inhibitory effects seem to be weaker in VCN and specific effects of type II inhibition in VCN have not been identified in vivo. Thus the nonmonotonicity of type IV responses produced by inhibitory inputs from vertical cells remains the most likely source of the nonlinearity of the DCN output representation. The role of other inhibitory inputs (Davis and Young 2000
; Reiss and Young 2005
) is not clear.
Different degrees of edge sensitivity in type IV neurons: predictions of the LDWM
A previous paper (Reiss and Young 2005
) identified three classes of DCN type IV neurons according to their sensitivity to steep rising spectral edges, such as the lower-frequency edge of a noise band or the upper-frequency edge of a noise notch. These three groups were generally characterized by different responses to broadband noise, as seen in rate versus level functions. Figure 8, A–C shows rate–level functions for three neurons studied here whose properties correspond to the groups defined by Reiss and Young. All three neurons have nonmonotonic tone rate–level functions (solid lines) that are typical of type IV neurons; the noise rate–level functions (dashed) define the three groups, as subsequently described. The corresponding LDWMs are plotted in Fig. 8, D–F.
|
The differences between the three LDWMs in Fig. 8 are subtle, involving the depth and strength of inhibitory inputs. These results illustrate that small differences in the receptive field can lead to large differences in responses to properly chosen stimuli. The fact that the LDWM shows differences in responses to edge stimuli that correspond to those predicted for these neurons provides support for the usefulness of the LDWM.
Advantages and limitations of the LDWM
The LDWM has the drawback of all weighting-function models relative to STRFs, in that it does not contain information about time-domain responses. Nevertheless, it is an efficient way to characterize the spectral characteristics of a neuron. Although the number of stimuli used in deriving the LDWMs in this study was usually about 1,000, such models can be derived from far fewer stimuli. In fact, an advantage of the LDWM over a series of quadratic models covering the same range of sound levels is that the latter requires more stimuli because more parameters must be estimated. Of course, this efficiency occurs because the LDWM lacks explicit interactions between frequencies. Inspection of Fig. 7, B and D shows that the second-order weights derived from model data have smaller interactive terms (off-diagonal elements) than those from the actual data. Interactions across frequency can be explicitly added to the LDWM, at a cost of many additional parameters.
One alternative to models like the LDWM is a network model based on the hypothesized organization of the DCN. Network models replicate many aspects of DCN type IV responses (Hancock and Voigt 1999
; Reiss and Young 2005
; Zheng and Voigt 2006
), although estimating parameters of the network from principal cell responses suffers from a lack of uniqueness. Further because of incomplete understanding of the circuitry and properties of the circuit elements in the DCN (Davis and Young 2000
; Reiss and Young 2005
) and higher auditory nuclei, such models require additional assumptions. Systems models like the LDWM do not suffer from such problems.
|
|
GRANTS |
|---|
|
|
|
ACKNOWLEDGMENTS |
|---|
|
Present address of L. Reiss: Department of Speech Pathology and Audiology, University of Iowa, Iowa City, IA 52242.
|
|
FOOTNOTES |
|---|
Address for reprint requests and other correspondence: E. D. Young, Department of Biomedical Engineering, Johns Hopkins University, 720 Rutland Ave., Baltimore, MD 21205 (E-mail: eyoung{at}jhu.edu)
|
|
REFERENCES |
|---|
|
Bonin V, Mante V, Carandini M. The statistical computation underlying contrast gain control. J Neurosci 26: 6346–6353, 2006.
Carandini M, Demb JB, Mante V, Tolhurst DJ, Dan Y, Olshausen BA, Gallant JL, Rust NC. Do we know what the early visual system does? J Neurosci 25: 10577–10597, 2005.
Caspary DM, Backoff PM, Finlayson PG, Palombi PS. Inhibitory inputs modulate discharge rate within frequency receptive fields of anteroventral cochlear nucleus neurons. J Neurophysiol 72: 2124–2133, 1994.
Davis KA, Young ED. Pharmacological evidence of inhibitory and disinhibitory neuronal circuits in dorsal cochlear nucleus. J Neurophysiol 83: 926–940, 2000.
de Boer E, de Jongh HR. On cochlear encoding: potentialities and limitations of the reverse-correlation technique. J Acoust Soc Am 63: 115–135, 1978.[CrossRef][Web of Science][Medline]
deCharms RC, Blake DT, Merzenich MM. Optimizing sound features for cortical neurons. Science 280: 1439–1443, 1998.
Depireux DA, Simon JZ, Klein DJ, Shamma SA. Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. J Neurophysiol 85: 1220–1234, 2001.
Efron B, Tibshirani R. An Introduction to the Bootstrap. New York: Chapman & Hall, 1993.
Eggermont JJ, Aertsen AM, Johannesma PI. Prediction of the responses of auditory neurons in the midbrain of the grass frog based on the spectro-temporal receptive field. Hear Res 10: 191–202, 1983a.[CrossRef][Web of Science][Medline]
Eggermont JJ, Aertsen AM, Johannesma PI. Quantitative characterisation procedure for auditory neurons based on the spectro-temporal receptive field. Hear Res 10: 167–190, 1983b.[CrossRef][Web of Science][Medline]
Escabi MA, Read HL. Representation of spectrotemporal sound information in the ascending auditory pathway. Biol Cybern 89: 350–362, 2003.[CrossRef][Web of Science][Medline]
Escabi MA, Schreiner CE. Nonlinear spectrotemporal sound analysis by neurons in the auditory midbrain. J Neurosci 22: 4114–4131, 2002.
Fritz J, Shamma S, Elhilali M, Klein D. Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex. Nat Neurosci 6: 1216–1223, 2003.[CrossRef][Web of Science][Medline]
Hancock KE, Voigt HF. Wideband inhibition of dorsal cochlear nucleus type IV units in cat: a computational model. Ann Biomed Eng 27: 73–87, 1999.[CrossRef][Web of Science][Medline]
Ingham NJ, Bleeck S, Winter IM. Contralateral inhibitory and excitatory frequency response maps in the mammalian cochlear nucleus. Eur J Neurosci 24: 2515–2529, 2006.[CrossRef][Web of Science][Medline]
Kopp-Scheinpflug C, Dehmel S, Dorrscheidt GJ, Rubsamen R. Interaction of excitation and inhibition in anteroventral cochlear nucleus neurons that receive large endbulb synaptic endings. J Neurosci 22: 11004–11018, 2002.
Machens CK, Wehr MS, Zador AM. Linearity of cortical receptive fields measured with natural sounds. J Neurosci 24: 1089–1100, 2004.
Mante V, Frazor RA, Bonin V, Geisler WS, Carandini M. Independence of luminance and contrast in natural scenes and in the early visual system. Nat Neurosci 8: 1690–1697, 2005.[CrossRef][Web of Science][Medline]
Musicant AD, Chan JC, Hind JE. Direction-dependent spectral properties of cat external ear: new data and cross-species comparisons. J Acoust Soc Am 87: 757–781, 1990.[CrossRef][Web of Science][Medline]
Nagel KI, Doupe AJ. Temporal processing and adaptation in the songbird auditory forebrain. Neuron 51: 845–859, 2006.[CrossRef][Web of Science][Medline]
Nelken I, Kim PJ, Young ED. Linear and nonlinear spectral integration in type IV neurons of the dorsal cochlear nucleus. II. Predicting responses with the use of nonlinear models. J Neurophysiol 78: 800–811, 1997.
Nelken I, Prut Y, Vaadia E, Abeles M. Population responses to multifrequency sounds in the cat auditory cortex: four-tone complexes. Hear Res 72: 223–236, 1994a.[CrossRef][Web of Science][Medline]
Nelken I, Prut Y, Vaadia E, Abeles M. Population responses to multifrequency sounds in the cat auditory cortex: one- and two-parameter families of sounds. Hear Res 72: 206–222, 1994b.[CrossRef][Web of Science][Medline]
Nelken I, Young ED. Linear and nonlinear spectral integration in type IV neurons of the dorsal cochlear nucleus. I. Regions of linear interaction. J Neurophysiol 78: 790–799, 1997.
Ostapoff EM, Morest DK, Parham K. Spatial organization of the reciprocal connections between the cat dorsal and anteroventral cochlear nuclei. Hear Res 130: 75–93, 1999.[CrossRef][Web of Science][Medline]
Qiu A, Schreiner CE, Escabi MA. Gabor analysis of auditory midbrain receptive fields: spectro-temporal and binaural composition. J Neurophysiol 90: 456–476, 2003.
Reiss LA. Spectral Coding and Nonlinearity in the Dorsal Cochlear Nucleus (PhD thesis). Baltimore, MD: Johns Hopkins Univ., 2005.
Reiss LA, Bandyopadhyay S, Young E. Effects of stimulus spectral contrast on receptive fields of dorsal cochlear nucleus neurons. J Neurophysiol (August 1, 2007). doi:10.1152/jn.01239.2006.
Reiss LA, Young ED. Spectral edge sensitivity in neural circuits of the dorsal cochlear nucleus. J Neurosci 25: 3680–3691, 2005.
Rice JJ, May BJ, Spirou GA, Young ED. Pinna-based spectral cues for sound localization in cat. Hear Res 58: 132–152, 1992.[CrossRef][Web of Science][Medline]
Sen K, Theunissen FE, Doupe AJ. Feature analysis of natural sounds in the songbird auditory forebrain. J Neurophysiol 86: 1445–1458, 2001.
Shapley RM, Victor JD. How the contrast gain control modifies the frequency responses of cat retinal ganglion cells. J Physiol 318: 161–179, 1981.
Shofner WP, Young ED. Excitatory/inhibitory response types in the cochlear nucleus: relationships to discharge patterns and responses to electrical stimulation of the auditory nerve. J Neurophysiol 54: 917–939, 1985.
Spirou GA, Young ED. Organization of dorsal cochlear nucleus type IV unit response maps and their relationship to activation by bandlimited noise. J Neurophysiol 66: 1750–1768, 1991.
Sutter ML, Schreiner CE, McLean M, O'Connor KN, Loftus WC. Organization of inhibitory frequency receptive fields in cat primary auditory cortex. J Neurophysiol 82: 2358–2371, 1999.
Temchin AN, Recio-Spinoso A, van Dijk P, Ruggero MA. Wiener kernels of chinchilla auditory-nerve fibers: verification using responses to tones, clicks, and noise and comparison with basilar-membrane vibrations. J Neurophysiol 93: 3635–3648, 2005.
Theunissen FE, Sen K, Doupe AJ. Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds. J Neurosci 20: 2315–2331, 2000.
Valentine PA, Eggermont JJ. Stimulus dependence of spectro-temporal receptive fields in cat primary auditory cortex. Hear Res 196: 119–133, 2004.[CrossRef][Web of Science][Medline]
Versnel H, Shamma SA. Spectral-ripple representation of steady-state vowels in primary auditory cortex. J Acoust Soc Am 103: 2502–2514, 1998.[CrossRef][Web of Science][Medline]
Voigt HF, Young ED. Cross-correlation analysis of inhibitory interactions in dorsal cochlear nucleus. J Neurophysiol 64: 1590–1610, 1990.
Wickesberg RE, Oertel D. Tonotopic projection from the dorsal to the anteroventral cochlear nucleus of mice. J Comp Neurol 268: 389–399, 1988.[CrossRef][Web of Science][Medline]
Yeshurun Y, Wollberg Z, Dyn N. Prediction of linear and non-linear responses of MGB neurons by system identification methods. Bull Math Biol 51: 337–346, 1989.[Web of Science][Medline]
Young ED, Calhoun BM. Nonlinear modeling of auditory-nerve rate responses to wideband stimuli. J Neurophysiol 94: 4441–4454, 2005.
Young ED, Yu JJ, Reiss LA. Non-linearities and the representation of auditory spectra. Int Rev Neurobiol 70: 135–168, 2005.[Web of Science][Medline]
Yu JJ. Spectral Information Encoding in the Cochlear Nucleus and Inferior Colliculus: A Study Based on the Random Spectral Shape Method (PhD thesis). Baltimore, MD: Johns Hopkins Univ., 2003.
Yu JJ, Young ED. Linear and nonlinear pathways of spectral information transmission in the cochlear nucleus. Proc Natl Acad Sci USA 97: 11780–11786, 2000.
Zaghloul KA, Boahen K, Demb JB. Contrast adaptation in subthreshold and spiking responses of mammalian Y-type retinal ganglion cells. J Neurosci 25: 860–868, 2005.
Zhang S, Oertel D. Tuberculoventral cells of the dorsal cochlear nucleus of mice: intracellular recordings in slices. J Neurophysiol 69: 1409–1421, 1993.
Zheng X, Voigt HF. Computational model of response maps in the dorsal cochlear nucleus. Biol Cybern 95: 233–242, 2006.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
N. A. Lesica and B. Grothe Dynamic Spectrotemporal Feature Selectivity in the Auditory Midbrain J. Neurosci., May 21, 2008; 28(21): 5412 - 5421. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |