|
|
||||||||
The Journal of Neurophysiology Vol. 84 No. 1 July 2000, pp. 255-273
Copyright ©2000 by the American Physiological Society
Center for Neural Science, New York University, New York, New York 10003
| |
ABSTRACT |
|---|
|
|
|---|
Krishna, B. Suresh and Malcolm N. Semple. Auditory Temporal Processing: Responses to Sinusoidally Amplitude-Modulated Tones in the Inferior Colliculus. J. Neurophysiol. 84: 255-273, 2000. Time-varying envelopes are a common feature of acoustic communication signals like human speech and induce a variety of percepts in human listeners. We studied the responses of 109 single neurons in the inferior colliculus (IC) of the anesthetized Mongolian gerbil to contralaterally presented sinusoidally amplitude-modulated (SAM) tones with a wide range of parameters. Modulation transfer functions (MTFs) based on average spike rate (rMTFs) showed regions of enhancement and suppression, where spike rates increased or decreased respectively as stimulus modulation depth increased. Specifically, almost all IC rMTFs could be described by some combination of a primary and a secondary region of enhancement and an intervening region of suppression, with these regions present to varying degrees in individual rMTFs. rMTF characteristics of most neurons were dependent on sound pressure level (SPL). rMTFs in most neurons with "onset" or "onset-sustained" peri-stimulus time histograms (PSTHs) in response to brief pure tones showed only a peaked primary region of enhancement. The region of suppression tended to occur in neurons with "sustained" or "pauser" PSTHs, and usually emerged at higher SPLs. The secondary region of enhancement was only found in eight neurons. The lowest modulation frequency at which the spike rate reached a clear peak ("best modulation frequency" or BMF) was measured. All but two mean BMFs lay between 0 and 100 Hz. Fifty percent of the 49 neurons tested over at least a 20-dB range of SPLs showed a BMF variation larger than 66% of their mean BMF. MTFs based on vector strength (tMTFs) showed a variety of patterns; although mostly similar to those reported from the cochlear nucleus, tMTFs of IC neurons showed higher maximum values, smaller dynamic range with depth, and a lower high-frequency limit for significant phase locking. Systematic and large increases in phase-lead commonly occurred as SPL increased. rMTFs measured at multiple carrier frequencies (Fcs) showed that the suppressive region was not the result of sideband inhibition. There was no systematic relationship between BMF and Fc of stimulation in the cells studied, even at low carrier frequencies. The results suggest various possible mechanisms that could create IC MTFs, and strongly support the idea that inhibitory inputs shape the rMTF by sharpening regions of enhancement and creating a suppressive region. The paucity of BMFs above 100 Hz argues against simple rate-coding schemes for pitch. Finally, any labeled line or topographic representation of modulation frequency is unlikely to be independent of SPL.
| |
INTRODUCTION |
|---|
|
|
|---|
Sinusoidally amplitude-modulated (SAM) tones
induce in human listeners, over different modulation frequency ranges,
a variety of percepts such as fluctuation, roughness, and pitch. SAM
signals have also been used in psychophysical experiments within a
linear systems framework to study the temporal filtering properties of the auditory system, as distinguished (but not separate) from its
spectral filtering properties (e.g., Dau et al. 1997
;
Salvi et al. 1982
; Viemeister 1979
). It
is known that human speech and animal communication sounds contain many
amplitude-modulated features (e.g., Rosen 1992
). In
particular, the envelopes found in natural speech have been shown to
contain sufficient information below 50 Hz to allow a high degree of
speech recognition performance, even in the relative absence of
spectral cues (Shannon et al. 1995
). Further, it has
been reported that training language-learning impaired children with
modified speech that included enhancement of envelope components in a
similar frequency range (3-30 Hz) led to significant improvements in
performance on various speech and language tests (Tallal et al.
1996
). Understanding the processing of modulated signals is
also of relevance to the design of stimulation strategies for cochlear
implants, where one attempts to convey information about speech signals
with a temporally varying stimulation strength at a limited number of
points on the cochlea.
There now exist considerable data on physiological
responses to SAM tones in the auditory nerve (e.g., Joris and
Yin 1992
) and cochlear nucleus (e.g., Frisina et al.
1990
; Moller 1974
; Rhode 1994
;
Rhode and Greenberg 1994
). Neurons in these lower levels
of the ascending auditory pathway convey information about the envelope
by "phase locking" to the modulation frequency of the SAM tone.
Spike rate variations with modulation frequency are generally poorly
tuned or nonexistent in the auditory nerve and for most neuronal types
in the cochlear nucleus. In contrast, neurons at higher levels of the
auditory system, including the inferior colliculus (IC), have been
shown to exhibit large variations in average spike rate as the
modulation frequency of the SAM tone is varied (e.g., Langner
and Schreiner 1988
; Rees and Moller 1983
; Schreiner and Urbas 1988
). It has been suggested that
this represents a transformation from a "temporal code" for
modulation frequency in the auditory nerve and cochlear nucleus to a
"rate code" at higher levels, and that this transformation is
essentially complete at the level of the IC (Langner and
Schreiner 1988
). However, many details of the parametric
dependence of the responses to SAM tones of single neurons at locations
other than the auditory nerve and cochlear nucleus remain unclear. As a
result, existing models of the processing of amplitude-modulation (AM)
at higher levels of the auditory system are poorly constrained by
experimental data and hence difficult to evaluate.
The IC is an obligatory relay in the primary lemniscal pathway from the
extensively interconnected auditory midbrain network to the cortex. It
receives ascending projections from various ipsilateral and
contralateral sites in the auditory periphery and descending
projections from the cortex (Oliver and Huerta 1991
). As
a result of this extensive set of excitatory and inhibitory inputs and
because it is the primary source of projections to the thalamus and
cortex, it is often considered to perform an "integrative" role in
auditory processing. In particular, in contrast to auditory nerve
fibers and most neurons in the cochlear nucleus, prior studies (e.g.,
Heil et al. 1995
; Langner and Schreiner
1988
) of the responses of IC neurons to SAM tones with
different modulation frequencies (and all other parameters fixed) have
reported that most neurons in the IC show systematic variations of
average spike rate with modulation frequency, with neurons often
showing a clear maximum response at a best modulation frequency (BMF).
However, studies (e.g., Rees and Moller 1983
;
Rees and Palmer 1989
) that have varied other parameters
of the stimulus have indicated that response patterns to SAM tones can
be strongly dependent on stimulus parameters other than modulation
frequency, like the sound pressure level (SPL) of stimulation.
Nonetheless, a systematic description of the spike rate and
synchronization properties of single IC neurons to SAM tones with a
wide range of parameters is still lacking. Such a description would
help clarify the implications of the emergent rate tuning in the
auditory midbrain, both for neural processing mechanisms and for the
representation of modulation frequency in the auditory system.
With this in mind, we present here the results of a detailed characterization of extracellularly recorded responses from single neurons in the physiologically defined central nucleus of the IC of the Mongolian gerbil (Meriones unguiculatus) to SAM tones with a range of modulation frequencies presented via a closed system to the contralateral ear at multiple modulation depths, SPLs, and carrier frequencies (Fcs).
| |
METHODS |
|---|
|
|
|---|
Surgical and recording procedures
Adult Mongolian gerbils with clean external ears and no sign of
middle ear infections were initially anesthetized with an intraperitoneal injection of pentobarbital sodium (60 mg/kg) and subsequently maintained in an areflexive state with supplemental intramuscular injections of ketamine hydrochloride (approximately 30 mg
· kg
1 · h
1). A
heating pad was used to maintain a constant rectal temperature of
37°C. The pinnae were removed, and craniotomy of the interparietal bone was performed to expose the cerebellum. The animal was then transferred to a double-walled sound-attenuated room (IAC), where sound
delivery speculae were sealed to the temporal bone and muscle around
the opening of the external auditory meatus bilaterally.
Activity of single units was recorded with platinum-plated,
glass-insulated tungsten microelectrodes advanced into the IC through
the intact cerebellum with a stepping motor microdrive (CalTech).
Electrodes typically had exposed tip lengths of 5-10 µm and an
impedance of 2-5 M
at 1 kHz. Electrical signals recorded from the
brain were amplified (variable gain) and filtered (typically 0.25-10
kHz). Neural signals were displayed on an oscilloscope and fed to an
audio monitor. Single units were identified and isolated using the
criteria of waveform similarity and separation of successive spikes by
an absolute refractory period; only very well isolated single units
were studied. An event timer (MALab, Kaiser Instruments) logged the
occurrence of discriminated action potentials together with stimulus
zero-crossings to a resolution of 1 µs. Event times were stored in a
FIFO buffer from which they were retrieved by the host computer.
Acoustic stimuli were presented as the electrode was advanced to
isolate responsive units. As the electrode passed through the dorsal
mantle of the IC, responses that were broadly tuned, driven better by
broadband noise than tones, and often habituating to repeated tonal
stimulation were found. Occasionally, a coarse descending tonotopic
sequence was encountered. Entry into the physiologically defined
central nucleus was always marked by an increase in spontaneous
background activity and the beginning of a clear ascending tonotopic
progression, with narrowly tuned tone response areas that showed no
signs of habituation to repeated stimulation. Within the central
nucleus, both modulated and unmodulated tonal stimuli were effective in revealing responsive units.
Stimulus generation and data acquisition
Stimulus waveforms were generated by two digital synthesizers
controlled by a microprocessor and custom hardware for timing, data
logging, and waveform control (MALab, Kaiser Instruments). The
dedicated microprocessor communicated with a host computer via an
IEEE-488 interface. Stimuli were digitally attenuated and transduced by
electrostatic earphones (Stax Lambda, housing designed by G. Sokolich)
coupled to the ear pieces. For each individual animal, SPL (expressed
in dB re 20 µPa) near the tympanic membrane was calibrated for both
ears from 40 Hz to 40 kHz under computer control using a previously
calibrated probe tube and a condenser microphone (Bruel and Kjaer, type
4134). Phase was calibrated for both ears from 40 Hz to 5 kHz by
measuring the disparity in phase between a reference sinusoid generated
electrically and the recorded acoustic signal. Note that all the data
in this paper result from monaural contralateral presentation of sound.
The magnitude transfer function of the speaker was usually smooth, with
a slow rolloff until 25 kHz, the maximum frequency used; two or three
local resonances, never deviating by more than 10 dB from the baseline,
were also present. Deviations of the phase response from linearity were
small, and not usually present. Appropriate compensations were made to
the carrier prior to modulation. All acoustic stimuli were shaped
digitally with a cosine-squared ramp of 5-50 ms rise time to reduce
spectral splatter at onset and offset. The SAM tone is represented by
the formula A sin(
ct)[1 + m sin(
mt)], where
c is the angular frequency of the carrier,
m that of the modulator, A the
amplitude of the carrier, t the time after signal onset, and
m the modulation depth (0-100%). Such a stimulus, if
sufficiently long, has a simple three component spectrum centered at
Fc, as shown in Fig.
1. The SPL of the SAM tone is defined as
the SPL of the carrier that is being modulated. The modulation
frequency is the frequency corresponding to
m.
|
Frequency response areas, either at a single SPL or at multiple SPLs,
were recorded using pure tones (usually 100 ms, with a rise time of 10 ms, and a repetition rate <2.5 Hz) and the best frequency (BF) was
noted. Following this, a complete spike rate versus SPL function
[rate-level function (RLF)] was usually measured (at BF). Neurons
were then studied using pure tones (at multiple SPLs) and SAM tones
while monitoring the responses (spike rate and vector strength) on-line
to gain an initial qualitative characterization of the neuron's
response properties. This initial exploration helped confine the formal
presentation to the ranges of levels, depths, carrier frequencies,
durations, and modulation frequencies where clear variations in
response were present, thus maximizing the use of the limited recording
time available. Modulation frequencies were varied within the range of
0.1 Hz to a value just less than the
Fc used. Stimulus durations varied
from 1 to 10 s (this was determined partly by the period of the
lowest modulation frequency used, so that at least 1 period was
presented), while rise times varied from 5 to 50 ms. Unless it was
itself the parameter being varied, Fc
was set to the BF of the neuron (as measured using pure tones) at the
SPL being tested. If responses were being recorded at multiple SPLs,
Fc was kept constant and usually equal
to the BF at an SPL in the middle of the spanned SPL range. Because it was not uncommon for the BF to vary with SPL (e.g., Kuwada et al. 1984
), Fc sometimes
differed slightly from the BF at some SPLs. In any case, the BF can
also depend on the kind of stimulation (SAM tone vs. pure tone), the
duration of stimulation (100 ms vs. 1-10 s) and the modulation
frequency of stimulation (see RESULTS).
Data analysis
Three descriptive measures of the pure-tone response are used in
this paper. The first is the mean spike rate averaged over the duration
of stimulation. The second is the minimum (across all SPLs tested) mean
(averaged across stimulus presentations) latency at the BF of the
neuron or close to it (if the BF varied with SPL). Finally, neurons
were classified into one of four classes (onset, onset-sustained,
pauser, and sustained) on the basis of their peri-stimulus time
histograms (PSTHs) (as in Le Beau et al. 1996
). However,
because PSTHs can change with SPL, we also used some additional
criteria: neurons were classified as pausers if they showed a pauser
pattern at any SPL, and neurons were classified as sustained or onset
types only if they showed this pattern at all SPLs. The few broad onset
neurons found were included in the onset-sustained category, and since
the identification of choppers requires more tone presentations than we
used, these were not separately identified.
The response to AM is also characterized by three measures: mean spike
rate (averaged over the duration of stimulation), vector strength at
the modulation
frequency1
(Goldberg and Brown 1969
) and the mean phase-lead of the
response relative to the modulating sinusoid. The functions describing the variation of each of these measures with modulation frequency will
be referred to as modulation transfer functions (MTFs). MTFs plotted
using the spike rate, vector strength, and mean phase-leads are called
rate modulation transfer functions (rMTFs), temporal modulation
transfer functions (tMTFs), and phase modulation transfer functions
(pMTFs), respectively.
The vector strength is a measure of the synchrony to the modulating
waveform and is equal to
F1/F0,
where F1 is the spectral magnitude of
the response at
m and
F0 the average spike rate. It varies
from a minimum of zero to a maximum of one, and as a reference, the
vector strength of the sinusoidal modulating waveform of a 100% depth
SAM tone is 0.5, while that of a half-wave rectified sinusoid is 0.784. A response with all spikes at precisely the same unique phase has a
vector strength of 1 ("perfect" phase locking). The significance of
the vector strength was assessed using the Rayleigh statistic
(Stephens 1969
) at the 1% significance level.
Additionally, to minimize the contribution of onset spikes, at least
six spikes per stimulus presentation were required for significance.
The mean phase-lead was computed as (90
), where
is the
direction of the mean vector in the vector strength calculation. The
mean phase-lead is equal to the phase of the spectral component of the
response at
m, relative to that of the
modulating waveform. (The direction of the mean vector of the
modulating waveform is 90°, since it produces a unimodal nonnegative
period histogram symmetric about its peak at 90°.) The mean
phase-lead was decremented by an appropriate multiple of 360°, i.e.,
unwrapped, whenever it went from a response close to 360° to one
closer to 0°. pMTFs were truncated when the sampling resolution
became poor enough that the phase might have skipped more than one
cycle; this was not usually required, and usually only affected values
above 300 Hz. Only significant vector strengths and their associated
mean phase-leads are shown in the plots in this paper. Consistent with the fact that phase locking to pure tones is common only below 600 Hz
(Kuwada et al. 1984
), we observed significant vector
strength at the carrier frequency only in the few neurons in our sample with BFs in that frequency range. Synchrony to the carrier will not be
discussed further in this report.
Most MTFs did not vary much when different 1-s time windows after the beginning of the stimulus were chosen for analysis, even though most neurons showed an adaptation of their mean firing rate during the stimulation period. All measures were therefore calculated over the entire stimulation period.
A BMF was defined for each rMTF as follows: first, the modulation frequency that elicited the maximum spike rate was identified. If there were two distinct maxima (e.g., Fig. 3A), the one at the lower modulation frequency (primary peak) was taken. Following this, a range of modulation frequencies (b1 to b2, Fig. 1E) where the response was >90% of this response maximum was extracted. The BMF was chosen as the mean of b1 and b2. This procedure essentially corrected for skewed or irregular peaks and was almost always close to the BMF as chosen by eye. BMFs were only measured if the spike rate dropped by at least 70% on both sides of the BMF. If a 70% drop was only present on the high-frequency side, then the modulation frequency (higher than the BMF) that elicited 90% of the maximum response was chosen as the corner frequency of the rMTF. A cutoff frequency was also measured; this was the frequency at which the response fell to the minimum spike rate plus 10% of the difference between the maximum and minimum spike rates, on the high-frequency side of the primary peak (w1 in Fig. 1E). Finally, a worst modulation frequency (WMF) was extracted in rMTFs with a clear suppressive region, as evidenced by a lower spike rate in comparison with that to one or more lower depth stimuli. The method was similar to that used for the BMF: the mean of a 10% range (w1 to w2, Fig. 1E) was taken as the WMF. In all cases, linear interpolation was used if required.
Kendall's
(Press et al. 1993
) is used often in this
paper as a nonparametric measure of correlation between two or three variables.
| |
RESULTS |
|---|
|
|
|---|
We measured responses from 109 single neurons in the
physiologically characterized central nucleus of the IC in 34 Mongolian gerbils. Three onset neurons were unresponsive to AM across the entire
parameter space, a proportion similar to that reported earlier
(Rees and Palmer 1989
).
Varying modulation depth
Varying SAM tone modulation depth revealed that IC rMTFs were composed of 1 or 2 regions of enhancement (modulation frequency ranges where the spike rate increases with increase in stimulus modulation depth) and/or 1 region of suppression (a range of modulation frequencies where the spike rate decreases as stimulus modulation depth increases). Figures 2 and 3 show illustrative examples of rMTFs and tMTFs measured from individual neurons (with a range of BFs) using SAM tones with varying modulation depths and constant SPL. In all cases (except Fig. 2D, see legend), the Fc was equal to the BF of the neuron at that SPL. The rMTFs in Fig. 2 show examples of neurons with rMTFs that predominantly show either an increase (A-C) or decrease (D) in spike rate with increasing modulation depth at all modulation frequencies. The magnitude of change depends on the modulation frequency. These changes create band-pass (A-C) or bandsuppressive (D) rMTFs that are characterized by their single prominent region of enhancement and suppression, respectively. In contrast, Fig. 3 illustrates examples of neurons whose rMTFs show both regions of enhancement and suppression. In other words, both the direction (increase or decrease) and magnitude of change (in spike rate with modulation depth) depends on the modulation frequency. The rMTFs in Fig. 3, A, C, and D, show secondary peaks at higher modulation frequencies; these were only found in eight neurons (see DISCUSSION). However, in contrast to responses at the primary peak (BMF), responses at the secondary peak possessed low vector strength; i.e., they were not synchronized to the modulation frequency. We formally recorded the effects of systematic variation of modulation depth from 31 neurons; in most cases, the spike rate at any given modulation frequency increased or decreased almost monotonically with increasing modulation depth.
|
|
tMTFs ranged from low-pass to band-pass (more peaked) shapes; this is
related to the SPL of stimulation, as will be shown in a subsequent
section. Notice that the vector strength remains high (and often
reaches its peak) as the spike rate reaches its minimum (Figs.
2D and 3). Also, low modulation depths can elicit responses
with high vector strengths, i.e., the dynamic range of the vector
strength measure is small as modulation depth is varied. For example,
the neuron in Fig. 2C showed a vector strength >0.8 in
response to a 10% stimulus modulation (an increase of 0.82 dB and a
decrease of 0.91 dB from the mean SPL) at 30 Hz. In contrast to the
rapid saturation of the vector strength of the response (VSr), the
vector strength of the modulating waveform (VSs; equal to half the
modulation depth) increases linearly with depth from 0 to 0.5. As a
result, an alternative measure of synchrony, modulation gain [20
log(VSr/VSs)] (e.g., Frisina et al. 1990
) decreases
almost monotonically as the depth increases. No consistent trend was
found in the behavior of pMTFs at different modulation depths, and in
most cases, the magnitude of variation was small.
Varying sound pressure level
The rMTFs for each of the neurons in Figs. 2 and 3 were recorded at a fixed SPL and Fc. Therefore it is of interest to know whether the observed rMTFs are invariant characteristics of the neuron, independent of the particular parameters (SPL and Fc) chosen for stimulation. Figure 4 partially answers this question by demonstrating that it is possible for a region of enhancement in the rMTF from a single neuron to be transformed systematically into a region of suppression as the SPL is increased. Note that these rMTFs were measured with the carrier at 1,500 Hz, while the effects of using a 1,200-Hz carrier are shown in Fig. 2D for the same neuron. The position of the rate minimum is not affected by this change in Fc, suggesting that spectral effects (like the positioning of the AM sidebands in putatively inhibitory regions of the frequency response area) are not important in its generation (see subsequent section on effects of varying carrier frequency). The tMTFs at multiple SPLs and 100% depth show a systematic "low-pass to band-pass shift," as a result of the decrease in vector strength at low modulation frequencies with increasing SPL (Fig. 4F).
|
To explore the emergence of suppression at higher SPLs further, we
systematically varied the SPL while keeping the depth at 100% and the
carrier at or close to the BF in 56 neurons. Illustrative examples
spanning the range of behaviors seen are shown in Figs. 5 and 6.
Figure 5 shows examples of MTFs at multiple SPLs from four neurons with
a range of best frequencies and with rMTFs that remain band pass (but
with varying bandwidths) at all tested SPLs. The neurons depicted in
A and B responded at all modulation frequencies tested, but suffered a loss of tuning at higher SPLs (possibly due to
saturation of the spike rate). A qualitatively similar pattern has been
reported previously (Rees and Palmer 1989
). The neurons
in C and D were "transient responders," with
a poor or absent response to long-duration unmodulated tones. Neurons
of this kind maintain their tuning over a broad range of SPLs. The spike rate at a given modulation frequency varied with SPL in either a
monotonic or a nonmonotonic fashion, in a manner that could depend on
modulation frequency (A and D). The BMF showed shifts of variable direction and magnitude as the SPL increased. As in
Fig. 4, the tMTFs show variable degrees of a "low-pass to band-pass
shift," similar to that seen in the cochlear nucleus (Frisina
et al. 1990
; Rhode 1994
; Rhode and
Greenberg 1994
).
|
|
Four examples of neurons with a suppressive region in their rMTFs are shown in Fig. 6. The suppressive region either emerges at higher SPLs (Fig. 6, A-C) as in the example of Fig. 4 or is present at all tested SPLs (Fig. 6D). The tMTFs show the same low-pass to band-pass shift seen in Fig. 5. Again, responses at modulation frequencies within the suppressive region retain considerable synchrony (i.e., show high vector strengths).
A correlation with some other neuronal response property would clearly be useful in delineating the possible mechanisms underlying the variety of rMTF changes seen with increases in SPL. No systematic relationship seems to exist between the presence of a suppressive region and the shape of the RLF (i.e., whether it is monotonic or nonmonotonic). However, it was found that the presence of a suppressive region in the rMTF at 1 or more SPLs was significantly correlated with the PSTH pattern in response to pure tones (Table 1). An rMTF region was defined as suppressive if the spike rate at modulation frequencies within that region showed a clear decrease as the modulation depth was increased (see METHODS). Neurons classified as pausers or sustained types on the basis of their PSTHs were more likely to possess rMTFs with a suppressive region, while rMTFs from neurons with onset or onset-sustained PSTHs usually only had a single region of enhancement (Fig. 5). However, it should be noted that using our aforementioned criterion, suppression can only be demonstrated clearly for neurons that respond well to unmodulated tones; for example, the rMTFs in Fig. 2B probably possess a suppressive region of very small magnitude between 20 and 100 Hz. Eleven of the 18 onset or onset-sustained neurons, and 2 of the 5 pauser neurons without a suppressive region in their rMTFs did not respond well to long-duration unmodulated tones. It therefore remains possible that a putative suppressive mechanism (possibly inhibition: see DISCUSSION) sharpens the high-frequency slope in rMTFs like those in Fig. 5, C and D, without actually resulting in a visible suppressive region. This is consistent with the fact that spike rates are often nonmonotonic with SPL over broad ranges of modulation frequency, even in rMTFs without suppression (e.g., Fig. 5D).
|
The final measure used to characterize MTFs was the mean phase of the response. Figure 7 shows the pMTFs for three neurons, whose rMTFs and tMTFs have been shown in earlier figures (Figs. 5, A and D, and 6D). A systematic increase in the phase lead (phase-advance) with increasing SPL, as illustrated in Fig. 7, A and B, was observed in the pMTFs from 30 of 56 neurons. Other neurons showed both increasing and decreasing leads over different modulation frequency ranges in their pMTFs. A decreasing phase lead (phase-delay) was sometimes seen in the pMTF in frequency regions corresponding to the decreasing high-frequency slope of the associated rMTF (especially in neurons with sustained PSTHs). This behavior was usually less systematic, but a fairly clear example is shown in Fig. 7C. The phase-advance is not always accompanied by an increase in spike rate; for example, the neuron in Fig. 7A shows a phase-advance even though its rMTFs are nonmonotonic with SPL (Fig. 5D). The phase-advance in Fig. 7B seems to be at least partly due to the marked adaptation seen in the response that shifts the peak of the response toward the beginning of the cycle (see the period histograms). The phase advance in Fig. 7A could be due to a phasic response that occurs earlier in the cycle as the SPL increases; this could possibly also be viewed as a very rapid adaptation. There was a consistent tendency for the phase-advance to be maximum at intermediate modulation frequencies, as exemplified by the data in Fig. 7B. This implies that it is inaccurate to view the phase-advance as simply reflecting a decreased time delay at higher SPLs that mirrors the known decrease in latency with SPL for pure tones; because that explanation predicts, at least in its simplest form, a linear relationship between phase-advance and modulation frequency (i.e., the phase-advance ought then to be maximum at the highest modulation frequency). On the contrary, the phase-advance is often minimal at the highest modulation frequencies. Finally, the modulation frequency at which the maximum phase-advance in a given neuron was found was not necessarily identical to that at which its rMTF peaked.
|
The properties of the only offset neuron observed in this study are
shown in Fig. 8. The rMTF showed a
band-pass shape, while the vector strength remained high throughout the
range of modulation frequencies that elicited a response. The pMTF
showed a systematic phase-delay with SPL at low modulation frequencies.
The period histogram suggests that this is the result of a phasic
response occurring later in the cycle as SPL increases. Consistent with its "offset" nature, the neuron fires during the falling phase of
the amplitude envelope (as indicated by the negative value of the low
modulation frequency asymptote). These properties are similar to the
recently described properties of offset neurons from periolivary
regions (Kuwada and Batra 1999
).
|
MTF characteristics across the population
Of the 106 neurons responsive to SAM tones, 96 were studied by
stimulating at 100% depth and with a
Fc that was at or close to the BF of
the neuron at all the SPLs studied. Figure
9 is a representation of the BMFs at
100% depth and different SPLs for all 96 neurons. The mean BMF
(averaged across the different SPLs tested and including corner
frequencies; see METHODS) did not show a significant
correlation (Kendall's
= 0.0708; P = 0.1533) with the BF. Some interesting properties are evident in the cumulative probability distribution of the mean BMF, shown in Fig.
10A. The maximum mean BMF
encountered was 140 Hz, and about 50% of the mean BMFs lay below 25 Hz. The range of variation of the BMF for an individual neuron (both
absolute and relative to the mean) observed at different SPLs is also
shown on the same graph. Fifty percent of the 49 neurons tested over at
least a 20-dB range of SPLs showed a BMF variation larger than 66% of
their mean BMF; in absolute terms, 50% of the neurons showed a range
larger than 10.9 Hz. No systematic pattern was found in the variation
of BMFs with SPL for individual neurons (Fig. 10B).
|
|
At least one rMTF with a suppressive region was observed in 43 of the 96 neurons. The mean WMF (averaged across SPL; see METHODS) lay between 0 and 200 Hz in most cases, with a mode near 100 Hz (Fig. 10C). No systematic pattern was found in the variation of WMFs with SPL for individual neurons (data not shown).
The minimum mean latency in response to pure tones at BF is
significantly correlated (Kendall's
=
0.3276,
P = 0.0001) with the mean BMF (averaged across SPL) of
the neuron in response to SAM tones at 100% depth and with a carrier
at or close to BF (Fig. 11). Neither
measure showed a significant correlation with the BF [
(BF-latency) =
0.0288 (P = 0.366) and
(BF-mean BMF) = 0.0329 (P = 0.3479)].
Accordingly, a three-way test resulted in a
of
0.3269, only
slightly different from that for the BMF-latency relationship,
suggesting that the BF of the neuron was not a confounding variable.
|
Figure 12A reveals that cutoff frequencies (see METHODS) are markedly lower in rMTFs that possess a suppressive region, when compared with those that do not. This could be interpreted as an effect of the suppressive mechanism sharpening the high-frequency slope of the rMTF, thus leading to a lower rMTF cutoff frequency.
|
The tMTF cutoff frequency (defined here as the maximum modulation
frequency at which neurons retain synchrony to the modulation frequency) can be regarded as an index of the low-pass filtering and
internal noise in the system. The measure shows no significant correlation with the Fc used
(Kendall's
= 0.003, P = 0.4829). Over 85% of
neurons lack significant synchrony above 300 Hz (Fig. 12B).
Cutoff frequencies in the IC are thus substantially lower than those
found at lower levels like the cochlear nucleus and the lateral
superior olive (Joris and Yin 1998
; Rhode and
Greenberg 1994
; both in the cat).
Some other aspects of the vector strength transformation between the
cochlear nucleus and the IC are shown in Fig.
13. As shown in Fig. 13A,
the maximum synchrony found in responses from almost all IC neurons
(0.907 ± 0.009, mean ± SE, n = 96) is
clearly more than the means reported for any neuron type in the
cochlear nucleus, with the possible exception of
OI units (Rhode 1994
; Rhode
and Greenberg 1994
). Also, responses from neurons that possess
BMFs (i.e., the rMTF shows a clear peak, see METHODS) at
all SPLs studied are almost maximally synchronized at the BMF; i.e.,
the mean difference between the vector strength at BMF and the maximum
vector strength is close to zero (e.g., MTFs in Figs. 2,
A-C, and 5). In contrast, neurons that possess corner
frequencies show larger mean differences between the vector strength at
their rMTF peak (BMF or corner) and the maximal vector strength in the
associated tMTF (for example, see MTFs in Figs. 3, 4, and 6).
|
The examples in Figs. 2 and 3 suggest that IC neurons show considerable
phase locking (i.e., high vector strength) at relatively low depths of
modulation. This is confirmed in Fig. 13B, where the lowest
modulation depth at which a significant vector strength was found is
plotted against the value of the vector strength for the 31 neurons
whose MTFs were recorded at multiple depths. Comparing this to
modulation depth-vector strength functions recorded from the cat
cochlear nucleus population (Rhode 1994
: Fig.
13) confirms that vector strengths of
IC neurons at low depths are more than those for almost all cochlear
nucleus neuron types (onset-choppers being the exception).
One pMTF each from 95 of the 96 neurons is plotted in Fig.
14A to display the
population characteristics of the pMTFs. The positive low modulation
frequency asymptotes indicate a tendency for most neurons to fire on
the rising portion of the sinusoidal envelope. A straight line was
often a good fit to the high-frequency (greater than or equal to 100 Hz; in our observations, responses in this range usually showed small
or absent SPL-dependent phase shifts) phase responses; the maximum and
minimum values of the slope (see legend) across all neurons were 15.6 and 6.09 ms. These may be interpreted as time delays (e.g.,
Anderson et al. 1971
; but also see Ruggero
1980
) that contain possible contributions from the fixed delays
and filtering properties of the system.
|
Thirty of the 56 neurons studied at multiple SPLs showed a systematic phase-advance as the SPL increased. This is shown in Fig. 14B, where phase leads (at the modulation frequency at which the largest phase-advance was found) are plotted as a function of SPL for each of the 30 neurons. The distribution of the maximum phase-advance (i.e., the range of each of the 30 functions in Fig. 14B) per 10-dB rise in SPL is shown as a histogram in Fig. 14C. The mean maximum increase was 23.81° per 10 dB (SE, 1.78). This may be larger than similar increases reported from responses at lower levels in the auditory pathway (see DISCUSSION).
Varying carrier frequency
We also investigated in 34 neurons the effects of stimulating different parts of the frequency response area by varying the Fc of the SAM tone. Illustrative examples are shown in Figs. 15 and 16. The examples in Fig. 15 show that band-pass rMTFs derived from an individual neuron can have different shapes; i.e., they are not scaled versions of each other. The BMFs can show shifts of varying magnitude and direction. The spike rate varies with carrier frequency in a manner that is roughly consistent with the frequency response area. Finally, as discussed earlier in the context of Fig. 4, one might imagine that suppressive regions could result from the positioning of the AM sidebands over inhibitory regions of the frequency response area. This explanation might also predict that as Fc is varied, suppressive regions would show shifts of the same magnitude as the shift in Fc. This does not seem to be the case (Fig. 16, A-C).
|
|
Figure 16D shows data from a neuron whose rMTF seems to lack suppressive regions when stimulated well away from its BF. Because suppressive regions can emerge at higher SPLs (see Fig. 6A for MTFs at multiple SPLs from the same neuron), this raises the issue of whether the effects of Fc variation can be explained in part on the basis of the SPL relative to threshold at each Fc. We do not yet have sufficient data to address this question.
The tMTF variation with Fc is roughly
similar to the variation observed in the same neuron when SPL is
varied. Thus when Fc is varied at a
particular SPL, tMTFs appear to vary over the same range of values seen
when stimulated between threshold and that particular SPL at BF. For
example, the neuron in Fig. 15C showed almost invariant
tMTFs as the SPL was varied (data not shown, but similar in this
respect to Fig. 5, C and D); this was mirrored in
the tMTF invariance with Fc.
Similarly, the neuron in Fig. 16A showed a large drop in
vector strength at low modulation frequencies as SPL increased (Fig.
4); the tMTF in Fig. 16A shows the same property as
Fc became closer to BF. However, the
details of the variation within this range did not show any clear
pattern, especially at higher levels; for example, the vector strength
did not show a consistent relationship either with
Fc or with spike rate. These results
are similar to those reported from the cochlear nucleus (Rhode
1994
). Finally, no systematic changes in the pMTFs were found
when Fc was varied.
One scheme for the generation of spike rate tuning to modulation
frequency in the midbrain (Langner 1981
) for low carrier frequencies predicts a linear relation between 1/BMF and
1/Fc (and therefore a monotonic
relationship between BMF and Fc).
Figure 17, A-E, show data
from the five cells (in this study) with peaked rMTFs that were studied
at multiple low carrier frequencies (<5 kHz) within their frequency
response area. No systematic pattern was found for BMF shifts with
Fc. Plotting these shifts as functions of 1/BMF versus 1/Fc (Fig.
17F) confirms the lack of any systematic linearity in our
data.
|
| |
DISCUSSION |
|---|
|
|
|---|
The present study provides detailed descriptions of the variations in spike rate, synchrony, and phase of spike discharges of single neurons in the IC in response to variations in modulation depth, SPL, and Fc of SAM tone stimuli at a range of modulation frequencies. The systematic (and often dramatic) changes seen in all these response measures when stimulus parameters are varied have been described in RESULTS. We now discuss the implications of these findings both for the neural mechanisms generating AM responses in the IC and for performance in psychophysical tasks involving auditory temporal processing.
Excitation and inhibition together create the IC rMTF
Most neuron types in the cochlear nucleus predominantly show
poorly tuned or nonexistent variations in spike rate (i.e., low-pass or
flat rMTFs) with modulation frequency (Frisina et al.
1990
; Rhode and Greenberg 1994
). Information
about the modulation frequency is instead present in the response
component locked to the modulation frequency (i.e., a "temporal
code" rather than a "rate code"). In contrast, neurons in the IC
show large variations of spike rate with modulation frequency. These
variations have previously been reported to result in rMTFs of varied
complex shapes, with band-pass rMTFs that possess a BMF being the most
common variety (Heil et al. 1995
; Langner and
Schreiner 1988
; Rees and Palmer 1989
). The data
in this paper show that rMTFs are composed of regions of enhancement
and suppression, where the spike rate increases or decreases,
respectively, with an increase in the modulation depth of the SAM tone
stimulus. In particular, almost all IC rMTFs could be described by some
combination of a primary and a secondary region of enhancement and an
intervening region of suppression (Fig.
18), with these regions present to
varying degrees in individual rMTFs. The regions of enhancement have
band-pass shapes, with the low-frequency region of enhancement (E1)
forming the primary peak (BMF or corner frequency), and the much less
commonly found high-frequency region (E2) the secondary peak. The
region of suppression creates the band-suppressive shape, with its
trough forming the WMF.
|
In the spirit of most current models (see Mechanisms generating
the secondary region of enhancement), we speculate that
the primary region of enhancement is primarily the result of a
transformation of excitatory inputs to the IC. This seems plausible
because at least some rMTFs in the mustache bat IC remain band-pass
after iontophoretic application of various inhibitory blockers
(Burger and Pollak 1998
). A similar result has been
reported for onset neurons in the mustache bat dorsal nucleus of the
lateral lemniscus (DNLL) (Yang and Pollak 1997
),
supporting the notion that purely excitatory mechanisms are capable of
creating band-pass rMTFs. However, it is possible that both in the IC
and the DNLL, the inputs themselves show band-pass spike rate tuning
(see next section); this could also account for the minimal effect of
inhibitory blockers on the band-pass tuning of IC rMTFs. On the other
hand, it seems very likely that inhibitory inputs shape the rMTF by
creating the region of suppression. In addition, inhibition may sharpen the high-frequency rolloff of the primary region of enhancement without
resulting in a visible region of suppression (as a result of the low
firing rate in response to unmodulated tones: see discussion of Table 1
in RESULTS). The properties of the secondary region of
enhancement remain unclear; potential mechanisms that could generate it
are discussed in a later section.
In the next few sections, various issues related to the observed MTFs are discussed in greater detail. However, it must be emphasized that in the absence of much pertinent data about intrinsic properties and input patterns of IC neurons, as well as the AM response properties of inputs to the IC, much of this discussion must remain speculative.
Is rate tuning created de novo in the IC?
With contralateral sound presentation, the excitatory afferents to
the IC that are likely to be active include those from the
contralateral cochlear nucleus (in particular, stellate cells in the
ventral cochlear nucleus, and fusiform cells in the dorsal cochlear
nucleus), contralateral lateral superior olive (LSO), ipsilateral
medial superior olive (MSO), and possibly, the contralateral IC
(Moore et al. 1998
; Oliver and Huerta
1991
). Most neurons in the cochlear nucleus do not show much
rate tuning. Although little is known about the response properties of
MSO neurons to SAM tones, it appears from preliminary data that neurons
in the LSO possess, in addition to tuned tMTFs, rMTFs that show
systematic changes of spike rate with modulation frequency (Thornton SK
and Semple MN, unpublished observations). LSO rMTFs can show peaks
(BMFs); the majority of BMFs seem to lie below 400 Hz. Thus rMTFs of IC neurons may at least in part reflect the rate tuning present in their
rate-tuned inputs (e.g., from the LSO); low-pass filtering of the
inputs (which are also phase locked to the modulation frequency) may
potentially account for the lower BMFs in the IC. However, the extent
to which different inputs overlap in their projections onto single IC
cells is not clear. It therefore remains plausible that at least some
of the rate tuning seen in the IC emerges as a result of collicular processing.
Coincidence detection mechanisms may create the primary region of enhancement
Mechanisms clearly exist in the auditory midbrain to create tuned
rMTFs from inputs (e.g., from the cochlear nucleus) that show varying
amounts of synchrony to the modulation frequency, but little or no
spike rate changes with modulation frequency. One candidate scheme
suggests that band-pass rMTFs (regions of enhancement) seen in IC
neurons result from coincidence detection of synchronized excitatory
inputs (Hewitt and Meddis 1994
). In this model, the IC
neuron is considered to be a coincidence detector that fires maximally
when its inputs (from the cochlear nucleus) are maximally synchronized,
thus converting the peak in the input tMTFs to a rMTF peak in the IC.
Such a model can reproduce some aspects of the previously reported
data, including the flattening of some rMTFs at high SPLs (Rees
and Palmer 1989
) (Fig. 5A in present study: because
the coincidences generated by the high input spike rates at high SPLs
fire the coincidence detector independent of the synchrony in the
inputs). However, in comparison to the tMTF peaks in the inputs from
the cochlear nucleus (80-520 Hz: Frisina et al. 1990
,
gerbil; mean = 330 Hz: Rhode and Greenberg 1994
,
cat) and the lateral superior olive (200-600 Hz: Joris and Yin
1998
, cat), IC BMFs appear to span a much lower range (0-100 Hz: Fig. 10A). In other words, rMTF peaks in the IC do not
seem to be equal to tMTF peaks in their inputs, as specified in the model. Other mechanisms (possibly including some combination of a stage
of low-pass filtering, inhibitory inputs causing a sharper high-frequency rolloff, or intrinsic cellular properties) may need to
be included to account for the data. However, it seems prima facie
possible that such a model (with the addition of inhibitory inputs to
create the region of suppression: see next section) might serve
as a good first attempt to reproduce various other aspects of AM
responses reported in this paper. An extensive modeling study would
also offer insight into input patterns and cellular properties that
could generate the diversity of MTF characteristics seen in the IC.
An alternate scheme that has been proposed to explain peaked rMTFs also treats the IC neuron as a coincidence detector. However, the structure of this model is very different from that above: it is posited that a cross-correlation analysis is performed by neurons that detect coincidences between spike trains synchronized to the modulation frequency and carrier frequency, respectively, and delayed by different small time periods (Langner 1981