|
|
||||||||
J Neurophysiol (November 1, 2002). 10.1152/jn.00253.2002
Submitted on 2 April 2002
Accepted on 30 July 2002
Laboratory of Auditory Neurophysiology, Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
| |
ABSTRACT |
|---|
|
|
|---|
Barbour, Dennis L. and Xiaoqin Wang. Temporal Coherence Sensitivity in Auditory Cortex. J. Neurophysiol. 88: 2684-2699, 2002. Natural sounds often contain energy over a broad spectral range and consequently overlap in frequency when they occur simultaneously; however, such sounds under normal circumstances can be distinguished perceptually (e.g., the cocktail party effect). Sound components arising from different sources have distinct (i.e., incoherent) modulations, and incoherence appears to be one important cue used by the auditory system to segregate sounds into separately perceived acoustic objects. Here we show that, in the primary auditory cortex of awake marmoset monkeys, many neurons responsive to amplitude- or frequency-modulated tones at a particular carrier frequency [the characteristic frequency (CF)] also demonstrate sensitivity to the relative modulation phase between two otherwise identically modulated tones: one at CF and one at a different carrier frequency. Changes in relative modulation phase reflect alterations in temporal coherence between the two tones, and the most common neuronal response was found to be a maximum of suppression for the coherent condition. Coherence sensitivity was generally found in a narrow frequency range in the inhibitory portions of the frequency response areas (FRA), indicating that only some off-CF neuronal inputs into these cortical neurons interact with on-CF inputs on the same time scales. Over the population of neurons studied, carrier frequencies showing coherence sensitivity were found to coincide with the carrier frequencies of inhibition, implying that inhibitory inputs create the effect. The lack of strong coherence-induced facilitation also supports this interpretation. Coherence sensitivity was found to be greatest for modulation frequencies of 16-128 Hz, which is higher than the phase-locking capability of most cortical neurons, implying that subcortical neurons could play a role in the phenomenon. Collectively, these results reveal that auditory cortical neurons receive some off-CF inputs temporally matched and some temporally unmatched to the on-CF input(s) and respond in a fashion that could be utilized by the auditory system to segregate natural sounds containing similar spectral components (such as vocalizations from multiple conspecifics) based on stimulus coherence.
| |
INTRODUCTION |
|---|
|
|
|---|
Sound processing in the auditory
system of mammals begins in the cochlea, which contains an array of
sensory epithelium that decomposes sound energy into many parallel
pathways of frequency information (Fletcher 1940
;
Von Békésy 1960
; Zwicker et al. 1957
). These frequency pathways or "channels" persist
throughout the ascending auditory system at least as far as primary
auditory cortex (Aitkin et al. 1986
; Howard et
al. 1996
; Kosaki et al. 1997
; Merzenich
et al. 1973
, 1975
, 1976
; Morel et al. 1993
;
Reale and Imig 1980
; Rose and Woolsey
1949
; Woolsey and Walzl 1942
, 1944
) and
represent a fundamental organizing principle for auditory neuroscience.
Complex, natural sounds generally contain energy at many frequencies
that tends to undergo coordinated changes in amplitude and frequency
(Nelken et al. 1999
), implying that the auditory system
could exploit correlations in the auditory filter outputs to extract
more information about the acoustic environment. Psychophysicists have
tested this idea with stimuli that simultaneously excite multiple
auditory filters and have concluded that such phenomena as comodulation
masking release (CMR) and modulation detection/discrimination
interference (MDI) reflect, at least partially, the operation of
multiple auditory filters in detection tasks. In CMR, masking noise
bands incoherently modulated in amplitude with respect to a probe tone
render the probe tone more easily detectable (Hall et al.
1984
; Moore 1990
; Moore et al.
1990
); in MDI, incoherent AM of one tone renders the presence
of AM more easily detectable on another tone (Moore and Shailer
1992
; Strickland et al. 1989
; Yost et al.
1989
).
Not only does AM seem to influence the grouping of tones
(Bregman 1990
; Bregman et al. 1985
;
Strickland et al. 1987
; Von Békésy 1963
; Wakefield and Edwards 1987
), but so, too,
can FM (Bregman 1990
; Bregman et al.
1985
; Chalikia and Bregman 1989
; Cohen
and Chen 1992
; Furukawa and Moore 1996
, 1997
;
Wilson et al. 1990
). The role FM plays in auditory
grouping, however, remains poorly elucidated relative to the role of
AM. While some studies have indicated similarities between the
perception of AM and FM (Saberi and Hafter 1995
;
Zwicker 1962
), other studies of masking effects indicate
a stronger role for harmonicity in simultaneous FM than in AM
(Bregman and Doehring 1984
; Carlyon 1991
, 1992
,
1994
, 2000
; Carlyon and Stubbs 1989
;
Chalikia and Bregman 1989
, 1993
; Culling and
Summerfield 1995
; Gardner and Darwin 1986
;
Marin and McAdams 1991
; McAdams 1989
).
Despite the difficulty in verifying the precise nature of FM coherence
in grouping tasks, this cue has been shown to provide information that
could be useful for the processing of multiple simultaneous sounds.
Modulated tonal sound components feature prominently in the natural
vocalizations of the common marmoset (Callithrix jacchus, a
vocal primate), most notably in trill calls (Fig.
1A). Trill calls typically
contain a fundamental and harmonic that share similar AM and FM. Trill
calls from different monkeys, however, have distinct modulations,
providing one potential cue for the marmoset auditory system to
differentiate between two simultaneous callers (Agamaite
1997
; Agamaite and Wang 1997
; Bregman
1990
) (Fig. 1B). Marmosets in their natural habitat
and in captivity rely heavily on their vocal repertoire to communicate
within their familial groups. They regularly hear concurrent
vocalizations from several monkeys, which they must process effectively
to survive and function in their social structure (Epple 1968
,
1975
). Findings from behavioral experiments on vervet monkeys
in their natural habitats suggest that nonhuman primates can extract
caller and message content of species-specific vocalizations in a noisy
environment (Seyfarth et al. 1980
).
|
Simultaneous modulated tones appear to represent logical, behaviorally
relevant candidates for probing marmoset auditory cortex for
sensitivity to stimulus coherence at different carrier frequencies (Fig. 1, C and D). Such stimuli are analogous to
two marmoset monkeys simultaneously producing trill calls. Stimuli are
coherent if all components have precisely the same modulation frequency (fmod) and phase
(
mod) and incoherent if either
fmod or
mod is mismatched. Differences in
mod between
components reflect only temporal properties of the stimulus, while
differences in fmod reflect both
temporal and spectral properties. The temporal properties of the
neurons under investigation were studied by altering the relative phase
of modulation between two modulated tones (
rel =
mod2
mod1)
a
property to which the auditory system has been shown to be sensitive
and which is believed to influence sound segregation for AM
(Bregman et al. 1985
; Strickland et al. 1987
,
1989
; Wakefield and Edwards 1987
; Yost
and Sheft 1989
) and FM (A. S. Bregman and P. Abdel Ahad,
unpublished data; Bregman 1990
; Cohen and Chen
1992
; Furukawa and Moore 1996
, 1997
; Wilson et al. 1990
). The stimuli contained a primary
tone (f1) fixed at CF and a secondary
tone (f2) at one carrier frequency near CF. Both tones were sinusoidally modulated in either amplitude or
frequency with identical fmod unless
otherwise noted;
rel was varied as described
in METHODS.
| |
METHODS |
|---|
|
|
|---|
Animal preparation
Preparation of marmosets followed institution-approved chronic physiology procedures. Beginning 2-4 wk prior to surgical implantation, the animal was accommodated to a primate chair for progressively longer periods of time until it sat quietly for 4-6 h. Preparatory aseptic surgeries were performed under initial ketamine (20 mg/kg) and sustained isoflurane (0.5-2.0%, combined with a 50/50 mixture of oxygen/nitrous oxide) anesthesia. The temporalis muscles were removed bilaterally, and 1.1- or 1.5-mm-diam holes were drilled approximately 1 mm through the skull in two semicircular patterns around the laterally located auditory cortices. Screws were inserted tightly into these holes for support, and dental cement was applied around the screws onto the entire exposed skull except for the regions immediately covering auditory cortex. These bare regions of skull were later protected by covering them with a pliable, easily removable polyvinylsiloxane dental impression compound (Kerr). Two stainless steel headposts were also affixed into the dental cement for later immobilization of the head. Following surgery, the animal was monitored closely and administered regular doses of antibiotics and pain relievers during the recovery period, normally 10-14 days.
During a neurophysiological recording experiment, the animal sat in the
primate chair with its body minimally restrained and its head
immobilized by a stainless-steel arm attached to the skull-mounted
restraining headposts. The protecting layer of impression compound over
the animal's skull was removed, and a small hole (approximately 1 mm
diam) was drilled through 80-90% of the skull thickness with a custom
drill mounted on a commercial micromanipulator (SM-11, Narishige). The
remaining bone was removed by hand with sterile, custom microsurgical
instruments under 40× magnification on a surgical microscope (Carl
Zeiss). Anesthetics were not required during the drilling because no
soft tissue was manipulated. A sterile electrode was then attached to a
hydraulic microdrive (Trent-Wells) mounted on the micromanipulator and
carefully lowered through the hole until it rested on the dura. This
hydraulic microdrive was operated by the experimenter from outside the
chamber to alter the electrode depth during experimentation. The
preparation was adapted from a similar technique developed to record
from mustached bat (Pteronotus parnellii parnellii) auditory
cortex without exposing excessive amounts of neural tissue.
(Suga 1965a
,b
). The recordings achieved with this method
were typically stable over several hours of recording time, allowing
for the accumulation of large amounts of data from individual neurons.
At the end of each day of recording, the exposed hole was filled with
an antibiotic compound and the skull sealed over again with impression
compound. At the end of a week of recording, a small amount of dental
cement was placed into the hole to seal it permanently for infection
control, stabilization of later recordings, and retardation of
extraneous soft tissue growth.
Stimulus generation
Two tones were generated simultaneously using custom software,
and all their individual parameters were varied independently. Each
tone was computed from
|
(1) |
|
|
|
(2) |
AM, the AM phase. The phase was defined such
that when
AM = 0 and
t0 = 0, A(t0) = 0, indicating
a cosine function. The sinusoidal FM
d
c/dt was governed by
three parameters: MFM, the FM depth as
a percentage of carrier frequency fc,
which could take on values between 0 and 1;
fFM, the modulation frequency of the
FM; and
FM, the modulation phase of the FM.
The phase was defined such that when
FM = 0 and t0 = 0,
c(t0) = 0, indicating a sine function for the FM. Note that the form
c(t) took was that of phase
modulation, which was derived from the desired FM as follows
|
|
(3) |
c(t0) = 0, KFM must equal
MFMfc/fFM.
Each tone, therefore had 10 free parameters that could be varied:
A0,
fc,
t0,
t1,
MAM, fAM,
AM,
MFM,
fFM, and
FM.
Note that no random carrier phase was incorporated into the tones.
Relative modulation phase was defined as the AM or FM phase of the
secondary (f2) tone at
t = 0 minus the corresponding phase of the primary
(f1 or CF) tone at t = 0. This definition allowed relative modulation phase values to be
meaningful even when the modulation frequencies of the two tones
differed. Examples of two tones with equal modulation frequencies but
different relative modulation phase values
hence, different temporal
coherence
are shown in Fig. 1, C and D. Linear
ramps of 10 ms ON-OFF were added to all stimuli used in
these experiments.
Physiological recordings
The extracellular tungsten microelectrodes (3-5 M
, A-M
Systems; 1-3 M
, Uwe Thomas Recording) allowed for reliable
single-unit isolation (>40 dB SNR possible; typical SNR > 30 dB;
see Fig. 2). In all cases, action
potentials were continuously monitored during each recording session
and sorted on-line by template-matching digital signal processing (DSP)
software (Alpha-Omega Engineering). The spike timing information was
logged in a Pentium-based personal computer. Electrode voltage traces
were also digitized periodically for confirmation of on-line spike
sorting. Well-isolated single units were rigorously sought out, and
data were not collected under poor signal conditions.
|
Custom software running in MatLab on the same personal computer
generated all sound stimuli digitally at 100,000 samples/s, which were
then low-pass filtered at 50 kHz, fed into two serially-linked TDT PA4
attenuator modules (each of which was set to 1/2 the desired overall attenuation), and passed into a power amplifier (Crown) and
finally into the recording chamber. A value of 0 dB attenuation represents the most intense sound deliverable at a particular amplifier
setting [approximately 93 dB sound pressure level (SPL) for pure tones
at 1 kHz in these experiments]. The stimuli were all delivered in
free-field through a single two-way crossover, open bass reflex
loudspeaker (B&W 601) located 70 cm in front of the animal's head. The
speaker had ±4 dB passband ripple from 100 to 36 kHz, which
encompasses the hearing range of marmosets (Seiden
1957
). Speaker transfer function was measured using pure tones
stepped in one-twelfth-octave increments and recorded using a
Brüel and Kjær condenser microphone placed in the empty primate chair at the location of the animal's head. Both the loudspeaker and
the primate chair were located within a double-walled acoustic chamber
(IAC-1024, Industrial Acoustics), whose interior was lined with
three-inch acoustic foam (Sonex, Illbruck).
All stimuli were presented pseudorandomly for multiple repetitions
(usually 5-10). Spontaneous firing rates
(Rsp) were determined from the spiking
during the silent periods preceding the stimuli (usually 200-500 ms
long). Stimuli were 500-1,000 ms in duration, and successive stimuli
were always separated by
1 s of silence, usually more. The
interstimulus interval was adjusted to longer values for units with
clear activity/suppression long after stimulus offset.
Units were sampled from all cortical layers but predominantly from supragranular layers, as judged by recording depth and response properties. Primary auditory cortex was located stereotactically and confirmed by its short-latency, tone-responsive units and its tonotopic map, which was determined from electrode penetrations made through closely spaced holes in the skull. When all desired physiological experiments for an animal had been conducted, electrolytic lesions and fluorescent dye injections were made at various sites around auditory cortex. The animal was then deeply anesthetized with Nembutal, euthanized and perfused with formalin to preserve the brain tissue. Serial sectioning and staining, in conjunction with the experimental record, can reveal the electrode tracks, thereby pinpointing the recording sites.
A total of 84 single units isolated in the primary auditory cortex (A1)
of two awake marmoset monkeys (Callithrix jacchus) was
analyzed in this study. The units were selected for their sustained
response to modulated tones
a category that constitutes the majority
of units encountered in A1 (Liang et al. 2002
).
Responsive units were located with a standardized search routine while
advancing the microelectrode through cortical tissue orthogonally to
the surface. Once a tone-responsive unit was identified, its tunings to
tone carrier frequency and SPL were determined. Frequency response functions (FRF; spike rate in response to a single pure tone fixed in
amplitude and varied in carrier frequency) were taken at or near lowest
response threshold, and rate-level response functions were measured at
characteristic frequency (CF), as determined by the peak of the
threshold FRF. Units were also characterized with respect to single
tone sinusoidal AM at the CF, yielding a modulation transfer function
(MTF). Some units were instead characterized by FM tones, and the
choice between the two types of modulation was made based on which type
of stimulus elicited more spikes from the unit. In the case of the few
units with no sustained response to pure tones, frequency and level
tunings were determined using the modulated stimuli.
Once a responsive unit's basic excitatory properties were defined, it
was often probed with two simultaneous pure tones for its inhibitory
properties. Because most auditory cortical units spontaneously
discharge at very low rates, an FRF typically reveals little
inhibition, thereby requiring the use of one tone at CF to bias the
excitatory response and a second tone to probe for inhibition
(Brosch and Schreiner 1997
; Shamma and Symmes
1985
; Suga 1965b
; Sutter et al.
1999
). A primary tone (f1) was
always delivered at CF in such protocols while a secondary tone
(f2) was varied in carrier frequency
around CF to generate a two-tone FRF (TTFRF). The
f1 tone was delivered with a SPL near
threshold, but intense enough to drive the unit consistently
often the
peak of the rate-level function; the
f2 tone was delivered at the same amplitude or
20 dB SPL (in rare cases 40 dB) more intense as necessary to reveal inhibition. Density of sampling ranged from 10 to
40 steps per octave, depending on narrowness of tuning. A similar
process generated a two-modulated-tone FRF (TMT FRF) for the coherent
case (relative modulation phase or
rel = 0°) and the incoherent case (
rel = 180°).
Modulation frequencies were chosen that reliably drove the units (e.g.,
peak of the rate modulation transfer function), and both tones were
modulated at the same modulation frequency unless otherwise noted.
Finally, secondary frequency regions showing apparent coherence
sensitivity were probed more closely by fixing both
f1 and
f2 carrier frequencies and varying
rel in steps of either 22.5° or 45°. This
process was often repeated at many f2
frequencies. Not all stimulus conditions could be studied in all units
because of limited recording time.
Data analysis
Spike trains were converted into average discharge rate (or just
"rate") measures over a fixed time window beginning at stimulus onset and ending 50 ms following stimulus offset. Because all the
neurons tested for coherence sensitivity exhibited sustained spiking
patterns, limiting the time window to the beginning, middle, or end
portions of the stimulus did not substantially alter any of the
population measures. No systematic variations in synchronous spiking
(i.e., vector strength) as a function of relative modulation phase were
observed or found to be statistically significant across the population
of neurons studied (P > 0.05, Pearson's correlation test); hence, analysis was limited to spike rates. For the purposes of
this work, the rate modulation transfer function (rateMTF) was defined
as the number of spikes elicited by each of a series of tones modulated
at different frequencies divided by the duration of the stimuli. This
definition is identical to all other rate functions (e.g., the
rate-level function) and provides no information about the temporal
structure of the spiking. The synchronization modulation transfer
function (syncMTF) was defined here to be the vector strength as a
function of modulation frequency. Vector strength was defined
classically, and significance was evaluated using a Rayleigh test
(P < 0.001)
|
(4) |
|
For population summary of phase responses, units were determined to be
sensitive to
rel by statistical analysis of
the
rel sweep rate curve having the largest
range of values. Null hypothesis (i.e., all values of
rel elicited the same firing rates) was rejected if P < 0.05 in a one-way ANOVA of rates for
all phase values tested. Rate curves as a function of phase were
three-point-triangle smoothed for determination of the phase at minimum
(Rmin) and maximum
(Rmax) rate responses, although
plotted data represent actual, unsmoothed rates. Carrier
frequency-dependent spike measures were sampled at 20 frequencies/octave for common comparison. An inhibition index (II) was
computed at each f2 carrier frequency as the two-tone frequency response function rate at CF minus the rate
at the f2 carrier frequency,
normalized by the rate at CF
|
(5) |
rel = 0° minus the rate at
f2 and 180°, normalized by the rate
at CF and 0°
|
(6) |
1, and dividing it by the largest value for that neuron. Large
values of the adjusted inverse CSI reflect coherence sensitivity.
| |
RESULTS |
|---|
|
|
|---|
Temporal coherence sensitivity measured by sinusoidal AM
Stimulus parameters useful for testing temporal coherence sensitivity were determined on a neuron-by-neuron basis. Figure 3 shows the behavior of one representative neuron in response to a series of protocols designed to measure coherence sensitivity in response to AM tones. First, the carrier frequency, sound level, and modulation frequency ranges eliciting the greatest discharge rate from the neuron were determined by systematic stimulus sweeps of a single tone or modulated tone. Figure 3, A and B, demonstrates the resulting spiking patterns, as well as two separate measures of response to amplitude modulated tones: discharge rate and vector strength. Arrows in these plots indicate parameter values used for the two-tone and/or two-modulated-tone protocols.
|
When a primary tone (unmodulated) was fixed at the carrier frequency
and at sound level values indicated in Fig. 3A, a secondary unmodulated tone varied in carrier frequency revealed flanking inhibition (Fig. 3C, violet curve)
the classical two-tone
response. When both of the tones were modulated at the same modulation
frequency (16 Hz, see Fig. 3B) with a relative modulation
phase of 0°, similar flanking inhibition was revealed (Fig. 3,
C and D, green curve). If, however, both tones
were modulated at the same modulation frequency but with a relative
modulation phase of 180°, the outer portion of the lower inhibitory
flank showed a markedly increased discharge rate (Fig. 3, C
and D, blue curve).
This phenomenon was explored in more detail by fixing the primary tone
as before and fixing the secondary tone carrier frequency while varying
the relative modulation phase. This process was repeated stepwise
across many secondary tone carrier frequencies. Figure 3E
depicts the result of this procedure at the secondary carrier frequency
indicated by arrows in Fig. 3C,D. This frequency was chosen for demonstration because it shows the effect of interest and because it lies relatively far from CF, thereby eliminating the
possibility that the AM sidebands would excite the neuron. This neuron
showed a clear minimum rate response (maximum of suppression) for
coherent (
rel = 0°) stimuli
the most common
response type observed in this study. Some changes in vector strength
could be seen (Fig. 3E, right), but alteration in
temporal spiking was rare throughout the population of neurons studied.
Figure 4 shows another example neuron studied with similar AM protocols. Just as this neuron has a carrier frequency and modulation frequency range of maximal excitatory response different from that of the previous neuron (Fig. 4, A and B), its carrier frequency range of incoherent release from inhibition differs as well (Fig. 4, C and D). These two neurons are representative of the range of flanking carrier frequencies that reveal release from inhibition under temporally incoherent AM conditions: carrier frequencies both above and below CF, on both the inner and outer inhibitory flanks can show the phenomenon. Neurons may also show release from inhibition at all flanking frequencies or none, indicating considerable diversity in the nature of flanking inputs to A1 neurons. Figure 4E shows again that large phase-induced rate changes are typically unaccompanied by corresponding changes in vector strength.
|
Temporal coherence sensitivity measured by sinusoidal FM
Concurrently-delivered AM tones constitute a spectrally compact stimulus useful for exploring issues of temporal coherence sensitivity. Auditory cortex neurons also respond in sustained fashion to FM stimuli, so the question of FM-induced temporal coherence sensitivity may be addressed with a similar protocol. Figure 5 shows a neuron that exhibited a mild flanking release from inhibition for temporally coherent FM. Similarly to the AM case, changes in relative modulation phase between two FM tones can modify the inhibition evident in the rate response of A1 neurons.
|
The similarities between AM and FM temporal coherence sensitivity in
individual neurons can be seen most easily by comparing the relative
modulation phase tuning uncovered by the protocol used in Fig.
3E. Figure 6 shows the
relative modulation phase tuning responses of several neurons tested
with either AM or FM tones. While the stimulus parameters at which
coherence sensitivity exists can be seen to differ from neuron to
neuron, the phase responses generally show either coherent suppression
(Fig. 6, top and middle) or incoherent
suppression (Fig. 6, bottom). The main differences between
AM and FM become apparent when population responses are considered and
will be discussed in a later section. Again, no systematic alterations
in the vector strength of the spiking patterns were observed across the
population of neurons studied
the predominant effect seems to be a
strong modification of discharge rate.
|
Variety of temporal coherence sensitivity
Examples of neurons tested for temporal coherence sensitivity at many carrier frequencies are shown in Fig. 7 for AM and Fig. 8 for FM. These three-dimensional plots indicate the discharge rate of the neuron as a function of both secondary carrier frequency and relative modulation phase. The varying degree of inhibition release is evident in these examples. To aid the visualization of the coherence effect, the TMT FRF at 0° of relative modulation phase is projected through all phase values for each example (Fig. 7, right), creating a hypothetical response showing no coherence sensitivity. Figure 7, A and B shows neurons similar to the example in Fig. 3. Flanking inhibition in these neurons lessens under temporally incoherent stimulus conditions. Both inhibitory flanks in these neurons seem to be sensitive, although to varying degrees.
|
|
Figure 7C shows a mixed effect for the three inhibitory flanking regions. The inhibition nearest CF persists at all relative modulation phase values, and on-CF suppression becomes apparent at incoherent phase values. The far flanking region shows a mild release from inhibition for incoherent phase values. Figure 7D depicts a neuron displaying little temporal coherence sensitivity at any secondary carrier frequency for the stimulus parameters tested.
Figure 8A shows a neuron with coherent FM inhibition around
CF, similar to that depicted for AM in Fig. 7C. Figure
8B shows a more complex response, with a coherent release
from inhibition just below CF (as in Fig. 7, A and
B) and a biphasic phase tuning response (troughs at both
0° and 180°) at the highest carrier frequencies audible to
marmosets. Such responses were never observed for two tones amplitude
modulated at the same modulation frequency and probably reflect
differential sensitivity to the upward and downward FM inherent in
sinusoidal modulation (Liang et al. 2002
).
Two main categories of phase tuning functions
As mentioned previously, the relative modulation phase
tuning functions depicted in Fig. 6 appeared to constitute two distinct categories. From all the phase tuning functions collected for a neuron
that demonstrated a dependence of rate on relative modulation phase
(1-way ANOVA of rates, P < 0.05), the one with the
largest difference between maximum and minimum rates
(Rmax
Rmin) was used to characterize that
neuron. Some neurons showed no significant alteration of rate at any
phase tested. The phases of minimum and maximum response
(
min and
max) for the
significant phase sweeps are plotted against the percent decrease from
maximum, computed by (Rmax
Rmin)/(Rmax
Rspont) × 100, in Fig.
9A. The median discharge rate
differential (Rmax
Rmin) was 15 spikes/s; the range, [2,
53]. Distributions of the phases of minimum and maximum responses are
shown in Fig. 9, B and C. Neurons tested with AM show a clear bias toward response minima (maximum suppression) at phases near 0° and response maxima at phases near 180°. While fewer neurons were tested with FM, a higher overall percentage showed
significant phase effects (77% or 17/22 vs. 53% or 31/58 for AM), and
response minima tended to be fairly evenly divided between 0° and
180°. The differences in phase distributions and relative proportions
of neurons showing detectable effects constitute the most obvious
distinctions between data gathered with AM versus FM.
|
Each phase tuning function was taken with the primary tone fixed at the
neuron's CF and the secondary tone fixed at another carrier frequency.
A scatterplot of CF versus secondary tone frequency where
Rmax
Rmin was greatest for each neuron is
shown in Fig. 10. Three observations
can be made: 1) coherence sensitivity is found over a wide
range of CFs and well beyond the physiological carrier frequencies of
trill calls (5-7 kHz); 2) maximum coherence sensitivity is
observed both above and below the CF; and 3) carrier frequencies of maximum coherence sensitivity are generally found near
the CF. The range of CFs involved encompasses a large portion of the
audible frequency range of marmosets (Seiden 1957
) and the entire range of their vocalizations (Agamaite 1997
;
Agamaite and Wang 1997
) but does not appear to be
concentrated within any one frequency range.
|
Average population temporal coherence sensitivity
The clustering of rate minima and maxima around AM phase values of
0° and 180° seen in Fig. 9 implies that the maximally
coherent and incoherent stimulus conditions may adequately reveal the
temporal coherence properties revealed by AM tones. Figure
11A shows the median CSI as
a function of secondary tone carrier frequency relative to CF for all
the neurons tested with the two-AM-tone protocol depicted in Fig.
3C. Positive values of CSI indicate greater rates in
response to 0° than to 180°; negative values, 0° < 180° (see METHODS). The maximum median CSI value lies at
CF, indicating that two coherent AM tones with carrier frequencies very
close or equal to CF elicit greater population rate responses than
do two incoherent tones with the same carrier frequencies. The
minimum median CSI values lie approximately ±1/4 octave from
CF
approximately one critical band (Fletcher 1940
;
Greenwood 1961
; Hamilton 1957
). Negative
CSI values indicate incoherent release from inhibition, and the
frequency range of low median CSI corresponds closely with high values
of the median inhibition index (II) as measured by two pure tones (see
METHODS).
|
Figure 11B depicts all of the AM CSI data for CF and CF ±1/4 octave. The greatest number of neurons (30/61) showed a bilateral flanking decrease of CSI relative to the value at CF (black), indicating flanking inhibition that was generally stronger for coherently modulated stimuli. A smaller number (13/61) showed a bilateral flanking increase of CSI relative to CF (gray), indicating either coherent release from flanking inhibition or a diminished response near CF resulting from incoherence. The remainder of the neurons (18/61) showed an asymmetric response.
The median CSI values at ±1/4 octave significantly differed from the median value at CF (P < 0.01, Wilcoxon signed-rank test), and neurons with CSI decreases had significantly lower CSI values at ±1/4 octave than those with increases (P < 0.001, sign test), indicating that the overall AM population response was dominated by neurons demonstrating incoherent release from flanking inhibition. Results for FM (data not shown) look similar but were insignificant because of the smaller number of neurons tested with FM (n = 25).
One important result seen for both AM and FM derives from the CSI
measure and can be seen in Fig. 11B for AM: most of the CSI values lie in the range [
1, 1], indicating that coherence
sensitivity does not normally result in facilitating responses, which
would yield CSI values >1 for coherent and <
1 for incoherent
facilitation. Facilitation would be expected if latent off-CF
excitatory inputs became active as a function of coherence. The CSI at
±1/4 octave was uncorrelated with carrier frequency, modulation
frequency, sound level, discharge rate, or vector strength
(P > 0.05, Pearson's correlation test).
Temporal coherence sensitivity persists across modulation frequency
Modulation frequencies were normally chosen to elicit large, sustained discharge rates from each neuron. A subset of neurons, however, was tested for coherence sensitivity over a range of modulation frequencies. Figure 12, A- E shows several examples for AM where the relative modulation phase tuning maintained the same general form over a range of modulation frequencies (right, color-coded to reflect modulation frequencies indicated at left). The modulation frequency ranges exhibiting the greatest coherence sensitivity appeared to be unrelated to any particular features of the modulation transfer functions.
|
While modulation frequencies showing strong coherence sensitivity are unique to each neuron, a population trend can be seen in Fig. 12F. Plotted are the adjusted inverse CSI values (see METHODS) for the neurons tested at multiple modulation frequencies. Each adjusted inverse CSI curve shown on the left corresponds to one neuron; mean values across modulation frequency are shown on the right. At higher modulation frequencies, all neurons tested tended to show inhibition throughout the entire stimulus cycle, losing all coherence sensitivity. At low modulation frequencies, some neurons still exhibited coherence sensitivity ("low-pass") and some tended to lose it much as at higher frequencies ("band-pass"), making the population response look band-pass with a broad intermediate range of modulation frequencies (16-128 Hz) where coherence sensitivity was most commonly found.
Examples of neurons tested with a primary tone modulated at one frequency and a secondary tone modulated at a different frequency are shown in Fig. 13. When the mismatched modulation frequencies were harmonically related (e.g., fmod1 = 16 Hz, fmod2 = 32 Hz represents "low/high" harmonic modulation frequencies), either a relatively flat (Fig. 13, A and B) or a double-peaked (Fig. 13, C and D) phase tuning function resulted from the low/high or high/low combinations or both. Recall that the amplitude modulating function is always defined to be a cosine, and "relative modulation phase" in cases of mismatched modulation frequencies refers only to the starting phase of the modulation on the secondary tone (see METHODS). No clear population trends were evident with this protocol.
|
Coherence sensitivity exists over a limited range of relative sound level
Generally, the sound level of the primary tone was chosen to
elicit a high discharge rate from the neuron, and the secondary tone
sound level was chosen to be the lowest value that revealed flanking
inhibition. Therefore because of the variety of level tuning properties
found in A1 (Brugge and Merzenich 1973
; Calford and Semple 1995
; Pfingst and O'Connor 1981
;
Phillips and Irvine 1981
; Phillips et al.
1994
; Wang et al. 1999
), sound levels over a
wide range (0-70 dB SPL) were tested during the course of the coherence sensitivity experiments. A small number of neurons was tested
at several primary and secondary tone sound levels, and two
representative examples of phase tuning at different values of
secondary sound level are shown in Fig.
14. The secondary tone sound levels at
which coherence sensitivity could be found tended to be greatest near
the primary tone level or
20 dB more intense, as shown in the
right-hand plots of Fig. 14.
|
| |
DISCUSSION |
|---|
|
|
|---|
Summary of findings and neural circuitry model
Most neurons in the primary auditory cortex of marmosets exhibit a
property whereby their discharge rates are modified by the relative
modulation phase between a modulated tone at CF and another modulated
tone nearby in carrier frequency. Because relative modulation phase
represents a purely temporal feature that can alter the coherence
between the two modulated tones, this discharge rate modification
occurs as a function of the temporal coherence of the two tones. Each
neuron demonstrates a unique range of carrier frequencies where
temporal coherence sensitivity can be found, but the range of the
population as a whole generally coincides with the population range of
flanking inhibition measured by pure tones. This observation, coupled
with the finding that little phase-induced facilitation occurs,
implicates inhibitory mechanisms as the underlying cause of temporal
coherence sensitivity. Had a significant amount of facilitation been
found, latent flanking excitatory inputs may have appeared more likely.
Because most inhibition observed in intracellular studies of A1 neurons
arises locally (de Ribaupierre et al. 1972
;
DeWeese and Zador 2000
; Serkov 1984
;
Serkov and Volkov 1984
, 1985
), these inhibitory inputs
are likely to be located in the cortex.
If all inhibitory inputs into A1 neurons possessed temporal dynamics
(i.e., modulation transfer functions) similar to that of the excitatory
input(s), then all flanking inhibition would be expected to show an
incoherent release from inhibition. Conversely, if all inhibitory
inputs possessed temporal dynamics that operated on longer time scales
than the excitatory inputs, then flanking inhibition would be expected
to persist at all relative modulation phase values. The finding that a
mixture of these response types exists for most neurons
indicates that only some inhibitory inputs match the excitatory inputs
in their temporal dynamics. This situation is consistent with input
neurons
both excitatory and inhibitory
possessing a variety of
modulation transfer functions and converging onto A1 neurons. The
simplest circuitry model explaining the response to the neuron depicted
in Fig. 3A is shown in Fig.
15. At the modulation frequency tested
for this model neuron (indicated by tick marks in the hypothetical MGB
modulation transfer functions), the lowest frequency
(leftmost) inhibitory input shows significant synchronous firing, as does the excitatory input. The other inhibitory inputs show
high firing rates but no significant synchronization at that modulation
frequency, indicating that their spikes will inhibit the A1 neuron
regardless of the temporal structure of the stimulus at that modulation
frequency (i.e., relative modulation phase).
|
The proper protocol to explore inhibitory temporal dynamics more
thoroughly is to deliver a pure tone at CF to provide excitatory input
and then present a flanking tone at a range of modulation frequencies,
constructing an inhibitory modulation transfer function (IMTF). This procedure performed in the inferior colliculus (IC) has
revealed IMTFs mirroring those seen in excitatory cases, that is, MTFs
with typically band-pass rate and temporal characteristics (Li
et al. 2002
). The data from the current study indicate that the
rates of A1 neurons are more strongly affected by temporal dynamics of input stimuli than are their temporal response properties. Such a result, if borne out by more detailed analysis of cortical IMTFs, would suggest that the convergence of subcortical inputs of
differing temporal dynamics onto A1 neurons can largely be read out as
a rate code in cortex.
FM as an entity separate from AM
As mentioned in the INTRODUCTION, some
psychophysical experiments have suggested that amplitude and FM are
perceived
and thus are likely to be coded
similarly. Masking studies,
however, have revealed potential differences between the perception of
AM and FM. The physiological picture in cortex appears somewhat more complicated for FM than for AM, as well, with greater stimulus selectivity for FM (e.g., Fig. 5) and complex phase-tuning responses (e.g., Fig. 8B).
Nevertheless, the FM data collected showed interesting parallels/contrasts to the AM data. First, the phenomenon of temporal coherence sensitivity took the same general form in FM neurons as in AM neurons. Second, the presence of coherence sensitivity seemed to be more easily detected using FM than AM, having produced 17 of 22 significant responses (vs. 31 of 58 for AM). Finally, while the AM responses were heavily weighted by incoherent release from inhibition, the FM responses showed a nearly equal distribution between incoherent and coherent release from inhibition (Fig. 9). The significance of these findings remains, at present, unclear and in need of further experimentation to sort out. A handful of neurons tested with both AM and FM showed somewhat inconsistent effects between the two stimulus types, indicating that FM coherence sensitivity cannot easily be predicted from AM data.
Range of modulation frequency involved
The persistence of temporal coherence sensitivity at numerous
joint modulation frequencies implies that the inputs responsible for
the phenomenon have spike discharges relatively well-aligned with the
modulating function (i.e., a fairly high vector strength) over a range
of modulation frequencies. The collective loss of coherence sensitivity
at high modulation frequencies also bolsters this conclusion. Loss of
coherence sensitivity at low modulation frequencies for some neurons
implies that some inputs exhibit band-pass temporal characteristics.
Figure 16 compares the modulation frequency range of temporal coherence sensitivity for 17 A1 neurons with syncMTF data obtained for 200 A1 neurons in a separate study in
awake marmosets (Liang et al. 2002
). The syncMTF data
are shown as the percentage of neurons with significant AM envelope
synchronization (P < 0.001, Rayleigh test) at each
modulation frequency. The large difference between these two curves at
intermediate modulation frequencies explains why coherence sensitivity
is evident in the rate but not the temporal responses of A1 neurons.
Collective synchronization boundaries of MGB neurons have been reported
ranging from 100 to 300 Hz (Langner 1992
), which is
within the range predicted by Fig. 16, but diversity of animal models,
experimental preparations, stimulus selection and analysis methods
makes conclusions regarding the coding transformation between MGB and
cortex somewhat premature. It has been established, however, that
cortical neurons typically show lower synchronization boundaries than
geniculate input neurons monosynaptically connected to them
(Creutzfeldt et al. 1980
). The balance of the
evidence
from this study and previous studies
supports the assertion
that temporal coherence sensitivity reflects the temporal properties of
the inputs rather than that of the A1 neurons themselves and
that t