JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 88: 2684-2699, 2002; doi:10.1152/jn.00253.2002
0022-3077/02 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (12)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Barbour, D. L.
Right arrow Articles by Wang, X.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Barbour, D. L.
Right arrow Articles by Wang, X.

J Neurophysiol (November 1, 2002). 10.1152/jn.00253.2002
Submitted on 2 April 2002
Accepted on 30 July 2002

Temporal Coherence Sensitivity in Auditory Cortex

Dennis L. Barbour and Xiaoqin Wang

Laboratory of Auditory Neurophysiology, Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Barbour, Dennis L. and Xiaoqin Wang. Temporal Coherence Sensitivity in Auditory Cortex. J. Neurophysiol. 88: 2684-2699, 2002. Natural sounds often contain energy over a broad spectral range and consequently overlap in frequency when they occur simultaneously; however, such sounds under normal circumstances can be distinguished perceptually (e.g., the cocktail party effect). Sound components arising from different sources have distinct (i.e., incoherent) modulations, and incoherence appears to be one important cue used by the auditory system to segregate sounds into separately perceived acoustic objects. Here we show that, in the primary auditory cortex of awake marmoset monkeys, many neurons responsive to amplitude- or frequency-modulated tones at a particular carrier frequency [the characteristic frequency (CF)] also demonstrate sensitivity to the relative modulation phase between two otherwise identically modulated tones: one at CF and one at a different carrier frequency. Changes in relative modulation phase reflect alterations in temporal coherence between the two tones, and the most common neuronal response was found to be a maximum of suppression for the coherent condition. Coherence sensitivity was generally found in a narrow frequency range in the inhibitory portions of the frequency response areas (FRA), indicating that only some off-CF neuronal inputs into these cortical neurons interact with on-CF inputs on the same time scales. Over the population of neurons studied, carrier frequencies showing coherence sensitivity were found to coincide with the carrier frequencies of inhibition, implying that inhibitory inputs create the effect. The lack of strong coherence-induced facilitation also supports this interpretation. Coherence sensitivity was found to be greatest for modulation frequencies of 16-128 Hz, which is higher than the phase-locking capability of most cortical neurons, implying that subcortical neurons could play a role in the phenomenon. Collectively, these results reveal that auditory cortical neurons receive some off-CF inputs temporally matched and some temporally unmatched to the on-CF input(s) and respond in a fashion that could be utilized by the auditory system to segregate natural sounds containing similar spectral components (such as vocalizations from multiple conspecifics) based on stimulus coherence.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Sound processing in the auditory system of mammals begins in the cochlea, which contains an array of sensory epithelium that decomposes sound energy into many parallel pathways of frequency information (Fletcher 1940; Von Békésy 1960; Zwicker et al. 1957). These frequency pathways or "channels" persist throughout the ascending auditory system at least as far as primary auditory cortex (Aitkin et al. 1986; Howard et al. 1996; Kosaki et al. 1997; Merzenich et al. 1973, 1975, 1976; Morel et al. 1993; Reale and Imig 1980; Rose and Woolsey 1949; Woolsey and Walzl 1942, 1944) and represent a fundamental organizing principle for auditory neuroscience. Complex, natural sounds generally contain energy at many frequencies that tends to undergo coordinated changes in amplitude and frequency (Nelken et al. 1999), implying that the auditory system could exploit correlations in the auditory filter outputs to extract more information about the acoustic environment. Psychophysicists have tested this idea with stimuli that simultaneously excite multiple auditory filters and have concluded that such phenomena as comodulation masking release (CMR) and modulation detection/discrimination interference (MDI) reflect, at least partially, the operation of multiple auditory filters in detection tasks. In CMR, masking noise bands incoherently modulated in amplitude with respect to a probe tone render the probe tone more easily detectable (Hall et al. 1984; Moore 1990; Moore et al. 1990); in MDI, incoherent AM of one tone renders the presence of AM more easily detectable on another tone (Moore and Shailer 1992; Strickland et al. 1989; Yost et al. 1989).

Not only does AM seem to influence the grouping of tones (Bregman 1990; Bregman et al. 1985; Strickland et al. 1987; Von Békésy 1963; Wakefield and Edwards 1987), but so, too, can FM (Bregman 1990; Bregman et al. 1985; Chalikia and Bregman 1989; Cohen and Chen 1992; Furukawa and Moore 1996, 1997; Wilson et al. 1990). The role FM plays in auditory grouping, however, remains poorly elucidated relative to the role of AM. While some studies have indicated similarities between the perception of AM and FM (Saberi and Hafter 1995; Zwicker 1962), other studies of masking effects indicate a stronger role for harmonicity in simultaneous FM than in AM (Bregman and Doehring 1984; Carlyon 1991, 1992, 1994, 2000; Carlyon and Stubbs 1989; Chalikia and Bregman 1989, 1993; Culling and Summerfield 1995; Gardner and Darwin 1986; Marin and McAdams 1991; McAdams 1989). Despite the difficulty in verifying the precise nature of FM coherence in grouping tasks, this cue has been shown to provide information that could be useful for the processing of multiple simultaneous sounds.

Modulated tonal sound components feature prominently in the natural vocalizations of the common marmoset (Callithrix jacchus, a vocal primate), most notably in trill calls (Fig. 1A). Trill calls typically contain a fundamental and harmonic that share similar AM and FM. Trill calls from different monkeys, however, have distinct modulations, providing one potential cue for the marmoset auditory system to differentiate between two simultaneous callers (Agamaite 1997; Agamaite and Wang 1997; Bregman 1990) (Fig. 1B). Marmosets in their natural habitat and in captivity rely heavily on their vocal repertoire to communicate within their familial groups. They regularly hear concurrent vocalizations from several monkeys, which they must process effectively to survive and function in their social structure (Epple 1968, 1975). Findings from behavioral experiments on vervet monkeys in their natural habitats suggest that nonhuman primates can extract caller and message content of species-specific vocalizations in a noisy environment (Seyfarth et al. 1980).



View larger version (50K):
[in this window]
[in a new window]
 
Fig. 1. Motivation and stimuli used to test for temporal coherence sensitivity. A: spectrogram (top) and amplitude envelope (bottom) of a trill call from a single marmoset monkey. B: spectrogram (top) and amplitude envelope (bottom) of the same call as in A from monkey M1 superimposed on the trill call of a 2nd monkey (M2). These particular calls have frequency components that do not overlap, although the distributions of trill carrier frequencies for any given marmoset generally overlaps that of any other. Each call contains a fundamental and 1st harmonic, which are sinusoidally amplitude and frequency modulated. Differences in temporal structure evident in both types of modulation provide a potential cue for segregating the 2 sounds. C and D: stimuli used to test for temporal coherence sensitivity. Two tones were delivered simultaneously and sinusoidally modulated either in amplitude (C) or frequency (D). Shown for AM are spectrograms (top) and amplitude envelopes (bottom); for FM, spectrograms only. The 2 tones were always modulated at the same frequency unless otherwise noted. Relative modulation phase was varied systematically and represented a purely temporal measure of the coherence between the 2 tones.

Simultaneous modulated tones appear to represent logical, behaviorally relevant candidates for probing marmoset auditory cortex for sensitivity to stimulus coherence at different carrier frequencies (Fig. 1, C and D). Such stimuli are analogous to two marmoset monkeys simultaneously producing trill calls. Stimuli are coherent if all components have precisely the same modulation frequency (fmod) and phase (phi mod) and incoherent if either fmod or phi mod is mismatched. Differences in phi mod between components reflect only temporal properties of the stimulus, while differences in fmod reflect both temporal and spectral properties. The temporal properties of the neurons under investigation were studied by altering the relative phase of modulation between two modulated tones (phi rel = phi mod2 - phi mod1)---a property to which the auditory system has been shown to be sensitive and which is believed to influence sound segregation for AM (Bregman et al. 1985; Strickland et al. 1987, 1989; Wakefield and Edwards 1987; Yost and Sheft 1989) and FM (A. S. Bregman and P. Abdel Ahad, unpublished data; Bregman 1990; Cohen and Chen 1992; Furukawa and Moore 1996, 1997; Wilson et al. 1990). The stimuli contained a primary tone (f1) fixed at CF and a secondary tone (f2) at one carrier frequency near CF. Both tones were sinusoidally modulated in either amplitude or frequency with identical fmod unless otherwise noted; phi rel was varied as described in METHODS.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Animal preparation

Preparation of marmosets followed institution-approved chronic physiology procedures. Beginning 2-4 wk prior to surgical implantation, the animal was accommodated to a primate chair for progressively longer periods of time until it sat quietly for 4-6 h. Preparatory aseptic surgeries were performed under initial ketamine (20 mg/kg) and sustained isoflurane (0.5-2.0%, combined with a 50/50 mixture of oxygen/nitrous oxide) anesthesia. The temporalis muscles were removed bilaterally, and 1.1- or 1.5-mm-diam holes were drilled approximately 1 mm through the skull in two semicircular patterns around the laterally located auditory cortices. Screws were inserted tightly into these holes for support, and dental cement was applied around the screws onto the entire exposed skull except for the regions immediately covering auditory cortex. These bare regions of skull were later protected by covering them with a pliable, easily removable polyvinylsiloxane dental impression compound (Kerr). Two stainless steel headposts were also affixed into the dental cement for later immobilization of the head. Following surgery, the animal was monitored closely and administered regular doses of antibiotics and pain relievers during the recovery period, normally 10-14 days.

During a neurophysiological recording experiment, the animal sat in the primate chair with its body minimally restrained and its head immobilized by a stainless-steel arm attached to the skull-mounted restraining headposts. The protecting layer of impression compound over the animal's skull was removed, and a small hole (approximately 1 mm diam) was drilled through 80-90% of the skull thickness with a custom drill mounted on a commercial micromanipulator (SM-11, Narishige). The remaining bone was removed by hand with sterile, custom microsurgical instruments under 40× magnification on a surgical microscope (Carl Zeiss). Anesthetics were not required during the drilling because no soft tissue was manipulated. A sterile electrode was then attached to a hydraulic microdrive (Trent-Wells) mounted on the micromanipulator and carefully lowered through the hole until it rested on the dura. This hydraulic microdrive was operated by the experimenter from outside the chamber to alter the electrode depth during experimentation. The preparation was adapted from a similar technique developed to record from mustached bat (Pteronotus parnellii parnellii) auditory cortex without exposing excessive amounts of neural tissue. (Suga 1965a,b). The recordings achieved with this method were typically stable over several hours of recording time, allowing for the accumulation of large amounts of data from individual neurons. At the end of each day of recording, the exposed hole was filled with an antibiotic compound and the skull sealed over again with impression compound. At the end of a week of recording, a small amount of dental cement was placed into the hole to seal it permanently for infection control, stabilization of later recordings, and retardation of extraneous soft tissue growth.

Stimulus generation

Two tones were generated simultaneously using custom software, and all their individual parameters were varied independently. Each tone was computed from
<IT>Y</IT>(<IT>t</IT>)<IT>=</IT><IT>A</IT><SUB><IT>0</IT></SUB><IT>A</IT>(<IT>t</IT>)<IT> sin </IT>[<IT>2&pgr;</IT><IT>f</IT><SUB><IT>c</IT></SUB><IT>t</IT><IT>+&phgr;<SUB>c</SUB></IT>(<IT>t</IT>)] (1)
where
<IT>A</IT>(<IT>t</IT>)<IT>=</IT><FENCE><AR><R><C>½<IT>M</IT><SUB>AM</SUB>−½<IT>M</IT><SUB><IT>AM</IT></SUB><IT> cos </IT>(<IT>2&pgr;</IT><IT>f</IT><SUB><IT>AM</IT></SUB><IT>t</IT><IT>+&phgr;<SUB>AM</SUB></IT>)</C><C>for AM</C></R><R><C>1</C><C>for FM</C></R></AR></FENCE>

&phgr;<SUB>c</SUB>(<IT>t</IT>)<IT>=</IT><FENCE><AR><R><C>0</C><C>for AM</C></R><R><C><IT>M</IT><SUB><IT>FM</IT></SUB> <FR><NU><IT>f</IT><SUB><IT>c</IT></SUB></NU><DE><IT>f</IT><SUB><IT>FM</IT></SUB></DE></FR><IT>−</IT><IT>M</IT><SUB><IT>FM</IT></SUB> <FR><NU><IT>f</IT><SUB><IT>c</IT></SUB></NU><DE><IT>f</IT><SUB><IT>FM</IT></SUB></DE></FR><IT> cos </IT>(<IT>2&pgr;</IT><IT>f</IT><SUB><IT>FM</IT></SUB><IT>t</IT><IT>+&phgr;<SUB>FM</SUB></IT>)</C><C>for FM</C></R></AR></FENCE>

<IT>t</IT><IT>∈</IT>[<IT>t</IT><SUB><IT>0</IT></SUB><IT>, </IT><IT>t</IT><SUB><IT>1</IT></SUB>] (2)
The sinusoidal AM A(t) was governed by three parameters: MAM, the AM depth, which could take on values between 0 and 1; fAM, the AM frequency; and phi AM, the AM phase. The phase was defined such that when phi AM = 0 and t0 = 0, A(t0) = 0, indicating a cosine function. The sinusoidal FM dphi c/dt was governed by three parameters: MFM, the FM depth as a percentage of carrier frequency fc, which could take on values between 0 and 1; fFM, the modulation frequency of the FM; and phi FM, the modulation phase of the FM. The phase was defined such that when phi FM = 0 and t0 = 0, phi c(t0) = 0, indicating a sine function for the FM. Note that the form phi c(t) took was that of phase modulation, which was derived from the desired FM as follows
&phgr;<SUB>c</SUB>(<IT>t</IT>)<IT>=</IT><IT>M</IT><SUB><IT>FM</IT></SUB><IT>2&pgr;</IT><IT>f</IT><SUB><IT>c</IT></SUB> <LIM><OP>∫</OP></LIM><IT> sin </IT>(<IT>2&pgr;</IT><IT>f</IT><SUB><IT>FM</IT></SUB><IT>t</IT><IT>+&phgr;<SUB>FM</SUB></IT>)<IT>dt</IT>

<IT>=</IT>−<IT>M</IT><SUB><IT>FM</IT></SUB> <FR><NU><IT>2&pgr;</IT><IT>f</IT><SUB><IT>c</IT></SUB></NU><DE><IT>2&pgr;</IT><IT>f</IT><SUB><IT>FM</IT></SUB></DE></FR><IT> cos </IT>(<IT>2&pgr;</IT><IT>f</IT><SUB><IT>FM</IT></SUB><IT>t</IT><IT>+&phgr;<SUB>FM</SUB></IT>)<IT>+</IT><IT>K</IT><SUB><IT>FM</IT></SUB> (3)
where KFM represents the constant of integration. To have phi c(t0) = 0, KFM must equal MFMfc/fFM. Each tone, therefore had 10 free parameters that could be varied: A0, fc, t0, t1, MAM, fAM, phi AM, MFM, fFM, and phi FM. Note that no random carrier phase was incorporated into the tones. Relative modulation phase was defined as the AM or FM phase of the secondary (f2) tone at t = 0 minus the corresponding phase of the primary (f1 or CF) tone at t = 0. This definition allowed relative modulation phase values to be meaningful even when the modulation frequencies of the two tones differed. Examples of two tones with equal modulation frequencies but different relative modulation phase values---hence, different temporal coherence---are shown in Fig. 1, C and D. Linear ramps of 10 ms ON-OFF were added to all stimuli used in these experiments.

Physiological recordings

The extracellular tungsten microelectrodes (3-5 MOmega , A-M Systems; 1-3 MOmega , Uwe Thomas Recording) allowed for reliable single-unit isolation (>40 dB SNR possible; typical SNR > 30 dB; see Fig. 2). In all cases, action potentials were continuously monitored during each recording session and sorted on-line by template-matching digital signal processing (DSP) software (Alpha-Omega Engineering). The spike timing information was logged in a Pentium-based personal computer. Electrode voltage traces were also digitized periodically for confirmation of on-line spike sorting. Well-isolated single units were rigorously sought out, and data were not collected under poor signal conditions.



View larger version (39K):
[in this window]
[in a new window]
 
Fig. 2. Voltage traces of a typical unit stimulated with 2 modulated tones (TMT) at various relative modulation phase values. Expanded time scale on right shows spike waveform for 2 action potentials, marked by rectangles on the traces. Shaded area indicates stimulus duration.

Custom software running in MatLab on the same personal computer generated all sound stimuli digitally at 100,000 samples/s, which were then low-pass filtered at 50 kHz, fed into two serially-linked TDT PA4 attenuator modules (each of which was set to 1/2 the desired overall attenuation), and passed into a power amplifier (Crown) and finally into the recording chamber. A value of 0 dB attenuation represents the most intense sound deliverable at a particular amplifier setting [approximately 93 dB sound pressure level (SPL) for pure tones at 1 kHz in these experiments]. The stimuli were all delivered in free-field through a single two-way crossover, open bass reflex loudspeaker (B&W 601) located 70 cm in front of the animal's head. The speaker had ±4 dB passband ripple from 100 to 36 kHz, which encompasses the hearing range of marmosets (Seiden 1957). Speaker transfer function was measured using pure tones stepped in one-twelfth-octave increments and recorded using a Brüel and Kjær condenser microphone placed in the empty primate chair at the location of the animal's head. Both the loudspeaker and the primate chair were located within a double-walled acoustic chamber (IAC-1024, Industrial Acoustics), whose interior was lined with three-inch acoustic foam (Sonex, Illbruck).

All stimuli were presented pseudorandomly for multiple repetitions (usually 5-10). Spontaneous firing rates (Rsp) were determined from the spiking during the silent periods preceding the stimuli (usually 200-500 ms long). Stimuli were 500-1,000 ms in duration, and successive stimuli were always separated by >= 1 s of silence, usually more. The interstimulus interval was adjusted to longer values for units with clear activity/suppression long after stimulus offset.

Units were sampled from all cortical layers but predominantly from supragranular layers, as judged by recording depth and response properties. Primary auditory cortex was located stereotactically and confirmed by its short-latency, tone-responsive units and its tonotopic map, which was determined from electrode penetrations made through closely spaced holes in the skull. When all desired physiological experiments for an animal had been conducted, electrolytic lesions and fluorescent dye injections were made at various sites around auditory cortex. The animal was then deeply anesthetized with Nembutal, euthanized and perfused with formalin to preserve the brain tissue. Serial sectioning and staining, in conjunction with the experimental record, can reveal the electrode tracks, thereby pinpointing the recording sites.

A total of 84 single units isolated in the primary auditory cortex (A1) of two awake marmoset monkeys (Callithrix jacchus) was analyzed in this study. The units were selected for their sustained response to modulated tones---a category that constitutes the majority of units encountered in A1 (Liang et al. 2002). Responsive units were located with a standardized search routine while advancing the microelectrode through cortical tissue orthogonally to the surface. Once a tone-responsive unit was identified, its tunings to tone carrier frequency and SPL were determined. Frequency response functions (FRF; spike rate in response to a single pure tone fixed in amplitude and varied in carrier frequency) were taken at or near lowest response threshold, and rate-level response functions were measured at characteristic frequency (CF), as determined by the peak of the threshold FRF. Units were also characterized with respect to single tone sinusoidal AM at the CF, yielding a modulation transfer function (MTF). Some units were instead characterized by FM tones, and the choice between the two types of modulation was made based on which type of stimulus elicited more spikes from the unit. In the case of the few units with no sustained response to pure tones, frequency and level tunings were determined using the modulated stimuli.

Once a responsive unit's basic excitatory properties were defined, it was often probed with two simultaneous pure tones for its inhibitory properties. Because most auditory cortical units spontaneously discharge at very low rates, an FRF typically reveals little inhibition, thereby requiring the use of one tone at CF to bias the excitatory response and a second tone to probe for inhibition (Brosch and Schreiner 1997; Shamma and Symmes 1985; Suga 1965b; Sutter et al. 1999). A primary tone (f1) was always delivered at CF in such protocols while a secondary tone (f2) was varied in carrier frequency around CF to generate a two-tone FRF (TTFRF). The f1 tone was delivered with a SPL near threshold, but intense enough to drive the unit consistently---often the peak of the rate-level function; the f2 tone was delivered at the same amplitude or <= 20 dB SPL (in rare cases 40 dB) more intense as necessary to reveal inhibition. Density of sampling ranged from 10 to 40 steps per octave, depending on narrowness of tuning. A similar process generated a two-modulated-tone FRF (TMT FRF) for the coherent case (relative modulation phase or phi rel = 0°) and the incoherent case (phi rel = 180°). Modulation frequencies were chosen that reliably drove the units (e.g., peak of the rate modulation transfer function), and both tones were modulated at the same modulation frequency unless otherwise noted. Finally, secondary frequency regions showing apparent coherence sensitivity were probed more closely by fixing both f1 and f2 carrier frequencies and varying phi rel in steps of either 22.5° or 45°. This process was often repeated at many f2 frequencies. Not all stimulus conditions could be studied in all units because of limited recording time.

Data analysis

Spike trains were converted into average discharge rate (or just "rate") measures over a fixed time window beginning at stimulus onset and ending 50 ms following stimulus offset. Because all the neurons tested for coherence sensitivity exhibited sustained spiking patterns, limiting the time window to the beginning, middle, or end portions of the stimulus did not substantially alter any of the population measures. No systematic variations in synchronous spiking (i.e., vector strength) as a function of relative modulation phase were observed or found to be statistically significant across the population of neurons studied (P > 0.05, Pearson's correlation test); hence, analysis was limited to spike rates. For the purposes of this work, the rate modulation transfer function (rateMTF) was defined as the number of spikes elicited by each of a series of tones modulated at different frequencies divided by the duration of the stimuli. This definition is identical to all other rate functions (e.g., the rate-level function) and provides no information about the temporal structure of the spiking. The synchronization modulation transfer function (syncMTF) was defined here to be the vector strength as a function of modulation frequency. Vector strength was defined classically, and significance was evaluated using a Rayleigh test (P < 0.001)
<IT>VS</IT><IT>=</IT><FR><NU><IT>1</IT></NU><DE><IT>n</IT></DE></FR> <RAD><RCD><FENCE><LIM><OP>∑</OP><LL><IT>i</IT><IT>=1</IT></LL><UL><IT>n</IT></UL></LIM><IT> cos </IT>(<IT>2&pgr;</IT><IT>t</IT><SUB><IT>i</IT></SUB><IT>/</IT><IT>T</IT>)</FENCE><SUP><IT>2</IT></SUP><IT>+</IT><FENCE><LIM><OP>∑</OP><LL><IT>i</IT><IT>=1</IT></LL><UL><IT>n</IT></UL></LIM><IT> sin </IT>(<IT>2&pgr;</IT><IT>t</IT><SUB><IT>i</IT></SUB><IT>/</IT><IT>T</IT>)</FENCE><SUP><IT>2</IT></SUP></RCD></RAD> (4)

<IT>RS</IT><IT>=2</IT><IT>n</IT>(<IT>VS</IT>)<SUP><IT>2</IT></SUP>
where VS is the vector strength, n is the total number of spikes in the analysis window, ti is the time of occurrence of the ith spike, T is the reciprocal of the modulation frequency, and RS is the Rayleigh statistic, which takes on values below 13.8 for significance levels of P < 0.001 (Goldberg and Brown 1968, 1969; Mardia and Jupp 2000). These MTF definitions have been used previously to analyze the spiking patterns of cortical neurons in response to modulated tones (Gaese and Ostwald 1995).

For population summary of phase responses, units were determined to be sensitive to phi rel by statistical analysis of the phi rel sweep rate curve having the largest range of values. Null hypothesis (i.e., all values of phi rel elicited the same firing rates) was rejected if P < 0.05 in a one-way ANOVA of rates for all phase values tested. Rate curves as a function of phase were three-point-triangle smoothed for determination of the phase at minimum (Rmin) and maximum (Rmax) rate responses, although plotted data represent actual, unsmoothed rates. Carrier frequency-dependent spike measures were sampled at 20 frequencies/octave for common comparison. An inhibition index (II) was computed at each f2 carrier frequency as the two-tone frequency response function rate at CF minus the rate at the f2 carrier frequency, normalized by the rate at CF
II=<FR><NU>TTFRF(CF)−TTFRF(<IT>f</IT><SUB><IT>2</IT></SUB>)</NU><DE><IT>TTFRF</IT>(<IT>CF</IT>)</DE></FR> (5)
This index was computed for each unit for which TTFRF data were collected, and the population median was used to determine the carrier frequencies of flanking inhibition. A normalized coherence sensitivity index (CSI) was computed at each f2 carrier frequency as the two-modulated-tone frequency response function rate at the f2 carrier frequency and phi rel = 0° minus the rate at f2 and 180°, normalized by the rate at CF and 0°
CSI=<FR><NU>TMTFRF(<IT>f</IT><SUB><IT>2</IT></SUB><IT>, 0°</IT>)<IT>−TMTFRF</IT>(<IT>f</IT><SUB><IT>2</IT></SUB><IT>, 180°</IT>)</NU><DE><IT>TMTFRF</IT>(<IT>CF, 0°</IT>)</DE></FR> (6)
This index was computed for each unit for which AM TMTFRF data were collected, and the population median was used to determine the carrier frequencies of maximal temporal coherence sensitivity. The CSI values at ±1/4 octave were compared with the values at CF by a Wilcoxon signed-rank test and a sign test (see RESULTS). For neurons tested at multiple matched modulation frequencies (i.e., fmod1 = fmod2), an adjusted inverse CSI was computed by taking the CSI for each modulation frequency, multiplying by -1, and dividing it by the largest value for that neuron. Large values of the adjusted inverse CSI reflect coherence sensitivity.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Temporal coherence sensitivity measured by sinusoidal AM

Stimulus parameters useful for testing temporal coherence sensitivity were determined on a neuron-by-neuron basis. Figure 3 shows the behavior of one representative neuron in response to a series of protocols designed to measure coherence sensitivity in response to AM tones. First, the carrier frequency, sound level, and modulation frequency ranges eliciting the greatest discharge rate from the neuron were determined by systematic stimulus sweeps of a single tone or modulated tone. Figure 3, A and B, demonstrates the resulting spiking patterns, as well as two separate measures of response to amplitude modulated tones: discharge rate and vector strength. Arrows in these plots indicate parameter values used for the two-tone and/or two-modulated-tone protocols.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 3. Auditory cortex neurons show stereotypical responses to variations in relative AM phase. A: raster plots of tone frequency response functions (FRF) and rate-level function of 1 representative neuron. Shaded areas on all raster plots represent stimulus duration; dark or colored dots represent spikes that fall within the rate analysis window. Arrows indicate parameter values used later for 2-tone and TMT protocols. B: AM transfer functions at a modulation depth of 1 (100%), showing both mean discharge rate over the analysis window (black) and vector strength (gray, significant values filled). Horizontal lines on all rate plots show spontaneous discharge rate; error bars show SE. C: plotted to the same scale are the tone FRF (orange); a fixed, unmodulated f1 with an unmodulated f2 varied in carrier frequency (violet); a fixed, AM f1 with a coherent AM f2 (phi rel = 0°) varied in carrier frequency (green); and a fixed, AM f1 with an incoherent AM f2 (phi rel = 180°) varied in carrier frequency (blue). Some lower carrier frequencies were released from inhibition by incoherent vs. coherent stimuli. D: raster plots for phi rel = 0° and phi rel = 180° TMT FRFs in C. E: response of the neuron when f1 was fixed at characteristic frequency (CF), f2 was fixed at the carrier frequency indicated by arrows in C and D, and phi rel was altered in 22.5° increments: raster plot of spikes (left) and rate/vector strength curves (right). This relative modulation phase tuning function shows minimal rate responses (black) in the coherent condition with minor changes in vector strength as a function of phase (gray).

When a primary tone (unmodulated) was fixed at the carrier frequency and at sound level values indicated in Fig. 3A, a secondary unmodulated tone varied in carrier frequency revealed flanking inhibition (Fig. 3C, violet curve)---the classical two-tone response. When both of the tones were modulated at the same modulation frequency (16 Hz, see Fig. 3B) with a relative modulation phase of 0°, similar flanking inhibition was revealed (Fig. 3, C and D, green curve). If, however, both tones were modulated at the same modulation frequency but with a relative modulation phase of 180°, the outer portion of the lower inhibitory flank showed a markedly increased discharge rate (Fig. 3, C and D, blue curve).

This phenomenon was explored in more detail by fixing the primary tone as before and fixing the secondary tone carrier frequency while varying the relative modulation phase. This process was repeated stepwise across many secondary tone carrier frequencies. Figure 3E depicts the result of this procedure at the secondary carrier frequency indicated by arrows in Fig. 3C,D. This frequency was chosen for demonstration because it shows the effect of interest and because it lies relatively far from CF, thereby eliminating the possibility that the AM sidebands would excite the neuron. This neuron showed a clear minimum rate response (maximum of suppression) for coherent (phi rel = 0°) stimuli---the most common response type observed in this study. Some changes in vector strength could be seen (Fig. 3E, right), but alteration in temporal spiking was rare throughout the population of neurons studied.

Figure 4 shows another example neuron studied with similar AM protocols. Just as this neuron has a carrier frequency and modulation frequency range of maximal excitatory response different from that of the previous neuron (Fig. 4, A and B), its carrier frequency range of incoherent release from inhibition differs as well (Fig. 4, C and D). These two neurons are representative of the range of flanking carrier frequencies that reveal release from inhibition under temporally incoherent AM conditions: carrier frequencies both above and below CF, on both the inner and outer inhibitory flanks can show the phenomenon. Neurons may also show release from inhibition at all flanking frequencies or none, indicating considerable diversity in the nature of flanking inputs to A1 neurons. Figure 4E shows again that large phase-induced rate changes are typically unaccompanied by corresponding changes in vector strength.



View larger version (49K):
[in this window]
[in a new window]
 
Fig. 4. Another example revealing the diversity of responses to relative AM phase. Format is identical to that of Fig. 3. A: tone FRF and rate-level function. B: AM transfer functions at a modulation depth of 1. C: tone FRF (orange), 2-tone FRF (violet), 2-AM-tone FRF at 0° (green), and 2-AM-tone FRF at 180° (blue). Greatest incoherent flanking release from inhibition occurs at carrier frequencies immediately above CF. D: raster plots for the 0° and 180° TMT FRFs from C. E: phase tuning function with f1 fixed at CF and f2 fixed at the carrier frequency indicated by the arrows in C and D. Steps in phi rel were made in 45° increments.

Temporal coherence sensitivity measured by sinusoidal FM

Concurrently-delivered AM tones constitute a spectrally compact stimulus useful for exploring issues of temporal coherence sensitivity. Auditory cortex neurons also respond in sustained fashion to FM stimuli, so the question of FM-induced temporal coherence sensitivity may be addressed with a similar protocol. Figure 5 shows a neuron that exhibited a mild flanking release from inhibition for temporally coherent FM. Similarly to the AM case, changes in relative modulation phase between two FM tones can modify the inhibition evident in the rate response of A1 neurons.



View larger version (47K):
[in this window]
[in a new window]
 
Fig. 5. Example of responses to variations in relative FM phase. Format is identical to that of Fig. 3. A: FM tone FRF and rate-level function. This neuron does not respond to pure tones. B: FM transfer functions at 10% modulation depth. C: FM tone FRF (orange), 2-FM-tone FRF at 0° (green), and 2-FM-tone FRF at 180° (blue). Some flanking inhibition is released under coherent stimulus conditions for this neuron. D: raster plots for the 0° and 180° TMT FRFs in C. E: phase tuning function with f1 fixed at CF and f2 fixed at the carrier frequency indicated by the arrows in C and D. Steps in phi rel were made in 45° increments.

The similarities between AM and FM temporal coherence sensitivity in individual neurons can be seen most easily by comparing the relative modulation phase tuning uncovered by the protocol used in Fig. 3E. Figure 6 shows the relative modulation phase tuning responses of several neurons tested with either AM or FM tones. While the stimulus parameters at which coherence sensitivity exists can be seen to differ from neuron to neuron, the phase responses generally show either coherent suppression (Fig. 6, top and middle) or incoherent suppression (Fig. 6, bottom). The main differences between AM and FM become apparent when population responses are considered and will be discussed in a later section. Again, no systematic alterations in the vector strength of the spiking patterns were observed across the population of neurons studied---the predominant effect seems to be a strong modification of discharge rate.



View larger version (47K):
[in this window]
[in a new window]
 
Fig. 6. Additional examples of relative modulation phase tuning functions for AM (left) and FM (right). Variation in discharge rates can be quite large, but no systematic alterations in vector strength have been observed for the neurons studied. Carrier frequencies, attenuations, and modulation frequencies used are indicated above each plot. Incoherence discharge (top and middle) represents the most common response observed; coherence discharge (bottom) is less common.

Variety of temporal coherence sensitivity

Examples of neurons tested for temporal coherence sensitivity at many carrier frequencies are shown in Fig. 7 for AM and Fig. 8 for FM. These three-dimensional plots indicate the discharge rate of the neuron as a function of both secondary carrier frequency and relative modulation phase. The varying degree of inhibition release is evident in these examples. To aid the visualization of the coherence effect, the TMT FRF at 0° of relative modulation phase is projected through all phase values for each example (Fig. 7, right), creating a hypothetical response showing no coherence sensitivity. Figure 7, A and B shows neurons similar to the example in Fig. 3. Flanking inhibition in these neurons lessens under temporally incoherent stimulus conditions. Both inhibitory flanks in these neurons seem to be sensitive, although to varying degrees.



View larger version (50K):
[in this window]
[in a new window]
 
Fig. 7. Temporal coherence measures for AM at multiple carrier frequency and relative modulation phase combinations. Actual data (left) are compared with the data at 0° projected through all phase values for visual comparison. Parameters in parentheses are f1 tone sound level in dB attenuation, f2 tone sound level in dB attenuation, modulation frequency, and modulation depth. A: flanking release from inhibition at both high and low frequencies (70 dB, 50 dB, 32 Hz, 1). B: these examples show different degrees of incoherent flanking release from inhibition at both high and low frequencies (60 dB, 40 dB, 64 Hz, 1). C: example showing incoherent on-CF suppression and incoherent release from inhibition at a flanking frequency away from CF (40 dB, 40 dB, 8 Hz, 1). Incoherent on-CF suppression was rarely found for AM. D: example showing no coherence-dependent inhibitory properties (60 dB, 40 dB, 128 Hz, 1).



View larger version (58K):
[in this window]
[in a new window]
 
Fig. 8. Temporal coherence measures for FM at multiple carrier frequency and relative modulation phase combinations. Format is identical to that of Fig. 7. A: example showing strong incoherent on-CF suppression (30 dB, 30 dB, 32 Hz, 0.005). This type of response is much more common for neurons tested with FM than with AM. B: example showing incoherent release from inhibition below CF and biphasic response at high frequencies (70 dB, 70 dB, 32 Hz, 0.4).

Figure 7C shows a mixed effect for the three inhibitory flanking regions. The inhibition nearest CF persists at all relative modulation phase values, and on-CF suppression becomes apparent at incoherent phase values. The far flanking region shows a mild release from inhibition for incoherent phase values. Figure 7D depicts a neuron displaying little temporal coherence sensitivity at any secondary carrier frequency for the stimulus parameters tested.

Figure 8A shows a neuron with coherent FM inhibition around CF, similar to that depicted for AM in Fig. 7C. Figure 8B shows a more complex response, with a coherent release from inhibition just below CF (as in Fig. 7, A and B) and a biphasic phase tuning response (troughs at both 0° and 180°) at the highest carrier frequencies audible to marmosets. Such responses were never observed for two tones amplitude modulated at the same modulation frequency and probably reflect differential sensitivity to the upward and downward FM inherent in sinusoidal modulation (Liang et al. 2002).

Two main categories of phase tuning functions

As mentioned previously, the relative modulation phase tuning functions depicted in Fig. 6 appeared to constitute two distinct categories. From all the phase tuning functions collected for a neuron that demonstrated a dependence of rate on relative modulation phase (1-way ANOVA of rates, P < 0.05), the one with the largest difference between maximum and minimum rates (Rmax - Rmin) was used to characterize that neuron. Some neurons showed no significant alteration of rate at any phase tested. The phases of minimum and maximum response (phi min and phi max) for the significant phase sweeps are plotted against the percent decrease from maximum, computed by (Rmax - Rmin)/(Rmax - Rspont) × 100, in Fig. 9A. The median discharge rate differential (Rmax - Rmin) was 15 spikes/s; the range, [2, 53]. Distributions of the phases of minimum and maximum responses are shown in Fig. 9, B and C. Neurons tested with AM show a clear bias toward response minima (maximum suppression) at phases near 0° and response maxima at phases near 180°. While fewer neurons were tested with FM, a higher overall percentage showed significant phase effects (77% or 17/22 vs. 53% or 31/58 for AM), and response minima tended to be fairly evenly divided between 0° and 180°. The differences in phase distributions and relative proportions of neurons showing detectable effects constitute the most obvious distinctions between data gathered with AM versus FM.



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 9. Relative modulation phase tunings reveal clustering of responses. A: plotted for each neuron is the greatest percent decrease from maximum discharge rate elicited by relative modulation phase [i.e., (Rmax - Rmin)/(Rmax - Rspont) × 100]. Suppression beyond spontaneous firing yields values >100%. The greatest decrease is determined over all f2 carrier frequencies tested and is plotted against the phases at which the minimum and maximum rates (phi min and phi max) occur. Significant AM (black) and FM (gray) data are shown. B: histogram of the phi min values is shown for AM (black), FM (gray), and nonsignificant (light gray) data. C: histogram as in B for phi max values.

Each phase tuning function was taken with the primary tone fixed at the neuron's CF and the secondary tone fixed at another carrier frequency. A scatterplot of CF versus secondary tone frequency where Rmax - Rmin was greatest for each neuron is shown in Fig. 10. Three observations can be made: 1) coherence sensitivity is found over a wide range of CFs and well beyond the physiological carrier frequencies of trill calls (5-7 kHz); 2) maximum coherence sensitivity is observed both above and below the CF; and 3) carrier frequencies of maximum coherence sensitivity are generally found near the CF. The range of CFs involved encompasses a large portion of the audible frequency range of marmosets (Seiden 1957) and the entire range of their vocalizations (Agamaite 1997; Agamaite and Wang 1997) but does not appear to be concentrated within any one frequency range.



View larger version (29K):
[in this window]
[in a new window]
 
Fig. 10. Carrier frequency distribution of relative modulation phase sensitivity. Scatterplot of f2 carrier frequencies generating the greatest percent decreases of Fig. 9A plotted against CF for each neuron. CFs of phase-sensitive neurons span a large portion of the audible frequencies, and flanking frequencies of maximum effect can be either above or below CF.

Average population temporal coherence sensitivity

The clustering of rate minima and maxima around AM phase values of 0° and 180° seen in Fig. 9 implies that the maximally coherent and incoherent stimulus conditions may adequately reveal the temporal coherence properties revealed by AM tones. Figure 11A shows the median CSI as a function of secondary tone carrier frequency relative to CF for all the neurons tested with the two-AM-tone protocol depicted in Fig. 3C. Positive values of CSI indicate greater rates in response to 0° than to 180°; negative values, 0° < 180° (see METHODS). The maximum median CSI value lies at CF, indicating that two coherent AM tones with carrier frequencies very close or equal to CF elicit greater population rate responses than do two incoherent tones with the same carrier frequencies. The minimum median CSI values lie approximately ±1/4 octave from CF---approximately one critical band (Fletcher 1940; Greenwood 1961; Hamilton 1957). Negative CSI values indicate incoherent release from inhibition, and the frequency range of low median CSI corresponds closely with high values of the median inhibition index (II) as measured by two pure tones (see METHODS).



View larger version (41K):
[in this window]
[in a new window]
 
Fig. 11. Coherence sensitivity of all neurons tested with AM. A: plot of median coherence sensitivity index (CSI) as a function of carrier frequency relative to CF (black). Region of greatest decrease in median CSI from the median at CF corresponds to flanking frequencies most commonly associated with inhibition, as indicated by the median inhibition index (II; gray). B: CSI for each neuron in the data set at CF and CF ± 1/4 octave (approximately 1 critical band). A significantly greater number of neurons show flanking CSIs decreased from CF (black) than increased (gray). Most CSI values lie in the range [-1, 1], indicating inhibitory effects; large positive or negative values of CSI typically result from poor normalization.

Figure 11B depicts all of the AM CSI data for CF and CF ±1/4 octave. The greatest number of neurons (30/61) showed a bilateral flanking decrease of CSI relative to the value at CF (black), indicating flanking inhibition that was generally stronger for coherently modulated stimuli. A smaller number (13/61) showed a bilateral flanking increase of CSI relative to CF (gray), indicating either coherent release from flanking inhibition or a diminished response near CF resulting from incoherence. The remainder of the neurons (18/61) showed an asymmetric response.

The median CSI values at ±1/4 octave significantly differed from the median value at CF (P < 0.01, Wilcoxon signed-rank test), and neurons with CSI decreases had significantly lower CSI values at ±1/4 octave than those with increases (P < 0.001, sign test), indicating that the overall AM population response was dominated by neurons demonstrating incoherent release from flanking inhibition. Results for FM (data not shown) look similar but were insignificant because of the smaller number of neurons tested with FM (n = 25).

One important result seen for both AM and FM derives from the CSI measure and can be seen in Fig. 11B for AM: most of the CSI values lie in the range [-1, 1], indicating that coherence sensitivity does not normally result in facilitating responses, which would yield CSI values >1 for coherent and < -1 for incoherent facilitation. Facilitation would be expected if latent off-CF excitatory inputs became active as a function of coherence. The CSI at ±1/4 octave was uncorrelated with carrier frequency, modulation frequency, sound level, discharge rate, or vector strength (P > 0.05, Pearson's correlation test).

Temporal coherence sensitivity persists across modulation frequency

Modulation frequencies were normally chosen to elicit large, sustained discharge rates from each neuron. A subset of neurons, however, was tested for coherence sensitivity over a range of modulation frequencies. Figure 12, A- E shows several examples for AM where the relative modulation phase tuning maintained the same general form over a range of modulation frequencies (right, color-coded to reflect modulation frequencies indicated at left). The modulation frequency ranges exhibiting the greatest coherence sensitivity appeared to be unrelated to any particular features of the modulation transfer functions.



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 12. A-E: examples of neurons tested for relative modulation phase sensitivity at several joint modulation frequencies. Colored plot symbols on the rate modulation transfer functions (left) indicate the modulation frequencies used for correspondingly colored phase tuning functions (right). Other stimulus parameters were chosen for the highest elicited spike rates. F: adjusted inverse CSI values (see text) for each neuron tested at multiple modulation frequencies (left) and the mean values (right). Values for individual neurons are connected by lines; black lines represent the examples in A-E (computed from 0° and 180° values indicated by colored symbols in the phase tuning plots). Mean population response profile reveals a band-pass nature of coherence sensitivity with respect to modulation frequency.

While modulation frequencies showing strong coherence sensitivity are unique to each neuron, a population trend can be seen in Fig. 12F. Plotted are the adjusted inverse CSI values (see METHODS) for the neurons tested at multiple modulation frequencies. Each adjusted inverse CSI curve shown on the left corresponds to one neuron; mean values across modulation frequency are shown on the right. At higher modulation frequencies, all neurons tested tended to show inhibition throughout the entire stimulus cycle, losing all coherence sensitivity. At low modulation frequencies, some neurons still exhibited coherence sensitivity ("low-pass") and some tended to lose it much as at higher frequencies ("band-pass"), making the population response look band-pass with a broad intermediate range of modulation frequencies (16-128 Hz) where coherence sensitivity was most commonly found.

Examples of neurons tested with a primary tone modulated at one frequency and a secondary tone modulated at a different frequency are shown in Fig. 13. When the mismatched modulation frequencies were harmonically related (e.g., fmod1 = 16 Hz, fmod2 = 32 Hz represents "low/high" harmonic modulation frequencies), either a relatively flat (Fig. 13, A and B) or a double-peaked (Fig. 13, C and D) phase tuning function resulted from the low/high or high/low combinations or both. Recall that the amplitude modulating function is always defined to be a cosine, and "relative modulation phase" in cases of mismatched modulation frequencies refers only to the starting phase of the modulation on the secondary tone (see METHODS). No clear population trends were evident with this protocol.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 13. Relative AM phase tuning at disparate modulation frequencies. Format similar to that of Fig. 12. When 2 tones amplitude modulated by harmonically related frequencies had their relative modulation phase values varied (see METHODS), the phase tuning functions (right) showed a variety of responses, including relatively flat responses (A and B) and responses with more than 1 peak (C and D). Color coding for matched modulation frequency phase tuning curves follows the format of Fig. 12; phase tuning curves for mismatched modulation frequencies are labeled as fmod1/fmod2. Other stimulus parameters were chosen for the highest elicited spike rates.

Coherence sensitivity exists over a limited range of relative sound level

Generally, the sound level of the primary tone was chosen to elicit a high discharge rate from the neuron, and the secondary tone sound level was chosen to be the lowest value that revealed flanking inhibition. Therefore because of the variety of level tuning properties found in A1 (Brugge and Merzenich 1973; Calford and Semple 1995; Pfingst and O'Connor 1981; Phillips and Irvine 1981; Phillips et al. 1994; Wang et al. 1999), sound levels over a wide range (0-70 dB SPL) were tested during the course of the coherence sensitivity experiments. A small number of neurons was tested at several primary and secondary tone sound levels, and two representative examples of phase tuning at different values of secondary sound level are shown in Fig. 14. The secondary tone sound levels at which coherence sensitivity could be found tended to be greatest near the primary tone level or <= 20 dB more intense, as shown in the right-hand plots of Fig. 14.



View larger version (36K):
[in this window]
[in a new window]
 
Fig. 14. Phase tuning functions taken at various f2 sound levels (left) reveal coherence sensitivity for only a relatively small range of values. At low relative sound levels, no inhibition results at any modulation phase; at high relative sound levels, inhibition results at all modulation phase values. Rate-level functions of the f2 tone (right) show the ranges of coherence sensitivity. Other stimulus parameters were chosen for the highest elicited spike rates.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Summary of findings and neural circuitry model

Most neurons in the primary auditory cortex of marmosets exhibit a property whereby their discharge rates are modified by the relative modulation phase between a modulated tone at CF and another modulated tone nearby in carrier frequency. Because relative modulation phase represents a purely temporal feature that can alter the coherence between the two modulated tones, this discharge rate modification occurs as a function of the temporal coherence of the two tones. Each neuron demonstrates a unique range of carrier frequencies where temporal coherence sensitivity can be found, but the range of the population as a whole generally coincides with the population range of flanking inhibition measured by pure tones. This observation, coupled with the finding that little phase-induced facilitation occurs, implicates inhibitory mechanisms as the underlying cause of temporal coherence sensitivity. Had a significant amount of facilitation been found, latent flanking excitatory inputs may have appeared more likely. Because most inhibition observed in intracellular studies of A1 neurons arises locally (de Ribaupierre et al. 1972; DeWeese and Zador 2000; Serkov 1984; Serkov and Volkov 1984, 1985), these inhibitory inputs are likely to be located in the cortex.

If all inhibitory inputs into A1 neurons possessed temporal dynamics (i.e., modulation transfer functions) similar to that of the excitatory input(s), then all flanking inhibition would be expected to show an incoherent release from inhibition. Conversely, if all inhibitory inputs possessed temporal dynamics that operated on longer time scales than the excitatory inputs, then flanking inhibition would be expected to persist at all relative modulation phase values. The finding that a mixture of these response types exists for most neurons indicates that only some inhibitory inputs match the excitatory inputs in their temporal dynamics. This situation is consistent with input neurons---both excitatory and inhibitory---possessing a variety of modulation transfer functions and converging onto A1 neurons. The simplest circuitry model explaining the response to the neuron depicted in Fig. 3A is shown in Fig. 15. At the modulation frequency tested for this model neuron (indicated by tick marks in the hypothetical MGB modulation transfer functions), the lowest frequency (leftmost) inhibitory input shows significant synchronous firing, as does the excitatory input. The other inhibitory inputs show high firing rates but no significant synchronization at that modulation frequency, indicating that their spikes will inhibit the A1 neuron regardless of the temporal structure of the stimulus at that modulation frequency (i.e., relative modulation phase).



View larger version (25K):
[in this window]
[in a new window]
 
Fig. 15. Model neuronal circuitry accounting for temporal coherence sensitivity. Results of the present coherence sensitivity experiments can be explained if subcortical neurons possessing differing syncMTFs project onto a single cortical neuron. The MGB modulation transfer functions (MTF) and TMTFRF formats match those of the previous figures and reflect the simplest hypothetical examples that can explain the results.

The proper protocol to explore inhibitory temporal dynamics more thoroughly is to deliver a pure tone at CF to provide excitatory input and then present a flanking tone at a range of modulation frequencies, constructing an inhibitory modulation transfer function (IMTF). This procedure performed in the inferior colliculus (IC) has revealed IMTFs mirroring those seen in excitatory cases, that is, MTFs with typically band-pass rate and temporal characteristics (Li et al. 2002). The data from the current study indicate that the rates of A1 neurons are more strongly affected by temporal dynamics of input stimuli than are their temporal response properties. Such a result, if borne out by more detailed analysis of cortical IMTFs, would suggest that the convergence of subcortical inputs of differing temporal dynamics onto A1 neurons can largely be read out as a rate code in cortex.

FM as an entity separate from AM

As mentioned in the INTRODUCTION, some psychophysical experiments have suggested that amplitude and FM are perceived---and thus are likely to be coded---similarly. Masking studies, however, have revealed potential differences between the perception of AM and FM. The physiological picture in cortex appears somewhat more complicated for FM than for AM, as well, with greater stimulus selectivity for FM (e.g., Fig. 5) and complex phase-tuning responses (e.g., Fig. 8B).

Nevertheless, the FM data collected showed interesting parallels/contrasts to the AM data. First, the phenomenon of temporal coherence sensitivity took the same general form in FM neurons as in AM neurons. Second, the presence of coherence sensitivity seemed to be more easily detected using FM than AM, having produced 17 of 22 significant responses (vs. 31 of 58 for AM). Finally, while the AM responses were heavily weighted by incoherent release from inhibition, the FM responses showed a nearly equal distribution between incoherent and coherent release from inhibition (Fig. 9). The significance of these findings remains, at present, unclear and in need of further experimentation to sort out. A handful of neurons tested with both AM and FM showed somewhat inconsistent effects between the two stimulus types, indicating that FM coherence sensitivity cannot easily be predicted from AM data.

Range of modulation frequency involved

The persistence of temporal coherence sensitivity at numerous joint modulation frequencies implies that the inputs responsible for the phenomenon have spike discharges relatively well-aligned with the modulating function (i.e., a fairly high vector strength) over a range of modulation frequencies. The collective loss of coherence sensitivity at high modulation frequencies also bolsters this conclusion. Loss of coherence sensitivity at low modulation frequencies for some neurons implies that some inputs exhibit band-pass temporal characteristics. Figure 16 compares the modulation frequency range of temporal coherence sensitivity for 17 A1 neurons with syncMTF data obtained for 200 A1 neurons in a separate study in awake marmosets (Liang et al. 2002). The syncMTF data are shown as the percentage of neurons with significant AM envelope synchronization (P < 0.001, Rayleigh test) at each modulation frequency. The large difference between these two curves at intermediate modulation frequencies explains why coherence sensitivity is evident in the rate but not the temporal responses of A1 neurons. Collective synchronization boundaries of MGB neurons have been reported ranging from 100 to 300 Hz (Langner 1992), which is within the range predicted by Fig. 16, but diversity of animal models, experimental preparations, stimulus selection and analysis methods makes conclusions regarding the coding transformation between MGB and cortex somewhat premature. It has been established, however, that cortical neurons typically show lower synchronization boundaries than geniculate input neurons monosynaptically connected to them (Creutzfeldt et al. 1980). The balance of the evidence---from this study and previous studies---supports the assertion that temporal coherence sensitivity reflects the temporal properties of the inputs rather than that of the A1 neurons themselves and that t