Journal of Neurophysiology

Long-Lasting Modulation by Stimulus Context in Primate Auditory Cortex

Edward L. Bartlett, Xiaoqin Wang

Abstract

A sound embedded in an acoustic stream cannot be unambiguously segmented and identified without reference to its stimulus context. To understand the role of stimulus context in cortical processing, we investigated the responses of auditory cortical neurons to 2-sound sequences in awake marmosets, with a focus on stimulus properties other than carrier frequency. Both suppressive and facilitatory modulations of cortical responses were observed by using combinations of modulated tone and noise stimuli. The main findings are as follows. 1) Preceding stimuli could suppress or facilitate responses to succeeding stimuli for durations >1 s. These long-lasting effects were dependent on the duration, sound level, and modulation parameters of the preceding stimulus, in addition to the carrier frequency. They occurred regardless of whether the 2 stimuli were separated by a silent interval. 2) Suppression was often tuned such that preceding stimuli whose parameters were similar to succeeding stimuli produced the strongest suppression. However, the responses of many units could be suppressed, although often weaker, even when the 2 stimuli were dissimilar. In some cases, only a dissimilar preceding stimulus produced suppression in the responses to the succeeding stimulus. 3) In contrast to suppression, facilitation of responses to succeeding stimuli by the preceding stimulus was usually strongest when the 2 stimuli were dissimilar. 4) There was no clear correlation between the firing rate evoked by the preceding stimulus and the change in the firing rate evoked by the succeeding stimulus, indicating that the observed suppression was not simply a result of habituation or spike adaptation. These results demonstrate that persistent modulations of the responses of an auditory cortical neuron to a given stimulus can be induced by preceding stimuli. Decreases or increases of responses to the succeeding stimuli are dependent on the spectral, temporal, and intensity properties of the preceding stimulus. This indicates that cortical auditory responses to a sound are not static, but instead depend on the stimulus context in a stimulus-specific manner. The long-lasting impact of stimulus context and the prevalence of facilitation suggest that such cortical response properties are important for auditory processing beyond forward masking, such as for auditory streaming and segregation.

INTRODUCTION

The acoustic environment is typically composed of one or more sound sources that change over time. Temporal transitions in the acoustic environment are often behaviorally or semantically relevant, such as consonant–vowel combinations in speech or phrases of animal vocalizations. In these cases, a sound cannot be unambiguously identified without reference to its stimulus context.

Psychophysical studies in humans have demonstrated that stimulus context strongly influences the perception of sounds. A simple example of the influence of stimulus context is forward masking, which occurs when the presence of one sound elevates the threshold to detect a subsequent sound or to detect a change in the sound (Fastl 1976; Moore 1978; Moore and Glasberg 1983; Penner 1974; Zwislocki et al. 1968). Masking effects are strongest when the spectral content of the first sound is similar to the second sound, when there is no delay between the sounds, and when the masker duration is long (Kidd and Feth 1982; Moore 1978; Moore and Glasberg 1983; Oxenham and Plack 2000; Zwislocki et al. 1968). Forward masking can occur whether the masker (preceding) stimulus is a pure tone or a narrowband or wideband noise, as long as it overlaps with the frequency of the probe (succeeding) stimulus (Kidd and Feth 1982; Penner 1974). Psychophysically, forward masking produced by a brief masker has a relatively short duration (≤100 ms), but the duration of masking increases with the masker's sound level (Kidd and Feth 1982; Moore and Glasberg 1983; Penner 1974; Zwicker 1984) and duration (Penner 1974; Zwicker 1984).

In neurophysiological studies of mammalian auditory cortex, stimulus context has been studied in relation to forward masking (Brosch and Scheich 2002; Brosch and Schreiner 1997; Calford and Semple 1995; Fitzpatrick et al. 1999), representation of pure tone stimuli in a continuous noise background (Phillips 1985), mismatch negativity (Ulanovsky et al. 2003), dynamic interaural phase disparity representation (Malone et al. 2002), sound directional acuity (Reale and Brugge 2000), and foreground versus background representations (Bar-Yosef et al. 2002). Forward-masking stimulus paradigms produced neural suppression for 200–300 ms in auditory cortical neurons of anesthetized cats (Brosch and Schreiner 1997; Calford and Semple 1995; but see also Hocherman and Gilat 1981), anesthetized monkeys (Brosch and Scheich 2002), and awake rabbits (Fitzpatrick et al. 1999). Similar to psychophysical data in humans, neural responses to forward-masking stimuli were strongest when the preceding stimulus (the first stimulus, denoted S1) and the succeeding stimulus (the second stimulus, denoted S2) had similar carrier frequencies and when S1 was louder than S2. In these studies, masker and probe stimuli were short-duration pure tones (≤100 ms) or individual clicks.

However, many natural stimuli, including primate vocalizations, have components that are substantially longer (≳1 s) (e.g., Wang 2000) and are often temporally modulated. The stimulus context–dependent effects of time-varying and long-duration signals have not been systematically investigated in previous studies of auditory cortex. Furthermore, stimulus context also influences processes other than masking, includingstream segregation of time-varying sounds (Grimault et al. 2002) and vowel perception (Holt et al. 2000). Thus the use of temporally modulated sound sequences may reveal neural response properties that extend beyond simple masking responses. The main hypothesis that we were testing was that the modulation parameters of the first stimulus (in addition to carrier frequency) would affect the neural responses to the second stimulus. Previous studies focused largely on the effects of carrier frequency. Although carrier frequency determines which part of the spectrum a sound is centered on, spectral or temporal modulations are the means by which information about a complex sound is encoded. The other hypothesis that we were testing was that using long-duration S1 stimuli should produce long-duration alterations in the S2 responses, as opposed to the effects induced by brief stimuli used in most previous studies.

At the neural level, the converse of a forward-masking response is a combination-sensitive facilitatory response in which the response to a given sound is enhanced by preceding sounds. Combination sensitivity is well documented in auditory brain regions of specialized species such as bats (Suga et al. 1978; see Suga 1992 for review) and songbirds (see Doupe and Kuhl 1999 for review; Margoliash and Fortune 1992). Although reports of facilitation are relatively fewer in other species, enhancement of the responses produced by pure tone sequences has been observed in macaques (Brosch et al. 1999; Malone et al. 2002), cats (Brosch and Schreiner 2000; McKenna et al. 1989) and rats (Kilgard and Merzenich 1999). Similar to the forward-masking studies, facilitation was usually studied using short-duration sounds and occurred at short interstimulus intervals.

The majority of previous neurophysiological studies cited above were conducted in anesthetized animals. Given the suppressive effects of anesthetics on cortical neurons (Gaese and Ostwald 2001; Goldstein et al. 1959), it is possible that the extent of the context-dependent processing in auditory cortex was not fully revealed in previous studies even for tonal stimuli. In the present study, we investigated stimulus context–dependent processing in the auditory cortex of awake marmosets using combinations of temporally modulated tone or noise stimuli.

METHODS

Animal preparation and single-unit recording procedures

All experiments were performed at Johns Hopkins University in AAALAC-approved facilities following protocols approved by the Institutional Animal Care and Use Committee (IACUC). The methods for preparing marmoset monkeys for electrophysiological recording have been previously reported (Barbour and Wang 2002; Lu et al. 2001) and are only briefly described here. A marmoset was progressively adapted to sit in a minimally restraining, custom-made primate chair until it sat comfortably for the duration of the experiment. An aseptic surgery was performed in which 2 stainless steel head posts were attached to a thick cap of dental acrylic (Dentsply) covering the exposed skull. A small region overlying auditory cortex on each side was covered by a thin layer of dental acrylic and filled with a polyvinylsiloxane dental impression compound (Kerr) between recording sessions.

Electrophysiological recording sessions began after the animal fully recovered from the surgery. Only one recording hole (nearly 1.0 mm in diameter) was open at any given time during recording sessions to ensure the stability and minimize tissue exposure. To record neural activities in auditory cortex, a sterile microelectrode (3–5 MΩ, A-M Systems) was advanced through the dura into the brain using a manually controlled hydraulic microdrive (Trent-Wells or David Kopf Instruments) located outside of the recording chamber.

Single units were isolated and detected using an online template-matching system (Alpha-Omega Engineering). Units were continuously monitored throughout a recording session to ensure stability and isolation quality. Stimulus presentation and recording were halted or paused if the recording quality deteriorated or the animal was suspected to be drowsy when monitored through a camera placed in the recording chamber (trials were resumed after the animal was confirmed to be awake with eyes open). Using this preparation, we generally achieved signal-to-noise ratios >10:1 for well-isolated single units that were often stable for >1 h (see examples in Fig. 1B).

FIG. 1.

Explanation of stimuli and analysis and examples of recorded spike waveforms. A: in the fixed ISI paradigm, each trial consisted of a variable stimulus 1 (S1) followed by a silent interstimulus interval (ISI) and then a constant stimulus 2 (S2). In this paradigm, the ISI was usually set at 0 ms. Both S1 and S2 durations were typically long-lasting and >500 ms. Silent periods were present before the S1 stimulus to record the spontaneous activity and after the S2 stimulus to record poststimulus discharges. Each set of stimuli also included one condition in which no S1 stimulus was played (S2 alone), so that the isolated S2 response could be used as a reference to compare with the other S2 responses that were preceded by S1 stimuli (S2|S1). Comparisons between S2-alone responses and S2|S1 responses were made for a series of temporally shifted 500 ms windows starting at S2 onset and then shifted in 100 ms increments for the duration of the S2 stimulus. In the variable ISI paradigm, S1 and S2 were constant. S2 had a shorter duration than the fixed ISI paradigm, with durations <500 ms (usually 200–300 ms). An S1 stimulus was present in all conditions for this stimulus set. ISI was varied from 25 to 3,000 ms in logarithmic steps. B: spike waveforms from 2 well-isolated single units were recorded in response to the following stimuli: Stimulus 1 (S1): amplitude-modulated tone, carrier frequency (fc) = 2.84 kHz; sound level (SL) = 0 to 60 dB SPL (all levels are in dB SPL unless otherwise stated), modulation frequency (fmod) = 128 Hz. Stimulus 2 (S2): amplitude-modulated tone, fc = 2.84 kHz, SL = 20 dB, fmod = 64 Hz. ISI: 0 ms. Top row is the response to S2 alone. In the 2nd row, S1 = 0 dB. Time marked by the asterisk was expanded in time and shown in the inset. At this expanded time scale (75-ms duration), 2 distinct waveforms, one larger and one smaller, can be seen. In the 3rd5th rows, S2 increases in 20-dB increments from 20 to 60 dB.

Most units were collected from supragranular layers 2/3 and layer 4, as estimated from recording depth and density of units (i.e., layer 4 was distinguished by densely packed units, causing an audible change in background neural activity). Recording depths were typically 300–1,500 μm from the cortical surface. Recordings in the present study were obtained from tone-preferring auditory cortical neurons immediately lateral to the lateral sulcus (putative primary auditory cortex) and from tone- or noise-preferring auditory cortical neurons located about 1–2 mm lateral to the lateral sulcus (putative lateral belt area). Data were obtained from the auditory cortices of three marmosets during a total of 115 recording sessions (animal M36N: 49 sessions, M41M: 37 sessions, M36L: 29 sessions). Typically, 1–3 single units were fully studied in each recording session that lasted 3–5 h.

Acoustic stimuli

Stimuli were presented in free field from a speaker (B & W 601) located 70 cm in front of the animal and level with the animal's head (0° azimuth, 0° elevation). The interiors of the double-walled soundproof chambers (Industrial Acoustics) in which recordings took place were covered with 3-in. acoustic absorption foam (Sonex, Ilbruck). Stimuli were created from custom programs using MATLAB software (The MathWorks), generated through a D/A converter (Tucker–Davis Technologies) with a 100-kHz sampling rate, low-pass filtered at 50 kHz, attenuated with 2 serially linked attenuators (Tucker–Davis) and amplified by a power amplifier (Crown). The measured speaker output was within 4 dB of the intended output for frequencies from 100 Hz to 36 kHz, which encompasses the hearing range of marmosets (Fay 1988; Seiden 1957), with a calibrated sound level of 90 dB SPL at 0 dB attenuation for 1-kHz tones. Stimuli used in this study included pure tones, band-pass–filtered noises, sinusoidally amplitude-modulated (sAM) tones or noises and sinusoidally frequency-modulated (sFM) tones.

Experimental protocols

A “NEURON-CENTERED” EXPERIMENTAL DESIGN.

Well-isolated single units in the auditory cortex of the awake marmoset are highly selective for stimulus parameters (in time, frequency, and intensity domains) and are diverse in their properties, particularly for neurons in the upper cortical layers. A given stimulus could evoke activity in only a small subset of neurons in auditory cortex. This was in sharp contrast to multiunit activities recorded from middle cortical layers in anesthetized marmosets (Wang et al. 1995). Therefore we adopted a “neuron-centered” approach to studying the effects of stimulus context in awake marmoset auditory cortex. That is, rather than using a set of standard stimuli to test all neurons, we first identified a unit's preferred stimulus (specific stimulus type and parameters) that evoked maximal firings. We then tailored the stimuli used in the S1–S2 paradigms to accommodate the particular stimulus selectivity of the individual neuron.

PROCEDURES TO DETERMINE THE PROPERTIES OF A UNIT'S PREFERRED STIMULUS.

Once a single unit was isolated, we first determined its preference for tone or noise. Qualitative assessment was used to determine which class or type of stimuli to choose (e.g., tone or noise) and quantitative assessment (based on firing rates) was then used to determine which stimulus parameter (e.g., modulation frequency) was preferred by a unit. Firing rates were calculated and displayed online along with the dot rasters of the evoked responses. Responses were also inspected visually for consistency of rate and temporal pattern. For tone-preferring units, the next step was to determine a unit's characteristic frequency (CF) using pure tone stimuli delivered at frequencies at 10 dB above a unit's estimated threshold. We tested ±1 octave around the putative preferred frequency at a sampling density of 10–20 tones/octave, or we presented tones from 1 to 32 kHz at a sampling density of 10 tones/octave. For noise-preferring units, a center frequency (also referred to as CF) was determined using band-pass noise stimuli with varying center frequencies at 10 dB above a unit's estimated threshold and with a preferred bandwidth that maximized evoked responses. Some units did not show a clear preference for tone or noise and could be driven by both types of stimuli. CF estimated by tones was generally close to that estimated by band-pass noises in these units.

The third step was to measure the rate-level function of a unit at its CF using the stimulus optimized in the preceding procedures. Most neurons in A1 or lateral belt area in awake marmosets had nonmonontonic rate-level functions, for which a preferred sound level could be determined. About 2/3 of the units analyzed in the present study (97/134 units) had more than 50% firing rate reduction at the loudest sound level compared with the peak firing rate.

The next step was to determine the preferred modulation frequency using sAM (or sFM) tones or sAM noises, for tone- or noise-preferring units, respectively, at various modulation frequencies (typically 4–512 Hz). The carrier frequency of sAM tones or noises was set at the unit's CF, the sound level set at the preferred level (for nonmonotonic units), or 20–40 dB above threshold (for monotonic units). The equations used to generate sAM or sFM were the same as those used in Barbour and Wang (2002). The majority of units (in A1 or lateral belt area) showed a preference for a particular modulation frequency (Liang et al. 2002). A unit's preferred stimulus had the “best ” carrier type (tone or noise), CF, modulation frequency, and sound level determined through the above optimization procedures.

SEQUENTIAL TWO-SOUND STIMULI.

Each trial (or presentation) consisted of 2 sequential stimuli. The first (preceding) stimulus was referred to as S1, and the second (succeeding) stimulus as S2. S1 and S2 stimuli each had 5-ms linear onset and offset ramps and were separated by an interstimulus interval (ISI). S1 was the variable stimulus (equivalent to the “masker” in a masking experiment), whereas S2 was the constant stimulus (equivalent to the “probe” in a masking experiment). The observed preferences for stimulus parameters for a unit were used to guide our choices of S1 and S2 to construct 2-sound stimuli. S1 typically had the same parameters as S2 except that the parameter being tested was varied.

S1–S2 stimuli were constructed using the following parameters. 1) S1–S2 combinations: sAM–sAM tones, sAM–sAM noises, and sFM–sFM tones. Occasionally, unmodulated tones or unmodulated noise stimuli were used for S2. 2) S1 and S2 sound levels: for units with nonmonotonic rate-level functions, S1 and S2 sound levels were typically set at the unit's preferred sound level, unless S1 sound level was the variable parameter. For units with monotonic rate-level functions, S1 and S2 were set at 20–40 dB above threshold. 3) S1 and S2 duration: S1 and S2 durations were chosen to be longer than those used in previous studies and to reflect the time spans of marmoset vocalizations (Wang 2000). The range used was 500–2,500 ms, with most S1 stimuli between 750 and 1,500 ms in duration and most S2 stimuli between 1,000 and 2,000 ms in duration. 4) ISI: in the fixed ISI paradigm (Fig. 1A), the ISI was fixed at a constant value (between 0 and 25 ms) when the effects of varying S1 parameters were tested. In the variable ISI paradigm (Fig. 1A), ISI became a variable when its effect was tested while S1 parameters were kept constant.

TESTING PROCEDURES.

We studied the modulatory effects of S1 on S2 responses by varying one S1 parameter at a time with a fixed ISI (Fig. 1A). The effect of varying ISI was separately tested, usually afterall tests with S1 parameters were completed. Control trials consisted of silence for the duration of S1 followed by S2 (Fig. 1A, “S2 alone”) and were included in fixed-ISI tests. Each stimulus set contained multiple S1–S2 combinations in which S1 varied in one parameter in multiple steps. One or more stimulus sets were tested for each unit. Each S1–S2 combination in a stimulus set as well as the S2-alone stimulus was played for 5–10 repetitions. Trials in each stimulus set (including control trials) were presented in pseudorandom order. The intertrial interval (ITI) was typically 1–2 s, which was increased if it appeared that the long-lasting effects outlasted the ITI. The following stimulus sets were used in the experiments reported here.

  1. To test for carrier frequency–dependent modulations, the S1 carrier frequency was varied, typically in 6–10 steps/octave and usually ±1 octave around the S2 carrier frequency.

  2. To test for modulation frequency–dependent modulations, S1 modulation frequency was varied, typically from 4 to 512 Hz in octave steps. Depending on the S2 modulation frequency, we sometimes used S1 modulation frequencies as low as 2 Hz or as high as 1,024 Hz to test S1 modulation frequencies at least 2 octaves away from the S2 modulation frequency.

  3. To test for sound level–dependent modulations, the S1 sound level was varied, typically from −40 to +40 dB relative to the S2 sound level. We rarely used stimuli ≥80 dB SPL and mostly used stimuli ≤60 dB SPL.

  4. To test for S1 duration–dependent modulations, S1 duration was varied from 100 to 1,300 ms in 200-ms steps.

  5. To test for ISI effect, ISI was systematically varied (25–3,000 ms in 15–16 logarithmic steps) while S1 and S2 were held constant. An S1 stimulus that produced significant suppression or facilitation of S2 responses in the fixed-ISI test was chosen for the variable-ISI test.

Data analysis

As described later in the results, the effects of S1 stimuli on S2 responses and the proportion of units affected by S1 were similar regardless of whether the stimuli were tone based or noise based. Therefore although we listed the number of units tested for each stimulus combination (Table 1), units were not further subdivided by their estimated anatomic locations in A1 or lateral belt.

View this table:
TABLE 1.

Number of units tested for each S1–S2 stimulus paradigm

For each unit, spontaneous firing rates were averaged over all trials from the stimulus set during the prestimulus window (0–200 or 0–500 ms). The median spontaneous firing rate of the sampled units was 1.98 spikes/s (n = 170 units). Overall firing rate for a stimulus (S1 or S2) was calculated as the average number of spikes/trial during the stimulus divided by the stimulus duration in seconds. The mean spontaneous firing rate was subtracted from the total firing rate to obtain a mean driven rate. Responses to a stimulus set were included in analyses if the mean driven rate of the S2-alone condition was >5 spikes/s and the total firing rate was ≥50% higher than the spontaneous firing rate or if the driven firing rate was 1–5 spikes/s and the total firing rate was at least twice as large as the spontaneous rate.

DETERMINATION OF THE DURATION OF SUPPRESSION OR FACILITATION INDUCED BY S1.

Statistical comparisons of overall firing rates were made between the response to S2-alone (denoted as S2-alone) and the responses to S2 when preceded by S1 (denoted as S2|S1) (Wilcoxon rank-sum test, P < 0.05 considered statistically significant). Comparisons were also made using the rank-sum test between firing rates of S2-alone (RS2) and S2|S1 (RS2|S1) computed from a 500-ms time window beginning at S2 onset and then subsequently shifted in 100-ms increments until it reached the end of the S2 stimulus (Fig. 1A). The firing rate for a given comparison window was computed over the 500-ms (0.5-s) window duration. Thus the first comparison window was for spikes that occurred 0–500 ms after S2 onset, the second window for spikes that occurred 100–600 ms after S2 onset, and so forth (Fig. 1A, bottom). This procedure yielded multiple comparison points that allowed us to assess the duration of modulatory effects induced by S1. The 500-ms windows were used to calculate firing rates because many single units in auditory cortex of awake marmosets have low firing rates, especially when S2 responses were suppressed by preceding S1. Shorter comparison windows could inflate short-term fluctuations in firing rate, therefore creating spurious “significant” differences. To further ensure that the modulatory effects were not influenced by random fluctuations of firing rate, we required that S2|S1 responses were significantly different from the S2-alone responses in at least 2 consecutive comparison windows. Two parameters of suppression or facilitation caused by S1 were measured. 1) Beginning of detected suppression or facilitation was defined as the beginning of the first comparison window for which the S2|S1 response differed significantly from the S2-alone response. The earliest possible onset time of suppression or facilitation was at S2 onset (beginning of the first comparison window). 2) Duration of the suppression or facilitation was calculated as the number of consecutive comparison windows in which the S2|S1 response differed significantly from the S2-alone response multiplied by 100 ms.

DETERMINATION OF THE S1 PARAMETERS RESULTING IN SIGNIFICANT CHANGES IN S2 RESPONSE.

For each stimulus set in which one S1 parameter was varied (e.g., modulation frequency, sound level), the parameter value that produced the largest significant suppression or facilitation in the S2|S1 response (RS2|S1) compared with the S2-alone response (RS2) was determined. Significance was assessed by comparing the S2|S1 response with the S2-alone response for the moving comparison window as described above. If more than one S1 parameter value produced significant changes in the S2|S1 response, a weighted average of the differences between the S1 parameter values and the S2 parameter value was calculated by the following formula Math where N is the number of S1 parameter values that produced significant changes in RS2|S1, {[RS2|S1(i) − RS2]/RS2} is the percentage change in S2|S1 firing rate compared with the S2-alone firing rate, and PS1(i) and PS2 are the parameters for S1 and S2, respectively. If only one S1 parameter significantly altered S2 responses, the S1avg value was equal to PS1 − PS2. S1 parameters were included in the weighted average only if they significantly altered S2 responses and were continuous with the S1 parameter that produced the strongest suppression or facilitation of S2 responses. When S1 carrier frequency or modulation frequency was the varied S1 parameter, the difference between PS1(i) and PS2 was measured in octaves such that PS1(i) − PS2 = log2 [PS1(i)/PS2]. The weighted average (S1avg) captured more accurately the center of the S1 parameter values that produced the greatest suppression or facilitation of RS2|S1.

RESULTS

S1–S2 stimulus sequences were tested on 170 well-isolated single units. About 90% of the units tested (152/170) had their responses to S2 stimuli significantly suppressed or facilitated by the presence of S1 stimuli.

Sensitivity to changes in S1 carrier frequency

SUPPRESSION.

We tested the influence of S1 carrier frequency in 46 units for which S1 and S2 were both modulated tones (both sAM or both sFM, Table 1). S1 and S2 both were modulated at the same modulation frequency. In 27 of 46 units tested (59%), responses to S2 were significantly suppressed at one or more S1 carrier frequencies, as compared with the responses to S2 alone. A representative example is shown in Fig. 2A. The S1 and S2 stimuli for the unit shown in Fig. 2A were both sAM tones with matching modulation frequencies. The S2 carrier frequency was set at 3.08 kHz (near the unit's CF), while the S1 carrier frequency varied from 1.54 to 6.16 kHz in 19 steps (±1 octave around S2's carrier frequency with 9 steps/octave). The S2 firing rate was significantly reduced when the S1 carrier frequency was similar to that of S2. This can be seen by comparing the dot raster in Fig. 2Aa of S2 alone (top row) with that of S2 when preceded by a S1 with the same (Fig. 2Aa, arrow) or similar carrier frequency. Note that the strongest suppression of the S2 response occurred when the S1 carrier frequency was close to, but not identical to, the S2 carrier frequency (e.g., at S1 carrier frequency of 3.33 kHz) (see Fig. 2Aa, middle shaded box; also Fig. 2Ab).

FIG. 2.

Long-lasting modulations of S2 response by changes in S1 carrier frequency. Aa: dot raster display of the spike responses of unit 36n-157a for an S1–S2 sequence in which the S1 carrier frequency was varied. Each dot represents one action potential. Solid lines separate each S1–S2 combination. S2 alone response is displayed in the top row. Variable S1 parameter, S1 carrier frequency in this case, is shown on the y-axis. Black bar under the raster indicates the duration of the S1 stimulus, and the gray bar indicates the duration of the S2 stimulus. Arrow indicates the S1–S2 combination for which the carrier frequency (fc) of S1 is equal to S2 (fc S1 = fc S2). A similar format is used in subsequent figures. Gray boxes indicate stimuli that are compared in Ab. S1: amplitude-modulated (sAM) tone, fc = 1.54–6.16 kHz, SL = 20 dB SPL, fmod = 4 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 3.08 kHz, SL = 20 dB, fmod = 4 Hz. Ab: shown are the peristimulus time histograms (PSTHs) of the S1–S2 combinations indicated by the gray boxes in Aa. Top PSTH is that of S2 alone (light gray box). Middle PSTH is for the S1 stimulus that produced the greatest reduction in the S2 response (S1 = 3.33 kHz, medium gray box). Bottom PSTH is for an S1 frequency that was not near the S2 frequency and is one octave lower than the S1 frequency in the middle PSTH (S1 = 1.66 kHz, dark gray box); 50-ms bins were used for all plots. Plots were scaled by the maximum spike rate. Ac: mean rates of the S1 (S1 rate, dashed line) and S2 (S2|S1 rate, open circles, solid line) responses for each S1–S2 stimulus combination for the same unit as in Aa and Ab. Dotted line represents the firing rate for S2 presented alone (S2 alone rate), whereas the solid line with no symbols represents the neuron's spontaneous rate (spont. rate). S2|S1 firing rates that were significantly different from the rate of S2 alone were noted with enlarged filled circles. An arrowhead points to the carrier frequency of the S2 stimulus. An open gray star indicates the S1 weighted average of carrier frequencies that significantly altered S2 responses (see methods). A similar format is used for all subsequent plots. Ba: dot raster display for a different unit, m36n-35a, whose S2 responses were facilitated by a particular S1 frequency. Same format as Aa. S1: frequency-modulated tone, fc = 7.0–21.0 kHz, SL = 10 dB, fmod = 32 Hz, FM depth (depth) = 512 Hz, ISI = 0 ms. S2: frequency-modulated tone, fc = 15.0 kHz, SL = 10 dB, fmod = 32 Hz, depth = 512 Hz. Gray boxes indicate stimuli that are compared in Bb. Bb: PSTHs of the stimuli indicated in Ba. Top PSTH shows the response to S2 alone, whereas the bottom PSTH shows a facilitated S2 response when preceded by an S1 stimulus with a frequency of 9.58 kHz. Bc: mean rates of the S1 and S2 stimuli using the same format as in Ac. In this example, none of the S1 exactly matched the S2 carrier frequency. Arrowhead marking S2 is at 15 kHz, whereas the nearest S1 stimulus is 15.34 kHz.

Because more than one S1 frequency suppressed S2 responses, a weighted average of the S1 frequencies that suppressed S2 responses was computed (see methods and below). Only S1 frequencies that were in the group that included the peak suppression and not interrupted by nonsignificant S1 frequencies were included in the weighted average. Therefore although S1 frequencies of 1.66, 4.52, and 4.89 kHz suppressed S2, they were not included in the weighted average. The weighted S1 carrier frequency in Fig. 2Aa (3.41 kHz, open star) was near the S1 frequency that produced maximal suppression (3.33 kHz).

As shown in Fig. 2A, S1 stimuli whose carrier frequencies were well outside of this unit's excitatory frequency tuning also reduced the S2 firing rate, even when there was little or no response to S1 (Fig. 2Aa). For example, when S1 was played at a carrier frequency of 1.66 kHz, it evoked very few spikes but greatly reduced the responses to S2 (Fig. 2Ab, bottom shaded box).

What was striking in this representative example in light of previous studies was that the suppression lasted for hundreds of milliseconds and persisted for the duration of the long-duration S2 stimulus. As shown in the peristimulus time histogram (PSTH) in Fig. 2Ab, S1 evoked only a weak response at 3.33 kHz (off the unit's CF). Nevertheless, the responses to S2 were diminished through the entire duration of the S2 stimulus (750 ms), hundreds of milliseconds after the offset of S1. Figure 2Ac plots the mean firing rates during S1 and S2, respectively, versus S1 carrier frequency. Filled circles in Fig. 2Ac correspond to significant changes in the S2|S1 rate compared with the S2-alone rate. Because firing rate for a stimulus (S1 or S2) was calculated as the average number of spikes/trial during the stimulus divided by the stimulus duration in seconds, in this example, S1 and S2 rates for matched parameters (arrowhead) appeared to be different because they were computed over different durations: 1,000 ms for S1 and 750 ms for S2. They were not significantly different when rates were computed using windows of equal duration. Interestingly, in this example, the S1 carrier frequency that maximally suppressed S2 response (3.33 kHz) was harmonically related to a distant S1 carrier frequency (1.66 kHz) that produced strong suppression of S2 response (Fig. 2Ac).

FACILITATION.

In 20 of 46 (43%) units tested by varying S1 carrier frequency, S2 responses were significantly facilitated at one or more S1 carrier frequencies. Unlike the suppression by S1 that was usually observed near matched S1 and S2 carrier frequencies, the facilitation often occurred for S1 carrier frequencies that were distant from the S2 carrier frequencies. Figure 2B shows a different unit for which S1 and S2 were sFM tones. S1 significantly enhanced the S2 firing rate when S1 and S2 differed by >5 kHz (shaded box, Fig. 2Ba,b). The increase in S2 firing rate lasted for the duration of S2 (1,500 ms). Figure 2Bc shows that S2 firing rate significantly increased by almost 25 spikes/s compared with the response to S2 alone.

POPULATION PROPERTIES.

More than one S1 carrier frequency could, and often did, produce significant changes in the S2 response. A weighted average of the S1–S2 carrier frequency differences was computed. This represents the relationship between S1 and S2 carrier frequencies for S1 to induce the largest suppression or facilitation in each unit (see methods). Figure 3 shows the distribution of these differences for suppression (Fig. 3A) and facilitation (Fig. 3B), respectively. Note that suppression or facilitation of the S2 response could occur for the same unit depending on the S1 carrier frequency. Figure 3A shows that for a majority of the units the weighted suppression occurred when the S1 carrier frequency was near the S2 carrier frequency. This was the case regardless of whether S1 and S2 were sAM tones (filled bars) or sFM tones (open bars). The difference between S1 and S2 carrier frequencies was <0.05 octaves in 9 of 27 units (Fig. 3A, central bin). In contrast, when S1 facilitated the S2 response, the weighted averages of the S1 carrier frequencies that produced facilitation were within 0.05 octaves in only 1 of 20 units (Fig. 3B, central bin), which was a significantly lower proportion than that for the case of suppression (χ2 test, P < 0.05). Figure 3, C and D, shows the weighted S1 carrier frequency as a function of S2 carrier frequency for suppression (Fig. 3C) and facilitation (Fig. 3D). Over the range of carrier frequencies tested, the weighted S1 carrier frequencies that suppressed S2 responses were much closer to the S2 carrier frequency (solid line is S1 = S2) than the S1 carrier frequencies that facilitated S2 responses. These results demonstrate that sAM and sFM tones often produced their strongest suppressive effects on S2 with similar carrier frequencies, but their strongest facilitatory effects were distributed across a wider range of carrier frequencies.

FIG. 3.

Distributions of S1–S2 frequency differences. Histograms of the weighted differences in carrier frequencies between S1 and S2 for S1 parameters that produced suppression (A) or facilitation (B) of S2 responses when S1 carrier frequency was varied (see methods). x-axis bin size = 0.10 octaves. Each bin is the proportion of units tested for which the weighted S1–S2 difference fell within the bin. Proportions do not necessarily add to 100% because not all stimulus sets in which carrier frequency was varied tested the same range of S1–S2 frequency differences. A: solid black bars are for units tested using sAM tones for S1 and S2. Gray outlined open bars are for units tested using sFM tones for S1 and S2. B: histogram of the weighted S1–S2 carrier frequency differences for facilitation. Same format as for the top histogram. C: S1 weighted average of carrier frequencies that significantly suppressed S2 responses was plotted as a function of S2 carrier frequency. Both values were expressed in Hertz. Stars indicate sAM S1 parameters that suppressed S2 responses and crosses indicate sFM S1 parameters that suppressed S2 responses. Solid diagonal line indicates y = x. Dashed lines are y = x ± 0.1 octaves. D: same format as in C. Open circles indicate sAM S1 parameters that facilitated S2 responses, and open triangles indicate sFM S1 parameters that facilitated S2 responses.

Sensitivity to changes in S1 modulation frequency

To test whether manipulations of S1's temporal modulation properties affected S2 responses, we systematically varied the AM frequency of S1 stimuli that were either sAM tones or sAM noises. In these tests, S1 and S2 had matching carrier frequencies, sound levels, and modulation depths, as well as equal bandwidths for sAM noise stimuli. Thus the only varying parameter in S1 was AM frequency. We tested 119 units for their sensitivity to S1 modulation frequency (Table 1).

SUPPRESSION.

We tested 97 units using sAM tone stimuli, with 72 of them (74%) showing significant decreases in S2 firing rate. Twenty-six units were tested using sAM noise stimuli, with 22 of them (85%) showing significant decreases in S2 firing rate. Figure 4A illustrates the sensitivity of S2 responses to changes in S1 modulation frequency for a representative single unit. S1 and S2 were sAM tone stimuli with carrier frequencies of 3.34 kHz. S1 modulation frequency was varied from 4 to 512 Hz, while the S2 modulation frequency was fixed at 16 Hz. S2 responses were strongly suppressed for S1 modulation frequencies from 4 to 64 Hz, or ±2 octaves away from the S2 modulation frequency (Fig. 4Ac). In Fig. 4Ab, the PSTH of the response to S2 alone was compared with the PSTH when S2 was preceded by the S1 modulated at 4 Hz. Despite a barely noticeable response to S1, the S2 response was clearly suppressed for several hundreds of milliseconds. Maximal suppression occurred when the S1 and S2 modulation frequencies were equal at 16 Hz (arrow in Fig. 4Aa), which also corresponded to the peak of the S1 response (Fig. 4Ac). However, significant suppression of the S2 response was also observed even when S1 only weakly excited the unit at 4, 8, and 64 Hz (Fig. 4Ac), illustrating that suppression of the S2 response was possible in the absence of a high S1 firing rate and when the S1 modulation frequency was different from that of S2.

FIG. 4.

Long-lasting modulations of S2 response by changes in S1 modulation frequency. Aa: dot raster display for unit m36n-133a for stimulus set in which sAM fmod was varied, showing suppression of S2 responses by S1 stimuli. Gray boxes indicate S1–S2 combinations that are compared in Ab. Arrow indicates fmod(S1) = fmod(S2). S1: amplitude-modulated tone, fc = 3.34 kHz, SL = 60 dB, fmod = 4–512 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 3.34 kHz, SL = 60 dB, fmod = 16 Hz. Ab: PSTH of responses of the same unit as in Aa. PSTH of S2 alone (top, lighter gray box) and PSTH of S2 responses with S1 fmod = 4 Hz (bottom, darker gray box). Ac: mean S1 and S2 rates for the unit in Aa. Labeling of components in the plot is the same as in Fig. 2 and is shown in the figure legend. Ba: dot raster display for unit m36n-86a showing facilitation of S2 responses by S1 at a particular S1 sAM fmod. Arrow indicates fmod(S1) = fmod(S2). S1: amplitude-modulated tone, fc = 7.39 kHz, SL = 10 dB, fmod = 4–512 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 7.39 kHz, SL = 10 dB, fmod = 16 Hz. Bb: PSTH of responses of the same unit as in Ba. PSTH of S2 alone (top, lighter gray box) and PSTH of S2|S1 responses with S1 fmod = 128 (bottom, darker gray box). Bc: mean S1 and S2 rates for the same unit as in Ba. Arrowheads in Ac and Bc indicate when the S1 and S2 modulation frequencies are equal.

FACILITATION.

In 29 of 97 units (30%) tested with sAM tones and 12 of 26 units (40%) tested with sAM noise, one or more S1 modulation frequencies significantly facilitated the S2 response (Table 1). For the representative unit shown in Fig. 4B, the S1 modulation frequency was varied from 4 to 512 Hz, while the S2 modulation frequency was fixed at 16 Hz (Fig. 4Aa). Significant facilitation of S2 firing rate was observed when the S1 modulation frequency was 128 Hz (Fig. 4Bc), which was much higher than the S2 modulation frequency. Unlike suppression, facilitation often occurred when the S1 and S2 modulation frequencies were far apart. Similar to the example in Fig. 2B, facilitation of S2 responses in Fig. 4B lasted for several hundred milliseconds (Fig. 4Bb).

POPULATION PROPERTIES.

The histograms in Fig. 5 show the weighted average of the modulation frequency difference between S1 and S2 when suppression (Fig. 5A) or facilitation (Fig. 5B) was induced in each unit. These units were analyzed in the same way as that shown in Fig. 3 (see methods). For the population of units tested for their AM frequency sensitivity, suppression of the S2 responses was most common when S1 and S2 modulation frequencies were similar, with a peak in the population histogram centered at matched S1 and S2 sAM modulation frequencies (Fig. 5A). This was the case for both sAM tone stimuli (Fig. 5A, filled bars) and sAM noise stimuli (Fig. 5A, open bars). Of 94 units, 25 (27%) were suppressed only by S1 modulation frequencies more than one octave different from S2. sFM tones were not included in this analysis because sinusoidal FM results in time-varying changes in both frequency and amplitude, therefore complicating interpretation of the S1–S2 interactions. Facilitation of S2 responses was observed for S1 modulation frequencies that were broadly distributed as far as 5 octaves away from S2 modulation frequency (Fig. 5B). However, for any given unit, facilitation of the S2 response often occurred for only a small range of modulation frequencies (e.g., Fig. 4B). The number of units whose S1 modulation frequencies were within 2 octaves of S2 modulation frequency was significantly lower for facilitation compared with suppression (χ2 test, P < 0.05). The patterns of the population data shown in Fig. 5 were obtained from neurons for which S1 AM modulation frequency was varied while S1 and S2 had the same carriers (i.e., same carrier frequency and bandwidth, if applicable). The patterns are similar to those shown in Fig. 3 (carrier frequency test), suggesting that it is the similarity of both the spectral and temporal parameters of S1 and S2, not just the similarity of their carrier frequencies, that underlies the suppressive or facilitatory interactions between the 2 stimuli. Although a large proportion of units was suppressed when S1 and S2 modulation frequencies were similar, there were also a substantial number of units that were suppressed by dissimilar modulation frequencies.

FIG. 5.

Distributions of S1–S2 modulation frequency differences. Same format as in Fig. 3. A: histogram of the weighted differences in sAM modulation frequencies between S1 and S2 for S1 parameters that produced suppression of S2 responses. x-axis bin size = 1 octave. Each bin is the proportion of units tested for which the weighted S1–S2 difference fell within the bin. Solid black bars are for units tested using sAM tones for S1 and S2. Gray open bars are for units tested using sAM noise for S1 and S2. B: histogram of the weighted differences in sAM modulation frequencies between S1 and S2 for S1 parameters that produced facilitation of S2 responses. Same format as above. Solid black bars are for units tested using sAM tones for S1 and S2. Gray open bars are for units tested using sAM noise for S1 and S2. C: S1 weighted average of AM modulation frequencies that significantly suppressed S2 responses was plotted as a function of S2 AM modulation frequency. Both values were expressed in Hertz. Stars indicate sAM tone S1 parameters that suppressed S2 responses (correlation coefficient r = 0.44, P < 0.001) and crosses indicate sAM noise S1 parameters that suppressed S2 responses (r = 0.79, P < 0.001). Solid diagonal line indicates y = x. Dashed lines are y = x ± 0.5 octaves. D: same format as C. Open circles indicate sAM tone S1 parameters that facilitated S2 responses (r = 0.13, P > 0.05) and open triangles indicate sAM noise S1 parameters that facilitated S2 responses (r = −0.52, P > 0.05).

Figure 5C shows that the weighted S1 modulation frequencies that suppressed S2 responses were similar to the S2 modulation frequencies across the range of sAM modulation frequencies tested (star and cross symbols). Weighted S1 modulation frequencies that facilitated S2 responses were often at least one octave different from the S2 modulation frequency (outside of dashed lines in Fig. 5D, open circle and triangle symbols). This was especially clear at the extremes of the range of modulation frequencies tested. At one of the lowest modulation frequencies tested (4 Hz), facilitatory responses were observed when the S1 modulation frequencies were much higher than S2, whereas at the highest modulation frequencies tested (256, 512 Hz), facilitatory responses were observed when the S1 modulation frequencies were much lower than S2 (Fig. 5D).

Sensitivity to changes in S1 sound level

Once an S1 stimulus was found that suppressed or facilitated S2 response at equal sound levels, the effects of varying S1 sound level were tested using the S1 with the parameter values that produced the largest change in S2 responses. Thus in many cases, S1 and S2 differed in modulation frequency or carrier frequency.

SUPPRESSION.

Figure 6 shows an example of enduring suppression of the S2 response that lasted for >2 s. S1 consisted of 10 amplitude-modulated tone bursts (carrier frequency 3.91 kHz, modulation frequency 8 Hz) whereas S2 consisted of ten 3.91-kHz pure tone bursts. Each tone burst was 200 ms in duration, separated by a 25-ms interval. At low S1 sound levels (0–10 dB SPL), there was little effect on the S2 response (Fig. 6Aa). As the S1 sound level increased, the S2 firing rate began to decrease (Fig. 6Ac). At the peak S1 response (40–50 dB SPL), S2 was maximally suppressed for the entire duration of the S2 stimulus, which was >2 s (Fig. 6Ab,c). A further increase in S1 sound level (60–80 dB SPL) resulted in less suppression of S2 responses (Fig. 6Ac).

FIG. 6.

Long-lasting suppression of S2 responses by changes in S1 sound level. Aa: dot raster display for unit m41m-64a whose S2 responses were suppressed when S1 sound level was varied. Gray boxes indicate S1–S2 combinations that are compared in Ab. S1: amplitude-modulated tone train, fc = 3.91 kHz, SL = 0–80 dB, fmod = 8 Hz, 10 × 200-ms duration, 25-ms interval. S1–S2 ISI = 10 ms. S2: tone train, fc = 3.91 kHz, SL = 40 dB, 10 × 200-ms duration, 25-ms interval. Arrow indicates stimulus where S1 and S2 sound levels are equal. Ab: PSTH comparing combined responses for S1 = 0 dB and S1 = 10 dB (lighter gray) with responses for S1 = 40 dB (darker gray). S2 was 40 dB in both PSTHs. Ac: mean S1 and S2 rates for the same unit as in Aa. B: mean S1 and S2 rates for unit m36n-37a, different from the unit in A, are plotted as a function of S1 sound level. S1: amplitude-modulated white noise (0–50 kHz), SL = −10 to 60 dB, fmod = 4 Hz, ISI = 0 ms. S2: white noise, SL = 20 dB. Arrowhead indicates S1 and S2 sound levels are equal. C: histogram of the weighted difference in sound levels between S1 and S2 for S1 parameters that suppressed S2 responses. White bars are for units tested using sAM tones for S1 and S2. Black bars are for units tested using sAM noise stimuli for S1 and S2. Gray bars are for units tested using sFM tones for S1 and S2.

The example in Fig. 6A was representative of units for which suppression occurred over a limited range of S1 sound levels (Fig. 6Ac). In these units, suppression of S1 by S2 was often strongest when S1 and S2 sound levels were nearly equal rather than when S1 was loudest. In other units, there was a threshold of S1 sound level above which S2 responses were significantly suppressed by preceding S1, such as the example shown in Fig. 6B. The magnitude of suppression generally increased with increasing S1 sound level in these units and the S1 sound level that produced the greatest suppression was usually ≥20 dB louder than S2. For the unit shown in Fig. 6B, when the S1 sound level was ≤S2 sound level, the S2 firing rate was not significantly affected by S1. As S1 became louder, the S2 firing rate was significantly reduced and remained so at higher S1 sound levels. This occurred despite a lack of response to S1 at sound levels higher than that of S2 (Fig. 6B). Figure 6C summarizes the weighted differences between S1 and S2 sound levels for S1 parameters that produced the strongest suppression of S2 responses. There was a concentration of units with an S1–S2 level difference of 0–10 dB. For the majority of units, the strongest suppression of S2 responses occurred when S1 was louder than S2 (Fig. 6C). There were no major differences in the distributions of S1–S2 level differences when sAM tones (white bars), sAM noise (black bars), or sFM tones (gray bars) were used in the tests.

FACILITATION.

Whereas suppression of S2 responses could increase or decrease with increasing S1 level, facilitation of S2 responses by S1 almost always increased as S1 sound level increased. A representative example is shown in Fig. 7A for which S1 and S2 were sAM stimuli modulated at 16 and 32 Hz, respectively. As shown in Fig. 7Ac, the S1 response was nonmonotonic with respect to sound level with a peak at 20 dB. The sound level of the S2 stimulus was set at 20 dB. For S1 sound levels ≤10 dB louder than that of S2, there was no significant change in S2 firing rate (Fig. 7Ac). Significant facilitation occurred when the S1 sound level was ≥20 dB louder than S2, at which point the unit no longer responded to S1. At the highest sound level tested (60 dB), the facilitation of the S2 response reached its maximum and lasted for hundreds of milliseconds (Fig. 7Ab).

FIG. 7.

Long-lasting facilitation of S2 responses by changes in S1 sound level. Aa: Dot raster display for unit m36n-67b whose S2 responses were facilitated when S1 sound level was varied. S1: amplitude-modulated tone, fc = 7.18 kHz, SL = −10 to 60 dB, fmod = 16 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 7.18 kHz, SL = 20 dB, fmod = 32 Hz. Gray boxes show S1–S2 stimuli that will be compared in Ab. Arrow indicates when S1 and S2 sound levels are equal. Ab: PSTH comparing S2 alone (lighter gray) with S1 level at 60 dB (darker gray). Ac: mean S1 and S2 rates for the same unit as in Aa. Arrowhead indicates S1 and S2 sound levels are equal. B: histogram of the weighted S1 sound levels relative to S2 for the S1 parameters that produced facilitation of S2 responses. White bars are for units tested using sAM tones for S1 and S2. White bars are for units tested using sAM noise stimuli for S1 and S2. Gray bars are for units tested using sFM tones for S1 and S2. C: S1 weighted average of sound levels that significantly affected S2 responses was plotted as a function of S2 sound level. Both values were expressed in dB. Stars indicate sAM tone S1 parameters that suppressed S2 responses, crosses indicate sAM noise S1 parameters that suppressed S2 responses, × symbols indicate sFM S1 parameters that suppressed S2 responses, open circles indicate sAM tone S1 parameters that facilitated S2 responses, squares indicate sFM S1 parameters that facilitated S2 responses, and open triangles indicate sAM noise S1 parameters that facilitated S2 responses. Solid diagonal line indicates y = x. Dashed lines are y = x ± 5 dB.

As shown by the population summary in Fig. 7B, the strongest facilitation of S2 response usually occurred at S1 sound levels that were ≥20 dB higher than S2, in contrast to the prevalence of suppression at similar S1 and S2 sound levels (Fig. 6C). This was the case whether the S1 and S2 were sAM tones (white bars), sAM noises (black bars), or sFM tones (gray bars).

Regardless of the S2 sound level, S1 generally facilitated S2 responses when S1 was much louder than S2 (Fig. 7C, open circles, triangles, and squares). Suppression of S2 responses often occurred when S1 was slightly louder than S2, and a greater proportion of suppressive S1 stimuli (star, cross, and x symbols) were within 5 dB of the S2 sound level (within the dashed lines).

For suppression, the rate-level function of the S1 response was not well correlated with the changes in the S2 response with regard to sound level. As stated in methods, most auditory cortex neurons had nonmonotonic rate-level functions. The examples in Figs. 6B and 7A had clearly nonmonotonic S1 rate-level responses, but the suppression (Fig. 6B) or facilitation (Fig. 7A) of S2 responses was monotonic with respect to sound level. Overall, when S1 had a nonmonotonic rate-level function, the S2 firing rate could be monotonically decreasing (32 of 53 units tested, Table 2) or nonmonotonically decreasing (21 of 53 units tested, Table 2). A similar split was observed when S1 had a monotonic rate-level function (S2 response: 12/29 nonmonotonic, 17/29 monotonic, Table 2), as illustrated by representative units shown in Fig. 8. In Fig. 8A, the S1 response increased monotonically, whereas the S2 response was suppressed monotonically with little change in the amount of suppression for S1 sound levels ≥50 dB. In Fig. 8B, the reduction of S2 rate by S1 was strongest and significant at an intermediate S1 sound level but was absent at the highest S1 sound levels. This occurred even though the S1 rate increased monotonically. In contrast, the facilitation of S2 responses was almost always monotonically increasing (40/43 units), regardless of whether S1 rate-level function was monotonic or nonmonotonic (Table 2).

FIG. 8.

Examples of different S1 rate-level responses and changes in the S2 response. A: mean S1 and S2 rates for unit m36n-58a plotted as a function of S1 sound level. S1: amplitude-modulated tone, fc = 9.01 kHz, SL = 20–80 dB, fmod = 4 Hz, ISI = 0 ms. S2: tone, fc = 9.01 kHz, SL = 60 dB. B: mean S1 and S2 rates for unit m41m-162a are plotted as a function of S1 sound level. S1: amplitude-modulated tone, fc = 1.83 kHz, SL = −10 to 50 dB, fmod = 32 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 1.83 kHz, SL = 10 dB, fmod = 4 Hz. Arrowheads in A and B indicate equal S1 and S2 sound levels.

View this table:
TABLE 2.

Characteristics of suppression and facilitation for changes in S1 sound level

Suppression and facilitation in the same unit

In some units, S1 suppressed and facilitated S2 responses, depending on the S1 parameters. When carrier frequency was the variable S1 parameter, 9 neurons showed both suppression (9/27 = 33% units) and facilitation (9/20 = 45% units). In this subset of units, suppression was more likely to occur at S1 frequencies more distant from the S2 frequency (6/9 = 67% units >0.15 octaves S1–S2 difference, median S1–S2 difference = 0.48 octaves) than in the total population (12/27 = 44% units >0.15 octaves S1–S2 difference, median S1–S2 difference = 0.11 octaves). Facilitation in these units occurred for a median S1–S2 difference = 0.44 octaves and was usually >0.5 octaves away from the weighted S1 carrier frequency that produced the strongest suppression (mean suppression–facilitation frequency difference = 0.75 ± 0.42 octaves, median difference = 0.59 octaves). When modulation frequency was the variable S1 parameter, 29 neurons showed both suppression (29/94 = 30.9% units) and facilitation (29/42 = 69.1% units). Similar to the total population, suppression in this subset of units occurred at significantly smaller S1–S2 sAM fmod differences than facilitation (suppression, mean S1–S2 difference = 1.6 ± 1.5 octaves, facilitation, mean S1–S2 difference = 3.0 ± 1.7 octaves, P < 0.01 signed rank sum). For a given unit, the weighted S1 modulation frequency that facilitated S2 responses was on average 3.1 ± 2.2 octaves away from the weighted S1 modulation frequency that suppressed S2. In 24 units, constituting 29% (24/82) of the suppressed S2 responses and 56% (24/43) of the facilitated S2 responses, both suppression and facilitation were observed when sound level was the S1 parameter. Suppression occurred at significantly lower sound levels (mean S1 level = 22.1 ± 19.9 dB SPL) than facilitation (mean S1 level = 50.5 ± 17.5 dB) in these units (P < 0.0001, signed rank sum), similar to what was observed in the overall population. For a given unit, the weighted S1 level that facilitated S2 responses was on average 33 ± 16 dB louder than the weighted S1 modulation frequency that suppressed S2.

Figure 1B shows an example of suppression and facilitation in the same neuron for the single unit represented by the larger action potentials. Compared with the S2-alone response (Fig. 1B, top trace), the S2|S1 response was suppressed when S1 and S2 were matched in sound level (Fig. 1B, middle trace), whereas the S2|S1 response was facilitated when S1 was 40 dB louder than S2 (Fig. 1B, bottom trace).

Response stability using long-duration stimuli

An important issue for investigations of stimulus context was whether responses returned to a baseline value between trials. As a control analysis, we compared the S1 responses with the S2 responses for the S1 stimulus that matched S2. We computed the rates over equal durations for S1 and S2, starting from stimulus onset and using the shorter of the 2 stimulus durations. If there was no significant difference between the S1 and S2 rates, it indicated that the neuron had settled to a baseline state that was similar for S1 and S2. For units in which S1 suppressed S2 responses, S1 and S2 rates were not significantly different (Wilcoxon rank-sum test, P > 0.05) in 91/94 units for which S1 sAM modulation frequency was varied and in 27/27 units for which S1 carrier frequency was varied. For units in which S1 facilitated S2 responses, S1 and S2 rates were not significantly different (ranksum test, P > 0.05) in 40/42 units for which S1 sAM modulation frequency was varied and in 18/20 units for which S1 carrier frequency was varied (some units were tested using both protocols). Thus neurons returned to a baseline state between trials that was comparable between S1 and S2 in almost all units studied.

Relationship between S1 firing rate and S1's effects on S2 responses

Because S1-induced suppression of S2 responses was typically strongest when S1 was similar to S2, it was interesting to know whether the observed decreases in S2 firing rate could be simply attributed to spike habituation or adaptation of the neuron. There were a number of observations from our experiments that suggested this was not the case. We first analyzed the relationship between the amount of suppression in S2 responses and the firing rate evoked by the S1 stimulus. Figure 9A plots RS2|S1/RS2 ratio versus RS1/RS2 ratio for each S1 stimulus that significantly suppressed S2 responses, where RS2|S1 is the S2|S1 rate, RS2 is the S2 alone rate, and RS1 is the S1 rate. If suppression was a result of spike habituation, there should be a negative correlation between the 2 ratios in this plot such that higher S1 firing rates are related to lower S2 firing rates. The data in Fig. 10A showed no such correlation. This showed that S1 firing rates were not reliable predictors of the magnitudes of reductions in S2 firing rates. Thus spiking adaptation could not be the explanation of the reduced S2 responses when preceded by S1. In addition, there was also no correlation between S1 firing rates and S2|S1 responses when S1 induced facilitation (Fig. 9B), suggesting that the facilitation was not predictable by S1 firing rates. These results suggest that S1 firing rates were not reliable predictors of the magnitudes of reductions in S2 firing rates. Furthermore, 31% of the S1 stimuli (268/878 samples) that suppressed the S2 response (Fig. 9A) and 42% of the S1 stimuli (112/268 samples) that facilitated the S2 response (Fig. 9B) had S1 driven rates <1 spikes/s. This further suggests that the spiking activity produced by S1 did not predict S1's modulation of the S2 response.

FIG. 9.

Lack of correlation between S1 firing rate and reduction of S2 rate. A: (S2|S1 rate)/(S2 alone rate) was plotted vs. the (S1 rate)/(S2 alone rate.). S2 responses that were significantly suppressed by S1 are shown. Open circles indicate that the responses to S1 (RS1) and the S2|S1 responses (RS2|S1) were from stimulus sets in which S1 carrier frequency was varied (n = 71), × symbols were from stimulus sets in which S1 modulation frequency was varied (n = 475) and stars were from stimulus sets in which S1 sound level was varied (n = 332). R2 values for carrier frequency, modulation frequency, or level as the variable S1 parameter were 0.025, 0.003, and 0.022, respectively. B: same measures as in A for S2 responses that were facilitated by S1. Symbols are the same as in A (n = 60 for S1 carrier frequency, n = 91 for S1 modulation frequency, S1 = 117 samples for S1 sound level). R2 values for carrier frequency, modulation frequency, or level as the variable S1 parameter were 0.003, 0.001, and 0.038, respectively.

FIG. 10.

Population distributions of the timings and durations of the S1 effects on S2. For each unit in A, B, D, and E, the S1 stimulus that produced the maximum duration of suppression (A, D) or facilitation (B, E) on S2 was chosen. See methods for information on how values were measured. A: distribution of durations of suppression of S2 responses by S1 stimuli (median: 600 ms, mean ± SD: 673 ± 367 ms). B: distribution of durations of facilitation of S2 responses by S1 stimuli (median: 500 ms, mean ± SD: 599 ± 371 ms). C: average percentage change in S2 firing rate (RS2|S1 − RS2)/RS2 was plotted as a function of S1 duration for units whose responses were suppressed by S1. Average was computed over all 34 neurons for which S1 duration was varied. Filled circles indicate a significant difference from the 100-ms S1 duration condition (Kruskal–Wallis test, P < 0.05). D: distribution of beginning times of detected suppression of S2 responses by S1 stimuli measured in terms of comparison window number. S2 times spanned by the first three comparison windows are shown above their respective bars on the histogram. E: distribution of beginning times of detected facilitation of S2 responses by S1 stimuli measured in terms of comparison window number.

Duration and onset time of S1-induced suppression and facilitation

DURATION OF SUPPRESSION OR FACILITATION.

One striking feature of the S1-induced changes in S2 responses was the long-lasting nature of the suppression or facilitation. To determine the timing and duration of a given S1's suppression or facilitation of S2 responses, we compared the firing rate of S2 alone with the S2|S1 rate using 500 ms comparison windows that were incrementally shifted in 100-ms steps (see methods and Fig. 1A). For a given unit, the duration of the suppression or facilitation was calculated as the number of consecutive comparison windows in which the S2|S1 response differed significantly from the S2-alone response multiplied by 100 ms.

We analyzed 133 units whose S2 responses were significantly suppressed by S1 stimuli and 89 units whose S2 responses were significantly facilitated. If multiple sets of S1 stimuli were tested in a unit, measurements from the S1 stimulus set that produced the longest lasting change in the S2 response were used to represent that unit in the population plots shown in Fig. 10. All measurements were obtained from stimulus sets using the fixed ISI paradigm (Fig. 1A).

Across, the population, the mean duration of suppression was 673 ± 367 ms and the median duration was 600 ms (Fig. 10A). A similar analysis for facilitation revealed that the facilitation lasted on average for 599 ± 371, with a median duration of 500 ms (Fig. 10B), which was not significantly different from the duration of suppression. The durations of suppression and facilitation found in our experiments are much longer than what has been reported in studies using short-duration S1 stimuli.

Not only was the duration of suppression longer when using longer-duration stimuli, but the magnitude of suppression also increased as S1 duration increased, as shown in Fig. 10C. In 34 units, S1 duration was varied from 100 to 1,300 ms in 200-ms increments. For these units, S1 had already been determined to induce significant suppression of S2 firing rate from stimulus sets using long-duration S1 stimuli and varying S1 carrier frequency or modulation frequency. A population average of the percent change in S2 firing rate as a function of S1 duration showed that the reduction in S2 firing rate increased as the S1 duration increased (Fig. 11C). Compared with S2 firing rates following S1 durations of 100 ms, there was a significant decrease in the S2 firing rates for S1 durations ≥700 ms.

FIG. 11.

Example of late-acting facilitation. A: PSTH for unit m36n-41a for a stimulus set in which S1 sAM fmod was varied. S1: amplitude-modulated tone, fc = 5.6 kHz, SL = 20 dB, fmod = 4–1,024 Hz, ISI = 0 ms. S2: amplitude-modulated tone, fc = 5.6 kHz, SL = 20 dB, fmod = 128 Hz. B: S2|S1 rate of the initial 300 ms of response as a function of S1 modulation frequency. C: S2|S1 rate of last 1,000 ms of S2 response (500–1500 ms) of response as a function of S1 modulation frequency. Enlarged filled circles indicate significant decrease (B, gray circles) or increase (C, black circles) in mean S2|S1 rate vs. S2 alone measured over the same time period.

BEGINNING OF DETECTED SUPPRESSION OR FACILITATION.

We measured the beginning of detected suppression or facilitation, defined as the first comparison window in which the S2|S1 response and S2 alone were significantly different. This was not intended as a precise measure of latency, which would require a large number of stimulus repetitions not available in the present study, but rather provided the ability to compare the timing of suppression and facilitation across units. As shown in Fig. 10D, suppression of S2 responses by S1 stimuli often was detected by the first comparison window (first comparison window beginning at 0 ms). There were, however, some units in which the suppression began at later times. For facilitation, there was a significantly larger proportion of units with late onset times (beginning later than the second comparison window, 45/89 units, χ2 test, P < 0.05) (Fig. 10E), and the median beginning of detected facilitation was the third comparison window (beginning at 200 ms after S2 onset). In comparison, 35% (47/133) of units were suppressed beginning later than the second comparison window (Fig. 10D).

An example of late facilitation is shown in Fig. 11. Presentation of S2 alone, a 5.6 kHz sAM stimulus modulated at 128 Hz, generated a strong early response (Fig. 11A). As the S1 modulation frequency varied from 4 to 1024 Hz, the firing rates of the initial 300-ms portion of S2 responses were all below the firing rate of S2-alone measured within the same time window (Fig. 11B). However, the firing rates of the late portion of the S2 responses were greater than corresponding S2-alone firing rate, significantly so at S1 modulation frequencies of 256 and 1,024 Hz (Fig. 11C). Examples such as this imply that S1-induced changes in S2 rate observed using short S2 stimuli (Brosch and Schreiner 1997; Calford and Semple 1995) possibly represent only a subset of the possible S1-induced effects on S2. In fact, a substantial proportion of units that exhibited S1-dependent modulation, especially for facilitation of the S2 rate, occurred at later times after the beginning of S2 (Fig. 10, B and D).

Sensitivity to changes in S1–S2 interstimulus interval using short S2 stimuli

Previous studies investigating contextual suppression or facilitation of auditory-evoked responses in auditory cortex used short S2 probe stimuli and varied ISI (Brosch and Schreiner 1997; Brosch et al. 1999; Calford and Semple 1995; Fitzpatrick et al. 1999). To compare our results from awake primates using long S1 stimuli with the results of previous studies, we also tested 55 units with short S2 durations (200–300 ms) while varying ISI from 25 to 3,000 ms. Long-duration S1 stimuli were still used in these tests. S1 was chosen based on the parameters that produced the largest change in the S2 response in the fixed ISI paradigm. An example is shown in Fig. 12A. The S2 response was significantly suppressed by S1 for ISI ≤540 ms, with the amount of the suppression diminishing monotonically with increasing ISI (Fig. 12Ab) and returning to the S2-alone rate by an ISI of 1,540 ms. A second example in Fig. 12B showed significant suppression for ISI ≤840 ms. Forty-four units whose S2 responses were significantly suppressed at 0 ms ISI were tested with variable ISI (Fig. 12D, top). The mean duration of suppression assessed by varying ISI (mean ± SD: 730 ± 621 ms, median: 543 ms) was comparable to the duration of suppression observed for consecutive, 0 ms ISI S1–S2 stimuli (Fig. 10A, mean ± SD: 673 ± 367 ms, median: 600 ms, P > 0.05 vs. variable ISI paradigm, Kruskal–Wallis test). This indicated that the observed suppression did not depend on the presence of a continuous S2 stimulus and did not decay more quickly with the insertion of a silent interval between S1 and S2.

FIG. 12.

Long-lasting effects of S1 on S2 occurred when S1 and S2 were separated by silence. Aa: PSTH for unit m36n-58a as ISI was varied. Gray bars indicate onset and offset of S1. Gray boxes indicate onset and offset of S2. S1: amplitude-modulated tone, fc = 9.01 kHz, SL = 70 dB, fmod = 4 Hz, ISI = 25–3,000 ms. S2: tone, fc = 9.01 kHz, SL = 60 dB. Ab: mean S1 and S2 rates for the same unit as in Aa. B: mean S1 and S2 rates for a different unit, m36n-160a. S1: amplitude-modulated tone, fc = 3.55 kHz, SL = 20 dB, fmod = 256 Hz, ISI = 25–3,000 ms. S2: amplitude-modulated tone, fc = 3.55 kHz, SL = 10 dB, fmod = 32 Hz. Ca: PSTH for another unit, m36n-67b, as ISI was varied. Gray bars indicate onset and offset of S1. S1: amplitude-modulated tone, fc = 7.18 kHz, SL = 60 dB, fmod = 128 Hz, ISI = 25–3,000 ms. S2: amplitude-modulated tone, fc = 7.18 kHz, SL = 20 dB, fmod = 32 Hz. Cb: mean S1 and S2 rates for unit in Ca. D, top: distribution of maximum ISI for units in which S1 stimuli suppressed S2 responses (mean ± SD: 730 ± 621 ms, median: 543 ms). Bottom: distribution of maximum ISI for units in which S1 stimuli facilitated S2 responses (mean ± SD: 401 ± 468 ms, median: 200 ms).

Facilitation of the S2 response was also observed with varying ISI. In an example shown in Fig. 12C, S1 strongly facilitated S2 responses at short ISI, with significant facilitation persisting for ISI ≤270 ms. Note that the S1 stimulus produced no responses in this example. Eleven units that showed facilitation at 0 ms ISI were tested with varying ISI. The mean duration of facilitation was 401 ± 468 ms (Fig. 12D, bottom, median: 200 ms). In contrast, the duration of facilitation observed with long duration S2 in the fixed ISI paradigm was longer (Fig. 10B, mean ± SD: 599 ± 371 ms, median: 500 ms, P < 0.001 vs. variable ISI paradigm, Kruskal–Wallis test). Thus unlike suppression, the presence of a silent interval and/or the duration of the S2 stimulus affected the duration of facilitation of S2 responses by S1. These data show that the long-lasting suppression and facilitation exhibited by cortical neurons can be revealed by both fixed-ISI protocols (primarily used in the present study) and by variable-ISI protocols (extensively used in previous studies). However, the use of long-duration S1 stimuli produced much longer lasting effects on the S2 responses.

DISCUSSION

Summary of results

Two-sound sequences (S1–S2 sequences) consisting of temporally modulated, long-duration stimuli were used to investigate how the cortical responses to a sound were affected by preceding sounds. Long-duration S1 stimuli were found to produce suppression or facilitation of S2 responses that lasted from hundreds of milliseconds to more than one second by fairly strict statistical criteria (Fig. 10, A and B), which was far longer than had been reported using short S1 stimuli. These S1-dependent changes were observed using both fixed-ISI and variable-ISI testing protocols. Suppression of S2 responses was the most common effect produced by S1 stimuli. S2 responses usually showed a greater amount of suppression when S1 and S2 had similar rather than dissimilar carrier frequencies (Fig. 3A). Introducing AM of S1 further modulated S2 responses in addition to the modulatory effects that resulted from the relationship between the carrier frequencies of S1 and S2. Suppression was often strongest when the modulation frequencies of the 2 stimuli were similar (Fig. 5A). In addition, suppression of S2 responses depended on the relationship between sound levels of the 2 stimuli (Figs. 6C and 7B). Higher S1 sound levels often resulted in greater suppression (Fig. 6C), but many units were maximally suppressed when S1 and S2 sound levels were similar (Figs. 6A and 8B). This evidence indicated that S1-induced suppression was sensitive to spectral, temporal and intensity characteristics of S1. Our analyses showed that the suppression of S2 responses by S1 stimuli could not be simply attributed to spike habituation or adaptation.

In many of the units studied, particular S1 parameters significantly facilitated S2 responses, often hundreds of milliseconds after the offset of S1 (Figs. 2B, 10B, and 11). Facilitation was observed when varying S1's carrier frequency, modulation frequency and sound level, demonstrating that the facilitation was dependent on spectral, temporal and intensity characteristics of S1. Unlike suppression, facilitation of S2 responses induced by S1 was observed for a wide range of S1–S2 combinations across the population of sampled units and was often strongest when S1 and S2 had mismatched parameters (Figs. 3B, 5B, and 7B). For a given unit, facilitation was usually observed for a narrow range of particular S1–S2 combinations (Figs. 2B and 4B). Facilitation of S2 responses increased as S1 sound level increased (Fig. 7B). Finally, like suppression, facilitation could be observed in the absence or near-absence of discharges evoked by the S1 stimulus (Figs. 4B, 7A, and 12C) suggesting subthreshold or subcortical interactions between the 2 sequential stimuli.

Technical considerations

One potential limitation of the present study is that although the animals were awake, they were not required to engage in an auditory task. In our data, the consistency of the observed response modulations over multiple, randomized trials suggested that the changes in the responses were either insensitive to the animal's behavioral state or that the animal remained in a relatively stable behavioral state over the course of the trials. Furthermore, one study in awake macaques used 2-sound sequences (tone–tone or noise–noise) and found that suppression was similar whether the animal was passively listening or performing a sound localization task (Werner-Reiss et al. 2003). In another study in awake macaques, spontaneous and evoked firing rates to tones slightly increased during performance of a sound localization task when compared with passive listening (Scott et al. 2003), but this nonspecific increase in excitability was different from the specific effects of stimulus context that we have reported here. Moreover, both mismatch negativity and auditory stream segregation have attention-insensitive and attention-sensitive components (Muller et al. 2002; Sussman et al. 1998), so any parallels discussed here only address the attention-insensitive components.

A separate issue is that in analyzing responses obtained using the fixed ISI paradigm, we required that the S2|S1 responses significantly differed from the S2 alone responses in 2 consecutive comparison windows (see methods). In doing so, we may have excluded from our analysis responses for which significant differences were confined to the initial 100 ms of S2. Such responses constituted only a small fraction of the significant differences that were observed (93/1048 = 8.9% cases). Therefore the population data shown in Fig. 10 would not be substantially affected by these cases.

Comparison with previous studies in the auditory cortex

EFFECTS OF CARRIER FREQUENCY.

Effects arising from stimulus context have been described in the auditory cortex for simple tone stimuli (Brosch and Scheich 2002; Brosch and Schreiner 1997; Calford and Semple 1995; Malone et al. 2002; McKenna et al. 1989; Ulanovsky et al. 2003) or for click pairs (Fitzpatrick et al. 1999). These studies found that suppression was strongest and occurred most often when the S1 and S2 frequencies were similar (Brosch and Scheich 2002; Brosch and Schreiner 1997; Calford and Semple 1995; Ulanovsky et al. 2003). In the present study, we tested the effects of overlapping S1 and S2 carrier frequencies using temporally modulated sAM tones and sFM tones. Similar to the results reported for unmodulated tones, suppression of S2 responses by S1 was strongest when S1 and S2 carrier frequencies were similar (Figs. 2 and 3). This reinforces the notion that strong suppression occurs when S1 and S2 share overlapping spectral contents and that this suppression is a general feature of forward masking (Moore 1978; Zwislocki et al. 1968).

Unlike suppression, S1-induced facilitation found in this study occurred over a range of S1–S2 carrier-frequency differences (Fig. 3B) but were restricted to a small range of carrier frequencies for a given unit. This finding was similar to what was reported in awake marmosets using simultaneous rather than sequential tones (Kadia and Wang 2003). One study using oddball tone stimuli reported facilitation when the oddball tone neighbored the standard tone (Ulanovsky et al. 2003). Brosch and colleagues (Brosch and Schreiner 2000; Brosch et al. 1999) reported that maximal facilitation occurred when frequencies of 2 sequential tones differed by approximately one octave when stimulus onset asynchrony between 2 tones was approximately 100 ms. They also reported that facilitation was observed for a wide range of frequencies of the first tone (Brosch et al. 1999). It was found in a previous study in awake marmosets that facilitation could be induced for a wide range of frequency differences between 2 simultaneous tones ≥2 octaves (Kadia and Wang 2003). In the present study, we chose to sample S1 carrier frequencies densely (6–10 frequencies/octave) within ±1 octave of the S2 frequency. We showed that long-lasting facilitatory responses elicited by long duration stimuli were typically restricted to a small range of S1–S2 carrier-frequency combinations for a given unit and, over the sampled population, were approximately equally distributed for all S1–S2 combinations within ±1 octave (Fig. 3B). There were methodological differences between this study and the previous studies using 2 sequential tones cited above. Short-duration tones were used in the studies by Brosch et al. (1999) (100 ms) and Brosch and Schreiner (2000) (30 ms), in contrast to long duration tones used in the present study (500–2,000 ms). As a result, the time difference between the onsets of 2 tones was much longer in the present study than in Brosch et al. studies (Brosch and Schreiner 2000; Brosch et al. 1999). Moreover, the present study was conducted in awake marmosets, whereas the previous studies discussed above were conducted in anesthetized cats (Brosch and Schreiner 2000) or macaque monkeys (Brosch et al. 1999).

EFFECTS OF STIMULUS SOUND LEVEL.

Studies in anesthetized cats have found that suppression generally increased with increasing sound level, but a minority of units decreased suppression at high sound levels (Brosch and Schreiner 1997; Calford and Semple 1995). A study in anesthetized cats related the differential responses to the rate-level function of the S1 response (Calford and Semple 1995). They found that neurons with nonmonotonic, but not monotonic, S1 rate-level functions displayed nonmonotonic suppression of S2. Increasing the sound level of the S1 stimulus has also been shown to broaden the range of effective suppressing S1 frequencies (Brosch and Schreiner 1997; Calford and Semple 1995). Although S1 sound level had a profound influence on the amount of suppression or facilitation of the S2 response (Figs. 68), we found no systematic relationship between the S1 rate-level function and S1 suppression or facilitation of S2 responses (Figs. 6 and 8, Table 2). In many instances, the magnitude of S2 suppression decreased as the S1 sound level increased (Figs. 6A and 8B), but this was not related to the S1 rate-level function. Decreases in suppression with increasing S1 sound level could reflect the sound level tuning of suppressive synaptic inputs to the recorded cell, shifts in the balance of excitation and suppression, or the interplay of more than one mechanism (see following text for further discussion of potential mechanisms). On the other hand, facilitation had a simple relationship with S1 sound level and increased monotonically as S1 sound level increased (Fig. 7), but was not related to the rate-level function of the S1 responses. This could arise from some combination of increased excitation and decreased inhibition during the S2 stimulus.

EFFECTS OF INTERSTIMULUS INTERVAL.

To compare the durations of suppression and facilitation more directly with previous studies (Brosch and Schreiner 1997; Brosch et al. 1999; Fitzpatrick et al. 1999; Werner-Reiss et al. 2003), we tested 55 units with a short S2 stimulus and variable ISI. Moreover, the variable ISI paradigm focused on changes in the transient portion of the S2 response induced by S1. We demonstrated that the mean duration of suppression in awake marmosets was similar whether or not there was a silent interval between S1 and S2 (Figs. 10 and 12). This suggested that despite the stimulus and perceptual differences between the fixed and variable ISI paradigms, the neural mechanisms by which S1 suppressed S2 responses appeared to be similar between the 2 paradigms.

Unlike suppression, facilitation of S2 by S1 persisted for a shorter time in the variable ISI paradigm compared with the fixed ISI paradigm (Figs. 10B and Fig. 12D, bottom). This result implies that facilitation weakens and decays within a few hundred milliseconds in the absence of ongoing sound-driven neural activity, which constrains the potential mechanisms for facilitation. Nevertheless, facilitation beginning long after the onset of S2, which was prevalent in the neurons studied (Fig. 10E), may require longer duration S2 stimuli to be observed. If facilitation did not require a longer-duration stimulus, we should have observed an increase in facilitation at longer ISIs corresponding to the longer onset time of the facilitation when long S2 stimuli were used, which we did not. Facilitation decreased monotonically for all of the units tested with variable ISI and short-duration S2 stimuli and likely reflects the duration of facilitation of the onset portion of the response only. Use of short S2 stimuli could miss increases in S2 firing rate if the facilitation occurred only in the later sustained portion of the S2 response (e.g., Figs. 2B, 4B, and 11).

Sensitivity to S1 modulation frequency

The carrier frequency of a narrow-band sound (e.g., sAM tone) reflects the location of the sound's spectral energy on the frequency axis. If similarity in S1 and S2 carrier frequency is sufficient by itself to produce suppression, then the amount of S2 suppression should be predicted simply by the proximity of the S1 and S2 carrier frequencies and should be insensitive to the temporal parameters of S1 and S2. If, on the other hand, the relationships between the temporal parameters of S1 and S2 contribute to the amount of suppression, then there may be some S1 modulation frequencies that reduce the S2 response, while others may have no effect on S2 (essentially a release from masking) or even facilitate the S2 response. Data from the present study suggest that it could be the similarities between S1 and S2 in general, not just the similarity between their carrier frequencies, that determine the degree of suppression or facilitation produced by S1. S1's suppression of S2 responses was shown to depend on the modulation frequency difference between S1 and S2 (Figs. 4 and 5). The interesting implication of this result was that S1-dependent suppression was not solely based on a comparison of the spectral characteristics of S1 and S2. Rather, suppression was sensitive to the temporal characteristics of S1. It is also important to consider the modulation frequencies that did not generate the expected suppression produced when S1 and S2 carrier frequencies matched. In some cases, the suppression probably mirrored the modulation frequency tuning of inputs that have already been formed at subcortical levels (Creutzfeldt et al. 1980; Langner and Schreiner 1988; Preuss and Muller-Preuss 1990). However, many S1 stimuli drove the neuron and still did not produce suppression, supporting the assertion that suppression differed from spiking habituation. For some units, certain S1 modulation frequencies facilitated S2 responses (Figs. 4B and 11), which was even more unexpected given the same S1 and S2 carrier frequencies.

Temporal modulation parameters have not been investigated in previous studies for S1-dependent suppression of S2 firing rates. However, a recent human psychophysical study demonstrated forward masking by AM noise stimuli (Wotcjzak and Viemeister 2004). In that study, masking was strongest when the masker and probe stimuli had similar modulation frequencies, and masking lasted for hundreds of milliseconds, which is very similar to the neurophysiological data of the present study. The tendency for the strongest suppression to occur when S1 and S2 sAM rates differ by less than or equal to one octave (Fig. 5A) also concurs with psychophysical data on auditory stream segregation (Grimault et al. 2002). Subjects typically perceived sAM stimuli differing by less than one octave as part of a single stream, but perceived stimuli differing by greater than one octave as 2 streams (Grimault et al. 2002). These psychophysical studies are consistent with the neural responses in auditory cortex (Figs. 4 and 5) and may provide some insight into the functions of stimulus context-dependent changes in S2 responses. While it is clear that the temporal characteristics of S1 contribute to forward masking, further study is required to determine whether the observed persistent changes in firing rate induced by previous stimuli indicate the perception and persistence of auditory streams or objects.

Whereas suppression of S2 responses could usually be induced by more than S1 modulation frequency, facilitation was usually restricted to a narrower range of S1 modulation frequencies that were different from the S2 modulation frequency (Figs. 4B and 11). Across the population, facilitation occurred at all combinations of S1–S2 differences (Fig. 5B). Therefore when an acoustic stimulus changes its modulation frequency, different groups of neurons could have their responses facilitated in a manner that could encode the direction of the modulation frequency change and enhance the response to the new modulation frequency.

Sensitivity to S1 duration

This was the first study to vary systematically the S1 duration and use long-lasting stimulus durations that were comparable to the durations of natural vocalizations. Data from the auditory cortex of the anesthetized cat have hinted that the preceding stimulus duration exerts an effect on subsequent neural activity for S1 durations ≤200 ms (Brosch and Schreiner 1997). Duration has been shown to affect the proportion of neurons affected and the magnitude of S1-dependent effects in the gerbil inferior colliculus (Malone et al. 2001). We found that the magnitude of suppression increased as S1 duration increased (Fig. 10C). S1-dependent effects were much shorter with short-duration S1 stimuli (Brosch and Schreiner 1997; Brosch et al. 1999; Calford and Semple 1995; Fitzpatrick et al. 1999 but see also Hocherman and Gilat 1981) than with long-duration S1 stimuli (present study), regardless of whether the effects were suppressive or facilitatory. The durations of S1 stimuli in this study extended far beyond what have been used in most psychophysical studies of masking and are more comparable to durations of stream buildup (Beauvois and Meddis 1997; Beauvois 1998) or the standard blocks in oddball paradigms investigated by mismatch negativity (Ulanovsky et al. 2003). In these studies, similar to the present study, auditory streams consisted of a single repetitive stimulus that influenced the perception (Beauvois and Meddis 1997; Beauvois 1998) or neuronal firing rates (current study; Ulanovsky et al. 2003) for seconds after the introduction of a new stimulus.

Potential neural mechanisms of S1-induced modulations

The potential mechanisms for the contextual modulation reported in this study are unknown but are constrained by their characteristics. Since many cells could be either suppressed or facilitated depending on the S1 stimulus, it is more likely a function of how a given S1 stimulus engages the excitatory and inhibitory inputs rather than a frequency or brain region-specific effect. For suppression, the most probable mechanisms are presynaptic depression of neurotransmitter release (Chance et al. 1998; Nelson 1991; Varela et al. 1997; Wei et al. 2002), postsynaptic activation of intrinsic currents in the recorded neurons, or postsynaptic activation of neuromodulator receptors. Long-lasting postsynaptic inhibition is less likely, since it would require inhibitory inputs to fire for long durations in the absence of an auditory stimulus. Synaptic depression has been shown to occur at high-release probability synapses in pyramidal neurons in layer 2/3 of the rat auditory cortex (Atzori et al. 2001), which is the layer at which most of the units in the present study were putatively recorded. Another possibility is that induction of postsynaptic currents reduces neuronal excitability. During visual adaptation in V1 neurons, a Na+-activated K+ current was activated and reduced the firing rate of the neurons for seconds (Sanchez-Vives et al. 2000). Presynaptic depression or postsynaptic activation of a Na+-activated K+ current would mainly occur during rapid afferent firing and suggests that the cause of suppression was often tuned similarly to the cell's excitatory receptive field or related to the cell's level of excitation.

For facilitatory effects, S1-evoked activation of NMDA, metabotropic glutamate, dopamine (Seamans et al. 2001), noradrenaline or acetylcholine (Bakin and Weinberger 1996; Manunta and Edeline 1997; Weinberger 2004) receptors could induce cell depolarization or sustained firing in the cell's inputs. The relatively long onset time and lasting effects of neuromodulators are in agreement with the observations that the expression of the facilitatory effects often occurred later in the S2 response (Fig. 10). Another scenario is that reciprocal excitatory connections between the recorded cell and some of the cells from which it receives input set up a positive feedback loop, which is similar to mechanisms proposed for generating persistent activity in prefrontal and visual cortex (McCormick et al. 2003). This mechanism would permit only short S1–S2 ISIs to maintain the reverberation, which is consistent with the relatively rapid decay of facilitation with increasing ISI (Fig. 12).

Functional implications

At the behavioral level, sound context affects the detection of sounds, the perception of sound changes, and the formation of perceptual streams. For example, the perceptions of consonant sounds were dramatically altered by their vowel context, even when the preceding vowels were separated from the consonants by hundreds of milliseconds (Holt et al. 2002). The effects of stimulus context can be altered by pathological states such as autism, schizophrenia, Alzheimer's disease and language impairment (Alain et al. 2002; Arlinger and Dryselius 1990; Ceponiene et al. 2003; Dubno et al. 2003; Pekkonen et al. 2001, 2002). The S2 stimuli used in this study were often much longer than the probe stimuli used in most masking studies, and the time course of changes in the S2 responses outlasted the reported durations of masking (Kidd and Feth 1982; Moore and Glasberg 1983; Penner 1974; Zwicker 1984). Thus it is important to consider what roles other than masking might be served by the long-lasting effects of S1 on S2.

The prevalent suppression of S2 stimuli by S1 stimuli whose modulation frequencies were within one octave of S2 suggests a means by which an auditory object could be insensitive to small fluctuations, since that would not be likely to signify a relevant change in the auditory scene. By contrast, introduction of a new, substantially different stimulus could potentially increase the magnitude of the change in neural response to the new stimulus, either by the same unit or within a small population of neurons. Such an effect was found both neurally and behaviorally for adaptation of orientation tuning in visual cortical neurons (Dragoi et al. 2002). This interpretation also fits with the observation that changes in S2 rate increase as S1 duration increases, since the perceptual strength of an auditory stream grows over time and decays with a time constant >1 s (Beauvois and Meddis 1997; Bregman 1978). Not only can suppression play a role in stream tracking, but facilitation of nonmatched S1 and S2 stimuli could enhance the contrast for stream segregation and work in opposition to the perceptual capture and neural suppression for nearly matched stimuli. As the difference between the spectrum, envelope, or loudness between S1 and S2 increases, the likelihood of perceiving 2 streams increases (Moore and Gockel 2002).

Facilitation of S2 responses by S1 is a form of temporal combination selectivity, which can be used to enhance the signaling of stimulus transitions along a particular stimulus dimension, such as modulation frequency or sound level. Whereas temporal combination selectivity in bats is limited to tens of milliseconds (Kawasaki et al. 1988; O'Neill and Suga 1982), temporal combination sensitivity in songbirds can span hundreds of milliseconds (Margoliash 1983), which is comparable to what was observed in this study. Such temporal combination selectivity could be important in representing certain classes of marmoset calls that have components that change hundreds of milliseconds into the call (Wang 2000) or for representing situation-dependent combinations of calls.

GRANTS

This research was supported by a Whitaker Distinguished Postdoctoral Fellowship to E. L. Bartlett and National Institute on Deafness and Other Communication Disorders Grant R01-DC-03180 to X. Wang.

Acknowledgments

We thank A. Pistorio for animal assistance and S. Gardner, C. DiMattina, D. Bendor, E. Issa, A.Pistorio, and S. Sadogopan for helpful comments on the manuscript.

Footnotes

  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

REFERENCES

View Abstract