Journal of Neurophysiology

Excitatory and Inhibitory Intensity Tuning in Auditory Cortex: Evidence for Multiple Inhibitory Mechanisms

M. L. Sutter, W. C. Loftus


The intensity tuning of excitatory and suppressive domain frequency response areas was investigated in 230 cat primary auditory cortical and 92 posterior auditory field neurons. Suppressive domains were explored using simultaneous 2-tone stimulation with one tone at the best excitatory frequency. The intensity tuning of excitatory and suppressive domains was negatively correlated, supporting the hypothesis that inhibitory sidebands are related to excitatory domain intensity tuning. To further test this hypothesis, we compared the slopes of the edges of suppressive bands to the intensity tuning of excitatory domains. Edges of suppressive bands next to excitatory domains had slopes significantly more slanted toward the excitatory area in neurons with intensity-tuned excitatory domains. This relationship was not observed for suppressive band edges not next to the excitatory domain (e.g., the lower edge of lower suppressive bands). This indicates that intensity tuning ultimately observed in the excitatory domain results from overlapping excitatory and inhibitory inputs. In combination with results using forward masking, our results suggest that there are separate early and late sources of inhibition contributing to cortical frequency response areas, and only the early-stage inhibition contributes to excitatory domain intensity tuning.


Neurons that are tuned for sound intensity (also called nonmonotonic neurons) are common throughout the central auditory system. Intensity tuning is interesting from both a functional and mechanistic perspective. Intensity tuning is potentially important for loudness encoding (Phillips 1993; Phillips and Carr 1998), analyzing complex sound spectra (Phillips 1993; Sutter and Schreiner 1995), spectral analysis in the presence of noise (Phillips 1987, 1990), and envelope sensitivity (Heil 1998; Heil and Irvine 1998; Phillips 1988; Phillips et al. 1995), and must be accounted for in any population code of auditory signal processing (Phillips et al. 1994; Semple and Kitzes 1993b). The mechanisms underlying intensity tuning are intriguing because they must rely on neural inhibition. Inhibition from the CNS is required because all auditory nerve fibers, the input to the central auditory system, have monotonically increasing response versus intensity functions. Therefore to reduce activity preferentially at high intensities requires inhibition.

Intensity-tuned neurons can be found in most auditory brain areas, including the cochlear nucleus (CN) (Greenwood and Maruyama 1965; Young and Brownell 1976), inferior colliculus (IC) (Aitkin 1991; Kuwabara and Suga 1993), medial geniculate body (MGB) (Aitkin and Webster 1972; Rodrigues-Dagaeff et al. 1989; Rouiller et al. 1983), and auditory cortex (AC) (Barone et al. 1996; Erulker et al. 1956; Evans and Whitfield 1964; Phillips and Irvine 1981). Ionophoretic injection of GABAergic and glycinergic receptor agonists and antagonists indicate that intensity tuning is constructed de novo in many auditory areas (e.g., CN: Caspary et al. 1979; Davis and Young 2000; Evans and Zhao 1993; superior olivary complex (SOC): Grothe 1994; Zheng and Hall 2000; IC: Chen and Jen 2000; Faingold et al. 1991; Fuzessery and Hall 1996; Hall 1999; Pollak and Park 1993; Vater et al. 1992; Yang et al. 1992; MGB: Suga et al. 1997; AC: Wang et al. 2000). In primary auditory cortex (A1), ¼ to ½ of the neurons are intensity tuned (e.g., Clarey et al. 1994; Imig et al. 1990; Phillips and Irvine 1981; Sutter and Schreiner 1995), depending on the definition of intensity tuning and the experimental methods used (Heil 1997; Phillips 1988). Although intensitytuned neurons can be found throughout A1, two areas in A1 have a concentration of intensity-tuned cells resulting in a topographic map of intensity (Heil et al. 1994; Suga and Manabe 1982; Sutter and Schreiner 1995).

The reduction in responsiveness at high intensities in intensity-tuned neurons may be produced by the overlap of nonintensity-tuned excitatory and inhibitory inputs. Greenwood and Maruyama (1965) concluded on the basis of firing patterns of cochlear nucleus neurons that many stimuli evoke both excitatory and inhibitory processes with the net effect on firing determined by the relative timing and strength of the excitation and inhibition.1 Phillips (1988) found evidence that excitatory domain (ED-) intensity tuning of cortical neurons was related to lateral inhibitory processes, and discussed the possibility that ED-intensity tuning emerges from the overlap of an inhibitory input with a non-intensity-tuned excitatory input at high intensities. The schematic in Fig. 1A illustrates how this might occur. The frequency tuning of level 1 neurons that provide input are sufficiently broad so that inhibitory and excitatory inputs overlap in the level 2 neuron. Because the inhibitory synapses are as strong as the excitatory synapses, the inhibition cancels the excitation in the region of overlap, with the net result that the excitatory domain is intensity tuned and the inhibitory domains are not intensity tuned. Figure 1B illustrates a similar scenario, except that the inhibitory synapses are weaker than the excitatory synapses. As a result, level 2 neurons have a non-intensity-tuned excitatory domain flanked by intensity-tuned inhibitory domains that slant away from it. In the third scenario (Fig. 1C) the inputs are tuned sufficiently narrow so that inhibitory and excitatory inputs do not overlap. As a result, both the excitatory and inhibitory domains are not intensity tuned in level 2 neurons. Note that it would not be possible to derive intensity tuning for both the excitatory and inhibitory domains. Thus this model predicts that the shapes and intensity tuning of excitatory and inhibitory domains will be correlated in certain ways. An experimental goal of this study is to test this prediction.

fig. 1.

A: possible mechanism creating cells with intensity-tuned excitatory domains and non-intensity-tuned suppressive domains. Level 1 shows broadly frequency tuned input neurons (circles). To the left of the circles are frequency response areas (FRAs) of corresponding neurons. At level 2, broad fast lateral inhibition (red feed forward connections) inhibits responses at the best excitatory frequency (BEF) at high intensities. This creates cells with intensity-tuned excitatory domains, but untuned suppressive domains (gray shading in FRAs to the right of level 2 neurons). B: possible mechanism for creating neurons with inhibitory domains that are intensity tuned at the best inhibitory frequency (BIF), but with excitatory domains that are not intensity tuned. All connections are the same as in A. All that changes is the strength of inhibition. C: possible mechanism for creating cells with excitatory and inhibitory domains not tuned for intensity. All connections are the same as in A. All that has changed is the frequency tuning of inputs. Note: we used lateral inhibition in the model for simplicity. All observed properties could be created with surround inhibition matched in frequency to excitation.

Measuring inhibitory/suppressive response areas of cortical neurons in anesthetized preparations is complicated by low spontaneous firing rates. The only study examining the correlation in the intensity tuning of inhibitory and excitatory domains in A1 neurons is Calford and Semple (1995). They used a forward-masking stimulus paradigm, in which an initial tone is followed by a second tone in the neuron's excitatory domain, to reveal persistent inhibition. It was observed that excitatory and inhibitory domain properties were correlated, but not in the manner predicted by the model described above. For example, neurons with intensity-tuned excitatory domains tended to have forward-masked inhibitory domains that were also intensity tuned and that did not account for the excitatory tuning. The authors hypothesized that the inhibition responsible for creating intensity-tuned excitatory domains was distinct from forward-masking inhibition. Another goal of this study is to expand on this idea, and propose that ED-intensity-tuned cortical neurons display at least 2 types of inhibition: 1) a short latency, short-acting inhibition that abolishes responses at high intensities and thereby carves out intensity-tuned excitatory domains, and 2) a longer latency, longer-lasting lateral inhibition derived from local clusters of intensity-tuned cells that does not contribute to ED-intensity tuning.

Short latency inhibition relative to excitation may be necessary to produce ED-intensity tuning at higher levels of the auditory system, where the responses are primarily phasic, rather than sustained. Most cat A1 neurons produce only a short latency phasic onset response where all spikes fall in a narrow time probability window (about 5–10 ms). Furthermore, for many ED-intensity-tuned A1 neurons, this onset response is completely abolished by inhibition at high intensities (e.g., Phillips et al. 1985, 1995). For inhibitory postsynaptic potentials (ipsps) to create the complete cessation of short latency phasic onset responses at high intensities, they must arrive nearly simultaneously with excitatory postsynaptic potentials (epsps). Otherwise, part of the onset response could occur before the inhibition can exert an effect. This stands in contrast to earlier stations of the auditory system where intensity tuning can result from inhibition later in a sustained responses (e.g., Ramachandran et al. 1999; Rhode and Kettner 1987; Rhode and Smith 1986; Rhode et al. 1983; Shofner and Young 1985; Spirou et al. 1999). Thus in the model of Fig. 1, level 1 neurons are connected to level 2 neurons by a fast, monosynaptic inhibitory pathway and a delayed, disynaptic excitatory pathway. The inhibition that underlies ED-intensity tuning in A1 neurons does not necessarily reside in A1, but may be derived from any areas in the auditory pathway. Wherever the source of the inhibition is located, however, it must have a short latency compared with excitatory inputs in the same region. At the same time, one cannot automatically assume that ED-intensity tuning in A1 neurons is inherited from ED-intensity-tuned neurons at earlier stages of the pathway.

The posterior auditory field (PAF, also called “field P”) is a cortical area where roughly 80% of the neurons have intensitytuned excitatory domains (Heil and Irvine 1998; Kitzes and Hollrigel 1996; Phillips and Orman 1984). Despite the prevalence of ED-intensity tuning in PAF, measurements of inhibitory/suppressive domains and their relationship to ED-intensity tuning in PAF have been lacking. Although PAF neurons have different spectral and temporal properties than those of A1 neurons (e.g., Loftus and Sutter 2001b), Phillips et al. (1995) hypothesized that mechanisms responsible for ED-intensity tuning in PAF neurons may be similar to those for ED-intensity tuning in A1 neurons. If this is correct, then the correlations between the suppressive and excitatory domain shapes that we find in A1 should also be observed for PAF.

In the present study, we measure excitatory and suppressive domains of A1 and PAF single neurons. Simultaneous 2-tone stimulation is used to reveal the effects of short latency inhibition. Quantitative measures of intensity tuning and other aspects of response domain shape are derived. Correlations between the excitatory and suppressive domains are examined and compared with the model predictions, and the model is expanded to account for both simultaneous and forward-masking data.


Surgical preparation

We recorded single neurons from A1 in 3 left, and 23 right hemispheres of 25 young adult cats, and from PAF in 6 left and 2 right hemispheres of 8 young adult cats. Surgical preparation, stimulus delivery, and recording procedures for A1 are the same as those from a previous study (Sutter and Schreiner 1991), with exceptions noted below. Stimulus delivery and recording procedures for PAF are the same as those in Loftus and Sutter (2001b), with exceptions noted below.

Briefly, anesthesia was induced with an intramuscular (im) injection of ketamine hydrochloride (10 mg/kg) and acetylpromazine maleate (0.28 mg/kg). After venous cannulation, an initial dose of sodium pentobarbital (30 mg/kg) was administered. Animals were maintained at a surgical level of anesthesia with a continuous infusion of sodium pentobarbital (2 mg kg–1 h–1) and if necessary, with supplementary intravenous injections. Lactated Ringer solution was injected through a separate catheter for a total fluid volume of 3.5–4.0 ml kg–1 h–1. The cats were also given dexamethasone sodium phosphate (0.14 mg/kg, im) to prevent brain edema, and atropine sulfate (1 mg, im) to reduce salivation. The temperature of the animals was monitored with a rectal temperature probe and maintained between 37.5 and 38.0°C by means of a heated water blanket with feedback control.

Three-point head fixation was achieved with palatal-orbital restraint, leaving the external meati unobstructed. The temporal muscle was retracted and the lateral cortex exposed by craniotomy. The dura overlying the middle and/or posterior ectosylvian gyrus was removed, the cortex was covered with silicone oil, and a video image of the surface vasculature was taken to record the electrode penetration sites. If brain pulsation interfered with stable single-unit recording, a semiclosed system was used consisting of a wire mesh placed over the craniotomy. The space between the grid and cortex was filled with a 1% solution of clear agarose. The agarose-filled grid/chamber diminished pulsation of the cortex and provided a fairly unobstructed view of identifiable locations across the exposed cortical surface.

To confirm electrode locations, near the end of some experiments (72–120 h of recording) penetrations were marked with 2 or 3 electrolytic lesions (about 10 μA DC; electrode negative; about 10 s). The cats were transcardially perfused with physiological saline followed by 4% paraformaldehyde. The brains were blocked and stored in 10% sucrose. Frozen sections were cut in the horizontal plane at a thickness of 50–60 μm, counterstained with cresyl violet, and examined under a light microscope. A1 and PAF borders were defined from the established tonotopic borders (Merzenich et al. 1975; Reale and Imig 1980). Additional confirmation that recordings were in PAF was provided by retrograde tracer experiments in 2 animals. In both cases, several tracer injections were made along the dorsoventral extent of putative PAF, as defined by the physiological criteria of the present study. The resulting pattern of labeling within the medial geniculate body (MGB) matched the expected distribution of thalamic projections to PAF (Morel and Imig 1987; Rodrigues-Dagaeff et al. 1989).

Stimulus generation and delivery

Experiments were conducted in double-walled sound-shielded rooms (IAC). Stimuli were generated by a microprocessor (TMS32010 or TDT; 16 bit D/A converter at 120 kHz; low-pass filter of 96 dB/octave at 15, 35, or 50 kHz). A pair of passive attenuators provided attenuation.

For A1 neurons, sounds were delivered with insert speakers. We used calibrated headphones (STAX 54) enclosed in small chambers that were connected to sound-delivery tubes sealed into the acoustic meati. This sound-delivery system was calibrated with a sound level meter (Brüel and Kjäer) and distortions were measured either with waveform analyzers or a computer acquisition system. The frequency response of the system was essentially flat (≤14 kHz) and did not have major resonances deviating more than ±6 dB from the average level. Above 14 kHz, the output rolled off at a rate of 10 dB/octave. Because our calibration ignores potential influences of the outer ear, measurements of the slopes of frequency tuning curves might be affected, particularly at high frequencies. Slope measurements were adjusted for the transfer function roll-off, but this does not include the influences of notches in the head-related transfer function. However, because we sampled neurons with a wide range of best frequencies, this likely changed the variance of inverse slope measures and less likely the mean. Harmonic distortion was better than 55 dB below the primary.

For PAF, auditory stimuli were presented “near field” (speaker distance to the center of the head 3.0 ft) by calibrated speakers: a Radio Shack Optimus Pro-7AV and a Radio Shack dual-radial horn tweeter (cat. no. 40-1377) with a crossover circuit at 7 kHz. A Radio Shack MPA-200 amplifier drove the speakers. Speakers were placed at ±90° azimuth and 0° elevation relative to the animal and oriented directly toward the pinna contralateral to the recorded hemisphere. The sound system was calibrated with a sound level meter (Brüel and Kjäer type 2231) with a probe microphone positioned near the pinna. The frequency response of the system was essentially flat from 0.5 to 40 kHz except for 2 notches of <14 dB (peak-to-peak) centered at 1.2 and 2.8 kHz; otherwise, major resonances deviated less than ±6 dB from the average level. Above 40 kHz, the output rolled off at a rate of 37 dB/octave. Harmonic distortion was <60 dB below the primary.

Pure-tone bursts were 50 ms in duration with a 3-ms rise/fall time. To assess suppressive domains, 2-tone bursts were constructed digitally by combining 2 simultaneous pure tones with the same shaping as the pure-tone bursts. The interstimulus interval was 400–1,200 ms for pseudo-randomly presented pure tones and 600–2,000 ms for pseudo-randomly presented tone pairs.

Recording procedure

Parylene-coated tungsten microelectrodes (Microprobe) with impedances of 1–10 MΩ at 1 kHz were introduced into the auditory cortex with a hydraulic microdrive. For A1, all penetrations were approximately orthogonal to the brain surface; for PAF, penetrations were made into the caudal bank of the posterior ectosylvian sulcus, to depths ≤3,300 microns, and located at various dorsoventral positions and distances from the sulcus (200–1,200 microns). Histological verification from several animals indicated that lesioned recording sites from A1 were from cortical layers 3 and 4; from PAF the recording sites were more distributed throughout the cortical layers. Neuronal activity of single units was amplified, band-pass filtered, and monitored on an oscilloscope and an audio monitor. During early experiments, action potentials from individual neurons were isolated from background noise with a window discriminator (BAK Electronics) and the time of each action potential was saved to disk. During later experiments, on-line single-unit isolation was performed with software, and spike waveforms were saved to disk to allow for additional off-line discrimination after the experiments.

Single-tone frequency response areas

Frequency response areas (FRAs) were obtained for each recorded unit. A description of these procedures can be found in Sutter et al. (1999), and are very similar to those of Evans (1974). Briefly, we presented 675 tone bursts in a pseudo-random sequence of different frequency/intensity combinations selected from 15 intensities and 45 frequencies. The intensities were spaced 5 dB apart for a total of 75 dB presented range. The frequency range covered by 45 steps ranged between 2.0 and 5.6 octaves, depending on the estimated width of the frequency-tuning curve. Typically we used a 3-octave range that provided 0.067-octave resolution between frequency samples. In later experiments the number and range of the tested frequencies could be more flexibly varied to allow the investigators to obtain a highresolution response area covering the full range of the neuron's response.

Because of the time constraints of single-unit recording, we characterized FRAs based on as few stimulus repetitions as possible. If a response was evoked by more than about 50% of the stimuli inside of each excitatory band, the curve was deemed well defined. If after one presentation per frequency/intensity combination, the resulting FRA was not well defined, the process was repeated with the same 675 stimuli, and the resulting evoked activity was added to the first. If necessary, the FRA recording procedure was repeated up to 5 times. This method has provided statistically reliable characterizations of cells based on repeated-measure controls (see Table 3 of Sutter and Schreiner 1991).

Two-tone frequency response areas

For most single neurons recorded in cat auditory cortex under barbiturate anesthesia, spontaneous activity is very low, and does not provide sufficient activity to judge background suppression by a stimulus. Therefore we used a 2-tone simultaneous masking paradigm to measure response suppression. For 2-tone FRAs, 675 different tone pairs were presented. For each tone pair, one component (the “BEF tone” or “probe-tone”) was at the cell's best excitatory frequency (BEF) with energy just above response threshold. The BEF was determined from a single-tone FRA. The second component (the “variable tone” or “masker tone”) had a frequency and intensity chosen using the same pseudo-random procedure as described for the single-tone FRA. The purpose of this element was to determine which frequency/intensity ranges suppressed the activity associated with the fixed BEF tone. If the response to the fixed BEF tone was not reliable (e.g., the mean of the BEF tone activity, in spikes per presentation, was less than the SD or the probability of response <0.25), the procedure was repeated with the same 675 tone pairs. The resulting evoked activity of multiple presentations was then added. Because we presented over 675 stimuli that contained the fixed BEF tone, habituation and/or adaptation sometimes caused the response to decrease over time. In those cases, we repeated the 2-tone FRA several times.

Suppressive bands were identified in the 2-tone FRAs using the methods of Sutter et al. (1999). A suppressive band is defined as a contiguous frequency/intensity space on the 2-tone FRA where fixed BEF tone activity was reduced by 50% or more by the variable tone. Lower suppressive bands by definition were on the low-frequency side of the excitatory domain, and upper bands were on the highfrequency side. If there were multiple lower or upper suppressive bands (Sutter et al. 1999), only the lower and upper bands closest to the excitatory domain were used. After each band was identified, the best inhibitory frequency (BIF)2 and threshold were determined for each suppressive sideband.

Determination of strength of excitatory domain intensity tuning

We used the monotonicity ratio to determine the strength of EDintensity tuning (e.g., Sutter and Schreiner 1995). The excitatory monotonicity ratio (eMR) is based on the response versus intensity function near the unit's characteristic frequency [CF, the most sensitive frequency; also in this study called the best excitatory frequency (BEF)]. Response versus intensity functions for single-tone excitatory bands (Fig. 2, A–C) were created as in Sutter and Schreiner (1995). At each intensity level, the number of action potentials from a ¼-octave bin around the unit's BEF and a 15-dB-wide intensity bin were summed. One-quarter octave usually constituted 4 different frequencies and 15 dB usually covered 3 levels of intensity. This provided a minimum of 12 different stimulus presentations per data point in the response versus intensity functions. Only units that were recorded from a minimum of 45 dB above threshold were used to analyze eMR. The eMR is the number of spikes elicited at the highest intensity divided by the number of spikes at the maximum of the response versus intensity function Math(1) Therefore a cell that fired maximally at the highest tested intensity had an eMR of 1 (Fig. 2A); a cell that was completely inhibited at the highest intensities had an eMR of 0 (similar to Fig. 2C).

fig. 2.

Relationship of monotonicity and strength ratios to 6 response vs. intensity functions. A lower monotonicity or strength ratio means a function is more sharply tuned. On the left (A–C) the relationship of excitatory monotonicity ratio (eMR) to single tone response vs. intensity functions are shown. Single-tone responses at the best excitatory frequency (BEF) were used. The relationship is shown for a neuron with a monotonically increasing function (A), intensity-tuned function (C), and a function with weak intensity tuning (B). On the right (D–F) relationship of the inhibitory strength ratio (iSR) and inhibitory monotonicity ratio (iMR) to 2-tone response vs. intensity functions are shown for 3 neurons. In this case the response vs. intensity function is taken from simultaneous 2-tone stimulation with one tone at each cell's BEF and one at the cell's best inhibitory frequency (BIF). The intensity of the BIF component is varied and the BEF tone intensity is fixed at 5–20 dB above threshold. Dashed lines represent response to the BEF tone presented by itself. D: cell with monotonically increasing inhibitory strength as function of intensity. Note that monotonically increasing inhibitory strength corresponds to monotonically decreasing function because a larger decrease in response corresponds to stronger inhibition. F: cell with an intensity-tuned suppressive domain. E: intermediate case.

Determination of inhibitory strength and intensity tuning

Response versus intensity functions (Fig. 2, D–F) were also created for suppressive bands from the 2-tone FRA. At each level of the variable component of the 2-tone stimulus, the number of action potentials from a ¼-octave bin around each band's BIF and a 15-dB-wide intensity bin were summed. We derived two metrics from the suppressive/inhibitory domain response versus intensity function: the inhibitory strength ratio (iSR) and the inhibitory monotonicity ratio (iMR).

The iSR is 1 minus the ratio between the number of spikes elicited at the highest intensity of the variable component of the 2-tone stimulus and the number of spikes at the maximum of the suppressive/inhibitory response versus intensity function Math(2) Therefore units that were strongly inhibited at high intensities would have iSR values near 1 (Fig. 2D) and neurons with weak inhibition at higher intensities would have iSR values near 0 (Fig. 2F).

The iMR is more closely analogous to the eMR measure because it is the amount of suppression at the highest intensity divided by the maximal suppression in the suppressive/inhibitory domain response versus intensity function Math(3) Spikesprobe is the response to the probe tone alone in units of spikes per presentation (dotted lines in Fig. 2, D–F). This is estimated from the 90 presentations to the lowest two “masker” intensities used in the 2-tone FRA. Spikeshighest intensity is as in Eq.2 and Spikesminimum is the spikes per presentation at the minimum of the suppressive response versus intensity function, where suppression is maximal. Suppression is estimated by subtracting the 2-tone response from Spikesprobe. Suppressive bands with monotonically decreasing firing rate (and therefore monotonically increasing inhibitory strength) will have iMR values near 1 (Fig. 2D), and bands with intensity-tuned suppressive domains will have values closer to 0 (Fig. 2F). In cases where the highest intensity response is greater than the probe response, the iMR is clipped to zero rather than allowing it to take on negative values (e.g., Fig. 2F). Only units that were recorded from a minimum of 40 dB above suppression threshold were used to analyze iSR and iMR.

To directly contrast these two measures, the iSR reflects the magnitude of the inhibition at the highest tested intensity, regardless of whether the suppressive-domain response versus intensity function is intensity tuned; on the other hand, the iMR is sensitive to the intensity tuning of the suppressive domain response versus intensity function, but is normalized for the strength of inhibition at the highest intensity. For example, if the response versus intensity function in Fig. 2D reached a minimum of 0.6 spikes per presentation rather than 0 spikes per presentation, the iSR would be reduced from 1 to about 0.5, but the iMR would remain unchanged at 1. Conversely, if the minimum of the suppressive response versus intensity function in Fig. 2E had reached 0 at +20 dB, thereby forming a deeper trough, the iSR would remain unchanged at 0.68 but the iMR would get slightly smaller (from 0.83 to 0.72).

The model of Fig. 1A predicts that both absolute level of inhibition (red lines in Fig. 1, A and B) and intensity tuning are important contributors to cortical responses. There is no reason therefore to suspect that one measure is better than the other a priori. Accordingly, for simplicity we will only report results on iMR because it is the most analogous to the eMR measure, except when iMR and iSR differences are substantial enough to influence interpretations.


Descriptive statistics of suppressive and excitatory domain intensity tuning

Although the properties of excitatory domain (ED) intensity tuning in A1 and PAF have been reported by several investigators, the properties of inhibitory domain intensity tuning have yet to be reported quantitatively. In this section we report the descriptive statistics of ED and inhibitory domain intensity tuning in A1 and PAF and compare the results for the two cortical areas.

A plurality of A1 cells had monotonically increasing response versus intensity functions, whereas the majority of PAF cells were intensity tuned for BEF tones. Strongly intensitytuned excitatory domains [excitatory monotonicity ratio (eMR) ≤0.5] were observed in 33.0% (76/230) of the recorded A1 cells, intermediately intensity-tuned excitatory domains (0.5 < eMR < 0.8; Sutter and Schreiner 1995) occurred in 29.1% (67/230), and monotonically increasing response versus intensity functions (eMR ≥0.8) were observed in 37.8% (87/230) of A1 neurons (Table 1). The distribution was highly nonnormal with a median eMR of 0.685 (Fig. 3). PAF had a higher incidence of ED-intensity tuning with 58.7% (54/92), 23.9% (22/92), and 17.4% (16/92) of PAF neurons having strongly, intermediately, and untuned excitatory domains, respectively. The median eMR of the relatively flat distribution was 0.44 (Fig. 3). The differences between A1 and PAF in strongly, intermediately, and untuned cells was significant (χ2, P < 0.0001), as were the differences in median eMR (Mann–Whitney, P < 0.0001).

View this table:
table 1.

Comparison of intensity tuning between PAF and A1

fig. 3.

Histograms of eMR and iMR distributions for different bands in A1 and PAF.

For suppressive bands, the distributions of intensity tuning and strength measures are more bimodally distributed with an extremely high proportion of suppressive domains that were not tuned for intensity with inhibitory strength ratios (iSRs) and inhibitory monotonicity ratios (iMRs) near 1 (Fig. 3). A summary of the intensity tuning of suppressive bands abutting the excitatory domain can be seen in Table 1. The median iMR and iSR for lower and upper bands always indicated stronger inhibition at higher intensities in PAF than in A1; however, the differences between A1 and PAF were small and never reached significance except for the upper band iMR. We also looked at whether the upper band was more tuned than the lower band; once again, differences in the median did not reach significance, although for A1 iMR, A1 iSR, PAF iMR, and PAF iSR median values were all higher for upper than for lower bands. The strong bimodal tendency for these metrics around values of 1 and 0 probably interfered with making statistical comparisons between A1 and PAF and upper and lower bands.

The ED-intensity tuning of a given neuron may have impacted our ability to measure its suppressive domain tuning. If a neuron with strong ED-intensity tuning responded weakly or habituated to repeated presentations of the BEF tone as a result of heightened inhibition, then it might not be possible to collect a 2-tone FRA for that neuron. This is supported by Table 2, which shows that the subset of neurons for which a 2-tone FRA could not be derived had stronger ED intensity tuning than the rest of the sample. This was noticeable for PAF, where the median monotonicity ratio was significantly higher for the neurons with 2-tone FRAs than for those without (0.51 vs. 0.36, P < 0.05, Mann–Whitney test). In A1, the difference was not significant, consistent with less habituation in A1.

View this table:
table 2.

Relationship of intensity tuning to the ability to collect a 2-tone FRA

In summary, these data indicate that ED-intensity tuning is stronger in PAF than in A1. There may also be overall differences in suppressive domain tuning between A1 and PAF that could be related to differences in the incidence of ED-intensity tuning in these 2 fields. The ability to detect this is weakened by a bias against recording 2-tone FRAs from strongly inhibited PAF neurons. We now consider neurons for which both 2-tone and single-tone FRAS were obtained (to mitigate this sampling bias effect), and ask whether the tuning of the excitatory and suppressive domains are correlated.

Relationship of excitatory domain (ED-) intensity tuning to suppressive domain intensity tuning in A1

The model of Fig. 1, A and B predicts an inverse relationship between excitatory and suppressive domain intensity tuning; that is, neurons with intensity-tuned excitatory domains should have non-intensity-tuned suppressive domains and vice versa. A Spearman rank test demonstrated this relationship in A1 for both lower and upper suppressive bands (P < 0.05, Table 3).

View this table:
table 3.

Relationship of eMR to inhibitory measures

Although the ρ values for all the Spearman tests were negative, indicating an inverse relationship, a linear trend was difficult to determine because of the nonuniform distribution of eMR and iMR values (Fig. 4). One of the most striking aspects of Fig. 4 was the paucity of cells with both intensity-tuned suppressive and excitatory domains. To better view this effect (in light of the many overlapping points in the scatter plot) we assigned each neuron to one of 4 quadrants and made a bubble histogram. For simplicity, we will describe the method only for iMRlower versus eMR (Figs. 4 and 5A, left) (although the same method was used for iMRupper vs. eMR, iSRlower vs. eMR, and iSRupper vs. eMR). Cells with eMR and iMRlower both <0.5 (i.e., with intensity-tuned suppressive and excitatory domains) lie in the lower left quadrant. Cells with eMR and iMRlower >0.5 lie in the upper right quadrant, and so forth. The size of each circle is proportional to the percentage of neurons with the corresponding joint values of eMR and iMRLower. A1 neurons with both non-intensity-tuned suppressive and excitatory domains were common (Fig. 5, A and B, upper right quadrants of each plot) in contrast to the uncommonly encountered neurons with intensity-tuned suppressive and excitatory domains (lower left quadrants). This suggests that the negative correlation may be chiefly attributed to a lack of cells with dually intensity-tuned excitatory and suppressive domains. Alternatively, the apparently large percentage of neurons with non-intensity-tuned excitatory and suppressive domains might be attributable to the high proportion of untuned neurons in the individual distributions (Fig. 3). It might even be that the percentages of neurons with non-intensity-tuned excitatory and suppressive domains (upper right quadrant) are smaller than predicted by chance pairings of eMR and iMR values taken from the individual distributions.

fig. 4.

Relationship of suppressive to excitatory domain intensity tuning in A1. Scatter plots are shown for lower and upper suppressive bands vs. eMR. Every point corresponds to one neuron. Statistical significance of Spearman rank correlation test is shown in top right-hand corner of each plot.

fig. 5.

Bubble histogram demonstrating nonlinear correlation of suppressive to excitatory domain intensity tuning in A1. Size of bubbles (circles) corresponds to the percentage of cells with combinations of monotonicity and strength ratios given by the intersection of xand y-axes. A: A1 iMR vs. eMR for upper and lower suppressive bands. Histograms are binned into 4 quadrants by ratios of >0.5 or <0.5. Lower left quadrant: neurons with low eMR and iMR values, that is, cells that have intensity-tuned excitatory and suppressive domains. Percentages of cells in each quadrant were statistically compared with random pairings from individual eMR and iMR distributions with a Monte Carlo analysis. If the proportion of cells falls significantly below the expectation with random pairings at P < 0.05 an * is placed in corner. If significance is at the P < 0.01 level, ** was used. If the proportion of cells is significantly above the expectation with random pairing at the P < 0.05 level a ♦ is placed in the corner, ♦ ♦ for P < 0.01. B: same plot for iSR in A1. C and D: similar plots for posterior auditory field (PAF).

Therefore we decided to determine the nature of the relationship between excitatory and suppressive domain intensity tuning more precisely with a Monte Carlo analysis. For simplicity we will describe only the specific analysis of iMRlower versus eMR (Fig. 5A, left), although the analogous analyses was performed for all other inhibitory metrics. For each iMRlower value we randomly assigned an eMR value (without replacement) from one of the eMR values in the actual data. In this way the individual distributions of iMRlower and eMR were not changed, but the pairing of values was random, creating a new joint distribution. This procedure was performed 1,000 times to get 1,000 different joint distributions with identical individual iMRlower and eMR distributions. We then created 2-dimensional 2 × 2 histograms (such as those for the real data in Fig. 5) for all 1,000 simulated data sets. The actual percentages falling in each quadrant were then compared with the distribution of 1,000 simulations. The percentage of neurons in a quadrant was considered significantly below chance pairing at the P < 0.05 level if it was smaller than 95% of the simulated values (i.e., 950/1,000). The results were considered significant at the P < 0.01 level if the actual percentages were smaller than 99% of the simulated values. This test was performed on the lower left and upper right quadrants. Based on the negative correlation of the Spearman test (Table 3), the expectation would be that either (or both) of these quadrants would have significantly fewer observations. Similar analyses were performed to determine whether the percentage of neurons in the upper left and/or lower right quadrants was above chance as would be predicted by a negative correlation.

For A1 lower suppressive band iMR, the lower left and upper right quadrants had proportions of neurons significantly below chance (Fig. 5A, * represents significantly below chance), and the lower right and upper left quadrants had proportions of neurons significantly above chance (♦ in Fig. 5 represents significantly above chance). For A1 upper suppressive bands, the 2 lower quadrants reached significance, but the upper quadrants only approached significance (0.07 > P > 0.05). Inhibitory strength ratio (iSR), which is an index of the inhibitory strength at the highest intensity (see methods), reached significance for all quadrants with both lower and upper suppressive bands (Fig. 5B).

Relationship of ED-intensity tuning to suppressive domain intensity tuning in PAF

Although for PAF neurons we also found an inverse relationship between excitatory and inhibitory domain intensity tuning, the relationship was not as obvious as in A1 (Fig. 5, C and D). Spearman rank tests demonstrated that this relationship was statistically significant for lower suppressive bands, but did not reach significance for upper suppressive bands (Table 3). The same method of Monte Carlo analysis was performed as in A1 to try to determine what relationships caused the significant Spearman rank correlation, and only revealed significant deviations from chance for iSR lower in PAF. In this instance all quadrants reached significance (Fig. 5D, left).

For PAF, the data were dominated by monotonic suppressive bands, with very little intensity-tuned suppression. This strong tendency likely decreased the ability of detecting trends in PAF neurons because the cells were more homogeneous in their suppressive properties. The Spearman rank analysis was also performed on combined A1 and PAF data. Overall, the introduction of PAF data increased significance (relative to A1 data alone) and had no net affect on rho values. It is hard to determine whether the lack of significant correlation of PAF upper bands reflects true correlation differences between A1 and PAF, or a lack of power in PAF attributed to the small N (Table 3) and a small percentage (about 25%) of PAF neurons with intensity-tuned upper suppressive bands.

When considered together with the A1 data, these results suggest that the negative correlation between the intensity tuning of suppressive and excitatory domains is not solely the result of a lack of cells with dually intensity-tuned suppressive and excitatory domains. Below chance pairing of neurons with both suppressive and excitatory domains that are not tuned for intensity, as well as above chance pairing of tuned suppressive and untuned excitatory domains and vice versa, also contribute.

Relationship of the intensity tuning of lower and upper bands in A1 and PAF

There was a strong correlation of the strength of inhibition between lower and upper suppressive bands. Lower band suppression that monotonically increases as a function of intensity is often accompanied by monotonically increasing upper band suppression with intensity (Fig. 6). A notable aspect of the relationship between suppressive upper and lower band intensity tuning is the propensity of cells for which both suppressive bands were untuned (Fig. 6, upper right quadrants). All correlations in PAF and A1 were significant with correlation coefficients in A1 about 0.36 and in PAF about 0.5 (Table 4).

fig. 6.

Three-dimensional bubble histogram demonstrating the nonlinear correlation of upper and lower suppressive band intensity tuning. Left column plot: data from A1; right column: data from PAF.

View this table:
table 4.

Relationship of intensity tuning of lower and upper suppressive bands

Relationship of excitatory domain (ED-) intensity tuning to the edges of suppressive bands

If inputs creating suppressive sidebands are responsible for ED-intensity tuning one might expect to see a correlation between the slopes of edges of suppressive sidebands and the degree of ED-intensity tuning. Specifically one would expect that inhibitory inputs would impinge on excitatory edges at high intensities (Fig. 7A), causing the suppressive domains to slope toward the excitatory domain (Fig. 7B, Fig. 1A, level 2). Conversely, one would expect that for excitatory domains that are not intensity tuned, suppressive sidebands would slope more away from them (Fig. 1, B and C, level 2). To address this issue we compared the inverse slope (IS) of suppressive band edges from 5 to 45 dB above the neuron's threshold (IS5–45), to the eMR. The IS of the lower edge of a hypothetical suppressive band with a threshold of 5 dB is shown in Fig. 7C. In this case the IS5–45 is –1 octave/40 dB. Negative values denote edges that slant toward lower frequencies with increasing intensity. Inverse slopes were chosen, rather than slopes, because of the added stability for vertical tuning curve edges (see Sutter 2000).

fig. 7.

Schematized tuning curve showing a possible mechanism of how inhibitory inputs might combine to create intensity tuning, and the terminology of edges of bands, and how the inverse slope (IS)5–45 is measured. A: putative inhibitory inputs with broad sloping edges that infringe on the BEF. Dark solid line outlines the excitatory tuning curve of the input. Shaded gray areas mark the putative tuning curves of the inhibitory inputs. Darkest shaded area corresponds to frequencies at which the lower and upper inhibitory inputs overlap. B: predicted excitatory and suppressive domains of an output neuron receiving input from A. Dark solid line outlines the excitatory domain. The BEF is represented by a vertical dashed line. Dark areas demarcate inhibitory bands. Arrows point to various edges of the tuning curve indicating what they are called. Note that the lower suppressive band's upper edge (LBUE) and upper band's lower edge (UBLE) abut and impinge on the excitatory frequency tuning curve. The lower band's lower edge (LBLE) and upper band's upper edge (UBUE) do not abut the excitatory frequency tuning curve. Note that this schematic is for a neuron with a circumscribed excitatory domain. C: how IS5–45 was calculated from the lower edge of a suppressive band. Note that this band's threshold was 5 dB and IS5–45 would be –1 octave/40 dB.

In both A1 and PAF there was a correlation between the inverse slope of the suppressive band edges abutting the excitatory domain and the ED-intensity tuning (Table 5, Fig. 8). The lower suppressive band's upper edge (LBUE, Fig. 7B) inverse slope tended to be more positive for cells with lower eMRs (i.e., more intensity-tuned excitatory domains; Fig. 8). This corresponds to LBUE slanting toward the excitatory domain for ED-intensity-tuned cells. Similarly, the upper suppressive band's lower edge (UBLE) tended to be more negative for cells with lower eMRs (Fig. 8), corresponding to the UBLE slanting toward the excitatory domain for ED-intensity-tuned cells. Both of these effects were significant for A1 and for PAF (Table 5). For edges that did not abut the excitatory domain [the lower edge of the lower band (LBLE) and the upper edge of the upper band (UBUE)] the correlation never reached significance (Table 5). These results are consistent with the notion that “surround” inhibition impinging on the excitatory tuning curve helps to create ED-intensity tuning.

View this table:
table 5.

Relationship of the inhibitory domain to excitatory domain intensity tuning

fig. 8.

Plots demonstrating the relationship of the inverse slope of the suppressive band edges abutting the excitatory domain, and excitatory domain (ED-) intensity tuning. Each plot shows the relationship between a suppressive band edge's IS5–45 (inverse slope between 5 and 45 dB above threshold) and eMR. In the top box labeled A1 the different plots show the relationships of the 4 different suppressive band edges to eMR in A1. The lower suppressive band's upper edge (LBUE) and upper band's lower edge (UBLE) abut the lower and upper edge of the excitatory tuning curve, respectively, whereas the lower suppressive band's lower edge (LBLE) and upper band's upper edge (UBUE) are distant from the excitatory tuning curve. Note that the 2 abutting-edge ISs vary greatly with eMR, but the nonabutting edges do not (Table 5). The middle box shows the same relationships for PAF, and the lower box shows the relationships for combined A1 and PAF data.

Relationship of frequency separation of suppressive and excitatory domains to ED-intensity tuning

The model of Fig. 1 predicts the above result that intensitytuned excitatory domains would have suppressive sidebands with steeper slopes toward the BEF (Fig. 1, A vs. C). This reflects the notion that inhibitory inputs overlap with the excitatory domain at high intensities. The model's predictions for low intensities are less clear. On the one hand, the assumption that all inhibition is similar in strength (or alternatively if the slopes of the edges of inhibitory inputs were all the same) would lead to the model predicting that for cells with strong ED-intensity tuning, the best frequency of adjacent suppressive bands (BIF) would be closer to the BEF. The logic is that when the BIFs are far from the BEFs, the inhibitory inputs would be less likely to overlap with BEF excitatory inputs at high intensities; therefore this would predict monotonically increasing responses as a function of BEF tone intensity (Fig. 9A). When the BIFs are closer to the BEF (Fig. 9B), inhibitory inputs of equivalent bandwidth would overlap with the BEF excitatory inputs at higher intensities, thereby causing the firing rate to decline for higher intensity BEF tones (inhibitory inputs aligned with double arrow at BEF in Fig. 9B). Alternatively, it could be that frequency separation between inhibitory and excitatory input at low intensities is not a major factor in the model when compared with slopes. To distinguish these two possibilities we examined the relationship between eMR and the frequency separation of the suppressive and excitatory domains at both low (threshold) and high (45 dB above suppression threshold) intensities.

fig. 9.

Schematic showing a possible relationship between BEF–BIF separation and ED-intensity tuning. Solid outline depicts the excitatory domain and shaded areas represent suppressive domains; dashed lines represent BEFs and BIFs. A: frequency tuning of 2 inhibitory inputs with distant BIFs from excitation (arrows). In this case, because the BIFs are far in frequency from the BEF, the inhibitory inputs do not impinge on the excitatory input. Therefore the excitatory domain is not intensity tuned. B: expectation when the BIF of the inhibitory input (and therefore the measured suppressive domain) is close to the BEF. In this case, the inhibitory input impinges on the excitatory input at high intensities and both inhibitory inputs overlap at the BEF, creating strong inhibition and ED-intensity tuning. C and D: how high-intensity inhibition could correlate with intensity tuning, independent of BEF–BIF separation. In these plots, excitatory and suppressive response domains are shown rather than the inputs, as in A and B. C: the suppressive bands are distant at higher intensities, whereas in D it abuts the excitatory domain. The frequency difference, (ΔLBUE), between the LBUE and the BEF is shown in C. The analogous difference, ΔUBLE, is also shown in C.

First, we investigated the correlation between ED intensity tuning and the differences in the BEFs of excitatory and suppressive domains. For lower and upper A1 and PAF suppressive bands, no significant correlations were found and no trends were observed (Table 5). It is also interesting to note that the separation between the BIF of the upper band and the BEF in PAF was larger than that in A1 (PAF median = 0.65 octaves, A1 median = 0.34 octaves, Mann–Whitney U test: P = 0.0064). This trend also argues against further BEF–BIF separation corresponding to less ED-intensity tuning, given that PAF has more ED intensity tuning but has upper suppressive bands whose BIFs are further from the BEFs.

Neurons whose suppressive domains hug the edge of the excitatory domain (e.g., Fig. 10) can provide insight into the lack of relationship between BEF–BIF separation and EDintensity tuning. In this cell with an eMR of 0.73, the BEFs and BIFs are similar, but both suppressive bands slant away from the excitatory domain, suggesting that there is competition between excitation and inhibition and the dominance of one over the other is independent of the BEF–BIF difference. The inverse relationship between suppressive and ED-intensity tuning also supports this idea.

fig. 10.

Suppressive and excitatory domains of A1 neuron with BIFs close to the BEF. This cell did not have strong ED-intensity-tuning.

This result does not necessarily contradict the model (but infers that assumptions of equal strength of all inhibitory bands is not warranted). The critical aspect in the model is not the frequency of the inhibition at low intensities, but the frequency and strength of inhibitory inputs at high intensities, where responses to BEF tones start to decline for intensity-tuned excitatory domains. This is because at suppression threshold 2-tone suppression/inhibition reflects only the most sensitive unit, not necessarily the strongest.

To test whether the high-intensity suppressive domains correlate with ED-intensity tuning (and the observed results with inverse slopes), we investigated the relationship between ΔUBLE and eMR. ΔUBLE (Fig. 9, C and D) is calculated by taking the frequency difference (in octaves) between UBLE 45 dB above threshold and the BEF. For both A1 and PAF a significant correlation (Table 5) was observed between ΔUBLE and eMR, supporting the notion that high-intensity inhibition is responsible for ED-intensity tuning. A significant correlation (Table 5) was also observed between ΔLBUE and eMR. The results of this section are consistent with the model. When the edges of strongly sloping inhibitory inputs (Fig. 1A, level 1 and red lines) intrude into the BEF of the input, an intensity-tuned excitatory domain is created. An outcome of this interaction is that the edge of the suppressive domain (Fig. 1A, level 2) gets closer to the BEF.


Mechanisms underlying intensity tuning: fast inhibition and delayed excitation

These results are consistent with the working hypothesis that inhibitory input neighboring the excitatory edges of frequency tuning curves is responsible for excitatory domain intensity tuning by inhibiting BEF responses at higher intensities. Supporting this is 1) the inverse relationship between excitatory and neighboring suppressive domain intensity tuning, and 2) the relationship between the slopes of suppressive band edges and ED-intensity tuning.

In addition to these results, other work strongly constrains the properties of inhibitory input carving out ED-intensity tuning. In particular the inhibition must be 1) effective at high intensities and 2) strong and/or fast. Intensity-dependent inhibition consistent with case 1 has been reported throughout the auditory system, with midbrain and brain stem binaural processing providing especially compelling examples (Park et al. 1996, 1997; Sanes 1990; Yin et al. 1985). Fast inhibition (case 2) is required because the highly phasic onset response of many cortical cells contrasts with the more sustained responses of many subcortical neurons. Accordingly, the ipsps responsible for removing onset responses at high intensities must affect the cell before epsps can bring the neuron past action potential threshold.

The inhibition responsible for the observed cortical intensity tuning need not be localized to cortex. Both fast ipsps and slow epsps consistent with these results have been demonstrated throughout the auditory system (e.g., brain stem: Ferragamo et al. 1998; Wu and Kelly 1996; midbrain: Carney and Yin 1989; Covey et al. 1996; Kuwada et al. 1997; Nelson and Erulkar 1963; cortex: Bartlett and Smith 1999). This inhibition of onset responses by effective “feed-forward” inhibition is a relatively unique property of the auditory system when compared with other senses.

So how can the auditory system create this effective “feedforward” inhibition? This can be produced either anatomically (Fig. 1) or physiologically. Anatomically fast shorter latency feed-forward inhibitory pathways and/or delayed excitatory pathways could create this fast inhibition. There is evidence for such feed-forward inhibition or delayed excitation in projections from brain stem to midbrain (Adams and Mugnaini 1984; Bauer et al. 2000; Moore et al. 1998; Shneiderman et al. 1988; Vater et al. 1997; Zhang et al. 1998) and from IC to MGB (Bartlett and Smith 1999; Peruzzi et al. 1997; Saint Marie et al. 1997; Winer et al. 1996), as well as within the brain stem and cortico-cortical projections. Feed-forward inhibition can also be achieved physiologically. For example, a combination of weak excitatory and strong inhibitory input could allow the inhibition to delay or prevent the neuron from crossing the action potential threshold. Similarly if ipsps are integrated (in time and space) more efficiently than epsps, inhibitory inputs can exert a stronger influence before the action potential threshold is crossed. Faster receptor dynamics and/or transmitter release for inhibition could also generate the same effect. There is physiological evidence for such mechanisms, although it is difficult to segregate these physiological effects from anatomical contributions (e.g., brain stem: Grothe and Sanes 1993; Zhang and Oertel 1994; cortex: Cox et al. 1992; Hefti and Smith 2000).

Mechanisms underlying intensity tuning: multiple stages of inhibition

The constraint that inhibition must be fast to create EDintensity tuning can help resolve the seemingly contradictory results of this study and those of Calford and Semple who found intensity tuning of excitatory and suppressive domains were positively correlated (Calford and Semple 1995). In other words they found that neurons with intensity-tuned excitatory domains also had intensity-tuned inhibitory domains, and cells with non-intensity-tuned excitatory domains also had untuned inhibitory domains. In contrast to our study, which used simultaneous masking to characterize suppressive domains, Calford and Semple used forward masking where the putative inhibitory tone was always presented before a BEF tone with no temporal overlap (i.e., no time when both tones were on). In our study the suppressive and BEF tones were gated together and always simultaneously presented, completely overlapping. So why do results of these 2 studies seem opposite? One possible explanation is that the inhibitory input responsible for creating intensity tuning is separable and distinct from that responsible for forward-masked inhibition (Calford and Semple 1995).

In particular, one can think of 2 stages of inhibition: a fast “feed-forward” inhibition that creates intensity tuning; and a longer latency interneuron-mediated inhibition responsible for forward masking. By making reasonable assumptions about the interneuron inhibition, such as cells with similar excitatory domain intensity tuning laterally inhibit each other, one can account for the results of both studies. Figure 11 presents a model that incorporates both fast and slow inhibition and accounts for both the presented and the forward-masking data. Level 1 and level 2 were presented in the introduction (Fig. 1). The fundamental premise of this model is that ED-intensity tuning first is created by inhibition, and then ED-intensitytuned cells laterally inhibit each other. The model has 3 successive levels. Level 1 neurons are connected to level 2 neurons by 2 pathways: a monosynaptic fast-feed forward lateral inhibitory3 pathway and a delayed excitatory pathway (the delay is schematized by going through two synapses). Level 3 neurons laterally inhibit each other. The levels refer to successive stages of processing that could be connections within or between brain areas.

fig. 11.

Our results when combined with those of Calford and Semple support 2 distinct stages of inhibition. One creates ED-intensity tuning (level 1 to level 2 in A). In the other ED-intensity-tuned cells laterally inhibit each other (level 3 in A). This stage is like a segregated ED-intensity-tuned “channel” where ED-intensity-tuned cells inhibit only each other, but do not interact with non-intensity-tuned neurons. All effects of the first stage are shown in red; all of the second stage in blue. This model architecture can explain all the results observed by this and Calford and Semple's study. Note the phrase two-tone inhibitory (TTI) domain is used in this figure to be consistent with the literature, because the term two-tone suppression is historically used to refer to peripheral suppression in the basilar membrane. A: possible mechanism creating cells with intensity-tuned excitatory domains, untuned simultaneous masked suppression, and intensity-tuned forward masked suppression. Level 1 shows broad excitatory frequency tuning curves (left) of different input cells (circles). At level 2, broad, fast lateral inhibition (red feed forward connections) inhibits responses at the best excitatory frequency at high intensities. This creates cells with intensity-tuned excitatory domains, but non-intensity-tuned simultaneous masked suppression (tuning curves to right of level 2 cells). At level 3, cells with intensity-tuned excitatory domains laterally inhibit each other (blue dashed lines) creating intensity-tuned forward masked suppression. The reason this intensity-tuned inhibition would not be seen with simultaneous masking is because this inhibition experiences typical synaptic delays. B: possible mechanism for creating cells that are not intensity tuned for excitatory or forward masked suppressive domains, but are intensity-tuned for simultaneous masked suppression. All connections are the same as in A. All that has changed is the strength of inhibition. C: how cells that have both excitatory and suppressive domains that are not intensity tuned could be formed. All connections are the same as in A. All that has changed is frequency tuning of the inputs. D: schematic figure similar to plots in Fig. 5 summarizing the results of this study using simultaneous masking and Calford and Semple's study using forward masking. Each quadrant is the marked with roman numerals (lowercase for simultaneous 2-tone suppression, and uppercase for forward masked inhibition). Quadrant numerals are also placed next to the frequency tuning curves shown in A–C.

Figure 11A shows that the convergence of an array of broadly tuned inputs creates a level 3 cell with intensity-tuned excitatory and forward-masked suppressive domains (Fig. 11D, quadrant III), but with simultaneous 2-tone suppressive domains that are not intensity tuned (Fig. 11D, quadrant iv). Intensity-tuned excitatory domains at level 2 are caused by the broad frequency tuning of the level 1 inputs, some of which create broad lateral inhibition at level 2. This inhibition is particularly strong at high intensities where the low frequency tail of the upper band's inhibitory input infringes on the cell's BEF (Fig. 11A, level 2). Therefore level 2 neurons have intensity-tuned excitatory domains and broad sloping simultaneous 2-tone suppressive domains that are not intensity tuned.

An essential feature of this model is that neurons with similar ED-intensity tuning cluster together and laterally inhibit each other, whereas cells with dissimilar ED-intensity tuning do not. This lateral inhibition, delayed by going through an interneuron (Fig. 11A, blue dashed line), can be found in level 3. In Fig. 11A this creates intensity-tuned forward-masked suppressive domains. For simplicity we have shown this interneuron in level 3, but it also could be in the same brain area that hosts the cells of level 2 or could be mediated through feedback from level 3 to level 2. Evidence supports all of these possibilities (e.g., He 1997; Jen and Zhang 1999; Moore et al. 1998; Paolini et al. 1998; Rhode 1999; Shofner and Young 1987; Suga et al. 2000; Wickesberg and Oertel 1990; Winer et al. 1995; Zhang and Oertel 1994; Zhang et al. 1997), which certainly are not mutually exclusive. Rather than focusing on where the inhibition occurs, we choose to focus on the essential feature of this model that cells with similar ED-intensity tuning cluster together and laterally inhibit each other, whereas cells with dissimilar ED-intensity tuning do not laterally inhibit each other. Neurons with similar ED-intensity tuning spatially cluster in auditory cortex (Clarey et al. 1994; Heil et al. 1994; Imig et al. 1990; Schreiner et al. 1992; Sutter and Schreiner 1995), possibly providing a site for such lateral inhibition. Although to our knowledge clustering of neurons with similar ED-intensity tuning have not been reported subcortically, there is evidence that certain brain areas have a higher proportion of ED-intensity-tuned cells (e.g., DCN: Voigt and Young 1980; caudal MGB: Rodrigues-Dagaeff et al. 1989). Finally feedback connections are topographic with respect to several information bearing parameters of sound (Suga et al. 2000; Yan and Suga 1999; Zhang et al. 1997). If topography also holds with respect to ED-intensity tuning, this could be a source of delays in forward-masked lateral inhibition.

The model assumes segregated parallel pathways for neurons with intensity-tuned (Fig. 11A) and untuned (e.g., Fig. 11B) excitatory domains. In the pathway with non-intensity-tuned excitatory domains of Fig. 11B, broad inputs and weak inhibition at level 2 allow for the ED to remain untuned because the inhibition does not have the strength to significantly influence the BEF response. Inhibitory domains are intensity tuned because the excitation is strong enough to overcome inhibition at the best inhibitory frequency at higher intensities. Because the excitation is untuned, responses using simultaneous masking in this pathway should fall in quadrant ii. Strong inhibition at level 3 of neurons in this segregated untuned pathway creates untuned forwardmasked inhibition (Blue), and because the ED is also untuned responses in forward-masked quadrant I.

The results with respect to the model of Fig. 11C is a bit more complex. In this model, inputs with narrow frequency tuning allow for non-intensity-tuned excitatory domains in level 2 because the lateral inhibitory inputs do not extend to the BEF. Although we found a high proportion of cells with non-intensity-tuned excitatory and suppressive domains (quadrant i), the proportion of cells is lower than one would predict by chance pairings of the individual distributions of eMR and iMR (Fig. 5). This indicates that the high proportion of cells in quadrant i is solely attributed to the large percentage of neurons with non-intensity-tuned domains. This supports that level 3 neurons with these properties exist in high proportion, but that at the cortical level, interaction between inhibitory and excitatory domains (potentially by the model of Fig. 11, A and B), is serving to overcome this, possibly because of mechanisms ascribed to Fig. 11, A and B. The data argue that even in the non-intensity-tuned EDs, interactions between overlapping frequencies in the inhibitory and excitatory input is prevalent.

This relatively simple model explains the results of Calford and Semple using forward masking and of the present study using simultaneous 2-tone masking. A critical assumption built into this model is that there are segregated parallel pathways for neurons with intensity-tuned and untuned excitatory domains. By restricting level 3 flanking inhibition to cells within a pathway the results of Calford and Semple can be explained. Figure 11A level 3 neurons have intensity-tuned excitatory and forward-masked suppressive domains, and Fig. 11B level 3 neurons have non-intensity-tuned excitatory and forwardmasked suppressive domains. With regard to our simultaneous 2-tone suppressive results, the model consistently predicts the types of correlation described in Fig. 5. In particular the model predicts a lack of neurons with both intensity-tuned excitatory and simultaneous masked suppressive domains (Fig. 11D, quadrant iii). When the inputs are broad (Fig. 11A), excitatory domains are intensity tuned and simultaneous 2-tone suppressive domains are not (Fig. 11D, quadrant iv), a condition that was quite common, particularly in PAF. The other common condition (by overall proportion and by above chance pairing of individual eMR and iMR distributions) of intensity-tuned simultaneous 2-tone suppressive and untuned excitatory domains (Fig. 5, upper left quadrants of bubble histograms) can be explained by broad inputs and weak inhibition at level 2, followed by strong inhibition at level 3 (Fig. 11B). Strong inhibition is required at level 3 to keep the results consistent with Calford and Semple's that neurons without intensitytuned excitatory domains also have untuned forward-masking inhibitory domains. In neural network simulations, it has been our experience that the model structure of Fig. 11 does not produce both intensity-tuned simultaneous 2-tone suppressive and excitatory domains in the same level 3 neurons (personal observation), a finding that is consistent with the results (Fig. 5, lower left quadrants). Finally, because only cells with similar excitatory domain intensity tuning laterally inhibit each other, this model also predicts a lack of cells with oppositely intensity-tuned excitatory and forward-masked-inhibitory domains that is consistent with findings of Calford and Semple.

In conclusion, we believe the most parsimonious explanation of our results and those of Calford and Semple is separate stages of inhibition carving out ED-intensity tuning and creating forwardmasked suppressive domains. Calford and Semple made similar arguments based on the strong evidence from their studies that forward-masking inhibition and the inhibition responsible for intensity tuning were the result of 2 independent inhibitory mechanisms.

Mechanisms underlying intensity tuning: loci of inhibition

With the current evidence it is impossible to assign a specific brain area corresponding to level 2 and/or level 3 of the model. Pharmacological experiments indicate that excitatory domain intensity tuning is created de novo in many auditory areas (e.g., CN: Caspary et al. 1979; Davis and Young 2000; Evans and Zhao 1993; IC: Faingold et al. 1991; Fuzessery and Hall 1996; Lu and Jen 2001; Pollak and Park 1993; Vater et al. 1992; Yang et al. 1992; Wang et al. 2000; MGB: Suga et al. 1997; A1: Chen and Jen 2000). Although a role for cortical inhibition in shaping intensity tuning has not been reported for non-echolocating mammals, the first published combined pharmacological and physiological studies in chinchilla auditory cortex report an inhibitory role in shaping excitatory domains (Wang et al. 2000, 2002). However, because ED-intensity tuning at the brain stem and midbrain level can result from inhibition of sustained responses without affecting the onset responses, it is unclear whether a narrower subset of auditory pathway neurons is responsible for the ED-intensity tuning observed in cortex, requiring fast inhibition to remove onset responses. More refined studies of the timing of inhibition creating ED-intensity tuning in different brain areas is required.

There are also several areas that might be the site(s) of delayed, possibly lateral, inhibition that carves out cortical forward-masked areas (Brosch and Schreiner 1997; Calford and Semple 1995); inhibitory interneurons exist in many brain structures (Winer and Larue 1989, 1996). However, the topography of cortical intensity-tuned neurons makes cortex a prime candidate.

Limitations on interpretation attributed to the use of simultaneous 2-tone masking

Although suppressive spectral receptive field properties have been reported using simultaneous 2-tone masking (e.g., Ehret and Merzenich 1988b; Fuzessery 1994; Fuzessery and Feng 1982; Fuzessery and Hall 1996; Suga 1965, 1968; Suga and Tsuzuki 1985; Sutter and Schreiner 1991; Sutter et al. 1999; Zhang and Feng 1998; Zhang et al. 1999), to our knowledge the only previous study of the relationship between the frequency extent of suppressive sidebands and ED-intensity tuning that directly measured suppressive bands used forward masking (Calford and Semple 1995). This choice was at least partially because the use of simultaneous 2-tone stimulation introduces inherent limitations in the interpretation of neural inhibitory contributions attributed to the well-established mechanical effect of simultaneous 2-tone suppression (TTS) (Sachs and Kiang 1968) in the basilar membrane (Rhode 1977; Ruggero et al. 1992).

Although the simultaneous paradigm poses some problems, its use is essential to investigate the role of inhibition in shaping ED-intensity tuning, because cortical ED-intensity tuning requires fast inhibition. The most common models of mechanisms underlying ED-intensity tuning support this. For example, a common argument made is that ED-intensity tuning might result from “spectral splash or splatter,” caused by the rapid onset of BEF tone pips, extending the tone-pips' power spectrum into lateral inhibitory sidebands (Phillips 1988; Phillips et al. 1995). Another possible mechanism is that tails of inhibitory sideband input might impinge on excitatory areas. Finally, inhibitory input with BIF matched to the BEF could create ED-intensity tuning if the inhibition becomes more effective at higher intensities. For all of the above-mentioned mechanisms that could create ED-intensity tuning an essential condition is that there be inhibition fast enough to eliminate onset responses. Therefore when a single loud BEF tone is presented, in addition to excitatory inputs being activated, inhibitory inputs also must be simultaneously activated. Because a single tone simultaneously evokes both the excitatory and inhibitory inputs, the spectral properties of inhibition under these conditions can be completely explored only by using simultaneously shaped and presented BEF and inhibitory tones.

Because we used simultaneous 2-tone stimuli it is possible that the observed suppressive domains simply reflects basilar membrane TTS. Although it is likely that TTS contributes to some of the observed results, several lines of evidence argue that neural inhibition is involved. The bandwidths, shapes, thresholds, and strength of cortical 2-tone suppressive domains often are quite distinct from peripheral TTS (Loftus and Sutter 2001b; Sutter et al. 1999). The time course of cortically observed inhibition (studied using simultaneous, overlapping, and forward masking and varying the delay of the onset of probe tone relative to the masker) is also incompatible with pure TTS (Brosch and Schreiner 1997; Loftus and Sutter 2001a). The properties of suppression reported herein further argue against pure TTS. For example, the frequency separation between excitatory and suppressive domains is further than expected from peripheral TTS. Additionally, the relationship between suppressive band edges and ED-intensity tuning is not expected if TTS were solely being measured. The a priori assumption would have to be that TTS is not related to excitatory domain intensity tuning, given that TTS is not observed with a single tone and intensity tuning is not seen in the basilar membrane. Even in cases where suppressive domains share some TTS properties (e.g., in Fig. 10, the threshold and frequency extent of the lower and upper suppressive bands nearest the excitatory area are consistent with 2-tone suppression), other properties such as the intensity tuning of the upper suppressive band are inconsistent with peripheral TTS. In fact, some cortical cells (even some that are quite narrowly tuned) do not show any 2-tone suppressive domains, whereas TTS is ubiquitous in auditory nerve fibers. In aggregate these data indicate that 1) peripheral TTS is not solely responsible for the observed effects and 2) under some conditions the effects of TTS with a BEF probe tone cannot even be seen at the cortical level, presumably because of excitatory integration and properties of synaptic transmission (such as threshold).

Excitatory domain (ED-) intensity tuning reflects inhibition at high intensities

It was a bit surprising that suppressive domains with BIFs close to the BEF did not appear to be closely associated with ED-intensity tuning, particularly because the slopes of suppressive sidebands' edges were correlated with ED-intensity tuning. This result is even more surprising because in the inferior colliculus when the BEF and forward-masked suppressive domain BIFs of a cell were close, lateral inhibition was more effective in shaping location sensitivity (Zhou and Jen 2000).

There are good explanations, though, for the results we observed. First, a neuron's spectral receptive field is not a clear picture of distinct separable inhibitory and excitatory inputs, but rather represents the net sum of excitatory and inhibitory inputs at each frequency. In this context one can think of the spectral receptive field, particularly near borders of excitatory and suppressive domains, as resulting from competition between excitatory and inhibitory inputs. For ED-intensity tuning, which depends on the competing influences of excitatory and inhibitory inputs at high intensities, it is not necessarily the most sensitive inhibitory inputs that are important, but the number and strength of inhibitory inputs at higher intensities. Second TTS, which does not play a role in the creation of intensity tuning, is expected to have BIFs very close to BEFs (e.g., see Sachs and Kiang 1968). Therefore in cells with less of an influence of neural inhibition (and therefore less intensity-tuned excitation), one might expect the peripheral TTS properties to dominate the suppressive domain near threshold. Therefore when using simultaneous 2-tone masking and recording from cells with little neural inhibition, one might expect to find the best “inhibitory” frequencies very close to the excitatory BEF, due to the presence of TTS.

In conclusion, when looking at excitatory domain intensity tuning one must account for inhibition at high intensities. Using low-intensity threshold measures (such as best frequency) and assumptions about inhibitory domain shape is not necessarily reliable.


Aside from the interpretational limitations introduced by TTS, using 2-tone masking poses another limitation. The need to drive a cell repetitively (≥675 times) with a BEF tone biases the recorded samples to cells that can be reliably repetitively driven. This means cells with weak responses or that rapidly habituate to BEF tones will be undersampled. However, these are exactly the cells that are most likely to have strong inhibition (given that inhibition can cause weak responsiveness or habituation). Table 2 suggests that in PAF this type of bias was present because ED-intensity-tuned cells (cells presumably experiencing stronger inhibition, and therefore potentially not having intensity-tuned suppressive domains) were undersampled with the 2-tone paradigm. This effect was not observed in A1. This could partially explain why, when not using paired comparisons, the strength and lack of suppressive domain intensity tuning in PAF neurons was not as large as one might expect given the differences in ED-intensity tuning between A1 and PAF (Table 1). However, paired comparisons (Table 3) revealed the relationship between excitatory and suppressive domain intensity tuning in PAF.

Binaural versus monaural

In PAF we used binaural stimulation. This could effect the proportion of recorded intensity-tuned neurons. For example, Klug et al. (1999) show fast ipsilateral inhibition in the IC putatively from a feed-forward pathway from the lateral lemniscus; such feed-forward inhibition could contribute to EDintensity tuning. Also, ED-intensity tuning can vary with different binaural stimulating conditions (Semple and Kitzes 1993a). Potential binaural stimulating differences are mitigated by two effects: 1) whereas overall response magnitudes change with binaural stimulating conditions, ED-intensity tuning in PAF does not seem to change (Orman and Phillips 1984); and 2) the 90 degree contralateral stimuli used in PAF would create substantially weaker ipsilateral stimulation than a stimulus presented at equal intensities to the 2 ears (the most common binaural stimulating condition.)

Potential anesthetic effects

Pentobarbital may have an effect on these studies by potentiating the effect of GABAergic inhibition. The usual manifestation is reduced spontaneous activity and changes in temporal response patterns. Studies of excitatory and inhibitory response properties in cat and monkey primary auditory cortex indicate that barbiturate anesthesia does not have a significant effect on frequency, intensity, and temporal tuning (Calford and Semple 1995; Merzenich et al. 1984; Pfingst and O'Conner 1981; Pfingst et al. 1977; Stryker et al. 1987), even though the anesthetic nonspecifically decreases cortical activity. For interpretation, however, we cannot rule out that the anesthetic used has some effect on the results. Therefore all interpretations should be made cautiously.

Implications of upper and lower band correlation

That upper and lower band iMRs tended to covary is interesting because it indicates that the creation of these two bands is not independent. There are several possible ways this could happen. One way is that upper and lower bands ostensibly result from one broad inhibitory input that extends beyond both lower and upper edges of the excitatory frequency tuning curve. Another possibility is that the upper and lower bands are derived from two distinct inhibitory inputs, but that the inputs are matched in their intensity tuning, possibly because of the known clustering of similarly intensity-tuned cells in the auditory system.

In conclusion, by measuring fast inhibition in A1 and PAF we were able to demonstrate that intensity-tuned excitatory domains are inversely related to the cells' suppressive domain intensity tuning. The results are consistent with a model of multiple independent inhibitory mechanisms and parallel segregated intensity-tuned and untuned pathways in the central auditory system.


This work was supported by National Institutes of Health Grants R01 DC-02514 and F31 MH-11518 and the Sloan Foundation (M. L. Sutter is a Sloan Fellow).

Present address of W. C. Loftus: Department of Neuroscience, University of Connecticut Health Center, 263 Farmington Ave, Farmington, CT 06030.


We thank K. O'Connor, and G. Recanzone for useful comments on the manuscript and M. McLean for technical assistance.


  • 1 Accordingly, a careful distinction between excitatory inputs and the tuning of the net responses to these inputs is required. When we refer to a portion of the frequency response area (FRA) that has a net excitatory effect, we will use the term excitatory domain. This refers to the measured response and helps to distinguish the tuning of the entire area from the excitatory input, which might have different tuning properties. The same nomenclature holds for suppressive domains and inputs, except suppressive domains may also be referred to as suppressive “bands” (Fig. 1). The word suppressive is used rather than inhibitory so as not to confuse the domains with the mechanisms that might create them; it is not intended to imply the mechanisms of 2-tone suppression in the basilar membrane.

    Furthermore henceforth when we refer to what historically are called “non-monotonic” or “intensity-tuned” neurons, we will use the nomenclature excitatory domain (ED-) intensity-tuned neurons to distinguish this type of intensity tuning from intensity tuning of the inhibitory domain, which will be called inhibitory domain intensity tuning.

  • 2 The term inhibitory is used here to keep the nomenclature consistent with previous literature (e.g., Sutter et al. 1999).

  • 3 Note that lateral inhibition is not required to create intensity tuning. There are several other possibilities including high-threshold inhibition matched in BIF to the BEF. However, the inhibition must be fast relative to the excitation.

  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract