Based on properties of excitatory frequency (spectral) receptive fields (esRFs), previous studies have indicated that cat primary auditory cortex (A1) is composed of functionally distinct dorsal and ventral subdivisions. Dorsal A1 (A1d) has been suggested to be involved in analyzing complex spectral patterns, whereas ventral A1 (A1v) appears better suited for analyzing narrowband sounds. However, these studies were based on single-tone stimuli and did not consider how neuronal responses to tones are modulated when the tones are part of a more complex acoustic environment. In the visual and peripheral auditory systems, stimulus components outside of the esRF can exert strong modulatory effects on responses. We investigated the organization of inhibitory frequency regions outside of the pure-tone esRF in single neurons in cat A1. We found a high incidence of inhibitory response areas (in 95% of sampled neurons) and a wide variety in the structure of inhibitory bands ranging from a single band to more than four distinct inhibitory regions. Unlike the auditory nerve where most fibers possess two surrounding “lateral” suppression bands, only 38% of A1 cells had this simple structure. The word lateral is defined in this sense to be inhibition or suppression that extends beyond the low- and high-frequency borders of the esRF. Regional differences in the distribution of inhibitory RF structure across A1 were evident. In A1d, only 16% of the cells had simple two-banded lateral RF organization, whereas 50% of A1v cells had this organization. This nonhomogeneous topographic distribution of inhibitory properties is consistent with the hypothesis that A1 is composed of at least two functionally distinct subdivisions that may be part of different auditory cortical processing streams.
The sensory receptive field (RF) is a fundamental concept in neuroscience. In the visual system, the classical RF (CRF) has been defined as the RF determined using single spots or single bars of light (review in Allman et al. 1985). Stimulus elements outside of the CRF modulate neurons' responses to complex stimuli (e.g., Gilbert and Wiesel 1990; McIlwain 1966). These modulatory influences correlate with perceptual processes, such as visual segmentation (Knierim and Van Essen 1992; Lamme 1995), brightness induction (Rossi et al. 1996), and contour integration (Kapadia et al. 1995), implying that stimulus components from beyond the CRF influence our perceptions. Characterizing these influences is critical to understanding the relationship between neurophysiology and perception.
Frequency tuning curves (FTCs) define a spectral or frequency-intensity RF for auditory neurons. Typically, FTCs of auditory neurons have been defined with single-tone stimuli (analogous to single-spot light stimuli used to define the CRF) and have focused on the excitatory spectral RF (esRF). However, the effects of modulatory tones from outside of the esRF on responses to excitatory tones imply that single-tone tuning curves do not adequately characterize the spectral input of neurons (e.g., Imig et al. 1997; Nelken et al. 1994; Sachs and Kiang 1968; Suga and Tsuzuki 1985). As in the visual system, these influences should be crucial contributors to the analysis of complex sounds and auditory scenes.
On the basis of results in echolocating bats (Suga 1994), one would expect that two-tone modulatory effects differ between subdivisions of auditory cortex. However, nonlinear frequency integration in the auditory cortex of nonecholocating mammals has received little attention (Abeles and Goldstein 1972;Katsuki et al. 1959; Shamma and Symmes 1985; Shamma et al. 1993). Recently, it has been hypothesized that cat primary auditory cortex (A1) contains functionally distinct dorsal (A1d) and ventral (A1v) subdivisions (Sutter and Schreiner 1991, 1995). In A1, two gradients of bandwidth have been described with multiple-unit recordings (Schreiner and Mendelson 1990). The bandwidth gradients run in the dorsalventral direction, orthogonal to A1's tonotopic map. In the dorsalventral center of A1 where multiple-unit clusters are most narrowly tuned for frequency, there is a reversal in the bandwidth map, marking a natural dividing point. Broadly tuned single neurons (Schreiner and Sutter 1992) and neurons with multipeaked frequency tuning (Sutter and Schreiner 1991) are mostly found in A1d, implying that A1d is involved in analyzing complex patterns of frequency. Neurons in A1v are more sharply tuned, and the organization of A1v is consistent with a role in analyzing and detecting narrowband sounds (Schreiner and Sutter 1992;Sutter and Schreiner 1995). If A1d was involved in analyzing complex spectral patterns, one would expect to see complex inhibitory frequency RFs in A1d, contrasting the simpler inhibitory tuning curves found at the level of the inferior colliculus.
In the present study, we investigated the spectral structure of two-tone inhibitory bands in cat A1 and show that A1 neurons have complex inhibitory spectral RFs (isRFs). The topographical distribution of different types of isRFs is consistent with the hypothesis that A1 contains at least two functionally distinct subdivisions that may belong to different auditory cortical processing streams.
We recorded single units from 3 left and 23 right hemispheres of 25 young adult cats. Surgical preparation, stimulus delivery, and recording procedures for have been described previously (Sutter and Schreiner 1991) with exceptions noted in the following text.
Briefly, anesthesia was induced with an intramuscular injection of ketamine hydrochloride (10 mg/kg) and acetylpromazine maleate (0.28 mg/kg). After venous cannulation, an initial dose of pentobarbital sodium (30 mg/kg) was administered. Animals were maintained at a surgical level of anesthesia with continuous infusion of sodium pentobarbital (2 mg · kg−1 · h−1 ) in lactated Ringer solution (infusion volume: 3.5 ml/h) and, if necessary, with supplementary intravenous injections of pentobarbital. Pentobarbital may affect these studies by potentiating the effects of GABAergic inhibition. The usual effect is reduced spontaneous activity, and changes in temporal response patterns. Studies of excitatory and inhibitory response properties in cat and monkey primary auditory cortex indicate that barbiturate anesthesia does not have a significant effect on frequency, intensity, and temporal tuning (Calford and Semple 1995;Merzenich et al. 1984; Pfingst and O'Conner 1981; Pfingst et al. 1977;Stryker et al. 1987), even though the anesthetic nonspecifically decreases cortical activity. For interpretation, however, we cannot rule out that the anesthetic used has some effect on the results obtained. The cats also were given dexamethasone sodium phosphate (0.14 mg/kg im) to prevent brain edema, and atropine sulfate (1 mg im) to reduce salivation. The rectal temperature of the animals was maintained at 37.5°C by means of a heated water blanket with feedback control.
Three-point head fixation was achieved with palatal-orbital restraint, leaving the external meati unobstructed. The temporal muscle was retracted and lateral cortex exposed by craniotomy. The dura overlying the middle ectosylvian gyrus was removed, the cortex was covered with silicone oil, and a video image of the vasculature was taken to record the electrode penetration sites. If brain pulsation interfered with single-unit recording, a system was used consisting of a wire mesh placed over the craniotomy. The space between the grid and cortex was filled with 1% clear agarose solution. The agarose-filled grid diminished cortical pulsation and provided an unobstructed view of cortex.
After 72–120 h of recording, the animal was anesthetized deeply and perfused transcardially with saline followed by formalin so brain tissue could be processed for histology. Cresyl violet and fiber staining was used to reconstruct electrode positions from serial frontal 50-μm sections. Electrode positions were marked with electrolytic lesions (5–15 μA for 5–15 s) at the final few recording sites.
Stimulus generation and delivery
Experiments were conducted in double-walled sound-shielded rooms (IAC). Stimuli were generated by a microprocessor (TMS32010; 16 bit D/A converter at 120 kHz; low-pass filter of 96 dB/octave at 15, 35, or 50 kHz), and passively attenuated.
We used calibrated insert speakers (STAX 54) enclosed in small chambers that were connected to sound-delivery tubes sealed into the acoustic meati. This sound-delivery system was calibrated with a sound-level meter (Brüel and Kjaer), and distortions were measured either with waveform analyzers or a computer acquisition system. The frequency response of the system was essentially flat ≤14 kHz and did not have major resonances deviating more than ±6 dB from the average level. Above 14 kHz, the output rolled off at a rate of 10 dB/octave. Harmonic distortion was better than 55 dB below the primary.
Stimuli consisted of either short-duration shaped tones or tone pairs. Tone pairs were generated by adding the waveforms of two sine waves to create one complex waveform. Tone pairs were presented simultaneously, methodologically similar to presenting two simultaneous spots of light to the visual system. However, when the two tones are very close in frequency and amplitude, a complex form of AM of the envelope can occur. We did not notice any effect of these envelope fluctuations on the responses, which almost exclusively consisted of one to two short-latency spikes. This type of response is common both in awake cats (Abeles and Goldstein 1970, 1972) and mustached bats (e.g., Suga 1984) but not in all species (e.g., squirrel monkeys, Funkenstein and Winter 1973; macaque monkeys, Pfingst and O'Conner 1981). Each stimulus was 50 ms long, with 3-ms rise/fall time. The interstimulus interval (ISI) was 400–1,200 ms for pseudorandomly presented pure tones and between 600 and 1,200 ms for pseudorandomly presented tone pairs. ISIs were chosen based on the habituation of each cell. Typical ISIs were 600 ms for pure tones and 900 ms for tone pairs. Because A1d cells tended to habituate more than A1v cells, longer ISIs were more commonly used in A1d.
Parylene-coated tungsten microelectrodes (Microprobe) with impedances of 1.0–8.5 MΩ at 1 kHz were inserted into auditory cortex with a hydraulic microdrive. Penetrations were approximately orthogonal to the brain surface. Recordings were made at depths from 600 to 1,000 μm below the cortical surface as determined by the microdrive. Dimpling of the cortical surface was usually <100 μm and thus was not a major factor in determining electrode depth. Histological verification from several animals indicated that the recording sites were from cortical layers 3 and 4. The electrical signal from the electrode was amplified, band-pass filtered, and monitored on an oscilloscope. Action potentials from individual neurons were isolated with a window discriminator.
Single-tone frequency response areas
Frequency response areas (FRAs) were obtained for each unit. We presented 675 tone bursts in a pseudorandom sequence of different frequency-intensity combinations selected from 15 intensities and 45 frequencies. The intensities were spaced 5 dB apart for a total range of 75 dB. The frequencies covered by 45 logarithmically spaced steps ranged between 2 and 5 octaves, depending on the estimated frequency tuning curve (FTC) width. Typically we used a three-octave range, centered on the cells characteristic (most sensitive) frequency (CF), which provided 0.067-octave resolution between samples (Fig.1 A).
Because of the time constraints of single-unit recording, we characterized FRAs based on as few stimulus repetitions as possible. If a response was evoked for more than ∼50% of the stimuli inside of each excitatory band, the curve was deemed well defined. In AI, this 50% criteria corresponds roughly to the mean spikes per presentation minus the standard deviation. If after one presentation per frequency-intensity combination the resulting FRA was not well defined, the process was repeated with the same 675 stimuli, and the resulting responses were added. The FRA recording procedure was repeated up to five times.
Single-tone FTC construction
FTCs, the borders of the excitatory spectral receptive fields, were constructed from the FRA based on the estimated spontaneous rate plus 20% of the peak rate. This criterion was applied, after a weighted averaging with the eight frequency-intensity neighbors was applied to responses to each stimulus in the FRA. Smoothing increased the effective number of presentations per frequency-intensity combination by a factor of 2.5 at the expense of frequency resolution. This method was robust, yielding comparable results across repeated measures (see Table 3 of Sutter and Schreiner 1991).
Spontaneous activity of most cat A1 neurons under barbiturate anesthesia is too low to measure suppression by a single stimulus. Therefore we used a two-tone simultaneous masking paradigm to measure response suppression. For two-tone FRAs, 675 different tone-pairs were presented. For each tone-pair, one component, “the BEF-tone” was at the cell's best excitatory frequency (BEF) with energy just above response threshold. The purpose of this component was to drive the cell reliably at the lowest possible intensity (usually 10–20 dB above threshold). The second component (the “variable tone”) had a frequency and intensity chosen using the same pseudorandom procedure as described for single-tone FRAs. This component allowed us to determine the frequency-intensity ranges that suppressed BEF tone responses. If the BEF tone response was not reliable (e.g., the mean BEF tone activity, in spikes per presentation, was less than the standard deviation or response probability was <0.25), the procedure was repeated. The resulting responses of multiple presentations then were added. Because we presented >675 stimuli all containing the BEF tone, habituation or adaptation sometimes caused the response to decrease over time. In those cases, we repeated the two-tone FRA several times.
We used low-intensity BEF tones for several reasons. First, we wanted to provide as little excitatory drive to the cell as possible to have a high sensitivity in detecting suppression. Second, we wanted to remain in the linear part of the firing rate versus intensity function, which usually is less affected by habituation and by matched-CF inhibition. Excitatory responses often result from both inhibitory and excitatory inputs, and therefore we chose low intensity probe tones to minimize the potential inhibitory effects of the BEF tone (e.g., in “nonmonotonic,” intensity-tuned cells) on the variable tone.
Three-dimensional plots of the response magnitude versus the variable tone's intensity and frequency were constructed (Figs. 1 Band 2, B and C). In Fig. 1 B, the BEF was 10 kHz at 40 dB. Notice that thebottom row of the two-tone FRA is almost completely filled. For example, a strong response can be seen when the variable tone is 5 kHz at 20 dB (bottom arrow in Fig. 1 B), demonstrating that the BEF tone can drive the cell reliably when added to a low intensity (≤20 dB) variable tone. However, when the variable tone was 7 kHz at 90 dB (top arrow), the response to the BEF tone was not present indicating that energy at 7 kHz and 90 dB inhibits the response normally associated with the BEF tone of 10 kHz and 40 dB. A similar inhibitory relationship can be found when the variable tone is 12 kHz at 80 dB for this cell and for a wide range of variable tone frequencies for the cell the two-tone FRA of which is depicted in Fig.2.
Two-tone FTC construction
Inhibitory tuning curves marking the spectral borders of two-tone inhibition were derived using the methods of Sutter and Schreiner (1991). These analyses were performedblindly with respect to dorsalventral location. The response to the BEF tone then was approximated by calculating the mean and standard error of the mean for the responses from the bottom two rows of the FRA (90 presentations), which essentially consisted only of responses to the BEF tone. Controls with audiovisual monitoring were used to confirm that these low-intensity variable tones did not affect the response. For each point on the FRA, eight-point weighted smoothing (or 5-point smoothing at the edges of the FRA) was performed with nearest neighbors on the FRA. Then an iso-inhibitory response of 50% reduction of the mean response to the BEF tone was applied to all stimulus conditions in the FRA. Examples of FTCs derived by this method are shown in Fig. 4.
The first step of FTC analysis was to identify each excitatory, suppression, and inhibitory band in an FRA. Then the upper and lower frequency bounds of each excitatory and inhibitory or suppression band were calculated at all intensities. These, as well as all measured and calculated values, were entered into a relational database. The following definitions are useful in understanding how we classified bands. It is important to note that we use the term “ inhibitory bands” to describe the appearance of distinct inhibitory and/or suppressive regions in the spectral RF and not to allude to any specific mechanisms creating the RF (see discussion). Although the term two-tone suppression usually refers to mechanisms encountered in the basilar membrane and two-tone inhibition usually refers specifically to neural inhibition in the central auditory system, for simplicity we henceforth will use the term inhibition. Because an often nonseparable mixture of both mechanisms contribute to our results, we must be careful to state that the term inhibition is not meant to allude to mechanisms that exclude basilar membrane contributions.
Definitions of the bands are as follows. “Excitatory band” is a continuous frequency-intensity space on the FTC for which excitatory responses were recorded. Most cells only had one excitatory band, but cells with “multi-peaked” FTCs had more than one distinct excitatory region. “Inhibitory band” is a continuous frequency-intensity space on the two-tone FTC where BEF tone activity was reduced by 50% or more by the variable tone. Most cells had multiple inhibitory bands. “Lower inhibitory band” is an inhibitory band on the low frequency side of the excitatory FTC. In Fig.1 B, the 7-kHz, 90-dB tone falls within a lower inhibitory band. “Upper inhibitory band” is an inhibitory band on the high-frequency side of the excitatory FTC. In Fig. 1 B, the 12-kHz, 80-dB tone falls within an upper inhibitory band. “Middle inhibitory band” is an inhibitory band completelycontained within the frequency range of the excitatory FTC. If an inhibitory band extended beyond the lateral frequency borders of the excitatory FTC, it was not classified as a middle band.
Determination of strength of intensity tuning
We used the monotonicity ratio measure to quantify the strength of intensity tuning (Sutter and Schreiner 1995). The monotonicity ratio is based on the spike count versus intensity function near the unit's CF. Spike count versus intensity functions (Fig. 3) were created from the FRA in the following manner: at each intensity level, the number of action potentials from a one-quarter-octave bin around the unit's BEF and a 15-dB wide intensity bin were summed. One-quarter octave usually comprised four different frequencies and 15 dB usually covered three levels of intensity, providing a minimum of 12 different stimulus presentations per data point in the spike count versus intensity functions. Only units that were recorded from a minimum of 45-dB above threshold were used to analyze monotonicity ratio: the number of spikes elicited at the highest intensity divided by the number of spikes at the maximum of the spike count versus intensity function. Therefore a cell that fired maximally at the highest tested intensity had a monotonicity ratio of 1; a cell that was completely inhibited at the highest intensities had a monotonicity ratio of 0. For example, the neuron the FRA of which is shown in Fig. 2 A has a monotonicity ratio of 1, and the cell the FRA of which is depicted in Fig. 4 D has a monotonicity ratio of 0.2.
Topographical assignment of single units
The method of assigning single cells as belonging to A1d or A1v was described in a previous paper (Schreiner and Sutter 1992). Briefly, a multiple-unit premapping is used to determine the boundary between dorsal and ventral A1. The multiple-unit mapping is only used to define the boundaries of dorsal and ventral A1. All inhibitory properties reported in this paper are based on single-unit recordings. We characterized the multiple-unit map of excitatory bandwidth, which has a highly consistent form across animals, along dorsoventrally oriented isofrequency lines. Clusters are broadly tuned at the dorsal edge of A1 and become progressively more sharply tuned with more ventral recording sites. An absolute minimum in bandwidth is reached near the dorsalventral center of A1 where most clusters are sharply tuned to frequency, i.e., they have narrow frequency receptive fields. The trend then reverses and clusters become more broadly tuned with further ventral progression. Although there can be relative minima of bandwidth in dorsal regions as well (Schreiner and Mendelson 1990), we used the location of the gradient reversal at the absolute minimum in the bandwidth to define the border between dorsal and ventral A1 and assigned it a value of 0 mm. Within this coordinate system, we pooled data across animals (Schreiner and Sutter 1992). For some animals, we recorded from as many single units as possible in A1 and did not premap. Therefore although all cells in this study could be localized to A1, not all cells could be assigned to A1d or A1v.
We recorded two-tone FRAs from 163 single neurons in A1 with sufficient reliability to calculate inhibitory FTCs. In general, A1 neurons exhibited a variety in the properties of their inhibitory frequency tuning curves. Some neurons had single lower and upper inhibitory bands with a fairly narrowly tuned single excitatory band (Fig. 1 B). Other neurons had broad inhibitory frequency ranges (Fig. 2, B and C). In the example shown in Fig. 2, B and C, the inhibition was extremely broad, extending beyond the tested frequency range of five octaves. Other classes of two-tone inhibitory bands are shown in Fig. 4. Intensity-tuned (“nonmonotonic”) cells could have substantial inhibition at the characteristic frequency (Fig. 4, D–F). Cells with multiple excitatory bands (multipeaked) also characteristically had inhibitory areas within the boundaries of the excitatory tuning curve (Fig. 4, G–I). Additionally, cells could have fewer (Fig. 4, J–L) or more (Fig.5) than two inhibitory bands. Overall, it is not possible to identify a prototypical pattern of two-tone inhibitory frequency tuning in A1 as has been possible in the auditory nerve.
Spectral properties of two-tone inhibitory bands
To quantify the diverse spectral properties of inhibitory components, every inhibitory band was assigned one of five labels (Fig.5). An inhibitory band positioned on the low- or high-frequency side of the tuning curve was assigned as a “lower” or “upper” band, respectively. An inhibitory band completely within the frequency range of single-tone excitation was assigned as a “middle band” (Figs.5 B and 4, G–I). When there were two inhibitory bands on the low-frequency side of the excitatory tuning curve, the band closest to the low-frequency edge of the excitatory curve was called “the lower inhibitory band.” The band farther from the excitatory curve was assigned as the “second lower inhibitory band” (Fig. 5 A). Similar nomenclature was used when there were two upper bands (Fig. 5 C).
The largest single class were neurons with only one upper and one lower inhibitory band (Fig. 6, A andB) constituting 38% (62/163) of the recorded population. Cells with two lower bands and one upper band (Fig. 6, C andD) comprised the next largest class 13% (21/163) of the population. Cells with two upper bands and one lower band (Fig. 6,E and F) comprised the next most common class (10%, 16/163), followed by cells combining lower, middle and upper bands (8%, 13/163) and cells with a single lower band (6%, 11/163). Other configurations of inhibitory bands comprised <6% each of the population.
In Fig. 7, the percentages of neurons with different band structures are shown. Schematized frequency tuning curves are illustrated below the histogram. The percentages must be viewed as approximate because assigning a frequency range to a band could be ambiguous. Ambiguities could occur when two or more bands merged into one (e.g., Figs. 4, D–F, and 6, Eand F) or when a band crossed over from a low to middle or from a high to middle position (Figs. 4, D–F, and6 B). When multiple assignments were possible, we reverted toward a two-band model with one upper and one lower band.
Topography of inhibitory band structure
Functional implications can be gleaned from the spatial distribution of inhibitory band patterns along the isofrequency domain of A1. Previous studies of excitatory RF properties revealed distinct differences between dorsal and ventral locations in cat A1 (Sutter and Schreiner 1991). To test whether a similar distinction can be based on inhibitory effects from beyond the esRF, we analyzed the distribution of inhibitory bands separately for A1v and A1d.
For a subset of neurons (n = 61), assignment to A1d or A1v was possible. The inhibitory tuning curves of these cells, which were analyzed blindly with respect to dorsalventral location, provide further support that A1d and A1v are functionally distinct.
Cells with nontraditional band structures were significantly more common in A1d than in A1v (χ2,P = 0.0065). Half of the A1v cells (50%, 18/36) had only one upper and one lower inhibitory band, whereas this structure was less common (16%, 4/25) in A1d. In fact, this band structure was not the most commonly encountered in A1d, where cells with two lower and one upper inhibitory band (20%, 5/25) and cells with a lower, middle, and upper band (20%, 5/25) were most common (Fig.8). The two-band structure became progressively more probable as one moved more ventral within A1 (Fig.9).
Dependence of inhibitory characteristics on BEF tone parameters
One concern is that the choice of BEF tone might significantly affect the observed band structure either directly or by having an effect on the strength of excitatory drive. As mentioned inmethods, we attempted to keep the BEF tone at the BEF and at as low an intensity as possible while driving the cells reliably; however, occasionally, experimental constraints made such a choice impossible. Some cells could not be driven strongly enough at the standard selection of frequency and intensity, e.g., due to habituation, and we had to modify the BEF tone parameters during the recording session. To test whether the range of BEF-tone levelsrelative to threshold had an influence on the structure of the inhibitory bands observed, we divided the population of two-tone tuning curves into three groups: those with low-intensity probe tones (<12 dB above threshold); those with moderate-intensity probe tone (>12 and < 25 dB above threshold); and those with high-intensity probe tone (>25 dB above threshold). This grouping was chosen to obtain as close to three equally sized groups as possible. Basically no statistically significant differences among these groups were found. It appears that at very high probe intensities the percentage of cells with simple two-band structure decreased, although this effect was not significant (χ2-test, P = 0.50; Table 1). The mean probe intensities were identical (19.4 dB above threshold, Kolmogorov-Smirnov,P = 0.68) for A1d and A1v, indicating that the observed differences in complex band structure between these brain areas could not be due to choice of probe intensity. Similarly, the small variation in probe tone frequency relative to CF did not significantly effect our results (Table 1). Also neither the strength of activity evoked by the BEF tone (χ2, P = 0.84,n = 146) nor the characteristic frequency (χ2, P = 0.55,n = 146) had a significant effect on the percentages of simple-banded cells.
Relationship of inhibitory bands to excitatory properties
We analyzed the relationship between inhibitory band structure and sharpness of frequency tuning. Sharpness of tuning was assessed by the BW40 measure which is the bandwidth, in octaves, 40 dB above a neuron's threshold. For cells with two discrete excitatory ranges, the bandwidth was computed by taking the outer limits of the two. In general, cells with a larger number of inhibitory bands tended to have sharper tuning (Fig. 10, linear regression, P = 0.001, slope = −0.18 octaves per band). To probe the relationship between inhibition and frequency selectivity in more detail, a number of statistical tests were performed. For each of the first five tests, cells first were classified into one or two groups depending on the presence or absence of one type of inhibitory band in the isRF (e.g., lower band, upper band, 2nd lower band, etc.). Then the BW40s of the two groups were compared with a nonparametric statistical test. The results of each of these tests are in Table 2. When interpreting these results, it is important to note that thesame population of cells is used in each test (i.e., each row of Table 2), and the cells are grouped differently for the purposes of each test. (For example, cells with a lower band includes both cells with and without an upper band.)
The presence of upper inhibitory bands in the isRF was clearly related to sharper frequency tuning in the esRF. The mean bandwidth of cells with an upper band was 0.49 octaves less than the mean bandwidth of cells lacking a second upper band (Mann-Whitney U, P < 0.001). Similarly cells with a second upper inhibitory band had 0.42 fewer octaves of bandwidth at 40 dB above threshold, than cells lacking an upper band (Mann-Whitney U, P < 0.01). The presence of lower inhibitory bands in the isRF was associated with sharper frequency tuning in the esRF but these comparisons were not statistically significant. Cells with the classic LU inhibitory band structure also tended to have excitatory structure typical of A1 cells with sharp frequency tuning; however, this difference was not statistically significant (Table 2).
A result that appears surprising, at first, is that cells with a middle inhibitory band were significantly more broadly frequency tuned than neurons lacking a middle inhibitory band. This result, however, might be expected because of our strict definition of a middle inhibitory band. By our definition, the frequency range of a middle inhibitory band is contained completely within the frequency range of the esRF. Therefore for a middle band to be classified, excitation must be broader than inhibition. Accordingly inhibition extending beyond the esRF is associated with sharp frequency tuning, whereas inhibition completely within the esRF is associated with a broad extent of excitatory input.
Because intensity tuning must be created by inhibition in the central auditory system, one would expect inhibitory properties to be related to intensity tuning. Therefore we analyzed the relationship between inhibitory band structure and strength of intensity tuning. Sharpness of intensity tuning was assessed by the monotonicity ratio, which describes whether a cell's firing rate increases monotonically as a function of stimulus intensity. The relationship between monotonicity ratio and number of bands was not significant (not shown). Therefore we took a closer look at the relationship in a manner analogous to the analysis of frequency selectivity, above. The results are presented in Table 2.
The presence of each type of inhibitory band was related to sharper intensity tuning. Neurons with a second upper inhibitory band had stronger intensity tuning, with a mean monotonicity ratio of 0.48 compared with a mean monotonicity ratio of 0.68 for cells with no second upper band. This means that the average response to loud stimuli of a neuron with a second upper inhibitory band was only 48% as great as its response to the best intensity. In cells without an upper inhibitory band, the average response at loud intensities was 68% of the response at the best intensity. This difference was significant (Mann Whitney U, P = 0.017, see Table 2). On average, sharper intensity tuning was always seen for cells having a lower, upper, second lower, second upper, or middle inhibitory band when compared with cells lacking one of these bands. These five comparisons represent all five possible band locations, and the probability that all five tests would yield the same effect is 2−5 or 0.031. Cells with traditional two-band inhibitory structure (LU) also were less tuned for intensity than cells with other structures, further supporting the idea that additional inhibitory bands are associated with stronger intensity tuning.
Topographical differences in BW40 and intensity tuning cannot completely account for the topography of inhibitory properties. Although LU structure was significantly more common in A1v, monotonicity ratio was not statistically different between these two subdivisions. Also, differences in BW between A1d and A1v did not reach significance in this study. It should be noted that in other studies, a statistically significant relationship has been reported between BW40 and dorsal-ventral location (Schreiner and Sutter 1992). The lack of a significant effect of BW in this study was probably due to the smaller sample of topographically localized cells (n = 63) than in Schreiner and Sutter 1992 (n = 103). However, the differences in the topography of inhibitory properties does reach significance in this study implying a stronger relationship between inhibitory properties and topography than between excitatory properties and topography.
These studies demonstrate various complex inhibitory band structures arising from tones outside of esRFs in A1 neurons. In general the structure of inhibitory frequency regions substantially differed from simpler inhibitory band structures described in the ascending auditory system. The spectral structure of two-tone inhibitory response areas varied systematically within A1. A simple, “center-surround” inhibitory band structure was more common in A1v, and more complex, derived side-band structures were seen in A1d. These data suggest that A1d neurons are better suited to analyze spectrally diverse sounds, such as vowels with multiple formants or other communication sounds with multiple spectral components, whereas cells in A1v view the world through a simpler frequency “aperture.” This difference between A1v and A1d constitutes further evidence supporting the hypothesis that A1v and A1d are distinct subregions of A1 (Sutter and Schreiner 1991).
Potential importance of two-tone inhibition in auditory perception and signal processing
A fundamental question is, How do the complex isRFs reported herein relate to sound analysis? Although speculative, there is evidence that two-tone isRFs are related to the analysis of more complex signals. For example, two-tone inhibitory areas are necessary contributors to A1 cells' abilities to differentiate between frequency-modulated sweeps (e.g., Fuzessery and Hall 1996; Shamma et al. 1993; Suga 1965), and monaural sound localization cues (Imig et al. 1997). Two-tone isRFs can be used to predict cells' responses to spectrally complex sounds such as “ripple spectra” the energy of which is distributed across both esRF and isRF (Schreiner and Calhoun 1994; Shamma and Versnel 1995). It has been demonstrated that the location and extent of two-tone inhibitory bands provides a better prediction of the response to ripple spectra than properties of the esRF.
More generally, two-tone sRFs predict auditory cortical cells' responses to arbitrarily complex stimuli better than esRFs.Nelken and colleagues (1994) determined the predictive power of single-tone, two-tone, and higher-order responses to complex stimuli. They concluded that the combination of single- and two-tone RFs were sufficient to predict responses to complex stimuli and that increasing the stimulus content beyond two tones provided only marginal improvement in predictive power. Therefore the inhibitory modulatory effect of extra-esRF tones may play an important role in complex spectral analysis.
Support that two-tone effects are indicative of spectral integration can be gleaned further from a closer look at neurons with multi-peaked frequency tuning curves. More than ¾ of these cells, which are primarily found in A1d, display two-tone enhancement when tones are presented from each tuning curve peak simultaneously (Sutter and Schreiner 1991), indicating an integrative role for these cells. About half (11/23) of the multipeaked neurons from this study had a middle inhibitory band. Taking all the data, one might predict that A1 cells prefer complex spectra with energy in the excitatory areas and lacking energy in the middle inhibitory areas.
In this respect, it is interesting to note that the esRFs of some cells are quite similar with respect to shape and bandwidth, but the frequency extent of the inhibition outside of the esRF is strikingly different. For example the cells the FRAs of which are shown in Figs.2, 4, A and J, and 6 C have similar excitatory curves, yet vastly different inhibitory curves, as do the tuning curves from the cells in Fig. 6, D and E.These differences in the isRFs of cells with similar excitatory tuning are consistent with the notion that the isRF is carving out a reject band rather than solely shaping the excitatory response.
Relationship of isRFs in this study to other studies of inhibition and intensity tuning in A1
Many A1 neurons are intensity tuned, i.e., their firing rates decrease at high intensities. Intensity tuning must result from centrally generated by inhibition for two reasons. First, all auditory nerve fiber firing rates increase monotonically with increasing intensity. Second, the highly phasic short-latency onset-only response of auditory cortical neurons rules out adaptation as the cause of intensity tuning. Several hypotheses have been advanced to account for the mechanisms underlying intensity tuning (temporal envelope:Heil 1997; Heil and Irvine 1998; spectral splatter: Phillips 1988; Phillips et al. 1994; matched BF inhibition: Caspary et al. 1994). All of these hypotheses require short-latency inhibition. Our results show a relationship between short-latency inhibition and intensity tuning that is consistent with all of these hypotheses. Although understanding the frequency extent of short-latency inhibition could distinguish among these hypotheses, the variety of inhibitory band structures in intensity-tuned neurons (e.g., Figs. 4 and 6) does not suggest one single mechanism but rather supports the existence of several mechanisms.
Calford and Semple (1995) measured isRFs in A1 neurons using a forward masking paradigm where inhibitory tones were presented before BEF tones. Forward masking provides the advantage of ruling out peripheral two-tone suppression and for studying the dynamics of inhibition. However, forward masking is not an ideal method to characterize how inhibition shapes esRFs because any inhibition shaping esRFs has to be very fast to affect the highly phasic short-latency excitatory responses of A1 neurons. The delay between presentation of the inhibitory tone and the BEF tone and potential offset effects of the inhibitory tone can confound the relationship between isRFs and tuning curve shape.
Using short intertone intervals, Calford and Semple reported that no isRFs were intensity tuned. This result is consistent with ours, where inhibitory and BEF tones are presented simultaneously. However, when using an 80-ms difference between onsets, they found that many of the inhibitory bands were intensity tuned, particularly when the excitatory response was intensity tuned. There are too many possible mechanisms and not enough data to speculate on the cause of these differences at longer delays between inhibitory and BEF tones. Contributors to these mechanisms could be differences in latency-intensity functions and phasicness of inhibition and excitation, postinhibitory rebounds, subthreshold offset responses, desensitization of inhibitory receptors, and adaptation of inhibition and excitation, to name a few.
Finally it can be argued that the inhibition observed in intensity-tuned cells occurring when the variable tone is at the BEF is due to the tone pair essentially behaving like a loud BEF tone. This can be the case because at frequencies near the BEF, two-tone and intensity nonlinearities are related. Nevertheless, the weakening of response to loud tones in the single-tone case must be due to inhibition. Two-tone techniques, therefore still quantify the spectral extent of the inhibition that carves out intensity tuning in ways that cannot be achieved with single tones (e.g., Fig. 4 F).
Potential underlying mechanisms creating complex isRFs
Another fundamental question is how are these complex isRFs formed? Any mechanism could be implemented by cortical or subcortical inhibition. Because our study was designed to characterize inhibitory regions and not to determine the underlying mechanisms, the following hypotheses must be considered highly speculative.
There are two potential mechanisms for creating LU band structure. Pharmacological studies (Caspary et al. 1994;Evans and Zhou 1993) indicate that inhibition in anteroventral cochlear nucleus (AVCN) and dorsal cochlear nucleus (DCN) can be in tonotopic alignment with the excitatory response areas. In pallid bats (Fuzessery and Hall 1996), horseshoe bats (Vater et al. 1992), and chinchillas (Palombi and Caspary 1996), there are data to support the idea that the inhibitory inputs in the central nucleus of the inferior colliculus (ICc) are frequency-matched to the excitation. Intracellular and extracellular cortical responses in cats also are consistent with matched or recurrent inhibition (e.g., Volkov and Galuzjuk 1991). Therefore “simple two-banded structure” in the isRF can result from one broad inhibitory input in tonotopic register with a more frequency-limited excitatory input (Fig.11). Another potential cause of LU band structure is the convergence of two distinct inhibitory inputs. In mustached bat, ICc both matched and unmatched inhibitory inputs have been hypothesized (Yang et al. 1992).
Analogous to the potential mechanisms creating LU band structure, there are two potential mechanisms for creating two lower, two upper, or middle inhibitory bands in cortical neurons. First, multiple inhibitory bands might result from one broad inhibitory input split by a facilitatory-excitatory input (as in Fig. 11 B), and/or each band might result from distinct inhibitory inputs (as in Fig.11 C).
Results from echolocating bats support the existence of both mechanisms. As in Fig. 11 B (middle), mustached bats' ventral nucleus of the medial geniculate body (MGBv) neurons have two-tone facilitation (TTF) (Suga et al. 1997). When an excitatory tone is paired with a tone outside the esRF, the cell fires more strongly than to the excitatory tone alone. This TTF is shaped through interactions of GABAergic inhibition and N-methyl-d-aspartate-receptor mediated excitation/facilitation and is passed on to A1 cells (Fitzpatrick et al. 1993). Some A1 cells in mustached bats have a second lower inhibitory band that appears to have been split off from a larger low-frequency band by the TTF band as in Fig.11 B (Kanwal et al. 1999). Other second lower bands don't appear to be related to TTF as in Fig.11 C (Kanwal et al. 1999).
Although this study provides the first step of characterizing isRFs in A1 of nonecholocating mammals, further studies are needed to unravel the underlying mechanisms creating spectrally intricate inhibition. For much of the tuning curve, there are overlapping inhibitory and excitatory inputs, and this technique only looks at the net effects. Pharmacological methods (particularly if combined with 2-tone stimulation) that can selectively block excitation or inhibition, possibly in combination with intracellular recording, would be required to tease apart the inhibitory and excitatory contributions in frequency response areas with overlapping inhibition and excitation. For complete understanding, these studies would need to be performed throughout the auditory system, including but not limited to cortex, to determine how complex cortical isRFs are formed.
Differences between two-tone suppression in the auditory nerve and cortical isRFs
Although a few cells had two-tone suppression areas similar to those reported in the auditory nerve (e.g., Fig. 1), most of the inhibitory properties reported herein are distinct from those described in the auditory nerve. For example, multiple discontinuous bands on either side of the BEF have not been described in the auditory nerve. Even cells with only one lower and one upper inhibitory band, a characteristic associated with the auditory nerve, do not simply reflect a relaying of these peripheral suppressive properties. In particular, for many cells with simple two-band structure (e.g., 4D), the thresholds of the inhibitory bands are lower than one usually encounters in the auditory nerve (Sachs and Kiang 1968). Also, inhibitory frequency tuning is often more broad than that of two-tone suppression in the auditory nerve. For example, the FRA shown in Fig. 2 has much broader low-threshold inhibition than any reported for auditory nerve fibers. The example in Fig. 2 is the most extreme we recorded, although we encountered other cells with low-threshold inhibitory bands more than an octave wide. This example of Fig. 2 is extreme enough to raise concerns about potential stimulus artifacts; however, we are confident that there were no artifacts such as clicks. Furthermore cells recorded both before and after this unit in this experiment had more “normal” band limited isRFs. Therefore we believe this cell is truly an example of extremely broad low-threshold inhibition in A1.
Comparisons of subcortical isRFs with isRFs recorded in A1
If complex cortical isRFs are different from two-tone suppression seen in auditory nerve fibers, where are they formed? This is difficult to answer because inhibitory properties can be shaped by many nuclei (>10 stations at 6 levels) of the ascending auditory system [e.g., medial geniculate body (MGB), Suga et al. 1997; ICc, Yang et al. 1992; DCN, Evan and Zhao 1993; AVCN, Caspary et al. 1994]. Feedback connections also are known to play a role in physiological processing of sounds (Yan and Suga 1998; Zhang et al. 1997). Therefore the properties reported in this paper probably are not solely cortical in origin but rather reflect a complex dynamic feed-forward and feed-back system, for which identifying a “locus” is not straight-forward.
Furthermore all comparisons between studies must be interpreted cautiously because confounding experimental variables such as differences in stimulus complexity, species, anesthetics, spatial sampling biases, search stimuli, binaural conditions, and the frequency-intensity ranges and resolutions used, make comparisons across labs difficult. Because of these difficulties and the paucity of studies of isRFs in the auditory system, one must be careful not to interpret the lack of reporting of complex isRFs in an auditory area as conclusive evidence that they do not exist. With these caveats in mind, what can be said about isRFs in the auditory system?
In a rare study of two-tone inhibition in the cochlear nucleus (CN) of cats, all of the isRFs had simple two-banded structure (Rhode and Greenberg 1994). Because CN neurons have moderate spontaneous activity, inhibition is more commonly studied with single-tones. Inspection of AVCN single-tone tuning curves have provided little evidence for inhibitory surround structures with more than two bands or for inhibitory bands that carve out multi-peaked excitatory frequency tuning curves. Type-IV principle cells in the DCN have more complex tuning curves than AVCN cells, providing some evidence for complex isRFs in the brain stem. Inspection of DCN single-tone tuning curves (Evans and Zhao 1993;Goldberg and Brownell 1973; Young and Brownell 1976) yielded ∼50% (n = 24) of cells having simple LU band structure compared with 38% in this A1 study.
Although some DCN neurons show the flavor of complexity we report for cortical neurons, there are three main reasons to suspect that DCN single-tone isRFs are not solely responsible for the isRFs we are reporting. First, most of the complex inhibitory properties reported in the DCN were in the tonic response of the neurons. A1 neurons in awake (Abeles and Goldstein 1972) and anesthetized (deRibaupierre et al. 1972) cats respond almost exclusively phasically at stimulus onset. Second, Young and Brownell (1976) reported that under barbiturate anesthesia, isRFs of type-IV cells were spectrally simpler than in unanesthetized, decerebrate preparations. Because we used barbiturate anesthesia, the results of complex isRFs cannot be due to passing on of complex DCN properties alone. Third, the excitatory and inhibitory RF structure of principal DCN neurons from decerebrate cats appears more complex by our criteria than most neurons of the brain stem and midbrain areas receiving DCN inputs. Although presently under study (Ramachandran et al. 1999), the degree to which complex DCN isRF properties are passed on to subsequent areas in the lemniscal pathway remains an open question.
Inspection of tuning curves from the central nucleus of the ICc of nonecholocating animals also has provided little evidence for inhibitory surround organization other than simple two-band structure or a single inhibitory band on one side of the CF. Using methods similar to this study, Ehret and Merzenich (1988) found mainly simple two-band structure (6/8 cells) and no cells with more than two bands in cat ICc. In echolocating mustached bats, evidence is emerging that complex two-tone isRFs are present in the ICc (Portfers and Wenstrup 1999 ). However, this investigation focused on cells with fewer than three inhibitory bands and does not mention multiple lower or upper bands. Therefore whether complex isRFs exist in ICc of cats still should be considered an open question.
In a study of the medial geniculate body of cats using similar methods to ours, Imig et al. (1997) have shown that many neurons have complex isRFs. Inspection of that paper's examples of tuning curves recorded from the MGBv and the lateral part of the posterior group of thalamic nuclei (PO) reveal only two of seven cells had LU structure. Although the examples from Imig et al. (1997)were taken mainly from a subset of MGB neurons, i.e., the monaurally direction selective neurons, these results still support a high percentage of complex isRFs in the parts of the MGB belonging to the lemniscal pathway.
These data lead us to formulate the speculative working hypothesis that complex isRFs are progressively transformed along the ascending auditory pathway of cats. From this perspective, we speculate that complex isRFs are simpler and less frequent in the ICc than in thalamus or cortex. Whether there is a progressive transformation of complexity from ICc to cortex in the sRFs, however, still remains a subject of debate.
Implications for auditory cortical processing
When combined with other data, our results indicate that A1d has a physiologically distinct function from A1v and suggest that A1d is involved in analyzing complex spectra. A1d cells, on average, respond with longer latencies and have broader and more complex spectral RFs than A1v neurons (Mendelson et al. 1997;Sutter and Schreiner 1991). Additionally, many A1d cells exhibit two-tone enhancement (Sutter and Schreiner 1991), binaural facilitation, preference for broadband sounds (Middlebrooks and Zook 1983), and duration tuning (He et al. 1997). Thalamocortical and corticocortical projections to and within A1d appear more complex and widespread than those to and within A1v (He and Hashikawa 1998;Read et al. 1997). These results are consistent with the notion that A1d cells are well suited for integrating across frequencies and analyzing spectrally complex sounds. In contrast, A1v cells tend to be sharply frequency tuned and respond poorly to broadband sounds. Additionally there is a topographic representation of intensity in A1d. At the ventral border of A1d, one encounters the most sensitive cells that are also intensity tuned. Progressing more ventrally the intensity thresholds of cells increase and intensity tuning decreases (Sutter and Schreiner 1995). All of this evidence supports the notion that A1d and A1v are part of different processing streams: a ventral stream for detection and local spectral processing and a dorsal stream for discrimination and spectral shape processing.
We thank G. H. Recanzone, B. Mulloney, H. Read, and two anonymous reviewers for helpful comments on the manuscript.
This work was supported by Grants DC-02514 and DC-02260 from the National Institute of Deafness and Other Communication Disorders and an Alfred P. Sloan Research Fellowship (to M. L. Sutter).
Address for reprint requests: M. L. Sutter, University of California, Center for Neuroscience, 1544 Newton Ct., Davis, CA 95616.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 1999 The American Physiological Society