Journal of Neurophysiology

Processing of Band-Passed Noise in the Lateral Auditory Belt Cortex of the Rhesus Monkey

Josef P. Rauschecker, Biao Tian


Neurons in the lateral belt areas of rhesus monkey auditory cortex were stimulated with band-passed noise (BPN) bursts of different bandwidths and center frequencies. Most neurons responded much more vigorously to these sounds than to tone bursts of a single frequency, and it thus became possible to elicit a clear response in 85% of lateral belt neurons. Tuning to center frequency and bandwidth of the BPN bursts was analyzed. Best center frequency varied along the rostrocaudal direction, with 2 reversals defining borders between areas. We confirmed the existence of 2 belt areas (AL and ML) that were laterally adjacent to the core areas (R and A1, respectively) and a third area (CL) adjacent to area CM on the supratemporal plane (STP). All 3 lateral belt areas were cochleotopically organized with their frequency gradients collinear to those of the adjacent STP areas. Although A1 neurons responded best to pure tones and their responses decreased with increasing bandwidth, 63% of the lateral belt neurons were tuned to bandwidths between 1/3 and 2 octaves and showed either one or multiple peaks. The results are compared with previous data from visual cortex and are discussed in the context of spectral integration, whereby the lateral belt forms a relatively early stage of processing in the cortical hierarchy, giving rise to parallel streams for the identification of auditory objects and their localization in space.


Electrophysiological studies of the auditory system in higher mammals have traditionally used pure tones for stimulation because the prevailing theory considered audition primarily a process of frequency analysis that necessitated the decoding of broad-band complex sounds by narrow frequency filters (Kiang 1968; von Békésy 1960; von Helmholtz 1885). What this theory overlooks is the necessity of a resynthesis of the narrow filter outputs at the level of recognition and storage of complex sound patterns. It appears likely that at that stage auditory objects are coded by the simultaneous activity of lower-order filters, and an integration of these signals in higher-order neurons has to occur. Neuroethological studies in bats, songbirds, and owls have shown that such may indeed be the case (Doupe 1997; Konishi 1990, 1991; Margoliash 1997; Suga 1988, 1992).

Although an understanding of the perception of complex natural sounds is the ultimate goal of our studies, it may be advantageous at first to take an approach using sounds of intermediate complexity. Band-passed noise (BPN) bursts seem ideal in many ways because they have a well-defined center frequency and bandwidth, which can be varied systematically. Furthermore, BPN bursts are essential elements of many natural sounds, including those used for communication by many species.

Besides primary auditory cortex (A1), several other areas have been identified in earlier studies of rhesus monkey auditory cortex. In one seminal study, Merzenich and Brugge (1973) established the existence of 2 tonotopically organized areas, RL and CM, on either side of A1 along the rostrocaudal dimension. In addition, they recognized belt regions responsive to auditory stimuli medially and laterally of A1. Neither of these regions could be characterized in detail, however, because they did not respond well to tones. We considered the use of BPN bursts a possible next step in the analysis of the lateral belt because the increased bandwidth of BPN bursts may be suitable in eliciting responses from higher-order neurons.

The use of BPN bursts was partly motivated also by considerations of analogy between auditory and visual cortex. Neurons in the central visual pathways normally do not respond to small spots of light and in some areas, such as V4, require stimuli of a specific size (Desimone and Schein 1987; Petersen et al. 1980). Auditory frequency and 2-dimensional visual space can be considered analogous because they are the parameters encoded on the epithelia of their respective sensory end organs and again in the topographic maps of the cerebral cortex. According to this analogy, size of a visual stimulus corresponds to bandwidth of an auditory stimulus, and the existence of neurons specific for particular bandwidths could be expected. Prior studies have indeed provided clues to the existence of such neurons (Rauschecker 1998a; Rauschecker et al. 1995).

After the initial, preliminary reports, the present study was designed to analyze band-pass–selective neurons in greater detail and in a larger number of monkeys. This will help to provide comprehensive information about the organization of lateral belt areas in terms of center frequency and bandwidth as well as full details of our experimental approach. Furthermore, by discussing similarities between visual and auditory cortical organization we hope to learn more about general principles of cortical function.


Animals and surgery

A total of 969 cells were collected from the auditory cortex of 11 adult rhesus monkeys (Macaca mulatta), which were used in acute recording experiments under gas anesthesia without muscle paralysis. The bulk of the data reported in the present study was obtained from 7 of the monkeys. Four additional monkeys were used in a different study under identical recording conditions (Rauschecker et al. 1995) and pertinent data were added to the present analyses. The external meatus of all animals was kept clean at all times. The body weight of the monkeys ranged from 5 to 12 kg, and both sexes were used in the recordings.

For the experiment, animals were treated with atropine sulfate (0.05 mg/kg, subcutaneous) and were initially anesthetized with ketamine (10 mg/kg, intramuscular). A venous catheter was placed in one of the hind leg veins, and a tracheal tube was inserted. The monkey was then placed in a specially designed monkey head holder and artificially respirated with a ventilator (Harvard) at an intrapulmonary pressure of 500 to 1,000 Pa. Expiratory CO2 content was monitored continuously (Beckman, Medical Gas Analyzer LB-2) and kept at about 3.8% by varying ventilation volume and frequency. Anesthesia was maintained with isoflurane (1%) in a mixture of 50% nitrous oxide and 50% oxygen, the same type of anesthesia that was previously used in both nonprimary visual as well as auditory cortex (Rauschecker et al. 1987, 1995; Tian et al. 2001). EKG was monitored to ensure adequate anesthesia and stable physical condition of the animal. If the animal showed any acceleration of its heart rate in response to noxious stimuli, recording of data was suspended, and the concentration of anesthetic gases was increased, or a short-acting barbiturate (Biotal) was given through the venous catheter. The animal's core temperature was maintained at 38° C with a heating pad (Gaymar). Fluid (0.5% dextrose in 0.9% saline) was administered through the venous catheter by an infusion pump at a rate of 12–30 ml/h.

A craniotomy was performed over the parietal and superior temporal cortex in a sterile surgery suite. A temporary well was built with bone cement. The dura was left intact to minimize brain pulsation and prevent the cortex from possible mechanical damage during surgery and transportation. Only after the monkey was transferred to the recording room, was the dura dissected and retracted to expose the superior temporal gyrus. The well was filled with sterile 0.9% saline to prevent the cortex from drying.

Acoustic stimulation

stimulus generation. Pure-tone (PT) stimuli were generated with a stimulus generator (Wavetek 148A). The tone bursts were gated by a control unit (HI-MED, HG 300G) to 50-ms duration with 5-ms rise/fall time. The output amplitude could be adjusted through the gain control of the function generator and was monitored on an oscilloscope.

Band-passed noise (BPN) bursts were generated with the program SIGNAL (Engineering Design) on an IBM-compatible AT-486 personal computer. The SIGNAL program generated a random noise, which was band-pass filtered with different cutoff frequencies. By varying the upper and lower cutoff frequencies, BPN bursts could be generated at different center frequencies and with different bandwidths (Fig. 1). The sampling rate (“temporal grain”) was set to 100 kHz so that no significant quantization steps were present in the signal. The stimuli had a rise/fall time of 5 ms to reduce the effect of transients. The stimulus duration was 50 ms, but when there was any indication that the neuron was tuned to stimulus duration, stimuli of other durations could also be generated and tested. The root-meansquare (RMS) values of BPNs in the time signal were normalized with those of standard output of PT stimuli. Stimulus level remained constant during each trial but was varied between trials through an attenuator (HP 350D). PT stimuli at frequencies matching the center frequencies of BPN series could also be generated with the SIGNAL program, so that a direct comparison of response strength between BPN and PT was possible.

fig. 1.

Schematic spectrograms of band-pass noise (BPN) bursts with the same center frequency (Fc) but differing bandwidth. For comparison, pure tone (PT) and white noise (WN), regarded as the 2 extremes of the spectrum, are also included.

Normalizing the RMS value of the stimuli aims at keeping the total energy (or overall sound level, in dB SPL) of all stimuli constant. However, as bandwidth of a BPN burst increases, total energy is distributed over a broader frequency range, causing a slight decrease of the spectrum level (in our case 5 dB between bandwidths of 1/3 octave and 2 octaves) near the best frequency (BF) with increasing bandwidth. Because of this (negative) covariation of spectrum level and bandwidth a monotonic neuron would decrease its firing rate as a function of bandwidth, whereas the response of a nonmonotonic neuron would depend on the best intensity and dynamic range of the neuron. However, this covariation is small, as stated above, and works against our hypothesis of increasing spectral integration in the lateral belt. The alternative approach, keeping the spectrum level constant by keeping the level of all frequency components at the level of the PT, would result in an increase of the overall BPN sound level and thus a positive covariation of level and bandwidth, which is much less preferable. Therefore keeping the overall sound level constant seemed the most reasonable approach, and our results (see following text) that the majority of neurons actually increased their firing rate with bandwidth over a large range, even though the spectrum level decreased, appear to justify this reasoning.

All stimuli were amplified with a power amplifier (Hafler, SE 120) and played back with a high-fidelity loudspeaker (Infinity 5 Kappa) in free field (see following text). The loudspeaker was positioned 1.14 m in front of the monkey at the height of the ears. The interstimulus interval was ≥1 s. Each stimulus was repeated 20 times for each neuron.

stimulus calibration and sound field. The stimulus delivery system was calibrated with a Brüel and Kjaer (B&K) ½=-in. condenser microphone (#4133, free-field) and a B&K Precision Sound Level Meter (#2235; A-weighting scale) (Fig. 2). Between 0.4 and 24 kHz, the output varied by ±6 dB. Above and below this range a roll-off existed, partially because the signal was outside the measuring range of the microphone. The fidelity of the system in producing complex sounds was also tested, and near–free-field conditions were ensured. Details of calibration were described in prior publications (Tian and Rauschecker 1994, 1998).

fig. 2.

Calibration curve of the sound delivery system. Sound pressure level (SPL) of standard stimuli was at about 70–80 dB. Between 0.4 and 24 kHz, the output varied only by about ±6 dB. Below and above this range, there was a roll-off. Dashed line represents the SPL of the background noise.

Electrophysiological recording experiments were carried out in a large, dimly lit laboratory room (4.7 × 7.6 × 2.6 m), as previously reported (Tian and Rauschecker 1994, 1998), which was kept as quiet as possible. The sound pressure level (SPL in dB, re 20 μPa) of the noise in the recording room was measured with the same B&K equipment. The constant background noise had its peak level (35 dB) at 0.5 kHz (Fig. 2). Nonconstant noise from equipment was either insignificant or was periodic and could therefore by averaged out. The audiomonitor was silenced during recording and headphones were used to listen to the output of the recording amplifier. The amplitude of the sound stimuli was set ≥10–20 dB above the background level. The standard SPL was 50–85 dB, as measured at the monkey's head, which was well above the background noise level but still within the linear range of our sound delivery system.

Electrophysiological recording

For extracellular recording of neuronal spike activity, lacquer-coated tungsten electrodes (F. Haer, impedance approximately 1 MΩ) were advanced into the brain by a hydraulic micropositioner with remote-controlled stepping motor (David Kopf, Model 650). Each penetration position was recorded with the National Institutes of Health (NIH) Image program on a Macintosh IIfx computer and a CCD camera (Panasonic) mounted on a surgical microscope (Zeiss), so that the penetration sites could be reconstructed. The electric signals were band-pass filtered (0.3–20 kHz) and amplified in 2 stages (A-M Systems preamplifier, Model 1800; Tektronix, AM 502). A signal-processing unit (“slicer”; A. B. Bonds, Nashville, TN) was used to set the threshold for filtering out background noise and to reliably separate spikes from more than one neuron recorded at the same site. The output of the slicer was monitored with an audio monitor (Grass Instruments, AM 8), and a window discriminator was used to convert spikes with different amplitude levels into TTL signals. In addition, the signal at each step was monitored on an oscilloscope (Tektronix, 5113 Dual Beam Storage), so that we are quite confident that only isolated single-unit activity was recorded. The TTL signals were then registered on an IBM AT-386 PC with a data collection program (HIST, Spikes Systems), which produced peristimulus time histograms (PSTH) and raster displays with a bin width of 1 ms for on- or off-line evaluation.

Standard sets of digitized natural complex sounds (key jingling, finger snapping, hissing, clapping, etc.) that were shown to be effective in driving lateral belt neurons were used as search stimuli while the electrode was lowered. The SIGNAL program was again used to record and play back these sounds. When a unit was isolated, the best frequency (BF) of the neuron and its lowest excitation threshold at the BF were determined with tone bursts. Threshold was defined as the amplitude of a BF tone at which an increase of activity above the spontaneous level was just noticeable. The frequency tuning range of the neuron was determined at a set SPL 6–20 dB above threshold. If a neuron was not or relatively less responsive to pure tones, the best center frequency (BFc) of the neuron was determined with BPNs at different center frequencies, normally with a bandwidth of 1/3 or 1 octave. PSTHs were recorded when sound stimuli were played back. The center frequency of a BPN at which the response was strongest was taken as the BFc of the neuron. To determine the bandwidth tuning of neurons, BPNs with different bandwidths at 1/3, 1/2, 1, and 2 octaves at a given center frequency were presented. For comparison, a pure tone at the same center frequency and white noise (WN) were also presented. The bandwidth that yielded the strongest response was taken as the best bandwidth (BBW).

At the end of each penetration, as well as at specific depths during retraction of the electrode, electrolytic microlesions (7 μA, 7 s) were made to mark the electrode tracks and specific recording sites.

Data analysis

To quantify the response of neurons to stimulation with tone bursts and other complex stimuli, a “peak firing rate” was determined with a standard routine as follows: a 10-ms window was slid across the PSTH, which had been recorded with a bin width of 1 ms; the number of spikes in every 10-ms interval was measured until the maximum was found, and the average firing rate in the peak interval was calculated after subtracting spontaneous activity. The peak firing rate was plotted against the center frequency of the BPN, forming a “rate–center frequency” or “isointensity” curve for each neuron, similar to the “rate–frequency” curves with pure tones. As stated above, the frequency at which the neuronal response was maximal was taken as the BFc. Sometimes, several local maxima were apparent in the rate–frequency curve. A secondary or tertiary peak was defined if the firing rate between 2 local maxima dropped to <50% of either maximum. Similar to the “rate–frequency” curve, a “rate–bandwidth” curve could be derived from the neuronal responses to different bandwidths. Again, as stated above, the BBW was defined as the bandwidth at which the firing rate was maximal.


At the end of an experiment, which was limited to 48 h by our protocol, the animal was deeply anesthetized with sodium pentobarbital (Nembutal, 60 mg/kg, intravenous) and perfused transcardially with 0.9% saline followed by 4% paraformaldehyde in 0.1 M phosphate buffer. The brain was removed from the skull, blocked stereotaxically, and stored in fixative at 4° C. The part of the brain containing the auditory cortex was sunk in increasing sucrose gradients and cut in 50-μm-thick coronal sections on a freezing microtome. The sections were mounted on slides and Nissl-stained with cresyl violet or thionin. Electrolytic microlesions were identified under a microscope (Wild) and the sections drawn with a camera lucida (Leitz). The lesion sites were compared with the stereotaxic measurements taken during the recording session to verify the location of the recording sites. Core and belt areas were distinguished easily on the basis of cell density in layer IV and cytoarchitectonic criteria established previously (Hackett et al. 1998a; Jones et al. 1995). Different areas within core (e.g., A1 and R) and belt (e.g., AL, ML, and CL) were distinguished on the basis of best-frequency gradients within their cochleotopic maps (Rauschecker et al. 1995).


Recording sites

A total of 969 cells were collected from the auditory cortex of 11 adult rhesus monkeys (Macaca mulatta) in a total of 201 penetrations. Recordings were made on both sides of the lateral sulcus (LS) from the superior temporal gyrus (STG) and supratemporal plane (STP) of the left hemisphere, roughly between the central sulcus and the confluence of the LS and the superior temporal sulcus (STS). The lateral belt region was explored in 125 penetrations in 10 of the monkeys. For comparison, 79 penetrations were also made into the primary-like core areas A1 and R (the rostral area: Kaas and Hackett 2000; Morel et al. 1993; previously also termed RL or rostrolateral area: Merzenich and Brugge 1973) in 5 of the monkeys. The numbers of penetrations and units recorded in each area are given in Table 1. An average of 18 electrode tracks per monkey were distributed across the STG and STP. At least 3 recordings per penetration were obtained at varying depths throughout all cortical layers, yielding a total of 969 recording sites. Of these, 569 units responded to acoustic stimuli in a quantifiable manner and 345 neurons were analyzed with BPN. The minimum distance between units recorded within a penetration was 200 μm, and all penetrations were spaced ≥500 μm apart. This ensured sufficient distance of neighboring recording sites for independence of data.

View this table:
table 1.

Number of recording tracks and units in each area

Vertical or near-vertical penetrations were made into the STG. Because the cortex folds into the LS, penetrations made into the STP through the parietal cortex were sometimes oblique because they were oriented parallel to those in the STG.

Response to BPN

In the lateral belt region, neurons clearly preferred BPN over PT, as reported previously (Rauschecker et al. 1995). Whereas PT bursts often elicited little or no measurable response, 256 (84.5%) of the 303 neurons yielded a clear response when tested with BPN. One such example is shown in Fig. 3: PT bursts elicited only a small response, whereas BPN bursts centered at the same frequencies consistently evoked a much larger response. The best response in this neuron was elicited by a BPN with 1/3 octave bandwidth, and a BPN with 1 octave bandwidth yielded a solid response, still much larger than the response to PT. When the rate–frequency curves for PT and for BPNs with 1/3 and 1 octave bandwidth were plotted, the response to BPN was better than that to PT at every single frequency (Fig. 4A). Similar to the rate–frequency curve for PT, the rate–frequency curves for BPNs at different center frequencies also had an optimum, by which the BFc was defined. In most cases, the BFc and the best frequency (BF) for PT were in good agreement (Fig. 4). For some neurons, BFc values differed slightly from BF values, often simply because data points were collected at higher resolution with BPN (Fig. 4D).

fig. 3.

Peristimulus time histograms (PSTH) and raster displays of one AL neuron (unit B06–5–1) in response to PT (A) and BPNs at 1/3 (B) and 1 (C) octave bandwidth, respectively, and at different center frequencies (Fcs), indicated in the top-right corner of each panel in A. Short bar between each PSTH and raster indicates the on-set time (500 ms) and duration of the stimulus. All stimuli were energy-matched on the basis of their RMS value. Responses at the best center frequency (BFc, 0.70 kHz) are highlighted with dark shading. Bin width: 10 ms; epoch length: 2 s.

fig. 4.

Rate–frequency curves of 4 units, including the one (A) shown in Fig. 3, tested with PT and BPNs at different bandwidths and Fcs. Filled symbols indicate responses to different stimuli. Open symbols refer to the corresponding baseline activity in the 500-ms interval preceding stimulus onset. These are displayed to demonstrate the stability of recordings. As indicated in A, circles refer to PT stimuli; squares to BPN bursts of 1/3 octave; diamonds to BPN of 1 octave. Data for A from Rauschecker et al. 1995.

When the best responses of neurons in the lateral belt region to both BPN and PT at the same intensity level were directly compared, responses to BPN were clearly better than those to PT (Fig. 5A, paired t-test, P < 0.0001). To quantify the comparison between the responses to BPN and PT, the enhancement of BPN over PT was calculated for each of these neurons by calculating the ratio (BPN–PT)/PT × 100%. The distribution of the enhancement is plotted in Fig. 5B. In more than half of the neurons (72/136 = 52.9%), the enhancement of the BPN response over the PT response was >50%, and increases above 150% were seen in 22 neurons (16.2%). The opposite effect (i.e., a suppression of the response by >50%) occurred in only 2 neurons (1.5%). There was a trend for the enhancement to be greater in AL than in ML or CL, but this trend was not significant.

fig. 5.

Enhancement of auditory single-unit responses in the lateral belt by greater bandwidth compared with pure tones. A: scattergrams of best responses to BPN vs. PT in the 3 lateral belt areas, AL, ML, and CL. B: proportions of neurons with different degrees of enhancement by BPN over PT response. AL, ML, and CL data are stacked on top of each other. Positive values indicate actual enhancement; negative values, suppression.

For comparison, data recorded in the primary auditory cortex (A1) showed no difference between PT and BPN responses (Fig. 6A, paired t-test, P = 0.371). Enhancement in A1 neurons was mostly around 0%, with more neurons showing suppression of some degree than enhancement (Fig. 6B). Only very few neurons (4 out of 40) showed enhancement effects of >50%. This result is also indicated in the scattergram of Fig. 6A where most data points are clustered along the diagonal. The difference between lateral belt and A1 neurons was highly significant (χ2 test, P < 0.0001).

fig. 6.

A: scattergram of best responses to BPN vs. PT in A1 neurons. B: histogram of response enhancement to BPN over PT in A1 neurons.

Tuning to center frequency (Fc)

On the basis of responses to BPN and PT, the BFc (and BF) of most neurons could be determined. In electrode penetrations perpendicular to the cortical surface, BFc values remained largely unchanged, so that a BFc value could be calculated for each penetration from the average BFc values of all neurons recorded in that penetration. When the electrode positions on the STG and STP were plotted with their BFc values across the image of the brain surface, an orderly representation of BFc values became apparent in each monkey. These maps revealed continuous rostrocaudal progressions in the frequency domain, with 2 reversals along these progressions, indicating a quasitonotopic (cochleotopic) organization of these lateral belt areas on the STG.

In monkey B06, for example, recordings in the most rostral penetration on the STG yielded a BFc of 11 kHz. From this point backward, BFc values in penetrations made into the STG along the LS decreased from penetration to penetration down to 0.35 kHz over several steps, reversed direction and, after increasing to 20 kHz, reversed again and decreased to 6.5 kHz (Fig. 7, A and B). A region nonresponsive to our acoustic stimuli [“no clear response” (ncr)] was reached caudal to the 6.5-kHz penetration. This area was situated close to where the LS and STS join together. Another row of penetrations made more medially, through the parietal cortex but parallel to the penetration row on STG, showed the same frequency reversal in the anterior part of the recorded region. The mapping could not be completed more caudally because of the limited recording time.

fig. 7.

Map of BFc values (in kHz) along the lateral sulcus (ls) in one monkey (B06). BFc refers to the center frequency of a BPN burst. A: posterior part of the superior temporal gyrus (STG) is seen in the center. Filled circles (•) refer to electrode tracks that entered directly into its lateral surface; filled diamonds (♦) represent tracks that entered the supratemporal plane after traveling through overlying parietal cortex. All electrode tracks were verified by histology. BFc was averaged for each penetration from at least 3 recordings in different depths. ncr, “no clear response”; sts, superior temporal sulcus; AL, anterolateral area; ML, middle lateral area; CL, caudolateral area; m, medial; l, lateral; r, rostral; c, caudal. B: progression of BFc is shown projected onto a horizontal frequency scale with best bandwidth (BBW) added for each electrode penetration at which it could be determined.

Similar to the previous case, BFc progression with 2 frequency reversals along LS was also seen in monkey AM27 (Fig. 8). Starting from the most rostral position, a region nonresponsive to our acoustic stimuli was first encountered, indicated by the “ncr” on the brain image. The next penetration yielded a BFc of 4 kHz. From now on, BFc decreased to 0.6 kHz, reversed, increased to 17 kHz, then reversed again, and decreased to 1 kHz (Fig. 8). Again, a nonresponsive area was found at the confluence of the LS and STS. Another row of penetrations through parietal cortex into the STP showed a frequency reversal in the caudal part when the BFc dropped to 10 kHz after reaching 17 kHz. In the rostral part, the lowest BFc was reached in the most anterior penetrations, indicating another frequency reversal point was imminent. Thus the same frequency progression pattern of BFc values, from rostral to caudal with 2 reversals, was found in this monkey as in B06.

fig. 8.

Map of BFc values (in kHz) along the lateral sulcus in another monkey (AM27). Third parallel row of penetrations was made here through the parietal cortex, indicated by asterisks (*). Other conventions as in Fig. 7.

Mapping studies in the remaining monkeys revealed the same frequency progression pattern with 2 reversals along the LS. Thus at least 3 coarsely cochleotopic areas can be defined within the lateral belt of the macaque auditory cortex. To avoid any confusion with prior terminology (L, RL; Merzenich and Brugge 1973) we previously named these the anterolateral (AL), middle lateral (ML), and caudolateral (CL) areas, respectively (Rauschecker 1998a; Rauschecker et al. 1995). Based on the penetrations both into STG and STP, it appears that these 3 areas are parallel and collinear with the 3 previously known areas on the STP, R, A1, and CM, respectively (see Rauschecker 1998a).

Tuning to bandwidth

As mentioned above, neurons in the lateral belt areas responded not only differentially to BPN with different center frequencies, but also to different bandwidths at the same center frequency (see examples in Figs. 3 and 4). To quantify this tuning property, BPNs with 4 different bandwidths (1/3, 1/2, 1, and 2 octaves) as well as PT centered at the same frequency (usually the BFc of the neuron) and WN were presented. PT and WN can be regarded as the 2 extremes in the series of bandwidths: PT has a bandwidth of 0 octave and the bandwidth of WN is limited only by the sound system. In the lateral belt areas, 132 neurons (25 in AL, 92 in ML, and 15 in CL) were tested at these 6 bandwidths. Among them, complete PSTHs were recorded in 102 neurons (19 in AL, 71 in ML, and 12 in CL). Because it was sometimes difficult to determine the BFc on-line during the experiment, the Fc chosen for testing bandwidth tuning was not always exactly at the BFc of the neurons. Therefore only neurons that were tested with BPN series centered within 0.1 octave of the BFc were included in the database for further population statistics. Among the 102 neurons recorded with PSTHs, this was the case for 75 neurons.

Figure 9 shows examples of PSTHs of 4 typical neurons in response to 6 different bandwidths, including PT and WN, at their BFc. In the most common response profile of rate–bandwidth curves, a neuron responded poorly to PT or WN, but strongly to BPN with a certain bandwidth between 1/3 and 2 octaves with a single peak (Fig. 9B and Fig. 10, A, B, and C). These neurons were classified as single-peaked (SP) in bandwidth tuning. Out of the 75 neurons analyzed in the lateral belt, 38 (50.7%) showed this type of bandwidth tuning profile (Fig. 11). Sometimes, 2 or more peaks were found in the bandwidth tuning curves (Fig. 9C, and Fig. 10D). They were classified as multipeaked (MP), using the criteria defined in methods. This type of bandwidth tuning was seen in 20 (26.7%) neurons (Fig. 11). In 9 lateral belt neurons (12%) (Fig. 11), responses increased monotonically with increasing bandwidth (Fig. 9D and Fig. 10E). They were classified as increasing (IN). In another 8 neurons (10.6%) (Fig. 11), the response decreased monotonically with increasing bandwidth (Fig. 9A and Fig. 10F). They were classified as decreasing (DE). For comparison, the majority of neurons (14/23 = 60.9%) in A1 showed DE profiles in bandwidth tuning (Fig. 11). Some A1 neurons (6/23 = 26.1%) had a single peak, but only very few had multiple peaks (1/23 = 4.3%) or were monotonically increasing (2/24 = 8.7%).

fig. 9.

Examples of 4 lateral belt neurons in response to BPN bursts at their BFc values but with different bandwidths, indicated by the number in the top right corner of the top row. Each row (AD) represents the response of one neuron. Short bar below each PSTH indicates the onset time and duration of the stimulus.

fig. 10.

Examples of bandwidth tuning curves from 6 lateral belt neurons. In addition to varying the bandwidth of the energy-matched stimuli, the sound intensity level was varied in C and D, as indicated by different symbols and thickness of lines. Dashed lines, used to indicate that PT and WN, are not on a continuum with the rest of the curve. Open symbols refer to baseline activity. Data for D from Rauschecker et al. 1995.

fig. 11.

Distribution of bandwidth tuning types in lateral belt and A1 neurons. SP, single-peaked; MP, multipeaked; IN, increasing; DE, decreasing.

Based on the rate–bandwidth tuning curves, a BBW could be determined for most neurons within the range tested. As mentioned above, a large majority of neurons (47/75 = 62.7%) in the lateral belt areas had a BBW between 1/3 and 2 octaves. The distribution of BBW was quite even among the bandwidths tested (Fig. 12A). Only few neurons showed their best response to PT (11/75 = 14.7%). In contrast, the majority of A1 neurons (14/23 = 60.9%) showed a bandwidth preference for PT (Fig. 12B). If A1 neurons were at all tuned to BPN, then the BBW was usually ≤1/2 octave. Only 4 neurons (6.3%) in A1 were tuned to a bandwidth of ≥1 octave.

fig. 12.

Distribution of BBW in lateral belt (A) and A1 (B) neurons. BBW is evenly distributed for lateral belt neurons, whereas it is concentrated in the PT category for A1 neurons.

To test the effects of stimulus intensity on BW tuning, we varied the sound level in a small sample of neurons. The main result was that BBW for a given neuron was largely unaffected by changes of stimulus intensity over a wide range. Although the degree of enhancement and the width of the BW tuning curve sometimes varied at different sound levels, the peak of the BW tuning curve always remained at the same place (Fig. 10, BD). This was found in 15 lateral belt neurons tested with at least 2 different intensity levels. Some of these neurons displayed nonmonotonic behavior with respect to sound intensity. In the example shown in Fig. 10B, the neuron yielded solid responses at 65 dB, but the responses at 75 dB were much stronger. However, when the intensity increased to 85 dB, the responses were weaker again than at 75 dB.

When responses were compared at different mediolateral positions, BBW generally increased from medial to lateral along that axis—that is, roughly orthogonally to the BFc axis. In one monkey, bandwidth was mapped extensively in the mediolateral dimension (Rauschecker et al. 1995). A highly significant correlation was found in this monkey between BBW and the median of the electrode positions relative to the LS (P < 0.003, Spearman rank correlation).

There were also indications that BBW may vary as a function of depth from the cortical surface or, in other words, as a function of laminar position. When BBW was plotted against depth of each recording site in vertical electrode penetrations, neurons with narrow BBW (≤1/3 octave) were concentrated between 600 and 1,400 μm, whereas the majority of neurons with broader BBW (≥1/2 octave) was distributed outside of this range. Specifically, 7 of 9 neurons with a BBW of PT were within 600 and 1,400 μm; for neurons with a BBW of 1/3 octave, the ratio was 7 out of 12. However, the ratio shifted to 3 out of 7, 3 out of 8, 8 out of 14, and 4 out of 11 for neurons with a BBW of 1/2, 1, and 2 octaves and WN, respectively.


The present study confirms the existence of 3 functionally defined areas in the “lateral belt” of rhesus monkey auditory cortex, as previously postulated (Rauschecker et al. 1995). It demonstrates that neurons in the lateral belt respond vastly better to BPN bursts than to pure tones. Furthermore, the neurons are tuned to both bandwidth and best frequency of the BPN, which suggests their involvement in the early cortical stages of spectral analysis of complex sound patterns (see also Kadia and Wang 2003; Recanzone et al. 1999; Schreiner et al. 2000). Just as functional convergence of thalamic input helps to create receptive fields in primary auditory cortex (Miller et al. 2001), integration of frequency-specific input from primary cortical areas leads to more complex response properties in lateral belt neurons.

The 3 lateral belt areas, AL, ML, and CL, are identified on the basis of responses to BPN bursts, which characterize a given neuron in terms of its BFc and BBW. The mapping of BFc establishes mirror-symmetric cochleotopic maps for all 3 belt areas. The reversal points of BFc along a rostrocaudal axis determine the borders between AL, ML, and CL, which are situated in parallel and adjacent to the primary core areas R and A1 as well as area CM, respectively, on the supratemporal plane. A substantial portion of the lateral belt areas is situated on the exposed surface of the superior temporal gyrus (as opposed to the core areas, which are buried in the depths of the lateral sulcus). Therefore they provide an opportunity for the extensive analysis of neuronal response properties at a relatively early stage of cortical processing while recording electrodes can be placed under visual guidance.

Anatomically, the lateral belt can be differentiated from the core areas by its lower density of histochemical staining (Hackett et al. 1998a; Jones et al. 1995; Kosaki et al. 1997; Morel et al. 1993). Laterally adjacent to the belt, even paler staining identifies another zone of nonprimary auditory cortex, which has been termed “parabelt” (Hackett et al. 1998a; Morel et al. 1993). Projections from the core do not reach the parabelt directly but are always relayed through the belt (Hackett et al. 1998a), indicating a serial processing hierarchy. Although it is clear that the transition between belt and parabelt occurs somewhere on the open surface of the STG, their exact border has never been determined physiologically. Because our recordings were made as closely as possible along the lateral sulcus, it is safe to assume that they were located within the belt proper and not in the parabelt.

In a prior study, we noticed a gradient in preferred bandwidth from narrow (PT) to broad as we went mediolaterally on the STG (Rauschecker et al. 1995). We did not explore this further in the present study, but if the gradient exists, it most likely reflects the transition from core to belt areas. However, it is entirely possible that the same trend exists even within either core or belt. Previous studies in cats have found systematic differences in frequency tuning within A1, depending on the recording position along the isofrequency axes (Schreiner and Mendelson 1990; Schreiner et al. 2000), and this could be an important organizing principle for auditory cortex.

Neurons in the lateral belt areas generally respond much better to BPN than to pure tones, whereas the reverse is true for primary auditory cortex. Even more interestingly, lateral belt neurons usually respond best to particular bandwidths of BPN bursts, independent of sound intensity. The preference of lateral belt neurons for BPN over PT stimuli cannot be explained by increased sound levels of the BPN bursts, given that RMS values were normalized between all stimuli, so overall sound level was constant. It is furthermore unlikely that the BW selectivity is created by the small differences (∼5 dB) in spectrum level resulting from energy matching [i.e., the different distribution of energy within the BW (see methods)] because they go in the opposite direction. If this were the case, one would expect the response in most neurons to decrease with BW as the spectrum level decreases, whereas the opposite was the case. In addition, BBW was found to remain constant in neurons that were tested with different stimulus amplitudes, although this was not systematically explored.

The finding of a response enhancement with increasing BW in nonprimary auditory cortex corresponds well with studies of visual cortex, where receptive field size increases as one moves away from primary visual cortex to extrastriate areas. At the same time, single-unit responses in nonprimary visual areas improve as visual stimuli are enlarged because inputs get integrated over a larger area (Creutzfeldt 1995). However, the response enhancement has its limits in that diffuse light is rarely the optimal stimulus and neurons usually respond best to stimuli of a certain size.

The correspondence between the bandwidth tuning of lateral belt neurons described here and the size selectivity of neurons in visual area V4 (Desimone and Schein 1987; Petersen et al. 1980) is most striking. Just as the size selectivity of V4 neurons suggests their involvement in shape and object perception, one can argue that neurons in the lateral belt are crucially involved in auditory object analysis. However, BW-selective neurons would be equally useful for the extraction of spectral information in auditory spatial processing. Cues generated by the head and pinnae contribute to spatial hearing by means of spectral signatures, particularly spectral notches, specific for certain positions in both azimuth and elevation (Blauert 1997; Wightman and Kistler 1989). Combined activation of specific sets of BW-selective neurons could thus signal the presence of a sound source in specific spatial locations. Neurons selective for spectral notches could be created by inhibitory input from BW-selective neurons in a specific frequency region (Sutter et al. 1999).

Cortical integration occurs over various stages. Within each cortical area, it occurs between the thalamic input layer (IVc) and layers above and below. It fits with this concept that BBW in the lateral belt varied with depth from the cortical surface (i.e., laminar position): Neurons with the narrowest bandwidths were found at the input stage, whereas broader bandwidths prevailed outside. This is again reminiscent of visual cortical organization, where response field (RF) size varies along perpendicular electrode penetrations (Gilbert 1993; LeVay and Gilbert 1976), neurons in layer IVc having the smallest RFs. Feedforward projections to other cortical areas originate primarily from neurons in supragranular layers with larger RFs (Maunsell and van Essen 1983).

For the processing of complex sounds the bandwidth-selective neurons described here constitute only a first step. However, such neurons may exemplify the way in which cortex goes about building networks that are increasingly selective to increasingly more complex stimuli. It is conceivable that BPN-selective neurons could be used for a variety of tasks. Communication sounds, for instance, contain combinations of BPN bursts of various center frequencies and bandwidths (Hauser 1996; Rauschecker and Tian 2000; Wang 2000; Wang and Kadia 2001). One possibility, therefore is to build higher-order vocalization detectors by combining the output of BW-selective neurons in a nonlinear fashion (i.e., by their simultaneous activation of higher-order neurons acting as coincidence detectors) (Rauschecker 1998b). In previous studies of bats and songbirds, a similar concept has been suggested for creating neurons with selectivity to echo-location signals or song (Doupe 1997; Margoliash 1997; Suga 1988). This has been termed “spectral combination sensitivity.” Nonlinear integration also occurs in the time domain and seems to be a fundamental mechanism in cortical sensory processing, which is highly conserved across species.

We reported previously that neurons in area AL are on average more selective for communication sounds than those in areas ML and CL (Tian et al. 2001). Conversely, neurons in CL are by far the most selective for spatial position. Based on its location relative to A1 and CM, area CL is closely situated to area Tpt, as cytoarchitectonically defined by Pandya and coworkers (Pandya and Sanides 1973; Pandya et al. 1994). Area Tpt and the adjacent caudal belt project to area VIP in the posterior parietal cortex (Lewis and Van Essen 2000). CL also projects directly to regions of dorsolateral prefrontal cortex (Romanski et al. 1999), which have been implicated in spatial working memory (Goldman-Rakic 1996). Both findings support a role of caudal belt in auditory spatial processing, which is made even more likely by the narrower spatial tuning of CL neurons (Tian et al. 2001). CL is thus hypothesized to be the origin of an auditory “where” stream, whereas AL is thought to give rise to a “what” stream involved in the identification of auditory objects (Rauschecker and Tian 2000), with ML being in between both anatomically and functionally.

Human neuroimaging studies have confirmed the organization of auditory cortex into core and belt areas by using the same types of stimuli as in the present study (Wessinger et al. 2001). Two core areas robustly activated by pure-tone stimuli and mirror-symmetric tonotopic organization were found along Heschl's gyri. A third such area was sometimes seen more laterally. Although the first 2 areas quite obviously correspond to areas A1 and R, the third area may be homologous to area ML, which [like the former (Rauschecker et al. 1997)] has been shown to receive direct input from the MGv (Hackett et al. 1998b). These 3 pure-tone responsive areas were surrounded by belt regions both medially and laterally, which were activated only by BPN bursts. An exploration of the medial belt region in the monkey with BPN bursts is therefore indicated.

Findings from human neuroimaging also support the dualstream hypothesis of auditory processing (Alain et al. 2001; Binder et al. 2000; Maeder et al. 2001; Scott et al. 2000; Warren et al. 2002; Zatorre et al. 2002). It thus appears that, like in the visual system, studies of nonhuman primates can serve as excellent models for future human studies. Conversely, human imaging studies can provide useful guidance for microelectrode studies in nonhuman primates, which permit analysis at much higher spatial and temporal resolution than would be possible in most human studies, with some exceptions (Howard et al. 2000).



This work was supported by National Institute on Deafness and Other Communication Disorders Grant R01-DC-003489.


  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract