Processing of Frequency-Modulated Sounds in the Lateral Auditory Belt Cortex of the Rhesus Monkey

Biao Tian, Josef P. Rauschecker


Single neurons were recorded from the lateral belt areas, anterolateral (AL), mediolateral (ML), and caudolateral (CL), of nonprimary auditory cortex in 4 adult rhesus monkeys under gas anesthesia, while the neurons were stimulated with frequency-modulated (FM) sweeps. Responses to FM sweeps, measured as the firing rate of the neurons, were invariably greater than those to tone bursts. In our stimuli, frequency changed linearly from low to high frequencies (FM direction “up”) or high to low frequencies (“down”) at varying speeds (FM rates). Neurons were highly selective to the rate and direction of the FM sweep. Significant differences were found between the 3 lateral belt areas with regard to their FM rate preferences: whereas neurons in ML responded to the whole range of FM rates, AL neurons responded better to slower FM rates in the range of naturally occurring communication sounds. CL neurons generally responded best to fast FM rates at a speed of several hundred Hz/ms, which have the broadest frequency spectrum. These selectivities are consistent with a role of AL in the decoding of communication sounds and of CL in the localization of sounds, which works best with broader bandwidths. Together, the results support the hypothesis of parallel streams for the processing of different aspects of sounds, including auditory objects and auditory space.


Neurophysiological studies in cats and monkeys have traditionally identified auditory cortical areas on the basis of their tonotopic organization (Knight 1977; Merzenich and Brugge 1973; Reale and Imig 1980). It has been recognized recently that neurons in higher auditory cortical areas, in contrast to primary core areas, no longer respond optimally to pure tones but require more complex sound stimuli (Rauschecker 1997; Rauschecker et al. 1995). Nevertheless, the organization of these areas into cochleotopic maps remains an overriding principle, and borders between areas are identified as the reversal points between gradients of best frequency (Hackett et al. 1998; Morel et al. 1993; Rauschecker et al. 1995, 1997). Superimposed on cochleotopic organization are other stimulus representations: response properties of single neurons may vary a great deal between areas and help to identify any functional specializations that these areas may possess (Schreiner 1998; Tian et al. 2001).

Multiple cortical representations of the auditory world are thought to be part of a parallel processing hierarchy, in which each area processes preferentially a certain aspect of an auditory scene. “Preferentially” means that the parameter domains represented in nonprimary auditory cortex are by no means processed exclusively in one area, but significant differences between areas are expected in the distribution of response preferences within any such domain. Similar ideas have long been proposed for the role of multiple cortical representations in vision (Felleman and van Essen 1991; Ungerleider and Mishkin 1982; Van Essen and Maunsell 1983; Zeki 1978). The proportion of neurons selective for the color or direction of motion of a visual stimulus, for instance, varies substantially between extrastriate visual areas and has led to the characterization of certain areas as “color” (V4) or “motion” (V5/MT) areas. Astonishingly little progress has been made to date in terms of characterizing in a similar fashion any functional differences between higher auditory cortical areas.

Using tone bursts, various parameter domains of auditory processing have been analyzed in the past, but most of these analyses were restricted to primary auditory cortex (A1), especially of cats. This has included the analysis of sharpness of frequency tuning, level sensitivity, and binaural interactions (Imig and Adrián 1977; Phillips and Irvine 1981; Phillips et al. 1994; Schreiner and Cynader 1984; Schreiner and Mendelson 1990; Sutter and Schreiner 1991, 1995). A few of these studies have gone beyond A1, such as the anterior auditory field (AAF) (Knight 1977; Phillips and Irvine 1982) or the posterior auditory field (PAF) (Loftus and Sutter 2001; Phillips and Orman 1984; Phillips et al. 1995). AAF is similar to A1 in many respects, although there is some indication that frequency-tuning curves are broader than those in A1 and more often display multiple peaks (Knight 1977; Tian and Rauschecker 1994). In ferrets, the excitatory bandwidth in AAF neurons is twice as large as that in A1 (Kowalski et al. 1995). In addition, a higher proportion of nonmonotonic, intensity-tuned cells has been found in PAF than in A1 (Phillips et al. 1995). Nevertheless, none of these parameters has been sufficient to differentiate the various areas functionally.

Clearly, what is needed is an analysis with more complex sounds comparing neuronal responses in the different cortical areas. Suga was one of the first to successfully differentiate auditory cortical areas on the basis of their response preferences to various complex sounds (for reviews see Suga 1988, 1992). In the mustached bat, the use of behaviorally relevant sounds markedly facilitated the success of this approach, and the choice of complex stimuli was narrowed down considerably by modeling them after species-specific sounds used during echolocation. In other mammals, such an approach is made more difficult by their less specialized behavioral repertoire and a bewildering number of natural complex sounds that may be expected to trigger responses in nonprimary cortical neurons. One group of species-specific sounds that has been used with some success in nonhuman primates in the past (Newman and Symmes 1974; Symmes 1981; Winter and Funkenstein 1973; Wollberg and Newman 1973) and again more recently (Eliades and Wang 2003; Rauschecker and Tian 2000; Rauschecker et al. 1995; Tian et al. 2001; Wang 2000; Wang and Kadia 2001; Wang et al. 1995) is that of vocalizations. Although their spectra are known, however, they are still too complex to be used as a first- or second-order stimulus with any hope of finding stimulus-specific characteristics across different cortical areas.

It thus appears most promising to go back initially to the use of stimuli of intermediate complexity, such as amplitude-modulated (AM) or frequency-modulated (FM) sounds. Both types of stimuli have been used quite extensively for the study of A1 in cats (Eggermont 1994; Heil et al. 1992a, b; Mendelson and Cynader 1985; Mendelson and Grasse 1992; Mendelson et al. 1993; Phillips et al. 1985; Schreiner and Urbas 1986; Whitfield and Evans 1965), ferrets (Kowalski et al. 1995; Shamma et al. 1993), and squirrel monkeys (Bieser and Müller-Preuss 1996). Again, however, few attempts have been made to analyze nonprimary auditory areas: AM sounds have been tested in AAF, PAF, and VPAF (ventro-posterior auditory field) of the cat (Schreiner and Urbas 1986, 1988), FM sounds in AAF and PAF of cats (Tian and Rauschecker 1994, 1998), and AAF of ferrets (Kowalski et al. 1995).

In our previous cat studies we found that neurons in AAF prefer high FM rates (Tian and Rauschecker 1994), whereas neurons in PAF prefer, on average, lower FM rates (Tian and Rauschecker 1998). At the same time, a high prevalence of spatially tuned neurons in AAF and in the anterior ectosylvian sulcus (AES) was found (Korte and Rauschecker 1993; Rauschecker and Korte 1993). These findings led to the hypothesis that PAF may be more suitable for the analysis of communication sounds with their slower FM rates (Brown et al. 1978; Capranica 1972; Leppelsack 1983; Liberman et al. 1967; Rauschecker 1997, 1998), whereas AAF and AES may be more specialized for the analysis of sound location. However, the conclusions from the cat work remained incomplete and purely hypothetical because no tests of call selectivity, for instance, were conducted and no direct comparison between areas was performed in the same animals.

A similar functional dichotomoy has been proposed for the auditory cortical system of nonhuman primates, based on a more direct analysis of neurons in the lateral belt areas with species-specific vocalizations presented at varying spatial locations (Rauschecker and Tian 2000; Tian et al. 2001). Area AL (anterolateral) was found to have neurons with greater vocalization specificity, whereas area CL (caudolateral) had significantly more neurons with greater spatial selectivity. This suggests that AL is more closely involved in the analysis and identification of specific complex sound patterns (or auditory “objects”), including those used in acoustic communication, whereas CL is more specialized for the analysis of sound location. If the above species comparisons hold up, the functional gradient in the rostro-caudal direction would appear to be reversed in comparison with the cat. This could be explained by a disproportionate growth of the temporal lobe in monkeys during phylogeny, which would lead to a reversal of tonotopic gradients along the rostro-caudal axis (Jones 1985).

Based on these considerations, it will be instructive to perform a detailed analysis with FM sweeps of the lateral belt areas in rhesus monkeys. A prediction can be made as to the FM rate preferences of neurons in AL versus CL: AL with its higher selectivity for communication calls should prefer lower FM rates in the range of these calls, whereas CL should prefer higher rates. As pointed out previously, the use of linear FM sweeps is interesting for a second reason: they can be seen as analogous to light stimuli moving in a particular speed and direction (Mendelson and Cynader 1985; Tian and Rauschecker 1994, 1998), which are highly effective in driving neurons of extrastriate visual cortex (Hubel and Wiesel 1959). A comparison of FM tuning and visual direction tuning could lead to a more general understanding of cortical processing algorithms regardless of sensory modality.

Preliminary accounts of this work were previously presented (Rauschecker 1997, 1998; Tian and Rauschecker 1995, 1996).


Animals and surgery

The experimental procedures were generally the same as those used in our previous studies of FM selectivity in cats (Tian and Rauschecker 1994, 1998). Four adult rhesus monkeys (Macaca mulatta) with no signs of middle ear infections were used. A total of 216 cells were collected, 190 from the lateral belt areas [AL, ML (mediolateral), and CL], 14 cells from area CM, and 12 cells from A1. Recordings were made exclusively from the left hemisphere.

For the experiment, animals were treated with atropine sulfate (0.05 mg/kg, subcutaneous) and were initially anesthetized with ketamine (10 mg/kg, intramuscular). A venous catheter was placed in one of the hind leg veins, and a tracheal tube was inserted. The monkey was then placed in a specially designed monkey head holder and artificially respirated with a ventilator (Harvard) at an intrapulmonary pressure of 500 to 1,000 Pa. Expiratory CO2 content was monitored continuously (Beckman, Medical Gas Analyzer LB-2) and kept at about 3.8% by varying ventilation stroke volume and frequency. Anesthesia was maintained with isoflurane (1–2%) in a mixture of 50% nitrous oxide and 50% oxygen. Isoflurane creates a light state of anesthesia that can be easily controlled. EKG was monitored to ensure adequate anesthesia and stable physical condition of the animal. If the animal showed any acceleration of its heart rate in response to noxious stimuli, the concentration of anesthetic gases was increased, or a short-acting barbiturate (Biotal) was given through the venous catheter. The animal's core temperature was maintained at 38°C with a heating pad (Gaymar). Fluid (0.5% dextrose in 0.9% saline) was administered through the venous catheter by an infusion pump at a rate of 12–30 ml/h.

A craniotomy was performed over the parietal cortex in aseptic surgery. A temporary well was built with bone cement. The dura was left intact to minimize brain pulsation and prevent the cortex from possible mechanical damage during surgery and transportation. After the monkey was transferred to the recording room, the dura was dissected and retracted to expose the superior temporal gyrus (STG). The well was filled with saline to prevent the cortex from drying.

Acoustic stimulation


Pure-tone (PT) stimuli were for the most part generated with a stimulus generator (Wavetek 148A). The tone bursts were gated by a control unit (HI-MED, HG 300G) to 50-ms duration with 5-ms rise/fall time. The amplitude was monitored on an oscilloscope.

To determine the extent of the lateral belt areas, the supratemporal plane was first mapped with band-passed noise (BPN) (Rauschecker et al. 1995). BPN bursts were generated with the SIGNAL software (Engineering Design) on an IBM-compatible personal computer. The SIGNAL program generated a random noise, which was band-pass filtered with different cutoff frequencies. By varying upper and lower cutoff frequencies, BPN bursts could be generated with different bandwidths and center frequencies. The sampling rate was chosen as 100 kHz to avoid significant quantization steps in the signal. The stimuli had a rise/fall time of 5 ms to reduce the effect of transients. Standard stimulus duration was 50 ms, but duration could be varied if there was any indication of a neuron being tuned to this parameter. The root-mean-square (RMS) values of BPNs were normalized to the standard output for PT stimuli, which was 1 Vpeak.

The major stimulus category tested in the present study consisted of linear FM sweeps. These stimuli were also generated with SIGNAL at the same sampling rate of 100 kHz. The frequency range of the FM sweeps was chosen large enough to exceed the excitatory PT tuning range of the neuron by one octave on either side, unless this exceeded the frequency range of the sound delivery system. While keeping the frequency range constant, the duration of the FM sweep was systematically varied from 50 to 1,600 ms in logarithmic steps, to test the responses to different FM rates (FMR). The total range of FMR tested reached from 6.25 to 640 Hz/ms. Like PT stimuli, the FM sweeps also had a rise/fall time of 5 ms to reduce the effect of transients. The amplitude of the FM sweeps generated by SIGNAL was 1 Vpeak, the same as those of PT stimuli, which could also be generated with the SIGNAL program, so that a direct comparison of response strength between FM, BPN, and PT was possible. Stimulus amplitude remained constant during each trial but was varied between trials through an attenuator (HP 350D). The interstimulus interval was ≥1 s. Each stimulus was repeated 20 times for each neuron. All stimuli were amplified with a power amplifier (Hafler, SE 120) and played back with a high-fidelity loudspeaker (Infinity 5 Kappa) in free field. The loudspeaker was positioned 1.14 m in front of the monkey at the height of the ears.

Electrophysiological recording experiments were carried out in a large dimly lit laboratory room (4.7 × 7.6 × 2.6 m), which was kept as quiet as possible. The sound pressure level (SPL in dB, re 20 μPa) of the noise in the recording room was measured with a Brüel and Kjaer (B & K) -in. condenser microphone (#4133, free-field) and a B & K Precision Sound Level Meter (#2235; A-weighting scale). The constant background noise had its peak level (35 dB) at 0.5 kHz (i.e., outside the effective range of most neurons). The amplitude of the sound stimuli was set ≥10–20 dB above the background level. The standard SPL for pure tones and FM sweeps was 60–85 dB, as measured at the monkey's head, which was well above the background noise level but still within the linear range of our sound delivery system.


The stimulus delivery system was calibrated with the same B & K equipment. Between 0.4 and 24 kHz, the output varied by ±6 dB. Above and below this range a roll-off existed, partially because the signal was outside the measuring range of the microphone. The fidelity of the system in producing rapid FM sweeps was also tested, and near free field conditions were ensured. Details of calibration were reported in previous papers (Tian and Rauschecker 1994, 1998).

Electrophysiological recording

For extracellular recording of neuronal spike activity, a lacquer-coated tungsten electrode (F. Haer, impedance about 1 MΩ) was advanced into the brain by a hydraulic micropositioner with remote-controlled stepping motor (Model 650, David Kopf). Each penetration position was recorded with the National Institutes of Health (NIH) Image program on a Macintosh computer and a CCD camera (Panasonic) mounted on a surgical microscope (Zeiss), so that the penetration sites could be reconstructed. The electric signals were band-pass filtered (0.3–20 kHz) and amplified in 2 stages (A-M Systems preamplifier, Model 1800; Tektronix, AM 502). A “slicer” module (A.B. Bonds, Vanderbilt University) was used to reliably separate spikes from more than one neuron and filter out background noise. The output of the slicer was monitored with an audio monitor (Grass Instruments, AM 8), and a window discriminator was used to convert spikes with different amplitude levels into TTL signals. In addition, the signal at each step was monitored on an oscilloscope (Tektronix, 5113 Dual Beam Storage), so that we are quite confident that only isolated single-unit activity was recorded. The TTL signals were then registered on an IBM-compatible PC with a data collection program (HIST, Spikes Systems), which produced peristimulus time histograms (PSTHs) and raster displays with a bin width of 1 ms for on- or off-line evaluation.

Standard sets of digitized complex sounds were used as search stimuli while the electrode was lowered. When a unit was isolated, we first attempted to determine the best frequency (BF) of the neuron and its lowest excitation threshold at the BF with tone bursts. In this case, threshold was defined as the amplitude of a BF tone at which an increase of activity above the spontaneous level was just noticeable. The frequency-tuning range of the neuron was then determined at a set sound pressure level 6–20 dB above threshold. If a neuron was not, or only slightly, responsive to pure tones, the best center frequency (BFc) of the neuron was determined by varying the center frequency of BPN bursts, generally with a bandwidth of 1/3 or 1 octave. The center frequency of a BPN at which the response was strongest was taken as the BFc of the neuron. FM sweeps were played back at the same or lower amplitudes as PT or BPN stimuli. PSTHs and raster displays were recorded during presentation of all sound stimuli.

At the end of each penetration, electrolytic microlesions (7 μA, 7 s) were made to mark the electrode tracks and specific recording sites.

Data analysis

To quantify the response to stimulation with PT, BPN bursts, and FM sweeps in the lateral belt areas, the same “peak firing rate” was determined from the PSTHs as described in previous studies (Tian and Rauschecker 1994, 1998). In brief, a 10-ms window was slid at 1-ms steps across the PSTH, the number of spikes in this 10-ms interval was counted at each step until the maximum was found, and the average firing rate in the peak interval was calculated after subtracting spontaneous activity. The rate–frequency curves thus obtained were smoothed with a 3-point smoothing routine. Sometimes several local maxima were detected in these rate–frequency curves. A secondary or tertiary peak was defined if the firing rate between 2 local maxima dropped by ≥50% compared with either maximum.

FM responses were analyzed in an analogous fashion by creating “rate–frequency” curves from instantaneous frequency (IF). Because linear FM sweeps were used, the time axis in the PSTH can be easily converted into a frequency axis, as follows. The duration of the FM stimulus was divided into 50 equal intervals (the shortest FM stimulus was 50 ms), after which the peak firing rate in each interval was determined. Thus instead of looking for the peak response within the 10-ms window for the whole PSTH, the local peak response was determined in each interval. In the case of an FM duration of 1,600 ms, for example, the duration of each interval was 32 ms, but the width of the sliding window was kept at 10 ms. The obtained values were smoothed and plotted for both upward and downward FM sweeps against the IF at the center of the peak interval. Best instantaneous frequency (BIF) was defined as the IF at the global maximum in the rate–frequency curve. The BIFs at different FMRs were averaged separately for each direction (upward and downward). These averaged values were compared with the BFs determined from PT stimulation. The appropriate transformation into the frequency domain required an estimation of response latency, which was obtained from the shortest onset latency in response to any of the stimuli tested. Because it was sometimes difficult to estimate the minimum latency due to the relatively high spontaneous activity, the calculated rate–frequency curve was compared with the peak response for the whole FM duration. If the peak was not included in the rate–frequency curve, then the latency was reassessed until this was the case. Multiple maxima in the response to FM stimuli were defined by the 50% criterion (as for PT responses) and by their occurrence at corresponding IFs for both upward and downward sweep at least for one FMR.

To assess the neuron's response to different FMR, the peak firing rates at each FMR were determined and plotted as FMR tuning curves. According to the neuron's responses to the different FMRs, FMR tuning was categorized as high-pass (HP), low-pass (LP), band-pass (BP), all-pass (AP), or band-rejection (BR) behavior. A neuron was classified as HP or LP if the response dropped below 75% of the maximum response at lower or higher FMRs, respectively. A BP neuron had a clear optimum in the middle of the FMR tuning range, whereas the response of an AP neuron was never <75% of the maximum. A BR neuron had 2 optimal FMRs with <5% difference in firing rates separated by FMRs with firing rates <75% of the maximum (Tian and Rauschecker 1994, 1998). The preferred FMR (PFMR) was defined as the FMR at which the response was maximal in a given FM direction. For statistical evaluation, the PFMR in the preferred direction was used if the neuron was FM direction-selective (see following text); otherwise, the PFMR at which the response was maximal was used. Because linear FMs were used in our study but the cochleotopic scale is rather logarithmic, the actual FMR slows down at higher frequencies with regard to the cochleotopic organization. In other words, a neuron with a higher BFc could have the same instantaneous PFMR as that of another neuron with a lower BFc, although their overall PFMRs measured on a linear scale (Hz/ms) were different. To compensate for this effect, one has to determine the instantaneous PFMR on a logarithmic scale (octaves/s) at the BFc of a neuron. When Δt is made small enough, the instantaneous FMR at the BFc of a neuron can be approximated for the upward sweep by the equation Math where iFMR is the instantaneous FMR at the BFc; Fl and Fh are the lower and higher borders of the FM sweep, respectively; and D is the duration of the FM sweep. Similarly, the instantaneous FMR for the downward sweep can be approximated by Math with the same symbols as for upward sweeps.

Another way to assess a neuron's sensitivity to FMR is to calculate the centroid (Kowalski et al. 1995). Because the steps of FMR were logarithmic, the original formula had to be modified to reflect the logarithmic steps Math where C is the centroid of FMRs, and R(FMR) is the neuronal response at a particular FMR.

To assess FM direction selectivity (DS), a quantitative index was calculated using the equation Math where Ru and Rd are the responses to upward and downward FM sweeps, respectively, at a particular FMR (Heil et al. 1992a, b; Mendelson and Cynader 1985; Phillips et al. 1985; Shamma et al. 1993; Tian and Rauschecker 1994, 1998).

In case a neuron was tested at more than one intensity level, the best response was used for population analysis.


At the end of an experiment, the animal was deeply anesthetized with sodium pentobarbital (Nembutal, 60 mg/kg, intravenous) and perfused transcardially with 0.9% saline followed by 4% paraformaldehyde in 0.1 M phosphate buffer. The brain was removed from the skull, blocked stereotaxically, and stored in fixative at 4°C. The part of the brain containing the auditory cortex was sunk in increasing sucrose gradients and cut in 50-μm-thick frontal sections on a freezing microtome. The sections were mounted on slides and stained with cresyl violet or thionin. Electrolytic microlesions were identified under a microscope (Wild) and the sections drawn with a camera lucida (Leitz). The lesion sites were compared with the stereotaxic measurements taken during the recording session to verify the location of the recording sites.


A total of 216 auditory cortical units from 4 monkeys were studied with extracellular single-unit recording. Of these, 190 were located in the lateral belt areas, AL, ML, and CL, the remainder in areas A1 and CM. Among lateral belt neurons, 175 were analyzed with FM sweeps, and 165 neurons were responsive to such FM stimuli. Among those lateral belt neurons, 25 were tested with FM sweeps at more than one intensity level. In these cases, the best response was used for population analysis.

Recording sites

Parallel electrode penetrations were made into the superior temporal gyrus (STG) along the lateral sulcus (LS) under visual guidance. Figure 1A shows surface electrode penetration sites through the left hemisphere in one monkey (CR696). In most cases, we aimed to angle the penetrations orthogonally to the cortical surface. However, because of the curvature of the STG and the goal to keep penetrations parallel to one another, some electrode tracks ended up being slightly oblique to the cortical surface (Fig. 1B). In some cases (with more medial coordinates), penetrations into the supratemporal plane (STP) were also made through overlying parietal cortex.

FIG. 1.

A: electrode track positions in one monkey (CR696). Location of the tracks on the exposed surface of the superior temporal gyrus (STG) was recorded with a digital camera. Each dot indicates an electrode penetration. Number next to it shows the best center frequency (BFc) in kHz at this penetration. Border between the anterolateral (AL) and the mediolateral (ML) areas is indicated by the dashed line running through the reversal point of the frequency gradient along the lateral sulcus. B: histological reconstruction of electrode tracks in another monkey (89.1520). Penetrations are indicated by the straight lines traversing cortex on the STG; recording sites are indicated by tick marks. ls, lateral sulcus; sts, superior temporal sulcus; r, rostral; c, caudal; m, medial; l, lateral.

Based on the best center frequency (BFc) in each penetration along the LS, a BFc gradient with 2 frequency reversals could be established along the LS, indicating the previously defined 3 lateral belt areas AL, ML, and CL (Rauschecker et al. 1995). As a consequence, neurons recorded in the STG could be assigned to a particular lateral belt area. Figure 1 shows a BFc reversal in the anterior part of the lateral belt, indicating the border between AL and ML. Mapping of the caudal part of the STG could not be completed in this case because of time constraints during the recording session. Extensive mapping was performed in the caudal region of the second monkey, where a frequency reversal at the high-frequency border between CL and ML could be established. In another monkey recordings were made from all 3 belt areas, and in a fourth monkey the recordings concentrated again on ML and CL. Of the 190 neurons recorded in the lateral belt areas, 34, 104, and 52 were located in AL, ML, and CL, respectively. Of those neurons, 32 in AL, 101 in ML, and 42 in CL were tested with FM stimuli while PSTHs were recorded. Among them, 32, 100, and 33 in AL, ML, and CL, respectively, were responsive to FM stimuli. These neurons formed the database for further analysis.

Responses to FM sweeps: general properties

When stimulated with linear FM sweeps, 165 of 175 lateral belt neurons (94%) responded in at least one FM direction (Fig. 2). Only in 10 units (6%), largely from CL, a response was not detectable at any FM rate or direction. It cannot be excluded that the PFMR in these cases was outside the range of FMR tested.

FIG. 2.

An example of a lateral belt neuron responding to frequency-modulated (FM) sweeps with different rates and directions (up and down). For each response, the peristimulus time histogram (PSTH) is displayed on top, the raster display in the middle, and a schematic spectrogram of the stimulus on the bottom of each panel. Frequency-modulation rate (FMR, in Hz/ms) of the stimulus is indicated by the number in the upper-right corner of each of the top panels. Top panels: upward sweeps; bottom panels: downward sweeps.

In general, neurons in the lateral belt areas showed only a brief response to FM stimuli. The response period was much shorter than the actual stimulus duration (Fig. 2). Because the onset latency of the response changed clearly when the duration of the FM sweeps (i.e., the FMR) was varied, it seems that the response was triggered by a particular portion of the stimulus. When tested at different FMRs in a given sweep direction (upward or downward), the response varied with the FMR, generally with a clear maximum at a certain rate (Fig. 2, top panels). This indicates tuning for FMR. When the responses to the upward and downward FM directions at the same FMR were compared (Fig. 2, middle panels), one direction (the upward sweep in the case of this example) was often preferred by a neuron over the other. This suggests that the neuron was FM direction-selective. We will discuss these response properties in more detail below.

Selectivity for FM direction

To characterize FM direction selectivity quantitatively, a direction selectivity (DS) index was calculated at each FMR, as described in methods, and was plotted against FMR (Fig. 3, bottom panels). A neuron was considered direction selective when the response in one FM direction for one or more FMR was at least twice as large as that in the other direction (Mendelson and Cynader 1985). This corresponds to an absolute value of the DS index of 0.33, indicated by the dashed lines in Fig. 3. If there were points above the dashed line (i.e., the DS value was >0.33), then the neuron was classified as preferring upward sweeps (Fig. 3, A and B). Conversely, if there were points below the lower dashed line (i.e., the absolute value of DS was >0.33), then the neuron was classified as preferring downward sweeps (Fig. 3C). Sometimes, the direction preference depended on the FMR, i.e., preferring one direction at one FMR, but the opposite direction at another FMR (not shown in Fig. 3). These neurons were classified as preferring up- and downward sweeps. If the DS value was not greater than +0.33 or smaller than −0.33 at any FMR tested, then the neuron was classified as not direction-selective.

FIG. 3.

Selectivity for FM direction in lateral belt neurons. Firing rate (top panel) and direction selectivity (DS) index (bottom panel) at different FMRs are plotted for 3 neurons (A to C). In the top panel, open circles symbolize the responses to upward FM sweeps, filled circles to downward sweeps. Formula to calculate DS is displayed in the figure. In the bottom panel, the dashed lines at a DS index of ±0.33 indicate the criterion for DS applied to an FM sweep (Mendelson and Cynader 1985). Values beyond these limits indicate that the response to one FM direction is <50% of that in the other direction. Thus the neuron is direction-selective at this particular FMR. Positive and negative values refer to upward and downward preference, respectively. A: neuron with a preference for moderate FMR (BP neuron) in both FM directions and a direction preference to upward sweeps. B: neuron with a preference for moderate FMR (BP neuron) in upward direction, but for higher FMR (HP neuron) in downward direction, and direction preference for upward sweep at lower FMRs. C: neuron with a preference for lower FMR (LP neuron) in upward direction and moderate FMR in downward direction and direction preference for downward FM sweep at moderate FMRs. Note that in B, the FMR tuning curve for upward direction has 2 peaks, although because the 2 peaks differed by more than 5% in firing rate, the neuron did not qualify as a BR neuron.

Of 165 units in the lateral belt areas, 99 units (60%) fulfilled the criterion and were classified as direction-selective (Fig. 4). Among them, in 18 units (11% of all 165 cells) direction preference depended on FMR, so they were classified as preferring up- and downward sweeps. Of the remaining 81 neurons, 40 (24% of all 165 cells in the sample) preferred upward and 41 (25%) downward sweeps. Thus the ratio of direction preference for upward or downward sweeps was about half:half (40:41) among the neurons with only one preferred direction. The remaining 66 units (40%) did not meet the criterion for FM direction selectivity and responded about equally to upward and downward direction at all FMRs.

FIG. 4.

Distribution of AL (top), ML (middle), and CL (bottom) neurons with different types of FM direction selectivity. Up, Down: neurons that responded better to an upward or downward FM sweep, respectively, at all FMRs. Up/Down: neurons that responded better to one FM direction at some FMRs, but better to the other direction at other FMRs. None: no difference between both FM directions at any FMR.

When the numbers of neurons in each area were counted, 7 of 32 (21.9%) AL neurons preferred upward sweeps, and 15 neurons (46.9%) preferred downward sweeps (Fig. 4). In 2 AL neurons (6.2%), the direction preference depended on FMR. Eight AL neurons (25%) had no directional preference. In ML, 34 and 24 of 100 units (34 and 24%) preferred upward and downward sweeps, respectively. In 13 units (13%), the preference depended on FMR. The remaining 29 units (29%) had no directional preference. In CL, 8 and 9 of 33 neurons (24.2 and 27.3%) preferred upward and downward sweeps, respectively. In 4 units (12.1%) the directional preference depended on FMR, and the remaining 12 units (36.4%) had no directional preference.

Tuning to FM rate

The example in Fig. 2 showed that lateral belt neurons responded differentially to different FMRs. When the spike rate was plotted against the FMR in a particular FM sweep direction, FMR tuning curves could be derived (Fig. 3). Often there was a maximum in the middle range of FMRs tested (Fig. 3A, up- and downward; C, downward); in other cases, the maximum was at the lower (Fig. 3C, upward) or higher extreme (Fig. 3B, downward). Occasionally, 2 peaks could be detected in the FMR tuning curve (Fig. 3B, upward). Additional examples of FMR tuning curves are shown in Fig. 5. The responses here at different FMRs were normalized to the maximum response. As defined in methods, FMR tuning was categorized as high-pass (HP) (Fig. 5A), low-pass (LP) (Fig. 5C), band-pass (BP) (Fig. 5B), all-pass (AP) (not shown here), or band-rejection (BR) (Fig. 5D).

FIG. 5.

Examples of FMR tuning curves in 5 different categories: HP neurons (A); BP neurons (B); LP neurons (C); BR neurons (D). Fifth category (AP) is not displayed here. Each diagram depicts several examples for each of the 4 categories. Responses are normalized to the maximum response (100%) for each neuron. Dashed line marks 75% of the maximal response, which was used as the criterion to categorize the neurons. Note that the FMR tuning curves cover the whole range of FMRs tested. Up, upward sweep; Down, downward sweeps; HP, high-pass behavior; BP, band-pass behavior; LP, low-pass behavior; AP, all-pass behavior; BR, band-rejection behavior. n refers to number of neurons.

Presentation of FM sweeps in both directions (upward and downward changes of frequency) resulted in all kinds of combinations of FMR tuning. A neuron could, for example, show HP behavior for FM sweeps in one direction, but BP or even LP in the other. Figure 6 shows the percentages of neurons with different combinations in the 3 lateral belt areas in a 3-dimensional display. In AL, the LP-BP, LP-LP, and BP-LP neurons formed the 3 largest populations (9/32 = 28%, 7/32 = 22%, and 5/32 = 16%, respectively), followed by the BP-BP neurons (3/32 = 9%) (Fig. 6, top panel). AP or BR behavior appeared only in combination with others, such as HP, LP, or BP. Three quarters of AL neurons (24/32 = 75%) showed LP and 2 units (6%) showed BR behavior in at least one FM direction. In ML, the BP-BP neurons formed the largest proportion (25/100 = 25%), followed by BP-HP, LP-LP, and LP-BP neurons (14/100 = 14%, 12/100 = 12%, and 11/100 = 11%, respectively) (Fig. 6, middle panel). There was one (1/100 = 1%) BR-BR neuron. Sixty-nine neurons (69%) in ML showed BP behavior in at least one FM direction. In CL, the BP-BP neurons made up the largest proportion (14/33 = 42%) (Fig. 6, bottom panel). As in AL, AP and BR behavior appeared only in combination with HP, BP, or LP behavior. Overall, 76% (25/33) of CL neurons showed BP behavior in at least one FM direction.

FIG. 6.

Distribution of cells in AL (top), ML (middle), and CL (bottom) in FMR tuning categories. All possible combinations of responses in the 2 FM directions are shown here. Abbreviations as in Fig. 5.

When the tuning behavior only in the preferred direction was counted for each neuron (or in cases where there was no strict direction preference, data in the direction with maximal response were used), LP neurons were the largest population in AL (17/32 = 53%), followed by BP and HP neurons with 34 and 13%, respectively (Fig. 7, top). In ML, the largest proportion of neurons were the BP neurons (50/100 = 50%), followed by LP (26%) and HP (21%) neurons, whereas 3 neurons were classified as BR neurons (Fig. 7, middle). In CL, the proportion of BP neurons (23/33 = 70%) was even larger than in ML; LP and HP neurons were equally distributed at 15% (5/33) each, and there was no BR neuron in either AL or CL (Fig. 7, bottom).

FIG. 7.

Distribution of cells in AL (top), ML (middle), and CL (bottom) in FMR tuning categories in the preferred direction. If the neuron did not have a direction preference, the data in the direction with maximal response were used. Abbreviations as in Fig. 5.

Preferred FMR in different areas of the lateral belt

The preferred FM rate (PFMR) was defined as the FMR at which the response was maximal in a given FM direction (see methods). BR neurons were excluded here because a defined PFMR could not be determined. When the distributions of the PFMRs for the 3 areas were plotted, there was a clear difference between the 3 areas (Fig. 8). AL neurons preferred lower FMRs. More than half of the neurons had their PFMRs below 64 Hz/ms (24/31 = 77% and 17/31 = 55% for upward and downward sweeps, respectively), with their medians at 25 and 50 Hz/ms, respectively. CL neurons, in contrast, preferred higher FMRs. About 70% of CL neurons preferred FMRs above 64 Hz/ms (23/31 = 72% and 22/31 = 69% for upward and downward sweeps, respectively), with the medians at 160 Hz/ms for both directions. ML neurons preferred FMRs in the midrange. About half of the ML neurons (52/95 = 55% and 37/97 = 38% for upward and downward sweeps, respectively) had their PFMR below 64 Hz/ms, with their medians at 60 and 125 Hz/ms, respectively. The difference between the 3 areas was significant (P < 0.05, df = 2, Kruskal–Wallis test). No significant difference was found in PFMRs between upward and downward directions: AL (P = 0.0513, Wilcoxon signed-rank test), ML (P = 0.0731), and CL (P = 0.4552).

FIG. 8.

Distribution of preferred frequency-modulated rates (PFMRs) for all cells (open bars) and cells with a BFc ranging from 3 to 16 kHz (stippled bars) in AL (top panels), ML (middle panels), and CL (bottom panels). Left panels: data for upward sweeps; right panels: data for downward sweeps. Total number of units and the median of PFMRs (in Hz/ms) in each area are also displayed in each plot. Preference for FMR was distributed over the whole range tested. BR neurons were excluded from these plots because they did not have a single PFMR. Bin width of the graphs is one octave, i.e., 2k (k = 1, 2,… , 18) in Hz/ms.

Because the instantaneous FMR of a linear FM sweep slows down at higher frequencies with regard to the logarithmic cochleotopic organization and there were more neurons in CL with higher BFc values, it is possible that the differences in PFMR in different areas were not as strong as presented for the overall PFMR. To compensate for this potential sampling bias, we have restricted our comparison to neurons with a BFc in the same frequency range. AL neurons had BFc values ranging from 1 to 16 kHz, ML neurons had BFc values from 1.25 to 30 kHz, and CL neurons had BFc values from 3 to 36 kHz. Thus we selected neurons with a BFc ranging from 3 to 16 kHz. Among these neurons, 22, 58, and 19 were in AL, ML, and CL, respectively. There was no difference in BFc in the 3 areas (P = 0.119, df = 2, Kruskal–Wallis test). The PFMR distribution is also plotted in Fig. 8 as shaded bars. The difference in PFMR in the 3 areas was even clearer (P < 0.0001 and P = 0.0011 for upward and downward sweeps, respectively; df = 2, Kruskal–Wallis test). The median PFMR for these neurons in AL was 15.63 and 31.25 Hz/ms for upward and downward sweeps, respectively; the median was 62.5 and 160 Hz/ms in ML, and 160 and 200 Hz/ms in CL for upward and downward sweeps, respectively.

Another way to compensate for the “slow-down ” effect at higher frequencies is to compute the instantaneous FMR on a logarithmic scale (octaves/s) at the neuron's BFc. Because the best instantaneous frequency (BIF) differed widely from the BFc in many neurons, it is more appropriate to use the BIF rather than the BFc to determine the instantaneous FMR in octaves/s. Data for each lateral belt area are plotted in Fig. 9 for both the upward and downward directions. Again, AL neurons were distributed at lower values with a median of 5.2 and 6.3 octaves/s for upward and downward directions, respectively; ML neurons were in the midrange with medians of 21.5 and 16.9 octaves/s; and CL neurons had higher values with medians of 31.0 and 30.6 octaves/s. The difference between areas was significant for both directions (P < 0.0385 and P = 0.0055 for upward and downward sweeps, respectively; df = 2, Kruskal–Wallis test). When AL and CL neurons were compared directly, the difference was also significant for both directions (P < 0.0205 and P = 0.0013 for upward and downward sweeps, respectively; Mann–Whitney U test).

FIG. 9.

Distribution of instantaneous PFMRs obtained at the best instantaneous frequency (BIF) in AL (top panels), ML (middle panels), and CL (bottom panels). Left panels: data for upward sweeps; right panels: data for downward sweeps. Total number of units and median of PFMRs (in octaves/s) in each area are also displayed in each plot. BR neurons were excluded from these plots because they did not have a single PFMR. Bin width of the graphs is one octave, i.e., 2k (k = 0, 1,… , 10) in octaves/s.

When the distributions of PFMR were plotted for the preferred direction, the same results were obtained as described above (Fig. 10A). AL neurons preferred lower FMRs: 22 of 32 neurons (69%) had their PFMR below 64 Hz/ms, and the median was 50 Hz/ms. ML preferred the middle range of FMRs: neurons with their PFMR distributed almost equally above and below 64 Hz/ms (50:47 = 52:48%, respectively), with a median at 125 Hz/ms. CL neurons preferred higher FMR. Twenty-four of 33 CL neurons (73%) preferred FMRs above 64 Hz/ms. The median was 160 Hz/ms. Again, the difference between AL, ML, and CL was clearly significant (P = 0.0010, df = 2, Kruskal–Wallis test). The distribution of PFMRs of neurons with matched BFc values was again plotted in shaded bars. The difference was also significant among these neurons (P = 0.0002, df = 2, Kruskal–Wallis test).

FIG. 10.

A: distribution of PFMRs in the preferred direction for all cells (open bars) and cells with a BFc ranging from 3 to 16 kHz (stippled bars) in AL (top), ML (middle), and CL (bottom). Total number of units and the median of PFMRs in each area are also displayed in each plot. B: distribution of instantaneous PFMRs at the neurons' BIFs measured in octaves/s in the preferred direction for all cells. Other conventions as in Fig. 8.

As mentioned above, the instantaneous FMR on a logarithmic scale (octaves/s) at the neuron's BIF was also computed for the preferred FM direction. Data for each lateral belt area are plotted in Fig. 10B. Again, AL neurons were distributed at lower values with a median of 6.8 octaves/s, ML neurons were in the middle range with a median of 19.3 octaves/s, and CL neurons had higher values with a median of 30.5 octaves/s. The difference was significant (P = 0.0004, df = 2, Kruskal–Wallis test). When AL and CL neurons were compared directly, the difference was also significant (P < 0.0001, Mann–Whitney U test).

Another way to assess a neuron's FMR tuning preference is to compute the centroid, which is thought to be more stable than the mere PFMR, given that it takes into account the responses at all FMRs tested (Kowalski et al. 1995; Shamma et al. 1993). BR neurons were again excluded from this analysis because they would yield a centroid somewhere in the middle where the real response was low. Using this type of analysis, the same preference was seen for each area in a given FM direction (Fig. 11). AL had lower centroids with medians at 69 and 76 Hz/ms for upward and downward sweeps, respectively, ML in the midrange with medians at 86 and 101 Hz/ms, and CL had higher centroids with medians at 98 and 104 Hz/ms, respectively (P < 0.005, df = 2, Kruskal–Wallis test). When the centroids of upward and downward sweeps were compared within each area, ML neurons had higher centroids for downward sweeps than for upward sweeps (P = 0.002, Wilcoxon signed-rank test), but there was no significant difference in the other 2 areas between upward and downward sweeps (P = 0.0897 and P = 0.6951 for AL and CL, respectively, Wilcoxon signed-rank test). As for the PFMRs, similar results were obtained for centroids in the preferred direction. AL had lower centroids, ML had centroids in the mid-range, and CL had higher centroids, with medians at 73, 97, and 101 Hz/ms for each area, respectively (P = 0.0008, df = 2, Kruskal–Wallis test).

FIG. 11.

Distribution of centroids for all cells (open bars) and BP cells only (stippled bars) in AL (top panels), ML (middle panels), and CL (bottom panels). Left panels: data for upward sweeps; right panels: data for downward sweeps. Other conventions as in Fig. 8.

FMR tuning and BF

To investigate whether FMR preference is correlated with the BFc, both PFMR and centroid of a neuron were plotted against its BFc. Such an analysis is useful to exclude the possibility that the differences in FMR tuning between areas are merely the result of a sampling bias in terms of best frequencies. In a regression analysis, there was a weakly positive correlation between PFMR and BFc in ML for downward sweeps (r = 0.34, n = 93, P = 0.0040, ANOVA), but not for upward sweeps (r = 0.25, n = 91, P = 0.1065, ANOVA) (Fig. 12). However, there was no correlation in AL for upward sweeps (r = 0.02, n = 31, P = 0.9518, ANOVA) and even a negative correlation for downward sweeps (r = 0.14, n = 31, P = 0.9413, ANOVA). In CL, no significant correlations were found in the regression analysis either (r = 0.21, n = 29, P = 0.3631, ANOVA and r = 0.16, n = 29, P = 0.1097, ANOVA for upward and downward sweeps, respectively). The same regression analysis was also performed between BFc and the preferred instantaneous FMR in octaves/s, and except for a negative correlation in AL and CL neurons for downward sweeps (r = 0.41, n = 31, P = 0.024, ANOVA and r = 0.38, n = 30, P = 0.039, ANOVA for AL and CL, respectively; Fig. 13) no significant correlations were found between best frequency and PFMR, meaning that the difference in PFMR between belt areas cannot be attributed to a sampling bias.

FIG. 12.

PFMRs as a function of BFc for each unit in AL (top panels), ML (middle panels), and CL (bottom panels) in upward (left panels) and downward (right panels) FM direction. Dashed line depicts the regression line of PFMR and BFc. Insets: histogram at the bottom of each graph displays the distribution of BFc in the sample. Bin width of the graphs is one half octave, i.e., 20.5k (k = −4, −3,… , 10) in kHz. n.s., not significant.

FIG. 13.

Instantaneous PFMRs in octaves/s as a function of BFc for each unit in AL (top panels), ML (middle panels), and CL (bottom panels) in upward (left panels) and downward (right panels) FM direction. Dashed line depicts the regression line of PFMR and BFc. Insets: histogram at the bottom of each graph displays the distribution of BFc in the sample. Bin width of the graphs is one half octave, i.e., 20.5k (k = −4, −3,… , 10) in kHz. n.s., not significant.

When using centroids instead of PFMR, the correlation with BFc became stronger in ML (r = 0.56, n = 91, P = 0.0010, ANOVA and r = 0.57, n = 93, P < 0.0001, ANOVA for upward and downward sweeps, respectively). However, there was again no significant correlation in AL (r = 0.14, n = 31, P = 0.2811, ANOVA and r = 0.14, n = 31, P = 0.0539, ANOVA for upward and downward sweeps, respectively) and only a weak correlation in CL for downward sweeps (r = 0.58, n = 29, P = 0.0110, ANOVA), but not for upward sweeps (r = 0.43, n = 29, P = 0.2825, ANOVA).

FM versus PT responses


It has been reported that neurons in the lateral belt areas respond poorly to PT stimuli (Merzenich and Brugge 1973; Morel et al. 1993; Rauschecker et al. 1995). This was also true in the present study. Whenever possible, however, PT responses were also assessed on the basis of PSTHs. Therefore a direct comparison of PT and FM responses could be made in 54 of the lateral belt neurons (Fig. 14). Overall, and in most individual cases, the neuronal response to FM was significantly stronger than to PT (P < 0.001, paired t-test). The same was found in the small sample of neurons from area CM, in which a direct comparison was performed.

FIG. 14.

Direct comparison between responses to FM and pure tone (PT) stimuli in a sample of neurons from the lateral belt. FM peak firing rate is plotted against PT peak firing rate in a scattergram for the same neurons. Dashed line is the diagonal representing equal response to either stimulus type. Most of the points are above the diagonal (dashed line), indicating that the responses to FM stimuli were stronger than those to PT stimuli. n, number of cells; P, significance level in paired t-test.


As mentioned above, the response to an FM sweep was triggered by a particular event during the FM sweep, and it was much shorter than the stimulus itself. Considering that the FM sweep range was chosen large enough to exceed the excitatory tuning range of the neuron tested on both the lower and higher ends, it is not surprising that the response was shorter than that of the whole stimulus. However, which event exactly triggered the response remains to be determined.

In a linear model, a response of a neuron to an FM sweep will be triggered when the instantaneous frequency (IF) of the FM sweep enters the excitatory tuning range of the neuron; the response reaches its maximum when the IF approaches the BF of the neuron; and the neuron will cease to fire when the IF leaves the excitatory tuning range of the neuron, just like a PT probe stimulus being used to explore the excitatory tuning range. Under linear conditions, this sequence of events should always hold regardless of the time the IF enters or leaves the excitatory tuning range (i.e., independent of FM duration or FMR). Thus in a linear system the response to FM sweeps is directly related to the PT responses; in other words, the response to an FM sweep can be derived from the PT responses. To probe whether the responses to FM sweeps in the lateral belt were indeed linear, rate–IF curves were created, similar to the rate–frequency curves in response to PT stimuli. The PSTHs in response to FM were transformed into rate–IF curves by substituting frequency for time, and the locations of local and global maxima in these curves were determined. Best instantaneous frequency (BIF) was defined as the IF at the global maximum in the rate–IF curve. Both procedures are described in detail in methods.

Figure 15 shows examples of rate–IF curves at 6 different FMRs from the same neuron shown in Fig. 3. The BIFs were in general around 3 and 4 kHz, but varied with the FMR. Exceptions were found in the rate–IF curves at 290 Hz/ms for upward sweeps and 36.25 Hz/ms for downward sweeps, where the BIFs were considerably higher (>1 octave). The BIFs also deviated from the BFc obtained with BPN bursts, which was 2 kHz.

FIG. 15.

Comparison of rate–frequency curves for the neuron shown in Fig. 3 for band-pass noise (BPN) stimulation (A) and in response to FM sweeps (B, C) of different rate and direction. Curves in B and C are derived from PSTH responses, as described in methods; net firing rate is plotted as a function of instantaneous frequency (IF). Number above each peak indicates the frequency at the peak center. In all cases stimulus amplitude was 65 dB SPL. A: BPN response. B: responses to upward FM sweep at 6 different FMRs (increasing from top left to bottom right panel: 9.06, 18.13, 36.25, 72.5, 145, and 290 Hz/ms), indicated in italics in the upper left-hand corner of each plot). C: responses to downward FM sweeps at different FMRs in the same sequence as for upward sweeps. Peaks in the FM response were at IFs that varied with FMR.

When plotted against corresponding FMRs, the BIFs of the lateral belt neurons were found to vary with FMR in most cases (see the 4 examples in Fig. 16). In some neurons the BIFs deviated considerably at the middle FMRs tested (Fig. 16, A and D), in some others at lower FMRs (Fig. 16B). In yet some other neurons they differed at higher FMRs (Fig. 16C). BIF also behaved differently in different FM directions (Fig. 16, B and D). Overall, no systematic relationship was found between BIF, FMR, and FM direction.

FIG. 16.

Best instantaneous frequency (BIF) plotted against FMR in 4 neurons. Neurons' identification numbers are displayed in the top left-hand corner of each graph. Open symbols: upward sweeps; closed symbols: downward sweeps. Horizontal dashed line indicates the BFc for each neuron determined with BPN bursts.

Comparison between BIF and BFc

As mentioned above, a linear model also predicts that the BIFs of a neuron should be close to its BF (when tested with PT) or its BFc (when tested with BPN). Use of BPN bursts is customary in the lateral belt because neurons are not very responsive to PT. In 159 of the 165 lateral belt neurons tested with FM sweeps it was possible to determine the BFc, so a direct comparison between BFc and BIF could be performed. In Fig. 15, the BFc was at 2 kHz, significantly removed from the BIFs at most FMRs. There were cases when the BIFs were close to the BFc over a substantial portion of the FMR range tested (Fig. 16A, upward sweeps at middle FMRs; C, both upward and downward sweeps at lower FMRs; and D, downward at the 2 extreme FMRs), but in many instances BIFs differed from BFc by a wide margin over most of the FMR range tested (Fig. 16B).

When the difference between BIF and BFc was averaged over the 6 different FMRs for each direction, the result was greater than 1/3 octave in 75% (24/32) and 78% (25/32) of AL neurons for upward and downward sweeps, respectively. The corresponding numbers were 78% (76/97) and 77% (75/97) for ML neurons, and 77% (23/30) and 77% (23/30) for CL neurons (Fig. 17, left panels). The distribution for upward and downward sweeps was almost identical in all 3 areas. More than 90% of lateral belt neurons had a SD larger than 1/3 octave (Fig. 17, right panels), indicating that the BIF varied to a large extent in lateral belt neurons. These data suggest that FM responses in lateral belt neurons are not based on linear mechanisms.

FIG. 17.

Distribution of averaged difference between BIF and BFc (BIF-BF, left panels) and its SD (right panels). AL, top panels; ML, middle panels; CL, bottom panels. Open blocks: upward sweeps; filled blocks: downward sweeps. Vertical dashed lines indicate the border at one-third octave.


FM versus PT responses: processing hierarchies

Neurons in all 3 lateral belt areas, AL, ML, and CL, were found to respond well to FM sweeps of various rates and direction. The peak firing rate in response to FM sweeps was significantly higher in most cases than to PT bursts of any frequency. This finding alone confirms previous reports that FM sounds are more adequate stimuli than PT bursts for neurons beyond the primary core areas (Tian and Rauschecker 1994, 1998). FM glides occur very commonly in many natural sounds, including those used for communication by many animal species as well as humans. The finding is also consistent with the notion that neurons respond to increasingly complex stimuli as one moves away from the primary sensory areas. FM sweeps with their frequency changing over time certainly have a more complex spectrum than stationary tone bursts.

Because our electrode penetrations were made into the exposed surface of the STG under visual guidance and we used histological track reconstruction, the penetrations were guaranteed not to go into primary auditory core areas, which are situated on the STP inside the LS (Merzenich and Brugge 1973; Morel et al. 1993; Rauschecker et al. 1997). At the same time, we are fairly confident that our recording locations were in the belt and not in the “parabelt ” (Hackett et al. 1998; Morel et al. 1993) because they were placed as closely as possible to the lateral sulcus. However, this can be decided with more certainty only on the basis of combining electrophysiological recording with histochemical staining (Hackett et al. 1998). The belt/parabelt border varies between studies (and perhaps individual monkeys) (Hackett et al. 1998; Morel et al. 1993). Future work will have to determine the exact borderline between these two regions by systematically exploring neuronal response properties along the mediolateral extent of the STG.

Nonlinearity of FM responses in the lateral belt

Neurons in the lateral belt areas responded to FM sweeps usually with a peak in the PSTH that was much shorter than the FM stimuli, and the onset of the peak changed clearly with FMR. This indicates that the response was triggered by a particular portion of the FM sweep. To determine which portion was responsible for the peak, rate–IF tuning curves were derived from the PSTHs and the BIF values were determined for each FMR. In lateral belt neurons, BIF values varied considerably with FMR, but no systematic relationship could be established between BIF and FMR. BIF values also differed significantly from the BF or BFc. These analyses suggest that responses to FM sweeps in lateral belt neurons are based on nonlinear mechanisms. Linear responses were found in A1 and AAF (Kowalski et al. 1995; Tian and Rauschecker 1994), but not in PAF (Tian and Rauschecker 1998), indicating a hierarchy of information processing in the cat's auditory cortex. Similarly, the nonlinear responses in lateral belt neurons confirm that the lateral belt represents a higher stage than A1 in the information processing hierarchy of the rhesus monkey.

FM rate and direction

Most neurons in the lateral belt areas were highly selective for the rate and direction of an FM sweep. In terms of direction selectivity, the proportions of neurons preferring upward or downward sweeps were roughly equal in ML and CL. More neurons in AL preferred downward than upward sweeps, but this might be attributable to the limited sample size. A higher proportion of neurons preferring downward sweeps was also found in cat A1 (Heil et al. 1992a; Mendelson and Cynader 1985), whereas more neurons in AAF and PAF preferred upward sweeps (Tian and Rauschecker 1994, 1998). The possible behavioral significance of these asymmetries has yet to be explained.

In terms of rate selectivity, lateral belt neurons can display low-, high-, or band-pass behavior. In addition, band-rejection (BR) behavior was found in some lateral belt neurons. Band-rejection is complementary to band-pass, but the function of BR neurons may be quite similar to BP neurons, as they filter out FM sounds at particular rates. In cats, BR was found in PAF but not in AAF neurons.

Specificity for different FM rates in AL and CL

The central finding of the present study is the clear difference in preferred FM rates between areas of the lateral belt. Neurons in AL generally responded better to slower FM sweeps (in the range of tens of Hz/ms), whereas neurons in CL responded best to very fast FM sweeps (in the range of hundreds of Hz/ms). ML neurons included all FM rates, which is consistent with the lower processing level of ML compared with AL and CL (Hackett et al. 1998; Rauschecker and Tian 2000). We performed various control analyses to assure ourselves that the difference in preferred FM rates between areas was not the result of a sampling bias, whereby more neurons with low and high BFc would have been sampled in AL and CL, respectively. This was clearly not the case, nor was there a consistent correlation between BFc and PFMR, as might have been expected on the basis of prior data in the cat (Tian and Rauschecker 1994). In fact, there was even an inverse relationship between these 2 parameters in some instances. Importantly, a transformation of the data from a linear to a logarithmic scale (converting Hz/ms to octaves/s) demonstrated that the difference between AL and CL cannot merely be caused by the use of linear FM sweeps.

According to these differences, AL neurons would be very well suited to participate in the decoding of, among others, species-specific vocalizations, which range mostly between 8 and 50 Hz/ms (Table 1) (Hauser 1996; Rauschecker 1998). The various harmonics in the widely occurring “coo ” calls fall between 10 and 40 Hz/ms, with higher harmonics displaying higher FM rates. Only some of the “screams ” contain FM rates above 100 Hz/ms (tonal scream: 103 Hz/ms; arch scream: 314 and 826 Hz/ms for the downward portion). Some of the neurons in AL do include responses to these faster sweeps. It is noteworthy that screams play an important role as alarm calls, which have to be well localizable by members of the same species.

View this table:

FMRs in some rhesus monkey calls

There was a trend for neurons in AL to be more selective for FM direction, which would appear to be consistent with their role in the decoding of auditory patterns. On the other hand, the fact that fewer neurons with BP-like FMR tuning were found in area AL may at first appear inconsistent with this idea. One might expect neurons involved in the processing of auditory patterns, specifically communication sounds, to be most likely to display BP characteristics. Instead, a host of LP-like FMR tuning was found in AL. This suggests that we may have underestimated the number of BP neurons in AL, given that the lowest FMR tested did not go low enough to reveal the low end of FMR tuning. The latter was not technically feasible because it would have increased the duration of the stimuli beyond realistic limits.

Fast FM sweeps with their broad spectrum are ideal stimuli for sound localization. The preference of CL neurons for faster FM rates is therefore consistent with a possible role of CL in auditory spatial processing, although it certainly does not exclude other roles of the caudal auditory belt. An involvement of the caudal belt in sound localization processes has previously been proposed on the basis of the significantly sharper spatial tuning of CL neurons compared with neurons in the anterior belt (Tian et al. 2001). In addition, neuronal responses in the caudal belt show a tighter coupling to the behavior of monkeys performing auditory spatial tasks (Recanzone et al. 2000).

Functional specialization of caudal and rostral belt: dorsal and ventral processing streams

Caudal belt and parabelt (areas Tpt and Tpo) send projections to area VIP in the inferior parietal cortex (Lewis and Van Essen 2000), which is also known for its involvement in spatial processing in both monkeys and humans (Bremmer et al. 2001). The inferior parietal lobule has also been implicated in the processing of auditory space and motion on the basis of functional imaging work (Bushara et al. 1999; Griffiths et al. 1996, 2000; Weeks et al. 1999) as well as lesion studies (Griffiths et al. 1997) in humans. Other recent imaging studies have confirmed a role for the human caudal belt in auditory motion processing (Warren et al. 2002) and in the disambiguation of sound sources in space (Zatorre et al. 2002). Conversely, a host of recent human studies, using fMRI, PET imaging, and lesion approaches, have demonstrated a role for rostral STG and STS in the processing of voice and speech sounds (Belin et al. 2000; Binder et al. 2000; Clarke et al. 2000; Scott et al. 2000).

An anatomical study in rhesus monkeys has demonstrated the existence of largely separate pathways originating in the lateral belt and projecting to different target regions in the prefrontal cortex (Romanski et al. 1999). In this study, 3 different fluorescent tracers were injected into matched frequency regions of the 3 belt areas after these had been physiologically mapped. Injections into area AL produced label in ventrolateral and orbital regions of prefrontal cortex, whereas CL injections led to labeling of dorsolateral prefrontal cortex. The latter is known for its involvement in spatial working memory, whereas the former regions are assumed to participate in object working memory (Goldman-Rakic 1996).

These projection patterns conform to the physiological response properties found in the aforementioned study of Tian et al. (2001), which assigned superior selectivity for auditory objects and space to areas AL and CL, respectively. The studies by Tian et al. (2001) and Romanski et al. (1999) thus form the cornerstones of a recent theory, according to which dual-processing streams in nonprimary auditory cortex underlie the perception of auditory objects and auditory space (Rauschecker and Tian 2000): one pathway projecting anteroventrally from A1 through AL and the rostral STG and STS into orbitofrontal cortex forms the main substrate for auditory pattern recognition and object identification. Another pathway projecting caudodorsally into posterior parietal and dorsolateral prefrontal cortex is thought to be involved in auditory spatial processing.

Other species


The present results can be compared with previous studies of nonprimary auditory cortex in the cat using FM stimuli identical to the ones used here (Tian and Rauschecker 1994, 1998). In the cat studies, the anterior auditory field (AAF) showed a preference for faster FM rates, whereas neurons in the posterior auditory field (PAF) preferred slower rates. Although this situation may appear exactly reversed at first sight as compared with the monkey, the results in the two species can be made congruent with each other, if one applies an argument first advanced by Jones (1985). According to this reasoning, the primate forebrain in evolution has added most of its volume in frontal and temporal regions. The growth of the temporal regions supposedly led to a rotation of the temporal lobe by almost 180°, which would also explain the reversal of the tonotopic gradients in A1 (and neighboring areas) in the two species. Thereby high frequencies are represented in anterior portions of A1 in cats but in posterior parts of A1 in primates, whereas the low-frequency region of A1, which lies caudally in cats, becomes represented rostrally in monkeys.


Selectivity for FM stimuli has been described for some time in the auditory cortex of bats (Fuzessery 1994; Fuzessery and Hall 1996; O'Neill 1985; Suga 1965, 1968). However, only recently has it become clear that a similar dichotomy of auditory functions may exist as described here for the rhesus monkey. The original findings of FM-FM and CF-FM neurons pertained clearly and exclusively to ranging behavior during echolocation, a type of auditory processing that belongs into the “where ” category. More recently, however, it has become clear that the processing of FM sounds also has relevance for the processing of communication sounds, a type of “what”-processing in the bat (Kanwal et al. 1994; Ohlemiller et al. 1996). Our work in nonhuman primates has emphasized the importance of cortico-cortical processing hierarchies, although it also points to some earlier dichotomies in thalamocortical processing (Rauschecker et al. 1997). Some of the work in bats demonstrates that even earlier processing stages, such as the inferior colliculus, may have great importance for setting up these functional specializations, although they may not be as clearly separated in the bat (Gordon and O'Neill 1998, 2000).


A particularly interesting finding in the context of the present study is that of a hemispheric asymmetry for the processing of FM sounds in humans (Hickok and Poeppel 2000; Poeppel et al. 2004). Faster FM rates are preferentially processed in the left hemisphere, slower rates more frequently in the right hemisphere. A similar hemispheric asymmetry is also found for the processing of temporal variation in tone sequences (Zatorre and Belin 2001). Unfortunately, in the present study, recordings were performed exclusively from the left hemisphere, so that we cannot assess whether the same hemispheric asymmetry holds for monkeys. If so, this would provide further evidence for similar mechanisms regarding the processing of communication sounds in monkeys and humans.

Direction-selective cells in visual and auditory cortex

As has been argued previously, the selectivity of auditory cortical neurons for FM direction is comparable to the selectivity of visual cortical neurons for the direction of a moving light stimulus because the receptor surfaces in the visual and auditory systems are organized in 2-D retinal space and along the cochlear frequency axis, respectively. This arrangement results in the well-known retinotopic and cochleotopic organization of visual and auditory cortex, respectively. Thus a reasonable assumption is that the minicolumns that make up the texture of all cortical fields (Hubel and Wiesel 1977; Mountcastle 1957) contain a uniform machinery that applies the same algorithm to its thalamic input regardless of sensory modality. One of the results may be the existence of units that are asymmetrically selective for the direction of an adequate physical stimulus across the sensory receptor surface.

Direction-selective units in the visual system are known to derive their selective from highly nonlinear mechanisms (Barlow and Levick 1965; Reichardt 1987). An elaborate analysis of FM responses compared with PT responses in the present study has demonstrated beyond any doubt that the FM selectivity of lateral belt neurons is also made up of nonlinear mechanisms. This conclusion is not restricted to nonprimary auditory cortex but was also arrived at in most previous studies of FM-direction selectivity in A1 (e.g., Phillips et al. 1985). Further analysis may lead to an abstract formulation of direction-selective mechanisms independent of sensory modality and thus to a more generalized understanding of cortical processing algorithms underlying perception.


This study was supported by National Institute on Deafness and Other Communication Disorders Grant R01-DC-003489 to J. P. Rauschecker.


  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract