It has been hypothesized that the primate auditory cortex is composed of at least two processing streams, one of which is believed to selectively process spatial information. To test whether spatial information is differentially encoded in different auditory cortical fields, we recorded the responses of single neurons in the auditory cortex of alert macaque monkeys to broadband noise stimuli presented from 360° in azimuth at four different absolute intensities. Cortical areas tested were core areas A1 and rostral (R), caudal belt fields caudomedial and caudolateral, and more rostral belt fields middle lateral and middle medial (MM). We found that almost all neurons encountered showed some spatial tuning. However, spatial selectivity measures showed that the caudal belt fields had the sharpest spatial tuning, A1 had intermediate spatial tuning, and areas R and MM had the least spatial tuning. Although most neurons showed their best responses to contralateral space, best azimuths were observed across the entire 360° of tested space. We also noted that although the responses of many neurons were significantly influenced by eye position, eye position did not systematically influence any of the spatially dependent responses that we measured. These data are consistent with the hypothesis that caudal auditory cortical fields in the primate process spatial features more accurately than the core and more rostral belt fields.
The ability to localize sounds is a fundamental acoustic process that is critical for the survival of the individual and the species. A wide variety of studies have been conducted that show how both bin- and monaural cues interact to generate different levels of sound localization acuity (see Blauert 1997). One important factor in sound localization ability is the stimulus intensity, where quiet sounds are not localized as well as louder sounds, but there is a large intensity range where localization ability is very similar (Altshuler and Comalli 1975; Comalli and Altshuler 1976; Recanzone and Beckerman 2004; Sabin et al. 2005; Su and Recanzone 2001).
Earlier studies have shown that the auditory cortex is critical for sound-localization ability (Heffner and Heffner 1990; Jenkins and Merzenich 1984), yet very little is currently understood about how acoustic space is processed at the cortical level. Previous studies in anesthetized cats (e.g., Brugge et al. 1996; Imig et al. 1990; Middlebrooks and Pettigrew 1981; Middlebrooks et al. 1994; Rajan et al. 1990) and alert monkeys (Ahissar et al. 1992; Benson et al. 1981; Recanzone et al. 2000b) have shown that many auditory cortical neurons are spatially selective. This spatial selectivity is generally much broader than the sound localization ability of the animal, indicating that some sort of population coding must occur to give rise to the spatial perception (Eisenman 1974; Fitzpatrick et al. 1997; Furukawa et al. 2000; Jenison et al. 1998; Recanzone et al. 2000b; Skottun 1998; Stecker et al. 2005).
Studies in carnivores that have investigated the spatial tuning as a function of stimulus intensity have noted changes in receptive field size based on firing rate (Imig et al. 1990; Rajan et al. 1990), latency (Reale et al. 2002), and the pattern of action potentials (Middlebrooks et al. 1994, 1998). In these studies, it is commonly observed that the spatial receptive fields increase in size as the stimulus intensity is increased (Brugge et al. 1996; Middlebrooks and Pettigrew 1981; Middlebrooks et al. 1998; Mrsic-Flogel et al. 2005; Rajan et al. 1990). What remains unknown, however, is how the spatial tuning properties of macaque auditory cortical neurons change as a function of stimulus intensity.
The primate auditory cortex is composed of multiple cortical fields organized in a core—belt—parabelt fashion (Kaas and Hackett 2000; Rauschecker and Tian 2000). It has previously been proposed that information is processed in two parallel streams, a caudal pathway that selectively processes spatial information and a rostral pathway that selectively processes nonspatial information (Rauschecker 1998). Evidence in favor of this hypothesis is growing for both non-human primates (e.g., Rauschecker et al. 1995; Recanzone et al. 2000b; Tian et al. 2001) and also for humans (e.g., Barrett and Hall 2006; Krumbholz et al. 2005a,b; Zimmer and Macaiuso 2005), but it is still not clear if this is indeed the case. To continue to test this hypothesis, we recorded the spatial tuning functions of single auditory cortical neurons in several different cortical fields in alert macaque monkeys. Stimuli were presented at four different stimulus intensities to all neurons and were presented from 16 different locations spanning 360° in azimuth at the elevation of the interaural axis. These studies were conducted to determine if there are differences in the spatial tuning functions in azimuth across six different cortical areas. We recorded from two core areas (A1 and the rostral field, R), two belt areas that are believed to form part of the spatial processing stream (the caudomedial and caudolateral fields), and two belt areas that are hypothesized to not contributing to spatial processing, at least to the same extent (the middle medial and middle lateral fields). These results have been presented previously in abstract form (Woods et al. 2001).
Animal and tasks
Detailed methods on the animal preparation have been previously published (Recanzone et al. 2000a) and will be summarized here. Three adult (aged 5–12 yr) male macaque monkeys weighing 7–12 kg over the course of the study were used (monkeys F, G, and L). All monkeys were trained to sit in a primate chair that was custom built to minimize acoustic reflections. Animals were fluid restricted following U.C. Davis guidelines to motivate them to perform a behavioral task for fluid reinforcement.
Monkeys F and L were trained to depress a lever to initiate a trial. Three to seven stimuli of different intensities and locations were presented (S1 stimuli) before the same stimulus intensity and location was presented a second time (S2 stimuli). The interstimulus interval was 800 ms. Immediately after the S2 stimulus offset, the solenoid providing fluid reinforcement would open briefly (and audibly). If the monkey then released the lever within 800 ms, the solenoid would open again for a longer time to provide the fluid reinforcement. If the monkey released the lever prior to this time, no fluid reinforcement was provided and a brief time-out followed. Monkeys were thus attending to the acoustic environment but were not required to discriminate the location of the stimulus to receive a reward. This ensured that the monkeys maintained a steady state of alertness throughout the session. We were unable to train monkey G to perform this task, and therefore the animal simply received a fluid reward after three to seven stimuli as with the other monkeys. Closed circuit video monitoring and the fact that the monkey was receiving fluids every several seconds assured that the animal did not fall asleep and remained alert throughout the session. After the delivery of the fluid reinforcement, there was a time-out of 3–5 s to allow the monkey to finish licking and swallowing before the next trial began, which prevented a consistent self-induced acoustic stimulus.
Monkeys F and L were also trained to perform a fixation task. Eye position was monitored using a video eye-tracking system (Model 501; Applied Science Laboratories, Bedford, MA). A small light-emitting diode (LED) blinked until the monkey fixated this target to within 2°. The monkey then had to maintain fixation as well as depress the lever throughout the trial. For all sessions, the S1 stimulus location and intensity was randomly interleaved such that any particular stimulus could occur at any of the different S1 intervals with equal probability across a session. All animal procedures were approved by the Animal Care and Use Committee at U.C. Davis and followed PHS and Society for Neuroscience guidelines.
Apparatus and stimuli
All experiments were performed in a double-walled acoustic chamber (IAC, New York) measuring 2.4 × 3.0 × 2.0 m (l × w × h; inner dimensions) and lined with 3-in echo-attenuating foam. A speaker array of 16 speakers located 1 m from the center of the interaural axis of the monkey spanned 360° in 22.5° steps and was in the plane of the interaural axis in elevation. Speakers were 2 in in diameter and had a flat frequency profile from 500 to 12,000 Hz with a 6 dB/octave roll off at higher and lower frequencies. Acoustic stimuli were generated using TDT hardware and software (TDT, Gainesville, FL) controlled by a personal computer. Acoustic stimuli were 200-ms duration Gaussian noise with 5-ms linear on-off ramps. Stimuli were generated independently before each presentation and were therefore “unfrozen.” These stimulus intensities were measured at the center of the apparatus in the absence of the monkey using a ½-in microphone oriented directly toward each speaker. Stimulus intensities were set at 25, 35, 55, and 75 dB SPL (average intensity; A-weighted) and on each trial the intensity was randomly varied ± 2 dB in 0.5-dB steps. These two procedures prevent the stimulus from being identifiable given the idiosyncratic frequency response profile of each speaker.
Each monkey underwent an MRI imaging session in a 1.5 T scanner where standard 1-mm slices were collected in the coronal (axial) plane. These images were used to target the recording cylinder placement and in aiding in reconstructing electrode tracks and classifying neurons to different cortical areas. The animal was then implanted with a head post and recording cylinder over the left auditory cortex in a sterile surgical procedure. The recording cylinder was oriented in the vertical plane to allow the electrode to penetrate the superior temporal gyrus from a roughly orthogonal direction (Pfingst and O'Connor 1980). Recording cylinders were either circular, allowing 1.4 × 1.4-cm access, or elliptical in shape, allowing 3.4 × 1.4-cm access (Crist Instruments, Hagerstown MD). A plastic grid was inserted into the cylinder, and a 28-gauge guide tube (Crist Instruments) was inserted through the dura into the superficial cerebral cortex 1–5 mm above the superior plane of the superior temporal gyrus. Tungsten microelectrodes (FHC, Bowdoinham, ME) were advanced into the cerebral cortex using a hydraulic microdrive. Neuronal signals were filtered, amplified, and displayed on an oscilloscope and audio monitor using conventional techniques. The search stimuli consisted of tone and/or noise bursts, band-passed noise, clicks, and vocalizations presented from 90° to the right or straight ahead. Other azimuths were tested if neuronal activity was encountered. During this procedure, the monkey had to either depress the lever to initiate a trial and release the lever when the stimulus changed location (monkeys F and L) to receive a fluid reward or to sit quietly (monkey G) and fluid rewards were provided intermittently. Once neuronal activity that was driven by acoustic stimuli was encountered, single-neuron waveforms were isolated using a time-amplitude window discriminator (BAK, Mount Airy, MD). Action potentials were time stamped on the computer at 1-ms resolution from stimulus onset and continued for 350 ms. All data in this report are from neurons in which the unit isolation was stable, and data were collected for ≥12 randomly interleaved trials of each of the 16 locations and four intensities. In addition, one trial type in which no stimulus was presented was included to measure the spontaneous activity. Once a single session was completed, the electrode was removed and the animal was returned to its home cage.
In some experiments the eye position was also monitored. Eye position was measured using the ASL video eye-tracker calibrated using a nine-point array that subtended a visual angle of ±20°. The eye position was measured at the offset of each stimulus presentation, and if it was outside of a 2° window around the fixation point, the trial was aborted, no fluid reinforcement was provided, and a brief time-out followed. The eye position at the end of this period was also stored on the computer. For each neuron, a session in which fixation was required was immediately followed by one in which fixation was not required, or the reverse. The order of the two sessions was randomized and roughly equal across neurons.
After all experiments, monkeys G and L were given an overdose of sodium pentobarbital and perfused through the heart with normal saline followed by 4% paraformaldehyde in 0.1 M phosphate buffer, pH 7.2. The brains were removed, blocked, postfixed overnight in 4% paraformaldehyde, infiltrated with 30% sucrose, and frozen in dry ice. Blocks of auditory cortex were cut into 25-μm-thick sections on a freezing microtome, and alternate sections were stained with thionin to reconstruct electrode tracks.
For each neuron, poststimulus time histograms (PSTH) were constructed using the average of a sliding window 5 ms in duration that stepped 1 ms for each average. The period of analysis was from stimulus onset to 150 ms after stimulus offset (350 ms total) to encompass any potential offset response unless otherwise indicated. Statistical tests were either a paired or an un-paired two-tailed t-test unless otherwise indicated in results. A tuning index was determined at each of the four intensities using the formula: 1 − (worst response/best response) where the worst response was the lowest mean number of action potentials per stimulus presentation and the best response was defined as the greatest mean number of action potentials per stimulus presentation for that particular stimulus intensity. Thus the best and worst were not necessarily in opposing (180°) locations. This metric varies from 0 to 1 and takes overall firing rate into account. It has been used successfully in previous studies where stimulus configurations are circular (e.g., Baker 1981). The vector strength was calculated following previous procedures (Goldberg and Brown 1969). In our case, the 360° in azimuth was taken as a single cycle. The Rayleigh statistic (Mardia and Jupp 2000) was calculated as the vector strength squared times two times the number of action potentials used to calculate the vector strength. This metric is inappropriate to use to determine statistical significance as all of the action potentials in a single trial are assigned the same phase and therefore are not distributed across phase (Mardia and Jupp 2000). This value does, however, give a good indication of tuning strength that is unbiased by low firing rates. Stimulus bandwidth was determined by first determining the maximum firing rate for a given intensity. The greatest number of sequential locations with responses >50% of this value was then determined and multiplied by 22.5 to define the bandwidth for that stimulus intensity. The best azimuth was defined as the location that elicited the greatest mean number of action potentials for each stimulus. First-spike latency was defined as the mean time of the first action potential on a trial for a particular stimulus intensity and location that was ≥10 ms after stimulus onset. This was chosen as neurons with high spontaneous rates commonly had first spike latencies in the unphysiological range (e.g., <10 ms).
The results of this study are based on the recordings of 1,019 single neurons in the left hemisphere of three monkeys. The cortical area was assigned for each neuron based on the physiological response properties, the location within the recording grid (Fig. 1) (see Recanzone et al. 2000a), and the cytoarchitectonic appearance in Nissl stains and is shown in Table 1. Figure 1 shows the relative position of these six cortical areas as measured in the recording grid. Neurons in the primary auditory cortex (A1; red) and the rostral field (R; purple) were defined based on the short-latency, vigorous response to tone stimuli, narrow frequency bandwidth, and the progression and reversal of characteristic frequency. A1 and R neurons share the low-frequency border (Merzenich and Brugge 1973; Recanzone et al. 2000a), so neurons were assigned to A1 until that border reversed. Thus a small minority of area R neurons in this border region may have been inappropriately classified as A1 neurons. Belt areas (CM; green, CL; dark blue, ML; light blue, and MM; yellow) were similarly defined by the response latency, responses to tone, narrow band and broadband stimuli, broader frequency bandwidth, and progression of best band-passed stimulus (see Rauschecker and Tian 2004; Rauschecker et al. 1995). The borders between core and belt areas were easily defined (see Recanzone et al. 2000a); however, the border between adjacent belt areas was more ambiguous and a few neurons in these border regions may have been inappropriately classified. Histological verification was made in two monkeys (G and L) and was consistent with the physiological definitions. Monkey F is still actively participating in experiments.
Percentage of spatially tuned neurons
The first level of analysis was to determine if the neuron had a statistically significant response to any of the 64 stimuli presented in each session. Although all neurons showed clear responses to auditory stimuli, this could be restricted to tones or other stimuli not presented within the session. We first tested if the response elicited by the stimulus that produced the greatest response was statistically significantly greater than the spontaneous activity (t-test; P < 0.05) to define significant excitatory activity. We also tested if the response elicited by the stimulus that produced the weakest response was statistically significantly different from the spontaneous activity (P < 0.05) to define significant inhibitory activity. For cortical areas in which not all recorded neurons passed one of these two tests, the number of neurons that was significantly responsive is shown parenthetically in Table 1. Overall, ≥95% of neurons tested in each monkey showed statistically significant responses, and >98% of all neurons tested were significantly responsive. All subsequent analysis described in the following text was restricted to the neurons with significant responses.
Figure 2 shows the response of a single neuron to all 12 trials of each of the 64 stimuli tested. Each column represents the stimulus of a given intensity, and each row shows the response for a given stimulus azimuth with negative numbers corresponding to leftward azimuths and 0 corresponding to directly in front of the monkey. The first feature to note is that this neuron showed a response to virtually every stimulus presented, but the magnitude of the response varied by both intensity and azimuth. The weakest response was for azimuths directly in front of the monkey (0°) and the most vigorous response was at 135° to the right. There is also a consistent onset response, followed by a decrease in activity, and then a second burst of activity starting ∼150 ms after stimulus onset. This second burst of activity was strongest for the highest intensity stimuli. Across trial types, particularly at the highest intensity, there was very little variance in the latency, although there was an indication that the latency of the response increased as the magnitude of the response increased, with the latency for 0° being shorter than the latency for azimuths farther behind the animal. This was apparent for all four intensities tested, but least pronounced for the highest intensity stimuli.
A second example neuron is shown in Fig. 3. This neuron showed other features that were characteristic to the rest of our sample, for example the very weak, if any, response to ipsilateral locations. There was also considerably less structure in the response latency or in the pattern of action potentials for stimuli with very weak responses. The appearance of a second peak of activity near the end of the stimulus is also apparent in this neuron as well.
The clear modulation of the response as a function of intensity and azimuth was characteristic of most neurons, but there was a wide variety of response types within our sample. Figure 4 shows 12 other example neurons that represent a sampling of the types of neuronal responses we encountered. Each panel shows the PSTH taken for the azimuth and intensity that evoked the weakest response (left) and the greatest response (right). The line plot underneath shows the mean spikes per trial measured for 350 ms from stimulus onset for each intensity (blue, red, green, and light blue for 75, 55, 35, and 25 dB SPL, respectively) as a function of azimuth, with the vertical line showing 0°. The dashed line shows the spontaneous firing rate. The type of response that has been commonly observed in previous studies in anesthetized cats (e.g., Imig et al. 1990; Rajan et al. 1990) is shown in Fig. 2A. This neuron was most sharply tuned in azimuth for the lowest intensity stimulus (25 dB SPL; light blue). As the stimulus intensity increased, there was an increase in the overall firing rate but also an increase in the number of azimuths that evoked a response. Although this response profile was often encountered, there were numerous other examples that we observed as well. Figure 2, B–D, shows neurons that had weak or no response to the lowest intensity stimuli, but clear spatial tuning at higher intensities, and this tuning also broadened as the firing rate increased. Figure 2, E–G, shows examples of neurons that had sharp spatial tuning that was consistent across intensities, but only the overall firing rate changed. Figure 2, E and G, shows neurons where the lowest intensity stimulus produced the greatest response, whereas F and J show neurons that had their best response to an intermediate sound intensity. Figure 2H shows a neuron that had a vigorous response but was not tuned to any of the stimulus intensities or azimuths tested. The bottom row of panels shows examples of neurons with two distinct peaks in the tuning function. These neurons could have similar or different spatial tuning functions depending on stimulus intensity, and could show only excitation (e.g., Fig. 2, J and K) or excitation and inhibition (Fig. 2, I and L). What is also clear from this figure is that the full range of response profiles, from phasic onset responses (Fig. 2C) to sustained tonic responses (Fig. 2I) were observed.
A second feature of these neural responses, particularly evident from Fig. 2, is that the latency of the response can also vary as a function of stimulus intensity and azimuth. We therefore analyzed the data with respect to the mean first-spike latency (see methods). The resulting tuning functions for the same 12 neurons shown in Fig. 4 are shown in Fig. 5. Overall, although some neurons showed reasonable tuning functions (e.g., Fig. 5A and to a lesser extent D and J), they were clearly more noisy and did not show specificity as clearly as in the firing rate functions. This may be due in large part to the difficulty in defining the first-spike latency given the high spontaneous rate in some neurons and/or the inhibition of the response at some azimuths. We tried several other methods, none of which showed tuning functions that were any more structured than those shown in Fig. 5.
To determine which neurons had statistically significant tuning with respect to firing rate, we compared the spikes collected in each trial for those elicited by the stimulus at the best azimuth to those elicited by the stimulus at the worst azimuth (t-test; P < 0.01). The results are shown in Table 2. For all cortical areas, the majority of neurons showed significant spatial tuning using this criterion at each intensity presented (except area MM at 25 dB SPL). Additionally, the percentage of neurons that were significantly tuned decreased as the stimulus intensity decreased. However, the magnitude of this decrease was least for the caudal and lateral fields (12, 10, and 15% for ML, CM, and CL, respectively, between the percentage of neurons significant at 75 dB SPL compared with 25 dB SPL) compared with A1 (29%), MM (39%), and R (36%). This indicates that the population of neurons in the caudal and lateral fields better retain their spatial selectivity at low intensities compared with the other three fields tested.
Extent of spatial tuning
Given the large percentage of neurons that were spatially tuned, we next addressed the extent of spatial selectivity using three different metrics. The first was a tuning index based on a metric used previously in studies of direction of motion tuning in extrastriate visual cortex (Baker et al. 1981). We compared the results across monkeys and found no significant differences (ANOVA; all P > 0.05) and therefore pooled the data across monkeys (Fig. 6). Across different cortical areas, what was most obvious was that the tuning index within the population of neurons was not affected by the stimulus intensity as indicated by the superposition of the distributions for the four intensities. Area A1 neurons were intermediately tuned, whereas caudal and lateral fields had a higher percentage of neurons with a larger tuning index compared with rostral and medial fields. The same analysis based on the first-spike latency showed a different result. In this case, there was no difference between areas A1, CM, CL, and ML, so the results from these four areas were combined and the distribution is shown in Fig. 7A. The tuning index for these neurons is considerably lower when measured by latency as compared with firing rate with the median on the order of 0.65. Areas MM and R were different from the other cortical fields and are shown separately in Fig. 7, B and C. MM had the lowest tuning indexes, whereas area R had the highest, largely due to a dearth of neurons with a very low tuning index. We conclude that firing rate provides much better spatial information than first-spike latency across these six cortical fields.
The second metric we explored was the azimuth distance where the neuron responded with ≥50% of its greatest response for that stimulus intensity. This commonly resulted in multiple peaks within the spatial tuning curve (e.g., Fig. 4, I–L), although neurons that were weakly tuned could have multiple peaks that met this criterion. Figure 8 shows the medians (symbols) and first and third quartiles (bars) for neurons in each area measured in each monkey. This analysis revealed that neurons in area CL showed the sharpest tuning across all monkeys. There was considerable variability between monkeys in the other cortical areas, however, with monkey G consistently having the largest bandwidths and monkey F generally having the narrowest bandwidths. It should also be noted that the bandwidth did not systematically change as a function of stimulus intensity across areas or monkeys. For example, monkey G showed a clear increase in bandwidth with increasing stimulus intensity for A1 neurons (Fig. 8A, □), but monkeys F and L did not (▪ and □) nor did monkey G show this change with stimulus intensity in other cortical areas (e.g., areas ML and R, Fig. 8, D and F).
The same analysis based on the first-spike latency is shown in Fig. 9. In this case, the number of azimuths that were <50% of the longest latency contributed to this metric. As with the tuning index, the latency measurement results in much broader bandwidths. In contrast, however, there are differences between the different cortical fields with CL and CM showing the smallest bandwidths and area R showing the broadest. This was also true when one considers the smallest bandwidths, which were most often observed in CL and CM (data not shown).
The third metric was to calculate the Rayleigh statistic, which takes into account all tested azimuths as well as the overall firing rate. To get an idea of how the spatial tuning was dependent on the time of the response, this statistic was calculated for time periods starting at stimulus onset and continuing in 10-ms epochs ≤200 ms. The results are shown in Fig. 10. At each stimulus intensity, neurons in area CL showed the sharpest spatial tuning. As the stimulus intensity was decreased, these functions began to show an interesting pattern across cortical areas. At the highest intensity, there appears to be little pattern in which areas have the sharpest tuning except for CL being the sharpest. As the stimulus intensity decreases, areas CM and ML begin to show sharper tuning relative to their original positions, whereas areas MM and R begin to show the broadest tuning. At the lowest intensity tested, CL has the sharpest tuning, CM and ML have the next sharpest, MM and R have the weakest tuning, and A1 has only moderate spatial tuning. The second thing to note is that, regardless of cortical area or intensity, these functions rise sharply, and then tend to flatten out after ∼150 ms. Thus there appears to be little gain in spatial information processing after the initial 150 ms.
Effects of azimuth and intensity
The next issue we wished to address was to determine which azimuths were best represented by these neurons. For each cell, we calculated the best azimuth as the azimuth that elicited the greatest response, regardless of the stimulus intensity. Comparisons between monkeys showed that there was no significant difference in these distributions between animals (t-test; all P < 0.01), therefore the results were pooled and are shown in Fig. 11. Across monkeys and cortical areas, two main features emerged. First, the majority of neurons were tuned to azimuths in the contralateral hemifield (toward the monkey's right) with the peak near 90°. Second, although fewer in number, azimuths in the ipsilateral hemifield were also noted, such that the entire region of azimuth had some representation. Across cortical fields, however, there was little if any difference in the distributions.
Although most azimuths were represented by at least a few neurons, it remains unclear how strong these signals are for the different azimuths. To address this, we normalized the response of each neuron by the peak response to the 64 different stimuli and pooled these responses across all neurons within a given cortical area and monkey. These results are shown in Fig. 12. In each panel, the four curves represent the mean normalized firing rate at each stimulus intensity (see inset) and the spontaneous activity (dashed line) of the monkey where the most neurons were recorded for that cortical area (see Table 1). We chose to illustrate these data as those found in individual monkeys were similar but could show more noise if the number was small (i.e., monkey F in ML or monkey G in MM). Again, there is a clear bias for azimuths within the contralateral hemifield so that the contralateral hemifield is well represented by these neurons. One consistent feature across all cortical areas and monkeys is that the greatest pooled responses were noted for the loudest stimuli with the lowest intensity stimulus often being barely above the spontaneous rate (e.g., area R, Fig. 9F). This is consistent with the decreased sound localization ability at near threshold sounds in both humans and monkeys (Recanzone and Beckerman 2004; Sabin et al. 2005; Su and Recanzone 2001). What is also apparent is that some cortical areas show steeper functions than others with CL and CM having steeper functions than A1, ML, and MM. Area R had a steep “step”-like function near the midline that was relatively flat in the two hemifields. These data are consistent with caudal fields having better spatial resolution, as a population, compared with core or rostral fields.
One clear feature of the data shown in Fig. 12 is that the pooled responses were greatest for the highest intensity stimuli, but this was not necessarily the case for the individual neurons (e.g., Fig. 4, E–G, I, and J). To determine how frequent these nonmonotonic neurons were in our sample, we determined the intensity that elicited the greatest response for each neuron. These intensities are plotted in Fig. 13 for all monkeys and cortical areas. This analysis revealed that the most common best intensity was the greatest intensity tested. The clearest example of a gradual increase in the percentage of neurons having louder best intensities is perhaps shown for area A1, but area CM and to a lesser extent CL show similar behavior. What is interesting is that area MM has many neurons that had best attenuations at the middle intensities tested, and area R showed very few neurons that had their best response to stimulus intensities <75 dB SPL.
The last consideration is how eye position may have influenced the activity of these neurons. Two different laboratories have shown that eye position does influence the activity of auditory cortical neurons in the alert macaque monkey (Fu et al. 2004; Werner-Reiss et al. 2003). Similar findings have also been noted in the inferior colliculus (Porter et al. 2006). The vast majority of the data of this report (943 neurons) were collected where a fixation point was provided, but eye position was not measured, and the animals were not required to fixate any stimulus. For a subset of these neurons (76), we did measure eye position in two sessions run consecutively. In one session, eye position was monitored but the animal was not required to fixate, i.e., the same conditions as for all data reported so far. In the other session, the animal was required to fixate throughout the trial to receive a reward. The two sessions were counter-balanced, and two monkeys (F and L) participated. If eye position did have an influence on our results, we would expect a greater variance in the neuronal response when the monkey was not required to fixate the central stimulus. We concentrated our efforts on CL, which had shown the sharpest tuning using different metrics, and in A1, which had previously been investigated (Fu et al. 2004; Werner-Reiss et al. 2003). Other cortical areas were sparsely sampled but are included in the following analysis (Table 3).
Figure 14 shows the eye position and neuronal responses for three representative neurons measured on three different experimental days in monkey L. The top row shows the eye position at the end of each stimulus presentation (n = 1,434, 1,475, and 1,414 eye positions for A-C, respectively) when no fixation was required. In general, the monkey tended to look in the upper visual field, but the spread of eye positions was on the order of 30° in both azimuth and elevation. The next four rows show the mean responses (spikes/stimulus) for the session in which fixation was required (red) and when it was not (blue). The horizontal colored line shows the spontaneous rate for each condition, and the vertical line represents 10 spikes/stimulus. For the cell shown in A, there is virtually no difference in the driven activity although there was an increase in the spontaneous activity when the animal fixated. The cell shown in B had a different response, where the nonfixating condition showed a greater driven response and an increase in spontaneous activity compared with the fixation condition. The cell shown in C shows virtually no difference in the spontaneous activity but a clear increase in driven activity under the fixation condition.
These representative cells were commonly encountered in our sample. We therefore compared (paired t-test) the activity across all 64 stimulus azimuths and intensities between the fixating and nonfixating conditions. This analysis revealed that the vast majority of neurons (33/40 and 33/36 neurons in monkeys F and L, respectively) did have a statistically significant difference in activity (P < 0.01). This could be represented as either an overall increase or decrease in activity during the fixation condition. Figure 15 shows the mean firing rate across all 64 stimuli under the two conditions (each symbol represents a single neuron). The □ and ◊ show neurons where there was a significant difference in activity. As can be seen, although most cells did show differences in activity, there was a high correlation between the two measures that was significant (r = 0.87; P < 0.01).
One reason that we observed a much greater proportion of neurons with a significant effect compared with previous studies could be that the neuron was recorded under one condition after being studied during a previous session (≤1 h) and large changes in alertness, attention, satiation, or unit isolation could contribute significantly to the differences in overall activity that was observed. Because the fixation condition was performed first for approximately half of the neurons (41/76; numbers in parentheses in Table 3), we compared the overall mean activity (spikes/stimulus) for neurons in the fixating condition compared with the nonfixating condition. In that case, there was no significant difference (paired t-test; P > 0.01). Further, comparing the overall firing rate for the first session compared with the second, regardless of which type it was, also showed no significant difference (P > 0.01). Therefore there does not appear to be a systematic influence on the overall firing rate across neurons whether the monkey was fixating compared with not fixating nor whether the neuron was recorded early compared with later in the experimental day.
Further inspection of Fig. 14 reveals that, although the firing rate may change (B and C), there appeared to be very littledifference in the spatial tuning properties. We therefore compared the metrics that we have previously shown in this report that are largely independent of overall activity between these two conditions: tuning index, Rayleigh statistic, 50% bandwidth, best azimuth, and best intensity. This analysis (paired t-test) revealed that none of these metrics were significantly different between the two conditions (P < 0.01 with Bonferroni correction). Finally, we compared the variance in the response at each azimuth as a function of whether the monkey was fixating or not. We compared the mean response and variance of the response at each of the 64 azimuths and intensities as well as the spontaneous activity between the fixating and nonfixating conditions. Paired t-test for each of the 64 stimulus azimuths and intensities, as well as the spontaneous activity, across the recorded population of neurons showed that under none of these conditions was the P value significant (smallest P = 0.133, uncorrected), indicating again that fixation did not produce a systematic effect on the neuronal responses. In summary, although we noted differences in activity with respect to eye position that was consistent with previous reports, these differences are unlikely to have made a systematic impact on the bulk of the measures that are reported in this study.
This report describes the spatial tuning properties in azimuth in six different auditory cortical fields in the alert macaque monkey. Our findings were generally consistent with the hypothesis that caudal auditory cortical fields primarily process spatial information (Rauschecker 1998). This was supported by the caudolateral field having, overall, greater azimuth indices and Rayleigh statistic values as well as smaller bandwidths compared with the core area and rostral and medial areas. This was most apparent at the level of the population of neurons, as although rare, single neurons with sharp spatial tuning were encountered in areas R, MM, and ML.
The vast majority of neurons we sampled showed a statistically significant response to ≥1 of the 64 stimuli tested, and the majority of neurons encountered showed some spatial tuning (Table 2). Thus across these six cortical areas ≥70% of the encountered neurons could potentially provide spatial information. This is almost certainly an underestimate as we only recorded responses to locations in space at a single elevation and distance, and many of these neurons could have spatial tuning functions either above or below the interaural axis. The vast majority of these neurons also had their best azimuth in the contralateral field, although all of acoustic space at this elevation was represented in each cortical area. This is consistent with the contralesional, but not ipsilesional, deficits noted after unilateral auditory cortical lesions (Heffner and Heffner 1990; Jenkins and Merzenich 1984; Thompson and Cortez 1983). We also noted neurons with two distinct peaks in their spatial tuning. These two peaks were commonly located in a mirror symmetric fashion, consistent with “front/back” errors noted when localizing tones (e.g., see Blauert 1997).
Parallel processing in primate auditory cortex
One of the main objectives of this study was to determine if there is physiological evidence for parallel processing of spatial versus nonspatial information in the primate auditory cortex (Rauschecker 1998). We had previously shown that neurons in the CM were more sharply spatially tuned in frontal space compared with neurons in A1 (Recanzone et al. 2000b) and that the spatial acuity of the neurons across the population could account for the sound localization ability of that monkey. Similarly, Tian et al. (2001) showed that caudal belt fields were more spatially tuned than rostral belt fields and that rostral belt fields responded more selectively to vocalizations than did caudal belt fields in anesthetized macaques. The data from the present report are largely consistent with this idea as well, as the CL field showed the sharpest tuning of all fields tested with the CM field showing similarly sharply tuned neurons. Neurons in the more rostral belt and core areas generally had much broader spatial tuning.
The most-telling analysis that we performed was the population activity across all neurons recorded in a given area (Fig. 12). Here the population of CL neurons had the sharpest spatial tuning with clear inhibition in the ipsilateral field and strong excitation in the contralateral field. In addition, each of the stimulus intensities tested showed a similar pattern, although the functions for the lowest intensity stimuli were not nearly as sharp. In contrast, fields MM and ML had much shallower functions. Area R functions were also very sharp, but there was considerable activity in the ipsilateral field, and very little if any tuning for the quietest stimuli, even though monkey can localize these low-intensity stimuli (Recanzone and Beckerman 2004). These data are consistent with previous hypotheses that caudal auditory cortical field are selectively processing spatial information in contrast to more rostral fields.
These results are also consistent with previous studies in both alert (Recanzone et al. 2000a) and anesthetized monkeys (see Rauschecker and Tian 2000) demonstrating the existence of physiologically distinct auditory cortical fields. The differences we noted in spatial selectivity demonstrated that area R neurons were clearly distinct from A1 neurons and other cortical fields. We are also the first to describe responses in MM in the alert animal, and we noted clear differences between this and other cortical fields, consistent with this being an independent field based on anatomical criteria (e.g., Kaas and Hackett 2000). Finally, the spatial tuning was distinct among lateral fields ML, CL, and CM, again consistent with the anatomical results as well as the reversal in best frequency demarcating these as distinct cortical areas (Rauschecker and Tian 2004).
Implications for sound-location perception
The receptive fields measured in this study were very broad, with median 50% bandwidth measurements on the order of 90° for neurons in CL, which had the most sharply tuned neurons. This is considerably greater than the localization ability of monkeys. Using the same acoustic stimuli as in the present study, Recanzone and Beckerman (2004) found that monkeys can localize the louder stimuli to within about 5°, and the quietest stimuli to within ∼10°. One possibility is that spatial tuning is accomplished by a population encoding scheme by which the flanks of the tuning curves provide the most information. These regions show the greatest dynamic range as the peaks of the tuning functions are relatively broad. A similar mechanism has been proposed to account for visual hyperacuity (Bradley et al. 1987). The population tuning shown in Fig. 12 is consistent with this notion, as CL neurons had the steepest flanks and would therefore have the greatest spatial acuity of the fields tested. However, one possible reason that the majority of the best azimuths were clustered in a relatively small region is because the neurons were responding to the acoustic axis of the monkeys (e.g., see Middlebrooks and Pettigrew 1981). This seems less likely to us for several reasons. First, there was no difference in the distribution of best azimuths in the three monkeys, all of which had very different head sizes and pinna shapes. Second, best azimuths were noted across all locations tested. Third, monkeys habitually move their pinnae throughout the recording sessions. We have noted widely varying differences within a session (unpublished observations), and this variance is too great to account for the distribution of best locations that we have noticed.
A second possibility is that some aspects of the response beyond the overall firing rate contributes to sound localization perception. In contrast to previous reports in the cat (e.g., Brugge et al. 1996; Jenison et al. 1998), we find little evidence that first spike latency would contribute a great deal to spatial encoding (Fig. 5). This may be due to a species difference, but it is more likely due to differences in anesthetic state and how the latency measurements were defined. Given the high spontaneous rates of many of the neurons we encountered, and the fact that many neurons showed inhibition to azimuths in ipsilateral space, it was difficult to assign a first-spike latency. We had also attempted other latency metrics, such as the median first-spike latency and the latency to the peak response, but these methods showed even less structure in the spatial tuning profiles. It should also be noted that single-neuron responses in alert primates are commonly sustained (e.g., Recanzone 2000) and these sustained responses are more common when the appropriate stimuli are used (Wang et al. 2005). Thus a rate as opposed to a latency code may better reflect the neuronal processing of acoustic information in the primate auditory cortex.
An alternative possibility is that the pattern of spikes throughout the duration of the stimulus contributes to spatial processing (e.g., Furukawa et al. 2000; Middlebrooks et al. 1994). The individual rasters (Figs. 2 and 3) and the PSTHs shown in Fig. 4 show that the temporal structures within the neural response can vary with spatial location. If neurons can use this temporal information along the auditory cortical hierarchy, then these responses could also contribute to sound location perception.
Comparisons to previous studies in the cat
The most commonly studied animal model for auditory cortical spatial receptive fields is the cat, with numerous studies in anesthetized animals (Brugge et al. 1996; Middlebrooks and Pettigrew 1981; Rajan et al. 1990;) and more recently in the alert animals (Mickey and Middlebrooks 2003). Overall the results of this study are largely consistent with those of cats in that spatial tuning was commonly observed and that the bandwidths of these neurons are relatively broad. As in the cat, we observed neurons with spatial receptive fields that were restricted to the contralateral hemifield, but they could also be centered near the midline, in ipsilateral space, or respond to all azimuths tested. There are a few salient differences as well. Most studies in the cat have shown that the spatial selectivity of neurons commonly decreases (receptive field size increases) with increasing stimulus intensity, whether measured by firing rate, latency, or the pattern of response (e.g., Brugge et al. 1996; Middlebrooks and Pettigrew 1981; Middlebrooks et al. 1998; Rajan et al. 1990). We found multiple examples where this was not true, and the overall bandwidth across the population was largely consistent in most areas across the intensity range that we tested (e.g., Figs. 7 and 8). However, there is a fundamental difference in recording techniques between this and previous studies. We used the same four standard intensities regardless of the threshold of the neuron under study, whereas previous investigators set the stimulus intensity relative to the unit's threshold. Therefore very different stimulus intensities were used across the population of tested neurons, and we used a larger dynamic range (50 dB) than many previous studies. Given that a significant proportion of macaque auditory cortical neurons have nonmonotonic rate/level functions (Recanzone et al. 2000a) (this study Fig. 13), increasing the absolute intensity of the stimulus could commonly result in an overall reduction in firing rate. This may have the same effect as a lower-intensity stimulus, resulting in smaller receptive fields.
Influence of eye position
The bulk of our sample was recorded while the monkey sat in a darkened room and their eye position was not under behavioral control. Recent studies in two different laboratories have reported that eye position can influence the activity of auditory cortical neurons (Fu et al. 2004; Werner-Reiss et al. 2003) as well as neurons in the inferior colliculus (Porter et al. 2006). In those studies, the responses of ∼1/3 (Werner-Reiss et al. 2003) or more (Fu et al. 2004) of neurons or cortical locations, respectively, were influenced by the eye position. If there was a systematic effect of eye position on the activity of the neurons we studied, we would expect to see a higher variance in the response compared with when the animal had a constant eye position given that the eye position was random across trials. To test if this might be the case, we compared the neuronal response when the monkey was actively fixating a central target in one session, and no fixation requirement was imposed in the immediately following or preceding session. Similar to previous reports, we also observed significant differences between session when the monkey fixated and when they did not. In our hands, the percentage of neurons with fixation effects was much greater than in previous studies. This is likely due to the fact that these responses were measured in separate sessions, and therefore other influences such as the overall alertness, level of satiety, attention, etc. could be in play. We also noted that the differences could be increases or decreases and were not systematic across the population of neurons, similar to previous studies. This resulted in no overall difference in any of the tuning properties that we investigated and therefore eye-position effects are likely not contributing any bias in the results presented here. The finding that eye position does influence a substantial fraction of neurons in auditory cortical fields, now from three independent laboratories, indicates that these influences are likely underlying a fundamental encoding process that remains elusive.
One potential caveat for these experiments is that two monkeys were performing a behavioral task, whereas one monkey (monkey G) was passively listening to these same sounds. This did not seem to have a substantial influence on the activity of these neurons, as many of the metrics we used were similar between the three individuals, such as the percentage of tuned neurons, the tuning index, and the best azimuth. When differences were noted, for example in the bandwidth and intensity tuning, they were different for all three monkeys. It was not the case that monkey G consistently had the greatest or smallest measures of any of these metrics. This indicates that these metrics show the most individual idiosyncrasies, and interestingly they are the most telling for the acuity in the spatial representations. This may then be an underlying root to the differences observed between individual monkeys in their ability to localize stimuli at these same intensities (Recanzone and Beckerman 2004).
A second issue of concern is in the classification of each neuron into a specific cortical field. We relied heavily on the physiological response properties of both the multiple single-neuron responses before single-cell isolation as well as the properties of the single cells isolated at each location. This has proven to be largely adequate, and these classifications are consistent with the histological material and MRI images when it is available (Recanzone et al. 2000a; current study). In practice, we would “march” the electrode penetrations in the medial–lateral direction, which made defining the borders between A1 and the belt fields relatively straightforward. However, it was more difficult to differentiate the border regions between belt fields as several penetration locations could be ambiguous. This may have resulted in a misclassification of neurons in one location versus another and therefore increased the variance in our measures between different cortical areas. Our findings of differences therefore should be considered robust as misclassifications would only tend to blur any true differences. A second possibility is that we may have over sampled sub-regions of different cortical areas. This is almost certainly not the case for A1 (Fig. 1) as it was usually sampled nearly in its entirety. The lateral belt areas may have been over sampled in the medial aspects; however, not shown in Fig. 1 are the locations where suspected parabelt fields were encountered (often times deep within the penetration site as the electrode traversed obliquely through the superior temporal gyrus), and therefore we believe that the belt fields were also evenly sampled.
This research was funded in part by National Institute of Deafness and Communication Disorders Grants DC-02371 to G. H. Recanzone and DC-00442 to T. M. Woods and a core grant from the National Eye Institute.
The authors thank the California National Primate Research Center for expert veterinary care and N. Beckerman and D. Seto for participation in these experiments.
Present address of T. M. Woods: The Harker School, 500 Saratoga Ave., San Jose, CA 95129.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2006 by the American Physiological Society