Sound localization in both humans and monkeys is tolerant to changes in sound levels. The underlying neural mechanism, however, is not well understood. This study reports the level dependence of individual neurons' spatial receptive fields (SRFs) in the primary auditory cortex (A1) and the adjacent caudal field in awake marmoset monkeys. We found that most neurons' excitatory SRF components were spatially confined in response to broadband noise stimuli delivered from the upper frontal sound field. Approximately half the recorded neurons exhibited little change in spatial tuning width over a ∼20-dB change in sound level, whereas the remaining neurons showed either expansion or contraction in their tuning widths. Increased sound levels did not alter the percent distribution of tuning width for neurons collected in either cortical field. The population-averaged responses remained tuned between 30- and 80-dB sound pressure levels for neuronal groups preferring contralateral, midline, and ipsilateral locations. We further investigated the spatial extent and level dependence of the suppressive component of SRFs using a pair of sequentially presented stimuli. Forward suppression was observed when the stimuli were delivered from “far” locations, distant to the excitatory center of an SRF. In contrast to spatially confined excitation, the strength of suppression typically increased with stimulus level at both the excitatory center and far regions of an SRF. These findings indicate that although the spatial tuning of individual neurons varied with stimulus levels, their ensemble responses were level tolerant. Widespread spatial suppression may play an important role in limiting the sizes of SRFs at high sound levels in the auditory cortex.
- single neuron
- sound localization
in auditory perception, the accuracy of sound localization shows remarkable tolerance to changes in sound levels (Sabin et al. 2005), with a general degradation toward threshold levels in humans (Su and Recanzone 2001; Vliegen and Van Opstal 2004) and monkeys (Recanzone and Beckerman 2004). Using anesthetized preparations, electrophysiological studies have amassed considerable evidence for the existence of location-sensitive neurons in the auditory cortex (Ahissar et al. 1992; Brugge et al. 1996; Eggermont and Mossop 1998; Imig et al. 1990; Middlebrooks and Pettigrew 1981; Middlebrooks et al. 1998; Mrsic-Flogel et al. 2005; Rajan et al. 1990; Reale et al. 2003). In these studies, the spatial tuning of the majority of cortical neurons was found to be sharpest at near-threshold level, progressively broadening at moderate to high sound levels (e.g., Brugge et al. 1996; Middlebrooks et al. 1998; Mrsic-Flogel et al. 2005; Xu et al. 1998). This trend conflicts with documented perceptual performance in sound localization. Theoretical studies have suggested the use of disparity information between two broadly tuned neural populations to encode azimuth (AZ) (Stecker et al. 2005).
More recently, spatial selectivity has been reevaluated in the primary auditory cortex (A1) and secondary auditory cortex using awake and behaving preparations (King et al. 2007; Lee and Middlebrooks 2010; Mickey and Middlebrooks 2003; Recanzone 2000; Werner-Reiss and Groh 2008; Woods et al. 2006). In contrast to general findings in anesthetized animals, spatial tuning width obtained in awake preparations does not systematically expand with increasing sound levels (Mickey and Middlebrooks 2003; Woods et al. 2006). In fact, it sharpens through the suppression of responses at less preferred locations during behavioral tasks (Lee and Middlebrooks 2010). These studies have suggested that inhibition plays a role in limiting the size of sound response fields (SRFs) with increasing sound levels. In principle, one may infer the presence of inhibition from a reduction in neural firing under proper conditions. However, inhibition is generally difficult to characterize in extracellular recordings due to low spontaneous firing rates, especially in the upper cortical layers (typically <2 spike/s in awake marmosets) (Barbour and Wang 2003; Wang et al. 2005). The relationship between the strength of inhibition (or response suppression) and sound level across spatial locations has not been systematically examined in the auditory cortex of awake animals.
Another contrasting finding for awake animals is that the AZ tuning widths of single- and multiunit activity could either expand or contract with an average 20-dB increase in sound levels in the auditory cortex of cats (Mickey and Middlebrooks 2003). However, the overall distribution of tuning width appears to be unchanged between moderate and high sound levels in the auditory cortex of macaques (Woods et al. 2006). It should be noted that expansion and contraction in individual SRFs do not warrant a level-invariant distribution of tuning widths. For example, if broad SRFs become broader and narrow SRFs become narrower, the overall distribution of tuning widths may become bimodal at higher sound levels. Conversely, convergence may occur. The nature of level invariance may differ between responses of individual neurons and their population average, depending on the distribution of tuning widths across sound levels. This distinction is important because tuning width directly influences the stimulus information carried by a population of neurons (Kang et al. 2004; Seung and Sompolinsky 1993; Zhang and Sejnowski 1999). To address this issue, systematic analyses are needed to clarify the relationship between the variability found in individual neurons' spatial tuning widths and the overall tuning properties of their ensemble responses in the auditory cortex of awake animals.
In recent years, the marmoset model has offered many insights into the spectral and temporal aspects of sound processing in the auditory cortex of awake animals (e.g., Barbour and Wang 2003; Lu et al. 2001b). However, cortical processing of sound location information remains relatively unknown in the marmoset species. To improve the value of this nonhuman primate model, this study investigated neurons' spatial tuning in A1 and adjacent caudal field [caudal medial (CM)/caudal lateral (CL)] of awake marmoset monkeys. Our experiments focused on the two aforementioned issues for spatial coding: 1) the level dependence of spatial responses of individual neurons and their population average and 2) the level dependence of suppression in cortical SRFs. The results show that although the spatial tuning widths of individual neurons could expand or contract with increasing sound levels, averaged population responses remained tuned for neuronal groups preferring contralateral, midline, and ipsilateral directions. We further examined the spatial extent of suppression in cortical SRFs using a pair of sequentially presented test probe stimuli. Forward suppression was observed when the test stimuli were presented from either the excitatory center or far regions of a SRF. In general, the strength of suppression increased with sound levels, even at far locations, showing much reduced excitatory activity. These results provide insights into the roles of suppression mechanisms in spatial processing in the auditory cortex.
MATERIALS AND METHODS
Animal preparation and electrophysiological procedures.
A chronic recording preparation (e.g., Lu et al. 2001a) was used to record single-neuron activity in the auditory cortex of awake common marmoset monkeys (Callithrix jacchus). Experimental procedures were approved by the Institutional Animal Care and Use Committee of the Johns Hopkins University following National Institutes of Health guidelines.
All subjects were trained to sit in a custom-designed primate chair. After 2–4 wk of behavioral adaptation, two stainless steel headposts were attached to the skull under sterile conditions with the animal deeply anesthetized by isoflurane (0.5–2.0%, mixed with 50% O2 and 50% nitrous oxide). The headposts served to maintain a stable head orientation of the subject during electrophysiological recordings. To access the auditory cortex, small craniotomies (∼1 mm in diameter) were made on the skull over the superior temporal gyrus to allow for the penetration of electrodes (tungsten electrodes, 2- to 5-MΩ impedance, A-M Systems, Carlsborg, WA) via a hydraulic microdrive (Trent-Wells, Los Angeles, CA). Single unit activity was sorted online using a template-based spike-sorting program (MSD, Alpha Omega Engineering) and analyzed using custom programs written in Matlab (Mathworks, Natick, MA).
Spike waveforms were continuously monitored throughout a recording session to ensure stability and isolation quality. The 25th, 50th, and 75th percentiles of the signal-to-noise ratio were 16.56, 19.43, and 23.66 dB, respectively, for neurons reported in this study. The signal-to-noise ratio was defined as 20log10(|Vmax|/σn) in decibels, where Vmax is the maximal deflection of a spike waveform and σn is the SD of baseline activity. The median variability of the SNR of a neuron was 1.59 dB (SD) across the stimulus sets tested.
Experimental setup and sound delivery.
Experiments were conducted in a dimly lit, double-walled, sound-attenuated chamber (IAC-1024, Industrial Acoustics, 1.9 × 2.2 × 1.9 m). The internal walls and ceiling were lined with ∼3-in. acoustic absorption foam (Sonex, Illbruck), and the recording table and speaker frame were covered with ∼1-in. acoustic absorption foam to reduce acoustic reflections. Figure 1A shows the experimental arrangement. The subject sat in the primate chair centered in the room. Fifteen loudspeakers (FT28D, Dome Tweeter, Fostex) were mounted on a semicircular frame covering the upper-level frontal field at a distance of ∼80 cm from the head of a subject. The position of a loudspeaker was specified in AZ and elevation (EL) angles in spherical coordinates. As shown in Fig. 1B, a set of seven loudspeakers was evenly positioned with 30° horizontal spacing (AZ ±90°, ±60°, ±30°, and 0°) at EL 0° and EL 45°; one loudspeaker was positioned directly above the head of a subject (EL 90°). The loudspeaker directly in front of the animal was at AZ 0° and EL 0°. Positive AZ angles corresponded to speakers ipsilateral to the recording site. During experiments, the subject's head remained fixed and eye position was not controlled. An infrared camera was used to monitor the general behavior of a subject throughout a recording session.
All stimuli were generated digitally at a sampling rate of 100 kHz, converted via a digital-to-analog interface (TDT, DA4, Tucker-Davis Technologies), attenuated (TDT, PA4), amplified (D75A, Crown Amplifier), multiplexed (TDT, PM1, 2 × 8 channels), and delivered to one or two loudspeakers (i.e., single or two-source stimulation) via a 16-channel breakout box (custom built, connected to PM1). For single-source stimulation, one stimulus was generated, and PM1 was set at the “mono” mode (1 × 16) to deliver the stimulus to a designated loudspeaker. For two-source stimulation, two stimuli were generated and operated by separate TDT modules (PA4) or ports (DA4, PM1). In this case, PM1 was set to the “stereo” mode (channels A and B, 2 × 8) to deliver two stimuli (one per channel) simultaneously to two designated loudspeakers. As shown in Fig. 1B, the 15 loudspeakers were wired to PM1 with an interleaved arrangement (black: channel A, gray: channel B), so that the stimuli from the 2 channels both covered the contralateral, midline, and ipsilateral sound field.
The loudspeaker impulse responses were obtained by 214-point Golay code stimulation (Zhou et al. 1992) using a free-field microphone (B&K type 4191, 0.5 in.) placed on the top of the primate chair pointing toward the loudspeaker under test. Electrophysiological equipment such as the head holder and electrode manipulator were removed during recording. The maximum deviation in the power spectrum between responses of individual loudspeakers and their group average was within ±6 dB/Hz in the frequency range of 2–32 kHz except for the top speaker, whose maximum deviation was about ±12 dB/Hz due to reflections from the primate chair. Due to the finite frequency-response ranges of loudspeakers in our setup, acoustic stimuli were limited to the frequency range 2∼32 kHz, with a few exceptions for testing frequency tuning of neurons whose maximal responses occurred below 2 kHz or above 32 kHz.
Characterization of single-neuron response properties.
After a neuron was isolated, pure tone and noise stimuli (100 ms in duration with 5-ms cosine rise and fall times) were played iteratively to characterize its spatial, frequency, and level selectivity. Spatial selectivity was examined using broadband, frozen Gaussian noise (FIR filtered, flat spectrum between 2 and 32 kHz, at least 10 dB above the threshold). Noise samples were randomly chosen from neuron to neuron. Frequency selectivity was examined using pure tone stimuli (2∼32 kHz in 10 steps/octave) played at moderate sound pressure levels (SPLs; 30–60 dB) from at least one driven speaker location. Best frequency (BF) was defined as the pure tone frequency evoking the maximal rate of a neuron across the range of SPLs and locations tested. The average discharge rate was calculated based on the number of spikes occurring 0–150 ms after the stimulus onset.
Pure tone and noise intensities were both expressed in terms of the peak-to-peak equivalent decibel SPL. The reference amplitude was set by a 1-kHz tone calibrated at ∼90-dB SPL (re. 20 μPa) with 0-dB peak attenuation. The spectrum level of noise was 40 dB/Hz at 0-dB peak attenuation. In experiments, the interstimulus interval was at least 500 ms for randomly presented stimulus sequences. Ten repetitions were played for each stimulus.
Characterization of spatial selectivity of a neuron.
Multiple metrics were used to characterize the spatial sensitivity of a neuron. Best speaker location corresponded to the location evoking the maximal firing rate. Modulation depth (MD) corresponded to the peak normalized difference between the maximal and minimal firing rates, (Rmax − Rmin)/Rmax, collected across 15 speaker locations. The average firing rate (Ravg) across 15 locations was used to examine the overall excitability of a neuron.
To facilitate the graphical inspection of neural activity and to quantify the size of tuning area in a two-dimensional space, we constructed SRFs by the methods of interpolation and extrapolation. A similar approach was used in a study of cortical SRFs in the ferret (Mrsic-Flogel et al. 2005). Here, we interpolated average rates over an equal-area grid consisting of an ensemble of squares of 5° resolution (0.087 × 0.087 in radius2) that evenly tiled the spherical surface. Since speaker distance was not a parameter of interest in this study, SRFs were mapped onto the unit sphere. The total surface area of the frontal hemifield (0 to 90° in EL and −90 to 90° in AZ) was 3.417, close to π, a quarter of the surface area of a unit sphere. The difference indicates the quantization error of the grid. To ensure accurate estimations of tuning area, the grid map was further extended 30° laterally and below (−120 to 120° in AZ and −30 to 90° in EL). This allowed a proper estimate of tuning area driven by loudspeakers located at the borders of the speaker array (e.g., EL 0°); otherwise, erroneous smaller tuning areas would be reported compared with those driven by loudspeakers located at the center of the array (e.g., AZ 0° and EL 45°).
In a SRF, activity at a grid position with θ in EL and ϕ in AZ, denoted as (θ,ϕ) in spherical coordinates, was calculated as the weighted sum of responses [r(θ,ϕ)] over all speaker locations: r(θ,ϕ) = , where R(θi,ϕi) is the average rate to the ith speaker at (θi,ϕi) and N is the number of speakers. Ai was determined by the exponential function: Ai = exp(−α2/2σ2), where σ = 20°, α is the angular distance (in °) between (θ,ϕ) and (θi,ϕi), and N = 15. The SRFs plotted the peak normalized r(θ,ϕ) and were flattened into a two-dimensional map for display with the abscissa for AZ and the ordinate for EL.
The accuracy of the interpolation/extrapolation method was evaluated by examining how closely the centroid of a simulated SRF matched the position of a single driving speaker (zero responses were assumed at all other speaker locations). Using the equal-area grid, the error was 0° in EL and <5° in AZ for speakers at EL 0° and 45°, and the error was 0° in AZ and 10° in EL for the top speaker. Since only a quarter of the spherical space was sampled, we quantified the spatial tuning width of a neuron, best area (BA), as the ratio between the area of a SRF with activity ≥62.4% of the peak and the area of the frontal field covered by the speaker array. Using this threshold criterion, a spatially highly selective neuron responding only to one speaker would yield a BA of 0.1, whereas a spatially nonselective neuron responding equally to all 15 speakers would yield a BA of 0.996.
Methods of two-source stimulation for characterizing suppression in SRFs.
Previous studies have suggested that suppression at less preferred locations sharpens the spatial tuning of a cortical neuron in the awake condition (Lee and Middlebrooks 2010; Mickey and Middlebrooks 2003). However, due to the low spontaneous rate, the strength of suppression is difficult to reveal in extracellular studies. The level dependence of suppression remains uncharacterized in cortical SRFs.
In principle, one may infer the tuning of suppression to a test stimulus (S1) from reductions in neural responses to a probe stimulus (S2). This method has been widely used in studies of frequency tuning of excitation and inhibition in the auditory system [i.e., “two-tone” paradigm (Sachs and Kiang 1968, Suga and Tsuzuki 1985, Sutter et al. 1999)]. However, it is problematic to use a pair of simultaneously presented S1 and S2 to investigate the region of suppression in SRFs because perceptual fusion may occur [i.e., “summing localization” (Leakey 1959; Snow 1954)]. More specially, merging sound waves at the ears originating from two different directions could change the binaural coherence of the composite signal (Blauert 1997). This process cannot be approximated as simply addition between the probe and test stimuli.
For this reason, we characterized the rate-level tuning of excitation and suppression at multiple spatial locations using a pair of lead-lag sounds (S1 and S2). The experimental protocol was similar to the “forward masking” paradigm for studying excitatory and inhibitory frequency tuning in the auditory cortex (e.g., Brosch and Schreiner 1997; Calford and Semple 1995) except that S1 and S2 were delivered from either same or different locations. The test stimulus S1 was broadband, frozen Gaussian noise (band limited between 2 and 32 kHz) played at SPLs ranging from −10 to 80 dB in 10-dB steps. Probe sound S2 was played at a fixed SPL immediately after S1. No delay was imposed between the offset of S1 (after 5-ms fall time) and the onset of S2 (before 5-ms rise time). Reductions in S2 responses were used to infer the strength of S1-evoked, forward suppression. To evoke reliable probe responses, S2 was chosen from BF tone, narrowband noise (centered at BF, bandwidth ≤ 0.5 octaves) and broadband noise played at a preferred SPL of a neuron. For broadband noise, identical Gaussian noise tokens were used for S1 and S2. S1 and S2 each had a duration of 100 ms.
In experiments, S1 and S2 were delivered from variable locations. S2 was always delivered from a loudspeaker in channel B (gray, Fig. 1B) at the best speaker location of a neuron or from an adjacent location if the best speaker was not assigned to channel B. The S2 location was denoted as the “center” location. S1 was always delivered from a loudspeaker in channel A (black, Fig. 1B), which included speakers immediately adjacent to the center (denoted as the “near” location, <70° to the center location) and speakers far away from the center (denoted as the “far” location, >70° to the center location, often in the opposite hemifield), where the spatial separation indicates the angular distance between two speaker locations on a great circle. Additionally, S1 and S2 were summed physically (TDT, SM3) and played from the center location.
In experiments, the exact positions of center, near, and far varied from neuron to neuron, depending on the size and orientation of a SRF. As shown in Fig. 1B, the positions of speakers in channels A and B were interleaved. This arrangement was made to ensure that S2 in channel B (gray) could be played from either contralateral, midline, or ipsilateral locations. Moreover, for each S2 location used, S1 in channel A could be assigned to at least two near locations at the same hemifield to S2 and to at least three far locations at the opposite hemifield to S2. Since the two-stimulus stimulation had to be conducted after the frequency and spatial selectivity of a neuron had been characterized, not every near and far location were tested due to time constraints of single-unit recordings. We reported the results of neurons that were tested at least one near and one far location along with the center location. When more than one near and far location were surveyed, the averaged results were reported for each spatial configuration. For control analyses, S2-alone responses were measured in each of the three spatial configurations and the S2-alone trials were randomly interwoven into (S1, S2) trials. S1-induced forward suppression was assessed relative to the S2-alone rate obtained in the same spatial configuration. This minimized the effects of potential drift in the overall excitability of a neuron on the observed strength of suppression.
Identification of A1 and caudal field.
Single-unit responses were collected from A1 and caudal field in four hemispheres of one male and two female adult marmoset monkeys. In the marmoset, A1 is situated largely ventral to the lateral sulcus on the superior temporal plane and exhibits a low-to-high topographical frequency gradient along the rostral-caudal axis (Aitkin et al. 1986). Similar to other nonhuman primate species, CM and CL fields in the marmoset can be identified by a tonotopic reversal with an abrupt decrease of BFs at the high-frequency border of A1 (Aitkin et al. 1986; Kaas and Hackett 2000; Merzenich and Brugge 1973). In this study, a total of 681 tone-sensitive neurons were collected to construct the tonotopic maps of the 3 subjects.
Due to sampling limitations inherent in chronic recording, topographical mapping was accomplished with variable degrees of detail across subjects. The tonotopic organization in A1 and emergence of low-BF neurons in the caudal field were clearly seen in the auditory cortex of subjects M16s and M79u (Figs. 2, A1 and B1), whereas tonotopic gradients were less continuous in A1 of subject M43s in both hemispheres (Fig. 2C1). To verify the location of the caudal field, we additionally analyzed the frequency selectivity of local field potentials (0.1∼300 Hz) at individual recording sites (∼1-mm craniotomy). The recording sites that were assigned to the caudal field all showed low-frequency profiles in local field potentials and the presence of low-BF neurons. Since the medial to lateral division was not investigated in this study, we did not separate neurons further into CM and CL sectors.
Measurement of the acoustic properties of signals in the marmoset ear canal.
After the completion of the electrophysiological experiments, head-related impulse responses were measured in two subjects (subjects M16s and M79u) in the awake condition using 214-point Golay code stimulation (Zhou et al. 1992). Subjects sat normally in the primate chair. Pressure waveforms were collected using hearing aid microphones (model FG-23629-P16, Knowles Electronics) inserted into the ear canals at a depth of 5–10 mm. Signals were amplified (40 dB) using a custom-built amplifier and digitized (100-kHz sampling rate) using the TDT system. In data analysis, the impulse responses were truncated by a 512-point Hamming window to obtain the direct responses with the maximal amplitude centered around ∼2.56 ms. The time taken for the sound to travel from the loudspeaker to the position of the monkey head was ∼2.3 ms. The power spectrum of the direct responses was described as the head-related transfer function (HRTF).
Interaural level differences (ILDs) were extracted from frequency-domain signals and expressed as the decibel power difference between the right and left ear HRTFs averaged over a 2-kHz band. In the analysis, the left ear was designated as the ipsilateral ear, and positive ILDs corresponded to contralateral source locations. For controls, the free field signals were collected at the positions of the animal's ears (no monkey, two microphones pointing toward AZ −90 and 90°, respectively). Monaural spectrum and ILD information for the free field signals were compared with those of ear canal signals in the results.
Statistical significance tests.
The trend analysis on a given data set was based on a linear regression t-test; the R2 and t-statistics of the slope were reported. Wilcoxon rank-sum tests were used to evaluate the population medians, and two-sample Kolmogorov-Smirnov tests were used to evaluate the overall distributions between two groups of data sets. An α-level of 0.05 was used for all statistical tests.
Spatial selectivity of A1 and CM/CL neurons to the frontal field locations.
Data were obtained from experiments with 406 single neurons (208 neurons in A1 and 198 neurons in CM/CL, a total of 3 marmosets) that responded significantly to broadband noise stimuli at one or more speaker locations (P < 0.05 by t-test). The 25th, 50th, and 75th percentiles of SPLs tested were 30, 40, and 60 dB in A1 and 40, 50, and 60 in CM/CL.
Figure 2, A1–C1, shows the tonotopic mapping in the auditory cortex of three monkeys. The border between areas A1 and CM/CL was determined by the frequency reversal at the high-frequency end of A1. (See materials and methods for details on area identification.) The scatterplots show the best speaker locations of A1 neurons (Fig. 2, A2–C2) and CM/CL neurons (Fig. 2, A3–C3). A wide range of spatial selectivity was observed in both cortical areas. As shown in Table 1, a great portion of A1 and CM/CL neurons responded maximally to speakers at contralateral locations (∼50%, AZ < 0°, contralateral) and at locations above the horizontal plane (∼60%, EL 45°). Many neurons also showed preferences to ipsilateral (∼25%, AZ > 0°, ipsilateral) or midline (∼25%, AZ = 0°) locations. Space representation is thus not strictly lateralized by hemisphere in the marmoset auditory cortex. Further analyses revealed that the overall distributions of spatial preferences were rather similar among low-, mid-, and high-frequency neurons (Table 2). We found no obvious correlation between the spatial selectivity of a neuron (in AZ or EL) and its BF (R2 < 0.01, P > 0.33, in A1; R2 < 0.05, P ≥ 0.04, in CM/CL).
Analysis of acoustic reflection and directional cues in the experimental setup.
Since this study used free field sound stimulation, reflections off the apparatus might interfere with the spatial information carried by the direct sound and thus affect a neuron's spatial selectivity. Of particular concern were the near-field acoustic reflections off the top of the Plexiglas primate chair and the stainless steel electrode manipulator (Fig. 1A). To address this concern, we tested free field and ear canal signals in response to Golay code stimulation with and without these two pieces of equipment after the completion of physiological experiments. Two subjects were used.
Figure 3 shows the results associated with two source locations (marked in Fig. 3A). Figure 3B shows a comparison of ear canal signals measured at the left ear (ipsilateral) of one subject (subject M79u) with and without the electrode manipulator. For the ipsilateral source location (Fig. 3B, left), reflected waves were noticed in the received signal plotted in the time domain (arrow a, black curve). This reflection enlarged the notch depth in the corresponding HRTF. Among the 15 locations tested, the results from this ipsilateral source location (EL 45°, AZ 60°) exhibited the greatest influence of reflection. Reflected waves were much weaker in the received signal emitted from the contralateral location, and the overall shape of the HRTF remained relatively unchanged (Fig. 3B, right). At both source locations, the changes in HRTFs were more prominent at frequencies above 12 kHz.
To quantify these results, we measured the SD of the difference between HRTF spectra with and without the manipulator for a given source location. The analyses were separately conducted in three frequency bands (2–12, 12–24, and 24–32 kHz) because they contained, respectively, the resonant peak, first notch (FN), and high-frequency features of marmoset HRTFs (Slee and Young 2010). The results collected by the left ear microphone are shown in Fig. 3C. Reflections caused spectral deviations of 1–6 dB. The most pronounced changes occurred over the FN range at EL 45° (Fig. 3C, blue dashed line). The results for the right ear signals (not shown) exhibited similar patterns with smaller SD magnitudes.
Since the top plate of the primate chair was used to restrain the marmoset, it was impractical to remove the top plate and measure its effect on ear canal signals. We therefore analyzed free field signals, collected at the position of a monkey's ear, with and without the top plate. The results collected by the left ear microphone are shown in Fig. 3D. Similar to the results shown in Fig. 3B, reflected waves were more prominent in the signal received from the ipsilateral speaker at EL 45° than from the contralateral speaker at EL 0°. These reflections (Fig. 3D, left, arrow b, black curve) caused accentuated spectral notches in the signal spectrum at 5–6, 25, and 32 kHz. Figure 3E shows a plot of the SDs of the difference between HRTF spectra (with and without the top plate) collected for a given source location. For results analyzed in the three frequency bands, SDs were larger for sources located at EL 45° than at EL 0°. Because sound waves emitted from EL 45° had a larger angle of incidence than from EL 0°, there was a greater amount of sound energy reflected off the top plate. The results collected by the right ear microphone (not shown) exhibited similar patterns and SD magnitudes.
These observations prompted us to further investigate the effects of the apparatus on the directionality of signals received at the ear canals. We reasoned that directly comparing the free field (without monkey) and ear canal (with monkey) signals would reveal the contributions of room acoustics to the directional filtering of HRTFs. Since the majority of neurons we collected had BFs of >2 kHz (Fig. 2), our analyses focused on two level-related localization cues: monaural spectrum and binaural ILDs. Results for free field and ear canal signals were compared in the frequency range from 2 to 32 kHz (the bandwidth of noise stimuli used in our experiments). The top plate was present in both conditions, and results for two subjects are shown.
Figure 4A shows a plot of the monaural spectra of signals collected by the left ear microphone for source locations at EL 45° (top) and EL 0° (bottom). The color of each grid represents the signal power at a given frequency averaged across a 2-kHz range. Two observations were made. First, free field signals exhibited a spectral notch between 5 and 9 kHz across the AZ angles tested at EL 45° (Fig. 4A, top left, arrow a; also see the example shown in Fig. 3D). This notch region was not consistently observed in the collected ear canal signals (Fig. 4A, middle and right). This discrepancy causes difficulties in identifying the sources of notches observed in ear-canal signals (see the example shown in Fig. 3B) without the knowledge of the reflective and diffractive nature of body and head of the monkey and that of the experimental setup. Second, the energies of free field signals at a given frequency showed no clear AZ dependence at either elevation. In comparison, those of ear canal signals were much enhanced at the resonant frequency range (<12 kHz) and became increasingly directional toward midline ipsilateral directions between AZ 0° and AZ 60° at high frequencies (>12 kHz, Fig. 4A, arrow b). The frequency-dependent changes in the directionality of the cochlear microphonic or monaural gain (re. free field responses) has been described as the acoustic axis of pinna (Middlebrooks and Pettigrew 1981; Phillips et al. 1982). For the two marmosets tested, the maximal monaural energies became more sharply defined toward midline at higher frequencies. Similar observations have been found in other animals [e.g., barn owl (Keller et al. 1998), bat (Jen and Chen 1988), and cat (Middlebrooks and Pettigrew 1981)].
ILDs measured in these two conditions also showed clear differences (Fig. 4B). Ear canal ILDs decreased in an orderly fashion from contralateral to ipsilateral positions at all frequency bands. The dynamic range of ear canal ILDs was larger for high frequencies (e.g., ±20 dB at 17–19 kHz) than for low frequencies (e.g, ±10 dB at 3–5 kHz). This is consistent with the previous study of marmoset HRTFs (see Fig. 7 in Slee and Young 2010). In comparison, free field ILDs at EL 0° showed a weak ∼5-dB gain (<9 kHz) toward the contralateral field due to the directionality of the microphones. At high frequencies, ILDs were weak, showing no systematic AZ dependence at either elevation.
The above analyses indicate that the AZ directionality of monaural spectrum and ILD patterns observed in ear canal signals was associated with HRTF filtering, not the apparatus. However, reflections may influence the elevation selectivity of a neuron by altering the spectral profiles of HRTFs (Fig. 3). In this study, the effects of the acoustic features of the sound field on cortical SRFs were difficult to characterize because recording and restraining devices could not be removed during experiments (e.g., top plate of the primate chair). With this limitation in mind, the following experiments emphasized the relative changes in spatial tuning properties of a neuron as a function of sound level.
Effect of sound level on the spatial tuning of individual neurons in A1 and CM/CL.
Figure 5 shows the spatial responses of four example neurons measured at two SPLs (A1 neurons in A and B and CM/CL neurons in C and D). The raster plot, rate-AZ function, and SRF of a neuron were analyzed at each SPL. The BA of a SRF is outlined in each SRF. Spatial responses of these neurons exhibited rich temporal patterns, including onset/offset (Fig. 5A) and sustained activity (Fig. 5C), consistent with the results reported in awake cats (Mickey and Middlebrooks 2003) and awake macaque monkeys (Woods et al. 2006). When the results obtained at two different SPLs were compared, the example SRFs could either expand (Fig. 5, B and D) or contract (Fig. 5C) with increasing SPLs. These level-induced modulations were not uniformly present in space and time. For example, response enhancement only occurred at the contralateral locations shown in Fig. 5B, and suppression was more evident during the onset phase of responses shown in Fig. 5A. The variability found in these individual SRFs contrasts with the widespread enhancement of onset activity found in anesthetized animals (e.g., Brugge et al. 1996).
To quantify the observed changes in cortical SRFs with increasing SPLs, we analyzed the percent distributions of changes in BA (ΔBA; Fig. 6, A and B), MD (ΔMD; Fig. 6, C and D), Ravg across 15 speaker locations (ΔRavg; Fig. 6, E and F), and AZ and EL angles of the best speaker location (ΔAZ and ΔEL; Fig. 6, G and H). Among these metrics, BA is a local measure of tuning based on responses around the preferred location, whereas MD and Ravg are two global measures of tuning based on responses at both preferred and nonpreferred locations. The measurement compared the response properties of a neuron between the lowest (SPLlow) and highest (SPLhigh) sound levels tested, e.g., ΔBA = BAhigh − BAlow. The majority of neurons (>88%) were tested with SPL increments of 20 dB or more. The 25th, 50th, and 75th percentiles of SPLlow were 30, 30, and 40 dB in A1 and 30, 40, and 50 dB in the CM/CL. The percentiles of SPL increment (SPLhigh − SPLlow) were 20, 20, and 30 dB in both A1 and CM/CL.
A total of 99 A1 neurons and 108 CM/CL neurons were tested. In both cortical areas, ∼50% of neurons showed no variation in tuning acuity (|ΔBA| ≤ 0.1, |ΔMD| ≤ 0.1) and Ravg (|ΔRavg| ≤ 5 spikes/s), whereas the remaining neurons showed either increased or decreased spatial sensitivity and excitability with increasing SPL. Moreover, many A1 and CM/CL neurons retained their spatial preferences (∼30% with |ΔAZ| = 0°, ∼50% with |ΔEL| = 0°), whereas the others showed contralateral/ipsilateral and up/down shifts. These results show that increasing SPLs evoked opposite changes in tuning width, overall excitability, and tuning preference of individual neurons in the auditory cortex of awake marmosets. The bidirectional modulation resulted in near-zero medians for all metrics examined, as shown in Fig. 6 (P > 0.17 in A1 and P > 0.31 in CM/CL by rank-sum test).
Bidirectional changes in a tuning metric may yield a fixed group mean, but they do not warrant a stable distribution of a tuning metric; divergence and convergence could both occur. We next evaluated the level tolerance of spatial sensitivity of neurons as an ensemble. The data set included neurons shown in Fig. 6 along with neurons tested at only one SPL. Figure 7 shows the overall distributions of BA, MD, and Ravg at four SPL ranges. In A1 (solid circles), the medians of BAs (Fig. 7A) were similar at low, moderate, and high SPLs (P > 0.05 by rank-sum test). As shown by the 25–75th percentiles of the data set (vertical bars), the overall distributions of BAs were also preserved between low and high SPLs (P > 0.14 by a two-sample Kolmogorov-Smirnov test). Notably, narrowly and broadly tuned SRFs were found at each SPL group. Similar observations were made in MD (Fig. 7B) and Ravg (Fig. 7C). Medians and overall distributions of each metric were statistically indistinguishable between low and high SPLs (P > 0.05 by rank-sum test and P > 0.25 by a two-sample Kolmogorov-Smirnov test). At the lowest SPLs, responses showed slightly narrower BAs (P < 0.05) and higher Ravg (P < 0.001) relative to those measured at higher SPLs (by rank-sum test). In CM/CL (open circles), neurons had significantly smaller BAs and larger MD values than A1 neurons at moderate and high SPLs (P < 0.05 by rank-sum test) but not at low SPLs. No between-group differences were detected for any of the three metrics in terms of median (P > 0.09 by rank-sum test) and overall distribution (P > 0.05 by a two-sample Kolmogorov-Smirnov test) across SPLs.
These results indicate that the spatial acuity and excitability of neurons as an ensemble were preserved over a large dynamic range of SPL in A1 and caudal field of auditory cortex. Further analyses revealed that reductions in Ravg at higher SPLs were strongly correlated with increases in MD (R2 = 0.37 in A1 and R2 = 0.25 in CM/CL, P < 10−10) and weakly correlated with decreases in BA (R2 = 0.05 in A1 and R2 = 0.04 in CM/CL, P < 0.05). This indicates that increasing sound intensity could suppress the responses of some cortical neurons while improving their spatial tuning acuity.
While their spatial acuity was level tolerant, A1 and CM/CL neurons showed a wide range of spatial selectivity to frontal field sound locations (Fig. 2). Could the ensemble activities of these neurons also retain their spatial representation across SPLs? In a previous study of the auditory cortex of anesthetized cats, level-tolerant AZ coding was achieved through disparity analysis between the responses of “contralateral” and “ipsilateral” channels (Stecker et al. 2005). Here, we divided the neurons shown in Fig. 7 into three groups: contralateral, midline, and ipsilateral based on the AZ angles of their best speaker locations. For each neuron, we extracted its peak normalized rate-AZ tuning functions collected at EL 0° and EL 45° and then calculated the ensemble average of the tuning functions of neurons within the same AZ and SPL group at these two elevations. Figure 8 shows the results of A1 (A) and CM/CL (B) neurons collected in four SPL ranges.
In A1, more midline neurons were observed relative to contralateral and ipsilateral neurons at very low SPLs (≤20 dB), whereas more contralateral neurons were observed at higher SPLs. In CM/CL, contralateral neurons dominated at all SPLs. Due to limited data points at the very low SPLs, we could not reliably assess the significance of this distinction. When the results of the three AZ groups were compared (Fig. 8, left, middle, and bottom), the average tuning curves of midline neurons had a closed shape, whereas those of contralateral and ipsilateral neurons were half open and peaked at lateral AZ angles. On average, AZ tuning curves were modulated at a depth of 30–50% (the difference between the peak to trough of a curve) at both elevations, and their overall shape did not change drastically between low and high SPLs (30–80 dB). These results show that the ensemble responses of neurons stayed tuned across a large dynamic range of SPLs in the auditory cortex of marmoset monkeys. This applied to neurons with contralateral, midline, or ipsilateral preferences in both A1 and CM/CL.
Two-source stimulation revealed broadly distributed suppression in cortical SRFs.
Compared with results collected in the anesthetized condition (e.g., Brugge et al. 1996; Middlebrooks et al. 1998), one notable difference is that many neurons collected in the awake condition decreased their tuning widths and Ravg with increasing SPL (Fig. 6). An intracellular study has shown that the nonmonotonic rate level characteristics of cortical neurons are mediated by increasing strengths of synaptic inhibition at high sound levels (Tan et al. 2007). However, the spatial location selectivity of synaptic inhibition is largely unknown in the literature.
Here, we examined the level tuning of response suppression at the excitatory center and far region of a SRF using a pair of lead-lag sounds (S1 and S2). The strength of suppression was inferred from reductions in neural responses to the lagging S2, i.e., forward suppression. Since S1 and S2 did not overlap in time, we reasoned that forward suppression would be solely attributed to properties of the leading S1 and not to acoustic smearing or an interaction between S1 and S2 at the ear. In experiments, test stimulus S1 was 100-ms broadband noise, similar to those used for characterizing SRFs. Probe stimulus S2 was either a 100-ms BF tone or BF-centered band-pass noise or broadband noise, whichever evoked significant excitatory responses. S1 was played from the center and surround locations of a SRF (denoted as center, near, and far locations) from −10- to 80-dB SPL, and S2 was played immediately after the offset of S1 from the center location at a fixed SPL. (See details of the experimental design in materials and methods.)
Figure 9 shows the results of two example neurons. The first example was collected from CM/CL (the same neuron as shown in Fig. 5D), whose SRF expanded with increasing sound level and exhibited a broad spatial selectivity to frontal field locations at 50 dB (Fig. 9A). Figure 9B shows raster plots of its response to sequentially presented S1 and S2. S1-evoked excitation (0–100 ms, light gray) and forward suppression (100–200 ms, dark gray, relative to the S2-alone response shown at the top) can be seen at all three S1 locations tested (center, near, and far, as marked in Fig. 9A). To examine the level dependence of S1-evoked excitation, we calculated the increase of the discharge rate during S1 relative to the spontaneous rate of the neuron [i.e., R(S1) − spont]. For S1-evoked forward suppression, we calculated the reduction of discharge rates during S2 with and without the preceding S1 [i.e., R(S1,S2) − R(S2)]. Figure 9C shows that the magnitudes of excitation and forward suppression generally increased with the S1 level. Note that excitatory responses were not always followed by suppression (e.g., responses to S1 at 30 dB at the center and near locations; Fig. 9C), suggesting that habituation of the postsynaptic response may not contribute to the observed forward suppression.
Forward suppression was also found in the absence of excitation. This is the case for the second example neuron, which responded exclusively to ipsilateral source locations (Fig. 9D). For the sequential presentation, the spiking activity to S1 peaked at 30-dB SPL and then decreased to zero at higher S1 levels at both center and near locations (Fig. 9E), showing nonmonotonic rate level dependences. Interestingly, forward suppression persisted even when S1 did not elicit spiking activity between 60- and 80-dB SPL. S1-evoked forward suppression was also seen at the far location (Fig. 9E), showing very weak spiking activity at SPLs tested. Because inhibitory synaptic events are almost exclusively triggered by stimulus onset (Scholl et al. 2010), the nonresponsiveness of this neuron at high SPLs may be ascribed to sustained “silent” suppression evoked by S1, not to a lack of excitation or to separate offset-sensitive inhibitory input. Figure 9F shows a plot of the quantitative measurement on these responses. Among the neurons tested, 33% in A1 (23/70 neurons) and 46% in CM/CL (25/54 neurons) exhibited persistent silent suppression at far locations [re. R(S2), P < 0.05 by t-test] at one or more SPLs, despite no significant excitatory responses to S1 (re. spont. rate, P > 0.05 by t-test) between −10- and 80-dB SPL.
The results of these two example neurons show that the strength of forward suppression was more prominent at high S1 levels at both center and far regions of SRFs, despite different spatial extents of excitation in their SRFs and different characteristics of rate level tuning of excitation. To evaluate the generality of these findings, we compared the general trends of level dependences of excitation and suppression across the three spatial configurations based on the responses of a population of neurons. Figure 10 shows the population average of the strengths of excitation and forward suppression measured in the three spatial configurations (n = 70 in A1 and n = 54 in CM/CL). The data show that the magnitudes of excitation (Fig. 10, A and C) at the center and near locations were much higher than those measured at the far locations at SPLs above 20 dB in both A1 and CM/CL. Note that since the results were averaged across neurons, the shape of the rate level tuning curve of individual neurons influences that of their average. In this experiment, the majority of neurons showed nonmonotonic rate level tuning to S1 played from the center location (70%, 49/70 neurons in A1; and 72.2%, 39/54 neurons in CM/CL). For these neurons, their rate level tunings peaked at SPLs less than the highest SPL tested (80 dB). In both cortical areas, the distributions of the so-called best level peaked at 30- to 50-dB SPL (31% in A1 and 44% in CM/CL) and 80-dB SPL (30% in A1 and 28% in CM/CL). The nonuniform distribution of best level explained, to some extent, the two-peak profiles seen in the averaged tuning of excitation shown in Fig. 10, A and C. In contrast to excitation, the tuning of forward suppression increased with S1 level for the three spatial configurations in both A1 and CM/CL (Fig. 10, B and D). The monotonic increment of suppression in population average is consistent with the observations made in individual neurons (Fig. 9, C and F).
The contrast in the spatial extent of excitation and suppression also applied to their overall strengths measured based on the absolute firing rates of neurons. In the data analyses, we estimated the overall strength of S1-evoked excitation at a location by summing the firing rates in a tuning curve of a neuron [R(S1) − spont] between −10- and 80-dB SPL. To ensure that only valid response suppression was counted, we estimated the overall strength of forward suppression at a location by summing the rates in a tuning curve [R(S1,S2) − R(S2)], which were significantly lower than zero (P < 0.05 by t-test). The results of all neurons collected at the same spatial configuration (e.g., center) were then averaged. This analysis allowed us to compare the overall levels of excitation and forward suppression across the spatial configurations.
Figure 10, E and F, shows the population average of results at each of the three spatial configurations (means ± SE). The strengths of excitation were much reduced at the far locations relative to those at the center and near locations (P < 0.001 in A1 and P < 10−7 in CM/CL), whereas the strengths of forward suppression exhibited no significant differences among three configurations (P > 0.3 in A1 and P > 0.5 in CM/CL by rank-sum test). Taken together, these data indicate that the cortical spatial responses are modulated by suppression mechanisms. While the strengths of excitation are much reduced at the far regions of SRFs, the strengths of suppression increase with stimulus level at both the center and far regions of SRFs.
This study investigated the spatial response properties of auditory cortex neurons in the awake marmoset. The results were evaluated in comparative terms across experimental conditions with respect to sound level and sound location. The main findings were threefold. First, space representation is not strictly lateralized by hemisphere in the marmoset auditory cortex. Neurons in A1 and CM/CL fields showed a broad spatial selectivity to frontal field sound locations (Fig. 2). Second, the spatial tuning of individual neurons could either expand, or contract, or change little with increasing sound levels (Fig. 6), whereas the spatial acuity and spatial tuning of neurons as an ensemble remained level tolerant (Figs. 7 and 8). Finally although the strength of excitation was much reduced at the far regions of SRFs, the strength of suppression increased with sound level at both center and far regions of SRFs (Fig. 10). Together, these findings suggest that the spatial selectivity of neurons in the auditory cortex of marmosets is modulated by suppression mechanisms, which may play an important role in limiting the sizes of SRFs at high sound levels.
Spatial selectivity of neurons in the marmoset auditory cortex and relation to previous work.
Interaural timing and level differences and monaural spectral cues for sound localization are first extracted by neurons in the brain stem (for a review, see Irvine 1992; Oertel and Young 2004; Yin 2002; Young and Davis 2002). In the auditory cortex, changes in the spatial location of sound greatly modulate neural responses in multiple cortical areas (Middlebrooks et al. 2002). In this study, we used broadband noise stimuli (2–32 kHz) delivered in free field to study the spatial sensitivity of cortical neurons. According to marmoset HRTFs (Slee and Young 2010), the main spatial cues within this frequency range are ILDs for encoding AZ and the spectral shape of the HRTF magnitude for encoding EL. As shown in Fig. 4B, the dynamic ranges of ILDs measured in this study were similar to those previously reported in marmosets (cf. Fig. 5 in Aitkin and Park 1993; Fig. 7 in Slee and Young 2010). Compared with other mammalian species, the percentage of contralateral-preferring neurons found in the awake marmoset (∼50%; Table 1) closely matched those reported in A1 of anesthetized cats (Rajan et al. 1990; Samson et al. 2000) and was lower than those reported in awake cats [∼83% (Mickey and Middlebrooks 2003)] and awake macaque (Woods et al. 2006). It is possible that contralateral neurons that peaked outside the frontal field were not properly counted in this study, and their numbers could be substantial as shown by the two previous awake studies, both of which sampled a full 360° AZ plane (Mickey and Middlebrooks 2003; Woods et al. 2006). We are uncertain whether the directionality of the pinna also contributes to the observed difference. Similar to those observed in the marmoset (Fig. 4), the acoustic axis of the pinna points to the frontal ipsilateral sound field in both cats (Middlebrooks and Pettigrew 1981; Phillips et al. 1982) and macaques (Spezio et al. 2000).
The upward dominance of EL selectivity has been reported in the anterior ectosylvian sulcus and secondary auditory cortex areas of anesthetized cats at near-threshold levels (Fig. 4 in Xu et al. 1998), but relatively few studies have characterized the elevation selectivity of neurons in the auditory cortex of awake animals. In this study, >50% of A1 and CM/CL neurons showed preferences above the horizontal plane in response to broadband stimuli delivered from the upper frontal hemifield. Because reflections from the primate chair and recording instrument we used influenced the amplitude and frequency of spectral notches of ear canal and free field signals to varying degrees (Fig. 3), we are uncertain to what extent the observed EL selectivity was caused by reflections in the apparatus. Previous HRTF measurements in marmosets have shown that FN frequency monotonically increases with EL at lateral source positions with AZ >60–80°. In the frontal field and on the contralateral side, FN frequency shows constant or disorder relationships with AZ and EL (cf. Fig. 9 in Slee and Young, 2010). This imposed further difficulties in interpreting the observed EL selectivity in the frontal field based on FN frequency, especially when speculated spectral artifacts caused by acoustic reflections of the apparatus were not consistently observed in ear canal signals (Fig. 4A).
Mechanisms of level-invariant space coding in the auditory cortex.
Although lesion studies have shown that spatially oriented behaviors during sound localization require an intact auditory cortex (Heffner and Heffner 1990; Jenkins and Merzenich 1984; Thompson and Cortez 1983), the reliability of spatial functions of the auditory cortex is not fully understood. One major issue is that focal representations of space by cortical neurons are mostly observed at near-threshold levels in anesthetized preparations (Brugge et al. 1996; Middlebrooks et al. 1998), reminiscent of those observed in the inferior colliculus (Semple et al. 1983). At high stimulus levels, SRFs generally expand in width, and their spatial acuity and boundaries can no longer be reliably defined, in contrast to level-robust perceptual performance (Sabin et al. 2005). Stecker and colleagues (2005) proposed that this level issue could be resolved by using disparity information created between two “opponent” neural populations composed of contralateral and ipsilateral units within each hemisphere. Their analyses showed that although the tuning of individual populations broadened with increases of stimulus level, the difference in tuning between two populations remained unchanged, providing sufficient information about source AZ at both low and high SPLs (Stecker et al. 2005).
Spatial responses measured in the awake condition differed markedly from those measured in the anesthetized condition. We found that spatial tuning widths of individual neurons in A1 and CM/CL of awake marmosets do not obey a fixed relationship with stimulus level: expansion and contraction of SRFs were both observed along with those showing no change (Fig. 6). These results are consistent with findings in the auditory cortex of awake cats (Mickey and Middlebrooks 2003). Additionally, the distribution of spatial tuning width remained unchanged between low and high SPLs (Fig. 7A), consistent with findings in the auditory cortex of awake macaques (cf. Fig. 8, with the exception of one subject, in Woods et al. 2006). Similarly, the Ravg of neurons as a population changed little between low and high SPLs (Fig. 7C), whereas that of individual neurons could either increase or decrease (Fig. 6, E and F).
These results depict an interesting relationship between the variability of responses of individual neurons versus the stability of their ensemble responses in the auditory cortex. In this regime, the changes in the spatial sensitivity of individual neurons are not random; SRF expansion is offset by SRF contraction with a zero net gain (i.e., near-zero medians in Fig. 6). As such, the ratio of narrowly and broadly tuned SRFs remains roughly the same across SPLs (as shown in percentiles in Fig. 7A). In our analysis, the averaged AZ tuning curves of contralateral-, midline-, and ipsilateral-preferring neurons remained tuned between 30- and 80-dB SPLs (Fig. 8), indicating that level-tolerant AZ responses are not limited to hemispherical channels.
One apparent limitation of this study is that the sampling space is not complete. The sampling issue was recently addressed by Kuwada and colleagues (2011) in studying AZ coding in the inferior colliculus of unanesthetized rabbits. They reported that the top 35% of neurons showed level-tolerant tuning within ±150° ranges of AZs. It remains to be tested whether the population activity in the auditory cortex shows level-tolerant tuning in front/back and up/down dimensions and whether the general principles of the opponent channel theory promote level-invariant space coding along specific spatial dimensions.
Suppression mechanisms underlying cortical spatial selectivity.
In contrast to previous findings in the anesthetized condition (e.g., Brugge et al. 1996; Middlebrooks et al. 1998), neurons collected in the awake condition could decrease their tuning widths and excitability with increasing SPL (Fig. 6 in this study and Fig. 13 in Mickey and Middlebrooks 2003), suggesting the involvement of inhibitory/suppressive activity in cortical SRFs. This study investigated the suppression mechanisms using a modified “forward-masking” paradigm (Brosch and Schreiner 1997; Calford and Semple 1995). We did not estimate the strength of suppression during the presentation of the test stimulus (by simultaneously presenting S1 and S2) due to concerns about acoustic interactions of sound waves at the ear canal, as they could evoke the perception of “summing localization” (Blauert 1997). The results showed that although excitatory spiking activity was substantially reduced at the far regions of SPFs, the level dependence and overall strength of forward suppression were similar in results obtained at the center and far regions of SRFs (Fig. 10). This observation suggests that suppressive activity on average is more broadly tuned than excitatory activity in cortical SRFs. Another noteworthy feature of our results is the silent suppression observed at some SPLs and/or locations (Fig. 9D). Because inhibitory synaptic events are almost exclusively triggered by stimulus onset (Scholl et al. 2010), it is unlikely that the observed forward suppression was driven by separate offset-sensitive inhibitory input. One plausible explanation is that the observed silent suppression was governed by inhibitory synaptic activity, which reduced the sizes of SRFs at certain sound levels.
Nevertheless, the strength of suppression measured in extracellular studies is not equivalent to that of synaptic inhibition and could be affected by membrane adaptation mechanisms. A complete survey of the structures of cortical SRFs requires knowledge of the spatial tuning of synaptic excitation and inhibition using intracellular techniques. Although intracellular work has revealed prominent inhibitory activity underlying frequency (Kaur et al. 2004; Tan et al. 2004; Wehr and Zador 2003; Zhang et al. 2003) and intensity (Tan et al. 2007; Wehr and Zador 2003) selectivity of neurons in A1 of anesthetized rats, information about spatial tuning profiles of synaptic inputs is limited in the literature (Chadderton et al. 2009), and synaptic inhibitory activity in cortical SRFs has not been reported. It is unknown whether the feature selectivity of cortical inhibition differs between spatial and nonspatial aspects of sound analyses.
Comparison of spatial processing in A1 and caudal field.
Based on the differential distributions of spatial and spectral tuning widths and monkey call selectivity among cortical areas (Rauschecker et al. 1995; Recanzone 2000; Tian et al. 2001), researchers have suggested that the central auditory system is divided into at least two separate streams: the rostral-ventral (“what”) and caudal-dorsal (“where”) pathways emerging after A1 (Kaas and Hackett 2000; Recanzone and Cohen 2010; Romanski et al. 1999). It has been proposed that spatial information of sound is further analyzed and better represented by neurons in the caudal field in primate species.
In this study, the caudal field neurons exhibited better spatial acuity than A1 neurons (Fig. 7), in agreement with previous findings in awake macaques (Woods et al. 2006). Although these results favor the hypothesis that the caudal field is more suited to space coding, the spatial sensitivity of CM/CL neurons did not differ substantially from that of A1 neurons. In particularly, the overall distributions of BA, MD, and Ravg were markedly similar between the two cortical fields (Fig. 7), and strong suppressive modulations were found in SRFs of both A1 and CM/CL neurons (Fig. 10).
Whether spatial sensitivity of neurons in A1 and caudal field would differ in an active listening situation is not addressed in this study. Compelling anatomic and physiological evidence indicates that the caudal field is a site of multisensory convergence, where visual influences are mostly suppressive at the level of single neurons (de la Mothe et al. 2006; Kayser et al. 2009; Schroeder and Foxe 2002; Smiley et al. 2007). It remains to be tested whether suppression as shown in this study serves to improve space coding in a spatial task involving a multisensory experience.
This work was supported by National Institute on Deafness and Other Communication Disorders Grants DC-03180 and DC-005808.
No conflicts of interest, financial or otherwise, are declared by the author(s).
Author contributions: Y.Z. and X.W. conception and design of research; Y.Z. performed experiments; Y.Z. analyzed data; Y.Z. and X.W. interpreted results of experiments; Y.Z. prepared figures; Y.Z. drafted manuscript; Y.Z. and X.W. edited and revised manuscript; Y.Z. and X.W. approved final version of manuscript.
The authors thank J. Estes and N. Sotuyo for assistance with animal care and E. Issa, S. Slee, E. Young, M. Jeschke, D. Gamble, and E. Remington for discussions on results. B. May and S. Slee provided invaluable expertise on HRTF measurements in marmosets. The authors give special gratitude to N. Sotuyo for providing drawings of the experimental setup.
- Copyright © 2012 the American Physiological Society