JN Miami Valley Hospital
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 92: 2051-2070, 2004; doi:10.1152/jn.01235.2003
0022-3077/04 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (4)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Spitzer, M. W.
Right arrow Articles by Takahashi, T. T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Spitzer, M. W.
Right arrow Articles by Takahashi, T. T.

A Neuronal Correlate of the Precedence Effect Is Associated With Spatial Selectivity in the Barn Owl's Auditory Midbrain

Matthew W. Spitzer, Avinash D. S. Bala and Terry T. Takahashi

Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403

Submitted 19 December 2003; accepted in final form 13 May 2004


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Sound localization in echoic conditions depends on a precedence effect (PE), in which the first arriving sound dominates the perceived location of later reflections. Previous studies have demonstrated neurophysiological correlates of the PE in several species, but the underlying mechanisms remain unknown. The present study documents responses of space-specific neurons in the barn owl's inferior colliculus (IC) to stimuli simulating direct sounds and reflections that overlap in time at the listener's ears. Responses to 100-ms noises with lead-lag delays from 1 to 100 ms were recorded from neurons in the space-mapped subdivisions of IC in anesthetized owls (N2O/isofluorane). Responses to a target located at a unit's best location were usually suppressed by a masker located outside the excitatory portion of the spatial receptive field. The least spatially selective units exhibited temporally symmetric effects, in that the amount of suppression was the same whether the masker led or lagged. Such effects mirror the alteration of localization cues caused by acoustic superposition of leading and lagging sounds. In more spatially selective units, the suppression was often temporally asymmetric, being more pronounced when the masker led. The masker often evoked small changes in spatial tuning that were not related to the magnitude of suppressive effects. The association of temporally asymmetric suppression with spatial selectivity suggests that this property emerges within IC, and not at earlier stages of auditory processing. Asymmetric suppression reduces the ability of highly spatially selective neurons to encode the location of lagging sounds, providing a possible basis for the PE.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Localizing sounds in a natural environment requires an ability to deal with spurious directional information conveyed by acoustic reflections, or echoes. Our ability to do so is thought to depend on a precedence effect (PE), whereby the first arriving sound dominates perception of later arriving reflections (Haas 1951Go; Hartmann 1983Go; Wallach et al. 1949Go). Recent human psychophysical findings (Litovsky and Shinn-Cunningham 2001Go) suggest that the PE encompasses at least 2 distinct phenomena. At delays of a few ms, subjects experience fusion of leading and lagging sounds into a single perceptual event (reviewed in Blauert 1997Go). Over a wider range of delays, subjects experience localization dominance, in which leading and lagging sounds are localized to a position near the leading source (Litovsky and Shinn-Cunningham 2001Go; Wallach et al. 1949Go). Concurrently, subjects experience an impaired ability to detect changes in the location of the lagging source, and a smaller effect for the leading source (Litovsky and Macmillan 1994Go; Litovsky and Shinn-Cunningham 2001Go; Perrott et al. 1989Go; Zurek 1980Go), which have been termed discrimination suppression (Litovsky et al. 1999Go). The difference in time courses suggests that perceptual fusion results from different mechanisms than those responsible for localization dominance and discrimination suppression.

Behavioral studies have provided evidence of localization-dependent precedence phenomena in several animal species, including barn owls. Several animal studies have measured lateralization of sources placed symmetrically about the midline (Cranford 1982Go; Keller and Takahashi 1996bGo; Kelly 1974Go; Wyttenbach and Hoy 1993Go). At short delays, lateralization judgments correspond to the side of the leading sound. As the delay is increased, judgments become evenly distributed on the 2 sides, suggesting that the lagging sound becomes separately localizable. This conclusion is supported by results of a recent study measuring localization of paired sources using eye movements in cats (Tollin and Yin 2003Go). At lead-lag delays from 400 µs to 10 ms, subjects oriented toward the leading sound. At longer delays, subjects localized the lagging sound on some trials. Finally, a recent study from our laboratory demonstrated spatial discrimination suppression in barn owls (Spitzer et al. 2003Go). As in humans, leading sounds had a large effect on the ability to detect chances in the location of lagging sounds, and lagging sounds had a smaller effect on spatial acuity for leading sounds.

The neuronal basis of the PE is not well understood. Studies of central auditory structures in a variety of species have demonstrated that leading sounds suppress responses of spatially sensitive neurons to lagging sounds, providing a neuronal correlate of the PE (e.g., Fitzpatrick et al. 1995Go; Keller and Takahashi 1996bGo; Mickey and Middlebrooks 2001Go; Yin 1994Go). Although the PE occurs for a wide range of signals, including speech and continuous noise, most previous physiological studies, including that in the barn owl (Keller and Takahashi 1996bGo), have focused primarily on neuronal responses to clicks or sounds with durations of a few ms. The use of transient stimuli potentially offers 2 major advantages: 1) the ability to separate neuronal responses to leading and lagging sounds, and 2) the ability to clearly visualize neuronal interactions in the absence of confounding effects of acoustic superposition of leading and lagging sounds. (In practice, these advantages may be compromised by the response times of the acoustic transducers and peripheral auditory filters.) In natural listening situations, however, the delay between the primary signal and its reflection will often be shorter than the signal duration, resulting in substantial temporal overlap of leading and lagging waveforms. The resulting acoustic superposition of leading and lagging sounds at the subject's ear causes degradation of the directional cues to each source. Consequently, it is expected that effects of the leading sound on responses of spatially sensitive neurons to the lagging sound will be confounded by additional masking effects when the sounds overlap. Such effects would not have been apparent in previous studies using shorter stimuli. To be generally applicable to the variety of sounds and reverberant conditions encountered in natural environments, an understanding of the neuronal mechanisms of the PE must therefore extend to situations in which the sound duration is longer than the lag delay. As a step in this direction, the present study documents neuronal responses to pairs of leading and lagging sounds with durations of 50 and 100 ms, and lead-lag delays ranging from 1 to 200 ms.

The physiological mechanisms of lead-evoked suppression remain controversial. Several authors have proposed inhibitory mechanisms to explain both behavioral (Harris et al. 1963Go; Lindemann 1986Go; Zurek 1987Go) and neuronal (Fitzpatrick et al. 1995Go; Yin 1994Go) effects. In mammals, the spatial dependence of lead effects suggests that a substantial component of these interactions may be mediated by inhibitory processes in binaural brain stem nuclei (Litovsky and Delgutte 2002Go). On the other hand, recent modeling studies have demonstrated that, particularly at low frequencies, both behavioral and neurophysiological effects occurring at delays of a few milliseconds could result from interactions of leading and lagging sounds at early stages of auditory processing that do not involve neuronal inhibition (Hartung and Trahiotis 2001Go; Tollin 1998Go; Trahiotis and Hartung 2002Go). If such mechanisms were responsible for the suppression of neuronal responses to lagging sounds, this effect should be evident at the initial site of binaural interaction and at all subsequent processing stages. To elucidate the neuronal mechanisms of lead-evoked suppression it will be necessary to determine the site at which such interactions first become apparent within the ascending auditory pathways.

The lateral subdivisions of the barn owl's IC contain a neuronal map of auditory space. Within these structures, single-peaked auditory spatial receptive fields (SRFs) are generated through the combination of inputs from binaural neurons with highly ambiguous spatial tuning (Konishi 2003Go). This process seems to involve a gradual series of processing stages (Mazer 1995Go), resulting in a continuous distribution of spatial selectivity within the lateral shell subdivision of the IC core (ICc-ls) and the external nucleus of IC (ICx). In the present study, we examined the distribution of lead-dependent spatial masking effects, which are functionally analogous to the lead-evoked suppression observed in previous studies, among a population of IC neurons at varying levels of spatial processing. The results demonstrated an association between lead-dependent effects and spatial selectivity, suggesting that a neuronal correlate of the PE is generated in parallel with refinement of spatial selectivity within the barn owl's IC.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Physiological recording

All procedures conformed to National Institutes of Health guidelines for care and use of laboratory animals and were approved by the Institutional Animal Care and Use Committee of the University of Oregon. The subjects were 3 adult barn owls (Tyto alba) from a captive breeding colony at the University of Oregon. Experiments were conducted using a chronic preparation for recording in anesthetized owls that was described previously (Euston and Takahashi 2002Go). Before use in experiments, each owl had a head plate and 2 recording wells attached to its skull under isofluorane anesthesia. After recovery from surgery, the 3 subjects were returned to a flight cage in an owl colony where they were housed together for the duration of their experimental use. For neurophysiological recording, anesthesia was induced by intramuscular injection of ketamine (22 mg/kg) and valium (5.6 mg/kg), and maintained with N2O/O2 (25 to 40%) supplemented by isofluorane (0.125 to 1%), as needed. Recording sessions had a maximum duration of 12 h. Each subject was used in several recording sessions (Owl 719: 21 sessions; Owl 883: 6 sessions; Owl 916: 8 sessions), with a minimum recovery period of 10 days between sessions.

Neurophysiological recordings were conducted in a sound-attenuating chamber (IAC). The owl's head was stabilized by a holder attached to the chronically implanted headplate and its body supported by a heating pad. Before each recording session, the lid of one recording well was removed and the interior of the well was cleaned with a 0.25% mixture of chlorhexidine in sterile saline. In the first session, the portion of the skull underlying each recording well was excised to permit introduction of recording electrodes. Single-unit recordings were obtained using glass-coated tungsten microelectrodes with impedances at 1 kHz from 1 to 12 M{Omega} and exposed tip lengths of 5 to 20 µm. Electrodes were introduced through the forebrain and advanced ventrally toward the stereotaxic coordinates of the auditory midbrain using a stepping motor microdrive (µD-500, Power Technologies). The signal recorded by the electrode was fed to an oscilloscope and audio monitor to permit detection of stimulus-evoked activity. An interactive graphical user interface (BCLab running in Matlab v. 5.3, The Mathworks) allowed the experimenters to select virtual stimulus locations while searching for responsive units. Single-unit action potentials were isolated either by level triggering or through the use of a template matching spike sorter (Alpha-Omega MSD). Typical recording sessions involved 1 to 3 electrode penetrations. At the end of the recording session the well was rinsed with sterile saline and the lid was replaced. After recording sessions the owl was kept in an isolated recovery chamber until it recovered from anesthesia, at which point it was returned to its flight cage.

Stimulus presentation

Virtual auditory space (VAS) stimuli were generated as described previously (Keller et al. 1998Go) using each subject's own head related transfer functions (HRTFs). HRTFs were band-pass filtered between 2 and 12 kHz, converted from frequency to time domain representations by inverse Fourier transformation (30-kHz sampling rate), and stored digitally as 255-point (8.5-ms) finite impulse response filters. Two sets of binaural HRTF measurements were obtained for each subject. The first set sampled 617 locations spanning the frontal hemifield, at a spacing of 5° in azimuth and elevation in double polar coordinates. The second set sampled the following regions of space in 1° increments: –20 to 20° azimuth at elevations –20, –10, 10, and 20° at 0° elevation; –40 to 40° elevation at 0° azimuth.

During physiological recording, sounds were presented using a dichotic delivery system with foam insert earphones (model ER-1, Etymotic Research, Elk Grove Village, IL). Stimulus waveforms were generated digitally, and typically consisted of broadband noises with flat (±1 dB) spectra from 2 to 12 kHz, durations of 50 or 100 ms, and 2.5-ms cosine ON and OFF ramps. VAS stimuli were generated by real-time convolution of the stimulus waveform with the HRTFs for the appropriate ears and location (PD1 Power DAC, Tucker Davis Technologies, Gainesville, FL). To generate combinations of leading and lagging sounds, a sequence of zeros, with length corresponding to the lag delay, was concatenated to the end (leading sound) or beginning (lagging sound) of a single-noise waveform. The leading and lagging waveforms were then convolved with the HRTFs for the appropriate locations, and the filtered waveforms were added. Digitally processed waveforms were converted to analog voltage at 30-kHz sampling rate (PD1, Tucker Davis), attenuated (PA4, Tucker Davis) and amplified (HB6, Tucker Davis) before earphone presentation. All stimuli were presented at 52 dB SPLA, which was typically 25 to 35 dB above the response threshold of space-specific neurons, measured at their best locations.

An initial test was performed to characterize the auditory spatial tuning of each isolated unit. Sounds were presented from a set of 292 virtual locations, arranged in a checkerboard pattern to sample the entire frontal hemifield at a spacing of 10° in azimuth and elevation. Noise pips of 50 ms were presented with an interstimulus interval of 250 ms. The stimulus set was presented in 2 to 5 repetitions. In this, and all subsequent tests, the order of stimulus presentation was randomized for each repetition.

"Spatial response profiles" (SRPs) were generated by plotting the response (spikes per stimulus recorded in a time window equal to the stimulus duration, delayed by the unit's response latency), as a function of stimulus azimuth and elevation. Stimulus onset was delayed relative to the start of data collection by an amount equal to the stimulus duration to allow measurement of spontaneous discharge before stimulation. In addition, spike data were recorded during a silent interval equal to the total sound duration plus interstimulus interval at the start of each stimulus set repetition. The set of locations that evoked increases in discharge rate, relative to background firing, will be referred to as the "spatial receptive field" (SRF). The area within the SRF from which ≥75% of the maximal rate was obtained is termed the "best area." For units with SRFs containing a single dominant peak, the location within the best area that elicited the maximal average firing rate is termed the "best location." In off-line analysis (see SPATIAL TUNING INDEX), the best location was determined by calculating the weighted average of locations within the best area, using response magnitude as the weighting factor. (If the best area contained more than one region, the one that contained the most total spikes was used.) For on-line determination of target locations (see INTERACTION INDEX) the best location was estimated as the center of the dominant peak of the SRF.

Data analysis

SPATIAL TUNING INDEX. Neuronal spatial selectivity was quantified by a spatial tuning index (STI), that measures the spatial dispersion of responses across the frontal hemifield. STI is calculated from the spatial response profile by computing a weighted sum of angles between the sampled locations and the unit's best location, with response magnitude as the weighting factor, normalized to sum of all angles

(1)
where {alpha}i is the absolute value of the angle between location i and the unit's best location, Ri is the magnitude of the response to location i, and n is the number of locations tested. STI has a potential range from 1, if the spatial distribution of responses is uniform, to 0, if a unit responds to only a single location. The range of observed values was 0.003 to 0.471.

INTERACTION INDEX. The effect of a leading or lagging sound on the response to a sound at a unit's best location was quantified using the interaction index (I; Eq. 2). The sound at best location is termed the target (t) and the other sound the masker (m). I is calculated as follows

(2)
where Rt is the response (spikes/stimulus) to the target alone and Rt+m is the response to the target plus masker. The values of I have a potential range from –1, indicating complete suppression of the response in the target plus masker condition, to numbers approaching +1, indicating strong enhancement of the response. Values of 0 indicate that the masker had no effect.

The time windows used to measure Rt+m and Rt are illustrated in Fig. 1. To quantify the effect of the masker on total spike output, the Rt+m window was set to include all spikes evoked by either the target or the masker. Thus the window began at the onset of the leading sound, delayed by the response latency, and had a duration equal to the lag delay plus either the duration of the response to the target-alone (measured from visual inspection of spike raster displays) or, alternatively, the duration of the stimulus duration plus 10 ms (Fig. 1, vertical lines), whichever was longer. The use of the alternative minimum duration insured inclusion of responses occurring after the offset of leading maskers at short delays in some units, that might otherwise be excluded (e.g., Fig. 1A, Masker Leads). The Rt measurement window began at stimulus onset, delayed the response latency, and had the same duration as that of the Rt+m window.



View larger version (25K):
[in this window]
[in a new window]
 
FIG. 1. Time windows used for quantifying effects of leading and lagging maskers on responses to targets at the best location. Responses of units 883CJ and 883CG are shown in A and B, respectively. In both cases the target was positioned at the best location and the masker at a location that evoked inhibition or no response. Black and gray bars indicate the durations of the target and masker, respectively, delayed by each unit's response latency. Responses to target+masker combinations (Rt+m) were measured in a time window (vertical lines) starting at onset of the leading sound, delayed by the response latency, and with a duration equal to the lag delay plus the length of the response to the target-alone (with a minimum response duration of 110 ms). Response to the target-alone (Rt) was measured in a window equivalent to that used to measure Rt+m in the target-leading condition. Spatial receptive fields (SRFs) and additional responses of these units are shown in Figs. 5 (883CJ) and 6 (883CG).

 
RESPONSES TO INDIVIDUAL STIMULUS SEGMENTS. The combination of temporally overlapping leading and lagging sounds generates a stimulus with 3 distinct segments. The leading segment conveys directional cues identical to those of the leading sound, presented in isolation. This is followed by an overlap segment, during which acoustic superposition of leading and lagging sounds degrades the binaural cues for each source location. Finally, during the trailing segment the binaural cues are essentially identical to those of the lagging sound in isolation. To gain further insight into the mechanisms underlying the masking effects, responses to the different stimulus segments were analyzed separately, as illustrated in Fig. 2.



View larger version (29K):
[in this window]
[in a new window]
 
FIG. 2. Time windows used for analysis of responses to leading, trailing, and overlap stimulus segments. Responses in each condition are aligned on target onset. Responses to the stimulus segment, during which masker and target overlap, were measured in a time window (vertical lines) starting 10 ms after onset of the leading sound and ending at offset of the lagging sound, delayed by the response latency. For comparison with the overlap response, Rt was measured in an equivalent window (vertical lines), relative to target onset. Responses to the trailing segment in the target-lagging condition (Rt+m, Masker Leads) were measured in a window (shaded area) starting at masker offset, with duration equal to the lag delay, or the alternative minimum value of 5 ms. Responses to the leading segment in the target-leading condition were measured in a window starting at target onset (shaded area, Rt+m, Target Leads), and with duration equal to that of the trailing segment window. Both trailing- and leading-segment responses were compared with Rt, measured in a window starting at target onset.

 
The masker's effect during the overlap stimulus segment was quantified by calculating I, as before (Eq. 2), but with Rt+m measured in a window starting 10 ms after the onset of the lagging sound (delayed by the response latency), and ending at the offset of the leading sound (Fig. 2, dashed lines). The start of the overlap Rt+m window was delayed by 10 ms to prevent inclusion of responses to the leading segment in the target-leading condition, which often appeared to continue for a few milliseconds past masker onset. Rt was measured in an equivalent time window, relative to target onset. Thus the Rt window starts 10 ms after target onset, in the target-lagging condition, and 10 ms after masker onset, in the target-leading condition. The value of I was considered to be undefined if neither Rt+m nor Rt was significantly >0 (P < 0.05).

To quantify masking effects on responses to trailing segments, I was used to compare the response at masker offset in the target-lagging condition to the onset of the response in the target-alone condition (Fig. 2). Thus Rt+m was measured in a window starting at masker offset (Fig. 2, Masker Leads, shaded area). The duration of the Rt+m window was equal to the lag delay at delays ≥5 ms. At shorter delays, a duration of 5 ms was used because the responses to leading and lagging segments often continued for a few milliseconds beyond the lag delay, and because shorter windows yielded less reliable results. Rt was measured in a window with duration equal to that of the Rt+m window, starting at target onset. The reasons for choosing the onset segment of the target-alone response as a reference for comparison, in preference to the final segment, are detailed in the RESULTS section. For comparison, the response to leading segments of the target were analyzed in a similar manner. In this case, both Rt+m and Rt were measured in windows starting at target onset, with duration equal to the lag delay (Fig. 2, Target Leads).

TARGET DETECTION. Neuronal detection of a target at best location in the presence of a masker was quantified by receiver operating characteristic (ROC) analysis, following methods applied in previous neurophysiological studies (e.g., Bradley et al. 1987Go; Britten et al. 1992Go; Mountcastle et al. 1969Go). Responses were recorded during 20 repetitions of several target-plus-masker combinations with lead-lag delay varied from –200 to 200 ms. By convention, lag delay is measured between the onset of leading and lagging sounds, and is positive when the masker leads the target. The response to the target plus masker (Rt+m) was measured using the same procedures as in the initial I calculation, except that responses were not averaged across repetitions. The masker-alone response (Rm) was measured using the initial 200 ms of the +200 ms delay stimulus, in a window starting at stimulus onset, delayed by the unit's response latency, and with duration equal to the lag delay plus the longer of 110 ms or the duration of the response to the target alone. The minimum effective response duration of 110 ms was again used to prevent exclusion of spike bursts after the offset of leading maskers in units with short (<110 ms) target-alone responses. ROC curves were constructed using a set of response criterion values spanning the range of single-trial Rt+m and Rm values. The ROC curve was generated by plotting the proportion of "hits" (Rt+m > criterion), against the proportion of "false alarms" (Rm > criterion), for each criterion value. The criterion values included the maximum and minimum response values, as well as any value that resulted in a change in the proportion of both hits and false alarms. A criterion value greater than the maximum response and a value of one less than the minimum response were also included to define the endpoints of the curve. The area under the resulting curve, termed proportion correct [p(c)] provides an unbiased measure of target detection (Green and Swets 1966Go), representing the performance of an ideal observer, using the neuron's responses as the decision variable. Values of 0.5 and 1 correspond to chance and perfect detection performance, respectively. Values <0.5 may occur if the average response to the masker-alone condition is greater than the response to the target-plus-masker condition.

HIGH-RESOLUTION AZIMUTH TUNING. High-resolution single-source azimuth tuning curves were obtained by recording neuronal responses to a set of virtual locations spanning the azimuthal extent (or a portion thereof) of the peak of the SRF in 1° increments at an elevation of –20, –10, 0, 10, or 20°. Because most space-specific units cannot reliably detect changes in elevation about their SRF peaks of <5° (unpublished observations), this sampling was sufficient to characterize the azimuth tuning at the SRF peak of units with best elevations between –25 and 25°. Azimuth tuning curves for leading and lagging targets were obtained in the same manner, but with the addition of a masker at a fixed location outside the SRF, at the same elevation as the loci sampled at high resolution. The average azimuthal separation between the masker and the units' best azimuths was 28.0 ± 6.8°. The lag delay was always 3 ms, and sound durations were 100 ms. The peaks of azimuth tuning curves for single, leading, and lagging targets were determined by fitting the tuning-curve data with either a Gaussian curve or, if the tuning-curve was clearly skewed, with a lognormal curve. To ensure that fitted peaks accurately reflected neuronal azimuth tuning, a curve fit was excluded from further analysis if it explained <75% of the response variance, or if the calculated peak was located at an endpoint of the range of sampled azimuths. Best azimuths for single, leading, and lagging targets were determined from the corresponding curve fits. Tuning-curve shifts were calculated by determining the change in best azimuths between the single source and either leading or lagging target conditions relative to the location of the masker. By convention, a positive shift value indicates that the best azimuth in the target+masker condition is further away from the masker than the best azimuth in the single source condition.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Auditory responses were recorded from 241 spatially sensitive single units in the midbrains of 3 barn owls. The sampled SRFs exhibited varying degrees of spatial selectivity, ranging from those containing distinct lateral side peaks and/or vertically elongated central peaks to those with a single, spatially restricted peak. The joint distributions of spatial selectivity and response latency for the total unit sample are illustrated in Fig. 3. There was a significant negative correlation ({rho} = –0.23; P = 0.0004) between spatial selectivity, quantified by STI, and unit response latency, indicating that the most spatially selective units had the longest latencies. There was no clear indication, however, that either distribution contained multiple modes. These findings are in agreement with those of a previous study (Mazer 1995Go) and are consistent with the view that spatial selectivity develops through a gradual process within ICc-ls and ICx. Recordings were also obtained from 8 optic tectum (OT) units that responded to light as well as sound. Tectal units had long latencies and spatial selectivity comparable to that of the most selective IC neurons (Fig. 3, stars). The results below were obtained from a subset of 98 units from which detailed data sets were collected. All units chosen for further analysis had SRFs with a single, dominant peak, although smaller side peaks may have been present. The distribution of STI values for these units is shown on the right in Fig. 3 (shaded). The mean STI value for this subpopulation (0.10) was significantly lower than that for the total unit sample (0.13; P = 0.009, Mann–Whitney U test). The mean response latencies of the 2 populations (12.3 and 12.8 ms) were not significantly different (P = 0.25, t-test). The distributions of spatial selectivity and response latency suggest that the latter unit sample contains a mixture of units from lateral ICc-ls and ICx.



View larger version (23K):
[in this window]
[in a new window]
 
FIG. 3. Distributions of response latency and spatial selectivity [spatial tuning index (STI)] for populations of auditory midbrain units. Joint distributions for a sample of 241 units that were responsive only to auditory stimuli [inferior colliculus (IC) units] illustrate the trend toward greater spatial selectivity at longer latencies. Of units that responded to auditory and visual stimuli [optic tectum (OT) units] 8 had latencies and spatial selectivity comparable to the extreme values of the distributions for IC units. Subpopulation of units selected for further analysis (n = 98, far right) contained a lower proportion of units with low spatial selectivity.

 
Spatial distribution of lead-source effects

A previous study demonstrated that the responses of space-specific neurons to sounds at their best locations were reduced in the presence of leading sounds displaced by 40° in azimuth (Keller and Takahashi 1996bGo). Thus the neuronal representation of the direction of the lagging sound within the space map is suppressed, providing a potential neuronal correlate of the behavioral PE. We now consider the spatial distribution of such effects in both azimuth and elevation. The spatial dependence of lead source effects was studied, quantitatively, in 26 units by recording responses to lagging sounds at the units’ best locations, combined with leading sounds at locations spanning the frontal hemifield. For simplicity, the lagging sound at best location will be referred to as the target (t), and leading sound as the masker (m). The delay between onset of the masker and the target was either 3 (22 units) or 5 (4 units) ms.

The usual patterns of lead-source effects are illustrated for 3 units with varying degrees of spatial selectivity in Fig. 4. The lead effect was quantified by the interaction index (I, see METHODS, Data analysis). For this analysis, the response to the target plus masker (Rt+m) was measured in a window set to capture all spikes evoked by either the target or masker, and compared with the response to the target-alone (Rt), measured in a window with equivalent duration (Fig. 1). The SRPs are shown in the left column (Fig. 4A) and the spatial distributions of I are shown in the middle column (Fig. 4B). The target locations (crosses) and best areas (dotted lines) are indicated in Fig. 4B to facilitate comparison of the spatial topographies of lead effects with the SRPs. In all 3 units, the lead-source effects ranged from suppression (I < 0) to values close to zero, indicating that the masker had little effect on the net response relative to the target-alone condition. Lead-evoked facilitation (Rt+m > Rt + response to masker-alone) was never observed in any of the units studied. For all 3 units in Fig. 4B, the masking effect was minimal at locations near the best area, or at far peripheral locations. In the most spatially selective unit (unit 883DC, top row), the masker was most suppressive when it was located lateral to or above the SRF. In the other 2 units, the SRF extends vertically, following the contour of locations with the same ITD as that of the best location. In such units, lead effects were usually minimal along the same iso-ITD contour as that of the best location, and maximal at laterally adjacent locations. The masking effect (I) is plotted against the normalized single-source response at each location in Fig. 4C. In all 3 cases, the strongest lead-evoked suppression occurred at locations that produced the weakest responses in the single-source condition. By contrast, leading sounds at locations that produced responses >40% of maximum in the single-source condition had minimal suppressive effects.



View larger version (64K):
[in this window]
[in a new window]
 
FIG. 4. Relation between spatial distribution of lead-evoked masking effects and spatial response profile (SRP) features, for 3 IC units. A: SRPs were constructed from responses to sounds at 292 virtual locations spanning the frontal hemifield. Color scale (top) was normalized to each unit's maximum response, and values between sampled locations were filled by linear interpolation. Unit ID number and STI values are indicated on each plot. B: spatial distributions of lead source effects (I) for the same 3 units. Color scales are indicated to the right of each plot. Locations of the target (cross) and the unit's best area (dotted line) are indicated on each plot. C: effect of a leading masker at each location (I) is plotted against the strength of the response to a single source at that location (Rt), normalized to the maximum response to any location.

 
Differential effects of leading and lagging maskers

Because the sounds used in the preceding test were much longer than the lag delay, resulting in considerable temporal overlap of leading and lagging waveforms, the suppression of neuronal responses to the target caused by leading maskers located outside the SRF could reflect either acoustic or neuronal interactions, or a combination of the two. To better understand the cause of response suppression, we next compare the effects of leading and lagging maskers on responses to targets at the best location in a larger sample of IC units exhibiting varying levels of spatial selectivity.

When leading and lagging sound are presented from different locations at approximately equal levels, some reduction of the response to the best location sound is expected to result from degradation of the binaural cues caused by the acoustic superposition of waveforms from the 2 sources (Keller and Takahashi 1996aGo; Takahashi and Keller 1994Go). Specifically, the spectrum of interaural level difference cues will be altered, and the level of interaural correlation diminished, the latter resulting in a reduction of the effectiveness of ITD cues (Albeck and Konishi 1995Go; Saberi et al. 1998Go). Such effects are approximately equal, regardless of whether the masker or target leads. Therefore any difference in suppressive effects caused by leading and lagging maskers cannot be attributed to these acoustic interactions alone. Furthermore, any difference in effectiveness of suppression between leading and lagging maskers resulting from interactions within peripheral filters (Hartung and Trahiotis 2001Go; Trahiotis and Hartung 2002Go) or asymmetric temporal weighting at the initial site of binaural interaction (Tollin 1998Go) should be exhibited by all IC neurons.

Effects of leading and lagging maskers on responses to targets at best location were compared in 59 units, including 6 presumptive OT units. For this test the masker was positioned outside the SRF, at a location that was found, on-line, to suppress responses to the target. In 58/59 units, the masker was positioned at the same elevation as the estimated best location. In one unit, the masker was positioned 30° above the best location. The azimuthal separations between targets and maskers ranged from 14 to 55° (median = 26°). Sounds were 100-ms broadband (2–12 kHz) noise bursts presented with lead-lag onset delays of ±1, 2, 5, 10, 20, 50, 100, and 200 ms, expressed relative to target onset. The target and masker waveforms were identical, before convolution with the head-related impulse responses (HRIRs).

Comparing effects of leading and lagging maskers revealed 2 types of suppressive effects. In many units, the amount of suppression was similar, regardless of whether the masker led or lagged. This type of effect, termed temporally symmetric, is illustrated by responses of unit 883CJ in Fig. 5. The response to a target leading the masker by 200 ms is the same as that evoked by the target alone, and consists of a robust, moderately adapting discharge, sustained throughout the stimulus duration. By contrast, when the masker leads by 200 ms, it elicits a single spike at onset on some trials, followed by suppression of spontaneous firing, suggesting lateral inhibition. As the delay is decreased, causing the sounds to overlap in time, the response to the target is clearly suppressed throughout the duration of the masker. At delays from 1 to 20 ms, the magnitude of suppression is approximately equal, whether the masker leads or lags. This symmetric suppression is consistent with the effects of acoustic superposition on the available binaural cues. It is also possible that the masker exerts an additional inhibitory effect. However, any contribution of lateral inhibition to suppression of the target response appears not to depend on the temporal order of the masker and target. Note that, at delays from 5 to 20 ms, there is a strong, transient burst of spikes at the offset of the masker. At 50 ms, suppression is stronger when the target leads. This effect may reflect an interaction of the degradation of binaural cues with the temporal dynamics of the response to the target. When the target leads, the masker coincides with the weaker, later portion of the response and is thus more effective than in the opposite configuration, when it coincides with the stronger initial portion of the response. At 100 and 200 ms the masker has little to no effect. As in this example, symmetric response suppression was most often observed in units with SRFs containing prominent lateral side-peaks and vertically elongated main peaks.



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 5. Responses of a unit exhibiting temporally symmetric masker effects. A: SRP of unit 883CJ with crosshairs indicating the locations of the target (white) and masker (black). B: responses to target/masker combinations with varying lead-lag delays are displayed as raster plots and peristimulus histograms. Left column: responses to combinations in which the target led the masker. Right column: responses to combinations with lagging targets. Center column: delays between onset of leading and lagging sounds. Durations of the 2 sounds, delayed by the unit's response latency (11 ms), are indicated by black (target) and gray (masker) bars.

 
In more spatially selective units, suppression was often greater when the masker led than when it lagged. Such temporally asymmetric suppression is illustrated by responses of unit 883CG in Fig. 6. In this case, the unit responded to a target at –10° azimuth and failed to respond to a masker at 15° azimuth. At delays of 5 ms and below, the response to the target was completely suppressed when the masker led, and only partially suppressed when it lagged. In contrast to the previous example, the response at masker offset was weak or absent at delays below 20 ms in the masker-leading condition. Although a response to the target became apparent at longer delays, an asymmetric suppressive effect was evident at delays ≤50 ms. This type of asymmetric interaction is analogous to the behavioral PE in that the neuronal representation of the location of a lagging sound is more effectively suppressed than is that of a leading sound.



View larger version (20K):
[in this window]
[in a new window]
 
FIG. 6. Responses of a unit exhibiting temporally asymmetric suppressive effects. A: SRP of unit 883CG with crosshairs indicating locations of the target (white) and masker (black). B: responses to target/masker combinations with varying lead-lag delays. All conventions as in Fig. 5.

 
The effects of leading and lagging maskers at all delays are compared for the entire sample of 59 units in Fig. 7. In each plot the interaction index values (I, see METHODS, Data analysis) for lagging maskers are plotted against those for leading maskers. The unity line indicates temporally symmetric effects. At delays from 1 to 50 ms, nearly all points are contained in the lower left quadrant (interaction index values –1 to 0), indicating that the response to the target was suppressed to some extent when it overlapped in time with the masker. At delays from 1 to 10 ms, most points fell on or below the unity line, indicating that, for most units, the extent of suppression was equal or greater when the masker led than when it lagged.



View larger version (25K):
[in this window]
[in a new window]
 
FIG. 7. Masker effects were related to spatial selectivity. Temporal symmetry/asymmetry is illustrated by plotting the interaction index values obtained when the masker led against those obtained when it lagged, separately, for lead-lag delays of 1, 2, 5, 10, 20, 50, and 100 ms. Plot symbols designate IC units with low (ST >0.13, filled squares) and high (ST <0.13, open circles) spatial selectivity, and OT units (stars) that responded to light as well as sound.

 
The different unit subpopulations, designated by plot symbols, exhibited different effects. The scatter of values from the least spatially selective units (ST > 0.13; squares) fell just below the unity line at delays from 1 to 2 ms, and became centered on it at longer delays. Thus responses of these units exhibited mildly asymmetric suppression at the shortest delays, and temporally symmetric suppression at longer delays. By contrast, the scatters of values from the more spatially selective units (ST < 0.13; circles) and OT units included substantial numbers of points falling well below the unity line at delays from 1 to 50 ms. Thus profound temporal asymmetry was apparent only in the most spatially selective units.

The preceding conclusions were confirmed by statistical analysis (Table 1). At each delay value, IC units were classified as either symmetric or asymmetric by comparing responses to the target when the masker led or lagged. A unit was classified as symmetric if its responses with both leading and lagging maskers were significantly lower than the response to the target-alone, but not different from one another. A unit was classified as asymmetric if its response in the masker leading condition was significantly lower than half the magnitude of the response in the masker-lagging condition (all comparisons: t-test, {alpha} = 0.01). At delay values from 1 to 20 ms there was a substantial proportion of symmetric units. At each delay value in this range the mean STI value of asymmetric units was significantly lower than that for symmetric units, indicating the former units were more spatially selective.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Difference in spatial selectivity between units exhibiting temporally symmetric and asymmetric suppression

 
The relation between response asymmetry and both spatial selectivity and response latency is examined further in Fig. 8. Here, asymmetry is quantified as the difference between I values calculated in the masker-leading and masker-lagging conditions (IM_leads IM_lags). Thus large negative values indicate greater suppression when the masker leads than when it lags, and values of zero indicate equal masking effects in the 2 conditions. Response asymmetry, thus calculated, is plotted against spatial selectivity (STI, left column) and response latency (right column), for each IC unit for delays from 1 to 50 ms. As expected from the preceding analysis, at delays from 1 to 20 ms, units with low spatial selectivity (STI >0.13) had asymmetry values near zero. Among the more selective units response asymmetry increased systematically as a function of spatial selectivity. Because low STI values indicate high spatial selectivity, this relation is manifest as a significant (P < 0.01) positive correlation between STI and asymmetry at delays below 50 ms. Because spatial selectivity is correlated with latency, it is not unexpected that response asymmetry is also correlated with latency. In this case, response asymmetry tended to increase in a negative direction with increasing latency, resulting in negative correlations. Unlike spatial selectivity, however, latency was significantly correlated with asymmetry at a delay of 50 ms, but not at delays of 1 and 2 ms. Neither latency nor spatial selectivity was significantly correlated with asymmetry at 100-ms delay (P values of 0.55 and 0.31, respectively).



View larger version (35K):
[in this window]
[in a new window]
 
FIG. 8. Temporal asymmetry of responses was related to spatial selectivity and latency. Response asymmetry was calculated as the difference between I values measured in the masker-leading and masker-lagging conditions (IM_leadsIM_lags). Asymmetry is plotted as a function of STI and response latency for each IC unit at each lag delay. Values of the correlation coefficient (r) and significance levels are shown at the top of each plot.

 
Masking of responses to individual stimulus segments

The temporal overlap of leading and lagging sounds results in segments of stimuli during which only the leading or lagging waveform is present, flanking a segment in which both are present. The responses to such stimuli were often complex, with distinct components appearing to reflect differences in masking effects within different stimulus segments. Analysis of the segment-specific responses helped to pinpoint the differences between temporally symmetric and asymmetric suppression.

During the overlap segment, the mixing of the leading and lagging waveforms in each ear degrades the binaural cues. The degradation is the same, however, regardless of whether the target leads or lags and would not, by itself, contribute to a temporally asymmetric effect. Thus if suppression resulted entirely from these acoustic interactions, it is expected that masking of responses to the overlap segment would be the same, regardless of which sound led. This prediction was evaluated by comparing masking during the overlap segment in the masker-leading and masker lagging conditions (see METHODS, Data analysis).

The I values calculated from responses to the overlap segments (Fig. 2, between dashed lines) are plotted as a function of spatial selectivity for delays from 1 to 50 ms in the left column of Fig. 9. At all delays, responses of most units were moderately to heavily suppressed (I < 0) during the overlap segment, both when the masker led (circles) and when it lagged (triangles). This result is consistent with the expectation that acoustic interactions will have a major effect on responses of all units, regardless of the temporal order of masker and target. To compare effects of leading and lagging maskers, the difference between I values in the masker-leading and masker-lagging conditions (asymmetry = IM_leadsIM_lags) is plotted against spatial selectivity in Fig. 9 (right column). At delays between 1 and 10 ms, most values were close to zero, indicating approximately equal suppression in the 2 conditions. Thus in most units temporally symmetric influences were sufficient to explain the masking effects within the overlap segment at these delays. This result is consistent with the expected acoustic masking effects, as well as with lateral inhibition, provided that the inhibitory mechanism is insensitive to the temporal order of masker and target. The major exceptions, at delays from 1 to 5 ms, were highly spatially selective units that exhibited greater suppression in the masker-leading condition. This form of temporally asymmetric suppression was also evident at 10- to 50-ms delays in responses of several highly selective units, and 2 less-selective units. Such asymmetric suppression is inconsistent with the predictions of acoustic masking, and indicates the contribution of an additional mechanism that is sensitive to the temporal order of masker and target.



View larger version (39K):
[in this window]
[in a new window]
 
FIG. 9. Masking of responses to the overlapping stimulus segments was related to spatial selectivity. Left column: I values calculated for the overlap segment of responses in the masker-leading (circles) and masker-lagging (triangles) conditions are plotted as a function of spatial selectivity (STI) at each lag delay. Right column: asymmetry of masking effects (IM_leadsIM_lags) is plotted as a function of spatial selectivity. In cases where neither Rt nor Rt+m was significantly >0 (P < 0.05), the value of I is undefined and is indicated by points plotted above the plot axes.

 
At delays from 10 to 50 ms, the responses of most units differed from the predictions of a simple acoustic masking effect because suppression was greater in the masker-lagging condition. This effect is probably attributable to adaptation of responses. In the target-leading condition, the overlap segment follows a leading segment, in which only the target is present, which always evokes a strong response (e.g., Figs. 1 and 2, target leads). By contrast, in the masker-leading condition, the overlap segment follows a leading segment that evokes little or no response (e.g., Figs. 1 and 2, target leads). Consequently, response adaptation would be expected to suppress responses to the overlap segment in the masker-lagging condition, but not in the masker-leading condition. This explanation is supported by the fact that suppression in the masker-lagging condition actually increases in many units, particularly less-selective ones, as the delay is increased from 10 to 50 ms. Thus the masking observed during the overlap segment in many units is consistent with the expectations of acoustic masking at short delays, with an additional contribution of adaptation at longer delays. Neither mechanism, however, can account for the temporally asymmetric masking effects during the overlap segment exhibited by many of the more spatially selective units.

A second prediction of simultaneous acoustical masking is that the suppressive effects should occur only during the overlap segment. Thus in the masker-leading condition, we would expect to see a strong response to the trailing stimulus segment, in which only the target is present. This prediction is consistent with the responses of the symmetric unit shown in Fig. 5, at delays >2 ms, but not with those of the asymmetric unit shown in Fig. 6. At delays from 5 to 50 ms, the symmetric unit fired a burst of spikes after the offset of leading maskers, which resembled the response to stimulus onset in the target-alone condition (Fig. 5, Fig. 2A, Masker Leads, shaded region). Such responses demonstrate a recovery from the masking effect almost immediately after termination of masker-target overlap. In the asymmetric unit (Fig. 6, Fig. 2B), by contrast, an onset-like response to the trailing segment does not emerge until the target delay is increased to 20 ms. In this case, the masker appears to exert a suppressive influence on the response to the trailing segment that persists well beyond masker offset. Such effects are consistent with a long-lasting inhibitory mechanism, but not with the degradation of binaural cues resulting from acoustic superposition.

Suppression beyond masker offset was evaluated in the entire IC unit sample by comparing the responses to the trailing stimulus segment, in the masker-leading condition, to the onset segments of the responses to the target alone (see METHODS, Data analysis and Fig. 2). The resulting I values are plotted as a function of spatial selectivity in Fig. 10 (left column, circles). To demonstrate how the suppression of responses to trailing segments contributes to overall response asymmetry, the responses to leading segments in the masker-lagging condition were analyzed in similar fashion (Fig. 10, left column, triangles). At delays of 1 and 2 ms, most units had little or no response to the trailing target-alone segment, resulting in negative values. As delay was increased, responses to the trailing segment emerged more quickly in less spatially selective units than in the more selective ones. Thus at a delay of 50 ms, I values for responses to trailing segments are close to 0 in the less-selective units (STI <0.13), but well below 0 for many of the more-selective units. The recovery of responses at short delays in the less spatially selective units indicates that masking effects are primarily limited to the overlap segment. This effect is consistent with the expectations for acoustic masking, as well as lateral inhibition with a short time constant. By contrast, the suppression of responses to the trailing segment at delays up to tens of milliseconds in the highly selective units is suggestive of long-acting lead-evoked inhibition. This effect is unlikely to have resulted from response adaptation because the strongest suppression of trailing segment responses usually occurred after overlap segments that evoked little or no response. A peripheral mechanism is equally unlikely because interactions within peripheral filters are limited to a few milliseconds.



View larger version (39K):
[in this window]
[in a new window]
 
FIG. 10. Masking of responses to trailing target-alone stimulus segments depended on spatial selectivity. Left column: I values calculated for responses to trailing segments, relative to responses to the onset portion of responses to the target alone, are plotted against spatial selectivity (circles). I values for responses to leading target-alone segments are plotted for comparison (triangles). All conventions are the same as Fig. 9.

 
The suppression of responses to the trailing target-alone stimulus segments makes a major contribution to the temporal asymmetry of masking effects in the most spatially selective units. Trailing segment masking is, by its very nature, a form of temporal asymmetry because the response to the leading portion of the target in the masker-lagging condition (Fig. 2, Target Leads, shaded areas) is not subject to a similar influence. This is illustrated in Fig. 10 by comparison of I values calculated from responses to the trailing (circles) and leading (triangles) segments. At delays >1 ms, the values for leading segment responses are all near 0, indicating that these responses are similar to the onset portion of the target-alone response. (The negative I values for leading-segment responses in a few units at 1 and 2 ms could indicate either that the Rt+m window was long enough to include a portion of the response to the overlap segment, or that the integration time for neuronal responses was greater than the leading segment duration.) The temporal asymmetry of suppression is illustrated in the right column of Fig. 10 by plotting the difference in I values for trailing and leading segments (asymmetry = ItrailingIleading) as a function of spatial selectivity. At delays of 1 and 2 ms nearly all units exhibit strong temporally asymmetric suppression. This effect is not expected based on acoustic masking, but is consistent with either peripheral filter interactions, or short-acting inhibition. As the delay is increased, the asymmetry of responses in the less spatially selective units approaches 0 more quickly than that of the more selective units, as expected from the difference in suppression of responses to the trailing segments. By 50 ms, the values for poorly selective units are all close to 0, whereas many highly selective units still exhibit considerable asymmetry, resulting from the long-lasting suppression of trailing segment responses.

In summary, separate analysis of responses to the overlap and trailing stimulus segments suggests that the observed masking effects reflect a combination of several factors. In the least spatially selective neurons, at delays >2 ms, a combination of temporally symmetric suppression of responses to the overlapping stimulus segment and robust responses to trailing target-alone segments resulted in approximately equal masking effects in the masker-leading and masker-lagging conditions. This type of masking is consistent with the expected effects of acoustic superposition on binaural cues, with possible additional contributions of temporally symmetric lateral inhibition. Such acoustic effects also appear to make a major contribution to the suppression of responses to the overlapping stimulus segment in the more selective units. However, this mechanism cannot account for the temporally asymmetric suppression in the more selective units that resulted from a combination of long-lasting suppression of responses to trailing segments of the target and, in some units, greater suppression during the overlap segment in the masker-leading condition. Such effects are consistent with a lead-evoked inhibitory mechanism. Finally, at delays of 1 and 2 ms nearly all units exhibited some level of temporally asymmetric suppression. In most units, this effect was primarily attributable to the masking of responses to trailing target-alone segments. This effect is consistent with a peripheral mechanism or with short-acting lateral inhibition.

Neuronal detection of leading and lagging sounds

The temporal asymmetry of suppressive effects results in a difference in the ability of the most spatially selective IC neurons to detect leading and lagging targets at their best locations. This effect was quantified by ROC analysis measuring the ability of neurons to signal the presence of the target through changes in discharge rate relative to the masker-alone condition (see METHODS, Data analysis for details). The methods used to generate ROC curves from neuronal spike data are based on those used in previous studies (e.g., Bradley et al. 1987Go; Britten et al. 1992Go; Mountcastle et al. 1969Go) and are illustrated in Fig. 11. In this example, there was a small response to the masker, located on the edge of the SRF (Fig. 11B, masker alone), and a much larger response to the target at the best location (Fig. 11B, target alone). When the masker led the target by 1 ms (Fig. 11B, +1 ms), the response was slightly greater than that to the masker alone. When the target led by 1 ms (Fig. 11B, –1 ms), the response was much greater than that to the masker alone, but still less than that to the target alone. To construct ROC curves, spikes were counted on each stimulus repetition in a time window set to capture all spikes evoked by either sound (Fig. 11B, dashed vertical lines, same windowing as in Figs. 1 and 7). A set of response criteria was adopted that spanned the range of recorded spike counts. ROC curves were generated by plotting the proportion of trials on which Rt+m exceeded the criterion ("hits") against the proportion of trials on which Rm exceeded the criterion ("false alarms") for each criterion value. The areas under the resulting curves [proportion correct, p(c)] are equivalent to the performance of an ideal observer using the neuron's spike counts as a decision variable in a 2-alternative forced-choice task (Green and Swets 1966Go). ROC curves obtained from the responses in Fig. 11B are shown in Fig. 11C. In this example, when the target led the masker by 1 ms (–1 ms, triangles), the response on nearly every trial was greater than the maximum response in the masker-alone condition, resulting in nearly perfect detection performance [0.99 p(c)]. The smaller response when the masker led by the same amount (+1 ms) was reflected in lower target detectability [0.74 p(c)]. Increasing the target delay to +5 ms caused the response to increase (response not shown), improving detection performance to 0.92 p(c).



View larger version (23K):
[in this window]
[in a new window]
 
FIG. 11. Neuronal detection of targets at best location quantified by receiver operating characteristic (ROC) analysis. A: SRP of unit 883AJ with white and black circles indicating the locations of the target and masker, respectively. B: raster plots illustrate responses to 20 repetitions each of the target alone, masker alone, and target/masker combinations in which the target lagged (+1 ms) or led (–1 ms) the masker by 1 ms. Durations of the target (black) and/or masker (gray), delayed by the unit's response latency (11 ms) are shown above each raster. Time windows used for ROC analysis are indicated by vertical lines and markers. C: ROC curves generated from the responses in B and from responses to a target/masker combination with the target lagging by 5 ms. ROC curves were generated by plotting the proportion (P) of responses to the target/masker combination that exceeded criterion ("Hits") against the proportion of responses to the masker alone that exceeded criterion ("False Alarms"), for criterion values spanning the data range. Lag-delay value is illustrated on each ROC curve.

 
The relation between masker-evoked suppression and target detectability is illustrated in Fig. 12 for the 3 units used as examples in Figs. 5, 6, and 11. Detection functions, obtained by plotting p(c) as a function of lag delay, are shown in the left column. For comparison, the effect of the masker on the magnitude of responses to the target is quantified by "recovery" functions, plotted in the right column. Recovery was calculated according to the equation

(3)
with Rt+m and Rt measured in the same windows used in the initial interaction index calculation (Fig. 1). This measure was chosen because it is similar to that used previously to compare neuronal and behavioral echo thresholds (e.g., Fitzpatrick 1999Go; Yin 1994Go). For all 3 units, the magnitude of responses to leading and lagging targets are reduced by 50% or more at the shortest delays. Nevertheless, the responses of each unit to leading targets were sufficient to enable perfect, or near perfect, detection performance. Units 883AJ (A, also Fig. 11) and 883CG (C, also Fig. 6) exhibited temporally asymmetric masker effects. In these units, the greater suppression of responses to lagging targets resulted in an asymmetry of detection performance at the shorter delays. Detection performance for lagging targets improved as the response recovered at longer delays. Unit 883CJ (B, also Fig. 5) exhibited symmetric masker effects. In this case, the suppressed responses to leading and lagging targets were sufficient to support perfect detection performance at all delays. The data from all 3 units demonstrate that even very heavily suppressed responses may be sufficient to reliably indicate the presence of a target at the unit's best location.



View larger version (42K):
[in this window]
[in a new window]
 
FIG. 12. Detection and spike-rate recovery functions for 3 example units. For each unit, detection functions are plotted in the left column and recovery functions in the right column. A: unit 883AJ exhibited temporally asymmetric masker effects at delays <10 ms. Detection ability was nearly perfect for leading targets at all delays, and moderately impaired for lagging targets at the shortest delays. SRF and responses are shown in Fig. 11. B: unit 883CJ (Fig. 5), exhibited symmetric masker effects and perfect detection performance for leading and lagging targets at all delays. C: unit 883CG (Fig. 6) exhibited an asymmetry in detection performance that mirrored the temporal asymmetry of masker effects.

 
Thresholds for target detection and response recovery for the entire unit sample are compared in Fig. 13. Thresholds were calculated by finding the lag delay at which performance exceeded an arbitrary criterion level of 0.75 p(c) or 50% recovery. Threshold values between sampled delays were estimated by linear interpolation. In a few cases, where the detection or recovery function crossed the 0.75 level more than once, the lag delay at which the function crossed and stayed above threshold was used. Thresholds were determined for leading and lagging targets for 59 units, and for lagging targets only, for 21 additional units. In most units for which at least one threshold fell within the range of lag delays tested (1 to 100 ms), thresholds for target detection were lower than those for response recovery. In the target-leading condition, responses of 20/59 units exceeded both detection and recovery criteria at the shortest delay. Of the 39 units with at least one threshold in the sampled range, 30 had lower thresholds for detection than for response recovery. The differences between thresholds obtained by the 2 measures were often large. Many units with suprathreshold detection performance at 1 ms delay had recovery thresholds on the order of tens of milliseconds. In the target-lagging condition, a higher proportion of units (70/80) had at least one threshold within the sampled range. The great majority of these units (56/70) also had lower thresholds for detection than for recovery. Again, recovery thresholds often exceeded detection thresholds by more than an order of magnitude. Thus thresholds derived from measures that simply reflect the magnitude of neuronal responses may greatly underestimate the ability of neurons to signal the presence of a target through reliable changes in firing.



View larger version (34K):
[in this window]
[in a new window]
 
FIG. 13. Comparison of neuronal response thresholds obtained from recovery and detection functions. Thresholds falling outside the range of lag delays tested (1 to 100 ms) are plotted outside the plot frames. Plot symbols are the same as those in Fig. 7.

 
Analysis of detection thresholds revealed an asymmetry of performance for leading and lagging sounds among the more spatially selective units that mirrored the asymmetry of response suppression. The distributions of detection thresholds for all units, grouped according to spatial selectivity (ST cutoff = 0.13), are shown in Fig. 14 A. Most units, in both categories, could detect leading targets at all delays. The proportion of broadly tuned units with lead-detection thresholds below 1 ms (14/16) was slightly higher than that for sharply tuned units (27/37). There was a much greater difference in the abilities of the 2 populations to detect lagging targets. A high proportion of broadly tuned units (18/25) could also detect lagging targets at all delays. By contrast, most sharply tuned units were unable to detect targets at short delays. Only 17/53 sharply tuned units had thresholds below 1 ms.



View larger version (20K):
[in this window]
[in a new window]
 
FIG. 14. Asymmetry of detection thresholds for leading and lagging sounds depends on spatial selectivity. Detection threshold was defined as the delay value at which detection performance exceeded 0.75 p(c). A: distributions of detection thresholds for leading (top row) and lagging (bottom row) targets among subpopulations of IC units grouped according to spatial selectivity (ST cutoff = 0.13). Left-most bin in each histogram counts units that had suprathreshold detection performance at the shortest delay tested. Right-most bin counts units that did not achieve threshold detection at any delay ≤50 ms. B: cumulative proportions of units in each population that achieved suprathreshold performance for detection of leading (triangles) or lagging (circles) targets plotted as a function of lag delay.

 
The detection performance of the 53 units tested with both leading and lagging targets is summarized in Fig. 14B. Here, cumulative proportions of suprathreshold units are plotted as a function of lag delay. At a delay of 1 ms, most of the broadly tuned units could detect both leading and lagging targets. As the delay was increased to 5 ms the proportion of these units that were above threshold increased to 1. Among the more sharply tuned units, there were large disparities in detection performance for leading and lagging targets at short delays that diminished as delay was increased. Thus at delays from 1 to 10 ms, there is a substantial difference in the abilities of neurons with differing levels of spatial selectivity to detect a lagging source at best location, and hence to contribute to the neuronal representation of its location. The neuronal representation of the location of a lagging source by responses of the lessspatially selective units is robust. By contrast, the representation of the location of a lagging source conveyed by responses of the most spatially selective units is substantially attenuated relative to that of a leading source.

Azimuth tuning of responses to leading and lagging targets

Results of recent modeling studies suggest that suppression of neuronal responses to lagging sounds may result from interactions within peripheral filters that give rise to new effective internal interaural time differences (ITDs