We examined the accuracy and precision with which the barn owl (Tyto alba) turns its head toward sound sources under conditions that evoke the precedence effect (PE) in humans. Stimuli consisted of 25-ms noise bursts emitted from two sources, separated horizontally by 40°, and temporally by 3–50 ms. At delays from 3 to 10 ms, head turns were always directed at the leading source, and were nearly as accurate and precise as turns toward single sources, indicating that the leading source dominates perception. This lead dominance is particularly remarkable, first, because on some trials, the lagging source was significantly higher in amplitude than the lead, arising from the directionality of the owl's ears, and second, because the temporal overlap of the two sounds can degrade the binaural cues with which the owl localizes sounds. With increasing delays, the influence of the lagging source became apparent as the head saccades became increasingly biased toward the lagging source. Furthermore, on some of the trials at delays ≥20 ms, the owl turned its head, first, in the direction of one source, and then the other, suggesting that it was able to resolve two separately localizable sources. At all delays <50 ms, response latencies were longer for paired sources than for single sources. With the possible exception of response latency, these findings demonstrate that the owl exhibits precedence phenomena in sound localization similar to those in humans and cats, and provide a basis for comparison with neurophysiological data.
In a typical, natural environment, a sound arriving directly from a single source is followed by a succession of reflections arriving from different directions. Despite the presence of conflicting directional information, however, the accuracy of sound localization in reverberant environments is only slightly worse than that in anechoic conditions (Hartmann 1983; Rakerd and Hartmann 1985). The ability to localize and comprehend sounds from the direct source amid the clutter of reflections is thought to depend on a set of phenomena, collectively referred to as the precedence effect (PE), and united by the common theme that the first arriving sound dominates the perception of later-arriving reflections (reviewed in Blauert 1997; Litovsky et al. 1999).
Psychophysical studies have often presented leading and lagging sounds from two locations to simulate the sound arriving directly from a source followed by a single reflection. In these conditions, the subject's percept depends on the lead-lag delay. When identical sounds are presented simultaneously from two loci, subjects experience “summing localization,” meaning that a single source is perceived midway between the two speakers (Keller and Takahashi 1996a; Snow 1954). When one sound leads by a few milliseconds, a single sound is localized at a position near the leading source, a phenomenon termed “localization dominance” or “the law of the first wavefront” (e.g., Haas 1951; Litovsky et al. 1999; Wallach et al. 1949). At these delays, subjects also experience a loss in sensitivity to changes in the location of the lagging source, which has been termed “lag discrimination suppression” (Litovsky and Macmillan 1994; Saberi and Perrott 1990; Shinn-Cunningham et al. 1993; Spitzer et al. 2003; Yost and Soderquist 1984; Zurek 1980). At longer delays, the lagging sound becomes perceptible as a separate event, at which point the “echo threshold” is said to have been reached (reviewed in Blauert 1997).
Several studies have demonstrated that responses of spatially sensitive neurons to sounds at their best locations (or binaural configurations) are suppressed when preceded by sounds from different locations, thus paralleling the behavioral observations (Fitzpatrick et al. 1995, 1999; Keller and Takahashi 1996b; Litovsky and Delgutte 2002; Litovsky and Yin 1998a,b; Mickey and Middlebrooks 2001; Spitzer et al. 2004; Tollin et al. 2004; Yin 1994). This form of suppression could provide a basis for localization dominance because the neural representation of the location of the lagging source is heavily attenuated, whereas that of the leading source remains relatively intact. To test the relationship between neuronal and behavioral precedence phenomena, however, it is necessary to obtain comparable measures in the same species, as was recently done for the first time in the cat (Tollin and Yin 2003; Tollin et al. 2004).
The behavioral and neuronal specializations of the barn owl (Tyto alba) for acoustically guided predation make this species an excellent model for studying the neuronal basis of sound localization in complex acoustic environments. Previous studies in this laboratory have established that barn owls exhibit summing localization (Keller and Takahashi 1996a) and lag discrimination suppression similar to humans (Spitzer et al. 2003), and provided qualitative evidence for localization dominance and echo thresholds in a lateralization-type task (Keller and Takahashi 1996b). Physiological studies have demonstrated neuronal correlates of these behavioral phenomena in the space-mapped portion of the owl's auditory midbrain (Keller and Takahashi 1996b; Spitzer et al. 2004; Takahashi et al. 2001). We now use the owl's natural tendency to turn its head in the direction of a sound source to measure the effects of simulated reflections on sound localization. The results have implications concerning the manner in which the neuronal image within the auditory space map is used to select and orient toward multiple acoustic targets.
All experiments, conducted in accord with the Guidelines for the Care and Use of Laboratory Animals of the U.S. National Institutes of Health, were approved by the Institutional Care and Use Committee of the University of Oregon.
Subjects and training
Two hand-reared barn owls (owls “N” and “J”) from our captive breeding colony were trained to make head turns (saccades) to transient auditory and visual stimuli to obtain food rewards. Owls were kept at 85% of normal free-feeding body weight to maintain motivation during behavior testing. Between experimental sessions, the owls were housed together in a flight cage, physically isolated from but in audiovisual contact with the main colony.
Experiments were conducted in a double-walled, sound-attenuating anechoic chamber (Industrial Acoustics), the properties of which were previously described (Spitzer et al. 2003). All trials were conducted in complete darkness. During training and test sessions, the owl was tethered to a perch in the center of the room. Acoustical stimuli were presented from speakers mounted on a pair of hoops that formed arcs of a sphere with a radius of 1.5 m, centered on the approximate location of the owl's head. During test sessions, the orientation of the owl's head in two polar coordinates, azimuth and elevation, was sampled continuously at 468 Hz, using a custom-built magnetic search coil system (Remmel Labs, Ashland, MA). A magnetic search coil was attached to a surgically implanted head post, and used to sample the magnetic field imposed by two orthogonal coils, centered on the owl's perch. Details regarding the surgical implantation of the head post were previously described (Euston and Takahashi 2002). The head coil system was calibrated before each test session to obtain a linear response within the range of ±50° in azimuth and elevation. This calibration compensated for day-to-day variation of the magnetic field and enabled the system to remain accurate to within 1–2° throughout the calibrated range. The owls were monitored throughout test sessions by use of an infrared sensitive video camera and infrared light source. Owls were rewarded for successful head turns (see following text, Behavioral paradigm) by use of a mechanical feeder, triggered remotely by the experimenter sitting outside the sound chamber.
Experiments were controlled using an interactive graphical user interface implemented in Matlab v. 6.5 (The MathWorks, Natick, MA) that enabled the experimenter to initiate trials and to reward behavior based on localization accuracy criteria. Stimuli were digitally generated in Matlab and then converted to analog signals at a 30-kHz sampling rate, attenuated, and amplified using Tucker–Davis Technologies System II hardware (TDT PD1, PA4, HB6). The powered signal was fed through a multiplexer (TDT SS1) that allowed each of two input channels to be fed to one of eight speakers mounted on the hoops. The speakers (Peerless 2-in. cone tweeters) were calibrated at the start of each session and were replaced if their output deviated from a flat response (±2.5 dB) in the range from 3 to 9 kHz.
Stimuli were randomly selected from a library of 30 band-pass noise waveforms of 25-ms duration with flat amplitude spectra from 3 to 9 kHz. Noise waveforms were generated by inverse Fourier transformation of a square-wave amplitude spectrum (3–9 kHz pass-band) and a random phase spectrum, resulting in digital waveform with >90-dB stop-band attenuation. The spectra of the resulting acoustical stimuli were shaped by transfer functions of the transducers, described above. Stimulus waveforms were ramped on and off with 2.5-ms cosine envelopes. Between trials, stimulus level was roved at random through a range of 30 ± 5 dB sound pressure level (SPL; A-weighted), as measured by a microphone (BK-4176, Bruel & Kjær) positioned at the approximate location of the owls' heads, before the test session. Within this range of SPLs, probabilities of response, latency, and accuracy were shown to achieve asymptotic values (Whitchurch and Takahashi 2003). On paired-source trials, the same waveform was used as the leading and lagging sound, and both were presented at the same SPL as measured at the location that the owl's head would occupy during experiments.
Each owl was initially trained to turn its head to within 5° of single acoustic and visual targets to obtain rewards of mouse meat, provided by the experimenter using a remote-controlled feeder. Experimental test sessions began after the owl's performance was reliably better than this criterion level for short head turns (≤20°). Both owls retained a tendency to undershoot the locations of acoustic targets, despite training not to do so, and one owl (J) was reluctant to make turns to targets at elevations above the center of gaze.
An experimental test session consisted of 40 to 65 trials. The feeder was loaded with 40 rewards before the session; and the session was ended when the last reward was given. On the majority of trials, a sound was presented from a single speaker (hereafter referred to as “single-source trials”), and the owl was rewarded if it turned its head to within 5° of the target speaker. The target was selected at random from eight speakers mounted on the hoops in front of the owl's perch. Two pairs of speakers located along the horizontal meridian were used for stimulus presentation on both single- and paired-source trials. Individual speakers within each pair were separated by 40°. Four additional speakers, located along the vertical meridian, were used exclusively on single-source trials. The locations of individual speakers and the centers of speaker pairs were randomly varied throughout a 10° range between test sessions to prevent the owls from memorizing target locations. The owls were allowed to freely scan the environment between trials. Trials were triggered by the experimenter when the owl's head was still, and oriented within the calibrated portion of the magnetic field. In addition, the experimenters avoided triggering turns when the owl's gaze was directed below the acoustic target because one of the owls (owl J) was reluctant to make upward saccades.
Randomly interspersed among the single-source trials were five to 12 “paired-source ” trials, in which sounds were presented from two speakers, positioned along an arc at 0° elevation and separated in azimuth by 40°. On paired-source trials, an identical waveform was presented from each speaker and one of the two signals was delayed by 3 to 50 ms, to simulate an acoustic reflection of the sound emitted from the leading speaker. Paired sources presented within a single test session had either a single delay or up to three different delays. Rewards were given for any directed head turn, occurring within 500 ms of sound onset on paired-source trials, to avoid training the owls to behave in any particular manner. Thus the single-source trials within each test session served to reinforce accurate localization behavior and as controls against which behavior on paired-source test trials could be compared. To reinforce accurate sound localization, experimental test sessions were interleaved on successive days with training sessions in which only single sources were presented.
The manual triggering of trials provided a potential avenue for experimenter bias in the selection of starting head position. Therefore two post hoc statistical analyses were performed to determine whether any such bias was present in the data. A one-way ANOVA was performed to test for differences in head azimuth at stimulus onset across all test conditions (single sources and paired sources at all lag delays). The results demonstrated no significant effect of test condition on starting position for either owl (owl J: P = 0.228; owl N: P = 0.785). Furthermore, paired t-tests were performed to test for differences between initial head position in each paired-source condition (lag delay) and the single-source trials within the same test sessions. No significant difference (P < 0.1) was detected for any paired-source condition in either owl. These results demonstrate no evidence of bias in the selection of initial head orientation across test conditions.
On each trial, head orientation data were collected for 3 s, commencing 200 ms before the onset of the leading or single sound. Analysis of saccades used methods similar to those proposed in a previous study of sound localization in cats (Populin and Yin 1998), with one modification necessitated by the idiosyncratic behavior of one of the subjects (owl N). Saccade onset was defined as the point at which absolute angular velocity of the head exceeded 2 SDs of the distribution of angular velocities measured in the 200 ms before stimulus onset. Trials were considered to have “no response, ” and excluded from analysis, if no head-turn onset was detected within 600 ms of stimulus onset, although this rarely occurred. Data were also excluded from trials on which the head moved by more than 2° in azimuth or elevation before sound onset, or if the head orientation went beyond the calibrated portion of the magnetic field. The head turn endpoint was defined as the first point at which absolute angular velocity dropped below 5°/s. This criterion was adopted, in place of that used to define saccade onset, because one of the two owls would often slow to a near stop, without actually fixating the target.
Because paired sources differed in azimuth, but not elevation, analysis of localization accuracy focused on the azimuthal component of head turns. The precision and accuracy of head turns to single sources were measured by performing a linear regression of the angular extent of head turns (“head-turn angle, ” defined as the difference in azimuth between the head orientations at turn onset and endpoint) on “target angle, ” defined as the difference in azimuth between the center of gaze and speaker direction at turn onset. One measure of localization accuracy is provided by the slope of this regression because any deviation of the regression slope from 1 is indicative of a systematic localization error. The residuals after regression provide an indication of the dispersion of localization judgments about the regression line and are used to provide two measures of the precision of localization. The residual variance measures the dispersion of head-turn angles about the regression line (1) where αHT is the measured head-turn angle, αp is the head-turn angle predicted by the regression, and n is the number of head turns. The square root of sR2 is referred to as the residual error (Er) (2) This quantity is equivalent to the SD of head-turn angles relative to their central tendency, as represented by the regression line, and is thus a measure of the precision of localization in azimuth, with units in degrees. If a single-target azimuth had been tested repeatedly, as in previous studies (Knudsen and Konishi 1979; Tollin and Yin 2003), this measure would become identical to the SD of head-turn angles for that location. Er also provided a numerical basis for excluding outlying responses to single sources. Only one head turn in one bird (owl J, 5-ms lag-delay condition) was excluded because the turn angle deviated from the regression prediction by 49.4°, equivalent to a deviation of >10 Er values.
The regression for single-source trials also provided a basis for testing the localization dominance hypothesis (Tollin and Yin 2003), which states that the location of the leading source will exert a dominant influence on the perceived location of paired sources. This hypothesis predicts that, on paired-source trials, head-turn angle will be accurately predicted from the single-source regression, if the leading source is assumed to be the target. Thus when head angle is plotted as a function of azimuth to the leading source, all points should fall within the scatter about the regression line for single-source trials. This hypothesis was evaluated by determining the proportions of paired-source trials, thus plotted, on which the head-turn angle was outside the 99% confidence interval for predictions from the single-source regression (Sokal and Rohlf 1981).
The dominance of the leading source on localization behavior (“Lead Dominance”) was quantified using a metric, c, similar to that used in previous dichotic lateralization studies in human subjects (Shinn-Cunningham et al. 1993). In the present application (3) where αp is the head-turn angle predicted from the single-source regression, treating the leading source as the target, αHT is the actual head-turn angle, and ΔAz is the angular separation between leading and lagging sources, which was always 40°. This metric expresses the difference between the observed head-turn angle and that predicted by the location of the leading source on a scale, with 1 indicating no difference (i.e., complete lead dominance) and 0 indicating a difference equal to the azimuthal separation of leading and lagging sources, as would be expected if the owl localized the lagging source. In practice, values sometimes exceeded 1 because the head turns “overshot ” the leading source, but never were as low as 0.
The proportion of paired-source trials on which the saccade trajectories had sudden reversals in azimuth was used as a metric of the owls' abilities to resolve the locations of the two sources. The proportions of reversals on paired-source trials were compared with the proportions of reversals detected on the single-source trials within the same experimental test sessions, which served as a negative control. The significance of the difference between the proportions of reversals on paired-source and matched single-source trials was evaluated using the χ2 test for differences in frequency distributions. To detect reversals, the angular velocity of the azimuthal component of the head turn was first calculated as the first derivative of the azimuth values sampled between the onset and endpoint of the head turn. Reversals in azimuth were defined as any point at which the sign of the angular velocity on five of six preceding sample points was opposite to that on five of the six subsequent points. The 5/6 criterion was used to avoid rejecting trials on which the head became essentially stationary, in azimuth, for a few milliseconds during a reversal. In such cases, angular velocity declines to very small values (≪1°/s), which are subject to apparently random fluctuations around 0°/s, most likely as a result of electrical noise in the recording system. To avoid false positive errors, as indicated by an increase in the proportion of reversals detected on single-source trials, trials on which the total angular extent of the head turn was <6° were excluded from analysis, and reversals were accepted only if they occurred after the angular velocity of the azimuth component initially exceeded 20°/s and before it finally declined to <20°/s. These measures prevented counting as reversals small, but systematic deviations in azimuth that frequently occurred when the head was moving primarily in elevation.
Head turns to single sources
Single sounds evoked rapid, saccadelike head turns toward the speaker, similar to those described earlier (Knudsen and Konishi 1979). Response percentages were close to 100% and mean response latencies were 63 (n = 976) and 67 ms (n = 669) for the two owls, with no significant difference between the mean latencies. These results are consistent with findings of a recent study in our laboratory, which demonstrated that both maximal response percentages and minimal latencies are reached at levels 25 to 35 dB below those used in these experiments (Whitchurch and Takahashi 2003).
Example trajectories of head turns to single sources are illustrated in Figs. 1 and 2 (top panels) for the two owls. Nearly all trajectories are directed downward from the initial orientation because both owls tended to direct their gaze above the speaker array between trials. In addition, owl J was reluctant to turn to targets located above his center of gaze, so trials were initiated only when his head orientation was level with, or above, the target speaker. Owl N did not share this habit, but upward targets were nonetheless avoided by the experimenter to maintain consistency of testing between subjects. Head turns by owl N (Fig. 1) typically followed a straight trajectory from the initial direction toward the source direction. Owl J often made head turns with slightly curved trajectories (Fig. 2), and with the curvature becoming most pronounced as the head neared its final orientation. The thin lines extending from the endpoints in Figs. 1 and 2 indicate the remaining distance and direction to the target. Both owls tended to undershoot the sound source by an amount proportional to its eccentricity, with the extent of undershoot being greater in elevation than in azimuth. The azimuthal components of owl J's saccades were so accurate that a systematic undershoot is not readily apparent from the small sample in Fig. 2 (but see Fig. 3). The azimuthal undershoot of owl N′s head was more pronounced. On rare occasions, each owl would make a turn to a single source in which the trajectory had a sudden reversal (owl J: 10/976 trials; owl N: 3/669 trials). One such head turn is shown in Fig. 2 (arrow, top). Such behavior would be expected if the stimulus occurred in close temporal proximity to the onset of a spontaneous head turn.
Head turns to paired sources
The trajectories of saccades to paired sources changed as the delay between leading and lagging sounds increased. At lead-lag delays of 3 and 5 ms (Figs. 1 and 2, second panel from top), head turns were always directed toward the leading speaker and had trajectories similar to those evoked by single sources. This was true even in cases where the initial gaze direction was close to the lagging source [e.g., Fig. 1, leading targets at −32° azimuth (az.), 0° elevation (el.), and 36° az., 1° el.; Fig. 2 leading targets at −30° az., −14° el. and 41° az., −16° el.]. In these and subsequent figures, turns to paired sources are shown with gray lines and triangles indicating that the lagging source was 40° to the right of the leading source (Azimuthlag > Azimuthlead), and black lines and circles indicating that the lagging source was 40° to the left (Azimuthlag < Azimuthlead). At a lag delay of 10 ms, most turns by owl N still followed a nearly linear trajectory toward the leading source, but there were a few cases with abrupt midcourse changes of direction, one of which is illustrated (Fig. 1, arrow). At this delay, turns by owl J were similar to those at shorter lag delays and to single sources. At delays of ≥20 ms, both owls continued to localize the leading source on some trials, but also made numerous turns with pronounced changes of trajectory. Figures 1 and 2 illustrate that, in such cases, changes in trajectory were always directed toward one of the two sources. Furthermore, the initial trajectory was usually directed toward the leading source, although each owl made some turns with the first component directed toward the lagging source (not shown). In addition, at these longer delays, both owls made occasional turns with simple linear trajectories, but directed toward the lagging source (e.g., Fig. 1, 20 ms, arrow). As the example in Fig. 1 illustrates, such lag-directed turns were most often observed when the initial gaze direction was closer to the lagging source than to the leading one (see also Figs. 7 and 8).
Localization of single and paired sources
The localization in azimuth of single and paired sounds by each of the two owls is summarized in Figs. 3 (owl J) and 4 (owl N). In these figures, the azimuthal extent of the head turns (“Head-Turn Angle”) is plotted as a function of the azimuthal angle of the acoustic target, relative to the owl's center of gaze at turn onset (“Target Angle”). Thus if the owls had localized sound sources with perfect accuracy, all of the points from single-source trials (black dots) would have fallen along the unity line (solid diagonal).
Both owls localized single sources in azimuth with great precision. For both owls, the linear regression of head-turn angle on target angle for single sources, calculated within individual sessions, typically explained >95% of the variance of head-turn angles (owl J: 1 exception/35 sessions; owl N: 0 exceptions/22 sessions). The regression slopes, however, were <1 for both subjects (owl J: mean slope = 0.94; owl N: mean slope = 0.84), reflecting a tendency to undershoot the targets. In Figs. 3 and 4, each panel shows results for all single-source trials, pooled across all test sessions in which paired-sources were tested at the specified lag-delay. Because more than one delay was often tested in a single session, many of the single-source results are repeated in more than one panel. The regression analysis used to compare performance for single sources and paired sources was performed on the pooled results. The 99% confidence intervals for predictions from the single-source regressions are indicated by dashed lines.
Results for paired-source trials are shown with red triangles indicating that the lagging source was 40° to the right of the leading source and blue circles indicating that it was 40° to the left. If an owl had experienced a pronounced PE on a paired-source trial, it would have been expected to turn its head in the direction of the leading source. To test this prediction, the results from paired-source trials are plotted with the leading source being treated as the “target.” In this form of representation, the scatter of data points from single-source trials represents the predicted outcome in cases where the PE was overwhelming, and any systematic deviation from the single-source scatter would represent evidence of an influence of the lagging source on localization behavior. In the extreme case, where there is no PE, the owls would be expected to localize the lagging source on some trials, in which case the head-turn angle would expected to be 40° to the left or right of the leading source, depending on the location of the lagging source. When the lagging source was displaced to the right, the predicted outcome for lagging source localizations is given by offsetting the single-source regression by 40° along the ordinate (red dashed line), whereas, when the lagging source was displaced to the left, the predicted outcome is given by offsetting the single-source regression by −40° (blue dashed line).
Both owls' localization behavior demonstrated strong precedence effects on paired-source trials with lag delays of 3 and 5 ms. The data points from most trials at these delays fall within the 99% confidence limits of predictions from the single-source regressions, when the leading source is treated as the target. Thus in most cases, the owls behaved as would be expected if they were localizing a single sound emitted from the leading speaker. The few exceptions all occurred when the initial gaze direction was closer to the lagging source than to the leading source: at target angles >20°, when the lagging source was 40° to the left (red triangles), or at target angles < −20°, when the lagging source was 40° to the right (blue circles). In these conditions, on some trials, the head-turn endpoints were slightly shifted, relative to the single-source scatter, in the direction of the lagging source. In all cases, however, the final head orientation was much closer to the direction of the leading source than that of the lagging source.
As the lag delay was increased of 10 ms, most head turns were still well predicted by the location of the leading source. On the few trials in which saccades terminated outside the 99% confidence intervals for the single-source regressions, however, the bias in the direction of the lagging source became more pronounced than at shorter delays. This effect was most pronounced for owl N (Fig. 4), which made two head turns that were approximately equidistant from lead- and lag-source predictions, and one turn that was actually closer to the lag-source prediction.
Localization behavior at lag delays ≥20 ms was less well predicted by the lead-source location. At these longer lag delays, both owls often made turns to locations that were equidistant between the predictions of leading and lagging sources, and to locations that were closer to the lagging source. As at shorter delays, whenever the head-turn angle fell outside the single source scatter, the head-turn angle was shifted in the direction of the lagging source. At the longer delays, however, such lag-directed shifts even occurred on trials in which the initial gaze direction was nearer to the leading source.
The localization results are summarized in Figs. 5–8. In Fig. 5, the proportion of paired-source trials on which localization was influenced by the lagging source (head-turn angle was outside the 99% confidence interval of predictions from the single-source regression) is plotted as a function of lag delay for each owl. Owl J exhibited a sharp increase in the proportion of lag-influenced head turns as delay was increased from 10 to 20 ms, whereas owl N exhibited a more linear increase.
The precision of localization for single sources and paired sources is compared in Fig. 6 by a plot of the residual error (Er, Eq. 2) as a function of lag delay. Er was calculated separately for paired-source trials on which the lagging source was located to the left and right of the leading source to avoid potentially confounding effects of any systematic biases in localization judgments. Filled symbols indicate conditions in which the residual variance (sR2, Eq. 1) for turns to paired sources was significantly greater (P < 0.05, F-test) than that for single sources tested in the same sessions. At delays of 3 and 5 ms, both owls made head turns with equal precision to both single and paired sources. At longer delays, the residual error increased because the lagging sources exerted more pronounced and variable effects on localization.
The effects of lagging sources on localization are illustrated in greater detail in Figs. 7 and 8, in which a metric of lead-source dominance c (see methods, Eq. 1) is plotted against the absolute azimuthal angle between the initial gaze direction and the lagging source (“Lag Azimuth”). Thus 0° indicates that the owl was facing the lagging source at stimulus onset. In these figures, the influence of the lagging source is manifest as a reduction in lead-source dominance (i.e., c approaches 0), which indicates a bias of final head orientation in the direction of the lagging source. At delays ≤10 ms, both owls' head turns exhibited this type of bias when the initial head orientation was nearer to the lagging source (Lag Azimuth <20°). This effect is clearer for owl N (Fig. 7) than for owl J (Fig. 8), but in both cases there was a tendency for the bias to increase with proximity of the initial head orientation to the lagging source. Owl N also showed a clear tendency for the magnitude of lag influence to increase as lag delay was increased from 3 to 10 ms. This effect is less clear for owl J because of the greater variability of head-turn angles. At delays ≥20 ms, owl N exhibited clearly bimodal distributions of lead dominance, indicating that head-turn angle was predominantly influenced by either the leading source or the lagging source. At these longer delays, owl J also made turns that were predominantly influenced by one source or the other, but the distributions of lead dominance were more continuous than those for owl N. At delays ≥20 ms both owls made turns exhibiting strong lag influences when the initial head orientation was closest to that of the leading source.
Resolution of the lagging source
The preceding analysis demonstrates that the influence of the lagging source increases with lag delay. The exact nature of that influence, however, is not entirely clear. At short delays, the owls always made head turns directed toward the leading source, with trajectories similar to those of single-source turns, suggesting that they heard a single sound located near the leading speaker. According to this interpretation, the observed biases of head-turn angles would represent shifts in the perceived location of a single perceptual event in the direction of the lagging source. Alternatively, the owls may have perceived the location of the lagging source, but for some reason oriented toward only the leading source. In either case, the leading source appears to dominate localization behavior. As the lag delay was increased, both owls began to make more complex saccades, in which the direction of gaze was initially directed toward one of the two sources and then shifted in the direction of the other source. This behavior suggests that, at longer delays, the owls attempted to localize two sources at different locations. In this case, the simple metric of lead-source dominance would cease to be an effective descriptor of the influence of the lagging source, beyond demonstrating that some influence exists.
We therefore quantified the proportions of head turns in which the azimuthal component exhibited a reversal of direction. Because head turns to single sources occasionally exhibited reversals, it is expected that some reversals would also occur on paired-source trials, even if the owls heard a single source. Therefore the proportions of reversals on paired-source trials were compared with the proportions of reversals on single-source trials within the same test sessions, which served as a negative control. The results of this analysis are shown in Fig. 9. At lag delays of 3 and 5 ms, owl N made no head turns with reversals, and owl J made reversals on only 1.8% of trials at 3 ms. There was no significant difference (P > 0.2, χ2 test) between the proportions of head turns with reversals on paired- and single-source trials for either owl at these delays. At lag delays >10 ms, the proportion of reversals was always significantly greater on paired-source trials than on matched single-source trials (P values indicated on figure). At a lag delay of 10 ms, both owls made slightly more reversals on paired-source trials, with the difference between paired- and single-source trials being marginally significant (P = 0.04) for owl N and not significant for owl J.
These results demonstrate that, at delays ≤5 ms, there was no evidence that the owls were able to resolve the location of the lagging source. Saccades were always directed toward the leading source, with no significant indication of deviations toward the lagging source. Thus the small lag-directed bias of localization, observed on trials where the initial gaze direction was close to the lagging source, is consistent with a shift in the perceived location of a single perceptual event. It is not possible, however, to rule out the alternative hypothesis that the owls heard two sources at separate locations, but localized only the leading source. The increase in the proportions of turns with reversals at lag delays ≥20 ms indicates that, at least on some trials at these delays, the owls were able to resolve the locations of leading and lagging sources. Because of the complex trajectories of head turns at the longer delays, it is not possible to determine whether the owls perceived both sources at their actual locations or whether the perceived location of the lagging source was still influenced by the presence of the leading source, as has been reported in humans (Litovsky and Shinn-Cunningham 2001). It is interesting to note, however, that at delays ≥20 ms both owls frequently made turns that ended within the single-source localization scatters in Figs. 3 and 4, consistent with accurate localization of the leading source, although neither bird ever made head turns as large as those predicted for accurate localization of the lagging source. These results are suggestive of a residual influence of the leading source on lagging-source localization persisting beyond the delay threshold for resolution of the lagging source. Finally, the strength of the PE appears to decline sharply at delays between 10 and 20 ms because it is in this range that the lagging sound began to exert a substantial influence on localization behavior, and both owls gave indications of being able to resolve the lagging sound as a separately localizable event.
The mean latencies of head turns to both paired and single sources within the same test sessions are shown in Fig. 10. One potential confound in comparing latencies between paired- and single-source conditions is the effect of target eccentricity on latency because eccentricity was not explicitly controlled. Previous studies of ocular saccades in humans and other mammals have demonstrated a strong, inverse relationship between latency of saccades to acoustic targets and target eccentricity, defined relative to the direction of gaze (Jay and Sparks 1990; Yao and Peck 1997; Zahn et al. 1979; Zambarbieri et al. 1995). Although the magnitude of this effect appears to be much weaker in barn owls (Wagner 1993), it is nevertheless important to take this potential confound into account. Therefore the significance of differences in latencies between stimulus conditions was tested by performing analysis of covariance of condition (single source, paired source) on latency with target eccentricity as a covariate. This analysis, performed separately for each subject at each lag delay, confirmed that response latency was weakly dependent on target eccentricity in many but not all cases. Nevertheless, after factoring out effects of eccentricity, latencies of responses of both owls to paired sources were significantly greater than those to single sources at lag delays from 3 to 30 ms (P < 0.005, F-test). Owl J was also tested with lag delays of 50 ms and did not exhibit a significant difference in response latencies at this delay (P > 0.5).
One potential explanation for the increased latencies of responses to paired sources at delays <50 ms is that the owls simply waited until the stimuli had ended before responding. If this were the case, it would be expected that, at each lag delay, the response latencies would be increased by an amount equal to the difference in total stimulus duration between the paired-source and single-source conditions. The latencies predicted by this mechanism are indicated by the dashed lines in Fig. 10. At delays from 3 to 20 ms, the mean response latencies in the paired-source conditions were always significantly greater than the predicted values (both owls P < 0.005 at 3- to 10-ms delays; P < 0.05 at 20 ms; t-test). The increase in response time for paired sources is thus more likely to reflect an increase in processing time required for sound localization in echoic conditions.
We examined sound localization in the barn owl under conditions that evoke the PE in humans. For 3- and 5-ms delays, localization dominance was very strong: head turns were always directed at the leading source and were nearly as precise as turns toward single sources. A small bias in the direction of the lagging sound was evident when the initial gaze direction was close to the lagging source. With increasing delays, the directions of head turns were increasingly biased toward the lagging source. At delays >10 ms, owls made frequent double saccades, suggesting that they were able to separately localize both sources and indicating that echo threshold had been reached. These findings demonstrate that sound localization in barn owls exhibits precedence phenomena similar to those demonstrated previously in psychophysical studies of humans and cats, summarized below. One respect in which the owls' behavior differed from that reported previously in cats (Tollin and Yin 2003) is that saccade latencies for paired-source trials were significantly longer than those for single-source trials at all but the longest (50-ms) delay.
This study is the first to quantify the effects of a simulated reflection on sound localization in the barn owl. A previous study demonstrated that naïve owls will turn their heads into the quadrant of space containing the leading source at delays ≤10 ms (Keller and Takahashi 1996a), providing qualitative evidence of localization dominance, analogous to performance in a lateralization task. At longer delays, the owls occasionally made double saccades or turned toward the lagging source—findings that we have now confirmed and quantified. Because the birds in the earlier study were naïve, their behavior demonstrates the innateness of these PE components. The present study extends these findings by providing quantitative measures of the effects of reflections on localization precision and of the relative weighting of the locations of leading and lagging sources during localization dominance, both of which will be useful in future efforts to relate localization behavior to neuronal responses.
Previous studies have used a variety of methods to measure the effects of reflections on the perceived location of a direct sound. Human headphone studies first demonstrated that the perceived intracranial position of leading and lagging binaural click pairs, at a lag delay of 2 ms, reflected a dominant influence of the interaural time difference (ITD) of the leading pair, but was also influenced by the ITD of the lagging pair (Wallach et al. 1949). Subsequent human studies using similar methodologies have demonstrated that the strength of lead dominance is dependent on several factors including the lag delay and the relative levels and difference in ITDs of the leading and lagging pairs (Shinn-Cunningham et al. 1993; Yost and Soderquist 1984; Zurek 1980). In agreement with results of these dichotic studies, source localization studies in humans (Chiang and Freyman 1998; Mickey and Middlebrooks 2001; Stecker and Hafter 2002) and cats (Tollin and Yin 2003) demonstrated that, at lag delays below echo threshold, a single source is localized near the leading speaker, but with a significant bias in the direction of the lagging source. Results from studies using lateralization-based discrimination tasks suggest that similar localization dominance phenomena occur in a wide variety of animal species (Cranford 1982; Dent and Dooling 2004; Kelly 1974; Wyttenbach and Hoy 1993).
The present findings demonstrate that barn owls exhibit similar lead dominance in localization to humans and cats at lag delays from 3 to 10 ms because head turns were directed predominantly toward the leading source, but with a clear bias toward the lagging source that increased with lag delay. Within this range of delays it was also demonstrated that the strength of lead dominance depended on the eccentricity of the lagging source, being weakest on trials when the owl was looking toward the lagging source at stimulus onset. This effect may have resulted from the directional properties of the owl's external ears and facial ruff that give rise to a relative gain in average binaural level of 16 dB between sound sources at the center of gaze and 40° to the periphery (Keller et al. 1998). Therefore when the owl is facing the lagging source, the leading source, and presumably its neural representation, may be significantly attenuated. Given the magnitude of the gain advantage for centrally located sources, it is notable that it was not sufficient to counteract lead dominance at short delays. By comparison, in human stereophonic listening experiments, localization dominance can be completely counteracted by increasing the level of the lagging source by 5–10 dB at lag delays from 2 to 14 ms (Blauert 1997; Leaky and Cherry 1957; Snow 1954). Similarly, in budgerigars, a lagging-source advantage of 3–5 dB was sufficient to negate the PE at delays of 1 and 5 ms in a discrimination-based lateralization task (Dent and Dooling 2003). Results of a recent localization study in cats, however, show resistance of localization dominance to lag-source level advantages that appear to be more similar to the present results (Dent et al. 2005). Thus the failure of our owls to make lag-directed turns at delays of 3 and 5 ms (and 10 ms in owl J) when the initial gaze was directed toward the lagging source suggests either that localization dominance is stronger in owls and cats than in humans and budgerigars or that stronger effects are revealed by localization tasks than by lateralization tasks.
In humans the presence of a single lateral reflection has been shown to result in a small decrease in the precision of localization judgments (Rakerd and Hartmann 1985). In cats, on the other hand, in the range of lag delays where localization dominance occurs, the variance of localization responses appears to be comparable to that for single sources (Tollin and Yin 2003). It would appear that the owl's localization behavior is more similar to that of cats than humans in this regard; however, it should be noted that the localization tasks used in the owl and cat were more similar to one another than to the tasks used in human studies.
The precision of leading-source localization at short delays is noteworthy because reflections substantially degrade the available localization cues. With 25-ms stimuli and a 3-ms delay, the waveforms from the leading and lagging sources overlap and sum for 22 ms. The resulting interaural differences in timing and level—which provide the basis for sound localization in owls—fluctuate within this epoch and can attain values that are quite different from those of the individual sources (Best et al. 2004; Carlile and Best 2002; Faller and Merimaa 2004; Rakerd and Hartmann 1985; Roman et al. 2003). The level of interaural correlation is also substantially reduced compared with that for a single source, further diminishing the effectiveness of binaural cues (Egnor 2001; Spitzer et al. 2003). If one assumes that the owl computes binaural cues within a time window of about 6 ms (Wagner 1992), even the initial window would contain degraded cues. Nevertheless, the owls localized leading sources nearly as precisely and accurately as single sources. Such performance may have been achieved by integrating the output of the binaural comparators over a longer duration than in the single source condition, to achieve more reliable estimates of the average time and level differences. If so, the increase in integration time may have contributed to the increase in response latencies in the paired-source conditions at delays <30 ms.
The delay at which the lagging sound becomes perceptible as a separate event is traditionally defined as the echo threshold. Echo thresholds depend on stimulus duration, and with clicks, the human echo threshold is about 5 ms (Blauert 1997; Litovsky and Shinn-Cunningham 2001; Shinn-Cunningham et al. 1993). In the present study, owls started to make head turns with reversals at 10 ms (owl N) and 20 ms (owl J). Although this might be interpreted to indicate that the owl's echo threshold for the 25-ms stimulus is somewhere between 10 and 20 ms, the situation is probably more complicated. A recent study comparing multiple measures of the PE in the same subjects demonstrated that localization dominance persists at lag delays beyond the echo threshold defined by perceptual fusion (Litovsky and Shinn-Cunningham 2001). Subjects heard a second source, but localized it closer to the leading source than to its actual location, suggesting that localization dominance and fusion may result from separate mechanisms. If similar effects occur in barn owls, the saccades generating values of c between 1 and 0.5 can be construed as evidence that the owls were attempting to localize a lagging source that appeared close to the leading source, and not that the owls were localizing a single source that was biased toward the lagging source. Head-turn reversals may thus overestimate the owl's echo threshold. As in human studies, a precise measurement of echo threshold, thus defined, will require further experiments using location-independent behavioral measures. The present results provide an indication of the range of delays over which the sensory representation of the lagging source's location becomes resolvable from that of the lead. As in humans (Litovsky and Shinn-Cunningham 2001), this may occur at a different delay than that at which the lagging sound becomes perceptible as a separate event. Nevertheless, the present findings are of considerable interest because they provide a basis for comparison with the pattern of neuronal activity in the owl's midbrain auditory space map. Such comparisons may provide insight into the neuronal processes by which the spatiotemporal patterns of activity on a topographic map are parsed into representations of separate objects.
In contrast to the present findings, a previous localization study in cats demonstrated no significant difference in latencies of ocular saccades to single sources and paired sources at delays associated with localization dominance (Tollin and Yin 2003). Although such a difference could be indicative of a species difference in localization mechanisms, it is important to note the differences in the stimuli and the sensorimotor responses used in the two studies. The 10-ms noise bursts used by Tollin and coworkers would have resulted in proportionally less acoustic overlap of leading and lagging sounds at delays <10 ms than those used in the present study, and none at longer delays. Consequently, the cats may have required less time to process the localization cues because they were less degraded. A final possible reason for the difference between results in cats and owls is that the mean latencies of responses, and the associated variability were both greater in the cat, which would have made it difficult to detect differences in mean latencies between single- and paired-source conditions as small as those exhibited by owls. The greater response latency in cats is likely to reflect use of ocular saccades, which require more processing than head saccades.
Although an increase in integration times might account for the greater response latencies at short lag delays, it is unlikely to account for the greater latencies at longer delays, where acoustic superposition effects are reduced or absent. Another factor likely to contribute is the additional processing time required to decide which source to respond to when echo threshold for localization has been reached. Finally, it is possible that any increase in response latency reflects a greater uncertainty about the location of a more diffuse source (Tollin and Yin 2003). The latter explanation, however, is difficult to reconcile with the finding that the precision of localization was equivalent for single sources and paired sources at short delays.
Orientation and discrimination in auditory space
In an earlier study, we measured thresholds for discriminating differences in source location, or minimum audible angles (MAAs), using similar stimuli (Spitzer et al. 2003). MAAs for leading sources were approximately twice those for single sources, indicating that simulated reflections impeded discrimination. By contrast, the current study demonstrated that the precision of localization of leading and single sources was nearly equal at short delays. Thus reflections appear to have a greater effect on spatial discrimination than on localization. Alternatively, it is possible that different cues were used in the localization and discrimination studies. The latter possibility seems less likely, however, because results from the control conditions in the discrimination study demonstrated that performance was not based on pitch cues, which are the most likely alternatives to binaural localization cues.
Human psychophysical data demonstrate that the strength of the PE is proportional to the spatial separation of leading and lagging sources (Shinn-Cunningham et al. 1993). In the present study leading and lagging sources were always separated by 40°. It is thus possible that the effect of the lagging source on the precision of saccades would have been lower, had the sources been closer together.
The disparate effects might also reflect differences in the way in which the owl uses its space map in different tasks. Space-map neurons have discrete spatial receptive fields (SRFs) and are arrayed as a topographic map. Consequently, sounds are represented as foci of activity or neural images (McIlwain 1975). Lesions of the space map result in scotoma-like deficits of sound localization (Wagner 1993). In a study with a single source, we showed that changes in the firing rates of space-map neurons, measured across the entire neuronal image, can account, quantitatively, for the owl's MAA along the horizon (Bala et al. 2003). The change in firing rate, which may serve as the owl's decision variable in a discrimination task, is substantially corrupted by the presence of simulated reflections (Keller and Takahashi 1996b; Spitzer et al. 2004). Furthermore, among the most spatially selective neurons, which presumably constitute the output stage of the space map, the disruption of firing rate changes is greater when the sound inside the SRF is the reflection than when it is the leading source (Spitzer et al. 2004). Thus the extent to which rate-coded information about changes in source location is corrupted (lagging source > leading source > single source) is consistent with the observed order of behavioral MAAs (Spitzer et al. 2003).
By contrast, most previous studies have considered that sound localization might be based on either the centroid or the peak of activity within a neural map of spatial location (e.g., Knudsen and Konishi 1978; Lindemann 1986). The finding that the precision of lead-source localization at 3- to 5-ms delays is equivalent to that for single sources suggests that these measures, which depend on the spatial distribution of activity within the space map, are more resistant to the corruptive influence of a lagging source than is information encoded in neuronal firing rates. Further neurophysiological studies will be required to test this hypothesis.
Previous neurophysiological experiments using stimuli similar to those of the present study demonstrated that responses of space-map neurons to sounds within their SRFs are suppressed in the presence of a leading or lagging sound outside the SRF (Spitzer et al. 2004). The suppression of responses to leading sources was consistent with the effects of acoustic superposition of leading and lagging sounds on the available binaural cues, whereas the suppression of responses to lagging sounds persisted beyond the offset of the leading sound, suggesting an additional inhibitory contribution. Analogous inhibition of responses to the lagging sound has been demonstrated in numerous studies in mammals (Fitzpatrick et al. 1995, 1999; Litovsky and Delgutte 2002; Litovsky and Yin 1998a; Mickey and Middlebrooks 2001; Tollin et al. 2004; Yin 1994). In the owl, the responses to both leading and lagging sounds were suppressed maximally at the shortest lag delay tested (1 ms) and gradually recovered as the delay was increased. The present behavioral data have implications regarding the manner in which the responses of space-map neurons are used to localize sounds at different stages of this recovery process.
Because responses to lagging sounds are more suppressed than responses to leading sounds at 3- to 10-ms delays, it can be inferred that the neural image of the lagging source within the space map will be more heavily attenuated than that of a leading source. At these delays, the owls appeared to localize the leading source almost exclusively, but with a consistent, delay-dependent bias in the direction of the lagging sound (see Figs. 7 and 8). These findings are consistent with a mechanism in which the source location is computed as a firing-rate–weighted average of activity across the entire space map. One such computation is the centroid of map activity, as proposed in standard binaural models (Colburn 1973, 1977; Stern and Colburn 1978), in which each neuron contributes a vector directed toward its best location with a length proportional to its firing rate. Because neuronal responses to the lagging sound are not completely suppressed in all neurons, a residual focus of activity at the lagging source's map location will bias the centroid in its direction. Furthermore, as the neuronal responses to the lagging source recover with increasing lag delays, the magnitude of the lag-directed bias would be expected to increase, as was observed for head turns at 10 ms. An alternative explanation—that the lag-directed bias reflects a shift in the major peak of activity toward the map location of the lagging source—is not supported by the physiological data, which demonstrated that the peaks of azimuth tuning curves for leading sources were most often shifted away from the location of the lagging source (Spitzer et al. 2004).
A remaining challenge for models of the PE is to account for the ability to resolve the locations of leading and lagging sources at long delays, and the inability to do so at short delays. The neurophysiological data indicate that responses of most space-map neurons have recovered sufficiently to reliably signal the presence of a lagging source in their SRFs at delays from 5 to 10 ms (Spitzer et al. 2004). By contrast, their localization behavior does not provide clear evidence of an ability to resolve the location of the lagging source until the delay is increased to 10 or 20 ms. Together, these findings suggest that, at shorter delays, the diminished representation of the lag source on the map is treated as part of the image of a single source, as discussed above, whereas, at longer delays, the map image is parsed into separate representations of the locations of two sources. Further neurophysiological studies will be required to identify the neuronal processes involved in parsing the space-map image at long delays and merging it at shorter delays. The present data provide the necessary background for such investigations, by identifying the owl's echo threshold in a localization task.
This work was supported by National Institute of Deafness and Communication Disorders Grants F32-DC-00448 (National Research Service Award Postdoctoral Fellowship) to M. W. Spitzer and RO1-DC-03925 to T. T. Takahashi.
We thank K. Keller and E. Whitchurch for assistance in all aspects of the project, S. Yanagihara for training the owls, and D. Irvine for comments on an earlier version of the manuscript.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2006 by the American Physiological Society