## Abstract

The neurophonic potential is a synchronized frequency-following extracellular field potential that can be recorded in the nucleus laminaris (NL) in the brainstem of the barn owl. Putative generators of the neurophonic are the afferent axons from the nucleus magnocellularis, synapses onto NL neurons, and spikes of NL neurons. The outputs of NL, i.e., action potentials of NL neurons, are only weakly represented in the neurophonic. Instead, the inputs to NL, i.e., afferent axons and their synaptic potentials, are the predominant origin of the neurophonic (Kuokkanen PT, Wagner H, Ashida G, Carr CE, Kempter R. *J Neurophysiol* 104: 2274–2290, 2010). Thus in NL the monaural inputs from the two brain sides converge and create a binaural neurophonic. If these monaural inputs contribute independently to the extracellular field, the response to binaural stimulation can be predicted from the sum of the responses to ipsi- and contralateral stimulation. We found that a linear summation model explains the dependence of the responses on interaural time difference as measured experimentally with binaural stimulation. The fit between model predictions and data was excellent, even without taking into account the nonlinear responses of NL coincidence detector neurons, although their firing rate and synchrony strongly depend on the interaural time difference. These results are consistent with the view that the afferent axons and their synaptic potentials in NL are the primary origin of the neurophonic.

- sound localization
- phase locking
- auditory coincidence detector
- extracellular field potential
- neurophonic

barn owls can localize sound sources with outstanding precision. The localization error is about 2–3° (Knudsen et al. 1979; Bala et al. 2003). Localization in azimuth is based on the analysis of the interaural time differences (ITDs) in the nucleus laminaris (NL) by coincidence detector neurons (Moiseff and Konishi 1981; Poganiatz et al. 2001; Bala et al. 2003; Hausmann et al. 2009). The detection of ITDs depends on the phase-locked monaural inputs from the ipsi- and contralateral nuclei magnocellularis (NM) conveyed via axons acting as delay lines (Jeffress 1948; Carr and Konishi 1990). The available data indicate that the firing rate of NL neurons is sharply ITD tuned; for stimulation with tones, peak firing rates at best ITD were typically two times larger than minimum firing rates at the worst ITD (Peña et al. 1996; Carr and Konishi 1990).

Apart from the NL of the owl, ITD-sensitive neurons are found at the first station of binaural convergence of temporal information, such as the NL of reptiles (Carr et al. 2009) and birds (for the chicken, see Overholt et al. 1992; Köppl and Carr 2008), and in the medial superior olive in mammals such as cats (Galambos et al. 1959; Moushegian et al. 1964; Guinan et al. 1972; Yin and Chan 1990), dogs (Goldberg and Brown 1969), kangaroo rats (Moushegian et al. 1975), brown bats (Grothe and Park 1998), gerbils (Spitzer and Semple 1995; Brand et al. 2002), and rabbits (Batra et al. 1997). In addition, ITD-sensitive neurons are abundant in hierarchically higher structures such as the inferior colliculus (Knudsen and Konishi 1978; Carney and Yin 1989; Yin et al. 1990; Wagner 1992; Euston and Takahashi 2002; Joris et al. 2005; Agapiou and McAlpine 2008; Vonderschen and Wagner 2009; Horvath and Lesica 2011). Also, the extracellular field potential in the NL of both chicks and owls shows ITD tuning (Sullivan and Konishi 1986; Schwarz 1992; Wagner et al. 2005; Köppl and Carr 2008).

Intracellular recordings from NL neurons in the owl are difficult to obtain (Sullivan and Konishi 1986; Carr and Konishi 1990; Peña et al. 1996; Funabiki et al. 2011), partially because the intracellular spike amplitudes recorded within NL are small (∼10 mV; Funabiki et al. 2011) and outweighed by hundreds of afferent sources that contribute to the large, extracellularly recorded field within NL (Kuokkanen et al. 2010). The amplitude of the extracellular potential is typically in the range of millivolts. Thus isolation of single-unit activity from extracellular recordings is technically difficult. To characterize the response properties of NL, therefore, either recordings from the NL output axons (Moiseff and Konishi 1983; Carr and Konishi 1990) or the extracellular field potential in NL have been used (Sullivan and Konishi 1986; Wagner et al. 1987, 2005, 2009).

The extracellular field potential in NL is also called “neurophonic” (Weinberger et al. 1970) because it is well correlated with acoustic stimuli. The neurophonic has a temporal precision of ≤20 μs (Wagner et al. 2005, 2009) in the NL region responsive to high frequencies (>2.5 kHz) and shows a prominent oscillation at the stimulus frequency in response to pure-tone stimulation (Sullivan and Konishi 1986), with a signal-to-noise ratio >30 dB (Kuokkanen et al. 2010).

Here we examine how the neurophonic and its ITD tuning are generated. In doing so, we need to know the putative generators of the neurophonic. In owl NL, there is only one type of neuron, with an oval-shaped cell body; a small, nonbipolar dendritic tree; and a thick axon that does not branch within the nucleus (Carr and Boudreau 1993). Therefore, the structure of the NL circuit is simple, making it a good model system to separate and identify putative neuronal contributors of an extracellular field potential. There are three potential sources of the neurophonic: *1*) densely packed axons arriving from the ipsi- and contralateral NMs, *2*) synapses from NM axons onto the NL neurons, and *3*) NL neurons, which are sparsely distributed in NL.

Computational modeling and analysis of monaural neurophonic responses have suggested that the neurophonic in the barn owl NL is generated by hundreds of statistically independent sources (Kuokkanen et al. 2010). In contrast, the neurophonic in the chicken NL and in the mammalian medial superior olive has been estimated to originate from a smaller number of postsynaptic sources, i.e., synaptic currents or the spikes of bipolar neurons (Guinan et al. 1972; Schwarz 1992; Köppl and Carr 2008; Mc Laughlin et al. 2010a,b). Moreover, our previous analyses (Kuokkanen et al. 2010) suggested that the neurophonic in the owl's NL is a population effect dominated by the activity of many sources that are directly related to the input to NL, which originates from the ipsilateral and contralateral sides, and that the output spikes of NL neurons contribute only weakly to the neurophonic.

We here analyze binaural responses in NL to provide an independent test of the Kuokkanen et al. (2010) hypothesis, which was based on an analysis of monaural responses. We have also used our analyses of binaural responses to reexamine the alternative hypothesis, that the neurophonic originates from postsynaptic sources, i.e. from NL action potentials. To begin we first note that the monaural signals arriving from the left and right sides of the brain are statistically independent in the sense that ongoing spiking activity can be described well by inhomogeneous Poisson processes. Second, the extracellular medium, i.e., the paths for extracellular current flow, is linear up to frequencies of 5 kHz (Logothetis et al. 2007). Accordingly, our main hypothesis on the origin of the neurophonic states that the linear sum of monaurally evoked responses predicts binaurally evoked responses; this view (“input hypothesis”) implies that the contribution of the postsynaptic output spikes to ITD tuning is weak or absent. On the other hand, the alternative view (“output hypothesis”) states that mainly the output spikes contribute to the neurophonic and its ITD tuning.

To discriminate between the two hypotheses, we investigate how well a linear model can predict binaural responses at various ITDs. Deviations from the linear prediction could be explained by nonlinearities in the system such as contributions from the spiking activity of the NL neurons, which act as nonlinear coincidence detectors (Peña et al. 1996; Kempter et al. 1998; Funabiki et al. 2011). We shall show that linear summation predicts binaural responses surprisingly well, showing that the input hypothesis is in line with the data.

## MATERIALS AND METHODS

### Experimental Paradigm

Data were collected at the University of Maryland at College Park. Six barn owls (*Tyto alba pratincola*) were used to collect the physiological data presented in this study. The procedures described here conform to National Institutes of Health guidelines for animal research and were approved by the Animal Care and Use Committee of the University of Maryland. Most animals were used in two or three separate physiology experiments, spaced approximately a week apart.

Anesthesia was induced by injections of ketamine hydrochloride (3 mg/kg im Ketavet; Phoenix, St. Joseph, MO) and xylazine (2 mg/kg im Xyla-ject; Phoenix). Supplementary doses of ketamine and xylazine were administered to maintain a suitable plane of anesthesia. Body temperature was measured with a cloacal probe inserted and maintained at 39°C by a feedback-controlled heating blanket (Harvard Instruments, Braintree, MA). Buprenorphine hydrochloride (0.3 mg/kg im Buprenex; Reckitt and Colman Products, Richmond, VA) was administered at the end of each recovery experiment.

#### Surgery and stereotaxis.

Initially, the owl's head was placed in a custom-designed stereotaxic frame and stabilized using ear bars and a beak holder. Then, a metal headplate and a short metal pin marking a standardized zero point were permanently glued to the skull. After this, the ear bars and the beak holder were removed, and the head was held by the headplate alone. An opening was made in the skull around the desired area relative to the zero point, and the dura was cut open. Each electrode was moved to a specific rostrocaudal and mediolateral position with respect to the zero point before penetrating the brain (Carr and Konishi 1990). In some cases, the electrode was angled to facilitate access to extremely medial or lateral regions of the brainstem.

#### Electrodes and recording setup.

Owls were placed on a vibration-isolated table within a sound-attenuating chamber (IAC, New York, NY). Commercial Epoxylite-coated tungsten electrodes (Frederick Haer, Bowdoin, ME) were used, with impedances between 2 and 20 MΩ. A silver chloride pellet, placed under the animal's skin around the incision, served as the reference electrode (WPI, Sarasota, FL). Electrode signals were amplified and band-pass filtered (100–13,000 Hz) by a custom-built headstage and amplifier. The noise floor of the equipment was in the range of 1–10 μV. The recording was then passed in parallel to an oscilloscope and an A/D converter [DD1; Tucker- Davis Technologies (TDT), Gainesville, FL] connected to a personal computer via an optical interface (TDT).

### Stimulation

#### Stimulus generation and calibration.

Acoustic stimuli were digitally generated using custom-made software (Xdphys, developed in the laboratory of Dr. M. Konishi at California Institute of Technology, Pasadena, CA) driving a signal-processing board (DSP2; TDT). After passing a D/A converter (DD1; TDT) and an antialiasing filter (FT6–2, corner frequency: 20 kHz; TDT), the signals were variably attenuated (PA4; TDT), impedance-matched (HB6; TDT), and attenuated by an additional fixed amount before being fed to commercial miniature earphones. Two separate signals could be generated, passing through separate channels of associated hardware and driving two separate earphones. Sounds were calibrated individually at the start of each experiment, using built-in miniature microphones (EM3068; Knowles, Itasca, IL).

#### Stimulation protocol and recording.

While lowering an electrode in the brain towards NL, noise bursts were presented as search stimuli. Once auditory responses were discernible, tonal stimuli were applied from both the ipsi- and the contralateral side and binaurally to measure best frequency (BF) and best ITD and to estimate the position of the electrode (Sullivan and Konishi 1986).

To record a neurophonic, tone bursts of different frequencies (500–10,000 Hz) were presented monaurally and binaurally. The duration of a tone burst was 100 ms with 5-ms rise/fall times and generally a constant starting phase. In all experiments, voltage responses were recorded with a sampling frequency of 48,077 Hz, and analog waveforms were saved for offline analysis. In total, 200 ms of the response were saved per trial (single-trial data in Fig. 1, *A–D*). The stimulus began 5 ms after the recording started and ended at 105 ms. Tone bursts were repeated three to five times. The level of the tones was generally 20–30 dB above threshold. The interstimulus interval was either 500 or 700 ms.

Frequency-tuning curves were recorded in response to both monaural and binaural stimulation. An appropriate stimulus frequency near the BF of the location was selected to record the binaural ITD tuning in the range of −300 to +300 μs with typically 30- or 40-μs step size.

#### Recording locations.

Data were recorded from 171 locations in the NL of 6 anesthetized barn owls. We acquired data in one to three recording sessions in each owl and in each session from only one of the NLs. There were 1–7 penetrations in each session, and each penetration contained 1–28 dorsoventrally distributed recording locations. In a single penetration, the dorsoventral separation between the recording locations was generally ≥100 μm (∼2/3 of the locations). In four penetrations, we used distances of 10 or 15 μm. In 102 locations out of 171, both ipsi- and contralateral frequency-tuning curves as well as binaural ITD-tuning curves approximately at BF were recorded. In other locations, BFs were determined audiovisually.

##### STIMULUS FREQUENCIES.

The set of stimulus frequencies for the monaural responses was always the same for both ears. If a frequency used for the binaural stimulation was not in the set used for monaural stimulation, the response characteristics (variances of the response, signal, and noise) of the monaural responses were interpolated linearly from the closest stimulus frequencies.

The stimulus frequency used for binaural stimulation was approximately the BF for both monaural responses. The binaural stimulus frequencies ranged from 3.0 to 7.4 kHz. The recordings were from the high-BF region of NL; BFs ranged from 2.7 to 7.5 kHz and from 2.9 to 7.4 kHz for ipsi- and contralateral stimulation, respectively. The BF of the contralateral side was highly correlated with the BF of the ipsilateral side (correlation coefficient; ρ_{cc} = 0.83; *P* < 10^{−26}; *n* = 102). The difference between ipsi- and contralateral BFs was small (mean ± SD: 0 ± 200 Hz; range: −320 to +480 Hz) and in the same range as previous measurements (Peña et al. 2001).

### Data Analysis

All data analysis was done with Matlab 7.6 (MathWorks, Natick, MA). To calculate the linear-circular correlation coefficient, the vector strength, and the direction of the circular data, we used the Circular Statistics Toolbox by Dr. P. Berens, which is available at MatlabCentral (Berens 2009).

The mean values of the neurophonic responses were small because responses were band-pass filtered (100–13,000 Hz). For each trial, the 80-ms voltage trace from 10–90 ms after stimulus onset was analyzed. For the response, signal, and noise (see results), the variances were calculated over the same 80-ms periods and averaged over trials.

##### RESAMPLING AND CYCLIC-MEAN SIGNAL.

The “cyclic-mean signal” was defined as follows (Kuokkanen et al. 2010): if acoustic stimuli are periodic, we may average a neurophonic response across stimulus cycles. The resulting cyclic-mean waveform had a length of one stimulus cycle.

To numerically estimate the cyclic-mean signal, we note that we used 80-ms intervals of neurophonic responses that were sampled at 48,077 Hz (for example in Fig. 2*A*). For further analyses, we resampled the 80-ms interval, and the resampling frequency depended on the stimulus frequency. The signal was assumed to be cyclic with period 1/*f*_{stim} where *f*_{stim} is the frequency of a tone burst. To average across cycles of a neurophonic response, we resampled the 80-ms interval to 10, 20, or 40 sampling points per stimulus period; for *f*_{stim} > 5 kHz, we used only 10 or 20 sampling points per period. Resampling enabled us to average the response cycle-by-cycle (as well as across trials because of a constant starting phase), which yielded the “cyclic-mean” waveform. Using this cyclic-mean waveform, we generated a signal waveform of 80-ms length by concatenating identical copies of the cyclic-mean waveform and restoring the original sampling frequency (Fig. 2*B*). The noise was then defined as the difference of the neurophonic response and the 80-ms signal waveform (Fig. 2*C*).

#### Preprocessing and selection of the data.

##### DURATION OF THE RECORDING AT ONE LOCATION.

Neurophonic responses were corrected for instabilities during the recording at one location. The minimum duration of a recording at one location was 2 min. Locations where >40 min passed between the first and the last stimulus condition were excluded from the data set (6 out of 102). After that, the average duration of a recording at one location was 7 ± 5 min (mean ± SD); at only one location the recording time exceeded 15 min.

##### AMPLITUDES OF SPONTANEOUS ACTIVITY.

The variance of the spontaneous activity varied with time (for example due to changing levels of anesthesia), which implied some variability of response variances across stimulus conditions (ipsilateral, contralateral, and binaural stimulation). To account for nonstationarities that impede comparisons of monaural and binaural responses, the spontaneous noise level (“baseline”) was calculated as the variance of the first and last 5-ms interval of each 200-ms recording ([0, 5] ms and [195, 200] ms). The average over all repetitions for a given stimulus condition served as a baseline noise level. The responses to binaural stimulation were scaled with the factor [mean of monaural baselines]/[binaural baseline]. Measured scaling factors of the amplification were mostly in the range 0.9–1.6, indicating nonstationarities in this range. Recording locations outside of this range were excluded from the data set (5 out of 102). Recordings where the ratio of the ipsi- and contralateral baselines exceeded 1.1 or were below 0.91 were also excluded from the data set (14 out of 102).

In some cases, scaling factors were in the ranges from 0.09 to 0.16 or from 9 to 16. These ranges were due to manual changes in the amplification gain by factors of 0.1 or 10. To include such data sets in the analysis, voltages were corrected by factors 0.1 or 10. This normalization, however, does not account for the enormous spread of response variances (Fig. 1*F*). The voltages reported in results include such scaling factors.

##### PHASE VARIABILITY.

We observed variability in the instantaneous frequency of the cyclic-mean signal of the recordings. This might be due to synchronization errors (in the microsecond range) between the setup and the recording PC and/or natural variability in the responses. These were assumed to be associated with small movements in the recording electrode, for example, due to brain movement. The resulting phase differences between cyclic-mean signals in different trials result in a decreased trial-averaged cyclic-mean signal amplitude and, subsequently, an increased noise variance (see results). However, the phase difference of the mean phase in different trials was typically <5 μs. Data at one recording location for a given stimulation were therefore corrected for phase changes between trials. The mean phases of the responses (80-ms segments) were aligned by shifting the time base, typically by a few microseconds.

##### ITD TUNING.

Tuning to ITD corresponds to a significant variation in response to changing ITDs. Only recording locations showing ITD tuning (69 out of 102, assumed to be within NL, rather than on or outside its borders) were included in the analysis. The vector strength (mean resultant length, Goldberg and Brown 1969) of the ITD tuning of the response variance was required to be >0.02 and significant at *P* < 0.05. To calculate the vector strength of ITD tuning, we used the fact that ITD tuning is periodic with the stimulus frequency (independent of BF). We therefore assigned a specific phase, for example, the interaural phase difference, to each ITD and estimated the response variance. To overcome the problem of nonuniform phase sampling, we binned phases, averaged the response variances for bins with several values, and linearly interpolated values for bins with no values. The vector strength was calculated from the interpolated response variances; finally, vector strengths were averaged across bin sizes of 1, 2,…, 10° to suppress effects related to using a particular bin size. The computed ITD vector strength is equal to the “mean resultant length” as defined in Circular Statistics (Zar 1999). The significance of the vector strength was tested by bootstrapping the variance of each data set 10,000 times to calculate the distribution of possible vector strengths for that data set.

ITD tuning is observed throughout NL (Peña et al. 1996; Sullivan and Konishi 1986). The “best ITD” of a recording location, i.e., the ITD at which the largest response variance is expected, was determined from the variance of the cyclic-mean signal: the best ITD corresponded to the circular-mean direction (mean phase) of the variance as a function of ITD. The worst ITD was the best ITD plus half a period of the stimulus frequency. The periodicity of the ITD tuning was unequivocally determined by the stimulus frequency. To quantify how well the change of the response with the ITD tuning reflected a sine wave, we used the linear-circular correlation coefficient (Zar 1999; Berens 2009):
*r*_{cx} =*c*(cos α,*x*), *r*_{sx} =*c*(sin α,*x*), *r*_{cs} =*c*(sin α,cos α), and *c*(.,.) is the Pearson correlation coefficient, *x* is a linear variable, and α is a phase variable.

For sinusoidal, 1/*f* -periodic ITD modulation of some quantity *x*, for example, the firing rate or the noise variance, we can describe the ITD tuning by *x*(ITD) = *x*_{0} + *x*_{1} cos(2π*f*·ITD + ϕ) where ϕ is some phase, *x*_{0} > 0 is the mean, and *x*_{1} > 0 is the modulation amplitude. For *x*_{0} > *x*_{1}, these two quantities are connected by the equation *x*_{1} = 2*rx*_{0} where *r* is the vector strength.

The criteria developed above (*Preprocessing and selection of the data*) left 56 (out of 102) recording locations to be analyzed in more detail (many locations were excluded by several criteria simultaneously and assumed to be at the borders of NL). For the prediction of the binaural phase, we included only recording locations at which the same stimulus frequency was used for both monaural recordings as well as for the binaural recordings (38 out of 56); for this analysis we furthermore excluded those recording locations (4 out of 38) at which either one of the monaural recordings or the binaural recording had a randomized onset phase of the stimulus, leaving 34 recording locations.

##### CONTRIBUTION OF NL NEURONS TO THE NOISE VARIANCE.

We quantitatively estimated the magnitude of the contribution of NL neurons to the noise variance. We therefore extended a result derived in Kuokkanen et al. (2010) on the contribution of the activity of many neurons to the voltage measured at an electrode. Idealized point-like neurons were assumed to be homogeneously distributed in space at density 3/(4π*r*_{S}^{3}) where 2*r*_{S} is a measure for the mean distance between neighbors. The neurons were assumed to generate spike sequences that can be described by Poisson processes with a mean rate λ; a time-dependent modulation of this mean Poisson rate, i.e., phase locking, does not change the result below. A spike of an NL neuron was assumed to contribute to the voltage with a time dependent kernel *k*(*t*) as measured extracellularly by an electrode at distance *r* = *r*_{S} from the neuron. The kernel amplitude was assumed to decline proportionally to 1/r^{2} (dipoles) for distances *r* > *r*_{S}; distances *r* < *r*_{S}, which comprise less than one neuron on average, were neglected.

For this case, Kuokkanen et al. (2010; p. 2278) derived the expected power spectral density *P*(*f*) of noise at frequency *f* for the contribution of neurons within a sphere of radius *R* > *r*_{S} around the electrode,
*k̃*(*f*) is the Fourier transform of the kernel *k*(*t*). Because the noise power saturates with increasing radius *R*, the noise is local in that it is generated mainly by the few neurons closest to the electrode. We note the above result in *Eq. 1* is independent from the number (or density) of sources, which enters *P*(*f*) only through the specific kernel *k*. This waveform *k* describes the contribution of a spike of a neuron to the voltage measured by the electrode if the centers of the neuron and the electrode are separated by the typical distance *r*_{S} that characterizes the density of neurons.

In NL we have *r*_{S} ≈ 50 μm (Kuokkanen et al. 2010), and NL neurons form a map of ITD, i.e., neighboring neurons are tuned to similar ITDs. We therefore assumed that NL neurons within a sphere of radius *R* ≈ 3*r*_{S} ≈ 150 μm can be described by a single firing rate λ. The 27 neurons inside this 3*r*_{S} sphere contribute then 67% to the total power, which was obtained from the term within parentheses in *Eq. 1*. To simplify the calculations, we neglected a change in firing rate of neurons for *r* > 3*r*_{S} and used the approximation *R* → ∞, which lead us to

The noise variance σ_{NL}^{2} due to firing of NL neurons is related to the power spectral density *P*(*f*) by

Inserting *Eq. 2* into *Eq. 3*, we found

Using Parseval's theorem in *Eq. 4*, i.e., ∫|*k̃*(*f*)|^{2}d*f* = ∫|*k*(*t*)|^{2}d*t*, we obtained

For a typical kernel *k*(*t*) that has an amplitude of ∼100 μV and a width of ∼0.2 ms (Kuokkanen et al. 2010), and simply assuming a rectangular shape, we found from *Eq. 5* that the typical contributions from the NL neurons to the noise variance is

## RESULTS

To clarify the origin of the neurophonic potential in the barn owl's NL, we compared extracellular field potentials in response to monaural and binaural tone bursts, the latter presented with varying ITDs (Fig. 1, *A–D*). Altogether, we analyzed responses at 56 recording locations in the high-frequency region of the NL that showed ITD tuning in their response variance. The stimulus frequency used for each recording location was at or near its BF (see materials and methods for details).

### Quantification of ITD Tuning of the Neurophonic

The neurophonic in NL typically shows sharp ITD tuning (Sullivan and Konishi 1986; Carr and Konishi 1990; Wagner et al. 2005). We quantified ITD tuning using the variance of the ongoing neurophonic response; onset effects were excluded, and we averaged the variance over several trials. The variance of the neurophonic was used as a simple measure to describe how far the neurophonic is spread around its mean. Figure 1*E* depicts an example of the response variance to binaural stimulation at different ITDs. The variance of the binaural response was modulated by ITD (vector strength: *r* = 0.28; *P* = 8·10^{−3}), and this modulation was highly sinusoidal (linear-circular correlation: ρ_{lc} > 0.999; *P* = 4·10^{−3}). The maximum binaural response variances were always larger than the variances of responses resulting from monaural stimulation (Fig. 1*F*). This response behavior was typical for all 56 recording locations (Fig. 1*G*; *r* = 0.16 ± 0.12; mean ± SD; ρ_{lc} = 0.96 ± 0.06; all values were significant at *P* < 0.05). However, the relation between the two monaural response variances and how they interact to give rise to the binaural variances was quite variable (Fig. 1*F*). To better understand how ITD tuning was generated, we investigated neurophonic responses in more detail, which was guided by two basic hypotheses.

### Two Hypotheses on the Origin of the Neurophonic

The input hypothesis states that the binaural neurophonic is the result of a linear waveform summation of the extracellular fields generated by the ipsi- and the contralateral inputs. In this case, the measured field potential *r*_{B}(*t*) (in response to binaural input) is simply the sum of a contribution *r*_{I}(*t*) from the ipsilateral side and a contribution *r*_{C}(*t*) from the contralateral side, that is,

The alternative output hypothesis states that the ITD tuning reflects the spiking activity of the primary neurons within the NL. These neurons are generally regarded as coincidence detectors that fire at high rates at a particular ITD of the ipsi- and contralateral input. NL neurons convert the bilateral inputs in some complex, possibly nonlinear, way to produce action potentials (Funabiki et al. 2011). If the action potentials of NL neurons in the vicinity of an electrode generate the neurophonic, the properties of the neurophonic should depend on details of the input, the neuronal integration, and the spike generation in some nontrivial way. A mathematical framework to describe these processes depends on many parameters. Kuokkanen et al. (2010), therefore, outlined an approach based on a signal-to-noise ratio analysis for recording with monaural stimulation. However, there is no simple expression (resembling *Eq. 7*) that allows for a quantitative test of predictions of the output hypothesis. We therefore first focus on the input hypothesis and return to a test of the output hypothesis later in the manuscript.

A direct test of the two hypotheses is not possible because monaural and binaural responses cannot be measured simultaneously, and a test using responses in subsequent trials is hindered by trial-to-trial variability. We therefore evaluated the predictions of the hypotheses for averaged quantities of the response. Appropriate averaging reduces noisy fluctuations of the neurophonic and might therefore allow comparisons of the important features of monaural and binaural responses across trials. In particular, we first studied the waveform of the phase-locked component, or “cyclic-mean signal,” of the response to monaural and binaural tone bursts. In subsequent sections, we review the residual noise of the responses, i.e., the parts of the responses that are not phase locked to the periodic stimulus. Finally, we revisit the full responses.

### Signal and Noise

The cyclic-mean signal (Kuokkanen et al. 2010) is obtained by averaging the neurophonic response *r*(*t*) across stimulus cycles (for a formal definition see also materials and methods). The resulting cyclic-mean was then repeated periodically for the duration of the response to produce the “signal” 〈*r*(*t*)〉 (Fig. 2, *A–C*). Subtracting the signal from the response we obtain the “noise”
*r*_{I} and *r*_{C} as well as from the binaural response *r*_{B}. An example of a binaural response and the resulting signal and noise are shown in Fig. 2, *A–C*. The corresponding power spectral densities in Fig. 2*D* further illustrate this separation. In summary, the cyclic-mean signal captures the phase-locked part of the response whereas the noise includes all (other) broadband components.

For a first test of the linear-summation (or input) hypothesis, we focused on basic properties of the cyclic-mean signal.

### Properties of the Cyclic-Mean Signal

Important properties of the binaural signal 〈*r*_{B}(*t*)〉 are its waveform and its ITD tuning. To describe the ITD tuning, we considered the variance Var[〈*r*_{B}(*t*)〉] of the signal. The example in Fig. 2*E*, circles, shows strong ITD tuning. The magnitude of tuning was quantified by the vector strength, which was *r* = 0.42 (*P* = 0.026, bootstrapping) here. This vector strength of the signal was larger than the vector strength of the full response in Fig. 1*E*, which is partially explained by the different offsets of ITD-tuning curves from baseline. In Fig. 2*E*, the minimum variance of the binaural signal, for example, at ITD ≈ 0 or 20 μs (circles), was below the variances of both monaural signals (dashed lines), which are independent of ITD.

The example shown in Fig. 2*E* was typical for all recording locations, i.e., there was always a sharp ITD tuning of the variance Var[〈*r*_{B}(*t*)〉] (vector strength *r* = 0.41 ± 0.09, range from 0.16 to 0.50, all *P* ≤ 0.03; see also Fig. 4*E*). The signal variances for monaural and binaural stimulations each ranged over five orders of magnitude across recording locations (Fig. 2*F*). At those recording locations where the cyclic mean was relatively small in amplitude, however, the quality of the ITD tuning was not affected: in the population, the maximum variance of the cyclic mean and the strength of the ITD tuning were uncorrelated (correlation coefficient: 0.15; *P* = 0.26 by *t*-test; *n* = 56).

The second important property of the binaural signal 〈*r*_{B}(*t*)〉 is its waveform. We utilized the power spectral density (example in Fig. 2*D*) to infer the shape of the waveform. Because the second harmonic was much smaller than the large peak at the stimulus frequency, the binaural signal was highly sinusoidal, as Fig. 2*B* also indicates; this sinusoidal waveform of the extracellular signal was similar to intracellularly recorded membrane potential oscillations of high-BF NL neurons (Funabiki et al. 2011). At our 56 recording locations, the median level of the second harmonic was 43 dB (interquartile range: 11 dB) below that of the first harmonic at the best ITD. For the worst ITD, the median of the second harmonic was 20 dB (interquartile range: 13 dB) lower. Higher harmonics were typically close to or above the upper edge (13 kHz) of the band-pass filter applied to all electrode signals (see also materials and methods). As a consequence, we could describe the waveform of the binaural signal by the simple model
*a*_{B} is the amplitude and ϕ_{B} is the phase of a sine function, and *f*_{stim} is the stimulus frequency. Please note that both *a*_{B} and ϕ_{B} may depend on ITD and that the mean value of the signal is zero. The variance and the amplitude of the signal are then related through

Signals in response to monaural stimulation are also highly sinusoidal (Kuokkanen et al. 2010). We therefore can describe signals 〈*r*_{I}〉 and 〈*r*_{C}〉 in response to ipsi- and contralateral stimulation, respectively, by
*a*_{I} and *a*_{C} and phases ϕ_{I} and ϕ_{C} of sinusoids oscillating at the stimulus frequency *f*_{stim}. These descriptions of averaged responses are instrumental in a first test of the linear-summation hypothesis. We note that *a*_{I}, *a*_{C}, ϕ_{I}, and ϕ_{C} are obtained from recordings with monaural stimulation.

### Prediction of the Binaural Cyclic-Mean Signal from Monaural Signals

The linear-summation hypothesis in *Eq. 7*, which states that the binaural response is the sum of ipsi- and contralateral inputs, can be tested using the cyclic-mean signals. Because the average 〈…〉 is linear, *Eq. 7* leads to

Cyclic means are highly sinusoidal and can be approximated by sine functions; see *Eqs. 9*, *11*, and *12*. Using these models in *Eq. 13*, we find relationships for the amplitudes and for the phases. The linear-summation hypothesis therefore predicts that the binaural signal has amplitude
_{C}, where ϕ_{C}(ITD) = ϕ_{C,ITD=0} + 2π *f*_{stim}·ITD, also *a*_{B} and ϕ_{B} are functions of the ITD (note that the branch of the arctan needs to be chosen appropriately).

For a comparison of model and data, we consistently used response variances (units mV^{2}) in this article. In the model, the predicted binaural signal amplitude *a*_{B} (*Eq. 14*) is connected to the predicted binaural response variance through Var[〈*r*_{B}(*t*)〉] = *a*_{B}^{2}/2 (*Eq. 10*). The predicted variances (solid line in Fig. 2*E*) matched the measured ones (circles). The root mean squared error (RMSE) of the prediction was 8.2·10^{−3} mV^{2}, which could be separated into a bias term −4.0·10^{−3} mV^{2} and a deviation term 7.2·10^{−3} mV^{2}. To compare errors across recording locations, we normalized values by the mean variance across stimulus ITDs for each recording location. In the example in Fig. 2*E*, which had a mean binaural variance of 160·10^{−3} mV^{2} for the cyclic-mean, the normalized RMSE was 5.2%, the normalized bias was −2.5%, and the normalized deviation was 4.6%. In the population, normalized values were 18 ± 16%, −1 ± 20% (bias), and 11 ± 8% (deviation).

To better interpret the magnitude of the error and to better assess the strength of the model's predictions, we determined another standard error measure, i.e., the correlation coefficient *R* between model prediction and data. The squared correlation coefficient *R*^{2} gives the fraction of the variance of the data explained by the model. In the example in Fig. 2*E*, the model explained 99.7% of the variance of the data. In the population, the median of the variance explained was 99.5% (mean ± SD: 99.1 ± 1.2%; Fig. 2*H*). These values are remarkably high. However, we note that the measure *R*^{2} disregards any bias and scaling between model and data, in contrast to the RMSE.

Similarly to signal amplitudes (*Eq. 14*), the binaural phases ϕ_{B} are also well described by the linear prediction (*Eq. 15*, solid line in Fig. 2*G*). The predicted phase in Fig. 2*G* has an RMSE of 4.7° with a bias of −2.9° and a deviation of 3.7°. To compare predictions across locations, we divided these values, in cycles, by the stimulus frequency (5.0 kHz in the example), which leads us to a temporal precision. This temporal precision quantifies the accuracy of the horizontal shift of the prediction of the waveform of the binaural signal (Fig. 2*B*). The obtained values of the temporal precision in the example (Fig. 2*G*) were 2.6 μs (RMSE), −1.6 μs (bias), and 2.0 μs (deviation). In the population, the normalized values were 10 ± 8 μs (RMSE, Fig. 2*I*), −8 ± 14 μs (bias), and 7 ± 6 μs (deviation). In the population, there were several typical shapes of the phase-ITD tuning curves, including a sawtooth-like shape (Fig. 2*G*) with an ∼90° jump, monotonically decreasing ones, i.e., ones decreasing ∼360° within one cycle, and ones resembling a sinusoid. As can be inferred from *Eq. 15*, the shape depends on the ratio *a*_{I}/*a*_{C} of monaural amplitudes.

To summarize, for all applied stimulus frequencies and all applied ITDs, there was good agreement between measured binaural amplitude (RMSE 18 ± 16%) and phase (RMSE 10 ± 8 μs) of the signal and the values predicted by the linear waveform-summation model. The agreement between the parameter-free model (*Eqs. 7* and *13*) and the data is surprisingly good. However, we also note that the alternative output hypothesis, which states that NL spike activity generates the neurophonic, can, in principle, explain the observed linear summation of cyclic mean monaural signals as proposed in *Eq. 13*. Even though each individual NL neuron might combine binaural inputs nonlinearly, it is conceivable that in the summed activity of many neurons (as observed in the neurophonic) such a nonlinearity is much less pronounced. Thus, further analyses are necessary to distinguish between the input hypothesis and the output hypothesis. We therefore study, in what follows, the residual noise of the neurophonic (Fig. 2*C*) to distinguish between the two cases.

### Properties of the Binaural Noise

The variance of the noise (*Eq. 8*) of the neurophonic is a signature of the mean firing rate of the neuronal elements in the vicinity of an electrode (Kuokkanen et al. 2010; Lindén et al. 2011). Within NL there are mainly axons from NM neurons and NL neurons. Neurons in NL form a map of ITD, i.e., NL neurons near the electrode are tuned to similar ITDs. Therefore, a change of the ITD changes the firing rates of the binaurally driven NL neurons close to the electrode in a similar way. On the other hand, the activity of monaurally driven NM neurons, which provide input to NL, does not change with ITD. Therefore, an investigation of the ITD tuning of the variance of the noise allows us to assess to what extent the two sources contribute to the neurophonic. The linear waveform-summation (or input) model of the neurophonic predicts that the noise variance should not exhibit ITD tuning, whereas the alternative output hypothesis predicts that noise should show some ITD tuning.

To discriminate between the two possibilities, we first analyzed the properties of the noise in detail. The noise *n* was defined in *Eq. 8* (Fig. 2, *A–C*). Binaural noise *n*_{B}, for example, is the binaural response *r*_{B} minus the cyclic-mean binaural response 〈*r*_{B}(*t*)〉:
_{B}^{2} was only weakly tuned to ITD (Fig. 3*A*, circles; horizontal bars indicate single trials). The ITD-dependent modulation was 0.003 mV^{2}. This modulation equals the averaged noise variance or “noise floor” (0.095 mV^{2}) multiplied by two times the vector strength (*r* = 0.017) of the ITD tuning of noise (see also materials and methods), which was small but significant (*P* = 0.03, bootstrapping) in this example. In the population, the noise floor was about two orders of magnitude larger than the ITD-dependent modulation (Fig. 3*B*). Nevertheless, noise floor and modulation were strongly correlated (correlation coefficient: 0.73; *P* = 2·10^{−10}; *n* = 56), which may be explained by random fluctuations in the noise floor. Small modulations of noise are consistent with small vector strengths of the ITD tuning of noise (*r* = 0.012 ± 0.008, range from 0.002 to 0.043; *n* = 56; Fig. 3*C*, distance of the circles from the origin; see also Fig. 4, *E* and *F*). In 16 recording locations (out of 56), the vector strength was significant at *P* < 0.05.

This result was quantitatively compared with model predictions. The input hypothesis predicts that the noise should not be at all tuned to ITD. Although in most cases (40 out of 56) the observed tuning was indeed not significant, in some cases (16 out of 56 with significant tuning) the pure input hypothesis can be rejected. However, the ITD tuning of noise was always very weak, which argues for a minor deviation from the input hypothesis. Therefore, the input hypothesis describes the data well.

How large is the expected contribution of NL neurons to the noise variance, as predicted by the output hypothesis? In materials and methods (*Eq. 6*), we show that the noise variance σ_{NL}^{2} due to NL neurons is proportional to the mean firing rate of NL neurons in the vicinity (<150 μm) of the electrode. Peña et al. (1996) found sinusoidal ITD tuning curves with firing rates of NL neurons at 180 ± 101 spikes/s (means ± SD) for unfavorable ITDs and 354 ± 168 spikes/s for favorable ITDs. Using this observed range of mean rates in *Eq. 6*, we find σ_{NL}^{2} ≲ 2·10^{−3} mV^{2}. This predicted magnitude of the contribution of NL neurons to the binaural noise is much smaller than the observed magnitude of σ_{B}^{2}; for example, in Fig. 3*A* we find σ_{B}^{2} ≈ 0.1 mV^{2}, which is two orders of magnitude larger than σ_{NL}^{2}. At most recording locations shown in Fig. 3*B*, the measured noise floor σ_{B}^{2} is larger than 10^{−3} mV^{2}. We note that this estimate of the mean noise level does not depend on ITD, and the effect of pooling across neurons with possibly different mean rates can be neglected here.

On the other hand, the observed change in firing rates of NL neurons between favorable and unfavorable ITDs is in line with the observed range of modulation of the noise variance. The typical change in rate is ≈200 spikes/s (Peña et al. 1996), and *Eq. 6* leads to a change in the noise variance of ∼10^{−3} mV^{2}. This value matches well the modulation amplitude of the noise variance in Fig. 3*A* and is right in the center of the values observed in the population (Fig. 3*B*). In this estimate of the tuning of the noise, the “range” of the electrode is important because the favorable ITD changes as a function of position in NL. In materials and methods we show that 67% of the noise power comes from a range of ∼150 μm around the tip of an electrode, and within such a sphere of diameter of ∼300 μm we find on average ∼27 NL neurons. The NL neurons are arranged in a map of ITD, and the firing rate varies as a function of location along the axonal delay lines. For an axonal conduction velocity of 5 m/s and a stimulus frequency of 5 kHz, the ITD modulation of the firing rate is periodic within NL with a spatial wavelength of ∼0.5 mm, which is comparable, but considerably larger than the noise range of the electrode. Moreover, the favorable ITD changes only along one dimension of NL, i.e., along the axonal delay lines. In the other two dimensions, i.e., perpendicular to the axonal delay lines, the favorable ITD does not change. Pooling across neurons therefore reduces the ITD modulation of the noise but does not change the order of magnitude, which is essential here. Thus the above quantitative arguments indicate that the values and strong ITD modulation of firing rates of NL neurons are difficult to reconcile with the average magnitude and weak ITD modulation of the noise variance. Therefore, the pure NL spike output hypothesis can be rejected.

Although the noise variance was only weakly tuned to ITD, there was nevertheless a preferred phase of tuning, which occurred at ITD = 134 μs in Fig. 3*A* (arrow). This preferred ITD of the noise response was similar to the best ITD of the signal, which was = 118 μs in Fig. 2*E*. To further investigate this relationship, the difference Δ = 134 − 118 μs of the two values was converted to a phase *D* = 360°·Δ·*f*_{stim}. In the example we obtained *D* = 29.6° for *f*_{stim} = 5,000 Hz. This conversion allowed us to compare locations with different BFs. The direction of the noise variance in the population was *D* = 14 ± 57° (circular mean ± circular SD, Fig. 3*C*, angle of the circles), while the mean direction of the noise variance, weighted with the respective vector strengths, was 16.1° (Fig. 3*C*, gray arrow). Thus the weak ITD tuning of the noise was related to ITD tuning of the signal. This result indicates that NL neurons may have a small contribution to the neurophonic and that the overwhelming power of the neurophonic is due to the input to NL.

To further test the input hypothesis quantitatively, we neglected the small ITD tuning of noise (modulation) and focused on its average magnitude (noise floor, Fig. 3*B*). In Fig. 3*A*, the binaural noise variance (circles) was larger than the variances of the two monaural stimulations (dashed and dot-dashed lines), and those were even larger than the variance of the spontaneous activity (dotted line). This ordered relation of the magnitude of noise variances was observed at all recording locations (Fig. 3*D*), which implies that the noise, like the signal, depends on the stimulus even though the binaural noise variance is almost independent of the ITD.

### Prediction of the Average Binaural Noise Variance from Monaural and Spontaneous Noise Variances

Can we predict the measured binaural noise variance σ_{B}^{2} from the monaural noise variances σ_{I}^{2} and σ_{C}^{2} with the linear input-model? To do so, consider that the noise *n*_{B} measured with binaural stimulation may be composed of at least three types of sources: noise due to ipsilaterally driven input, *n*_{driven,I}, noise due to contralaterally driven input, *n*_{driven,C}, and also background noise *n*_{background} that is unrelated to acoustic stimulation and includes spontaneous activity. We tentatively neglect further noise sources, for example that due to the activity of NL neurons. To summarize,
*Eq. 19* cannot be determined separately. One can measure, instead, the noise *n*_{I}(*t*) for ipsilateral stimulation, the noise *n*_{C}(*t*) for contralateral stimulation, and the noise *n*_{spont}(*t*) for spontaneous activity. Similar to the above argument around the equation (18), *n*_{I}(*t*) may be composed of three components:
*n*_{spont,C} is the noise due to spontaneous activity of the contralateral input to NL. This term corresponds to (but is different from) the term *n*_{driven,C} in *Eq. 18*. For uncorrelated noise sources, *Eq. 20* leads to
_{C}^{2} for contralateral acoustic stimulation can be described by
*Eqs. 21*, *22*, and *23* in *Eq. 19*, we arrive at the prediction
*Eq. 24* can be directly measured. The linear-summation hypothesis therefore predicts how the variance σ_{B}^{2} of the noise for binaural stimulation depends on the measured noise variances σ_{I}^{2} and σ_{C}^{2} for monaural stimulation and on noise σ_{spont}^{2} for spontaneous activity.

The prediction error of the noise was normalized similarly to the prediction error of the cyclic-mean signal. In the example in Fig. 3*A* (solid line) the prediction had 10.8% RMSE, −10.2% bias, and 3.7% deviation. At the population level, the prediction of the noise variance matched the data well [RMSE: 8 ± 6%; bias: −2 ± 9%; deviation (RMSD): 3 ± 1%]. Here, the fraction of the variance explained is 1-RMSD^{2}, or 99.9% in the example. The median of the fraction of the variance explained was 99.9% in the population (mean ± SD: 99.9 ± 0.1%; Fig. 3*E*).

This excellent match between prediction and data further supports the hypothesis that the NL neurons' contribution to the neurophonic is small and that the noise mainly reflects the activity of the inputs to the NL. The small difference between the prediction and the data may be explained by the activity of the NL neurons. We again note that the alternative output hypothesis (spikes of NL neurons near the electrode generate the neurophonic) predicts that the noise exhibits sharp ITD tuning.

### Prediction of the Binaural Response Variance

After having considered the binaural signal and noise separately, we revisit the binaural response *r*_{B}(*t*). From *Eq. 16* we find
*Eqs. 10* and *17* lead us to

The assumption that signal and noise are uncorrelated is supported by the sharp ITD tuning of the signal and the almost complete absence of the ITD tuning in the noise. Figure 4, *A–C*, outlines the predicted (solid lines) and measured (circles) response variances for three examples. Although the recorded response variances spanned over five orders of magnitude (Fig. 1*F*), the binaural response variance was predicted well by the spontaneous and monaural responses, with the median of the fraction of the variance explained of *R*^{2} = 98.4% (mean ± SD: 92 ± 11%; Fig. 4*D*). The normalized RMSE of the prediction was 9 ± 6% with a bias of −3 ± 10% and a deviation of 4 ± 3%.

The ITD tuning of the neurophonic potential in response to binaural tones could be fully accounted for by the ITD tuning of the signal, i.e., the phase locked part of the response, whereas the ITD tuning of the noise was largely negligible. The amplitude of the signal always exhibited sharp ITD tuning, inferred from the high values of the vector strengths of the signal (Fig. 4, *E* and *F*), and this tuning of the signal was strong even in recordings in which the ITD tuning of the full response (= signal + noise) was weak (examples far below the diagonal in Fig. 4*G*); in such responses, the sharp ITD tuning of a small signal was masked by the weak ITD tuning of a larger noise.

Although the vector strengths of the ITD tuning of the noise and the signal had nonoverlapping ranges (Fig. 4*E*), the vector strengths were nevertheless correlated (correlation coefficient: 0.31; *P* = 0.018; *n* =56, squares in Fig. 4*F*). However, for the subset of recording locations (16 out of 56) with significant ITD tuning of noise (squares with crosses in Fig. 4*F*), vector strengths of noise and signal were not significantly correlated (correlation coefficient: −0.04, *P* = 0.88; *n* = 16). Finally, we note that the average variance of binaural noise (noise floor) was correlated to the variance of binaural signal at best ITD (correlation coefficient: 0.6; *P* = 6·10^{−7}; *n* = 56; Fig. 4*G*).

In summary, these results support the hypothesis that the input to NL is the predominant source of the neurophonic potential in NL of the barn owl.

## DISCUSSION

Predicting the source(s) of extracellular potential(s) is a challenging task. The barn owl NL has very few neuronal elements and a large, ITD-sensitive potential. Our experiments and the computational model have shown that the sum of the neurophonic potentials recorded in response to ipsi- and contralateral stimulation accurately predicts the responses to binaural stimulation at arbitrary ITDs. The fundamental assumption underlying this linear model is that the inputs to NL, i.e., afferent axons within NL and their synaptic input onto NL neurons, are the main sources of the neurophonic potential. Axonal and synaptic contributions are strongly correlated because a spike in an NM axon triggers pre- and postsynaptic currents at its terminals at NL neurons; therefore, we cannot disambiguate the contributions of the two sources in our analysis.

The binaural response in NL can be predicted by a linear model; the nonlinearities generated by NL coincidence detector neurons, which give rise to ITD tuning of their firing rates, were not necessary to explain the sharp ITD tuning of the neurophonic. A similar linearity, in the intracellularly measured membrane potential, has been observed recently in the medial superior olive of the gerbil (van der Heijden et al. 2011). We emphasize that our conclusion on the above nonlinearities depends on potential linearizing effects of pooling across NL neurons, which are arranged in a map of ITD, and this map can span a wide range of ITDs in NL. In general, the pooling effect should weaken any ITD tuning measured by an extracellular electrode in NL, and the weakening should depend on the range of the electrode and the quality of the map. However, the effects of pooling did not abolish the strong ITD modulation of the neurophonic “signal” (Fig. 4*E*), although the range of the electrode is large for phase-locked activity of uniformly distributed sources; this large electrode range for signals has been associated with the term “global” in (Kuokkanen et al. 2010). In contrast, the noise was found to be “local” (*Eq. 1* and Kuokkanen et al. 2010), and the electrode range is smaller for the noise than for the signal. Therefore, the potential linearizing effect of pooling should be weaker for the noise than for the signal. We therefore concluded that the relatively weak ITD modulation of the noise (compared to the signal) cannot be explained only by pooling.

Monaural and binaural signals, obtained from averaging extracellular field potentials over many stimulus cycles, were highly sinusoidal; both the amplitude and the phase of the binaural signal were predicted well by the sum of monaural signals. We note that for the binaural phase, the median prediction error was only 10 ± 8 μs, which is smaller than the sampling interval 22.7 μs in the measurements. The prediction error was in the same range as the previously reported temporal precision <20 μs of the neurophonic (Wagner et al. 2005). The variance of the predicted binaural signal amplitude that the model explained was 99.1 ± 1.2%, which is very high for a parameter-free model based on a single simple assumption, namely the linearity of the system. In general, the small prediction errors might at least partly be due to nonstationarities of recordings because monaural and binaural responses were measured with a temporal delay of typically several minutes.

### Weak ITD Tuning of Noise Implies Small Contributions from NL Neurons to the Neurophonic

Mechanistically, the amplitude of the recorded cyclic-mean signal is proportional to the population firing rate times the population vector strength of the neural elements within NL, most of which fire phase-locked to the stimulus (Kempter et al. 2001; Kuokkanen et al. 2010; Ashida et al. 2012). On the other hand, the residual noise represents the activity of neural elements that contribute to the extracellular field potential independently of their phase locking. The noise variance is therefore a signature of the mean neuronal activity, which was argued to originate from the immediate vicinity (<150 μm) of the recording electrode (Kuokkanen et al. 2010; Lindén et al. 2011). This assumption is essential to further arguments below.

To test whether ITD tuning of the mean spiking activity of the NL neurons can be observed in the extracellular field potential, we studied the ITD tuning of the noise variance. Since NL neurons form a map of ITD, nearby NL neurons are tuned to similar ITDs. The aggregate activity of the NL neurons in the vicinity of an electrode should therefore show considerable ITD tuning. Interestingly, the noise variance of the binaurally stimulated data did not show ITD tuning at most recording locations, and the vector strength was always small, below 0.043. However, at 16 (out of 56) recording locations, ITD tuning was significant at *P* < 0.05. Overall, the ITDs that gave rise to the largest noise variances were similar to the best ITDs of the binaural signals (difference *D* = 14 ± 56°, normalized by the stimulus frequency). Funabiki et al. (2011) reported a similar range (41.3 ± 40.3°) between the spike rate ITD tuning of the NL neurons and neurophonic ITD tuning. As Funabiki et al. (2011) suggested, this difference could be due to the contribution of NL axons to the neurophonic potential, instead of the NL cell bodies. Because the NL axons run along the ITD gradient, orthogonal to the delay line axons within the NL (Sullivan and Konishi 1986; Carr and Boudreau 1993), the ITD tuning of the axon should be phase shifted with respect to the neurophonic. Furthermore, in the NL neurons, the action potential may be generated in the first node of Ranvier (Ashida et al. 2007; Carr and Boudreau 1993; Kuba et al. 2006), giving rise to strong conductances beyond the cell body.

The very weak ITD tuning of the noise variance suggests that the neural elements providing the predominant contribution to the extracellular field potential in NL do not change their mean activity as a function of ITD. We note that the mean activity of the bilateral inputs to NL is independent of ITD, whereas the mean activity of NL neurons strongly varies as a function of ITD (Peña et al. 1996). As a result, the contribution of NL action potentials to the neurophonic can be assumed to be small. Similarly, for example, in chicken NL (Köppl and Carr 2008) and cat medial superior olive (Mc Laughlin et al. 2010a,b), the synaptic inputs to the bipolar cells are assumed to be the primary source of neurophonic. If our interpretation is correct in that the dependence of the noise variance on ITD reflects NL activity, the method we present here would allow us to separate input activity, as reflected by the signal, and output activity, which is part of the noise variance.

### Linear Prediction of Noise Variance and the Summation Ratio

The conjecture that the contributions of NL action potentials to the noise are negligible allowed us to predict the binaural noise variance σ_{B}^{2} from the ipsilateral (σ_{I}^{2}) and contralateral (σ_{C}^{2}) response variances and the variance σ_{spont}^{2} of the spontaneous activity. The linear model led to the prediction σ_{B}^{2} = σ_{I}^{2} + σ_{C}^{2} − σ_{spont}^{2} (*Eq. 24*), which matched the binaural data well: the variance of the data explained by the model was 99.9 ± 0.1% in the population.

This linear model for noise variances is closely related to the “summation ratio” (SR) as defined by Goldberg and Brown (1969)
*R*_{B}, *R*_{I}, *R*_{C}, and *R*_{spont} are firing rates in response to binaural, ipsilateral, and contralateral stimulation, and the spontaneous activity, respectively. The summation ratio quantifies the relationship of the activities: SR = 1 represents linear summation of monaural rates whereas SR < 1 indicates “disfacilitation,” i.e., the binaurally driven rate *R*_{B} − *R*_{spont} is smaller than the sum of the two monaurally driven rates, *R*_{I} − *R*_{spont} and *R*_{C} − *R*_{spont}. In contrast, SR > 1 indicates “facilitation” (Goldberg and Brown 1969; Peña et al. 1996).

The summation ratio has been previously applied to NL neurons' firing rates. At the worst ITD, sustained firing rates were smaller than or similar to their monaurally driven rates (Carr and Konishi 1990; Peña et al. 1996), and Peña et al. (1996) reported disfacilitation at SR = 0.50 ± 0.05. On the other hand, at the most favorable ITD, sustained firing rates can exceed the sum of monaural discharge rates (Carr and Konishi 1990; Peña et al. 1996), and Peña et al. (1996) found facilitation at SR = 1.51 ± 0.27. Thus, barn owl NL neurons operate in the regime of nonlinear coincidence detectors extracting the signal amplitude of oscillatory input (Peña et al. 1996; Kempter et al. 1998; Funabiki et al. 2011; Ashida et al. 2012). Low BF chicken NL neurons, however, employ an additional strategy, whereby inhibition suppresses firing at unfavorable ITDs (Nishino et al. 2008).

The noise variance can be associated with the local mean neuronal activity (Kuokkanen et al. 2010), as argued above. Assuming that the noise variance is proportional to the firing rate (*Eq. 6*), i.e., *R*_{B} ∼σ_{B}^{2}, *R*_{I} ∼σ_{I}^{2}, *R*_{C} ∼σ_{C}^{2}, and *R*_{spont} ∼σ_{spont}^{2}, we find from the measured noise variances a summation ratio SR = 1.00 ± 0.03. This value indicates linear summation (Goldberg and Brown 1969), as predicted by the linear model. Indeed, the original expression for the summation ratio at SR = 1 in *Eq. 27* and the linear model for noise variances (*Eq. 24*) are equivalent if firing rates are proportional to noise variances.

### Conclusion

We conclude that the nonlinear behavior of the NL coincidence detector neurons (summation ratio ≈1.5), is only weakly, if at all, represented in the extracellular field potential within NL. This finding provides independent evidence for the contribution of inputs to the neurophonic and strengthens the conclusions drawn by Kuokkanen et al. (2010). Conversely, the neurophonic mainly represents the input to NL. Our results therefore strongly support the hypothesis by Sullivan and Konishi (1986) that the binaural neurophonic response is essentially waveform summation of the two monaural neurophonic responses. In addition, theoretical and computational work on the development of temporal feature maps in NL had suggested that the neurophonic reflects the input to NL (Kempter et al. 2001; Leibold et al. 2001, 2002). Understanding the origin of the neurophonic is therefore important for testing these theoretical predictions about how maps of ITD with a sub-millisecond precision may be assembled during development.

## GRANTS

This work was supported by the Bundesministerium für Bildung und Forschung (Bernstein Center for Computational Neuroscience Berlin, 01GQ1001A and Bernstein Focus “Neuronal Foundations of Learning,” 01GQ0972 to R. Kempter), Deutsche Forschungsgemeinschaft [WA 606/12-1 to H. Wagner; Sonderforschungsbereich (SFB) 618 “Theoretical Biology,” TP B3], National Institute on Deafness and Other Communication Disorders Grants DC-00436 (to C. E. Carr) and P30-DC-04664 (to the University of Maryland Center for the Evolutionary Biology of Hearing), and fellowships from the Alexander von Humboldt Foundation and the Hanse- Wissenschaftskolleg (to C. E. Carr and G. Ashida).

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## AUTHOR CONTRIBUTIONS

Author contributions: C.E. C. and H.W. conception and design of research; G.A., C.E.C., and H.W. performed experiments; P.T.K. and R.K. analyzed data; P.T.K., C.E.C., H.W., and R.K. interpreted results of experiments; P.T.K. and R.K. prepared figures; P.T.K., G.A., C.E.C., H.W., and R.K. drafted manuscript; P.T.K., G.A., C.E.C., H.W., and R.K. edited and revised manuscript; P.T.K., G.A., C.E.C., H.W., and R.K. approved final version of manuscript.

## ACKNOWLEDGMENTS

This work profited from the advice of Sandra Brill, Nikolay Chenkov, Jose Donoso, Jorge Jaramillo, Thomas Künzel, Thomas McColgan, Martina Michalikova, Jose Luis Peña, and Roland Schaette. We gratefully acknowledge the technical assistance of Sahil Shah, and Kai Yan.

- Copyright © 2013 The American Physiological Society