## Abstract

Responses to broadband Gaussian white noise were recorded in auditory-nerve fibers of deeply anesthetized chinchillas and analyzed by computation of zeroth-, first-, and second-order Wiener kernels. The first-order kernels (similar to reverse correlations or “revcors”) of fibers with characteristic frequency (CF) <2 kHz consisted of lightly damped transient oscillations with frequency equal to CF. Because of the decay of phase locking strength as a function of frequency, the signal-to-noise ratio of first-order kernels of fibers with CFs >2 kHz decreased with increasing CF at a rate of about −18 dB per octave. However, residual first-order kernels could be detected in fibers with CF as high as 12 kHz. Second-order kernels, 2-dimensional matrices, reveal prominent periodicity at the CF frequency, regardless of CF. Thus onset delays, frequency glides, and near-CF group delays could be estimated for auditory-nerve fibers innervating the entire length of the chinchilla cochlea.

## INTRODUCTION

Nonlinear systems cannot be described by transfer functions or their time-domain equivalents, unit-impulse responses, and thus are difficult to analyze. Therefore the prominent nonlinearities of the cochlea have typically been described as perturbations of linear-like relations, using ad hoc stimuli such as tone pairs, for example, to study 2-tone suppression and combination-tone generation. More general and systematic approaches for the study of nonlinear systems, analogous to the determination of transfer functions of linear systems, were first sketched by Volterra and Wiener (Schetzen 1989; Volterra 1959; Wiener 1958) and fleshed out by Lee and Schetzen (Lee and Schetzen 1965; Schetzen 1989). Wiener analysis describes systems as sums of functionals (functions of functions). (For discussion of the applicability of Wiener-kernel analysis to the auditory system, see Eggermont 1993; Eggermont et al. 1983; Johnson 1980a; see also Kim and Young 1994 for a comparison of Wiener-kernel and spectrotemporal analysis.) The 1st-order Wiener functional is given by the convolution of a Gaussian white-noise input with the 1st-order Wiener kernel (h_{1}), the 2nd-order Wiener functional is given by a convolution of the noise with the 2nd-order Wiener kernel (h_{2}), and so on for the higher-order functionals; h_{1} and h_{2}, respectively, are obtained by performing 1st- and 2nd-order cross-correlations between a white-noise stimulus and the system's response (Lee and Schetzen 1965; Marmarelis and Marmarelis 1978). For linear systems, h_{1} is identical to the system's impulse response (i.e., the response to an intense brief click). For a system consisting of a linear filter followed by a quadratic nonlinearity, h_{1} is zero and h_{2} provides information on the quadratic nonlinearity (Marmarelis and Marmarelis 1978).

Wiener-kernel analysis was introduced to auditory physiology by Egbert de Boer (de Boer 1967) and colleagues. They used the *reverse correlation* (or *revcor*; *Eq. 8* in methods), the average noise-stimulus waveform preceding each spike (which is proportional to h_{1}; *Eq. 7* in methods), to describe the responses to noise of cat auditory-nerve fibers (ANFs) (de Boer 1969, 1973; de Boer and de Jongh 1978; de Boer and Jongkees 1968). Revcors have also been applied to the analysis of cochlear nucleus neurons and ANFs in several species (Carney and Yin 1988; Carney et al. 1999; Evans 1977; Joeken et al. 1997; Kim and Young 1994; Lewis et al. 2002; Møller 1977a, 1978; van Dijk et al. 1993, 1994, 1997; Wickesberg et al. 1984). The revcors of ANFs with low characteristic frequency (CF) have frequency tuning consistent with frequency–threshold curves for responses to tones. In the case of high-CF ANFs, revcors are small or insignificant in magnitude, reflecting the weak phase locking to high-frequency tones (Johnson 1980b; Palmer and Russell 1986). This deficiency of h_{1}s can be made up by the computation of h_{2}s, first used for the analysis of auditory neurons by Wickesberg et al. (1984), which can yield information on temporal coding even in the absence of phase locking to near-CF stimuli (Lewis et al. 2002; van Dijk et al. 1993, 1994, 1997; Yamada and Lewis 1999; Yamada et al. 1997).

We undertook a 2nd-order Wiener-kernel analysis of ANFs in chinchilla as a complement of work in our laboratory seeking to relate the responses of ANFs to the underlying vibrations of the basilar membrane (e.g., Narayan et al. 1998; Ruggero and Rich 1987; Ruggero et al. 1996, 2000). We especially wanted to obtain timing information for near-CF responses of ANFs innervating basal regions of the chinchilla cochlea: in chinchilla, high-quality basilar membrane data are available for the cochlear base (e.g., Recio et al. 1998; Rhode and Recio 2000; Ruggero et al. 1997) but comparable timing data do not exist for high-CF ANFs. We describe here h_{1}s and h_{2}s of chinchilla ANFs as a function of stimulus level and CF. A companion paper (Temchin et al. 2005) provides quantitative estimates of the ability of the Wiener kernels to predict ANF responses to tones, clicks, and frozen noise and relates the Wiener kernels to basilar membrane vibrations in the chinchilla cochlea. Preliminary results of this investigation were published in abstract form (Temchin et al. 1995).

## METHODS

### Animal preparation

Most of the techniques used here for animal preparation were published previously (Ruggero and Rich 1983, 1987). Adult male chinchillas were anesthetized with an initial injection of ketamine (100 mg/kg, subcutaneous) and with sodium pentobarbital (65 mg/kg, intraperitoneal), supplemented with additional doses of pentobarbital to maintain a complete absence of limb-withdrawal reflexes. Rectal temperature of the animals was maintained near 38°C with a servocontrolled electrical heating pad. Tracheotomy and tracheal intubation allowed for forced respiration, which was used only as necessitated by apnea or labored breathing. The pinna was resected and part of the bony external ear canal was chipped away to permit visualization of the umbo of the tympanic membrane and insertion of the earphone-coupling speculum. After opening the bulla widely, the tendon of the tensor tympani muscle was severed and the stapedius muscle was detached from its bony anchoring to prevent possible effects of muscle contraction evoked by high-level acoustic stimuli. A silver-ball electrode was placed on the round window to monitor cochlear health by measuring compound action potential thresholds.

The auditory nerve was approached superiorly after craniotomy and aspiration of part of the cerebellum. Capillary-glass microelectrodes (filled with 3 M NaCl or KCl solutions, impedance 20–70 MΩ) were positioned under visual control through an operation microscope and were advanced into the nerve by means of a remotely controlled hydraulic micropositioner.

ANF frequency–threshold tuning curves were obtained for responses to tone pips using an automated procedure (Liberman 1978; Ruggero and Rich 1983). Spontaneous rate was measured over a 10-s interval. CF and CF threshold were estimated from 3rd-order polynomial functions fitted to the tuning-curve tips.

### Acoustic stimulation

Acoustic stimuli were presented using a Beyer DT-48 earphone. Electrical tone pips were produced by a custom-built digital waveform generator under computer control (Ruggero and Rich 1983): a sinusoidal waveform, stored as 8,192 16-bit words of read-only memory, was sampled at selectable rates and converted into analog electrical signals. The sound pressure level (SPL) and phase of the acoustic tones were controlled by attenuating the electrical signals and by adjusting their starting phases with reference to a calibration table (SPL and phase as a function of constant-level and constant-phase electrical tones) generated in situ at the beginning of the experiment using a miniature Knowles microphone with its opening near the eardrum. On average, calibration SPLs were flat within ±3.4 dB at all frequencies <15 kHz.

Gaussian white-noise electrical waveforms were usually produced with an analog device (General Radio 1381). Less often, a hardware digital generator (Tucker-Davis Technologies TDT-WG1) or, exceptionally, a software-generated 8,192-word digital table, was also used. In all cases, the noise stimulus was low-pass filtered (15-kHz corner frequency, 48 dB/octave). Acoustic-noise spectral levels are expressed throughout this and the companion paper using units of dB SPL/Hz (i.e., re 20 μPa/Hz). [The unattenuated spectral level of the (electrical) noise stimulus was −48 dBV_{rms}/Hz. The root-mean-square (rms) voltage of the unattenuated tones exceeded the total rms voltage of the noise by 9 dB. Therefore for an attenuation of *x* dB, the spectral level at CF of an acoustic noise stimulus (SPL/Hz) = (SPL of unattenuated tones with frequency ≈ CF) − 57 dB − *x* dB.]. An equivalent rectangular bandwidth (ERB) was computed for each ANF, in accordance with its frequency tuning (see Fig. 16, *inset*). Thus for any given ERB and noise spectrum level, an ERB total pressure (expressed in dB SPL) could be determined. ERB SPL is more directly comparable to tone SPL than to noise spectral level (see Fig. 16).

The noise was almost always delivered continuously (duration: 42–1,070 s). Both the stimulus and the amplified microelectrode output were recorded on a stereo digital audiotape recorder (Sony DTC-690; 16 bits, sampling rate 44.1 or 48 kHz) for later analysis. Exceptionally (to obtain the 4 highest-level responses of Fig. 13), the noise was gated (20-ms duration, 80-ms repetition period) with an electronic switch (TDT SW2), digitized with 16-bit resolution (TDT AD1) and stored in the computer. When using the latter paradigm, neural responses were stored on-line as a series of spike times with accuracy of 1 μs (measured at their initial edge with a TDT ET1). With either paradigm, spike times were defined as the instants when their rising edges just exceeded the baseline noise level.

The noise produced by the TDT-WG1 digital generator contained spurious periodicities, evident in its autocorrelation function [φ_{xx}(τ) in *Eq. 11*]. Such periodicities sometimes contaminated the 2nd-order kernels (Figs. 1–4) with 2 artifactual stripes, parallel to the diagonals (not shown) and unrelated to the CF period. In every other respect, the analog and digital noises produced identical results and were therefore pooled for analysis.

### Computation and processing of the Wiener kernels

Zeroth-order kernels (h_{0}s), h_{1}s, and h_{2}s were computed from the spike and the digitized noise waveforms. Let *x*(*t*) represent the Gaussian white-noise waveform and *y*(*t*) be the spike train measured in the auditory nerve. On the assumption that all of the information carried by the spike train is in the time of occurrence of each individual spike, *y*(*t*) can be expressed as (1) where δ(*t*) is the Dirac delta function (infinite height and area = 1), *N* is the total number of spikes evoked by the noise stimulus, and *t _{i}* represents the spike times.

The zeroth-order Wiener kernel represents the average output of the system (Schetzen 1989) (2) where 〈*y*(*t*)〉 is the time average of *y*(*t*) (3) where *T* is the stimulus duration (4) that is, *h*_{0} represents the average firing rate *N*_{0} = *N*/*T*.

The 1st-order Wiener kernel h_{1} was obtained by cross-correlating (Schetzen 1989) the input *x*(*t*) to the output *y*(*t*) (5) (6) (7) where *A* is the power spectral density of the noise stimulus and (8) is the reverse-correlation function (de Boer 1967).

Using the previous equations, one can interpret the 1st-order Wiener kernel as the average value of the stimulus *x*(*t*) at a time τ_{1} before the occurrence of a spike, normalized to the stimulus power spectral density. For linear systems, *h*_{1}(*t*) is identical to the impulse response. For nonlinear systems, h_{1} is only a component of the impulse response, which contains contributions from higher kernels. In general, h_{1} for a nonlinear system differs from the linear part of the system and may contain some of the system's nonlinearities (Marmarelis and Marmarelis 1978).

The 2nd-order Wiener kernel h_{2} is obtained by 2nd-order cross-correlation between *x*(*t*) and *y*(*t*) − *h*_{0} (Schetzen 1989) (9) (10) (11) where φ_{xx}(τ) is the autocorrelation function of the input signal *x*(*t*), and (12) is the 2nd-order reverse-correlation function.

All h_{2}s presented in this paper are *m* × *m* square matrices, where *m* is the sample length of h_{1}. The value of *m* was chosen in each case so that it substantially exceeded the duration of the h_{1} and h_{2} signals. Each element of the matrix, h_{2}(τ_{i}, τ_{j}) (with τ_{i} ≠ τ_{j}), is proportional to the firing rate and the 2nd-order reverse correlation, which can be interpreted as the mean of the product of the value of the stimulus *x*(*t*) at 2 times, τ_{i} and τ_{j}, before the occurrence of a spike. Thus the h_{2}s give a measure of the nonlinear interaction, or “cross talk,” between the responses to 2 impulses (Marmarelis and Marmarelis 1978). In other words, h_{2}s measure the deviation from superposition arising from the nonlinearity of the system.

Cross-correlations (according to *Eqs. 8* and *12*) were carried out in the time domain, using either MATLAB functions or ad hoc programs coded in the C programming language. h_{1}s were computed with sampling period 22.68 or 20.83 μs and length (*m*) = 256, 512, 1,024, or 2,048 (depending on CF), chosen to fully encompass the response duration.

All h_{1}s and h_{2}s presented in this paper were zero-phase filtered with a low-pass function with corner frequency 1.25 octave higher than the frequency at which the magnitude of the first-rank singular vector of the 2nd-order kernel (h_{2} FSV; see following text) exceeded the noise floor by 6 dB (see Fig. 2*F*). h_{1}s were filtered using the MATLAB function *filtfilt*. h_{2}s were filtered using a MATLAB implementation of a 2-dimensional (2-D) zero-phase filter. The h_{2}s were subjected to singular value decomposition (SVD), using the MATLAB function *svd* (13) where **U**, **S**, and **V** are square matrices of the same size as *h*_{2}, and **T** is the transpose operator. The columns of **U** and rows of **V** are known as the *left* and *right singular vectors*, respectively, and are dimensionless. The nonzero diagonal of **S** represents the weights of the corresponding (same-rank) *singular vectors*. The weights have the same dimensions as the h_{2}s. For symmetric matrices with real elements, such as the h_{2}s in the present work, singular vectors are the same as eigenvectors. Thus *h*_{2} can be decomposed as follows (14) where the eigenvector **u**_{i} is a column element of **U** (i.e., **U** = [**u**_{1} **u**_{2} … **u**_{N}]) and **d**_{i} represents the corresponding eigenvalue. The eigenvector **u**_{i} associated with the largest eigenvalue (i.e., the first singular vector or h_{2} FSV) is used extensively in this and the companion paper (Temchin et al. 2005). The weighted FSVs [FSV × **S**(1, 1); e.g., Fig. 14*B*] have the same units as the h_{2}s. In attempting to give a physical meaning to the h_{2} FSV it is important to note that *Eqs. 12* and *14* imply that its sign (polarity) is ambiguous.

The instantaneous frequency of the h_{2} FSV was estimated by means of the analytic signal representation, as described in Recio et al. (1998), within the time interval in which the h_{2} FSV envelope magnitude exceeded the noise floor by ≥12 dB. The magnitude and phase spectra of h_{1}s and h_{2} FSVs (padded with zeroes to a length of 4,096 samples, from their original lengths, *m*) were obtained by Fourier transformation with MATLABfunctions *fft*. h_{2}s (size: *m* × *m*) were Fourier transformed using MATLAB function *fft2*. The statistical significance of phase locking was tested by computation of the number 2*nVS*^{2}, where *n* is the number of spikes and *VS* is the vector strength (Goldberg and Brown 1969). Phase locking was considered statistically significant when 2*nVS*^{2} > 10.6, in which case *P* < 0.01 (Mardia and Jupp 2000).

## RESULTS

Responses to noise were obtained from 137 ANFs, with CFs between 100 Hz and 14 kHz, recorded in 16 chinchillas. After isolation of each ANF, a frequency–threshold tuning curve was measured from its responses to tone pips and the level of gated white noise, which elicited a just-audible increase of the rate of firing above the spontaneous rate, was determined. Long-duration (42–1,070 s) white noise was then presented at that level and also at higher levels if recording time allowed. h_{0}s, h_{1}s, and h_{2}s were computed off-line from 193 responses of 137 ANFs. Additionally, h_{0}s and revcors (*R*_{1}, *Eq. 8*) were computed from 44 responses of 40 other ANFs from 4 other chinchillas.

### General features of 1st- and 2nd-order Wiener kernels of ANFs

Figure 1 illustrates the 1st- and 2nd-order Wiener kernels (h_{1}s and h_{2}s, respectively) for responses to noise of a representative low-CF ANF. The time-domain waveform of the h_{1} (*blue trace*, Fig. 1*A*) is a transient but relatively undamped oscillation, indicative of a well-tuned band-pass system. Fourier transformation of h_{1} (blue, Fig. 1*D*) reveals a close match between its best frequency (BF) and the CF of responses to tones (arrow). [We distinguish between CF, the frequency that yields the most sensitive responses at threshold levels in normal adult cochleae, and BF, the frequency of peak sensitivity. Each ANF has a unique CF but has various BFs that depend on cochlear maturity (Overstreet et al. 2003) and health, as well as stimulus level.]

Figure 1, *B* and *E* present h_{2}(τ_{1}, τ_{2}), computed from the same responses to noise represented in Fig. 1, *A* and *D*. The h_{2} is depicted in Fig. 1*E* as a 3-dimensional (3-D) object and in Fig. 1*B* as its color-coded projection onto the 2-D plane. Hues indicate magnitudes: red for extreme positive values, blue for extreme negative values, and green hues indicating the featureless, near-zero, background. Over a well-localized region, the h_{2} consists of the intersection of 2 waves moving in directions parallel and perpendicular to the diagonal, thus creating a checkerboard pattern of peaks and troughs.

“Slices” through h_{2} are shown in Fig. 1*C*. The thin black line represents the values of the kernel at a fixed τ_{1} (3.27 ms), when this kernel reaches its maximum. The magenta line represents the kernel diagonal, h_{2}(τ, τ), which is the contribution of h_{2} to the impulse response (Marmarelis and Marmarelis 1978). The oscillation of this diagonal is “stretched out” in time (by a factor of √2) relative to h_{2} (3.27 ms, τ_{2}) and its apparent periodicity is 1/√2CF [instead of 1/CF, as for h_{1} or h_{2} (3.27 ms, τ_{2})]. [Both the “stretched out” duration and the 1/√2CF periodicity represent geometrical artifacts of the time-domain h_{2}, without counterpart in the frequency domain (panel *F*); see p. 154–155 of Marmarelis and Marmarelis 1978.] The nonzero diagonal indicates the presence of amplitude-dependent nonlinearities. Specifically, it indicates that the summation of 2 simultaneous identical impulses did not produce a response twice as large as the response to a single impulse. The pattern of the h_{2}s of low-CF ANFs is consistent with that of a linear system followed by a square-law device (Marmarelis and Marmarelis 1978).

Figure 1*F* presents the magnitudes of the Fourier transform of the 2nd-order kernel. (Note that magnitudes for negative frequencies in Fig. 1*D* and quadrants III and IV in Fig. 1*F* are not shown because they are redundant.) The spectral magnitudes of h_{2} contain peaks around (CF, CF) in quadrant I, and around (−CF, CF) in quadrant II. The peak in quadrant I corresponds to the wave in the direction of the diagonal (magenta line in Fig. 1*B*), and indicates even-order distortion components phase locked to CF. The peak in quadrant II corresponds to the wave in a direction perpendicular to the diagonal and indicates responses that follow the *envelopes* of the even-order nonlinearities. In low-CF ANFs, the spectral peaks in quadrants I and II had similar amplitudes.

As illustrated in Fig. 2, h_{2}s were routinely low-pass filtered to attenuate the masking effects of high-frequency noise. Additionally, Fig. 2 suggests that the features of h_{2}s are more effectively displayed as 2-D projections with color-coded amplitudes than as 3-D objects (compare Fig. 2, *A* and *B* or *D* and *E*).

Figure 3*A* illustrates kernels for a mid-CF ANF. Whereas h_{1} (*blue trace* in Fig. 3*A*) resembles its counterparts for lower-CF ANFs (e.g., Fig. 1*A*), h_{2} (Fig. 3*B*) does not. In particular, the checkerboard pattern prominent in Fig. 1*B* is only faintly discernible (in the early part of h_{2}) in Fig. 3*B*. Later parts consist of a ridge at the diagonal, flanked by parallel ridges and troughs spaced at regular intervals about equal to the CF period (measured along lines parallel to either time axis). The diagonals of h_{2}s of low-CF and mid-CF ANFs are also different. For low-CF ANFs, the diagonal (*magenta trace* in Fig. 1*C*) consists of a prominent AC component that rides on a DC shift. For mid-CF ANFs (*magenta trace*, Fig. 3*C*), the DC shift remains but the AC component is substantially attenuated. The h_{2}s of low- and mid-CF ANFs also differ in the frequency domain: for low-CF ANFs (Fig. 1*F*) the magnitude peaks in quadrants I and II are of comparable size, whereas the quadrant II peak is much larger than the quadrant I peak for mid-CF ANFs (Fig. 3*F*). The small size of the quadrant I peak indicates that AC responses are poorly phase locked to CF. The large peak in quadrant II indicates that the response consists principally of a transient DC shift (or envelope) synchronized to the occurrence of CF components in the stimulus.

Figure 4 presents the h_{1}s and h_{2}s of a representative high-CF ANF. The h_{1} (blue in Fig. 4*A*) is nearly insignificant and barely discernible within the baseline noise. This is consistent with the poor phase locking of ANF responses to high-frequency tones (Johnson 1980b; Palmer and Russell 1986; Woolf et al. 1981). However, the frequency tuning of high-CF ANFs is clearly revealed in their h_{2}s, which consist solely of ridges and troughs at, and parallel to, the diagonal. This striped pattern contrasts with the checkerboard pattern of h_{2}s of low-CF ANFs (e.g., Fig. 1*B*).

Slices through the h_{2} are shown in Fig. 4*C*. The diagonal of h_{2}(τ, τ) resembles the envelope function of h_{2}. At a fixed τ_{1}, h_{2} [e.g., h_{2}(1.94 ms, τ_{2})] approximates 1/CF and its shape is that of a band-pass system. The spectrum of h_{2} (Fig. 4*F*) contains only a single peak, around (−CF, CF) in quadrant II. This contrasts with the h_{2}s of low- and mid-CF ANFs (Figs. 1*F* and 3*F*) and indicates that near-CF spectral stimulus components are almost exclusively signaled by envelope (i.e., DC) responses.

All kernels presented in the figures of this paper (except those in Fig. 2, *A* and *B* and the *blue trace* in 2*C*) were subjected to low-pass filtering. However, we also closely studied unfiltered frequency–domain versions of h_{2}s (similar to those of Figs. 1*F*, 3*F*, and 4*F*, but with greater resolution), searching for correlates of simple summation (f_{1} + f_{2}) and difference (f_{2} − f_{1}) tones. If present, distortion at summation and difference frequencies would appear as ridges parallel to the (CF/CF and CF/−CF) diagonals in quadrants I and II. At first glance, one might expect to find distortion at (f_{2} − f_{1}) in the 2nd-order kernels because such simple difference tones are often associated with square-law nonlinearity and are prominent features of ANF responses to tone pairs (Siegel et al. 1982). In fact, energy at distortion summation or difference frequencies was never evident. We are uncertain of whether this indicates that the poor signal-to-noise ratio of the Wiener kernels caused summation and/or difference tones to be buried in the baseline noise of the kernels or, alternatively, that a band-pass filter centered at CF is interposed between the site of origin of the even-order nonlinearities and the site of spike generation.

A simple “sandwich” model system (Fig. 5) helps to explain the results described above, in particular the relation between phase locking and the striped patterns of 2nd-order kernels in the time domain (as well as the corresponding spectral features). The system consists of a band-pass linear filter, a zero-memory nonlinearity (ZNL), and a low-pass linear filter. The band-pass (gammatone) filter is tuned to “CF.” The ZNL is a half-wave rectifier. The low-pass filter has a cutoff frequency of 2.5 kHz. The input to the system is a Gaussian white noise sampled at 48 kHz. The *middle column* shows 40-ms segments of the input and the outputs at various stages when the CF is 500 Hz (i.e., a “low” CF). Because the CF is far below the cutoff frequency of the low-pass filter, its output (“‡” in the block diagram) is virtually identical to that of the ZNL (“†”). The *right column* shows 4-ms segments of the input and the outputs when the CF is 7 kHz (i.e., a “high” CF). Because the CF is far above the cutoff frequency of the low-pass filter, its output consists of a slow-varying envelope with frequency components lower than the cutoff frequency.

Figure 6 presents 2nd-order Wiener kernels of the model system and their corresponding 2-D Fourier transforms. The *left column* shows the time-domain kernels and the *right column* shows their Fourier transforms (actually, only sections of the 1st and the 2nd quadrants of the complete 2-D Fourier transforms). The *1st row* shows the 2nd-order Wiener kernel of the system when the CF is 500 Hz. For this CF the input to the low-pass filter and the output of the low-pass filter are virtually identical and thus the 2nd-order Wiener kernels computed pre–and post–low-pass filtering are the same. Note that the Fourier transform (*right-hand side*) contains CF components in both spectral quadrants, corresponding to the orthogonally intersecting waves in the time domain (which yield a checkerboard pattern). The *2nd* and *3rd rows* show a similar analysis for the case of CF = 7 kHz. The 2nd-order Wiener kernel computed before low-pass filtering also exhibits a checkerboard pattern in the time domain and, as for the low-CF case (*top row*), its frequency–domain counterpart contains components in both quadrants at the CF. In contrast, the 2nd-order Wiener kernel computed after low-pass filtering has a striped pattern in the time domain and only a quadrant II component in the frequency domain.

### Singular value decomposition of the 2nd-order kernels

Useful representations of matrices can be obtained by *singular value decomposition* (Lewis et al. 2002; Yamada and Lewis 1999; Yamada et al. 1997; see methods). For low-CF ANFs, the Fourier magnitudes of the h_{1}s and the 1st-rank singular vectors of the h_{2}s (h_{2} FSVs) were essentially identical (compare *blue* and *red traces* in Figs. 1*D* and Fig. 8). However, the *polarity* of the h_{2} FSVs was ambiguous because of the inherent ambiguousness of polarity of the 2nd-order kernels. Thus the h_{2} FSV waveforms were either identical to those of the h_{1}s or, alternatively, had opposite polarities. Throughout this paper, h_{2} FSVs are plotted with polarities corresponding to those of the h_{1}s in all figures (procedures are illustrated in Fig. 10).

In the case of mid-CF ANFs, the match between the h_{1}s and the h_{2} FSVs (Fig. 3, *A* and *D*) was poorer than that for ANFs with lower CF (Fig. 1, *A* and *D*). This resulted from the deterioration of phase locking, which generated h_{1}s with poor signal-to-noise ratio. This is evident by comparing the magnitude spectra: in the case of the low-CF ANF (Fig. 1*D*), the peak-to-noise ratio of the h_{1} exceeds 40 dB at low frequencies; in the case of the mid-CF ANF (Fig. 3*D*), the peak-to-noise ratio of the h_{1} is only 20 dB at the same frequencies. With even weaker phase locking in high-CF ANFs, the h_{1} was nearly indistinguishable from the noise baseline (Fig. 4*A*), whereas the h_{2} FSVs retained sharp frequency tuning. [Note that the *blue trace* of Fig. 4*D*, where a peak is detectable at BF, is the magnitude of a *windowed* version of h_{1} (see Fig. 10); the time window, derived from the h_{2} envelope, extended in this case from about 1.23 to 2.35 ms.]

For some matrices, SVD sometimes yields very efficient compression of information, so that, for example, a recognizable version of an image with size *n*^{2} can be reconstructed from merely a few vectors of length *n*. Figure 7 shows that this is also the case for the h_{2}s of chinchilla ANFs. Figure 7*A* and its *inset* show that the weights of the h_{2} FSVs of chinchilla low-CF ANFs were much larger than the weights of all other vectors: the weights of the rank-2 vector amounted to <30% of the h_{2} FSV weight, on average (filled circles in Fig. 7*A* and *inset*) and the other vectors (ranks 3, 4, etc.) were of course smaller (*inset*). Thus the h_{2} FSVs sufficed to describe most features of the h_{2}s of low-CF ANFs. In the case of high-CF ANFs, the h_{2} FSVs and the 2nd-rank vectors had the same (positive) sign and similar weights (filled squares in Fig. 7*A*; see also upward triangles in *inset*) and differed by only a 90 ° shift. Thus for high-CF ANFs, the 2 singular vectors of highest rank (1 and 2) jointly contained most of the features of the h_{2}s. The abrupt change at 2–3 kHz in the relative weights of the 2nd-rank singular vectors (trend line in Fig. 7*A*) coincides with the transition in the appearance of h_{2}s, from the checkerboard pattern (low CFs) to the stripe pattern (high CFs).

The scatter diagram of Fig. 7*B* reveals a negative correlation between the relative weights of the 3rd-rank singular vectors (open symbols) and the signal-to-noise ratio of the 2nd-order kernel waveforms. The negative correlation suggests that singular vectors other than those with ranks 1 and 2 (i.e., vectors of rank 3, 4, and so on) did not represent bona fide cochlear response properties but merely “noisiness” attributed to insufficient sampling time. This idea is supported by the fact that singular vectors with ranks other than 1 and 2 were generally untuned. In the case of low-CF ANFs, the h_{2} 2nd-rank singular vectors (filled circles) also were always untuned and had small relative weights. In the case of high-CF ANFs (filled squares), however, the weights of the 2nd-rank singular vectors were prominent (i.e., similar to the FSV weights) and did not correlate with signal-to-noise ratio.

### The Wiener kernels as a function of CF

Figure 8 shows the h_{1}s and h_{2}s of several ANFs with low CFs (109 Hz to 2.5 kHz). The *left column* of Fig. 8 shows that the h_{1}s (*blue trace*) and the h_{2} FSVs (*red trace*) were nearly identical. The *right-hand column* illustrates the corresponding h_{2}s (as color-coded projections), exhibiting the checkerboard pattern. Figure 9 shows the h_{1}s (*blue trace*, *left column*) and the h_{2} FSVs (*red trace*, *left column*) for several high-CF ANFs. In the case of the ANF with CF 3.65 kHz (*top*), the h_{2} FSVs closely resembled the corresponding h_{1}s. In the case of ANFs with higher CFs, however, the signal-to-noise ratios of the h_{1}s diminished systematically, eventually becoming nearly, *but not completely*, buried in the baseline noise (Fig. 9, *left column*; see also *blue trace* in Fig. 4*A*). The h_{2}s of high-CF ANFs (*right column*) did not exhibit checkerboard patterns but rather consisted of ridges and troughs parallel to the diagonal.

For mammalian ANFs, phase locking to tones decays rapidly as a function of frequency and is often undetectable in responses to high-frequency tones (Johnson 1980b; Palmer and Russell 1986; Ruggero 1992; for a contrary view, see Teich et al. 1993). Nevertheless, the h_{1}s of ANFs with CFs as high as 12 kHz contained small but measurable near-CF oscillations at delays matching those of the corresponding h_{2}s (e.g., compare the *blue* and *red traces* for the ANFs with CFs of 9.3 and 12.1 kHz in Fig. 9). To obtain spectral information from such residual oscillations, each raw h_{1} (dashed line in Fig. 10 *A*) was windowed in the time domain (thick solid line in Fig. 10*A*) according to the normalized h_{2} FSV. [Note that the long duration of the time windows guaranteed that the h_{1} waveforms were not altered (Fig. 10*A*).] In ANFs with BF as high as 12.1 kHz, the windowed h_{1}s (e.g., thick line in Fig. 10*B*) often exhibited clear periodicity similar to that of the h_{2} FSVs (thin line in Fig. 10*B*) and the frequencies of the peak magnitudes of their Fourier transforms closely matched the BFs of the h_{2} FSVs (Fig. 10*C*; see also Fig. 11). Except for the above-noted π ambiguity, the phases of the windowed h_{1}s also closely matched those of the h_{2} FSVs (Fig. 10*D*).

To study the dependency of phase locking on CF, we devised a measure analogous to the vector strength used in quantifying phase locking in responses to tones (Goldberg and Brown 1969), that is, the ratio between the amplitudes at BF of the 1st-order cross-correlations (revcors; see *Eq. 8*) and the (unattenuated) stimulus-noise waveforms (15) *V _{revcor}* is the Fourier magnitude of the windowed revcor at BF (thick line in Fig. 10

*C*).

*V*is the average of the Fourier magnitudes of the noise stimulus at BF, measured in consecutive time intervals (identical to revcor times), which together spanned the full duration of the stimulus. The same time functions (e.g., thick solid line in Fig. 10

_{noise}*A*) were used to window the revcors and the noise stimuli.

*VS _{noise}* is a dimensionless quantity that varies from zero to one. It equals one when all spikes are preceded by effective waveforms with the same, fixed latency.

*VS*is plotted in Fig. 12 as a function of h

_{noise}_{1}BF.

*VS*is relatively constant, 0.68 on average, in the frequency range 100 Hz to 2 kHz, and decays precipitously at higher frequencies, at a rate of about −18 dB/octave. Because

_{noise}*VS*is based on 1st-order cross-correlation, its decay with BF reflects the extent to which frequency-tuned auditory signals (such as basilar membrane vibrations) are low-pass filtered by more central cochlear processes (e.g., the generation of receptor potentials at the inner hair cells).

_{noise}### Wiener kernels as a function of stimulus level

Figure 13 depicts h_{1}s for responses of a representative low-CF ANF to noise stimuli presented at 7 different levels. With increasing stimulus intensity, the number of detectable oscillations in the h_{1}s decreased (indicating deterioration of frequency selectivity) and the “center of gravity” (group delay) shifted to earlier times (Fig. 13*A*). The onset time of the h_{1}s, however, remained independent of the intensity of the stimulus.

Figure 13, *B* and *C* display the amplitude- and phase-frequency spectra of the h_{1}s of Fig. 13*A*. As the level of stimulation increased, response sensitivity decreased and the sharpness of frequency tuning (as reflected by Q_{10dB}, the ratio of CF to bandwidth at 10 dB re peak value; not shown) was also reduced: e.g., Q_{10dB} (43 dB SPL/Hz) = 1.02; Q_{10dB} (−17 dB SPL/Hz) = 2.25. Similar changes were observed in other low-CF ANFs.

Plots of phases as a function of frequency (e.g., Fig. 13*C*) often showed a dependency on stimulus intensity. For frequencies below CF, phase lags increased with intensity. Near CF, phases did not change; above CF, phase lags decreased with increases of stimulus intensity. The phase of the kernel obtained using the most intense noise (43 dB SPL/Hz) typically lagged responses to lower-level stimuli at all frequencies.

Figure 14 shows h_{2} FSVs of a representative high-CF ANF computed from responses to noise stimuli presented at several intensities. In the time domain, the h_{2} FSVs shifted to earlier times systematically as a function of increasing stimulus intensity (*solid color traces* in Fig. 14*A*). In the frequency domain, these time shifts corresponded to decreasing slopes of the phase-versus-frequency curves with increasing stimulus level (Fig. 14*C*), concomitant with reductions in near-BF group delay. These changes were typically accompanied by decreases in BF and sharpness of frequency tuning (Fig. 14*B*): e.g., Q_{10dB} (37 dB SPL/Hz) = 4.73; Q_{10dB} (2 dB SPL/Hz) = 6.46.

### Timing features of 2nd-order Wiener-kernels

The h_{2} FSVs were well described by a function of BF and three additional parameters (16) where *A* is related to the latency of the h_{2} FSV envelope peak, *B* is related to the width of the h_{2} FSV envelope, and φ is the response phase at BF.

The onset times of the h_{2} FSVs, defined as the times when the (envelope) fit curves first surpassed 5% of their peak amplitudes, are plotted in Fig. 15 *A*. For BF <2.7 kHz, the onset times varied roughly as a linear function of log BF. At a BF of 2.7 kHz, there was a discontinuity in the dependency of onset time on BF. For BF >2.7 kHz, the onset times also varied roughly linearly as a function of log BF but with a shallower slope than that for lower BFs. We attribute the 2.7-kHz discontinuity to the inability of the h_{2}s of high-CF ANFs to detect the earliest responses evoked by low-level noise. The response onset corresponds to the (linear) tail components of the basilar-membrane frequency response, which is typically undetectable in basilar-membrane responses to low-level clicks (Recio et al. 1998) or in 1st-order Wiener kernels of basilar-membrane responses to low-level noise (Recio et al. 1997). Presumably, the response onsets are even further buried in the noise in the case of ANF Wiener kernels, which have very restricted (20–30 dB) dynamic range. Therefore the onset times of Fig. 15*A* cannot be equated with “signal-front delay” as defined elsewhere (Goldstein et al. 1971; Ruggero 1980). The discontinuity occurs at a BF around 3 kHz, coinciding with (but not necessarily causally related to) the BF at which the low-frequency flanks of tuning curves undergo a drastic slope transition in chinchilla (Temchin et al. 1997). [Such slope transitions have also been described for ANF tuning curves in gerbil (Ohlemiller and Echteler 1990; Schmiedt 1989) and cat (Liberman 1978).]

For most ANFs, the value of parameter *A* as a function of BF (expressed in kHz) is well described (*r*^{2} = 0.95; *n* = 237) by the following equation (17) The relationship between parameters *B* and *BF* can be roughly specified (*r*^{2} = 0.69; *n* = 237) by the following equation (18)

The near-BF group delays of the h_{2} FSVs, computed from phase-versus-frequency curves, are plotted in Fig. 15*B* as a function of BF. The group delays lie along a locus well described (*r*^{2} = 0.95; *n* = 161) by the following equation (19) where BF is expressed in kHz.

Because both the detectable onset latencies and the near-BF group delays varied with stimulus level (e.g., Fig. 14), it is of interest to establish the relative levels of the stimuli that evoked the responses represented in Fig. 15. As indicated above, the lowest level of noise stimulation was chosen to just exceed thresholds. Such stimulus levels evoked a mean discharge rate (h_{0}) exceeding the spontaneous rate by 29.3 ± 21.5 spikes/s on average (*n* = 123). To put that number in perspective, recall that tone thresholds correspond to levels that elicited rates of 20 spikes/s higher than the spontaneous rates. To gain further insight into the levels of the noise stimuli relative to tone thresholds, we roughly estimated the effective SPLs of the noise stimuli on the basis of their spectral levels and the bandwidths of the ANF responses. Equivalent rectangular bandwidths (ERBs) were computed from the magnitude-versus-frequency curves of the h_{2} FSVs (*inset* of Fig. 16). The main part of Fig. 16 indicates the total noise pressures in the ERBs relative to tone CF threshold as a function of h_{2} FSV BF. On average, total noise pressure in the ERBs exceeded CF thresholds by 8.6 ± 9.7 dB (*n* = 122). Such relative levels, combined with the fact that the evoked discharge rates exceeded spontaneous activity by only 29.3 spikes/s, suggest that the data represented in Fig. 15 were obtained at levels that exceeded noise thresholds by not much more than 10 dB.

### Frequency glides in Wiener kernels

In one important respect, *Eq. 16* does not adequately describe the h_{2} FSVs: their instantaneous frequencies were not constant, but rather increased or decreased monotonically immediately after response onset, depending on BF. These “frequency glides” [first described by Møller (Møller 1977b; Møller and Nilsson 1979) for the revcors of low-BF ANFs] were quantified by Hilbert transformation of the h_{2} FSVs. The magnitudes and phases of the Hilbert transforms give, respectively, the envelope and instantaneous frequencies of the oscillations. Figure 17, *A* and *B*, respectively, show the h_{2} FSVs of representative low- and high-BF ANFs (continuous traces), together with their instantaneous frequencies (dashed lines). The dotted lines indicate regression lines computed over the ranges where the frequency glides were largest (as estimated by visual inspection). Figure 17*C* summarizes the variation as a function of BF of the magnitudes and directions of the frequency glides of h_{1}s and h_{2}s. The frequency glides are expressed in dimensionless units obtained by dividing the regression slope (kHz/ms) by CF^{2} (Shera 2001). Negative values (mostly for BF <900 Hz) correspond to frequency increases, positive values (mostly for BFs >900 Hz) correspond to frequency decreases, and zero (dotted line in Fig. 17*C*) indicates no changes of the instantaneous response frequency. We did not detect a clear frequency change in 90, mostly high-BF, ANFs. This may be explained by the fact that the response onsets of high-BF ANFs were often buried in the noise, as discussed above in the context of Fig. 15*A*. An arbitrary fit line (*r*^{2} = 0.799; *n* = 143) crosses 0 at 900 Hz and has a maximum around 3.5 kHz.

## DISCUSSION

### First-order Wiener kernels of ANFs in chinchilla and other mammalian species

The h_{1}s of low-CF chinchilla ANFs resemble the h_{1}s or revcors previously described in other mammalian species in several respects.

*1)* h_{1}s consist of oscillations tuned to CF (chinchilla: Figs. 1*A*–4*A*, 8, 9; cat: Carney and Yin 1988; Carney et al. 1999; de Boer 1967; de Boer and de Jongh 1978; de Boer and Jongkees 1968; Evans 1977; Kim and Young 1994; guinea pig: Evans 1977; Harrison and Evans 1982; gerbil: Lewis et al. 2002; rat: Møller 1977a, b, 1978; Møller and Nilsson 1979).

*2)* h_{1}s exhibit a FM (“frequency glide”) at their onsets: from high to low in low-CF ANFs and from low to high in ANFs with higher CFs (chinchilla: Fig. 17; cat: Carney et al. 1999; Evans 1977; rat: Møller 1977a, 1978; Møller and Nilsson 1979; guinea pig: Cooper 1989).

*3)* h_{1}s become more broadly tuned with increases in stimulus intensity (chinchilla: Fig. 13*B*; cat: Evans 1977; guinea pig: Harrison and Evans 1982; rat: Møller 1977b).

*4)* h_{1}s have onset latencies that generally decrease as a function of CF (chinchilla: Fig. 15*A*; cat: Carney and Yin 1988; Kim and Young 1994). In the present measurements in chinchilla, this trend was interrupted by a discontinuity at about 3 kHz. This discontinuity appeared to be a result of the use of noise levels that were only slightly higher than tip threshold and thus failed to stimulate the tuning-curve tails, responsible for the earliest responses to broadband stimuli (e.g., Recio et al. 1998).

*5)* h_{1}s have near-BF group delays (the negative slopes of the phase-vs.-frequency curves around BF), which become shorter with increases in stimulus intensity (chinchilla: Fig. 13*C*; cat: Carney and Yin 1988).

Additionally, the present results show that the h_{1}s of ANFs with CFs as high as 12 kHz retain significant (albeit weak) timing information at near-CF frequencies (Figs. 10–12; see also Figs. 4*A*, 9, and 14*A*). This finding suggests that residual phase locking exists in the responses to CF tones of high-CF ANFs. This is confirmed in the companion paper (Temchin et al. 2005).

### Second-order Wiener kernels of ANFs and cochlear nucleus neurons in mammalian species

The application of h_{2}s in auditory physiology was pioneered by Wickesberg et al. (1984) in a study of the cochlear nucleus. That study, which found that 2nd-order Wiener-kernel analysis did not predict very well the responses of low-CF neurons, reached the conclusion that “Wiener's … theory has only limited usefulness in the analysis of the peripheral auditory system” (Wickesberg et al. 1984; see also Johnson 1980a). With the benefit of hindsight, it is now clear that such a conclusion was unduly pessimistic, partly reflecting incomplete analysis of the neural data attributed to the use of slow computing hardware (which hampered adequate visualization of the h_{2}s and discouraged testing of adequate filtering and/or imaging schemes) and partly a result of not including high-CF neurons in their sample. Evidence for the first point is presented in Fig. 2, which shows that, if unaided by low-pass filtering and the use of color, perspective views of 2nd-order kernels of low-CF neurons (such as used by Wickesberg et al. 1984) are uninformative.

h_{2}s have been published for gerbil ANFs (Lewis et al. 2002; Yamada et al. 1997). In gerbil and chinchilla, h_{2}s are similar in the following ways:

*1)* they exhibit substantial energy well tuned to CF, regardless of CF (Figs. 1–4, 8, 9, and 14);

*2)* for low-CF ANFs, they consist of positive and negative peaks arranged in checkerboard patterns with periodicity restricted to the region near CF (Figs. 1 and 8); and

*3)* for high-CF ANFs, they consist of ridges and troughs parallel to the diagonal, also with periodicity restricted to the region near CF (Figs. 4 and 9).

Two other findings in chinchilla have not been reported in gerbil.

*1)* The h_{2} FSVs for chinchilla ANFs vary as a function of increasing stimulus intensity: near-CF frequency tuning deteriorates and near-CF group delays decrease (Fig. 14).

*2)* The onsets of h_{2} FSVs for chinchilla ANFs exhibit frequency glides (Fig. 17). We suspect that larger glides (i.e., spanning wider frequency ranges) should be demonstrable in high-BF ANFs using stimuli more intense than those in the present investigation.

Second-order kernels for gerbil and chinchilla may differ in the following respects.

*1)* Low-frequency troughs and ridges parallel to the time axes are apparently common in 2nd-order kernels of low- to mid-CF ANFs in gerbil (Figs. 4 and 7 of Lewis et al. 2002) but counterparts were found only exceptionally in chinchilla (in 2 ANFs with very low CF: see Fig. 8, *top panel*).

*2)* In gerbil low-CF ANFs, FSVs and 2nd-rank vectors had similar weights (see Fig. 5 of Lewis et al. 2002). In contrast, low-CF ANFs of chinchilla yielded FSVs that were severalfold larger than the 2nd-rank singular vectors (Fig. 7*A*).

*3)* In gerbil low-CF ANFs, the weights of the 3rd-rank singular vectors amounted to a large percentage (50–80) of the weights of the FSVs (Fig. 5 of Lewis et al. 2002). In contrast, the weights of the 3rd-rank vectors almost never exceeded 30% of the weight of the FSVs in chinchilla low-CF ANFs (Fig. 7*A*).

On balance, we suspect that the apparent differences between gerbil and chinchilla do not represent genuine species differences. Rather, the clear negative correlation between the weights of the 3rd- (and higher-) rank singular vectors and the signal-to-noise ratio of the h_{2}s of chinchilla ANFs (Fig. 7*B*) suggests that the h_{2}s reported for gerbil were contaminated by noise. This interpretation implies that separation of h_{2}s into “excitatory” and “inhibitory” subkernels according to the sign of a singular vector, as proposed in Lewis et al. (2002), may not be justified in mammals.

### Timing information in responses of high-CF ANFs and cochlear nucleus neurons studied with paired clicks and tonal complexes

The main advantage of 2nd-order Wiener analysis for the study of high-CF ANFs is its ability to extract high-frequency timing information from even-order cochlear nonlinearities encoded in the (low-frequency) response envelopes even in the absence of phase locking to the high-frequency stimuli. This ability, however, is not exclusive to Wiener-kernel analysis.

Long ago, Møller showed that the magnitude (spike rate) of responses to click pairs of cochlear-nucleus neurons in rats varied as a function of interclick delay with periodicity corresponding to CF even for CFs as high as 30.5 kHz (Møller 1970). In light of the present results, the rate sensitivity to interclick delay in high-CF neurons can be seen as a necessary result of the presence of even-order nonlinear interactions in the spike-generation process. Such nonlinearities result in nonzero h_{2} values outside the diagonal, which may be viewed as “cross talk” between responses to impulses separated in time. In the absence of such nonlinearities, periodicities would be absent from both h_{2}s of responses to noise and rate-versus-interclick delay functions of responses to paired clicks.

Recently another alternative method to obtain high-frequency timing information from high-CF ANFs was presented by Van der Heijden and Joris (2003). The method of Van der Heijden and Joris extracts periodicities using tonal complexes instead of white-noise stimuli. Group delays for cat ANFs, computed from the phase-versus-frequency curves of Fig. 4, *B* and *C* of van der Heijden and Joris (2003), are indicated by the filled circles in Fig. 15*B*. [The sum of a 1-ms synaptic/neural delay and 0.225 ms of acoustic delay (i.e., 1.225 ms) has been added to the cat group delays to make them comparable to the chinchilla data.] Generalizing a previous observation that the near-BF group delays of cat and chinchilla are very similar for BFs lower than 2 kHz (see Fig. 2.8 of Ruggero 1992), Fig. 15*B* shows that the similarity of group delays in the 2 species extends to high BFs.

## GRANTS

This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC-00419. P. Van Dijk was supported by the Royal Netherlands Society for Sciences and Arts and the Heinsius Houbolt Foundation.

## Acknowledgments

We thank T. Lewis for helping us to understand singular vector decomposition.

## Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2005 by the American Physiological Society