Journal of Neurophysiology

Reverse Cochlear Propagation in the Intact Cochlea of the Gerbil: Evidence for Slow Traveling Waves

Sebastiaan W. F. Meenderink, Marcel van der Heijden

This article has a correction. Please see:


The inner ear can produce sounds, but how these otoacoustic emissions back-propagate through the cochlea is currently debated. Two opposing views exist: fast pressure waves in the cochlear fluids and slow traveling waves involving the basilar membrane. Resolving this issue requires measuring the travel times of emissions from their cochlear origin to the ear canal. This is problematic because the exact intracochlear location of emission generation is unknown and because the cochlea is vulnerable to invasive measurements. We employed a multi-tone stimulus optimized to measure reverse travel times. By exploiting the dispersive nature of the cochlea and by combining acoustic measurements in the ear canal with recordings of the cochlear-microphonic potential, we were able to determine the group delay between intracochlear emission-generation and their recording in the ear canal. These delays remained significant after compensating for middle-ear delay. The results contradict the hypothesis that the reverse propagation of emissions is exclusively by direct pressure waves.


It is commonly accepted that acoustic energy entering the cochlea sets up transverse traveling waves that propagate from the base of the cochlea to the location where they evoke the largest response (“best site”). These waves are mediated by the basilar membrane (BM) and the cochlear fluids (von Békésy 1960). Their speed varies with both frequency and cochlear location, but they propagate much slower than longitudinal pressure waves. Owing to the nonlinear mechanical properties of the cochlea, frequency components not present in the stimulus are generated. These components arise within the cochlea and may travel back to the ear canal, where they can be measured as otoacoustic emissions (OAEs) with a sensitive microphone (Kemp 1978).

There is disagreement on the reverse cochlear propagation of OAEs: it is suggested by some that they travel back to the middle ear as fast pressure waves (He et al. 2008; Ren 2004; Ren and Nuttall 2006; Vetesnik et al. 2006), whereas others claim that reverse propagation, like forward propagation, is mediated by a slow traveling wave along the BM (Dong and Olson 2008; Shera and Guinan 1999; Talmadge et al. 1998). Establishing what mechanism is involved in this reverse propagation is important for both practical and theoretical reasons. OAE phase delays have been used to assess sharpness of tuning (Shera et al. 2002), which directly reflects the fundamental ability of the cochlea to segregate spectral components. Such analysis requires a proper understanding of the different contributions of forward and reverse intracochlear propagation. Also most models of cochlear mechanics (de Boer 1996; Talmadge et al. 1998; Zweig and Shera 1995) predict that the BM supports both forward and reverse traveling waves. The absence of the latter would indicate a preferred, inward, directionality of wave propagation that would call for a reformulation or at least a refinement of current models.

Several circumstances impede a straightforward empirical determination of OAE travel times. Without opening the cochlea, it is only possible to measure “round-trip” times, that is, the delay between the stimulus entering the ear and the OAEs leaving it. Additional information is needed to distinguish in such round-trip times the separate contributions of forward and reverse propagation. Regardless of whether this additional information is provided by cochlear models, theoretical considerations or empirical data, the estimates of reverse travel times are hampered by uncertainties about the extent, the number, and exact location of emission-generating sites in the cochlea. In fact, when determining OAE group delays from frequency sweeps, the site(s) of their generation most likely changes with stimulus frequency due to the cochlear tonotopic organization. Such uncertainties about the origin of OAEs render their travel times somewhat poorly defined. Nonetheless, several studies have determined the delay of the stimulus to its best site on the BM and interpreted this as the forward delay (Ruggero 2004; Siegel et al. 2005). The most direct approach to do this is to open the cochlea and measure intracochlear motion. This has additional complications and limitations. The very processes that generate the emissions are vulnerable to trauma, and the measurements are typically restricted to a narrow region at the basal, high-frequency end of the cochlea, where travel delays are <500 μs for all stimulus frequencies (Robles and Ruggero 2001).

This study introduces a novel approach to overcome these problems. Instead of the customary two-tone stimulus to evoke otoacoustic emissions, we employed a stimulus in which the lower tone is replaced by a group of closely spaced tones. This stimulus has two advantages in the determination of reverse cochlear propagation delays. First, the multi-tone character of the stimulus results in a group of emissions that consists of several simultaneous frequency components. This allows the calculation of group delays from a single recording, thus avoiding interrecording variability that may occur when using two-tone frequency sweeps. Second, the multitone stimulus results in an additional group of emissions around the single tone primary (Meenderink and van der Heijden 2009). These OAEs can be evoked using a wide frequency separation (several octaves) between the lower and upper stimulus components. As we will show, this results in a “skewed composition” of the round-trip times of these OAEs: the forward travel time of the stimulus components is much smaller than the reverse time of the OAEs. This makes it easier to test different return delay hypotheses (Ruggero 2004).

In addition to the OAEs in the ear canal sound pressure, we also measured the cochlear-microphonic potential (CM) at the round window. The CM is a collective response reflecting the transduction currents of (predominantly outer) hair cells (Patuzzi et al. 1989). We provide evidence that the distortion products in the CM and in the ear canal sound pressure arise from the same cochlear region. Moreover, the specific stimulus design causes the distortion products in the CM to reflect transduction currents of hair cells near the region of OAE generation. Combining the simultaneous ear canal and CM recordings allowed the dissection of the round-trip travel times of the distortion products into their constituents. We find a significant delay between the intracochlear generation of the emissions and their recording in the ear canal and interpret this to indicate that the reverse cochlear propagation, like forward propagation, is slow.



Recordings were made from adult Mongolian gerbil (Meriones unguiculatus; female, n = 6). Gerbils were anesthetized by intraperineal injection of ketamine/xylazine solution (effective dosage: 80 and 12 μg/g body wt, respectively). Supplementary anesthesia was administered subcutaneously as required. A small metal rod attached to the dorsal surface of the skull was used to fix the head of the animal during the experiment. The animal was placed on a heating pad, and its body temperature was maintained at 37°C. On one side, the pinna and cartilaginous ear canal were removed, and a probe was positioned within 4 mm of the tympanic membrane. It contained a [1/2]-in pressure-microphone (GRAS 40AG) to record ear-canal sound pressure, and two drivers (TDT CF1) for stimulus delivery. To record CM potentials, the basal turn of the cochlea was exposed by removing the skin, muscles and a small amount of bone overlying the inferior posterior mastoid chamber of the ipsilateral bulla. The recording electrode was a Teflon-coated silver wire terminating in an uninsulated bulb. It was placed within the round window antrum under visual control and referenced to an electrode placed within the muscles at the edge of the surgical field. Animal procedures were in accordance with guidelines provided by the animal committee of the Erasmus MC.

Stimulus generation and data acquisition

The acoustic stimuli consisted of multiple frequency components, each one with a random starting phase. For convenience of description, the components are separated into different constituents: 1) the f1 low-frequency tone complex, 2) the single f2 at higher frequency, and 3) the suppressor tone. The latter was present only in a limited subset of measurements. Nonlinear interaction between any combinations of these stimulus components can give rise to third-order distortion products (DPs). To unambiguously identify the origin (in terms of the evoking stimulus frequencies) of each of these DPs, the stimulus frequencies were chosen such that all possible difference and sum frequencies were unique (van der Heijden and Joris 2003, 2006; Victor 1979). This way of irregularly spacing the components guarantees that all third-order DPs of the type f1a ± f1b ± f2 are also unique, allowing highly accurate estimates of stimulus-to-response group delays of these DPs (van der Heijden and Joris 2005). All stimulus frequencies were commensurate with a fixed number of integer sample points. This allowed for averaging of a single recording by breaking it down into segments of exact periodicity. The periodicity of the stimulus components also ensured that all DPs have an integer number of cycles, thus eliminating any spectral leakage (Papoulis 1962). The f1 tone complex was generated from a separate D/A channel (TDT RP2.1), while the f2 component and suppressor (if present) were generated via a second D/A channel. The output of each channel was fed through a stereo power amplifier (TDT SA1) and broadcast from one of the drivers within the probe. The correct stimulus levels were attained by calibrating the drivers in situ at the level of the tympanic membrane using the microphone housed in the probe. The transfer characteristics of the probe were taken into account. All stimuli were generated at a rate of 48.8 kHz.

Synchronously with stimulus delivery, the ear canal sound pressure and CM potentials were acquired via separate A/D channels, each at a rate of 48.8 kHz (TDT RP2.1). Prior to acquisition, the CM potential was band-pass filtered (0.03–30 kHz) and amplified (20 times, SRS SR560). The microphone signal was also band-pass filtered (0.02–100 kHz, NEXUS 2690). Signals were stored on computer disk for off-line analysis.

The f1 tone complex was always set to low frequencies (1–1.5 kHz). Stimulus levels were chosen as low as possible, excluding any condition that led to measurable system distortion, while still generating sufficient DPs in both the CM potential and the ear canal sound pressure. These constraints led to a range of f2 values of 5–7 kHz. For the low-frequency tone complex, the level of each component was kept between 50 and 65 dB SPL, while the f2 stimulus tone never exceeded 68 dB SPL. Throughout the experiments, CM responses to pure tones between 1 and 12 kHz were used to monitor the health of the cochleae. Only those animals are considered for which these responses changed <10 dB for any frequency.

Data analysis

After discarding the first periodic block of each recording (to exclude transient phenomena), recorded signals were averaged over the aforementioned periodic blocks. Their magnitude and phase spectra were calculated via Fourier analysis. Noise was estimated from the frequency components that were incommensurate with the stimulus and third-order DPs. Only recordings for which all DPs under scrutiny exceeded the noise floor were included in subsequent analysis. Additional experiments were undertaken in an artificial ear to check for distortion in the hardware. For the stimuli used in this study, these system distortions never exceeded the noise floor. In suppression experiments, the f1 tone complex and f2 tone were fixed, while both the frequency and level of the suppressor tone were varied systematically but presented in random order. The amount of suppression was calculated by comparing these recordings against unsuppressed ones obtained without a suppressor tone. The latter were measured at random intervals during the suppression experiment, and showed little variation (<2 dB) of DP level across recordings.

For each spectrum, the phase of the recorded frequency components (primaries and DPs) was referenced to the phase of the stimulus components driving the loudspeakers. The reported delays are all group delays, which were calculated by fitting a straight line to the phase versus frequency curves using least squares minimization. Errors in the group delay estimates indicate 95% confidence intervals. The middle ear delay at the DP frequencies was determined by a separate recording using a tone complex centered on f2 as the stimulus. All stimulus generation, data acquisition, and off-line analysis was done via custom software in MATLAB (The MathWorks, Natick, MA).


A customary two-tone stimulus with frequencies f1 and f2 (f2 > f1) gives rise to two well-known third-order DPs: the lower cubic distortion tone (CDT) at 2f1f2 and the upper CDT at 2f2f1. Replacing the lower tone f1 by a group of N closely spaced tones (f11, f12, … , f1N) turns both of these CDTs into tone complexes themselves (Fig. 1A) (see also Meenderink and van der Heijden 2009). In addition to this spectral line splitting, a group of third-order DPs arises near the f2 tone (Fig. 1A, Embedded Image) at frequencies f2 + f1if1j with i,j = 1, … N. The emergence of this DP group is closely linked to stimulus-frequency OAEs (SFOAEs) (Kemp 1980) in the following way (Neely et al. 2005). The SFOAE at frequency f2 is partially suppressed by the simultaneous presence of the f1 complex. The fluctuating amplitude of the f1 complex causes the amount of suppression to fluctuate as well. The resultant modulation of the SFOAE at f2 is observed as sidebands around f2 at frequencies f2 + f1if1j, constituting this group of third-order DPs. We have named this DP group dynamic SFOAEs or dSFOAEs.

Fig. 1.

Power spectra illustrating the use of multitone stimuli in the generation of distortion products (DPs). A: recording in the ear canal showing 3 groups of DPs: 2f1f2; 2f2f1 and the novel group of dynamic stimulus-frequency otoacoustic emissions (dSFOAEs). They exceed the noise floor (—) and the separately recorded system distortion (○), indicating their biological origin. Upper stimulus tone: f2 = 6.6 kHz; lower components: 7-tone complex around 5 kHz. B: choosing f1f2 does not affect the generation and frequencies of the dSFOAEs. C: simultaneous recording of the cochlear-microphonic potentials (CM) shows DPs at the same dSFOAE-frequencies.

An important and useful property of the dSFOAEs is that their frequencies only depend on the relative frequencies (i.e., frequency differences) within the f1 complex; a collective frequency shift of the f1 tone-complex does not change the dSFOAE frequencies. As a consequence, and in contrast to the other third-order DPs, they can readily be evoked while having a large frequency separation between f1 and f2 (Fig. 1, B and C). Such a large frequency separation between the stimulus tones simplifies the analysis of cochlear travel times. This can be seen by considering that the f2-evoked SFOAE originates from a region close to the f2 best site (Shera and Guinan 1999). Because f1 is more than two octaves below f2, the f1 components reach this region very fast (Rhode 2007); only when approaching their own best region, much more apical than the f2 site, will they slow down. This means that the fluctuating f1 stimulus, which modulates the SFOAE to create the dSFOAEs, reaches the generation site with minimal forward delay. Thus should significant round-trip delays be found, these must be attributed to the reverse travel of the dSFOAEs. The only other contributions to the ear canal to ear canal round-trip delays are middle ear delays, which are addressed later on.

Note that the phase of the high-frequency (f2) primary has no effect on the dSFOAE group delay and is ignored throughout this study. The forward propagation of the f2 primary only affects the absolute phase of the entire dSFOAE-group but not the slope of their phase versus frequency curve (i.e., the group delay). Similarly, the exact type of modulation that generates the dSFOAEs (amplitude-, phase-, or a combination of these) only affects the absolute phases of the dSFOAEs not the phase gradient.

The analysis of travel times is illustrated in Fig. 3A. The ear canal to ear canal round-trip consists of a forward travel of the f1 stimulus to the region of dSFOAE generation, followed by a reverse travel of the resulting dSFOAEs back to the ear canal. Both forward and reverse paths consist of a middle ear part and a cochlear part.

The total round-trip group delay for a stimulus with f1 centered at 1.3 kHz and f2 = 6.2 kHz was determined from the phase spectrum of the dSFOAEs (Fig. 3B, ●). Importantly, the four tones of the f1 complex were irregularly spaced in such a way that each dSFOAE component in the response is associated with a unique pair of f1 components in the stimulus (see methods). This mathematical technique (van der Heijden and Joris 2003; Victor 1979) allows a highly accurate assessment of the group delay between the f1 components in the stimulus and the dSFOAEs (van der Heijden and Joris 2005). Linear regression of the phase data yielded an estimate of 722 ± 54 μs for the round-trip group delay. Given the particular stimulus design, the major contributor to this delay is expected to be reverse propagation of the dSFOAEs. To validate this interpretation—and determine whether reverse propagation is fast or slow—we need to evaluate the various contributions to the total round-trip delay.

The cochlear region of dSFOAE-generation was determined using a suppression paradigm. An extra tone (suppressor) was added to the stimulus and its effect on the dSFOAE magnitude was assessed as a function of suppressor frequency and amplitude (Brown and Kemp 1984). Suppression of the dSFOAE was found to be maximal for suppressor frequencies above the f2 tone by a factor of ∼1.4 (Fig. 2B), indicating that the dSFOAEs arise from a cochlear location that is slightly basal of the f2 best site. Given the large frequency separation between f1 and f2, this observation confirms that the low-frequency f1 complex reaches the cochlear region where the dSFOAE is generated with minimal cochlear travel delay. Furthermore, it helps the interpretation of the reverse travel time of the dSFOAEs under the assumption of slow wave propagation (see discussion).

Fig. 2.

Suppression of the dSFOAEs by the addition of an extra tone to the stimulus. A: cartoon illustrating the alleged region of origin of the dSFOAEs and their electrical counterparts, the dSF-CMs. The dSFOAEs are assumed to be produced in the shaded region, basal to the f2 region. The “lightning bolts” in this region symbolize dSFOAE-generators. B: effect of a suppressor tone of varying levels and frequencies on the summed power of the dSFOAEs (unsuppressed = −1.1 dB SPL). The changes are color-coded. C: suppression of the simultaneously recorded dSF-CMs (unsuppressed = −7.2 dB μV). D: iso-suppression curves extracted from B and C using a 6-dB DP reduction as criterion. Stimulus parameters: 3 f1 components around 1.3 kHz; f2 = 6.8 kHz (indicated by - - - and ▼); the level of each stimulus component was adjusted to give 30 dB μV in the CM response.

So far, the analysis was restricted to dSFOAEs recorded in the ear canal sound pressure. We simultaneously recorded the cochlear-microphonic potential (CM), which shows similar DPs (Fig. 1C). To analyze the ear canal to ear canal delays using these latter recordings, it is necessary to establish that these particular DPs in the CM result from nonlinearities in the same cochlear region as do the dSFOAEs recorded in the ear canal. The paired recordings in the suppression paradigm show that they do: tuning of suppression is similar in CM and in the ear canal sound pressure (Fig. 2, B and C), with the latter being slightly more biased toward higher suppressor frequencies (D). The similarity of tuning strongly suggests that the DPs, recorded in the CM and in the ear canal sound pressure, originate from identical cochlear sources. To indicate the DPs in the CM, and to acknowledge their similarity with the dSFOAEs in the ear canal, we will refer to them as dynamic SF-CM or dSF-CM.

To extract cochlear travel times from the total round-trip time, one needs to know the delays imposed by the middle ear. The middle ear delay of the f1 stimulus from the ear canal into the cochlear base was estimated by comparing the phases of the f1 components between the acoustic signal, recorded in the ear canal and the simultaneous CM recording (Fig. 3B, ◀). This yielded an estimated group delay of 166 ± 54 μs. The interpretation of this delay as middle ear contribution is based on the premise that linear components in the CM recorded at the round window originate from basal hair cells (Dallos 1971; Patuzzi et al. 1989).

Fig. 3.

Analysis of various travel delays from phase data. A: cartoon illustrating the total ear canal to ear canal round-trip (black line), and its decomposition into forward and reverse paths in the cochlea and through the middle ear. Different symbols at the end of each path refer to symbols in B. B: phase data and group delays for these paths. ●, ear canal (EC) to ear canal (f1 to dSFOAE); ◀, f1, ear canal to CM; ◆, f2, ear canal to CM; ★, CM-f1 to dSF-CM; Embedded Image, dSF-CM to dSFOAE. Stimulus parameters: 4 f1 components around 1.3 kHz; each component at 55 dB SPL; f2 = 6.2 kHz; L2 = 60 dB SPL.

The second contribution of middle ear transfer to the total delay is that of the dSFOAEs on their way back to the ear canal. Importantly, the previous estimate of middle ear delay of the f1 components cannot be used for this contribution because the dSFOAE frequencies (∼6.2 kHz) are much higher than f1 (∼1.3 kHz) and because middle ear delays may be frequency dependent. Additional recordings were made using a stimulus consisting of a narrow-band tone complex around 6.2 kHz. This yielded an estimated middle ear group delay at f2 of 73 ± 22 μs (Fig. 3B, ◆). After subtracting the two middle ear contributions, the base-to-base, intracochlear round-trip travel time amounts to 483 μs.

To further dissect the intracochlear round trip, the CM counterpart of the dSFOAEs was subjected to further analysis. From Fig. 2 it is clear that these dSF-CMs result from the same mechanical vibration as the dSFOAEs. This does, however, not tell us where the transduction of this mechanical vibration into a CM signal takes place. After generation, the dSFOAEs travel back to the ear canal, and in principle all hair cells along this reverse path (from the generation site to the stapes) may be involved in their transduction into dSF-CMs. Figure 4 illustrates the two extreme cases: this transduction is either by hair cells at the generation site of the dSFOAEs (A) or by hair cells near the round window (B). In Fig. 4C, reconstructed time signals for the dSF-CMs and dSFOAEs are shown. These were obtained from the recorded signals by applying a comb filter that only kept the frequencies of the emission components. A cross-correlation between these signals revealed that the dSFOAEs are delayed re. the dSF-CMs by 527 μs. Such a long delay between the signals was confirmed by a phase analysis (Fig. 3B, □) yielding a group delay of 472 ± 59 μs, a value large enough to exclude the possibility that dSF-CMs originate from hair cells in the extreme base. The 472-μs delay is in fact more consistent with the dSF-CMs originating close to the region of intracochlear dSFOAE generation, slightly basal of the f2 best site. After compensating for the 73-μs middle ear contribution, the lower boundary for the travel time of reverse cochlear propagation of the dSFOAEs amounts to 399 μs. This lower boundary corresponds to the situation in which the dSF-CMs delay comes from the forward propagation of the f1's to the region of dSFOAE generation; it includes no delay from reverse propagation of the mechanical dSFOAE vibration within the cochlea (Fig. 4A).

Fig. 4.

The potential location of transduction underlying the dSF-CMs. A: dSF-CMs may be generated by hair cells near the f2 region, close to where the dSFOAEs are generated. B: dSF-CMs may be generated by hair cells at the base that are excited by the reverse dSFOAE wave just before it leaves the cochlea. C: time signals for the dSF-CMs and dSFOAEs reveal a large group delay (527 μs) between the two. The time signals were obtained from the recordings by applying a comb filter that only kept the frequencies of the emission components. The large delay is evidence in favor of the situation displayed in A.

The final phase analysis concerns the delay between the f1 components in the CM and the occurrence of dSF-CMs (Fig. 3B, ★). This yielded a group delay of 84 ± 14 μs. This delay includes the forward propagation of the f1 components from the base of the cochlea toward the region where the dSFOAEs originate, and additional delay from intracochlear dSFOAE propagation from this region to the hair cells that transduce them into dSF-CMs. The measured 84 μs is therefore an upper boundary of the intracochlear forward travel time. Reports of cochlear-mechanical data (Ren and Nuttall 2001; Ruggero et al. 1997) indicate group delays of 65–80 μs for low-frequency components to travel from the base to the mid-frequency (9–12 kHz) turns. Thus the 84-μs group delay is likely to consist primarily of cochlear forward delay. This also supports the interpretation of dSF-CMs as the earliest possible manifestation of the DPs and confirms that the round-trip delays are dominated by the reverse delay.

This completes the dissection of the round-trip into its various constituents. The results displayed in Fig. 3 are summarized as follows. The total ear canal to ear canal round-trip (722 μs; 4.4 cycles) consists of a forward f1 stimulus delay through the middle ear (166 μs; 0.2 cycles), and the cochlea (at most 834 μs; 0.5 cycles); a reverse cochlear delay (≥399 μs; 2.5 cycles); and a reverse middle ear delay (73 μs; 0.4 cycles). The same analysis was applied to 25 recordings from three gerbils using f2 values of 5–7 kHz (see methods). This yielded an average reverse cochlear travel time of 376 μs with a SD of 129 μs.


We determined cochlear reverse delays by using a multitone stimulus paradigm and by combining acoustic recordings in the ear canal with simultaneous recordings of the CM at the round window. Our main finding is the occurrence of long delays between the intracochlear generation of the emissions and their recording in the ear canal sound pressure.

Middle ear delays

The combined CM and ear-canal sound pressure recordings also allowed us to calculate the group delays associated with forward and reverse propagation in the middle ear (166 ± 54 and 723 ± 22 μs, respectively). These different delays can be attributed to the large frequency difference (>2 octaves) between the forward propagating f1 tone complex, and the reverse propagation of the dSFOAEs. Gerbil middle ear delays have been reported to be relatively constant over a broad frequency range: 25 μs (Olson 1998); 30 μs (Overstreet and Ruggero 2002); 32–38 μs (Dong and Olson 2006); ∼110 μs (Ren et al. 2006); and 20–29 μs (Ravicz et al. 2008)], but these wide-band estimates are not valid in the low-frequency region (< ∼1.5 kHz). We calculated low-frequency middle ear delays from Fig. 6a in Ravicz et al. 2008 and from Fig. 4 in Rosowski et al. 1999. This yielded delays of 120–180 μs (0.2–2 kHz) and 120 μs (1–3 kHz), respectively, which are similar to the delay we found for the f1 tone complex.

Our estimate of forward middle ear delay for mid-frequency sounds exceeds most values reported in literature. Most likely, this results from the different bandwidths used in the calculation of the group delays. We calculated the forward middle ear delays over the exact, narrow, frequency band spanned by the dSFOAEs (i.e., 528 Hz around 6 kHz). In contrast, the delays reported in literature were calculated over much larger frequency ranges (several octaves). In effect, these values represent averages, smoothing out any fine structure in the phase versus frequency curves. When calculating middle ear delays from our own data over an extended frequency range (f = 4–11 kHz), we found an average group delay for middle ear transmission of 44 ± 6 μs (L = 75 dB SPL). Had we used these wideband estimates for middle ear delays, the calculations of the group delays for reverse propagation would be even larger, strengthening the rejection of instantaneous intracochlear back-propagation.

Origin of dSF-CMs

The temporal comparison between dSF-CMs and dSFOAEs (Figs. 3B and 4C) provided a direct measurement of reverse propagation time, circumventing uncertainties about cochlear mechanics (e.g., the extent, location, and number of sites of dSFOAE origin). This direct measurement was possible because the dSF-CMs 1) are transduced by hair cells that are proximate to the region from which the dSFOAEs originate and from there 2) propagate nearly instantaneously to the CM recording electrode. The transduction site contrasts with CM components evoked by pure tones, which are dominated by hair cells in the basal turn (Patuzzi et al. 1989). It also contrasts with 2f1f2 DPs, for which it has been suggested that their transduction by basal hair cells dominates in the CM response (Brown and Kemp 1985). The different origin of the dSF-CMs in the present study may be explained by considering the markedly different stimulus design (viz., the large frequency separation between the f1 tone complex and f2) employed here. This is illustrated in Fig. 5. The dSFOAEs originate from adjacent “generators” along the BM, where two components f1a and f1b together interact with f2 to generate a contribution to the (f2 + f1af1b) component having a phase of (ϕ2 + ϕ1a − ϕ1b). The propagation of each f1 component is fast in the region of dSFOAE generation, implying that ϕ1a − ϕ1b ≈ 0. The resulting phase profile of the distributed contributions will therefore closely follow that of the inward traveling f2 component (compare the lowest 2 curves in Fig. 5). Thus the dSFOAE components themselves have a strong directional preference for inward propagation, similar to the f2 component. This beam forming effect (Shera and Guinan 2008) will cause a bias of the propagating DPs toward the apical part of the cochlea, thereby reducing the relative contribution of basal hair cells to the dSF-CMs.

Fig. 5.

Schematic diagram illustrating expected strong inward directionality of dSFOAEs. Bottom: an extended view of the cochlear region of dSFOAE-generation. The phases, ϕ1a, ϕ1b, and ϕ2, of 3 interacting stimulus tones, are shown as a function of cochlear location. From these primary phases, the phase of the dSFOAE (ϕdp) is derived (bottom curve). The dSFOAE arises over a region just basal of the f2 best site. Because f1f2, the propagation of each f1 component is fast in that region, and the phase vs. cochlear position of each component is expected to be identical. Thus ϕ1a − ϕ1b ≈ 0. Because the phase of the dSFOAE is given by: ϕdp = ϕ2 + ϕ1a − ϕ1b, the resulting phase profile of the distributed contributions will closely follow that of the inward traveling f2 component. This steep phase profile of the multiple, adjacent dSFOAE-generators causes beamforming, and results in the strong preference for inward propagation of the dSFOAE.

Interpretation of reverse cochlear delays

Several hypotheses related to the expected reverse delays have been formulated (see Ruggero 2004) that describe two opposing views: otoacoustic emissions suffer a significant delay during reverse intracochlear propagation or this delay is absent or negligibly small. In the former view, retrograde propagation involves transverse traveling waves along the basilar membrane. In the latter view, reverse propagation is via longitudinal compression waves within the cochlear fluids. This “zero reverse delay” hypothesis has found renewed support by recent experiments (He et al. 2008; Ren 2004).

Our methods enabled the analysis of round-trip delays into its separate constituents. In this way, we calculated delays of 376 ± 129 μs (n = 25) for the reverse intracochlear propagation. Such delays are considerably longer than expected from direct reverse propagation via longitudinal pressure waves (He et al. 2008; Ren et al. 2004, 2006; Siegel et al. 2005). The most straightforward interpretation of this long delay is in terms of a slow reverse traveling wave. However, it cannot be excluded that part of the delay originates from forward propagation of the dSFOAEs from their place of generation to their own best site before being reflected. In either case, the long delay excludes the possibility of a negligible delay between generation and emission of these otoacoustic emissions from the cochlea.

Several previous studies on reverse OAE propagation have compared OAE round trip delays with forward delays of the evoking stimulus tones to the surmised location of OAE generation. Forward delays were derived from measurements of basilar membrane vibrations (e.g., Cooper and Rhode 1992; Overstreet et al. 2002) or auditory nerve responses (Shera et al. 2008; Siegel et al. 2005), and reverse delays were calculated by subtracting the delays thus obtained from OAE-round-trip delays. As an alternative to this forward delay re. round-trip delay comparison, DPs have also been recorded inside the cochlear fluid pressure (Dong and Olson 2008) or along a section of BM (He et al. 2008; Ren 2004), which were then compared with the DPOAEs in the ear-canal sound pressure.

The data from these studies have not resulted in a consensus about the mechanism for reverse intracochlear propagation of otoacoustic emissions. For instance, Siegel et al. (2005) calculated forward delays using Wiener-kernel analysis of auditory nerve fiber responses to noise and compared these delays with SFOAE group delays. Although the data did not allow them to distinguish between fast reverse traveling waves (“signal front hypothesis” in Ruggero 2004) and the zero reverse delay hypothesis, they appear to exclude slow reverse traveling waves. The same data were re-analyzed by Shera et al. (2008) using a cochlear model that supports slow, reverse traveling waves. They found that the ratios of SFOAE delays and the auditory nerve fiber delays are in close agreement with the model predictions, indicating the dominant role of slow, reverse traveling waves. Dong and Olson (2008) recorded DPOAEs in the ear canal, and their equivalent DPs in the intracochlear fluid pressure near the base of the cochlea. These data, and their agreement with the dominant role of slow reverse propagation, show a marked dependence on the (absolute and relative) frequencies of the stimulus tones and their relation to the location of the intracochlear recording probe. Interestingly, particular stimulus configurations resulted in DP phase versus frequency curves that were shallow or even sloped upward; these curves are almost mirror images of the forward traveling wave phase as if the DPOAE in the ear canal preceded the intracochlear DP. Exactly these types of phase versus frequency curves formed the basis for the zero delay hypothesis (Ren 2004).

In recent years, the discussion on reverse propagation seems to have focused on the specific prediction of cochlear models based on the theory of coherent reflection filtering (Shera and Guinan 1999; Zweig and Shera 1995) that reverse propagation delay equals the forward propagation delay. Although potentially useful, testing this prediction has proven to be difficult. First of all, the theory applies to reflection-source otoacoustic emissions, such as SFOAEs; interpreting DPOAEs within this theoretical framework is problematic. Also testing the predicted ratio of forward to round-trip delays requires measurements at the cochlear location of emission generation. The generation, however, occurs over an extended cochlear region, making it difficult to define “the location of OAE generation.” Another complicating factor is that delays derived from BM measurements do not necessarily correspond to the forward delay as defined by the theory of coherent reflection filtering (Shera et al. 2008).

Cochlear location of OAE generation

Both the advocates and opponents of slow reverse waves seem to agree that the round-trip (ear canal to ear canal) delays are not exactly equal to twice the forward delay of the primaries to their best site. Our data also reflect this: the reverse delays are smaller than the forward travel of the f2 tone to its best site. For example, a 6-kHz tone takes ∼900 μs to reach its best site (Rhode 2007), whereas 6-kHz dSFOAEs take ∼400 μs for reverse cochlear propagation.

A possible explanation for the apparent discrepancy between forward and round-trip delays is in the presumed location of OAE generation (Ren et al. 2006; Siegel et al. 2005; Zhang and Mountain 2009). If they are generated basal to the f2 site, travel times will be appreciably shorter because travel speed is lowest near best site. The observation of maximum suppression at 1.4 × f2 (Fig. 2) also indicates that most of the dSFOAEs are generated basal from the f2 best site provided that low stimulus levels are employed (see also Brass and Kemp 1993 for a similar observation on SFOAEs).

The hypothesis that OAEs are generated at a location somewhat basal to the best region of the high-frequency stimulus tone is attractive for another reason. Intracochlear measurements (He et al. 2008; Ren et al. 2004, 2006), which were aimed at directly measuring reverse OAEs and which led to the hypothesis of reverse pressure waves, might in fact have probed the very region where the OAEs are generated (see also Dong and Olson 2008). In this region, inward waves coexist with reverse waves; the former may well obscure the latter. The hypothesis of more basal OAE generation reconciles seemingly contradicting observations on reverse propagation in the cochlea and offers an explanation why reverse propagation takes less time than forward propagation of the stimulus to its best site.

Concluding statement

In conclusion, the present study illustrates that many of the difficulties of determining reverse travel times can be overcome by the use of an innovative stimulus paradigm in combination with paired recordings. This approach enabled us to measure travel times of reverse inner ear propagation without opening the cochlea or making untested assumptions on OAE generation. The results point to slow traveling waves as the mechanism for reverse propagation within the mammalian inner ear, although alternative, more circuitous intracochlear paths cannot be excluded.


This work was supported by NWO-VENI Grant 863.08.003 to S.W.F. Meenderink.


We thank Dr. David Kemp, Dr. Chris Shera, and one anonymous referee for helpful comments on an earlier version of this manuscript.


View Abstract