|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Department of Physiology and University College London Centre for Auditory Research, London WC1E 6BT, United Kingdom
Submitted 12 January 2004; accepted in final form 30 April 2004
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Of the two theories, the temporal theory of pitch has gained most credence, particularly over the past decade, in part due to its ability to explain a greater range of observed pitch phenomena. Mechanisms that account for temporal pitch have been the subject of numerous psychophysical (Krumbholz et al. 2000
; Plack and White 2000
; Pressnitzer et al. 2001
; Shackleton and Carlyon 1994
; Wiegrebe et al. 1998
) and modeling (de Cheveigne 1998
; Meddis and O'Mard 1997
) studies, and electrophysiological investigations in experimental animals (Biebel and Langner 2002
; Cariani and Delgutte 1996a, b
; Wiegrebe and Winter 2001
). A number of models posit mechanisms that could account for the processing of temporal pitch cues, in particular the implementation of auto-correlation on the stimulus waveform (de Cheveigne 1998
; Licklider 1951
). However, electrophysiological studies have largely been unsuccessful in revealing potential neural candidates for such a role.
A critical feature of temporal pitch theories is their unique ability to explain the pitch of unresolved harmonics, where all of the spectral components of a complex sound pass through a single same auditory filter. In such cases, pitch information is confined to the temporal pattern of neural activity within auditory filters. A recent neural theory proposed to account for the pitch of unresolved harmonics (Langner 1997
) suggests that maps of periodicity [quantified by sensitivity to amplitude-modulated (AM) tones] run orthogonal to the main tonotopic axis in the inferior colliculus (IC). Neurons with tuning for similar best modulation frequencies (BMFs) project along iso-modulation contours to synapse on low characteristic-frequency [characteristic frequency (CF) <2.0 kHz] neurons in the IC dorsal laminae. These neurons act as modulation extractors, responding to high-frequency AM tones outside their traditional frequency-versus-level response area, with preferred BMFs matching their CFs (Biebel and Langner 2002
). Evidence from primary auditory cortex (Schulze and Langner 1997
, 1999
; Schulze et al. 2002
) appears to confirm the existence of periodicity maps in response to high-frequency AM tones including one overlying the low-frequency region of primary auditory cortex. Note that this proposed mechanism can account only for the pitch of unresolved harmonics, since resolved harmonics will not evoke temporally-modulated responses within a single neural filter. However, unlike psychophysics experiments that use low-pass masking noise to reduce or abolish the contribution of distortions in temporal pitch mechanisms (Bernstein and Oxenham 2003
; Grimault et al. 2002
; Plack and White 2000
), these studies were not able to exclude the contribution to neural responses of low-frequency combinations tones (distortions) generated by the nonlinear mechanics of the basilar membrane in response to relatively spectrally complex (compared with pure tones) AM tones. Given the potential interest in central auditory mechanisms that extract modulation and/or pitch information, it is imperative that the contributions of peripheral mechanisms to neural responses be discounted before central mechanisms are assumed. Combination tones (CTs) were suggested by Helmholtz (1885)
more than a century ago as a possible means of accounting for the pitch of the missing fundamental, and although several lines of evidence indicate that CTs cannot account for all of the pitch percept of periodic signals, recent psychophysical evidence from Pressnitzer and Patterson (2001)
indicates that CTs likely contribute to the perception of complex pitches.
This study shows that low-frequency IC neurons readily respond to high-frequency AM tones, as previously reported (Biebel and Langner 2002
), but that a likely explanation for this responsiveness is the generation of distortion on the basilar membrane. The data indicate that a significant distortion is generated when high-frequency AM tones of moderate (<70 dB Sound Pressure Level) sound level are presented and that this distortion interacts with tones presented to the same ear on the basilar membrane to produce monaural beating in the responses of low-frequency IC neurons, and with neural activity generated by tones presented to the opposite ear in the binaural brain stem nuclei to generate binaural beating in interaural delay-sensitive IC neurons. These data suggest that recent reports of central neural mechanisms for temporal pitch extraction may, instead, be demonstrations of the contribution of cochlear-generated distortions to the neural representation of complex sounds.
| METHODS |
|---|
|
|
|---|
All experiments were carried out in accordance with the guidelines of the UK Home Office, under the control of the Animals (Scientific Procedures) Act 1986. Pigmented guinea pigs (Cavia porcellus, 315505 g) were anesthetized with intraperitoneal urethane (1.01.5 g/kg in 20% solution). Additional analgesia was administered as required using 0.1 ml im injections of fentanyl citrate/fluanisone (Hypnorm, Janssen, Beerse, Belgium). Atropine sulfate (0.06 mg; Animalcare, York, UK) was administered subcutaneously to reduce bronchial secretions. All animals had a tracheal cannula inserted. Body temperature was maintained using a thermostatically controlled heating blanket and rectal probe (Harvard Instruments, Edenbridge, UK). Most animals breathed spontaneously, but some were respired artificially with air or 95% O2-5% CO2. All animals were clear of any signs of infection in the ear canals and tympanic membranes. Any obstructing particles were carefully removed from the ear canals before proceeding. Animals were mounted in a modified Kopf Instruments (Bilaney Consultants, Sevenoaks, UK) stereotaxic frame situated inside a sound attenuating booth (IAC, Winchester, UK). Hollow ear speculae allowed the insertion of custom-made earphones and probe tube microphones to form a sealed pressure-field sound delivery system. Pressure equalization of the middle ear was achieved by sealing high acoustic impedance cannulae into the bulla via small holes drilled on both sides. Following subcutaneous injection of lignocaine (2%; Astra, Kings Langley, UK) into the scalp, skin and muscle were retracted, and a craniotomy was performed, extending 23 mm rostral and caudal of the interaural axis and 14 mm lateral from the midline on the right side. The dura overlying the cortex was removed, allowing microelectrode access through the cortex to the right inferior colliculus, and the cranium was sealed with 2% agar (Oxoid, Basingstoke, UK).
Single unit recording
Recordings were made from single neurons using parylene-coated tungsten microelectrodes (15 MOhm impedance; WPI, Stevenage, UK), mounted on a piezo-electric stepper motor and positioned stereotaxically into the inferior colliculus. Electrical activity from the microelectrode was filtered (300 Hz3 kHz) and amplified (variable gain) using a DAM-80 ac differential amplifier (WPI, Stevenage, UK) and a PC1 spike conditioner (Tucker Davis Technologies, Gainesville, FL). Units were isolated using variable frequency and intensity diotic tone probe stimuli. Single spikes were discriminated from background noise using an SD1 spike discriminator (Tucker Davis Technologies), linked to the computer system to allow accurate time stamping (1 µs) of the spike events via an ET1 event timer (Tucker Davis Technologies). Single unit isolation was confirmed by the consistency of the discriminated spike waveform displayed on a Tektronix TDS-210 digital oscilloscope.
Stimulus presentation and data analysis
Acoustic stimuli were produced and presented under computer control, using software developed at the Medical Research Council Institute of Hearing Research (by Prof. Alan Palmer and Dr. Trevor Shackleton) and Tucker Davis Technologies System II hardware. Digitally generated dichotic stimuli (AP2 digital signal processor; at 100- or 48-kHz sampling rate; Tucker Davis Technologies) were converted to analogue signals (DA3-2, Tucker Davis Technologies). The signals were filtered (FT6; corner frequency = 40 kHz; Tucker Davis Technologies), and attenuated (PA4, Tucker Davis Technologies), before being amplified (RB-971, Rotel, Tokyo, Japan) and delivered to Beyerdynamic DT-48 (Burgess Hill, UK) loudspeakers fitted with brass tube attachments sealed into the hollow ear speculae supporting the animal. The sound field inside the sealed system was sampled using FG3452 (Knowles Electronics, Burgess Hill, UK) microphones via a probe tube inserted to within a few millimeters of the tympanic membrane. The probe microphones had been previously calibrated against type 4136 1/8-in microphone (Bruel and Kjaer, Stevenage, UK). The sound systems for each ear were flat to within ±5 dB from 50 to 12,000 Hz and were matched to within ±5 dB for this range. All sounds were generated using the maximum range of the DSP to ensure a high signal-to-noise ratio. A fixed, in-line end attenuation of 60 dB was applied to each signal following signal generation, digital attenuation, and amplification, to give a maximum output at 1 kHz of 106 dB SPL for 0 dB digital attenuation. For several experiments, end attenuation was set to 40 dB, and the output of the total output of the system was reduced by 20 dB. For the response areas shown in RESULTS section, the ordinate decibel scale reflects the maximum106 dB SPL output at 1 kHz.
The measured f0 distortion in the sound delivery/recording system was examined to ensure that any such distortions were lower than the level of presumed cochlear distortions generated by the AM signals. Possible sources of distortion are the sound generation systemunlikely, given appropriate filtering was applied to the D/A signal conversionthe sound delivery system, or the in-line sound calibration system. The sum total of all distortions was significantly lower than could account for if the distortions were due to the stimulus generation/delivery system. For example, f0 distortion, measured by the on-line spectrum analyser, was 51 dB lower than a 3-kHz carrier AM at 363 Hz (100% depth) and presented at 70 dB SPL, 45 dB lower when the carrier level was increased to 75 dB SPL and within the noise floor (
65 dB lower than the 3 kHz carrier at 70 dB SPL) when the carrier level was reduced to 65 dB SPL. All of the presumed cochlear-generated distortions were significantly higher in level than the measured system distortions; on average, the threshold cochlear-generated distortion was 25 dB lower than the carrier level used, and all but two recordings were made with maximum carrier levels 70 dB SPL or below. Thus the distortions generated by the cochlea were
26 dB above the level of distortions generated in the combined signal generation, calibration, and measurement system. The noise floor of this combined system was 65 dB below the level of the 3-kHz carrier tone. The site of generation of distortions in the sound generation/delivery/calibration system was not determined. Although it is possible that the speakers generate some distortion, it is also likely that the Knowles microphones used for on-line analysis of the sound signals account for at least some of this distortion. The speakers and the probe tubes used to measure sound at the eardrum, and to which the Knowles microphones were attached throughout recordings, were calibrated against an 1/8-in Ban dK microphone. However, although these microphones show a very flat transfer function, they have a high noise floor [low signal-to-noise ratio (SNR)] and cannot be used to measure distortions for low sound level signals.
Once a single neuron was isolated, its CF was estimated audio-visually. In most cases, CF was confirmed by the generation of frequency-versus-level response areas, extending two octaves above and four octaves below the estimated CF, using diotic 50-ms tones covering a range of 60100 dB, presented at a rate of 5 Hz in randomized order. Response areas were also obtained over the range of four octaves above to two octaves below CF to confirm that neurons were unresponsive to pure tone stimulation at frequencies used as carrier signals using AM.
Additionally, neurons were characterized by the generation of peristimulus time histograms (PSTHs) to 50-ms CF tones presented diotically and monaurally to both ears, at 20 dB above threshold for 150 repetitions at 5 Hz. The neurons binaural sensitivity to interaural phase disparity (IPD) cues was assessed using 3-s duration binaural beat stimuli, with a 1-Hz difference between the ears. The 3-s beat contains two full sweeps of IPD during the middle 2 s and sweeps of 0.5 cycles of IPD during the first and last 500 ms of the stimulus. The initial and final 500-ms periods of the response were omitted from analysis to exclude contamination of IPD sensitive responses by the often large, but rapidly adapting, discharge rates evoked at the onset of the stimulus. PSTHs of the two complete cycles of IPD were averaged and plotted on-line with respect to the IPD of the stimulus to form period histograms. Using the methodology of Goldberg and Brown (1969)
, the vector strength (R) of the response was calculated from the period histogram. A vector strength of 1.0 reports perfect locking to the phase (IPD) of the stimulus, with all spikes occurring in 1 bin (bin width in this study was 20 ms), and 0.0 reflects an even distribution of spikes across all bins of the IPD phase plot. The average best phase was calculated as a vector average of the response magnitudes at each point in the cyclic IPD phase histograms. Vector strengths were assessed for their statistical significance by measuring the Rayleigh coefficient, 2nR2, where n is the total spike count, and R is the vector strength. Responses were considered significantly modulated with IPD for Rayleigh coefficients > 13.815 (Yin and Kuwada 1983
), i.e., P < 0.001.
| RESULTS |
|---|
|
|
|---|
Evidence for the contribution of combination tones in AM-generated neural responses
Figure 1 shows the responses of an IC neuron to low-frequency pure tones and high-frequency AM tones. A frequency-versus-level response area (Fig. 1A) obtained over six octaves (2 below CF and 4 above), using diotic, 50-ms tone bursts, confirms the audio-visually determined CF of 292 Hz. The top PSTH in Fig. 1B shows the response to a 3-s, 2-kHz pure tone presented to the contralateral ear. This frequency lay well outside (above) the pure-tone response area (Fig. 1A, large white circle), and no activity was evoked above the spontaneous discharge rate (indicated by the arrow to the right of the panel). The second from top PSTH in Fig. 1B shows the response of the same neuron to a 2-kHz tone AM 292 Hz, the neuron's CF. This AM produced sidebands in the spectrum at 2.292 and 1.708 kHz (Fig. 1A, small white circles), 6 dB lower than the level of the carrier. Despite all spectral components, including the lower sideband, lying above the response area, evoked discharge rates were higher than the spontaneous discharge rate. This is consistent with previous reports (Biebel and Langner 2002
) in which IC neurons with CFs below 2 kHz responded to AM tones that lay outside their pure-tone response areas. As might be predicted from Fig. 1A, a 291-Hz tone presented to the ipsilateral ear alone (Fig. 1B, 2nd from bottom) also evoked activity above spontaneous discharge rates.
|
Figure 2 shows the responses of another IC neuron, with an audio-visually determined CF of 364 Hz, to low-frequency pure tones and high-frequency AM tones. Frequencies >1.5 kHz evoked no activity, even at levels >90 dB SPL. The neuron's response to binaural beats (Fig. 2B), where the contralateral stimulus was a CF tone (364 Hz) and the ipsilateral stimulus was a CF 1 Hz tone (363 Hz), indicates the neuron to be strongly sensitive to IPD cuesresponses were significantly modulated (2nR2 > 13.815; P < 0.001) with IPD over a wide range of sound levels. For the highest sound level of 42 dB SPL (Fig. 2B, top), approximately +15 dB at threshold at CF (Fig. 2B, top row), the mean best IPD was 0.44 cycles of IPD, i.e., the neuron responded maximally when the stimulus at the right ear led the left ear by 0.44 cycles of the stimulus period. For the lowest sound level of 32 dB SPL (+5 dB at threshold), the best IPD was 0.40 cycles (Fig. 2B, bottom).
|
65 dB SPL (Fig. 2A, top arrow) and the lowest was 45 dB SPL (Fig. 2A, bottom arrow; intermediate levels in 5-dB steps). Despite the large spectral differences between the ears, however, the neural response was significantly modulated on the period of the difference between the CF in one ear and the AM rate in the other, at least for the three higher sound levels (Fig. 2B, top 3 rows), with a mean best IPD ranging from 0.25 to 0.35 cycles at CF. A final example of this behavior is shown in Fig. 3 for a neuron with a CF of 888 Hz (Fig. 3A). As for the neuron in Fig. 2, this neuron was sensitive to the IPDs of 1-Hz binaural beats generated by a CF tone in the contralateral ear and an 887-Hz tone in the ipsilateral ear, showing a best IPD of 0.31 cycles (Fig. 3B), with significant IPD sensitivity (P < 0.001) observed for the two highest levels (51 and 46 dB SPL) of the contralateral pure tone (Fig. 3B, top 2 PSTHs). In Fig. 3C, the CF tone is replaced with a 5-kHz tone, AM at 888 Hz. This stimulus configuration also evoked a response that, when binned on the period of the difference between the AM rate in the contralateral ear (at CF) and the CF 1 Hz tone in the ipsilateral ear, indicated the neuron to be sensitive to the IPD between these components. Significant IPD sensitivity (P > 0.001) was obtained for the four highest levels of AM tones shown (Fig. 3C, top 4 PSTHs).
|
What explanation might be posited for the sensitivity of IC neurons to AM high-frequency tones? For the examples above, one can hypothesize that the combination of AM tones in one ear and CF 1-Hz tones in the other produces a binaural distortion beat; action potentials generated by vibration of the contralateral basilar membrane at the place tuned to CF, equal to the AM rate, and by vibration of the ipsilateral basilar membrane at the place tuned to CF 1 Hz, propagate via low-frequency, phase-locking neurons in the cochlear nucleus to converge on delay-sensitive neurons in the superior olivary complex (Fig. 4, left). Here, the quadratic nonlinearity of the basilar membrane produces a traveling wave (horizontal red arrow) that is resolved by the basilar membrane at the (low-frequency) place tuned to the AM frequency. So, since there is no contralateral stimulus component at 364 Hz for the neuron in Fig. 2, or at 888 Hz for the neuron in Fig. 3, a parsimonious explanation is that the response modulation arises due to a low-frequency (364 or 889 Hz) componentthe difference tone at fc ± fmgenerated by the nonlinear response of the basilar membrane to the AM complex. Phase-locked activity generated by this difference tone is propagated through fibers tuned to these low frequencies to binaural coincidence detectors in the brain stem nuclei of the superior olive, where it interacts with phase-locked activity generated by the tone 1 Hz lower in the other ear. This activation propagates to the low-frequency laminae of the IC, where it is recorded as an IPD sensitivity in low-CF neurons.
|
Sensitivity to the periodicity of AM sounds below the IC
The argument that the generation site for the sensitivity of auditory neurons to high-frequency AM tones is lower than the level of the IC is supported by the response of the neuron in Fig. 5, which differs from the previous examples in two respects. Although sensitive to IPDs (Fig. 5A, top) and high-frequency tones (3 kHz) modulated at CF (208 Hz; Fig. 5A, middle), and also showing a modulated response to binaural stimulation on the period of the difference between the AM rate in the contralateral ear (208 Hz) and the pure-tone frequency in the ipsilateral ear (207 Hz; Fig. 5A, bottom), this neuron was recorded in the DNLL rather than the IC. The DNLL was located using the stereotaxic coordinates of Medvedeva (1977)
; the neuron in Fig. 5 was recorded 6.6 mm below the cortical surface and 3.5 mm lateral to the midline. This recording position was 3.1 mm deeper than (ventral to) the location in the same electrode penetration at which a low-frequency excitatory drive was recorded as the electrode passed the lateral edge of the IC. Histological reconstruction of the midbrain and brain stem of this animal confirmed the location of an electrolytic lesion at this neuron's recording site to be in the DNLL. Neurons in the DNLL, which is a GABAergic projection nucleus to the IC (Adams and Mugnaini 1984
; Batra and Fitzpatrick 2002
; Kelly and Li 1997
; Zhang et al. 1998
), show a high degree of phase-locking and high, sustained discharge rates (Aitkin et al. 1970
; Brugge et al. 1970
) compared with IC neurons, as Fig. 5B suggests. The second difference lies in the form of binaural interaction, which appears to be based on phase-locked inhibition as well as excitation. Diotic presentation of 50-ms tone bursts evoked instantaneous discharge rates peaking around 400 spikes/s (Fig. 5B, left). However, contralateral stimulation alone evoked significantly higher discharge rates (Fig. 5B, right top). Conversely, ipsilateral stimulation alone reduced discharge rates below the low spontaneous rate (Fig. 5B, right bottom; note change of scale). This suggests input from neurons in the lateral superior olive (LSO) that show evidence of an exquisitely timedon the microsecond scaleglycinergic input from the medial nucleus of the trapezoid body (MNTB) (Smith et al. 1991
, 1998
; Tsuchitani 1997
). The LSO sends excitatory projections to the DNLL (Huffman and Covey 1995
; Oliver 2000
). Consistent with this, the influence of the 207-Hz tone presented to the ipsilateral ear (Fig. 5A, bottom) is to reduce, in an IPD-sensitive manner, the response to the contralateral ear alone (cf. Fig. 5A, middle). The 3-kHz AM tone presented to the contralateral ear evokes
30 spikes/s, after an initial onset response over the first 200300 ms of
40 spikes/s. Simultaneously presenting the 207-Hz tone to the ipsilateral ear modulates the response between a maximum of 40 and
5 spikes/s.
|
|
In their study of the contribution of CTs to human pitch perception, Pressnitzer and Patterson (2001)
showed, using the cancellation of beats method (Goldstein 1967
), the presence of distortions
1520 dB lower than level of the primary tones, depending on the number of harmonic components. In this study, a pure-tone probe with frequency equal to CF 1 Hz was introduced to the left ear to beat monaurally with the presumed CF distortion tone, producing a 1-Hz monaural distortion beat. The phase and amplitude of the difference tone were estimated by varying the level of the probe tone required to modulate maximally the neural response.
Figure 7A shows the response area of an IC neuron (CF = 185 Hz). The panels in Fig. 7B indicate the response of this neuron to a 2.5-kHz tone in the contralateral ear AM at 185 Hz with, from bottom to top, increasing levels of a 184-Hz tone presented to the same ear. None of the spectral components of the AM signal (Fig. 7A, white circles) evoked a response when presented in isolation. For low levels of the 184-Hz tone (Fig. 7B, bottom; 31 dB SPL), the discharge rate was essentially that evoked by the high-frequency AM signal. As the level of the 184-Hz tone was increased (Fig. 7B, bottom to top), however, the neural response was modulated above and below the rate evoked by the AM tone alone. For a tone level of 48 dB SPL (Fig. 7B, 3rd PSTH from top), the response was maximally modulated. Further increasing the level of the tone also increased the discharge rate, but the response was less modulated. For the highest level used (59 dB SPL; Fig. 7B, top PSTH), the response was completely unmodulated. The explanation for the modulation in the neural response, which is at the period of the difference between the modulation rate of the high-frequency AM tone and low-frequency pure tone, is the same as for the binaural responses described above: the AM tone generates a distortion at the frequency of the difference tone (185 Hz in this case). The addition of a 184-Hz tone into the same ear causes a beating on the basilar membranea monaural distortion beatat the 1-Hz rate observed in the neural response. The two basilar membrane waves add when in phase and cancel when out of phase, producing modulated neural activity at 1 Hz in the auditory nerves innervating this region of the cochlea. When the levels of the two waves differ, for example, when the pure tone is lower than, or higher than, the level of the distortion, one frequency component dominates the basilar membrane output, and the IC neural response is less modulated as a result. When the level of the 184-Hz tone is equal to that of the 185-Hz distortion, the two basilar membrane responses cancel completely when out of phase (3 times over the 3 s of the stimulus), and the response is maximally modulated. The level at which this occurred (48 dB SPL) was 17 dB lower than the level of the 2.5-kHz carrier.
|
|
|
A prediction from the cochlear distortion hypothesis is that the distortion shows high specificity to the level and phase of a masker tone presented to the same ear. As shown above, neural activity is modulated in a manner consistent with monaural beating when a tone 1 Hz lower than the AM rate is presented to the same ear. Here, by adjusting the level and phase of a cancellation tone of equal frequency to the AM rate presented to the same ear, I show that the neural response is altered in a highly specific, and predictable, manner that suggests the interaction occurs at a peripheral stage in auditory processing. Such specificity is considerably less likely if the generation site for the neural response to the high-frequency AM signal is central, i.e., if it occurs after the response to the signal has been converted into a train of action potentials. Figure 9B plots the binaural phase of the distortionequal to the difference between the best IPD to binaural beats and the best IPD to binaural distortion beatsas a function of the phase obtained from the monaural beats method. The values obtained by each method are clearly very similar (the line indicates unity) and argue for a peripheral generation site for generation of the phenomenon rather than a central site.
The specificity of the level and phase of the presumed distortion product was further examined using a CF "masker" tone presented to the same ear as the AM complex modulated at CF, and a pure tone at CF 1 Hz was presented to the opposite ear. The level and phase of the CF masker were altered systematically, and the neural response was assessed. Figure 10 shows the responses of an IC neuron with a CF of 437 Hz to a 2.5-kHz tone in the contralateral ear AM at 437 Hz, a pure-tone "masker" of 437 Hz in the same ear, and a pure tone of 436 Hz in the ipsilateral ear. From top to bottom along the central spine of Fig. 11, responses are shown for decreasing levels of the CF masker tone in the contralateral ear. In each case, the masker starting phase is 0.1 cycles with respect to the tone in the ipsilateral ear, which is also the cancellation phase obtained from the monaural distortion beats experiment above. At high levels, the 437-Hz masker dominates the contralateral-evoked response on the basilar membrane and beats binaurally with the neural response generated by the 436-Hz tone in the ipsilateral ear to produce the observed IPD sensitivity. The phase of the binaural response at the highest masker level (7 dB lower than the AM carrier level; top panel) is +0.2 cycles, almost exactly the difference between the cancellation phase (0.1 cycles) and the response to binaural distortion beats (0.31 cycles) for this neuron. As the level of the 437-Hz masker is reduced, however, the response magnitude decreases systematically, although the phase of the response remains constant (Fig. 11, 2nd and 3rd from top; mean best IPDs of +0.17 and +0.18 cycles, respectively), until, for a masker level 19 dB lower than the AM carrier level (Fig. 11, middle), the IPD-modulated response was abolished. When the level of the 437-Hz tone is reduced further (Fig. 11, bottom 3 panels), however, the response increases again, and it is once more significantly modulated with IPD for a masker level of 43 dB SPL (Fig. 11, 3rd panel from bottom) but with a different best IPD (0.29 cycles in each of the bottom 2 panels) to that for high levels of the 437-Hz tone. At these low masker levels, the distortion tone, which, like the masker, is also 437 Hz, presumably dominates the response in the contralateral ear and beats binaurally, with the 436-Hz tone in the ipsilateral ear, to produce interaural phase characteristics that reflect the phase of the distortion not the phase of the masker.
|
|
A second example of this behavior is shown in Fig. 11 (the same DNLL neuron shown previously in Fig. 5). Once more, only a narrow range of CF masker levels and phases abolishes the modulated neural response. This is consistent with the hypothesis that a distortion tone at f0, generated on the basilar membrane, is responsible for generating the responses of low-frequency IC neurons to high-frequency AM sounds.
A total of 10 IC neurons and 1 DNLL neuron were systematically examined using different levels and phases of the cancellation tone in the binaural condition. All showed similar sensitivity to the level and phase of the tonal masker as the neurons in Figs. 10 and 11, with the response modulation gradually disappearing as masker level was reduced, only to reappear with further reductions in masker level.
Figure 12 shows the response of an IC neuron to three different levels of AM tone (3 kHz modulated at 263 Hz) in the contralateral ear and a pure tone (262 Hz) in the ipsilateral ear. The carrier level of the AM tone in the left column (Fig. 12A) was 10 dB higher than the middle column (Fig 12B) and 20 dB higher than that in the right column (Fig. 12C). In the top row, a high-level CF masker (263 Hz) beats binaurally with the 262-Hz tone in the ipsilateral ear, producing modulated discharge patterns with similar best IPDs. The middle row shows the response of the same neuron when the level of the 263-Hz masker is sufficient to offset the distortion produced by the AM tone, which was 19 dB lower than the level of the carrier in Fig. 12A. For AM levels 10 (Fig. 12B) or 20 dB (Fig. 12C) lower than in Fig. 12A, the masker level required to abolish the response was also reduced by 10 dB (Fig. 12, A and B, middle). In each case, the phase of the masker lagged the tone in the ipsilateral ear by 0.3 cycles. When the level of the masker was reduced further, the IPD-modulated response in Fig. 12, A and B, but not Fig. 12C, reappeared with altered best IPD, reflecting the relative phases of the contralateral distortion tone and the ipsilateral pure tone. In Fig. 12C, the AM carrier level was below the level required to generate a distortion that could interact binaurally with a tone in the other ear.
|
| DISCUSSION |
|---|
|
|
|---|
This study also examines claims made in recent studies suggesting a central neural mechanism of temporal pitch extraction (Biebel and Langner 2002
; Schulze and Langner 1997
, 1999
; Schulze et al. 2002
). The data show that low-CF neurons respond to high-frequency AM tones in which all spectral components lie outside the pure-tone response area, confirming the basic observation of Biebel and Langner (2002)
, and apparently consistent with the conclusion of both studies that periodicity pitch is mapped in the CNS. Given the importance of pitch in acoustic processing, particularly in auditory grouping and stream segregation, such findings potentially have immense importance in the field. The interpretation of the data in this study, however, suggests that these claims should be treated cautiously. None of the studies cited above tested the possibility that cochlear-generated distortions contributed to neural responses of low-CF neurons to high-frequency AM tones. Thus such distortions cannot be excluded in these studies. Psychophysical experiments investigating temporal pitch mechanisms employ low-frequency masking noise specifically designed to attenuate or remove the contribution of low-frequency spectral components generated at the fundamental frequency that are generated by distortion (Carlyon et al. 2002
; Moore and Sek 2000
; Plack and White 2000
). Given the importance to psychophysical studies of masking the potential contribution of cochlear-generated distortions, it is imperative that the contribution of such distortions be excluded when examining potential neural mechanisms that generate sensitivity to temporal pitch. Obviously low-frequency masking noise cannot be used to mask cochlear-generated distortions in single-neuron recordings, as the masking noise will excite directly the neuron through the vibration of the low-frequency end of the basilar membrane. However, low-frequency sounds can be used to assess the presence of cochlear-generated distortions, as was performed in this study. None of the cited studies (Biebel and Langner 2002
; Schulze and Langner 1997
, 1999
; Schulze et al. 2002
) purporting to examine the existence of periodotopic representations in the brain examined this possibility. It was found that all of the data that indicate sensitivity of low-CF neurons to high-frequency periodic stimuli can readily be explained by the generation of CTs on the basilar membrane at a frequency corresponding to the AM rate. Pure tones of frequency 1 Hz lower than the frequency of the presumed CT produced a beating pattern in the neural response. This response could be abolished using a pure tone of identical frequency to the presumed CT with appropriate phase and amplitude, and the response interacted binaurally, being sensitive to interaural phase differences between the presumed CT and a tone presented to the other ear. A parsimonious explanation for these observations is that low-CF neurons respond to high-frequency AM complexes by means of being activated through low-frequency auditory channels from the level of the cochlear nerve to the level of the IC. Consequently, a central mechanism of periodicity, or pitch, extraction is not required to explain these data.
Are neurons tuned for periodicity?
In the study by Biebel and Langner (2002)
, the tuning of low-CF IC neurons for AM rates was remarkably similar to their tuning for low-frequency pure tones. Since tuning for CF is determined at the level of the cochlea and imposed on central auditory neurons by means of axonal connectionsa labeled-line codethere is no intrinsic requirement that such neurons favor an AM rate similar to their CF. The majority of low-CF IC neurons do not show phase-locked responses to monaural inputs, and there is no evidence that IC neurons show any intrinsic temporal rate preference of inputs that would match their CF. As such, a rate code for periodicity extracted from high-frequency laminae in the IC does not require, nor does it have any apparent template on which to map, a preference for similar tuning to carrier and envelope modulations. A rate (i.e., discharge rate) map of AM preferences could just as well be arranged orthogonal to the tonotopic organization in the IC.
In those studies in primary auditory cortex where the relationship between CF tuning and tuning for periodicity was not as clear as in the IC (Schulze and Langner 1997
, 1999
), the authors suggested that this indicated a complex pattern of across-frequency integration. However, a simpler explanation for these data, but one that does not appear to have been accounted for in their studies, is the contribution of the head-related transfer function to spectrally shaping of the sound at the eardrum. Stimuli in these studies were presented under free-field listening conditions, in which the frequency-dependent gain function of the outer ear potentially alters sound levels at the ear drum by
20 dB, depending on the exact frequency and the location of the sound source relative to the head (May and Huang 1996
). This would inevitably have consequences for the sound levels within each frequency component that reaches the eardrum and could alter the apparent tuning of the neuron when presented with complex sounds compared with simple tuning.
Comparison with previous studies: free-field stimulation and binaural sensitivity
There is no categorical way of determining whether responses recorded in this study and those reported previously in the IC (Biebel and Langner 2002
) are derived from the same or similar population of neurons, apart from the phenomenological observation that low-CF neurons respond to high-frequency AM tones. In this sense, this study concurs completely with the phenomenon observed by Biebel and Langner, but goes further by showing that it occurs in low-frequency ITD sensitive neurons that might be considered to be specialized for spatial, rather than spectral, processing. However, before valid comparisons can be made between this study and that of Biebel and Langner, or between their own study and psychophysical studies, it is important to understand the significant differences in methodology that potentially contribute to their data and their interpretation of it.
In particular, the method of sound stimulation in Biebel and Langner's study is less controlled for sound level at either ear compared with this study, in which all sounds were presented by placement of speakers within a few millimeters of the eardrum, and a calibrated probe tube was used to record the sound level at this point. Biebel and Langner reported responses to free-field stimulation, with a speaker positioned some 4 cm external to one ear, and sound levels measured 1 cm from the ear canal. In at least two important ways this constitutes an uncontrolled stimulus and does not allow for comparison with sound levels in this study or for comparison with sound levels used in the carefully controlled psychophysical experiments cited by them. First, the contribution of the head-related transfer function (HRTF) on sound levels at the eardrum is not considered in their study. This is despite the fact that the pinna boosts sound pressure at the eardrum in a frequency- and position-dependent manner, by 10 dB or more at 4 kHz in the chinchilla (Murphy and Davis 1998
). Additionally, any complex tuning, with respect to AM rate, of the response to high-frequency AM tones is potentially confounded by the frequency-dependent HRTF. Related to the frequency dependence of the HRTF is that the relative levels of the individual spectral components, and thus the form and depth of the AM stimulus, are not well-controlled in Biebel and Langner's study. Although speaker output was reported as flat to within 6 dB over the range tested, the HRTF is likely to alter this considerably, depending on the frequency of the carrier and sidebands. This potentially impacts on the modulated waveform arriving at the eardrum. For example, 100% AM depth occurs when the sidebands (in appropriate phase relationship to the carrier) are 6 dB lower than the carrier. A frequency-dependent HRTF in which one or both of the sidebands are relatively boosted or attenuated compared with the carrier, for example, could significantly alter the modulation waveform. This is also relevant to their observation that strong neural responses could be elicited by very low modulation depths, since the relative levels of the spectral components in the AM complex, and thus the AM depth, at the eardrum are potentially very different to those outputted by the speaker. This also impacts on the reported responsiveness of IC neurons to low modulation depths of high-frequency carriers, which was taken as evidence for a nondistortion-related phenomenon.
A second issue concerns the binaural nature of sound stimulation in Biebel and Langner's study, where all recordings were made in response to free-field stimulation. All signals were subject to an ITD created by the sound arriving at one ear earlier than the other ear (which was not blocked in their experiments) in both the low-frequency carriers and high-frequency envelopes of the modulated sounds. In addition, all high-frequency signals would have been subject to a frequency-dependent interaural level difference. The sensitivity to these interaural cues and their influence on responses were unknown and untested, whereas this study examined monaural and binaural influences separately. The inference from their study is that they studied monaural effects. However, it is only possible to state categorically that Biebel and Langner studied uncontrolled binaural influences and did not address monaural responses at all, since all recordings were made essentially with free-field stimulation. Such binaural stimulation could also account for the reported inhibitory influences of high-frequency AM signal, which they took to be inconsistent with the distortion hypothesis. Since stimulation of the ear ipsilateral to the IC is well documented to provide significant inhibitory drive to IC neurons, the source of inhibition may have been the sound-evoked ipsilateral ear.
Contribution of cochlear-generated distortions to pitch processing
Several lines of evidence have been taken to indicate that distortions cannot account for all of the pitch percept of spectrally complex sounds. For example, they cannot explain the pitch shift that occurs when the fundamental frequency is changed but the spacing of the partials is held constant (Schouten et al. 1962
), and the perception of the residue pitch is maintained even in the presence of low-pass masking noise designed to eliminate low-frequency spectral cues for the pitch of the missing fundamental. Although this is often taken as evidence against the role of cochlear distortions in the perception of the pitch of complex sounds, recent modeling studies suggest that even the pitch shift may be explicable in terms of the nonlinear dynamics of the cochlea itself (Cartwright et al. 1999
). Further evidence for an important role for CTs in processing complex pitches was obtained by Pressnitzer and Patterson (2001)
by examining the lower-level of melodic pitch (LLMP). When low-frequency masking noise was added to the stimulus, as is common to many psychophysical investigations of temporal pitch, the lowest pitched note that could contribute to a melody was reduced, and (high) frequency region at which harmonics contributed to the experience of melodic pitch was increased. This suggests that cochlear-generated combination tones provide an important contribution to the pitch perception of complex sounds under natural listening conditions. In response to higher-frequency harmonic series at moderate intensities (5565 dB SPL), these authors found a significant low-frequency distortion spectrum: approximately 10 to 15 dB of the level of the pure tone components in the 11-component harmonic series in cosine phase. In this study, AM signals with just three spectral components appear to produce sufficient distortion to evoke a strong neural response. Thus, in the absence of low-frequency masking noise usually provided in psychophysical studies examining purely temporal mechanisms of pitch, the contribution of cochlear-generated distortions to pitch processing could be substantial, independent of whether harmonics are resolved or not.
| GRANT |
|---|
|
|
|---|
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: D. McAlpine, Dept. of Physiology, Univ. College London, Gower St., London WC1E 6BT, UK (E-mail: d.mcalpine{at}ucl.ac.uk).
| REFERENCES |
|---|
|
|
|---|
Agmon-Snir H, Carr CE, and Rinzel J. The role of dendrites in auditory coincidence detection. Nature 393: 268272, 1998.[CrossRef][Medline]
Aitkin LM, Anderson DJ, and Brugge JF. Tonotopic organization and discharge characteristics of single neurons in nuclei of the lateral lemniscus of the cat. J Neurophysiol 33: 421440, 1970.
Batra R and Fitzpatrick DC. Monaural and binaural processing in the ventral nucleus of the lateral lemniscus: a major source of inhibition to the inferior colliculus. Hear Res 168: 9097, 2002.[CrossRef][ISI][Medline]
Bernstein JG and Oxenham AJ. Pitch discrimination of diotic and dichotic tone complexes: harmonic resolvability or harmonic number? J Acoust Soc Am 113: 33233334, 2003.[CrossRef][ISI][Medline]
Biebel UW and Langner G. Evidence for interactions across frequency channels in the inferior colliculus of awake chinchilla. Hear Res 169: 151168, 2002.[Medline]
Brugge JF, Anderson DJ, and Aitkin LM. Responses of neurons in the dorsal nucleus of the lateral lemniscus of cat to binaural tonal stimulation. J Neurophysiol 33: 441458, 1970.
Cariani PA and Delgutte B. Neural correlates of the pitch of complex tones. I. Pitch and pitch salience. J Neurophysiol 76: 16981716, 1996a.
Cariani PA and Delgutte B. Neural correlates of the pitch of complex tones. II. Pitch shift, pitch ambiguity, phase invariance, pitch circularity, rate pitch, and the dominance region for pitch. J Neurophysiol 76: 17171734, 1996b.
Carlyon RP, van Wieringen A, Long CJ, Deeks JM, and Wouters J. Temporal pitch mechanisms in acoustic and electric hearing. J Acoust Soc Am 112: 621633, 2002.[Medline]
Cartwright JHE, Gonzalez DL, and Piro O. Nonlinear dynamics of the perceived pitch of complex sounds. Phys Rev Lett 82: 53895392, 1999.[CrossRef]
de Cheveigne A. Cancellation model of pitch perception. J Acoust Soc Am 103: 12611271, 1998.[CrossRef][ISI][Medline]