|
|
||||||||
1Department of Speech and Hearing Science, Arizona State University, Tempe, Arizona; and 2Department of Psychology, Utah State University, Logan, Utah
Submitted 19 November 2004; accepted in final form 29 July 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
A simple stimulus for studying spectral segregation based on harmonicity is a complex tone in which one component in a harmonic series is shifted in frequency, or "mistuned" (Hartmann 1988
). A component mistuned by 12% or more can be heard as a second sound source, perceived simultaneously with the original harmonic tone but having a different pitch (Hartmann et al. 1990
; Moore et al. 1986
). Smaller amounts of mistuning that do not lead to the perception of a second tone may still be detectable (Lee and Green 1994
; Moore et al. 1985
). The mistuned tone appears to engage the same neural mechanisms that underlie the processing of simultaneous competing sounds in more natural listening situations. Sinex et al. (2002b)
described the responses of inferior colliculus (IC) neurons to mistuned tones. They varied stimulus parameters including the amount of mistuning and the harmonic number that was mistuned, and found that these stimulus manipulations could produce dramatic changes in the temporal discharge patterns of IC neurons. In contrast, comparable stimulus changes produced only modest quantitative changes in the discharge patterns of auditory nerve fibers (Sinex et al. 2003a
). They concluded that a significant transformation in the processing of complex sounds occurred between the auditory nerve and the IC, and that information related to the presence of multiple sound sources might be conveyed by the temporal discharge patterns of IC neurons.
This study extends the findings of Sinex et al. (2002b)
, who focused on the representation of many different mistuned tones. In the present study, the representation of one particular tone was examined in detail, with the goal of determining how the same stimulus was processed by different neurons. Although the amount of mistuning was not systematically varied, effects of changing additional stimulus parameters including level and component phases were examined. Observations of responses to stimuli synthesized with different component phases and presented at different levels provided a control against the possibility that the distinctive temporally patterned responses observed previously were a fortuitous consequence of the use of sine-phase multicomponent tones. Effects of component phase may also provide evidence about mechanisms that underlie the representation of mistuned tones. Hypotheses about those potential mechanisms were tested by presenting the same stimuli to a simple computational model that simulated the responses of brain stem and IC neurons. The model explored the consequences of combining the responses of simulated peripheral and brain stem neurons in various ways, with the goal of identifying integrative strategies capable of generating discharge patterns like those typically exhibited by IC neurons. Portions of these results were presented as abstracts (Sinex 2004
; Sinex and Li 2004
; Sinex et al. 2003a
).
| METHODS |
|---|
|
|
|---|
Responses were obtained from single neurons in the IC of the chinchilla. The procedures were similar to those described previously (Sinex et al. 2002b
), although several significant changes were made. Different anesthetics and different recording electrodes were used. In addition, complex tones were synthesized with individual components in different phase relations, component amplitudes were normalized to be equal at the eardrum, and different earphones were used. None of these changes produced results that were inconsistent with those of Sinex et al. (2002b)
. All procedures were approved by the Institutional Animal Care and Use Committee at Arizona State University.
Animals were anesthetized by injection of a mixture of 36 mg/kg ketamine and 4 mg/kg xylazine. Supplemental injections of ketamine or ketaminexylazine were given as required to maintain a surgical level of anesthesia. Animals were placed in a stereotaxic instrument (David Kopf Instruments) in a sound-attenuating booth (Acoustic Systems, Austin TX). The skull was opened and a portion of nonauditory cerebral cortex directly above the IC on the right side was aspirated to expose the dorsal surface of the IC. Recording electrodes were placed above the IC at predetermined stereotaxic coordinates referenced to visual landmarks (Nuding et al. 1999
), then advanced from outside the sound booth with a hydraulic microdrive (Trent-Wells, Coulterville, CA). The electrode trajectory passed through the central nucleus of the IC, in a parasagittal plane.
In early experiments, tungsten electrodes insulated with Parylene C were used. In those experiments, the locations of selected recording sites were marked with electrolytic lesions made by passing 5 µA of current through the electrode tip for 10 s, and their locations within the IC were confirmed histologically, using standard methods. In later experiments, recordings were obtained with carbon-fiber electrodes (Kation Scientific, Minneapolis, MN). These electrodes were directed toward the same stereotaxic coordinates as in earlier experiments, but recording sites were not marked or confirmed histologically. Most neurons recorded with either electrode type exhibited secure, short-latency responses to tone bursts, and had sharp tuning with characteristic frequencies (CFs) that increased with increasing depth. These properties are consistent with those reported in previous investigations of the central nucleus of the IC (Langner and Schreiner 1988
; Merzenich and Reid 1974
; Nuding et al. 1999
). Based on these physiological properties and the histological observations, it is likely, but not certain, that the data presented here were obtained from neurons in the central nucleus.
Stimuli
The stimuli were complex tones consisting of eight sinusoidal components. Tones were 500 ms long, presented once per second at levels between 10 and 70 dB SPL per component. The amplitudes of individual components were adjusted during synthesis so that they would have equal SPL at the eardrum. The overall SPL was always 9 dB higher than the level of each individual component. In the harmonic tones, the components were the first eight harmonics of a 250-Hz fundamental. In the mistuned tones, the frequency of component 4 was increased by 12%, from 1.000 to 1.120 kHz. In the previous study (Sinex et al. 2002b
), this stimulus often elicited responses with a distinctive modulated temporal pattern that was useful as a standard against which changes across stimulus conditions and neurons could be evaluated. That pattern is described in more detail below and is referred to in this report as the "stereotypical pattern."
The starting phase relation between individual components in the complex tones was also varied. In the "sine" condition, each individual component in the complex tone started in sine phase. In the "cosine" condition, each individual component started in cosine phase. In the "random" conditions, each component had a different, randomly chosen starting phase. As many as three different sets of randomly chosen phases were used. Varying component starting phase changes the waveform and amplitude envelope of the stimulus; however, in informal listening tests there were no audible differences between stimuli synthesized with different starting phases. Using a restricted set of possible starting phases allowed responses to a common stimulus to be compared across neurons. Using more than one set of phases, and more than one randomization, reduced the possibility that the observed responses reflected some idiosyncratic property of one set of phases.
The waveforms of the harmonic tones were perfectly periodic, with a flat envelope and a period of 4 ms (Fig. 1, AC). Mistuning changed the envelope and also the fine structure of each waveform (Fig. 1, DF). With 12% mistuning of component 4, the frequencies of the eight spectral components were integer multiples of 10 Hz, which became the actual fundamental frequency of the mistuned tone. The waveforms exhibited shallow modulation with a period of 100 ms, the reciprocal of 10 Hz. Mistuning did not change the 4-ms spacing of waveform peaks, although the amplitudes of successive peaks were no longer identical as they were in the harmonic tones (Fig. 1, AC). Mistuning did not change the overall level of the tone.
|
Stimulus generation, stimulus presentation, and data collection were controlled by computer. Waveforms were digitally synthesized, passed through digitalanalog converters, programmable attenuators, and antialiasing filters (all from TDT, Alachua, FL), then delivered to a closed acoustic system incorporating ER2A insert earphones and an ER7 probe-tube microphone (Etymotic Research, Elk Grove Village, IL). The acoustic system was calibrated for each experiment. Tones were nearly always delivered monaurally to the contralateral ear. Binaural presentation was used as a last resort if neurons were unresponsive to monaural stimuli; one example of binaural responses is included in the RESULTS below.
When a neuron was isolated, estimates of CF and (if necessary) binaural sensitivity were obtained manually. After this initial characterization of the neuron, a detailed frequencyresponse map was obtained with an automated procedure that presented tones at multiple frequencies and levels (Nuding et al. 1999
). These data were used to make a more precise estimate of CF and threshold at CF. In most cases the frequencyresponse map was remeasured in the presence of a fixed-level CF tone. The fixed tone elicited a consistent response that made it possible to estimate the frequencySPL regions that provided inhibitory input to the neuron.
Responses to 500-ms tone bursts at CF were also obtained and used to classify each neuron according to the shape of its peristimulus time (PST) histogram and first-spike latency (Nuding et al. 1999
). From the PST histogram obtained at 20
30 dB above threshold, the ratio of discharge rate in the first 100 ms to the rate during the final 400 ms was calculated. Sustained units responded throughout the CF tone, although the PST histogram often exhibited a peak at tone onset. First-spike latencies were usually <20 ms, and the ratio of the onset discharge rate to the steady-state discharge rate typically ranged between 1 and 5. Pauser units exhibited a short-latency transient response and a second sustained response, separated by a silent interval lasting several milliseconds; for pauser units, first-spike latencies were usually <20 ms, but the ratio of discharge rates was often <1. Some sustained neurons exhibited pauser responses at higher SPLs, so the distinction between these categories should be considered somewhat arbitrary.
Transient units exhibited spikes just after stimulus onset but were silent or nearly silent afterward. First-spike latencies were usually <20 ms, and discharge-rate ratios varied from 5 to infinite. Long-latency units exhibited sustained or transient discharge patterns, but their first-spike latencies were >30 ms. These units also tended to respond with lower discharge rates, and the possibility that at least some of these were located outside the central nucleus cannot be ruled out. Neurons could be assigned to the transient and long-latency categories with more confidence than was the case for the other categories.
The data presented below are representative of the responses of 101 IC neurons. Of these, 95 neurons were studied with at least one pair of complex tones (a harmonic tone and a mistuned tone that was equivalent in every respect except for the frequency of the mistuned component). Forty-eight of those units were classified as "sustained," 18 units as pauser, 11 as transient, and 18 as long-latency.
Spikes were displayed as PST or cycle histograms with 1-ms bins. Cycle histograms were constructed from spikes occurring between 100 and 500 ms after tone onset, with a 5-ms correction for response latency. A correction was made to include as many spikes as possible in the analysis. A constant correction was used, rather than a unique correction for each neuron, so that time shifts illustrated in the figures below can always be attributed to properties of the stimuli or the neurons, rather than to the latency corrections. The length of the cycle histograms, 200 ms, included two cycles of the fundamental period of the mistuned tone. For some analyses, the discrete Fourier transforms (DFTs) of cycle histograms were calculated; for this calculation, the histograms were regenerated with 256 bins per 200 ms.
Complex tones were presented in a quasi-random order intended to counterbalance harmonic and mistuned conditions, and the various phase conditions, as well as cover a range of SPLs. Typically, each complex tone was presented 100 times.
Computational model
The model incorporated three sequentially ordered stages of processing, which loosely correspond to the cochlear, brain stem, and midbrain levels of the auditory pathway. The model simplified or omitted many details of brain stem processing and it did not attempt to simulate membrane conductances, membrane voltages, or spike discharges, as has been done in more sophisticated biophysical models (Borisyuk et al. 2002
; Cai et al. 1998a
,b
; Hewitt and Meddis 1994
; Nelson and Carney 2004
). Gammatone filters (Patterson et al. 1992
), as implemented by Slaney (1994)
, were used to simulate the frequency selectivity of Stage 1 channels. Stage 2 channels represent neurons in nuclei of the lower auditory brain stem, including those that provide either excitatory or inhibitory input to the IC (Burger and Pollak 2001
; Li and Kelly 1992
; Oliver and Shneiderman 1991
). Stage 2 channels derived their frequency selectivity directly from a single Stage 1 input; no lateral interactions between Stage 2 channels were simulated. However, the output of each Stage 2 channel was low-pass filtered at 400 Hz to extract the response envelope produced by a restricted frequency region of the stimulus. Stage 3 explicitly represented the IC. The key features of Stage 3 were that each channel received input from Stage 2 channels with different CFs, and that excitatory and also inhibitory inputs were combined. The Stage 2 inputs were weighted and summed; the CFs, weights, and number of these inputs are the parameters that were varied in this study. The summed response was low-pass filtered at 200 Hz, to simulate the decreased ability of IC neurons to follow AM, compared with neurons in the lower brain stem (Krishna and Semple 2000
; Langner and Schreiner 1988
; Rees and Møller 1983
). Thus for a Stage 3 channel that received input from "N" Stage 2 frequency bands, the response at Stage 3 (before half-wave rectification and low-pass filtering) was
![]() |
At each stage of the model, the output of a channel was an analog waveform; spikes or spike trains were not generated. Temporal details of these output waveforms can be compared with the temporal discharge patterns of actual neurons.
| RESULTS |
|---|
|
|
|---|
The stereotypical discharge pattern described by Sinex et al. (2002b)
was observed in response to mistuned tones in many, but not all, IC neurons. The pattern was most likely to occur, and to be most distinctive, in neurons with CFs within a broad range between about 0.25 and 3 kHz that exhibited short-latency sustained or pauser responses to pure tones.
Responses of one representative sustained neuron to simple and complex tones are shown in Fig. 2. The neuron's sustained response to CF tones included two to three synchronized spikes at response onset, accounting for the peak in the PST histogram in Fig. 2A; later spikes occurred with more variable latencies. Except for the onset peak, the PST histogram exhibited no stimulus-related temporal pattern. The response to the harmonic complex tone (Fig. 2B) was similar, except that the discharge rate was lower. As was the case for the pure-tone stimulus, no obvious temporal pattern could be seen during the sustained response. However, synchrony indices (SIs; Johnson 1980
) obtained by Fourier analysis revealed that weak but statistically significant discharge synchrony did occur at 250 Hz (SI = 0.19,
< 0.01, Rayleigh statistic; Rhode 1976
). The periodicity of the response most likely reflected the envelope frequency of the harmonic tone. In contrast, the mistuned tone elicited a larger sustained response with a distinctive modulated temporal pattern, which was clearly different from the response to the harmonic tone (Fig. 2C). After the onset response, the discharge pattern was characterized by peaks separated by about 8 ms, and slow modulation with a period of 100 ms. This is the pattern described in detail in the preliminary report (Sinex et al. 2002b
) and referred to in the METHODS as the "stereotypical pattern" against which changes associated with stimulus parameters or characteristics of neurons will be evaluated. As noted previously, this discharge pattern bore no simple relation to the stimulus waveform, in which peaks occurred every 4 ms. Although the waveforms of mistuned stimuli were modulated with the same 100-ms period, the depth of modulation was much greater in this response than it was in the stimulus. Fourier analysis of this discharge pattern indicated that the response included large components synchronized to several frequencies of interest. In this example, the SIs were 0.19 at 10 Hz, 0.63 at 120 Hz, 0.38 at 130 Hz, and 0.39 at 250 Hz; each value was statistically significant (
< 0.01, Rayleigh statistic). The pronounced 10-Hz modulation most likely reflected beating between the two larger response components at 120 and 130 Hz (the term "beating" as used here refers to an interaction between phase-locked responses, not beating between acoustic stimulus components). The origin of response components at those particular frequencies is discussed in the following text and in Sinex et al. (2002b)
. By contrast, in the response to the harmonic tone, the SIs at the same four frequencies were lower, and as noted only the component at 250 Hz reached statistical significance.
|
< 0.01). Unlike the sustained neuron, there was only a small peak at the onset of the response to the harmonic tone. The duration of the pause was reduced, relative to the response to the CF tone. In response to the mistuned tone, the temporal discharge pattern of this pauser neuron exhibited some of the same features, although the pattern was less apparent than that for the sustained neuron shown in Fig. 2. As in the data shown in Fig. 2, statistically significant response components occurred at 10, 120, 130, and 250 Hz; however, the SIs were lower: 0.19, 0.25, 0.35, and 0.22, respectively.
|
|
|
|
Temporal discharge patterns that resembled the stereotypical pattern were observed in neurons with a range of CFs. Each panel in Fig. 7 presents the cycle histogram of responses from a different neuron to the same mistuned tone. The first 5 neurons (Fig. 7, AE) were recorded from one animal and their CFs covered a range of 3.4 octaves, from 0.25 to 2.7 kHz. A variant of the stereotypical pattern was observed in each example. Consistent features of the response included the presence of peaks separated by 8 ms and a response envelope modulated with a period of 100 ms. The figure emphasizes the similarity of the discharge patterns across a broad range of CFs. The responses of neurons with CFs falling within a narrow range, sampled from different animals, would also be similar to one another. That is, there was no systematic dependency of the discharge pattern on CF.
|
The stereotypical discharge pattern differed across neurons in one other way. In some responses, the response fine structure at envelope minima differed from the fine structure at the envelope maximum. In Fig. 7, this can be seen most clearly in the discharge pattern of the neuron with the lowest CF, 0.25 kHz (Fig. 7A). At envelope maxima, peaks in the PST histogram were separated by about 8 ms, as noted. At the envelope valley, the spacing of histogram peaks changed to 4 ms. The 4-ms spacing obviously corresponds to the period of f0, 1/250 Hz. For this low-CF neuron, that spacing could arise directly as a phase-locked response to the fundamental component of the stimulus. Similar patterns were observed in other higher-CF neurons not illustrated in this figure; in those cases, 4-ms periodicity would more likely arise from beating between adjacent tuned harmonics of the stimulus.
Figure 7F presents the responses of one additional neuron to the mistuned tone. This response is noteworthy because of the neuron's high CF, 5.9 kHz. Although the discharge rate was extremely low, the stereotypical pattern was unmistakable. Neurons with CFs >5.9 kHz were not studied, so it is not possible to say whether this CF represents the upper limit of the range over which the stereotypical pattern can be observed.
Effect of stimulus SPL
In general, in neurons that exhibited the stereotypical pattern, the pattern was preserved with changes in stimulus level. Exceptions occurred at levels near threshold, where responses typically exhibited no pattern, or no pattern that could be detected in the absence of large numbers of spikes. Examples are shown in Fig. 8, which presents cycle histograms obtained from one representative neuron at levels from 20 to 60 dB SPL per component. The cycle histograms were similar but not identical. The previously mentioned change in periodicity at the minimum of the response envelope can also be seen, especially at 40 dB SPL. However, the most obvious change was a shift in the phase of the response envelope, similar to the shifts observed across CFs in Fig. 7. The envelope shift could occur with little change in the underlying temporal discharge pattern, as illustrated in Fig. 9, for which the data shown in Fig. 8, B and E were reanalyzed and plotted on the same axes. For the reanalysis, the beginning of the analysis window for the histogram plotted as a dashed line was delayed by 35.2 ms. A cycle length of 100 ms was also used, in part to accommodate the change in analysis window and in part to allow the fine structure of the histogram to be seen more clearly in Fig. 9. The change in analysis times was derived from the difference in response envelope phase (REP) for the two cycle histograms. REP was defined as the phase of the 10-Hz component of the DFT of the cycle histogram. REP was calculated for each histogram, then the phase difference, 2.21 radians at 10 Hz, was converted into a delay in ms. The figure indicates that except for the time shift, the responses obtained at 30 and 60 dB SPL were quite similar, especially for the larger histogram peaks. Across all the levels shown in Fig. 8, the largest envelope time shift was 48 ms, approaching one-half of the modulation cycle (one-half cycle is the maximum possible shift if the direction is ignored). Shifts in REP of this magnitude were observed in many other units.
|
|
In the previous study (Sinex et al. 2002b
), each individual sine wave component in the complex tones was synthesized in sine phase. In the present study, the phases of individual components in the complex tones were varied, in part to determine whether the distinctive temporal pattern observed in many IC neurons occurred fortuitously as a result of the use of sine-phase stimuli. Responses of one neuron to mistuned tones synthesized with various component starting phases are shown in Fig. 10. The neuron exhibited the stereotypical temporal pattern for some but not all stimulus-phase conditions. Details of the discharge patterns varied with component phase, but the most obvious change was a shift in REP. The magnitude of the shift was comparable to the shifts seen across CF or across SPL in previous figures. The changes were large, but not obviously systematic or predictable, either within or across neurons. Varying the phases of stimulus components is likely to produce changes in the relative response latencies of neurons, but the changes cannot be larger than the period of the stimulus component(s) that elicit the responses, in the range 0.5 to 4 ms for the component frequencies used here. The way in which these small latency shifts could produce such large envelope changes is considered further in Output of the computational model compared with IC discharge patterns below.
|
The emphasis so far has been on changes in the temporal discharge pattern produced by mistuning. Mistuned tones also tended to evoke higher average discharge rates from IC neurons, compared with harmonic complex tones. An example of a large rate increase is shown in Fig. 11. The neuron's response to pure tones at CF adapted over the first 100200 ms of the 500-ms stimulus, and although a few sustained spikes did occur, the neuron met the definition for the transient category (Fig. 11A). The average response to a harmonic complex tone was larger, 41 spikes/s compared with 15 spikes/s for the CF tone, and discharge synchrony with the periodicity of f0 was apparent in the 4-ms spacing of peaks in the histogram (Fig. 11B). As in the response to the pure tone, adaptation occurred during the first 100200 ms of the 500-ms tone. In contrast, the mistuned tone elicited a discharge rate of 95 spikes/s, an increase of more than a factor of 2 over the response to the harmonic tone and a factor of 6 over the response to a CF tone presented at the same SPL as the individual components of the complex tones (Fig. 11C). The rate increase was not simply a result of the higher overall SPL of the mistuned tone; the mistuned tone presented at 40 dB SPL per component, an overall level 1 dB below that of the pure tone that elicited the response in Fig. 11A, elicited a rate of 49 spikes/s (Fig. 11D). Another effect of mistuning was that the magnitude of the response to the mistuned tone showed no sign of adapting over time. In conjunction with the rate increases, clear examples of the stereotypical temporal pattern also emerged.
|
|
10 spikes/s. Mistuning led to a rate decrease of
10 spikes/s in only 42 cases (9%). Although the increases produced by mistuning were small, they were statistically significant (Wilcoxon signed-rank test, n = 480 pairs, P < 0.0001).
|
As noted, tones were nearly always presented monaurally to the contralateral ear, but exceptions were made for neurons that responded poorly to monaural stimuli but vigorously to binaural tones. Binaural complex tones were presented to five neurons, and responses were obtained for a total of 24 phase and SPL conditions. Although only a few examples were collected, there was no indication in those data that responses elicited with binaural stimulus presentation differed in any significant way from responses to monaural contralateral tones. One representative binaural discharge pattern is presented in Fig. 14. Figure 14, A and C (top row) presents cycle histograms of the responses obtained with binaural stimulation. The temporal discharge pattern obtained in response to the harmonic tone exhibited particularly strong synchrony at 250 Hz; because of the neuron's low CF, this pattern could reflect synchrony to the 250-Hz component in the stimulus or to the envelope. The response to the mistuned tone exhibited the features described previously for contralateral stimulation: the response envelope modulated with a 100-ms period, and fine structure consisting largely of peaks separated by 8 ms. As in some previous examples, this neuron also exhibited the temporal pattern associated with f0 during minima in the response envelope. Responses to monaural contralateral tones are illustrated in Fig. 14, B and D. These tones generated only a few spikes, and no temporal pattern could possibly be detected.
|
As noted, the same stimuli were presented to a computational model incorporating three sequential processing stages. Figure 15 illustrates the outputs of channels from all three stages of the model in response to the sine-phase mistuned-tone stimulus. This simulation included channels with two CFs, 1.242 and 2.089 kHz. The higher CF closely matched the CF of the IC neuron whose discharge pattern is included in D of the figure. At Stage 1 (Fig. 15A), the output of each channel was simply a band-passfiltered, half-waverectified version of the stimulus waveform. Neither channel was able to resolve individual components of the spectrum of the complex tone, resulting in a response with a modulated envelope whose frequency was determined by the spacing between the unresolved components. For the channel with the lower CF, the largest unresolved components were at 1.120 kHz (the mistuned component) and 1.250 kHz (the next-higher unmodified harmonic), and the response waveform had a beat or envelope frequency of 130 Hz. Response modulation with a period of 100 ms can also be seen, but that modulation was quite shallow. For the higher-CF channel, the unresolved components were at 1.750 and 2.000 kHz, resulting in an envelope frequency of 250 Hz. Each Stage 1 output also exhibited fine structure determined by the frequencies passed by the gammatone filter; individual cycles of the fine structure were too short to be visible in the figure. The modulation and fine structure observed in Stage 1 outputs generally reproduced the characteristics of auditory nerve fiber responses to similar mistuned tones reported by Sinex et al. (2003a)
.
|
At Stage 3 (Fig. 15C), the outputs originating in the two separate Stage 2 channels were summed. For this example, the channel with CF = 2.089 kHz provided excitatory input and the channel with CF = 1.242 kHz provided inhibition. The weight of the inhibitory input was set to 1.15, a value chosen empirically to produce a waveform whose shape closely approximated the discharge pattern of the neuron shown in D. The match between the discharge pattern of the IC neuron and the output of model Stage 3 was quite good, even though none of the waveforms generated at Stages 1 or 2 bore much resemblance to the neuron's response.
Stage 3 outputs that resembled the stereotypical discharge patterns of IC neurons to mistuned tones could be achieved with many different parameter sets. The examples shown in Fig. 16 were all produced by integrating across two channels, one of excitation and one of inhibition. In Fig. 16, AC, the excitatory CF was held constant at 1.242 kHz. The waveform of Stage 2 output for that CF was dominated by a 130-Hz envelope, as was shown in Fig. 15B. The CF of the inhibitory channel was varied, as was the weight of the inhibitory input. Similar but not identical Stage 3 outputs could be generated with inhibitory inputs originating in different frequency regions and, in each case, the stereotypical modulated response was obtained. The similarity in Stage 3 outputs reflected the fact that the waveforms of these Stage 2 outputs were not strongly affected by the variation in inhibitory CF because in each case the response envelope was produced by beating between adjacent tuned harmonics. Although the particular harmonics varied with CF, the frequency difference was always 250 Hz, producing a response envelope modulated with 4-ms periodicity.
|
One other difference across the examples in Fig. 16 was that the fine structure of the Stage 3 waveform could change during envelope minima. In Fig. 16C, the spacing between waveform peaks changed from about 8 ms around the maximum of the envelope, to about 4 ms at the envelope minimum. In Fig. 16, A and B, the periodicity did not change. Both of these patterns were observed in the responses of IC neurons (e.g., Fig. 7).
In the examples shown in Fig. 16, DF, the CF of inhibition was fixed and the CF of excitation was varied; the weight of inhibition was also allowed to vary to achieve a match to the stereotypical pattern. As before, similar output waveforms could be obtained for each pair of input channels and, as before, the major effect of varying the excitatory CF was a shift in REP.
The effects of varying temporal stimulus parameters are examined in Fig. 17. The phases of individual components in the mistuned tone were varied for the simulations shown in Fig. 17, AC. Parameters of the model were held constant across the three stimulus conditions in the figure. Varying the component phase had little effect on the fine structure of the Stage 3 response. Model responses to sine- and cosine-phase stimuli were similar, except for a small difference in overall amplitude. For the random-phase condition, a shift in REP of nearly half a modulation cycle was observed, analogous to the effects of component phase seen in the responses of IC neurons (Fig. 10).
|
| DISCUSSION |
|---|
|
|
|---|
As noted above and in Sinex et al. (2002b)
, a 130-Hz input could easily be produced by beating between responses synchronized to the mistuned fourth harmonic and the fifth harmonic in the stimulus used here. That is, a hypothetical neuron receiving two excitatory inputs, one synchronized to 1.120 kHz and another synchronized to 1.250 kHz, would be more likely to produce spikes when the two inputs were in phase, and less likely to produce spikes when the two inputs were out of phase, each phase relation occurring 130 times/s. In the same way, beating between any two exact harmonics would produce a response synchronized to f0, in this case 250 Hz. Second-order beats, between inputs locked to 130 and 250 Hz, would occur at 120 Hz. The computational model was used to evaluate hypotheses about the kind of mechanisms that could produce and integrate inputs at those two frequencies. These are considered further in Integrative mechanisms that produce the stereotypical discharge pattern below.
Responses across neurons
With modest variation, the stereotypical pattern was observed in neurons with different pure-tone discharge patterns and with CFs as low as 0.25 kHz and as high as 5.9 kHz. Neurons that responded to pure tones with sustained or pauser discharge patterns were most likely to exhibit the stereotypical discharge pattern. In general, sustained neurons exhibited the most highly modulated patterns. Exceptions were observed in neurons with long latencies, even if they responded in a sustained manner to other tones. Neurons with pauser discharge patterns generally exhibited a recognizable temporal pattern in response to mistuned tones. The responses of pauser neurons exhibited the 8-ms periodicity that was part of the stereotypical pattern, but they tended to exhibit less-prominent slow modulation than the responses of sustained neurons. Neurons that exhibited transient responses to pure tones generally responded poorly to complex tones, although there were exceptions (e.g., Fig. 11). In addition, the temporal discharge patterns of these neurons to mistuned tones included the 120- and 130-Hz components that distinguished them from responses to harmonic tones in fewer than half of the tested conditions (Fig. 6C). Sinex et al. (2002a)
previously reported that transient neurons in the IC of the chinchilla responded poorly to sinusoidally amplitude modulated (SAM) tones. As noted above, the stereotypical pattern is interpreted to be a response that results from integration of inputs slowly modulated at two different frequencies. Responses to harmonic tones also appeared to reflect beating inputs; in that case, the beats occurred at a single frequency, equal to f0. The present results are consistent with the results of Sinex et al. (2002a)
, which suggest that harmonic and mistuned tones should not be effective stimuli for transient neurons.
Mistuned tones would not have evoked such similar discharge patterns from neurons with widely varying CFs if each neuron were processing only a restricted region of the spectrum, which is what the typical narrow excitatory tuning of IC neurons (and auditory neurons in general) implies. Sharply tuned neurons may receive broadband inhibition (LeBeau et al. 2001
; Li et al. 2002
) and, for many neurons in this study, pure tones an octave or more from CF were capable of inhibiting responses at CF. In addition, Li et al. (2002)
reported that inhibition originating at frequencies remote from CF preserves the temporal pattern of the eliciting stimulus. Patterned inhibitory input appears to be critically important for producing the distinctive modulated discharge patterns observed in the present study, and the results presented here, showing similar discharge patterns evoked by the same complex stimulus across a range of CFs, may suggest an important functional role for that inhibition. This unusual result can be accounted for if it is assumed that individual IC neurons receive excitatory input that reflects two or more components of the complex tones, and patterned inhibitory input that reflects two or more components from a different part of the spectrum. This view is explained in greater detail in Integrative mechanisms that produce the stereotypical discharge pattern.
Shifts in response envelope phase
Similar discharge patterns could be elicited over at least a 40-dB range of levels, and by stimuli in which the phases of individual stimulus components were varied. In each case, the most obvious difference in discharge patterns was a shift in the phase of the modulated response envelope. Shifts of nearly one-half a modulation cycle,
50 ms in time, were common. These shifts accompanied stimulus changes that had little or no effect on the ability of listeners to identify the mistuned component as a separate sound. For example, Hartmann (2004)
reported that, although the detectability of a mistuned component was a nonmonotonic function of level, performance was relatively consistent within the range of levels most often used in the present study. We interpret this to mean that envelope phase is not useful for conveying information about the stimulus. However, it may be useful as an indicator of the integrative mechanisms that produce the observed discharge patterns.
It is likely that large shifts in the response envelope originate in smaller latency or phase shifts in the responses of neurons that provide direct or indirect input to the IC. Support for this possibility was obtained from simulated responses, which confirmed that fixed time delays on the order of 12 ms added to one Stage 1 input produced much larger shifts in Stage 3 REP. For pure tones, the first-spike latency of many auditory neurons decreases with increases in SPL (auditory nerve: Heil and Irvine 1997
; Liberman 1978
; Siegel et al. 1982
; cochlear nucleus: Kitzes et al. 1978
; Winter and Palmer 1995
; Young et al. 1988
; IC: Nuding et al. 1999
; Semple and Kitzes 1985
). The latencies of later-occurring spikes are usually not discussed, and in the absence of sustained phase-locked discharges cannot be related to the stimulus as easily as the first spike can. The phases of auditory-nerve fiber discharges to CF tones are relatively stable with changes in level (Kiang 1965
; Rose et al. 1967
). However, for tones away from CF, systematic phase changes occur with level (Anderson et al. 1971
) and, for brain stem neurons that do not exhibit pronounced phase-locking, it is at least plausible to assume that a shift in first-spike latency will be reflected in the timing of later spikes. Thus a given change in stimulus SPL can easily produce differential changes in the response latencies neurons providing input to a particular IC neuron; recruitment of new neurons into the responding population would also amount to a change in input phases. In either case, a shift in REP would occur. Similarly, a shift in stimulus phase would also shift the latencies of synchronized spikes; for example, a shift of one-half cycle at 1 kHz would be expected to produce latency shifts of, on the average, 0.5 ms. These latency changes, of a few milliseconds or less, could produce shifts on the order of 50 ms in the envelope of the response to the mistuned tone, if that envelope was produced by beating between phase-locked inputs as is suggested here. The effect on the IC response envelope would be essentially the same, whether the shifts arose from changes in the phases of stimulus components or changes in stimulus SPL, as was observed in the data.
Response fine structure
Most of the discussion has focused on the envelope of the responses of IC neurons to mistuned tones. This is because the existence of the response envelope and the way stimulus manipulations change the response envelope appear to provide clues to the integrative mechanisms that affect the representation of mistuned tones. However, as noted in the previous section, because the phase of this envelope was not preserved across neurons or stimulus manipulations, it may not be particularly useful for conveying information about the stimulus.
Harmonic and mistuned tones also elicited a stereotypical fine structure from IC neurons, which may be even more important for conveying information about harmonic structure that can be used for perceptual segregation. For harmonic tones, response fine structure, if there was any, was consistent; the periodicity matched the 250-Hz fundamental period of the complex tone as can be seen in Figs. 5C, 11B, and 14A (although the latter is an example of a binaural response, it is representative of other neurons' monaural responses to the same stimulus, and the periodicity is easier to see in the cycle histogram). For mistuned tones, the fine structure was dominated by the periodicity produced by interactions between the mistuned component and the next-higher harmonic, although response components at 250 Hz remained. For the tone used here, the stimulus components that accounted for the change in temporal pattern were at 1.120 and 1.250 kHz, resulting in an envelope with a frequency of 130 Hz. Response components at this difference frequency largely accounted for the discharge-rate peaks that occurred approximately every 8 ms. As noted above, this component and another at 120 Hz dominated the responses of IC neurons with CFs near the actual stimulus components that produced the beats. What is especially interesting is that the same components also dominated the responses of IC neurons with other CFs.
It is tempting to conclude that the appearance of additional components in the temporal discharge pattern somehow signals the presence of a stimulus component that is not part of the original harmonic series. Although mistuning led to new response components in a widely distributed population of neurons, it did not eliminate responses whose temporal pattern reflected the fundamental frequency of the original harmonic tone. Response components at 250 Hz persisted, as indicated by DFT analysis, and as noted in Figs. 7A, 8C, and 14C, responses to mistuned tones could exhibit the fine structure associated with the mistuned tones during envelope maxima and the fine structure associated with the harmonic tone during envelope minima. That alternating pattern raises the possibility that information about two simultaneous sound sources could be conveyed by individual neurons by a form of multiplexing.
Integrative mechanisms that produce the stereotypical discharge pattern
As noted in several places above, the working assumption is that the stereotypical pattern produced by mistuning of the tone used in this study reflects the interaction between spike trains synchronized to beat frequencies produced by adjacent spectral components in the stimulus. The assumption is usually stated as if the interaction occurs locally at the level of the IC, but of course it could occur at any level (or at more than one level) between the auditory nerve and the IC. As part of this project, a computational model that incorporates this view of integrative processing was developed and used to evaluate these assumptions in a more quantitative way. For the mistuned tone used here, the most important beat frequencies are 130 Hz for model channels with CFs near the mistuned component and 250 Hz in all other CF regions. The output of the IC stage reflects the difference between the envelope frequencies present in the second stage. An additional key feature of the model is that IC neurons are assumed to receive inhibitory inputs from input neurons tuned to a broad range of frequencies. With suitable parameter values, the model successfully accounted for the temporal discharge patterns observed in IC neurons, for the mistuned tones used here, and for other mistuned stimuli used in the previous study (Sinex et al. 2002b
). The success of the model suggests that the model's components might have counterparts in the circuitry and integrative processing actually occurring in the lower brain stem and IC. The model makes predictions about that processing that can be tested directly in future experiments.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
Present addresses: H.-Z. Li, University of Washington, Radiation Oncology, Box 356069, Seattle, WA 98195-6069; D. S. Velenovsky, Arizona Health Sciences Center, Department of Cell Biology and Anatomy, 1501 N. Campbell Ave., Tucson, AZ 85724.
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: D. G. Sinex, Utah State University, Department of Psychology, 2810 Old Main Hill, Logan, UT 84322-2810 (E-mail: don.sinex{at}usu.edu)
| REFERENCES |
|---|
|
|
|---|
Borisyuk A, Semple MN, and Rinzel J. Adaptation and inhibition underlie responses to time-varying interaural phase cues in a model of inferior colliculus neurons. J Neurophysiol 88: 21342146, 2002.
Bregman AS. Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
Burger RM and Pollak GD. Reversible inactivation of the dorsal nucleus of the lateral lemniscus reveals its role in the processing of multiple sound sources in the inferior colliculus of bats. J Neurosci 21: 48304843, 2001.
Cai H, Carney LH, and Colburn HS. A model for binaural response properties of inferior colliculus neurons. I. A model with interaural time difference-sensitive excitatory and inhibitory inputs. J Acoust Soc Am 103: 475493, 1998a.[CrossRef][Web of Science][Medline]
Cai H, Carney LH, and Colburn HS. A model for binaural response properties of inferior colliculus neurons. II. A model with interaural time difference-sensitive excitatory and inhibitory inputs and an adaptation mechanism. J Acoust Soc Am 103: 494506, 1998b.[CrossRef][Web of Science][Medline]
Darwin CJ and Carlyon RP. Auditory grouping. In: Hearing, edited by Moore BC. San Diego, CA: Academic Press, 1995, p. 387424.
Hartmann WM. Pitch perception and the segregation and integration of auditory entities. In: Auditory Function: Neurobiological Bases of Hearing, edited by Edelman G, Gall W, and Cowan W. New York: Wiley, 1988, p. 623645.
Hartmann WM. Detecting a Mistuned Harmonic. Michigan State Psychoacoustics Report 129. East Lansing, MI: Michigan State University, 2004.
Hartmann WM, McAdams S, and Smith BK. Hearing a mistuned harmonic in an otherwise periodic complex tone. J Acoust Soc Am 88: 17121724, 1990.[Medline]
Heil P and Irvine DRF. First-spike timing of auditory-nerve fibers and comparison with auditory cortex. J Neurophysiol 78: 24382454, 1997.
Hewitt MJ and Meddis R. A computer model of amplitude-modulation sensitivity of single units in the inferior colliculus. J Acoust Soc Am 95: 21452159, 1994.[CrossRef][Web of Science][Medline]
Johnson DH. The relationship between spike rate and synchrony in responses of auditory-nerve fibers to single tones. J Acoust Soc Am 68: 11151122, 1980.[CrossRef][Web of Science][Medline]
Kiang NYS. Discharge Patterns of Single Fibers in the Cat Auditory Nerve. Cambridge, MA: MIT Press, 1965.
Kitzes LM, Gibson MM, Rose JE, and Hind JE. Initial discharge latency and threshold considerations for some neurons in cochlear nuclear complex of the cat. J Neurophysiol 41: 11651182, 1978.
Krishna BS and Semple MN. Auditory temporal processing: responses to sinusoidally amplitude-modulated tones in the inferior colliculus. J Neurophysiol 84: 255273, 2000.
Langner G and Schreiner CE. Periodicity coding in the inferior colliculus of the cat. I. Neuronal mechanisms. J Neurophysiol 60: 17991822, 1988.
LeBeau FE, Malmierca MS, and Rees A. Iontophoresis in vivo demonstrates a key role for GABA(A) and glycinergic inhibition in shaping frequency response areas in the inferior colliculus of guinea pig. J Neurosci 21: 73037312, 2001.
Lee J and Green DM. Detection of a mistuned component in a harmonic complex. J Acoust Soc Am 96: 716725, 1994.[Medline]
Li H, Henderson J, and Sinex DG. Responses of inferior colliculus neurons to SAM tones located in inhibitory response area. Abstr Assoc Res Otolaryngol 2002.
Li L and Kelly JB. Inhibitory influence of the dorsal nucleus of the lateral lemniscus on binaural responses in the rat's inferior colliculus. J Neurosci 12: 45304539, 1992.[Abstract]
Liberman MC. Auditory nerve responses from cats raised in a low-noise chamber. J Acoust Soc Am 63: 442455, 1978.[CrossRef][Web of Science][Medline]
Merzenich MM and Reid MD. Representation of the cochlea within the inferior colliculus of the cat. Brain Res 77: 397415, 1974.[CrossRef][Web of Science][Medline]
Moore BC, Glasberg BR, and Peters RW. Thresholds for hearing mistuned partials as separate tones in harmonic complexes. J Acoust Soc Am 80: 479483, 1986.[CrossRef][Web of Science][Medline]
Moore BC, Peters RW, and Glasberg BR. Thresholds for the detection of inharmonicity in complex tones. J Acoust Soc Am 77: 18611867, 1985.[CrossRef][Web of Science][Medline]
Nelson PC and Carney LH. A phenomenological model of peripheral and central neural responses to amplitude-modulated tones. J Acoust Soc Am 116: 21732186, 2004.[CrossRef][Web of Science][Medline]
Nuding SC, Chen G-D, and Sinex DG. Monaural response properties of single neurons in the chinchilla inferior colliculus. Hear Res 131: 89106, 1999.[CrossRef][Web of Science][Medline]
Oliver DL and Shneiderman A. The anatomy of the inferior colliculus: a cellular basis for integration of monaural and binaural information. In: Neurobiology of Hearing: The Central Auditory System, edited by Altschuler RA, Bobbin RP, Clopton BM, and Hoffman DW. New York: Raven Press, 1991, p. 195222.
Patterson RD, Holdsworth J, and Allerhand M. Auditory models as preprocessors for speech recognition. In: The Auditory Processing of Speech: From the Auditory Periphery to Words, edited by Schouten MEH. Berlin: de Gruyter, 1992, p. 6783.
Rees A and Møller AR. Responses of neurons in the inferior colliculus of the rat to AM and FM tones. Hear Res 10: 301330, 1983.[CrossRef][Web of Science][Medline]
Rhode WS. A Test for the Significance of the Mean Direction and the Concentration Parameter of a Circular Distribution. Technical Report. Madison, WI: University of Wisconsin, Dept. of Neurophysiology, 1976.
Rose JE, Brugge JF, Anderson DJ, and Hind JE. Phase-locked response to low-frequency tones in single auditory nerve fibers of the squirrel monkey. J Neurophysiol 30: 769793, 1967.
Semple MN and Kitzes LM. Single-unit responses in the inferior colliculus: different consequences of contralateral and ipsilateral auditory stimulation. J Neurophysiol 53: 14671482, 1985.
Siegel JH, Kim DO, and Molnar CE. Effects of altering organ of Corti on cochlear distortion products f2f1 and 2f1f2. J Neurophysiol 47: 303328, 1982.
Sinex DG. Neural correlates of spectral segregation. J Acoust Soc Am 115: 2573, 2004.
Sinex DG, Guzik H, Li H, and Henderson Sabes J. Responses of auditory nerve fibers to harmonic and mistuned complex tons. Hear Res 182: 130139, 2003.[CrossRef][Web of Science][Medline]
Sinex DG, Henderson J, Li H, and Chen GD. Responses of chinchilla inferior colliculus neurons to amplitude-modulated tones with different envelopes. J Assoc Res Otolaryngol 3: 390402, 2002a.[CrossRef][Medline]
Sinex DG and Li H. Effect of component phase on responses of inferior colliculus neurons to harmonic and mistuned complex tones. Abstr Assoc Res Otolaryngol 2004.
Sinex DG, Li H, and Sabes JH. Modeling spectral integration that underlies IC responses to complex tones. Abstr Assoc Res Otolaryngol 2003.
Sinex DG, Sabes JH, and Li H. Responses of inferior colliculus neurons to harmonic and mistuned complex tones. Hear Res 168: 150162, 2002b.[CrossRef][Web of Science][Medline]
Slaney M. Auditory Toolbox. Technical Report 45. Cupertino, CA: Apple Computer, 1994.
Winter IM and Palmer AR. Level dependence of cochlear nucleus onset unit responses and facilitation by second tones or broadband noise. J Neurophysiol 73: 141159, 1995.
Yost WA. Auditory image perception and analysis: the basis for hearing. Hear Res 56: 818, 1991.[Medline]
Young ED, Robert J-M, and Shofner WP. Regularity and latency of units in ventral cochlear nucleus: implications for unit classification and generation of response properties. J Neurophysiol 60: 129, 1988.
This article has been cited by other articles:
![]() |
H.-R. Geis and J. G. G. Borst Intracellular Responses of Neurons in the Mouse Inferior Colliculus to Sinusoidal Amplitude-Modulated Tones J Neurophysiol, April 1, 2009; 101(4): 2002 - 2016. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. G. Sinex and H. Li Responses of Inferior Colliculus Neurons to Double Harmonic Tones J Neurophysiol, December 1, 2007; 98(6): 3171 - 3184. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Tan and J.G.G. Borst Comparison of Responses of Neurons in the Mouse Inferior Colliculus to Current Injections, Tones of Different Durations, and Sinusoidal Amplitude-Modulated Tones J Neurophysiol, July 1, 2007; 98(1): 454 - 466. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Alain and K. L. McDonald Age-Related Differences in Neuromagnetic Brain Activity Underlying Concurrent Sound Perception J. Neurosci., February 7, 2007; 27(6): 1308 - 1314. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |