JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 96: 252-258, 2006. First published March 29, 2006; doi:10.1152/jn.01257.2005
0022-3077/06 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
96/1/252    most recent
01257.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (12)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Narayan, R.
Right arrow Articles by Sen, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Narayan, R.
Right arrow Articles by Sen, K.

Distinct Time Scales in Cortical Discrimination of Natural Sounds in Songbirds

Rajiv Narayan, Gilberto Graña and Kamal Sen

Hearing Research Center, Department of Biomedical Engineering, Center for Biodynamics, Program in Mathematical and Computational Neuroscience, Boston University, Boston, Massachusetts

Submitted 30 November 2005; accepted in final form 13 March 2006


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Understanding how single cortical neurons discriminate between sensory stimuli is fundamental to providing a link between cortical neural responses and perception. The discrimination of sensory stimuli by cortical neurons has been intensively investigated in the visual and somatosensory systems. However, relatively little is known about discrimination of sounds by auditory cortical neurons. Auditory cortex plays a particularly important role in the discrimination of complex sounds, e.g., vocal communication sounds. The rich dynamic structure of such complex sounds on multiple time scales motivates two questions regarding cortical discrimination. How does discrimination depend on the temporal resolution of the cortical response? How does discrimination accuracy evolve over time? Here we investigate these questions in field L, the analogue of primary auditory cortex in zebra finches, analyzing temporal resolution and temporal integration in the discrimination of conspecific songs (songs of the bird's own species) for both anesthetized and awake subjects. We demonstrate the existence of distinct time scales for temporal resolution and temporal integration and explain how they arise from cortical neural responses to complex dynamic sounds.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
How accurately do single cortical neurons discriminate between different sensory stimuli? Addressing this fundamental question is an important step toward understanding the relationship between cortical neural responses and perception (Parker and Newsome 1998Go). This question has been investigated intensively in the visual and somatosensory cortices, e.g., motion-direction discrimination in area MT (Parker and Newsome 1998Go) and flutter discrimination in somatosensory cortex (Romo and Salinas 2003Go). In the auditory system, this question has been probed extensively in the auditory periphery, e.g., intensity and frequency discrimination in the auditory nerve (Delgutte 1995Go). Currently, little remains known about neural discrimination in auditory cortex.

Auditory cortex plays an important role in the perception of complex sounds, e.g., vocal communication sounds and speech (Fitch et al. 1997Go; Nelken 2004Go; Rauschecker 1998Go; Wang 2000Go). Lesions of auditory cortex cause a deficit in the perception of speech in humans and vocal communication sounds in animals (Heffner and Heffner 1986Go; Penfield and Roberts 1959Go), suggesting an important role for auditory cortex in the discrimination of complex sounds. Yet remarkably little is known about the discrimination of such sounds by auditory cortical neurons. Many natural sounds including vocal communication sounds display striking time-varying structure over multiple time scales (Attias and Schreiner 1998Go; Escabi et al. 2003Go; Lewicki 2002Go; Nelken et al. 1999Go; Singh and Theunissen 2003Go). This rich dynamic structure motivates two questions regarding the time scales underlying cortical discrimination of complex stimuli. How does discrimination depend on the temporal resolution of cortical neural responses? How does discrimination evolve over time?

Here we address these questions in songbirds, a model system that offers unique advantages for studying the neural discrimination of complex sounds with particular relevance to human speech (Doupe and Kuhl 1999Go). We investigate neural discrimination of conspecific songs (songs of the bird's own species) in field L, the avian analogue of primary auditory cortex (ACx), which is likely to play a critical role in the discrimination of conspecific songs (Grace et al. 2003Go; Sen et al. 2001Go; Theunissen et al. 2000Go). Specifically, we quantify the temporal resolution and temporal integration in the discrimination of complex sounds.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Surgical procedures

We recorded from the field L region of anesthetized and awake adult male zebra finches (Taenopygia guttata). All procedures were in strict accordance with the National Institutes of Health guidelines as approved by the Boston University Charles River Campus Institutional Animal Care and Use Committee. Two days prior to the electrophysiological recording, the bird was anesthetized (0.1–4% isoflurane in 0.5–2.5 l/min O2) for a preparatory surgical procedure to mark the locations of electrode penetrations and fix a head-support pin. A reference point for electrode penetrations was marked with ink 1.5 mm lateral and 1.2 mm anterior to the bifurcation point of the midsagittal sinus, and a steel support pin was glued on the skull. The bird was allowed to recover for 2 days before performing the experiment.

All surgical procedures done for the awake recordings were similar to those performed for the anesthetized recordings except where noted as follows. After the ink mark was made at the estimated location of field L, a lightweight microdrive containing two extracellular tungsten electrodes (impedance: 2–4 M{Omega}, FHC, Bowdoinham, ME) was positioned above the marked dot. The microdrive was a slightly modified version of a previous microdrive used to record from awake zebra finches (Hessler and Doupe 1999Go). The skull and dura beneath the dot were removed, and the implant was positioned such that the electrodes just entered the brain. Then the implant was secured to the skull with epoxy. A reference ground electrode was inserted into the brain on the opposite hemisphere from the location of the implant. Finally, the steel support pin was fixed as described in the preceding text.

Stimuli

The stimulus ensemble consisted of 20 undirected, conspecific zebra finch songs recorded in a sound-attenuated chamber (Acoustic Systems, Austin, TX). The songs were sampled at 32 kHz, band-pass filtered to retain frequencies between 250 Hz and 8 kHz, and stored in datafiles for playback (Sen et al. 2001Go).

Electrophysiology

ANESTHETIZED RECORDINGS. The methods were similar to those previously described (Sen et al. 2001Go; Theunissen et al. 2000Go). On the day of the experiment, the bird was anesthetized with three intramuscular injections of 20% urethan administered at half-hour intervals (75–90 µl total). The bird was then placed in a double-walled sound-attenuated chamber (Industrial Acoustics, Bronx, NY), facing the loudspeaker that was used for stimulus presentation. The speaker was located 20 cm away from the beak, and the bird was elevated to be at the same level as the center of the speaker cone. The bird's head position was fixed by attaching the steel pin to a frame located on the stereotaxic assembly. A reference electrode (Teflon-coated silver wire) was implanted in the brain close to the recording site. Extra-cellular tungsten electrodes (impedance: 2–4 M{Omega}) were lowered into the brain using a micromanipulator. Neural responses were recorded at ~100 µm interval depths. The conspecific song stimuli were played at a peak intensity of 75 dB SPL and randomly interleaved to obtain 10 trials of responses to each song. The electrophysiological signal was amplified, filtered, digitized, and stored on disk for further analysis. In some experiments, the electrode was repositioned for multiple passes. After each recording pass, small lesions were made along the electrode track for later reconstruction of the recording sites. At the end of the recording, the bird was killed with isoflurane, and the brain was preserved in 3.7% formalin fixative for histology.

AWAKE RESTRAINED RECORDINGS. On the day of the experiment, the bird was restrained in a small cloth jacket to restrict movement and reduce motion artifacts and was then placed into the stereotaxic assembly as described in the preceding text. Different single- and multiunit complexes in the same adult zebra finch were probed by manually advancing the two tungsten electrodes via the microdrive in ~150 µm intervals. Prior to the creation of the implants, the microdrive was calibrated to determine the amount by which the electrodes advance with each turn of the screw; in this way, the depth of the electrodes relative to the surface of the brain could be estimated. The bird was given a 30-min break after each recording session, lasting 2–3 h, and released from its jacket into its cage. Data from a particular site was obtained within the same recording session. Different sites were sampled over several days to weeks. After the experiment was concluded, the bird was killed, and the brain was stored for histology as in the preceding text.

Histology and classification of recording sites

Prior to sectioning, the brains were stored in 30% sucrose buffer overnight. Parasagittal 50 µm sections of the brain were prepared using a cryo-microtome and stained with cresyl violet (Nissl stain). Electrode placement was verified by comparing electrode tracks and electrolytic lesions to histological markers that define the boundaries of field L (Fortune and Margoliash 1992Go). Sites were classified as field L sites based on a combination of histology, medial-lateral coordinates and depth of the site. The estimated spectral temporal receptive fields (STRFs) at each site were also consistent with their location in field L (Sen et al. 2001Go).

Data analysis

Of all the sites we probed, the sites that exhibited an average firing rate that was significantly different (P < 0.01, paired t-test) from the average spontaneous firing rate for at least one song stimulus, were included in the analysis (n = 38). Of these, single- and multiunit activity was recorded at 24 sites in field L from 11 anesthetized birds (6 single units, 18 multiunits). Multiunit activity was obtained from 14 sites in five awake-restrained birds. Spike event times were obtained from the spike waveforms using a window discriminator. Classification of sites into single and multiunits followed the scheme used in Sen et al. 2001Go. Cases where the spike waveform had a single reliable and stereotyped shape were classified as single units and confirmed using a custom made spike-sorting algorithm. Multiunits consisted of spike waveforms that could be easily distinguished from the background noise but not from each other and contained small clusters of approximately two to five neurons.

We quantified the dissimilarity between pairs of spike trains using a recently proposed Spike Distance Metric (SDM) (van Rossum 2001Go). First, spike trains were filtered using a decaying exponential kernel with time constant {tau}

Formula 1(1)
where ti is ith spike time, M is the total number of spikes, and H(t) is the Heaviside step function. The spike distance was then computed as the Euclidean distance between a pair of filtered spike trains, f and g

Formula 2(2)
A significant advantage of the SDM is that, by varying {tau}, we could quantify discrimination over different time scales of the neural response. At small time scales the metric acts like a "coincidence detector" with small differences in spike timing contributing to the distance, whereas at long time scales the metric acts like a "rate difference counter" where average firing rates contribute to the distance.

We then used a classification scheme based on the SDM to quantify the neural discrimination of songs (Machens et al. 2003Go). Ten trials of spike trains were obtained at each site for each of the 20 songs. A template spike train was chosen for each of the songs, and remaining spike trains were assigned to the song with the closest template based on the spike distance measure. This procedure was repeated 1,000 times for different templates. The percentage of correctly classified songs (% correct) was used as a measure of discrimination. The chance level for classification was 5% because a spike train could be assigned to 1 of 20 songs. The data points for percent correct versus spike train length (Fig. 3) were fit with single exponentials by minimizing the least-squares error. The resulting fits had mean rms error percentages of 6.7% (anesthetized multiunit sites), 3.4% (anesthetized single unit sites), and 5.9% (awake multiunit sites).


Figure 3
View larger version (17K):
[in this window]
[in a new window]
 
FIG. 3. Neural discrimination analysis as a function of spike train length for the same unit as Fig. 2. The value of the temporal resolution parameter {tau} in the SDM was set to 10 ms. {tau}C is the time constant calculated from the single-exponential function fit to the data (solid line). The chance level is indicated by the horizontal dashed line. The vertical dashed line indicates the value of {tau}C for this site.

 

    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Quantifying discrimination

We recorded neural responses to an ensemble of conspecific songs in field L of anesthetized and awake-restrained adult male zebra finches. Figure 1 shows the neural responses to two songs from three different sites, illustrating different types of responses in the data. A striking feature of some of the responses is the precision of firing as seen in the alignment of spikes in the spike raster, and the sharp peaks in the peristimulus time histograph (PSTH; e.g., Fig. 1, site 1). Precise spiking is a prominent feature of auditory cortical responses (Elhilali et al. 2004Go; Heil 1997Go; Sen et al. 2001Go; Wehr and Zador 2003Go); however, no study has directly evaluated the contribution of precise timing in the cortical discrimination of complex sounds.


Figure 1
View larger version (40K):
[in this window]
[in a new window]
 
FIG. 1. Neural responses to 2 zebra finch songs recorded from field L. Top 2 panels: sound pressure waveforms of 2 different songs from the ensemble played to the zebra finches as well as their spectrograms (frequency range 250 Hz to 8 kHz). Middle 2 panels: response of site 1, an anesthetized, single-unit recording to the 2 songs; the spike raster plot for 10 repetitions of the song, and the peristimulus time histogram (bin size = 20 ms) are shown. Bottom left panel: (site 2) response of an anesthetized multiunit site to song 1. Bottom right panel: (site 3) response of an awake multiunit site to song 2.

 
Optimal temporal resolution for discrimination

We quantified the discriminability of spike trains using the SDM (see METHODS). The distributions of spike distances within songs (a measure of response variability) and across songs (a cue for discrimination) vary with the temporal resolution (Fig. 2A). This results in discrimination accuracy depending critically on the temporal resolution at which spike trains are evaluated (Fig. 2B), reaching the optimal level at ~10 ms, and falling significantly below this level both at 1 and 1,000 ms.


Figure 2
View larger version (19K):
[in this window]
[in a new window]
 
FIG. 2. Neural discrimination analysis as a function of the temporal resolution parameter {tau} in the Spike Distance Metric (SDM; see METHODS) for a single unit in our data set. A: distributions of inter-song (- - -) and intra-song (—) spike distance at {tau} = 1 ms (top), 8 ms (middle), and 115 ms (bottom). The 2 distributions are most separable at the optimal fine temporal resolution (8 ms, middle). B: discrimination accuracy plotted vs. {tau}. The value of the spike train length was set to 1,000 ms. {tau}opt is the temporal resolution that yields the maximum accuracy. The chance level is indicated by the horizontal dashed line (- - -).

 
Temporal dynamics of discrimination

To characterize the evolution of discrimination as the neural response accumulates, we computed the percentage of songs correctly identified as a function of increasingly longer time windows of the spike train (Fig. 3). Discrimination begins at chance level, improves steadily after the onset of the songs, and reaches a plateau for long durations.

Time scales of temporal resolution and temporal integration

To quantify the temporal resolution and time course of integration, we calculated the optimal time resolution for discrimination, {tau}opt (Fig. 2B) and the time constant for single-exponential fits to the temporal integration data, {tau}C (Fig. 3) respectively. Figure 4 shows the distribution of these two time scales in our data. The median values for {tau}opt and {tau}C were 13 and 597 ms, respectively. The difference in these distributions is highly significant (P << 0.001).


Figure 4
View larger version (8K):
[in this window]
[in a new window]
 
FIG. 4. Distributions of dual time scales for all sites. The distribution of values for {tau}opt and {tau}C is represented using a box and whisker plot. The box has lines at the lower quartile, median, and upper quartile values, the upper and lower whiskers represent the range of data within 1.5 times the median, and individual points show outliers outside the range of the whiskers. The median values for {tau}opt and {tau}C were 13 and 597 ms, respectively. This difference was highly significant (P << 0.001). No significant correlation was detected between these two parameters (r = 0.13, P = 0.43).

 
We compared three subgroups in the dataset: anesthetized multiunit, anesthetized single unit, and awake multiunit (see METHODS). Median values and interquartile ranges for the firing rates of the three groups, relative to the background firing rates, were 21.6 spikes/s (12.3–33.0), 12.9 spikes/s (10.5–24.9), and 21.8 spikes/s (10.7–31.3), respectively, and were not significantly different [Kruskal-Wallis (KW) test, P = 0.67]. These rates are higher than average firing rates reported in previous studies (Grace et al. 2003Go; Sen et al. 2001Go; Theunissen et al. 2000Go). However, those studies included sites from the caudal mesopallium, which would have decreased the overall mean rates. Moreover, in a diverse region such as field L, random differences in sampling could lead to substantial variability across different data sets. The median values of {tau}opt for the anesthetized multiunit, anesthetized single-unit, and awake multiunit recordings were 13 ms (10–16), 16 ms (13–16), and 10 ms (8–20), respectively, and were not significantly different (KW test, P = 0.52). The median values of {tau}C for the anesthetized multiunit, anesthetized single-unit, and awake multiunit recordings were 602 ms (550–732), 597 ms (511–724), and 558 ms (484–678), respectively. These values were not significantly different (KW test, P = 0.65). The median maximal accuracy for the anesthetized multiunit, anesthetized single-unit, and awake multiunit recordings were also not significantly different (KW test, P > 0.05).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Cortical discrimination of complex sounds

Although discrimination by single neurons has been probed extensively in the visual and somatosensory cortices (Parker and Newsome 1998Go) and the auditory periphery (Delgutte 1995Go), surprisingly little is known about neural discrimination of sounds in auditory cortex. The critical role of auditory cortex in the perception of complex sounds suggests that auditory cortical neurons may play an important role in discriminating between complex sounds. Yet to our knowledge, only one previous study has examined neural discrimination of complex sounds based on the average cortical population activity in primary auditory cortex (Orduña et al. 2005Go). In songbirds, previous studies at the cortical level have quantified information transmitted by single neurons (Hsu et al. 2004Go; Wright et al. 2002Go), discriminability between different categories of natural and synthetic sounds based on mean firing rates (Amin et al. 2004Go; Grace et al. 2003Go), and discriminability between different categories of natural sounds by ensembles of neurons (Woolley et al. 2005Go). This is the first study to quantify the contribution of spike timing to the discrimination of complex sounds by single neurons at the cortical level and the time scales underlying such discrimination. Although the number of single units in this study was relatively small, we obtained similar results from the multiunits, which consisted of small clusters of neurons dominated by a single unit (see METHODS).

Spike timing versus firing rates in cortical discrimination

The issue of whether spike timing information can contribute to cortical discrimination of sensory stimuli has been controversial in the visual and somatosensory cortices (Parker and Newsome 1998Go; Romo and Salinas 2003Go). An important distinction between most previous studies of discrimination in the visual cortex and this study is that the previous studies employed stimuli with constant amplitude, e.g., motion stimuli with a fixed velocity over time, whereas this study employed stimuli with time-varying structure. Although our results do not rule out or support rate-based neural codes for discrimination of constant stimuli, they raise the possibility that the use of dynamic stimuli, e.g., movies of natural scenes may reveal an important contribution of spike timing in cortical discrimination. Indeed, previous studies demonstrating an increase in the reliability of cortical neural responses in response to dynamic stimuli (de Ruyter van Steveninck et al. 1997Go; Mainen and Sejnowski 1995Go) are consistent with this idea because an increase in reliability of cortical responses would be expected to improve discrimination. Previous studies in auditory cortex have demonstrated the high degree of temporal precision in auditory cortical responses (DeWeese et al. 2003Go; Elhilali et al. 2004Go; Heil 1997Go; Heil and Irvine 1997Go) and the contribution of spike timing in the information transmitted by cortical neurons about time-varying stimuli (Lu and Wang 2004Go; Nelken et al. 2005Go; Wright et al. 2002Go) and sound location (Furukawa and Middlebrooks 2002Go). This study demonstrates the significant contribution of spike timing to the discrimination of complex sounds by single cortical neurons.

Although initial studies of flutter discrimination in the somatosensory system appeared to suggest a neural code based on fine temporal structure, i.e., differences in the periodicity of neural responses, subsequent studies have challenged this view reporting cortical neurons that strongly modulate their firing rates in response to the different stimuli and could thereby mediate accurate discrimination (Romo and Salinas 2003Go). Previous studies in the auditory cortex of awake animals have revealed a class of neurons that appear to use a rate code to encode time-varying stimuli (Lu and Wang 2004Go; Lu et al. 2001Go). We also recorded neural responses in awake birds in this study. However, we did not find neurons where the discrimination performance based on average rates was comparable to the performance at the optimal temporal resolution. Although we cannot rule out the possibility that we did not sample such neurons, there is an alternative explanation. The auditory cortical neurons employing a rate code in the studies by Lu et al. encoded amplitude modulations at relatively high frequencies, whereas vocal communication sounds, such as those used in this study, typically contain relatively slow modulations. As pointed out in the studies by Lu et al., coding based on spike timing and firing rates can complement each other in different stimulus regimes. Vocal communication sounds with relatively slow modulations may be particularly well suited to exploit the temporal precision of cortical responses.

Optimal temporal resolution for discrimination

Much of the debate on neural coding has focused on the question of whether neurons encode stimuli in their average firing rates or precise spike times (Abeles 1991Go; Rieke et al. 1997Go; Shadlen and Newsome 1994Go; Softky and Koch 1993Go). However, a more unified view of neural coding may emerge if we pose a more general question: what is the optimal temporal resolution for the neural code? Despite its fundamental nature, only a handful of studies have addressed this question (Chichilnisky and Kalmar 2003Go; Machens et al. 2003Go; Rieke et al. 1997Go). These studies have been performed at the sensory periphery. Using a classification method based on the SDM, we found that the optimal temporal resolution for cortical discrimination was ~10 ms (Fig. 2). This time scale, which is intermediate to the "coincidence detector" and "rate-difference counter" regimes, permits averaging of the neural response over a time window that is sufficiently large to reduce noise due to neuronal jitter but is small enough to avoid loss of significant temporal structure. Surprisingly, the optimal temporal resolution we found at the cortical level was comparable to the optimal temporal resolution at the sensory periphery. Previous studies in auditory cortex have demonstrated the high degree of temporal precision in auditory cortical responses (DeWeese et al. 2003Go; Elhilali et al. 2004Go; Heil 1997Go; Heil and Irvine 1997Go), and the contribution of spike timing in the information transmitted by cortical neurons about time-varying stimuli (Lu and Wang 2004Go; Nelken et al. 2005Go) and sound location (Furukawa and Middlebrooks 2002Go). It is important to emphasize that the notion of the optimal temporal resolution for a detection or discrimination task is distinct from the temporal precision of neural responses, and these two quantities need not be the same for a neural code (Chichilnisky and Kalmar 2003Go). Our study extends and complements previous work on cortical coding by quantifying the optimal temporal resolution for cortical discrimination.

Temporal dynamics of discrimination and time scale of integration

Although previous studies have typically examined discrimination performance averaged over the entire stimulus duration, there is growing interest in the temporal dynamics of cortical detection, discrimination, and information transmission (Cook and Maunsell 2002Go; Gold and Shadlen 2001Go; Osborne et al. 2004Go). Such analyses provide information about the speed with which neural performance accuracy accumulates and can ultimately be related to measures of the speed of perception such as reaction times. Our analysis of the temporal dynamics of discrimination revealed a distinct range for the time scale of integration. This range was relatively long on the order of hundreds of milliseconds and significantly different from the optimal temporal resolution for discrimination ~10 ms (Fig. 4).

Neural computations underlying distinct time scales

Our results suggest two classes of neural computations underlying cortical discrimination of complex sounds: neural computations that provide precise temporal information and neural computations that extract information accumulated over hundreds of milliseconds, while maintaining the information present in precise timing. Primary auditory areas are likely to contribute to the temporal precision of responses through mechanisms such as delayed inhibition (Wehr and Zador 2003Go) and/or synaptic depression (Wehr and Zador 2005Go). Less remains known about the second class of computations, which may occur in downstream areas. These computations may involve integration of sensory information over long time scales and could potentially contribute to cortical decision making (Gold and Shadlen 2001Go). Alternatively, such computations may extract information from a vector of multiple samples or "looks" of sensory information stored in working memory, without explicit integration of sensory signals (Viemeister and Wakefield 1991Go).

Distinct time scales in auditory perception

A long-standing puzzle in auditory psychophysics is the "resolution-integration" paradox, which refers to the discrepancy in the time scales underlying two types of perceptual tasks (de Boer 1985Go; Green 1985Go; Viemeister and Wakefield 1991Go). Tasks probing temporal integration of sounds have typically found relatively long time scales of a few hundred milliseconds, whereas tasks probing temporal resolution have uncovered a much finer time scale around tens of milliseconds. Cognitive theoretical proposals for distinct cortical time scales underlying the perception of complex sounds, e.g., speech and music have recently received experimental support from functional magnetic resonance imaging experiments (Boemio et al. 2005Go; Poeppel 2003Go; Zatorre et al. 2002Go).

In this study we also found distinct time scales of temporal resolution and temporal integration in discrimination of birdsongs by single cortical neurons. It is difficult to directly compare our results to the perceptual studies for several reasons. First, our study focused on the performance on single cortical neurons in a discrimination task. The analysis of discrimination by neural populations will allow a more complete assessment of the relationship between cortical responses and perception of complex sounds. However, a rigorous quantitative analysis of single neuron performance is critical to understanding the link between neural and behavioral levels, and our results provide constraints for candidate neural codes underlying perception based on single neurons or pooling models based on the most sensitive neurons, e.g., the lower envelope principle (Parker and Newsome 1998Go). Second, previous studies used different stimuli, which could lead to quantitative differences in the time scales for temporal resolution and temporal integration. Nevertheless, our analysis reveals how distinct time scales of temporal resolution and temporal integration can arise from cortical neural responses to complex dynamic sounds. These disparate time scales provide a cortical analogue of the resolution-integration paradox, i.e., the apparent contradiction that systems integrating information over long time scales nevertheless appear to maintain sensitivity to fine temporal resolutions.


    GRANTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
This work was supported by National Institute on Deafness and Other Communication Disorders Grant 1R01 DC-007610-01A1.


    ACKNOWLEDGMENTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
We thank L. Abbott and J. Fritz for comments on the manuscript and S. Colburn, R. Gütig, M. Shamir, and H. Sompolinsky for discussions.


    FOOTNOTES
 
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Address for reprint requests and other correspondence: K. Sen, Hearing Research Center, Department of Biomedical Engineering., Boston University., 44 Cummington St., RM 414B, Boston, MA 02215 (E-mail: kamalsen{at}bu.edu)


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Abeles M. Corticonics: Neural Circuits of the Cerebral Cortex. New York: Cambridge Univ. Press, 1991.

Amin N, Grace JA, and Theunissen FE. Neural response to bird's own song and tutor song in the zebra finch field L and caudal mesopallium. J Comp Physiol [A] 190: 469–489, 2004.[CrossRef]

Attias H and Schreiner CE. Coding of naturalistic stimuli by auditory midbrain neurons. In: Advances in Neural Information Processing 10. Cambridge, MA: MIT Press, 1998, p. 103–109.

Boemio A, Fromm S, Braun A, and Poeppel D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat Neurosci 8: 389–395, 2005.[CrossRef][ISI][Medline]

Chichilnisky EJ and Kalmar RS. Temporal resolution of ensemble visual motion signals in primate retina. J Neurosci 23: 6681–6689, 2003.[Abstract/Free Full Text]

Cook EP and Maunsell JH. Dynamics of neuronal responses in macaque MT and VIP during motion detection. Nat Neurosci 5: 985–994, 2002.[CrossRef][ISI][Medline]

de Ruyter van Steveninck RR, Lewen GD, Strong SP, Koberle R, and Bialek W. Reproducibility and variability in neural spike trains. Science 275: 1805–1808, 1997.[Abstract/Free Full Text]

de Boer E. Auditory time constants: a paradox? In: Time Resolution in Auditory Systems, edited by Michelsen A. Berlin: Springer, 1985, p. 141–158.

Delgutte B. Physiological models for basic auditory percepts. In: Auditory Computation, edited by Hawkins HL, McMullen TA, Popper AN, and Fay RR. New York: Springer, 1995.

DeWeese MR, Wehr M, and Zador AM. Binary spiking in auditory cortex. J Neurosci 23: 7940–7949, 2003.[Abstract/Free Full Text]

Doupe AJ and Kuhl PK. Birdsong and human speech: common themes and mechanisms. Annu Rev Neurosci 22: 567–631, 1999.[CrossRef][ISI][Medline]

Elhilali M, Fritz JB, Klein DJ, Simon JZ, and Shamma SA. Dynamics of precise spike timing in primary auditory cortex. J Neurosci 24: 1159–1172, 2004.[Abstract/Free Full Text]

Escabi MA, Miller LM, Read HL, and Schreiner CE. Naturalistic auditory contrast improves spectrotemporal coding in the cat inferior colliculus. J Neurosci 23: 11489–11504, 2003.[Abstract/Free Full Text]

Fitch RH, Miller S, and Tallal P. Neurobiology of speech perception. Annu Rev Neurosci 20: 331–353, 1997.[CrossRef][ISI][Medline]

Fortune ES and Margoliash D. Cytoarchitectonic organization and morphology of cells of the field L complex in male zebra finches (Taenopygia guttata). J Comp Neurol 325: 388–404, 1992.[CrossRef][ISI][Medline]

Furukawa S and Middlebrooks JC. Cortical representation of auditory space: information-bearing features of spike patterns. J Neurophysiol 87: 1749–1762, 2002.[Abstract/Free Full Text]

Gold JI and Shadlen MN. Neural computations that underlie decisions about sensory stimuli. Trends Cogn Sci 5: 10–16, 2001.[CrossRef][ISI][Medline]

Grace JA, Amin N, Singh NC, and Theunissen FE. Selectivity for conspecific song in the zebra finch auditory forebrain. J Neurophysiol 89: 472–487, 2003.[Abstract/Free Full Text]

Green DM. Temporal factors in psychoacoustics. In: Time Resolution in Auditory Systems, edited by Michelsen A. Berlin: Springer, 1985, p. 122–140.

Heffner HE and Heffner RS. Hearing loss in Japanese macaques following bilateral auditory cortex lesions. J Neurophysiol 55: 256–271, 1986.[Abstract/Free Full Text]

Heil P. Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol 77: 2616–2641, 1997.[Abstract/Free Full Text]

Heil P and Irvine DR. First-spike timing of auditory-nerve fibers and comparison with auditory cortex. J Neurophysiol 78: 2438–2454, 1997.[Abstract/Free Full Text]

Hessler NA and Doupe AJ. Singing-related neural activity in a dorsal forebrain-basal ganglia circuit of adult zebra finches. J Neurosci 19: 10461–10481, 1999.[Abstract/Free Full Text]

Hsu A, Woolley SM, Fremouw TE, and Theunissen FE. Modulation power and phase spectrum of natural sounds enhance neural encoding performed by single auditory neurons. J Neurosci 24: 9201–9211, 2004.[Abstract/Free Full Text]

Lewicki MS. Efficient coding of natural sounds. Nat Neurosci 5: 356–363, 2002.[CrossRef][ISI][Medline]

Lu T, Liang L, and Wang X. Temporal and rate representations of time-varying signals in the auditory cortex of awake primates. Nat Neurosci 4: 1131–1138, 2001.[CrossRef][ISI][Medline]

Lu T and Wang X. Information content of auditory cortical responses to time-varying acoustic stimuli. J Neurophysiol 91: 301–313, 2004.[Abstract/Free Full Text]

Machens CK, Schutze H, Franz A, Kolesnikova O, Stemmler MB, Ronacher B, and Herz AV. Single auditory neurons rapidly discriminate conspecific communication signals. Nat Neurosci 6: 341–342, 2003.[CrossRef][ISI][Medline]

Mainen ZF and Sejnowski TJ. Reliability of spike timing in neocortical neurons. Science 268: 1503–1506, 1995.[Abstract/Free Full Text]

Nelken I. Processing of complex stimuli and natural scenes in the auditory cortex. Curr Opin Neurobiol 14: 474–480, 2004.[CrossRef][ISI][Medline]

Nelken I, Chechik G, Mrsic-Flogel TD, King AJ, and Schnupp JW. Encoding stimulus information by spike numbers and mean response time in primary auditory cortex. J Comput Neurosci 19: 199–221, 2005.[CrossRef][ISI][Medline]

Nelken I, Rotman Y, and Bar Yosef O. Responses of auditory-cortex neurons to structural features of natural sounds. Nature 397: 154–157, 1999.[CrossRef][Medline]

Orduña I, Mercado E 3rd, Gluck MA, and Merzenich MM. Cortical responses in rats predict perceptual sensitivities to complex sounds. Behav Neurosci 119: 256–264, 2005.[CrossRef][ISI][Medline]

Osborne LC, Bialek W, and Lisberger SG. Time course of information about motion direction in visual area MT of macaque monkeys. J Neurosci 24: 3210–3222, 2004.[Abstract/Free Full Text]

Parker AJ and Newsome WT. Sense and the single neuron: probing the physiology of perception. Annu Rev Neurosci 21: 227–277, 1998.[CrossRef][ISI][Medline]

Penfield W and Roberts L. Speech and Brain Mechanisms. Princeton, NJ: Princeton University Press, 1959.

Poeppel D. The analysis of speech in different temporal integration windows: cerebral lateralization as "asymmetric sampling in time." Speech Commun 41: 245–255, 2003.

Rauschecker JP. Cortical processing of complex sounds. Curr Opin Neurobiol 8: 516–521, 1998.[CrossRef][ISI][Medline]

Rieke F, Warland D, de Ruyter van Steveninck R, and Bialek W. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press, 1997.

Romo R and Salinas E. Flutter discrimination: neural codes, perception, memory and decision making. Nat Rev Neurosci 4: 203–218, 2003.[CrossRef][ISI][Medline]

Sen K, Theunissen FE, and Doupe AJ. Feature analysis of natural sounds in the songbird auditory forebrain. J Neurophysiol 86: 1445–1458, 2001.[Abstract/Free Full Text]

Shadlen MN and Newsome WT. Noise, neural codes and cortical organization. Curr Opin Neurobiol 4: 569–579, 1994.[CrossRef][Medline]

Singh NC and Theunissen FE. Modulation spectra of natural sounds and ethological theories of auditory processing. J Acoust Soc Am 114: 3394–3411, 2003.[CrossRef][ISI][Medline]

Softky WR and Koch C. The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J Neurosci 13: 334–350, 1993.[Abstract]

Theunissen FE, Sen K, and Doupe AJ. Spectral-temporal receptive fields of nonlinear auditory neurons obtained using natural sounds. J Neurosci 20: 2315–2331, 2000.[Abstract/Free Full Text]

van Rossum MC. A novel spike distance. Neural Comput 13: 751–763, 2001.[Abstract/Free Full Text]

Viemeister NF and Wakefield GH. Temporal integration and multiple looks. J Acoust Soc Am 90: 858–865, 1991.[CrossRef][ISI][Medline]

Wang X. On cortical coding of vocal communication sounds in primates. Proc Natl Acad Sci USA 97: 11843–11849, 2000.[Abstract/Free Full Text]

Wehr M and Zador AM. Balanced inhibition underlies tuning and sharpens spike timing in auditory cortex. Nature 426: 442–446, 2003.[CrossRef][Medline]

Wehr M and Zador AM. Synaptic mechanisms of forward suppression in rat auditory cortex. Neuron 47: 437–445, 2005.[CrossRef][ISI][Medline]

Woolley SM, Fremouw TE, Hsu A, and Theunissen FE. Tuning for spectro-temporal modulations as a mechanism for auditory discrimination of natural sounds. Nat Neurosci 8: 1371–1379, 2005.[CrossRef][ISI][Medline]

Wright BD, Sen K, Bialek W, and Doupe AJ. Spike timing and the coding of naturalistic sounds in a central auditory area of songbirds. In: Advances in Neural Information Processing. Cambridge, MA: MIT Press, 2002.

Zatorre RJ, Belin P, and Penhune VB. Structure and function of auditory cortex: music and speech. Trends Cogn Sci 6: 37–46, 2002.[CrossRef][ISI][Medline]




This article has been cited by other articles:


Home page
J. Neurosci.Home page
C. P. Billimoria, B. J. Kraus, R. Narayan, R. K. Maddox, and K. Sen
Invariance and Sensitivity to Intensity in Neural Discrimination of Natural Sounds
J. Neurosci., June 18, 2008; 28(25): 6304 - 6308.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
C. Houghton and K. Sen
A New Multineuron Spike Train Metric
Neural Comput., June 1, 2008; 20(6): 1495 - 1511.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
G. Czanner, U. T. Eden, S. Wirth, M. Yanike, W. A. Suzuki, and E. N. Brown
Analysis of Between-Trial and Within-Trial Neural Spiking Dynamics
J Neurophysiol, May 1, 2008; 99(5): 2672 - 2693.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
S. M. Chase and E. D. Young
Cues for Sound Localization Are Encoded in Multiple Aspects of Spike Trains in the Inferior Colliculus
J Neurophysiol, April 1, 2008; 99(4): 1672 - 1682.
[Abstract] [Full Text] [PDF]


Home page
J. Cogn. Neurosci.Home page
K. M. M. Walker, B. Ahmed, and J. W. H. Schnupp
Linking cortical spike pattern codes to auditory perception.
J. Cogn. Neurosci., January 1, 2008; 20(1): 135 - 152.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
S. Wohlgemuth and B. Ronacher
Auditory Discrimination of Amplitude Modulations Based on Metric Distances of Spike Trains
J Neurophysiol, April 1, 2007; 97(4): 3082 - 3092.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
L. Wang, R. Narayan, G. Grana, M. Shamir, and K. Sen
Cortical Discrimination of Complex Natural Stimuli: Can Single Neurons Match Behavior?
J. Neurosci., January 17, 2007; 27(3): 582 - 589.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
96/1/252    most recent
01257.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (12)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Narayan, R.
Right arrow Articles by Sen, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Narayan, R.
Right arrow Articles by Sen, K.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Visit Other APS Journals Online
Copyright © 2006 by the The American Physiological Society.