Receptive fields have been characterized independently in the lemniscal auditory thalamus and cortex, usually with spectrotemporally simple sounds tailored to a specific task. No studies have employed naturalistic stimuli to investigate the thalamocortical transformation in temporal, spectral, and aural domains simultaneously and under identical conditions. We recorded simultaneously in the ventral division of the medial geniculate body (MGBv) and in primary auditory cortex (AI) of the ketamine-anesthetized cat. Spectrotemporal receptive fields (STRFs) of single units (n = 387) were derived by reverse-correlation with a broadband and dynamically varying stimulus, the dynamic ripple. Spectral integration, as measured by excitatory bandwidth and spectral modulation preference, was similar across both stations (mean Q1/e thalamus = 5.8, cortex = 5.4; upper cutoff of spectral modulation transfer function, thalamus = 1.30 cycles/octave, cortex = 1.37 cycles/octave). Temporal modulation rates slowed by a factor of two from thalamus to cortex (mean preferred rate, thalamus = 32.4 Hz, cortex = 16.6 Hz; upper cutoff of temporal modulation transfer function, thalamus = 62.9 Hz, cortex = 37.4 Hz). We found no correlation between spectral and temporal integration properties, suggesting that the excitatory-inhibitory interactions underlying preference in each domain are largely independent. A small number of neurons in each station had highly asymmetric STRFs, evidence of frequency sweep selectivity, but the population showed no directional bias. Binaural preferences differed in their relative proportions, most notably an increased prevalence of excitatory contralateral-only cells in cortex (40%) versus thalamus (23%), indicating a reorganization of this parameter. By comparing simultaneously along multiple stimulus dimensions in both stations, these observations establish the global characteristics of the thalamocortical receptive field transformation.
The thalamus and cortex are highly interconnected structures whose response properties are intimately related. In the auditory system, many studies have described receptive fields in thalamus or in cortex with a large variety of experimental protocols (for reviews, see Clarey et al. 1994; de Ribaupierre 1997), but very few have characterized both stations simultaneously (Creutzfeldt et al. 1980; Zhang and Suga 1997) or even endeavored to draw thalamocortical comparisons from nonsimultaneous recordings (Barone et al. 1996; Clarey et al. 1995; Pelleg-Toiba and Wollberg 1989;Samson et al. 2000). Because differences in animal model, anesthesia, stimuli, and measured response parameters could affect results, the literature cannot support a direct comparison of multiple receptive field dimensions between thalamus and cortex.
The choice of experimental stimulus is of particular importance because it constrains the sort of knowledge we can gain from a neural system. Traditional, spectrotemporally simple sounds have the advantage of being easily parameterized and manipulated. They suffer, however, from their task specificity: a given stimulus usually reveals a very limited aspect of neural response preference, whether temporal (e.g., clicks), spectral (e.g., tones of varying frequency), or aural (e.g., best-frequency tones at varying interaural delay/level). Natural sounds such as vocalizations, in contrast, tend to be spectrotemporally complex (Nelken et al. 1999; Smolders et al. 1979) and may change along all these dimensions simultaneously. Yet except for certain highly stereotyped animal vocalizations (Suga and Jen 1976), the complexity of natural sounds resists systematic manipulation along multiple, well-defined parameters. Although numerous studies have revealed how neurons respond to particular aspects of natural sounds (Bieser 1998;Creutzfeldt et al. 1980; Doupe and Konishi 1991; Glass and Wollberg 1983;Müller-Pruess 1986; Rauschecker 1998; Steinschneider et al. 1994; Symmes et al. 1980; Wang et al. 1995), few have used the sounds to explore the general processing capabilities of thalamic or cortical cells (but see Theunissen and Doupe 1998;Theunissen et al. 2000).
While this variety of investigative methods presents a rich and multifaceted view of thalamic and cortical responses, it thereby renders inaccessible any global characterization of the thalamocortical transformation of sensory representations. We recorded in both stations simultaneously, allowing a direct comparison of thalamic and cortical receptive field properties under identical experimental conditions. Our synthetic and spectrotemporally complex stimulus, the dynamic ripple, was designed to share many properties with natural sounds (Escabı́ et al. 1999) and to satisfy the formal requirements for deriving receptive fields with reverse correlation. The dynamic ripple therefore enables a unified description of temporal, spectral, spectrotemporal, and aural neural response preferences to well-controlled and naturalistic sensory stimulation.
Electrophysiological methods and stimulus design have been described in a previous report (Miller and Schreiner 2000). Essential details are repeated in the following text.
Young adult cats (n = 4) were given an initial dose of ketamine (22 mg/kg) and acepromazine (0.11 mg/kg), then anesthetized with pentobarbital sodium (Nembutal, 15–30 mg/kg) during the surgical procedure. The animal's temperature was maintained with a thermostatic heating pad. Bupivicaine was applied to incisions and pressure points. Surgery consisted of a tracheotomy, reflection of the soft tissues of the scalp, craniotomy over AI and the suprasylvian gyrus (for the thalamic approach), and durotomy. After surgery, the animal was maintained in an unreflexive state with a continuous infusion of ketamine/diazepam (10 mg/kg ketamine, 0.5 mg/kg diazepam in lactated Ringer solution). All procedures were in strict accordance with the University of California, San Francisco Committee for Animal Research and the guidelines of the Society for Neuroscience.
All recordings were made with the animal in a sound-shielded anechoic chamber (IAC, Bronx, NY), with stimuli delivered via a closed, binaural speaker system (diaphragms from Stax, Japan). Simultaneous extracellular recordings were made in the thalamorecipient layers (IIIb/IV) of the primary auditory cortex (AI) and in the ventral division of the medial geniculate body (MGBv). For the purposes of a parallel study, we targeted thalamic and cortical neurons with similar best frequency. Except for this constraint, recording locations in thalamus were randomly distributed within the MGBv; and those in AI spanned the central narrowly tuned and flanking broadly tuned regions (Read et al. 2001; Schreiner and Mendelson 1990). The best frequencies of thalamic and cortical neurons covered an identical range (thalamus 523 Hz to 19.7 kHz, cortex 548 Hz to 19.7 kHz) and differed in mean by only 1.3 kHz (mean thalamus, 9.7 kHz; mean cortex, 11.0 kHz; 2-sample t-testP = 0.02). Electrodes were parylene-coated tungsten (Microprobe, Potomac, MD) with impedances of 1–2 MΩ or 3–5 MΩ tungsten electrodes plated with platinum black. One or two electrodes were placed in each station with hydraulic microdrives on mechanical manipulators (Narishige, Tokyo, Japan), mounted on a stereotaxic frame (David Kopf Instruments, Tujunga, CA) or on supplementary supports. Localization of thalamic electrodes, which were stereotaxically advanced along the vertical, was confirmed with Nissl-stained sections. Spike trains were amplified and band-pass filtered (500–10,000 Hz), recorded on a Cygnus Technology (Delaware Water Gap, PA) CDAT-16 recorder with 24-kHz sampling rate, and sorted off-line with a Bayesian spike sorting algorithm (Lewicki 1994). Each electrode location yielded an average of 1.9 well-isolated single units. Stimulus-driven neural activity was recorded for ∼20 min at each location.
The dynamic ripple stimulus (Escabı́ et al. 1998; Kowalski et al. 1996; Miller and Schreiner 2000; Schreiner and Calhoun 1994) is a temporally varying broadband sound composed of 230 sinusoidal carriers (500–20,000 Hz) with randomized phase. The magnitude of any carrier at any time is modulated by the spectrotemporal envelope, consisting of sinusoidal amplitude peaks (“ripples”) on a logarithmic frequency axis which change through time. Two parameters define the envelope: a spectral and a temporal modulation parameter. Spectral modulation rate is defined by the number of spectral peaks per octave, or ripple density. Temporal modulations are defined by the speed and direction of the peaks' change. Both the spectral and temporal modulation parameters were varied randomly and independently during the 20-min, nonrepeating stimulus. Spectral modulation rate varied slowly (max. rate of change 1 Hz) between 0 and 4 cycles per octave; the temporal modulation rate varied between −100 Hz (downward sweep) and 100 Hz (upward sweep), with a maximum 3-Hz rate of change. Both parameters were statistically independent and unbiased within those ranges. In one experiment, however, the temporal modulation spectrum decayed slightly; all evidence of this mild bias was readily abolished while thresholding the STRFs (see Analysis). Maximum modulation depth of the spectrotemporal envelope was 45 dB. Mean intensity was set 20–30 dB above the neuron's pure-tone threshold. An independent dynamic ripple sound was presented simultaneously to each ear.
Data analysis was carried out in MATLAB (Mathworks, Natick, MA). For each neuron, the reverse correlation method was used to derive the spectrotemporal receptive field (STRF), which is the average spectrotemporal stimulus envelope immediately preceding each spike (Aertsen and Johannesma 1980; deCharms et al. 1998; Escabı́ et al. 1998; Klein et al. 2000; Theunissen et al. 2000). Positive regions of the STRF indicate that stimulus energy at that frequency and time tends to increase the neuron's firing rate, and negative regions indicate where the stimulus envelope induces a decrease in firing rate (Fig. 1 A). In all locations, the STRF procedure was performed on the typically dominant, contralateral ear; in 75% of the recordings, we performed an independent STRF calculation for the ipsilateral ear. Only units with robust STRF features from one or both ears were analyzed further. The presence of robust features was determined by the largest contiguous deviation in the significant STRF, where threshold was set atP < 0.002. Because spectral and temporal modulations in the stimulus are low-pass (≤4 cycles/octave and 100 Hz, respectively), the smallest possible STRF feature is as large as the fastest modulation half-cycle. In the absence of noise, the smallest feature would thus be 1/8 octaves by 5 ms. If we require that the peak rise at least twice as high as the noise threshold (a conservative criterion), then the minimum size for a robust STRF feature is 0.095 octaves by 3.8 ms. By this measure, 223 of 240 (93%) thalamic single units and 164 of 267 (61%) cortical single units were analyzed further. Of the units lacking STRF features, over half occurred at a recording location where another unit had an STRF (65% thalamus, 61% cortex). Thus 98% of thalamic and 85% of cortical locations yielded at least one well-isolated single unit with an STRF.
Modulation properties were derived by performing a two-dimensional Fourier transform of each significant STRF, smoothed with a two-dimensional (2-D) Gaussian of SD 2 pixels, to give the ripple transfer function (RTF; Fig. 1 B). The RTF is thus a signal in the parameter space of temporal modulation rate versus spectral modulation rate, or ripple density. Preferred temporal and spectral modulation rates are analogous to moving grating speed and spatial frequency, respectively, in the visual system. RTF energy at low (high) ripple densities indicates preference for broadly (narrowly) spaced spectral contours. Energy at large positive (negative) temporal modulations indicates a preference for fast up- (down-) frequency sweeps; energy near zero temporal modulation means the cell has little preference for sweeps. As a Fourier transform is sensitive to periodicities in the STRF envelope, the RTF depends heavily on the relationship of excitatory and inhibitory STRF subfields. For instance, if the sole STRF feature is an excitatory peak, the RTF will tend to be low-pass in both temporal and spectral modulation domains. Strong flanking inhibition in frequency or in time will tend to produce an RTF signal that is band-pass in the spectral or temporal domain, respectively (see Fig. 1 , A–L). The more an STRF resembles a sinusoidal alternation of excitatory and inhibitory domains in time or frequency, the more strictly band-pass its RTF will be in the corresponding dimension.
Most response parameters were derived from the feature with maximum deviation in the STRF or RTF. Best frequency (Hz) and peak latency (ms) are the spectral and temporal coordinates of the maximum deviation in the STRF (Fig. 1 A, gray arrows). The spectrotemporal boundaries of an STRF feature were defined by a contour at 1/e times the maximum value. This threshold is largely empirical, as it typically circumscribed ∼90% of the feature's energy; it is also analytically simple, in that an idealized Gaussian feature would have a contour with spectrotemporal extent √2 times its SD. Bandwidth is the width of this contour in frequency (Fig.1 A, black arrows). The sharpness-of-tuning measureQ 1/e is defined as the best frequency divided by the bandwidth; thus higherQ 1/e is sharper tuning. Best spectral modulation (BSM, in cycles/octave) and best temporal modulation (BTM, in Hz) are the spectral and temporal coordinates of the maximum deviation in the RTF (Fig. 1 B, arrows). A sharpness-of-tuning measure can also be derived for spectral and temporal modulation preference, identically to that described for the STRF but from the main RTF feature. The temporal modulation transfer function (tMTF) and spectral modulation transfer function (sMTF) were constructed by first folding the RTF along the vertical midline (temporal modulation = 0), thereby ignoring sweep direction. This RTF was then collapsed or summed along the complementary (spectral/temporal) dimension. For instance, the 2-D RTF was collapsed along the dimension of spectral modulation to yield a one-dimensional signal in the temporal modulation domain. A composite, or population RTF was constructed by averaging the RTFs from all units in thalamus or cortex, where each unit's RTF was weighted equally. Composite tMTFs and sMTFs were then derived from the composite RTF.
Spectrotemporal asymmetry or nonseparability in the STRF, such as frequency sweep selectivity, was measured in the RTF domain. Nonseparability is a special case of spectrotemporal asymmetry, indicative of oblique STRF features: a separable STRF can be constructed from the outer multiplication of a single signal in time and a single signal in frequency; a nonseparable STRF cannot. At each spectral modulation rate, the imbalance of RTF energy to either side of the vertical midline (temporal modulation = 0) was defined as the difference between positive and negative energies divided by the sum of energies. These values were combined by a weighted sum across all spectral modulation rates, with weights equal to the proportion of total RTF energy at each modulation rate, to give an asymmetry measure. Spectrotemporal asymmetry thus has value −1 for strong down-sweep preference, 0 for no mean asymmetry, and +1 for strong up-sweep preference.
Contralateral and ipsilateral STRFs were compared in two complementary ways. The first was a contra:ipsi peak measure, an ordered pair [contra, ipsi] with the value of greatest contralateral and ipsilateral STRF extremes in SDs above the noise. Excitatory extremes are positive numbers and inhibitory extremes are negative numbers. The second binaural measure was a similarity index (DeAngelis et al. 1999) related to a correlation coefficient. The two significant STRFs were treated as vectors rather than arrays in time and frequency. The binaural similarity index (BSI) is then the inner product of the vectorized contralateral and ipsilateral STRFs, divided by both of their vector norms. A vector norm is the square root of the inner product of a vector with itself. A BSI greater than zero indicates binaural agreement of the STRFs in frequency, time, and sign; a BSI less than zero means binaural inputs are, on average, antagonistic; and a BSI equal to zero indicates no correlation between binaural STRF shapes.
Typical STRFs appear in Fig. 1. The nearly ubiquitous excitatory peak in the contralateral STRF may be flanked on upper and/or lower frequencies by an inhibitory region (Fig. 1 A). Also common is an inhibitory region at the best excitatory frequency but at a longer latency; this indicates a preference for stimulus energy onsets (i.e., transitions from low to high energy). Because the RTF (Fig.1 B) is a Fourier transform of the STRF, it depends heavily on the relationship between excitatory and inhibitory STRF features. Consequently so do the sMTFs and tMTFs, which are derived from the RTF. Neighboring excitatory-inhibitory regions impart a band-pass preference for modulations in time or frequency, while a sole excitatory or inhibitory feature indicates low-pass preference for modulations. For instance, if a neuron's STRF has both flanking inhibition in frequency and preceding (i.e., longer latency) inhibition in time (Fig.1 A), then its RTF tends to be band-pass in both spectral and temporal domains (Fig. 1 B). The sMTF (Fig. 1 C) and tMTF (Fig. 1 D) likewise reflect these preferences. If a neuron's STRF has flanking inhibition in frequency but lacks preceding inhibition in time (Fig. 1 E), it tends to be band-pass for spectral and low-pass for temporal modulations (Fig. 1,F–H). If instead the neuron's STRF lacks flanking inhibition but shows strong preceding inhibition in time (Fig.1 I), it tends to be low-pass in the spectral and band-pass in the temporal domain (Fig. 1, J–L).
Examples of less common STRF structures are shown in Fig.2. Occasionally, an inhibitory domain forms an uninterrupted swath through time and frequency around the main excitatory peak (Fig. 2 A). Other neurons have very strong FM (FM) sweep selectivity (Fig. 2 B); this cortical neuron prefers a particular speed of upward sweep (upward, because the time axis in an STRF is time-preceding-spike). Very rarely, a well-isolated single unit has multiple excitatory and inhibitory domains in a complex arrangement (Fig. 2 C) or a single, almost solely inhibitory STRF (Fig. 2 D).
When the ipsilateral ear produces an STRF, the features may vary widely in their binaural spectrotemporal agreement or antagonism. The binaural STRFs in Fig. 3 A, for instance, have regions of similar spectrotemporal extent. While the contralateral STRF shows weak subfields that are not duplicated in the ipsilateral STRF, the main excitatory features in both match extremely well. In contrast, strong binaural overlap of subfields with opposite sign occurs in Fig. 3 B. Although less common, binaural features may show more complex contra versus ipsi relationships, such as a single subfield for one ear aligned with multiple cooperative or antagonistic subfields in the other ear (Fig. 3 C). Not all cells in thalamus or cortex have an ipsilateral STRF, but most show a contralateral STRF. Therefore much of the following analysis will describe features for the typically dominant, contralateral ear.
Temporal response preferences
Peak latencies for thalamic and cortical responses are defined by the moment of maximum deviation (usually excitation) in the contralateral STRF. First-spike latencies cannot be unambiguously assessed with a continuous stimulus such as the dynamic ripple. Both thalamus and cortex show a unimodal peak-latency distribution with median 10.5 and 13.0 ms (mean, 13.2 and 17.9 ms), respectively (Fig.4). The distributions are highly overlapping, with most of the thalamic latencies occurring after the earliest cortical ones. That is, the thalamic and cortical populations are simultaneously active for most of their response durations. Nevertheless, the tail of the cortical distribution extends to longer latencies (∼45 ms) than the thalamic tail (∼30 ms).
The tMTF measures a cell's preference for stimulus energy to fluctuate in time. It is constructed by collapsing the energy in the 2-D ripple transfer function into the one-dimensional temporal modulation domain (see methods). Because the RTF depends heavily on the relationship of excitatory and inhibitory subfields in the STRF, so do the modulation preferences of the tMTF. Preceding or following the typical excitatory STRF peak, weak or absent STRF inhibition in time tends to produce a low-pass modulation function. Strong inhibition in time usually results in a band-pass function for temporal selectivity.
Thalamic and cortical tMTFs differ in several respects. Thalamic cells tend to prefer higher temporal modulation rates than cortical cells as suggested by the representative tMTFs in Fig.5, A and B,respectively. For band-pass neurons (BTM ≠ 0), moreover, thalamic neurons prefer a significantly narrower relative range of modulation frequencies (temporal modulationQ 1/e: mean, 0.64 thalamus, 0.45 cortex; median, 0.66 thalamus, 0.39 cortex; 2-sample t-test,P < 0.001). That is, thalamic responses are more sharply tuned than cortical responses to temporal modulation rate. Distributions of the absolute value of best BTM for thalamic and cortical cells are plotted in Fig. 5 C. The histograms from both stations are bimodal, with a small but significant proportion of low-pass units, and most neurons band-pass at higher rates (mean: 32.4 Hz thalamus, 16.6 Hz cortex; median: 27.4 Hz thalamus, 14.4 Hz cortex; 2-sample t-test, P < 0.001). While cortical neurons cover a similar range as thalamic units, most (90%) thalamic best modulation rates fall <63.6 Hz, whereas 90% of cortical rates fall <33.5 Hz. BTM histograms reveal the maximum preferred rates for the neurons, but they do not incorporate the overall filter properties of the neural population. By averaging all tMTFs for thalamic and cortical units separately, we approximate the temporal modulation filters of these two stations (Fig. 5 D). The composite thalamic tMTF has proportionally more energy at higher modulation rates than cortex (peak: 21.9 Hz thalamus, 12.8 Hz cortex; upper 6-dB cutoff: 62.9 Hz thalamus, 37.4 Hz cortex). Whereas the energy at high rates in thalamus is due primarily to neurons with high BTMs, the high-frequency tail in cortex is also a result of neurons with low BTMs but broad modulation tuning. Therefore whether based on preferred rates alone or the overall filter properties, both stations are effectively band-pass temporal modulation filters with cortical rates about half those in thalamus.
Spectral response preferences
The sMTF measures a neuron's preference for the spacing of spectral contours. Complementary to the tMTF, the sMTF is constructed by collapsing the ripple transfer function into the spectral modulation domain. It thus depends on the relationship of excitatory and inhibitory STRF subfields across frequency: weak or absent STRF sideband inhibition produces a low-pass modulation function, and strong sideband inhibition results in a sharp band-pass function. Spectral modulation rate or ripple density is expressed as the number of cycles of the ripple stimulus envelope per octave frequency.
Representative examples of thalamic and cortical sMTFs are shown in Fig. 6, A and B, respectively. The distribution of best spectral modulation rates (BSM) for all units shows a heavy bias toward low values, or broad spectral preferences (Fig. 6 C. mean: 0.58 cycles/octave thalamus, 0.46 cycles/octave cortex; median: 0.42 cycles/octave thalamus, 0.25 cycles/octave cortex). There are proportionally more low-pass cortical cells, which accounts for the significant difference between the means (2-sample t-test, P = 0.029), but the distributions are otherwise very similar. As in the temporal domain, the BSM histograms obscure the filter properties of the neural populations. For instance, similar to temporal modulations, the sharpness of tuning for spectral modulations is lower in cortex than thalamus (for cells with BSM ≠ 0, spectral modulationQ 1/e: mean 0.46 thalamus, 0.29 cortex; median 0.40 thalamus, 0.22 cortex; 2-sample t-test,P < 0.001). We obtain approximations to the overall spectral transfer functions of thalamus and cortex by summing all the individual unit sMTFs in each station (Fig. 6 D). Unlike the temporal transfer functions, the composite spectral transfer functions are very similar between thalamus and cortex, with a peak of 0 cycles/octave in both thalamus and cortex, and upper 6-dB cutoff values of 1.30 cycles/octave in thalamus and 1.37 cycles/octave in cortex. In general, both thalamic and cortical spectral filters are therefore low-pass and almost perfectly overlapping.
In addition to BSM, we also employ a simpler traditional measure of spectral selectivity, Q. For the main contralateral STRF feature (almost always excitatory),Q 1/e is the best frequency divided by the bandwidth at 1/e of the peak magnitude (see Fig. 1 A). Distributions of excitatory frequencyQ 1/e for thalamus and cortex are highly overlapping (Fig. 7 A), with a mean of 5.8 in thalamus and 5.4 in cortex. The thalamus, however, has proportionally more neurons with very sharp tuning (Q 1/e > 10). Because BSM depends on the relationship between excitatory and inhibitory subfields and Q 1/e does not, comparing the BSM and Q 1/e for each unit gives a rough measure of how great a role sideband inhibition plays in spectral integration properties (Fig. 7 , B andC). The diagonal line indicates a perfect match betweenQ 1/e and BSM, where sideband inhibition is very strong and of the same spectral envelope periodicity as the main excitatory peak. Units near the bottom of the scatterplot lack strong sideband inhibition and therefore tend toward low-pass transfer functions. In both thalamus and cortex, some neurons have strong sideband inhibition, many have none at all, and most fall somewhere in between. The fact that very few symbols appear above the diagonals indicates that strong sideband inhibition is almost never of higher spatial periodicity than one would infer from the excitatory bandwidth. Whether considering traditional Q values or modulation transfer functions, therefore, thalamus and cortex exhibit very similar spectral response profiles.
Spectrotemporal response preferences
One of the strengths of the STRF is that it describes spectral and temporal response properties under identical stimulus conditions. Overall thalamic and cortical spectrotemporal modulation preferences are summarized in composite RTFs (Fig. 8,A and B). The joint distribution of BTMs and BSMs (superimposed symbols), moreover, reveals whether spectral and temporal properties are correlated for individual cells. Unlike previous sections where the absolute value of BTM was used, thus ignoring frequency sweep direction preference, this analysis includes the sign of the BTM (see methods for details). If modulation properties in time and frequency domains depend on one another, the joint distribution should have an informative structure. For instance, if the same excitatory-inhibitory processes underlie both spectral and temporal modulation preferences, then BSM and BTM will be positively correlated. Or if the cells behave as filters with limited time-frequency resolution, one would expect an anti-correlated limit between temporal and spectral modulation sensitivities. Such a time-frequency tradeoff would require that neurons with high BTMs have low BSMs, and neurons with high BSMs have low BTMs. In fact, neither thalamic nor cortical cells show any conspicuous correlation structure between spectral and temporal modulation rates. Both domains, of course, reflect the biases described in the preceding text and evident in the composite RTFs. This globally band-pass temporal and low-pass spectral structure results in a weak but significant correlation coefficient between ‖BTM‖ and BSM (thalamus, r = 0.266, P < 0.001; cortex, r = 0.300,P < 0.001). Except for these biases, however, BTM–BSM combinations fill the space probed by the stimulus and are relatively independent of one another, each accounting for less than 1/10th of the variability in the other (coefficient of determination: thalamusr 2 = 0.071; cortexr 2 = 0.090). Nor is there correlation between the sharpness of modulation tuning, orQ 1/e , in frequency versus time (P > 0.2).
An asymmetry in the magnitude of the RTF about the vertical midline (temporal modulation = 0) indicates a spectrotemporal asymmetry or nonseparability of the spectral and temporal aspects in the STRF. Asymmetries in the STRF can take many forms, but the strongest and most paradigmatic example is frequency sweep preference (e.g., Fig.2 B), with up-sweep preference resulting in higher RTF energy at positive temporal modulation values, and down-sweep preference resulting in higher energy at negative values. Since the scatterplots overlying Fig. 8, A and B, only give the individual RTF maxima rather than the distributions of energy and since the underlying composite RTFs obscure the filter properties of the individual cells, they cannot adequately reveal the RTF asymmetries across the population. To directly measure frequency sweep bias for each cell, we compared the RTF energies on either side of the vertical midline (temporal modulation = 0) at each spectral modulation rate, then combined values from all spectral modulation rates by a weighted sum to derive a spectrotemporal asymmetry measure. The spectrotemporal asymmetry has large negative value for strong down-sweep preference, 0 for no asymmetry, and large positive value for strong up-sweep preference. Most cells in both stations have little preference for sweeps in either direction (Fig. 8, C andD), i.e., their STRFs are spectrotemporally rather symmetric. The STRF in Fig. 2 A, for instance, has an asymmetry measure near zero (−0.07). A small proportion of cells, however, show strong preference for sweeps of the up or down directions, with no clear population bias toward either. The STRF in Fig. 2 B, for example, has a large positive asymmetry of 0.54; most of its RTF energy would extend into the right half-plane. The proportion of neurons sensitive to sweep direction, moreover, is similar in MGBv and AI.
Binaural response preferences
The STRF from stimulation of the contralateral ear is typically dominant, having greater magnitude than the ipsilateral side, and thus serves as the basis for most of the preceding analysis. As illustrated in Fig. 3, however, many thalamic and cortical cells also show significant STRFs to independent, simultaneous stimulation of the ipsilateral ear. The STRFs of a given neuron may differ substantially for contra- and ipsilateral stimulation. These differences can be expressed in terms of excitatory or inhibitory dominance as well as in terms of the STRF shapes. Accordingly, we quantify the binaural STRF differences by two methods. The first is a contra:ipsi peak measure, where the maximum STRF deviation is measured for each ear in SDs above the noise. The values in this ordered pair [contra, ipsi] are positive for excitatory and negative for inhibitory deviations. For instance, the neuron in Fig. 3 B has contra:ipsi peak measure [+9, −8]. The contra:ipsi peak measure considers only the single maximum deviation for each STRF, regardless of other weaker features and their binaural spectrotemporal alignment. This measure is reminiscent of the traditional classification of binaural interaction types in terms of excitatory (E) and inhibitory (I) contributions, where our contra:ipsi peak of [+9, −8] might be considered EIpk. However, the STRF-based measure is not categorical but continuous and quantifies the binaural input more than the explicit binaural interaction. Our second assay, the BSI, is a correlation coefficient measuring how similar the entire contra and ipsi STRFs are in frequency and time. A BSI of +1 means contra and ipsi STRFs are perfectly matched in frequency, time, and sign. A BSI of −1 occurs when the STRFs are matched in frequency and time, but are of antagonistic sign. Finally, a BSI of zero indicates no correlation between the shapes of the two STRFs. The BSI for the EEpk neuron in Fig. 3 A, for instance, is 0.75. The EIpk neuron of Fig. 3 B, on the other hand, has BSI = −0.31. STRFs with competing subregions, some binaurally similar and others opposite, tend to have BSIs closer to zero (e.g., Fig. 3 C, BSI = 0.14).
Both binaural measures may be plotted on the same figure, where symbol location indicates the contra:ipsi peak value and symbol type and size indicates BSI for each neuron (Fig.9). Considering for now only the contra:ipsi peak values (symbol location), thalamus and cortex show the same response types but different proportions of each. The vast majority of units have an excitatory maximum peak in the contralateral STRF (90% of total in thalamus, 82% cortex). Many of these have no ipsilateral STRF (EOpk = 23% of total in thalamus, 40% cortex). Of those that do, more than twice as many ipsilateral STRFs have an excitatory peak rather than an inhibitory peak (EEpk = 45% thalamus, 30% cortex; EIpk = 21% thalamus, 12% cortex). Binaural similarity indices are indicated in Fig. 9 by symbol type and size. Neurons with BSI = 0 are represented by ○. Positive BSIs are +, and negative indices are ⋄; for these nonzero values, symbol size scales with the absolute magnitude of the BSI. BSIs show a bias toward well-matched or unmatched, as opposed to antagonistic, binaural STRFs in both thalamus and cortex (BSI = 0 for 30% thalamus, 58% cortex; BSI > 0 for 56% thalamus, 34% cortex; BSI < 0 for 14% thalamus, 7% cortex; BSI range −0.76 to 0.91 for thalamus, −0.38 to 0.87 cortex). Values for BSI are highly correlated with the contra:ipsi peak measures. The neurons with BSI = 0 necessarily fall mostly along the axes of Fig. 9, almost always the horizontal axis indicating a lack of ipsilateral STRF. Most of the neurons with positive BSI are EEpk (upper-right: 78% thalamus, 84% cortex), and most with negative BSI are EIpk (lower-right: 78% thalamus, 89% cortex). Overall, thalamic and cortical cells have similar binaural response types, but they differ in the relative proportions of each.
We recorded simultaneously in auditory thalamus and cortex under identical conditions to compare directly the receptive field properties between stations. Our naturalistic and strictly parameterized stimulus characterized spectral, temporal, spectrotemporal, and aural response domains. This multidimensional and internally consistent assay of neural responses enables a unified description of the lemniscal thalamocortical transformation.
Of all parameters measured, temporal response properties differ most systematically between thalamus and cortex. Median peak response latency is 2.5 ms longer in cortex, a reasonable approximation to axonal and synaptic delay. This value agrees with studies of correlated thalamocortical cell pairs in several modalities (Creutzfeldt et al. 1980; Johnson and Alloway 1996;Miller et al. 2001a; Reid and Alonso 1995; Swadlow 1995; Tanaka 1983). While the maximum cortical peak latencies extend to longer values than in thalamus, the distributions are highly overlapping. That is, although thalamus is unsurprisingly activated before cortex, the two stations are simultaneously active for the greater part of their response. Tens of milliseconds of coincident activity would allow ample opportunity for corticothalamic, or even cortico-colliculo-thalamic feedback to shape responses in both stations (Ghazanfar and Nicolelis 1997; He 1997; Murphy et al. 1999; Villa et al. 1991; Zhang and Suga 1997, 2000).
A characteristic difference between thalamic and cortical responses is that cortical cells tend to respond to slower temporal modulation rates (Creutzfeldt et al. 1980). Unlike previous studies, we quantify the degree of this temporal slowing using strictly parameterized, wideband and dynamic stimuli. The composite temporal MTFs for thalamus and cortex show that the effective filter in cortex is half as fast as thalamus, in both peak and upper cutoff values. While the range of BTMs is similar between the two stations, many more thalamic cells have best responses to higher rates, so that the median value in cortex is also half that in thalamus. We further demonstrate that for band-pass neurons, temporal modulation tuning is narrower in thalamus than in cortex.
The profound transformation of temporal response properties from thalamus to cortex suggests that temporal modulations in the sound waveform are represented differently at the two stages. Mechanistically, one could suppose that each thalamic input is slowed by the same factor within the cortical network. A study of functionally connected thalamocortical cell pairs in the same system, however, shows no rank correlation between BTMs of thalamic inputs and cortical targets: fast (slow) thalamic cells do not contribute preferentially to fast (slow) cortical cells (Miller et al. 2001a). Therefore the thalamocortical temporal transformation is not a straightforward reduction in rate. We should reiterate the fact that the analyses in this report are sensitive only to neural responses phase-locked to the stimulus envelope. Although periodicities well above our stimulus limit of 100 Hz are important for an animal, phase-locked responses above this rate are very rare in thalamus and cortex, especially in the anesthetized preparation. The question thus remains of how higher periodicities are represented at this level. Some investigators provide evidence that the progressive decrease of phase-locked temporal modulation following rates could be accompanied by a recoding of modulations into a topographic, rate-based response (Langner et al. 1997; Pantev et al. 1989; but see Fishman et al. 1998). Others emphasize instead the perceptual saliency of the lower range of modulations (Arai and Greenberg 1998; Drullman et al. 1994); such psychophysical observations imply that we might expect to find a preferred representation of lower modulation frequencies in auditory cortex.
While temporal response properties differ significantly between thalamus and cortex, spectral properties are very similar. Although the cortex has somewhat more low-pass cells, the range of best spectral modulations is the same, as is the range of excitatory frequency integration or Q 1/e values. The composite spectral MTFs in both stations are strongly low-pass with similar upper cutoffs. Best spectral modulation rates for cortex are similar to those found with static ripple stimuli (Schreiner and Calhoun 1994) except for a greater percentage of lower values found in this study. Our analysis reveals that many neurons' spectral modulation preferences are determined by strong sideband inhibition, a conclusion reached less directly using static ripples, pure-tone, or multi-tone stimuli (Calhoun and Schreiner 1998;Nelken et al. 1994; Versnel and Shamma 1998). One related factor not addressed in this report is how stimulus intensity affects spectral bandwidth. Although the dynamic ripple covers a large intensity range of 45 dB, because we always presented it at a mean of 20–30 dB above the pure-tone threshold, we cannot draw firm conclusions about intensity-bandwidth dependence. Our preliminary data agree with previous studies, however, suggesting that changes in excitatory spectral integration with increased stimulus intensity are much less pronounced with wideband stimuli than with pure tones (Ehret and Merzenich 1988; Ehret and Schreiner 1997). Plausibly, active engagement of inhibitory sidebands would tend to limit excitatory spread at high intensities.
Composite RTFs describe the overall spectrotemporal modulation transfer functions for thalamus and for cortex, both of which show the band-pass temporal and low-pass spectral properties described above. Interestingly, the cortical composite RTF bears a striking resemblance to psychophysically derived detection thresholds for moving ripple stimuli (Chi et al. 1999). Individual neuronal preferences are preserved in joint distributions of BSMs and BTMs, which reveal that best modulation rates are uncorrelated between spectral and temporal domains in thalamus and cortex. For instance, a neuron with high spectral modulation preference may have any given temporal preference and vice versa. Not only are best rates uncorrelated, but the sharpness of tuning or Q values for spectral modulations are also uncorrelated with those for temporal modulations. This lack of correlation demonstrates that the excitatory-inhibitory processes underlying modulation preferences are independent for frequency and time. Moreover, the fact that best modulation rates are not constrained by an inverse relationship (e.g., high BSM implying low BTM) indicates an absence of time-frequency trade-off. This is in contrast to the inferior colliculus, the input station to thalamus, where many neurons appear to respect a maximum time-frequency resolution (Escabı́ et al. 1998). Another response characteristic not captured by separate temporal and spectral analyses is the asymmetry of the STRF or RTF, in its extreme an indication of frequency sweep selectivity. The proportion of cells showing highly asymmetric RTFs is significant yet low. This suggests that the high proportion of frequency sweep-selective cells found in some previous studies (Mendelson and Cynader 1985;Nelken and Versnel 2000; Phillips et al. 1985) is determined by cells with a relatively symmetrical arrangement of excitatory and inhibitory subfields in frequency and time, not by cells whose STRFs show distinct sweep-like features. Our results agree more closely with early studies that showed only a small proportion of cells in MGB (Whitfield and Purser 1972) and AI (Evans and Whitfield 1964) that respond exclusively to FM tones. Compared to traditional stimuli, the dynamic ripple thus gives a much fuller description of a neuron's joint spectrotemporal response preferences.
Binaural response properties are similar in type but differ in their relative proportions between thalamus and cortex. Using the contra:ipsi peak measure, EEpk cells outnumber EIpk cells by greater than a factor of two. Cortex, however, has proportionally more EOpkcells than thalamus. Similarly with the BSI measure, the contra- and ipsilateral STRFs are positively correlated for most thalamic cells, followed by no correlation and finally binaural antagonism; most cortical cells show no binaural correlation, followed by positive correlation, then antagonism. Although our binaural analysis cannot be cast directly in the traditional categories (EE, EI, etc.) (Aitkin and Webster 1972; Calford and Webster 1981; Imig and Adrián 1977;Middlebrooks et al. 1980; Phillips and Irvine 1983; Semple and Aitkin 1979), we have attempted to provide continuity by classifying binaural types in an analogous fashion (an excitatory:excitatory contra:ipsi peak measure is EEpk). The most important difference between our results and traditional measures is that STRFs were derived with simultaneously presented but uncorrelated sounds to each ear, so no explicit binaural interactions were tested. Our contra:ipsi peak measure depends on the dominance of excitation or inhibition in each ear rather than the explicit interaction between them. The combination of the contra:ipsi peak measure with the BSI is nevertheless highly suggestive of the binaural interaction type. The proportions of binaural response types in the present report differ somewhat from previous studies (Aitkin and Webster 1972;Calford and Webster 1981; Middlebrooks and Zook 1983; Middlebrooks et al. 1980), most noticeably in the small percentage of EIpk or binaurally antagonistic (BSI < 0) cells and in the large percentage of EOpk cells, especially in cortex (but seePhillips and Irvine 1983). Proportions differ considerably among many previous studies as well, however, probably due to idiosyncrasies in method. The differences in our study, therefore may be attributable to our use of a spectrotemporally complex stimulus, to our strict criteria for determining the presence and significance of an STRF (see methods), or to the fact that our analysis did not permit a parametric manipulation of binaural disparities. By using the same stimulus to evaluate simultaneously many neural response dimensions, we have developed a binaural assay that is internally consistent with monaural spectrotemporal preferences.
There are several promising directions for future studies. Although we recorded over large regions in the MGBv and AI, we made no attempt to systematically map either structure. Many of our response parameters could be spatially organized in both locations (Read et al. 2001; Rodrigues-Dagaeff et al. 1989;Schreiner 1998). We are nonetheless confident that we recorded from comparable thalamic and cortical populations, because many pairs showed monosynaptic-like functional connectivity (discussed in a separate report: Miller et al. 2001a). Second, many neurons fail to yield STRFs. Surely for some, the 20-min stimulus presentation time was insufficient to overcome low firing rate and/or high response variability. Other neurons without STRFs may respond to the stimulus envelope selectively but in a nonlinear, phase-invariant way, as seen in the inferior colliculus (Escabı́ et al. 1998) and reminiscent of complex cells in V1. Because anesthesia may affect some of our response parameters (Edeline et al. 1999), it will be important to verify our conclusions in the unanesthetized animal. Temporal modulation preferences in particular could vary with anesthesia (Kenmochi and Eggermont 1997), although recordings in unanesthetized guinea pig show a thalamic-cortical reduction in preferred rate similar to our results (Creutzfeldt et al. 1980). Finally, much work remains to determine how individual cells in thalamus and cortex actually implement the transformations described in the preceding text (Miller et al. 2001a,b).
In this report, we compare receptive field attributes of thalamic and cortical populations recorded simultaneously in the same preparation. Receptive fields were derived with dynamically changing, spectrotemporally complex stimuli that share many properties with natural sounds (Escabı́ et al. 1999). These methods enabled a unified description of temporal, spectral, spectrotemporal, and aural response properties under identical stimulus conditions (Miller and Schreiner 2000). Our observations thus lay the groundwork for further study of the thalamocortical transformation of responses to complex auditory stimuli.
This work was supported by the National Institutes of Health (DC-02260, NS-34835), the National Science Foundation (NSF97203398), and the Whitaker Foundation.
Address for reprint requests: L. M. Miller, Dept. of Psychology, University of California, 3210 Tolman Hall, #1650, Berkeley, CA 94720-1650 (E-mail:).
- Copyright © 2002 The American Physiological Society