For primary auditory cortex (AI) laminae, there is little evidence of functional specificity despite clearly expressed cellular and connectional differences. Natural sounds are dominated by dynamic temporal and spectral modulations and we used these properties to evaluate local functional differences or constancies across laminae. To examine the layer-specific processing of acoustic modulation information, we simultaneously recorded from multiple AI laminae in the anesthetized cat. Neurons were challenged with dynamic moving ripple stimuli and we subsequently computed spectrotemporal receptive fields (STRFs). From the STRFs, temporal and spectral modulation transfer functions (tMTFs, sMTFs) were calculated and compared across layers. Temporal and spectral modulation properties often differed between layers. On average, layer II/III and VI neurons responded to lower temporal modulations than those in layer IV. tMTFs were mainly band-pass in granular layer IV and became more low-pass in infragranular layers. Compared with layer IV, spectral MTFs were broader and their upper cutoff frequencies higher in layers V and VI. In individual penetrations, temporal modulation preference was similar across layers for roughly 70% of the penetrations, suggesting a common, columnar functional characteristic. By contrast, only about 30% of penetrations showed consistent spectral modulation preferences across layers, indicative of functional laminar diversity or specialization. Since local laminar differences in stimulus preference do not always parallel the main flow of information in the columnar cortical microcircuit, this indicates the influence of additional horizontal or thalamocortical inputs. AI layers that express differing modulation properties may serve distinct roles in the extraction of dynamic sound information, with the differing information specific to the targeted stations of each layer.
Primary auditory cortex (AI) contains six distinct layers, each with a unique set of input and output projections and with clearly differentiated cellular compositions (Rouiller et al. 1991; Winer 1992). In AI, cells are vertically arranged in a more conspicuous manner than in other sensory systems (Jones 2000; Winer 1984a). The vertical arrangement of AI cells is accompanied by highly specific interlaminar connections (Barbour and Callaway 2008; Mitani et al. 1985). This vertical microcircuitry has been considered a key element of cortical processing (Mountcastle 1997). These connections follow a precise and characteristic pattern that offers the opportunity to compare the function of specific components of the cortical microcircuit (Martinez et al. 2005). Here, we quantified laminar response patterns to dynamic temporal and spectral modulations to address the question of what transformations or constancies of spectrotemporal properties are evident within auditory cortical columns.
Previous work in AI showed that modulation information may undergo a transformation between thalamus and cortex (Miller et al. 2002). Compared with thalamic cells, neurons in thalamorecipient layers IIIb/IV follow slower modulations. Additionally, neurons in layers IIIb/IV contain spatial topographies, or local organizations, for characteristic frequency, latency, threshold, as well as spectral and binaural integration (Middlebrooks et al. 1980; Schreiner 1998; Schreiner and Sutter 1992). However, after this initial stage of processing, there is a paucity of information regarding how the vertical AI microcircuit further shapes and transforms elemental acoustic information (Linden and Schreiner 2003).
The situation in AI contrasts with that in the visual and somatosensory systems. In the visual system, the first stage of cortical integration—the thalamic input layer—creates simple cells, with cortical output stages dominated by complex cells. This laminar differentiation with regard to the manner of processing allowed testing of hypotheses concerning how these functional cell types were developed and constructed (Alonso and Martinez 1998; Ferster et al. 1996; Hubel and Wiesel 1962). Some properties related to stimulus content, such as retinal location of the receptive field and binocularity, are fairly constant across cortical laminae. By contrast, orientation and spatial modulation frequency can vary significantly with layer (DeBruyn et al. 1993; Heimel et al. 2005; Martinez et al. 2002), indicating distinct laminar functional transformations. In the whisker portion of the somatosensory system, some physiological properties can be fairly constant and others vary with layer and cell type (e.g., Ahissar et al. 2001). In thalamic recipient layers, afferents contact excitatory or inhibitory neurons on a sublaminar basis and constituent neurons are functionally dominated by a single whisker (Bruno and Simons 2002; Zhang and Alloway 2004). Cells in supragranular and infragranular layers usually have multiwhisker receptive fields, which integrate the layer IV single-whisker responses (Brumberg et al. 1999; Simons 1978). Thus cell responses in the early visual and somatosensory cortices are precisely shaped and organized according to their position in cortical layers.
Temporal and spectral modulations are fundamental properties of natural sounds that undergo substantial transformations in their representation along the auditory neuraxis (Joris et al. 2004). This raises multiple possibilities for the representation of these preferences in cortex. The preferences may be organized with little change across layers, whereas differences in horizontal location within AI may convey diversity in modulation preferences. Alternatively, modulation processing may be layer dependent, with changes that are systematic regardless of spatial position in AI. Third, there may be location-specific laminar transformations that are evident only at specific locations and they may not uniformly follow the main, monotonic, or progressive transformation suggested by the canonical columnar microcircuit (Linden and Schreiner 2003; Mitani et al. 1985). Thus averaging across locations may obscure the nature of the transformations. To examine these possibilities, we used multichannel probes to record from AI layers while presenting a dynamic noise stimulus. We then applied the reverse-correlation technique and calculated spectrotemporal receptive fields (STRFs). From the STRFs we obtained temporal and spectral modulation transfer functions (tMTFs, sMTFs) (Escabí and Schreiner 2002). MTFs characterize the sensitivity of a neuron to stimulus envelope modulations and reflect the relationship of excitatory and suppressive aspects of neural processing.
Our work significantly extends previous reports. Previous studies have presented STRFs from neurons recorded serially, not simultaneously. Additionally, STRFs have not been related to functional anatomy. Also, relatively few studies have described how temporal modulation processing changes with cortical depth and even fewer have described layer-dependent changes in spectral modulation processing. Our findings indicate that the processing of spectral modulation information in AI is predominantly layer dependent with distinct intra- and interlaminar diversity. Temporal modulation properties are more often layer independent but show an overall trend, where preferred modulation rates decrease in supra- and infragranular output layers compared with thalamic input layers, indicative of a general columnar transformation scheme. Combined, the differences in spectral and temporal modulation processing suggest independent spectral and temporal sequential processing schemes in the interlaminar or columnar domain.
The electrophysiological recording methods and stimulus design used in this study were previously described in detail (Miller and Schreiner 2000; Miller et al. 2001a,b, 2002). A brief description follows.
Seven young adult cats with clean and otoscopically normal outer and middle ears were sedated with an initial dose of ketamine (22 mg/kg) and acepromazine (0.11 mg/kg) and then anesthetized with pentobarbital sodium (Nembutal, 15–30 mg/kg) for the surgical procedure. The animal's temperature was maintained with a thermostatic heating pad. Bupivicaine was applied to incisions and pressure points. Surgery consisted of a tracheotomy, reflection of the soft tissues of the scalp, craniotomy over AI, and durotomy. After surgery, to maintain an areflexive state, the animal received a continuous infusion of ketamine/diazepam (2–5 mg·kg−1·h−1 ketamine, 0.2–0.5 mg·kg−1·h−1 diazepam in lactated Ringer solution). All procedures were in compliance with the University of California, San Francisco Committee for Animal Research.
With the animal placed inside a sound-shielded anechoic chamber (IAC, Bronx, NY), stimuli were delivered via a closed speaker system to the ear contralateral to the exposed cortex (diaphragms from STAX, Iruma-Gun, Japan). Extracellular recordings were made using multichannel silicon recording probes, which were provided by the University of Michigan Center for Neural Communication Technology (Wise 2005). The probes contained 16 linearly spaced recording channels, with each channel separated by 150 μm. The impedance of each channel was 2–3 MΩ. Probes were carefully positioned orthogonally to the cortical surface and lowered to depths between 2,300 and 2,400 μm using a microdrive (David Kopf Instruments, Tujunga, CA).
To obtain single-neuron responses, neural traces were band-pass filtered between 600 and 6,000 Hz and were digitally recorded with a Cheetah32 A/D system (Neuralynx, Bozeman, MT), at sampling rates between 18,000 and 27,000 Hz. Stimulus-driven neural activity was recorded for about 75 min at each location. After each experiment, the traces were sorted off-line with a Bayesian spike-sorting algorithm (Lewicki 1994). Most channels of the probe yielded one to two well-isolated single units. The spike shape of all neurons was inspected to distinguish between regular- and fast-spiking neurons (Atencio and Schreiner 2008). Here we report exclusively on regular-spiking neurons, which are presumptive pyramidal cells. All recording locations were in AI, as verified through initial multiunit mapping and determined by the layout of the tonotopic gradient and bandwidth modules on the crest of the ectosylvian gyrus (Imaizumi and Schreiner 2007).
For current source density (CSD) analyses, the A/D filter settings were adjusted to have a passband from 1 to 400 Hz, which allowed us to record local field potentials (LFPs). For LFP recordings, we used 1,000 to 2,000 monophasic clicks with a pulse width of 100 μs and an interstimulus interval of 500 ms. After each experiment, the sampling rate of the LFPs was downsampled to 1,000 Hz using a polyphase implementation that applied antialiasing filtering and then compensated for the delay introduced by the filtering (Proakis and Manolakis 1995). For each electrode channel, the LFPs were then averaged over each click stimulus trial. To obtain one-dimensional CSDs, we used the laminar profiles of the LFPs to estimate the second spatial derivative of the LFPs from three neighboring points via D = −(V(r − h) − 2V(r) + V(r + h))/h2, where D is the CSD estimate, V represents an averaged LFP recording, r is the depth (in μm) at which D is calculated, and h is the spacing of the recording sites (150 μm) (Nicholson and Freeman 1975; Steinschneider et al. 1998).
Penetrations with the linear recording array were orthogonal to the cortical surface and spanned all cortical layers. We operationally refer to this recording approach as “columnar.” By this we mean that the activity of recorded neurons represents processes that span the full vertical thickness of the cortical laminae, but may include more interactions than represented by the extent of anatomical microcolumns and less than the extent of functional modules (Linden and Schreiner 2003). An example of a depth recording, including a Nissl picture of the anatomy with electrolytic lesions, CSD, and minimum response latency profile, is given in Fig. 1 (Freeman and Nicholson 1975; Mitzdorf 1985). Locations of cortical neurons were assigned to layers based on a combination of depth estimate of the recording electrode relative to the cortical surface, first spike latency profile, and, if available, CSD. The depth ranges were used as a predominant criterion after verification with latency and CSD measures in several penetrations and were always in accord with established AI laminar boundaries (Mitani et al. 1985; Rouiller et al. 1991; Wallace et al. 1991; Winer 1984a). Assignment differences affected by local changes in cortical thickness were minimized, since depth readings were aligned with a functional estimate of the granular layer position.
Neurons were probed with pure tones, then with one or two presentations of a 15- or 20-min dynamic moving ripple stimulus. The level and frequency of each pure tone were chosen randomly from 15 different levels (5-dB spacing) and 45 different frequencies. Each pure tone was presented five times at a given level and frequency. The ripple stimulus was a temporally varying broadband sound (500–20,000 or 40,000 Hz) composed of about 50 sinusoidal carriers per octave, each with randomized phase (Escabí and Schreiner 2002). The carrier magnitude was modulated by the spectrotemporal envelope. At any given time, the envelope was defined by one spectral and one temporal modulation rate. Spectral modulation rate is defined by the number of spectral peaks per octave. Temporal modulations are defined as the number of peaks per second. Both the spectral and temporal modulation parameters varied randomly. Spectral modulation rate varied between 0 and 4 cycles per octave. The temporal modulation rate varied between −40 Hz (upward sweep) and 40 Hz (downward sweep). Both parameters were statistically independent and unbiased within these ranges. Maximum modulation depth of the spectrotemporal envelope was 40 dB. The mean intensity was set 30–50 dB above the average pure-tone threshold in a penetration.
Data analysis was carried out in MATLAB (The MathWorks, Natick, MA). For each neuron, frequency response areas (FRAs) were computed from the pure-tone responses, whereas the reverse correlation method was used to derive the STRF (Aertsen and Johannesma 1980; deCharms et al. 1998; Escabí and Schreiner 2002; Klein et al. 2000; Theunissen et al. 2000). STRFs were thresholded so that only significant features (P < 0.01) were included in the analysis (Escabí and Schreiner 2002).
Modulation properties were derived by computing the two-dimensional Fourier transform of each STRF. The fast Fourier transform (FFT) is a function of temporal (cycles/s, Hz) and spectral modulation rate (cycles/octave). The magnitude of this function was folded along the vertical midline (temporal modulation frequency = 0) to obtain the ripple transfer function (RTF). Since the Fourier transform is sensitive to periodicities in the STRF, the RTF reflects the relationship of excitatory (on) and suppressive (off) STRF subfields. Thus if the sole STRF feature is an excitatory peak, the RTF will tend to be low-pass in both the temporal and the spectral modulation domains. Strong flanking suppression in frequency and/or in time will tend to produce RTFs that are band-pass in the spectral and/or temporal domain.
RTFs were used to obtain modulation transfer functions (MTFs). Summing the RTF along the spectral modulation axis yields the temporal modulation transfer function (tMTF) and summing along the temporal modulation axis yields the spectral modulation transfer function (sMTF). MTFs were classified as band-pass if, after identifying the peak in the MTF, values at lower and higher modulation rates decreased by ≥3 dB. If there was no such decrease for low modulation rates the MTF was classified as low-pass. High-pass MTFs were not encountered. Best modulation rate for band-pass MTFs was the rate corresponding to the peak value in the MTF, whereas for low-pass MTFs it was the mean between the zero modulation frequency value and the 3-dB high-side cutoff. MTF width for band-pass MTFs was the difference between the high and low 3-dB cutoff values, whereas for low-pass MTFs the width was the difference between the high-side 3-dB cutoff rate and the zero modulation rate.
We analyzed the responses of 696 single units recorded from 43 multichannel probe penetrations to determine the dynamic sound processing properties of neurons across the vertical AI microcircuit. For six animals, the number of obtained and analyzed penetrations was between 4 and 14 (4, 4, 5, 5, 9, and 14). In one animal only two penetrations were collected. We constructed STRFs from the neural responses to dynamic moving ripple stimuli. From the STRFs, we quantified the modulation processing capacity of each neuron. The analysis method is outlined in Fig. 2. The first row of the figure shows the STRFs of two neurons. Each column contains different depictions of their response properties.
Figure 2A shows an STRF with temporal on and off subfields. The on region (red) corresponds to a higher than average (green) stimulus energy at that time before the spike occurrence. The off region (blue) has lower than average stimulus energy. The relationship of on and off subfields determines the response to stimulus envelope modulations. In the temporal domain, the sequence of on and off regions suggests a band-pass property for modulations. The absence of off regions in the spectral domain indicates a low-pass behavior for spectral modulations. We characterized the temporal and spectral modulation processing of the neuron by calculating the magnitude of the two-dimensional FFT of the STRF, which is the RTF (Fig. 2B). Since the STRF in Fig. 2A contains no spectral off regions, or suppressive sidebands, the RTF contains most of its energy near 0 cycle/octave, with decreasing energy at higher spectral modulation rates. The temporal on–off pattern of the STRF results in a band-pass behavior along the temporal modulation axis, with a maximum at about 20 cycles/s.
We derived temporal and spectral MTFs (tMTF and sMTFs) by summing the RTF along either the spectral or temporal modulation axes, respectively. The tMTF (black) has a band-pass structure, whereas the sMTF (red) is low-pass (Fig. 2C).
A second example neuron is shown in Fig. 2, D–F. The STRF for this neuron has on–off subfields along both the temporal and the spectral axes (Fig. 2D). The RTF of the neuron contains peak energy at 10 cycles/s temporally and at 2 cycles/octave spectrally (Fig. 2E). The on–off patterns—and thus the modulation selectivity of this neuron—are reflected in the MTFs (Fig. 2F), which have a band-pass shape for the sMTF (red) and the tMTF (black).
STRF structure differs between cortical laminae
Receptive field properties can vary widely across the extent of AI, as demonstrated by functional organizations such as tonotopy, bandwidth modules, or binaural bands, where each of these topographies was derived by recording from neurons in thalamorecipient granular layers. Since different sites within AI may vary greatly in their response parameters, across-layer differences have to be assessed within single penetrations. We therefore aligned electrode penetrations orthogonal to the cortical surface and simultaneously recorded from single neurons in different layers to determine how stimulus envelope modulations are processed in the vertical AI microcircuit. An example of STRFs and RTFs from such a penetration is shown in Fig. 3. Each row represents STRFs and RTFs from one neuron. The cortical depth estimate for each single unit is indicated to the left of the STRFs. The main excitatory, or on response (red), reveals a gradual latency progression, proceeding from longer response latencies at shallower depths (500–650 μm) to shorter latencies near layers that receive thalamic input (950–1,100 μm). Latencies remained relatively constant between 950 and 1,600 μm and increased at deeper locations (see Fig. 4J for latency analysis of this penetration).
In addition to latency variations, the example penetration demonstrates layer-dependent changes in STRF structure. From 500 to 1,550 μm, STRFs had a single dominant on subregion, which was temporally followed by off subregions (Fig. 3). From 1,100 to 1,550 μm, secondary excitatory and suppressive features become more prominent alongside the main on–off patterns. STRF features become less consistent at depths >1,550 μm. At these positions, excitatory and suppressive features take more complex shapes, such as elongated subfields, or repeating bands of excitation or suppression. Similar layer-dependent STRF changes were observed across all penetrations.
We quantified the modulation preferences of each neuron by obtaining the RTF from the STRF. The RTFs for the example penetration are depicted next to the STRFs (Fig. 3, right column). Since the STRFs change with depth, the RTFs also evince depth-dependent changes. Along the temporal modulation axis, band-pass (at 650–1,550, 1,700, 2,000 μm) and low-pass (at 500, 1,700, 2,000 μm) behavior can be seen. Variability at single recording sites is also present, since two neurons were recorded at 1,700 μm and two at 2,000 μm, and each site exhibits both low-pass and band-pass profiles. The range of temporal modulation frequencies that elicited a strong response varied between 10 Hz (2,000 μm) and 40 Hz (500, 2,000 μm) in this penetration. Temporal preferences with a single peak (500–1100 μm) or multiple peaks (1,250, 1,850, 2,000 μm) were encountered in the same penetration. Finally, temporal modulation behavior varies with spectral modulation properties (1,100, 1,250, 1,850, 2,000 μm), with different ranges and shapes of the temporal response profile corresponding to different spectral modulation ranges.
Variations similar to the temporal domain were evident for spectral modulation processing, including band-pass and low-pass responses, single and multipeaked preferred parameter regions, and narrowband and broadband filters. In this penetration, the highest consistency among the RTFs is seen between 650 and 1,550 μm, approximately corresponding to layers IIIb/IV and V. Layer VI shows the greatest shape and location variety of RTF profiles.
RTFs simultaneously display response preferences along spectral and temporal stimulus dimensions. To isolate each dimension, from the RTFs of each neuron we calculated temporal and spectral MTFs (tMTFs, sMTFs). The MTFs isolate each modulation axis and quantitatively characterize how envelope modulation processing is expressed at different cortical depths. For two penetrations, the MTF behavior across cortical depth is shown in more detail (Fig. 4 and 5). The examples serve to illustrate that depth-dependent changes of modulation properties are common and do not follow a stereotyped pattern. For the penetration shown in Fig. 3, temporal and spectral MTFs were collected into matrices, with a color-coded response magnitude (Fig. 4, A and E; red-brown: high value; blue: low value). Each matrix row shows either a tMTF or sMTF, with the cortical depth position annotated to the left. The MTF of each neuron was normalized by its maximum value. In this example, the majority of tMTFs were broadly tuned with 10 neurons responding well to modulation frequencies >30 Hz. The tMTFs maintained a substantial width until a depth of 1,700 μm, at which point the filter function narrowed (Fig. 4A). In this penetration, neurons located at depths between 650 and 1,100 μm (layers III and IV) expressed tMTFs with the highest modulation frequencies.
Best temporal modulation frequency (bTMF), tMTF bandwidth, and tMTF shape were determined for each neuron. bTMF varied with depth, with the lowest values in infragranular layers (Fig. 4B). tMTF width changed from wider to narrower with increasing recording depth (Fig. 4C). However, the widest tMTFs were at depths of 500 and 2,000 μm, indicating a nonmonotonic progression of temporal processing aspects. The shape of the tMTFs also changed with depth. Whereas most neurons from 500 to 1,550 μm had band-pass tMTFs, more low-pass tMTFs were present in layer VI (Fig. 4D). This trend of a higher proportion of band-pass tMTFs in middle layers was present throughout the majority of penetrations (see Fig. 7) and represents a general aspect of interlaminar processing that is reflected in the modulation processing domain.
Spectral modulation properties showed different and more varied patterns with cortical depth (Fig. 4E). sMTFs were narrower at the shallowest and deepest depths and significantly broadened between 1,100 and 1,700 μm (∼layer V). bSMFs and tuning width followed this trend, with the highest values between 1,100 and 1,700 μm (Fig. 4, F and G). Finally, most sMTFs were low-pass, with only two neurons in this penetration having a band-pass characteristic (Fig. 4H).
We also examined the depth distribution of characteristic frequency (CF) and latency (Fig. 4, I and J). How these parameters varied with depth provides two pieces of evidence for the orthogonality of the penetration: 1) the nearly orthogonal orientation of the penetration to the cortical surface is indicated by a nearly constant frequency preference of the neurons (with the occasional exception for the deepest locations) and 2) the latency profile matches the commonly observed interlaminar response time pattern, with the shortest values found around thalamic input layer IV. Across all 43 penetrations in this study, the average CF disparity was 0.1 ± 0.1 octave (mean ± SD) in a penetration, confirming that penetrations were columnar (i.e., of constant cochlear position specificity) in the frequency domain.
The preceding example clearly demonstrates that MTFs are not independent of cortical depth. Although these changes could be similar in different penetrations, they usually did not follow a common transformation scheme and were thus not strongly predictable from one penetration to another. A second example penetration shows a different depth profile, or functional organization, especially for spectral MTFs. Ten neurons between 400 and 1,900 μm were simultaneously recorded (Fig. 5). In this example, bTMFs were quite low and slightly increased with depth (Fig. 5B). In contrast to the previous example, tMTF widths were relatively constant from supragranular to infragranular positions (Fig. 5C). The shapes of the tMTFs were all band-pass (Fig. 5D). Combined, the temporal MTF properties of this penetration make a case for preference constancy within the columnar organization, i.e., with little functional change across layer. The spectral MTFs, however, present a different picture. sMTFs clearly changed with recording position. They were very broadly tuned at superficial locations, and most selective from 850 to 1,200 μm, and again <1,600 μm (Fig. 5G). bSMFs showed a very similar profile (Fig. 5F). With one exception at 864 μm, sMTF shapes were all low-pass (Fig. 5H). For the two example penetrations (Figs. 4 and 5), the temporal modulation depth profile showed strong variations in one case and little variation in the other. The spectral modulation properties changed across laminae in both examples, but were essentially anticorrelated to each other with respect to the direction of sMTF changes. This suggests a diversity in interlaminar spectral modulation processing that may not follow the main flow of information in the vertical cortical microcircuit. Accordingly, recordings in one layer may not predict the spectral modulation properties of other layers.
Global laminar variations of modulation parameters
The previous examples showed that temporal and spectral modulation properties showed various cortical depth profiles, ranging between near constancy to highly varied, laminar-specific properties. Thus the general principles for how the AI microcircuit transforms modulations remain unclear. To determine whether there is a global transformation scheme across laminae, we first tabulated the mean population values for temporal and spectral modulation properties for different layers (Table 1). To reduce measurement noise due to electrode placement and local functional or anatomical variations, in the following analysis we defined laminar boundaries to be: layer II/III (200–725 μm); layer IV (800–1,100 μm); layer V (1,175–1,475 μm); and layer VI (1,550–2,000 μm). Neurons that fell into the 75-μm intervals between these layer ranges were considered to be of ambiguous designation and were not considered. We found that, on average, best temporal modulation frequency was greatest in layer IV and lowest in layer II/III (P < 0.05, t-test with Bonferroni correction). Mean best spectral modulation frequency (bSMF) was not significantly different across layers. tMTF width was broadest in layers V and VI and smallest in layer II/III (P < 0.05, t-test with Bonferroni correction). sMTF bandwidths were not significantly different across layers. The range of values for each layer was fairly broad, since the SD of the observed bTMFs and bSMFs in each layer was larger than the mean SD across cortical depth for all 43 penetrations (3.98 cycles/s for bTMFs and 0.40 cycle/octave for bSMFs). Both values were significantly smaller than the SD within each layer (P < 0.05, t-test with Bonferroni correction; see Table 1).
The results in this report necessarily focus on regular-spiking neurons, since they are the predominant cell type in auditory cortex. Another cell type, fast-spiking neurons, may also be identified under specific methodological conditions (Atencio and Schreiner 2008). Fast-spiking cells, which are putative inhibitory neurons, have modulation preferences similar to those of regular-spiking cells. Specifically, not only bTMFs and bSMFs, but also tMTF and sMTF bandwidths, have been shown to be similar (Atencio and Schreiner 2008). We separately analyzed the recorded subset of fast-spiking units to determine whether their modulation preferences varied with layer (Table 1).
In a given penetration, we usually could not identify more than two fast-spiking units. Thus comparing fast-spiking cell laminar properties is challenging, since columnar transformations, if they occur, are most effectively assessed across layers within individual penetrations. By calculating population averages, using group data across penetrations, these changes may be obscured. Since our data set of fast-spiking cells was not large (n = 104, corresponding to 14.9% of all recorded neurons), with only a few fast-spiking cells per penetration, we could not compare within-penetration laminar variations for fast-spiking cells alone. Group comparisons between layers for fast-spiking neurons did not show any significant differences, nor were they different from regular-spiking neurons (Table 1). To determine whether the fast-spiking data could influence the results in this report, we grouped all data together and recomputed the overall statistics, although this did not affect the nature of the observed laminar differences or the statistical significance of any result. Thus for the remainder of this report, we focus exclusively on regular-spiking cells.
Pairwise analysis of laminar variations of modulation parameters
Statistically, few layer differences for mean MTF values were evident. This may suggest that either modulation properties are constant in AI columns or that modulation processing may vary between different columns and that pooling data across penetrations obscures this fact, as indicated by the example penetrations (Figs. 4 and 5). To determine whether modulation information follows a general transformation scheme across layers, we compared modulation property changes between layers in individual penetrations. On a neuron-by-neuron basis, MTF values in each layer were compared with those located in other layers. For each possible neuronal pair, we tallied the difference between the MTF values and the resulting difference values were then pooled across all penetrations. An example of this procedure for layer IV and layer VI bTMF values is shown in Fig. 6A. Since this procedure is an explicit signed difference test, we performed Wilcoxon signed-rank tests to determine statistical significance (Bain and Engelhardt 1992).
The pairwise analysis shows that bTMFs systematically differ between some AI layers. Layer IV contained neurons with slightly higher bTMF values than those of neurons in layers II/III and VI (Fig. 6; P < 0.05, signed-rank test). The largest bTMF difference was between layer IV and layer VI (Fig. 6A). Layer V contained neurons with higher bTMF values than those in layer VI. These results indicate that best temporal modulation frequency follows a general progression within the AI microcircuit, with a gradual, although modest, reduction from layer IV to II/III and to VI. However, even for the largest difference in bTMF between layers IV and VI (median: 2.74 cycles/s difference), the overall effect was modest and the difference distribution was wide, including positive and negative differentials.
tMTF bandwidths showed only two significant differences (Fig. 6B): both layer V and layer VI had significantly broader temporal tuning than that of layer II/III. These differences were likely due to the more variable structure of the STRF in lower layers (see Figs. 3, 4, and 5). As noted earlier, excitation is usually followed by suppression, although in deeper layers the relationship is less stereotypic, resulting in a variety of STRF structures and, consequently, modulation filters. Since layer II/III projects to layers V and VI in the AI microcircuit, the broader tuning of neurons in layers V and VI implies that these layers must receive noncolumnar inputs from other locations in AI, inputs from other cortical fields, and/or from additional thalamic sources (see Fig. 9 for further analysis).
The difference procedure revealed that bSMFs also varied with cortical position. Layer IV had significantly lower bSMFs than those of layers V and VI (Fig. 6G; P < 0.05, signed-rank test). Layer V neurons had the highest bSMFs. Thus in contrast to temporal BMFs, the highest sBMFs occur outside thalamic input layers.
sMTF bandwidth analysis showed that bandwidths were significantly narrower in layer IV than those in layers V and VI (Fig. 6, D and G; P < 0.05, signed-rank test). Thus for spectral modulation, lower layers process a greater range of frequencies than those in thalamic input layer IV. This reflects differences in spectral sideband strength. Strong sideband inhibition (spectral off subfields) was often seen in layer IV (Fig. 3), although it was less prevalent in deeper layers. A decrease in sideband suppression leads to more broadly tuned, low-pass sMTFs (Fig. 2). sMTF broadening and the decrease in spectral sideband suppression are consistent with previous results showing that spectral integration increases in infragranular layers (Atencio and Schreiner 2005; Volkov and Galazyuk 1989).
MTF shape variation with cortical layer
MTF shape changed with layer (Figs. 4 and 5). The majority of tMTFs were band-pass, whereas most sMTFs were low-pass. The laminar distribution of tMTF shape is shown in Fig. 7A. The number above each histogram bar lists the number of neurons in the specified bin. The distribution was biased toward band-pass tMTFs, which result from temporal on–off STRF patterns. tMTF shapes were not homogeneously distributed across layers. Band-pass tMTFs were most concentrated in layers III and IV, regions that contain lemniscal thalamic input. Infragranular layers showed an increase in the proportion of low-pass tMTFs, which suggests a disjunction between processing in thalamic input and predominantly corticocortical output layers, versus infragranular corticothalamic and corticocollicular output layers.
The majority of sMTFs was low-pass and, compared with tMTFs, sMTF shapes were distributed more evenly across layer, perhaps with the exception of layer II, which showed a lower proportion of band-pass neurons (Fig. 7B). The low proportion of spectral modulation band-pass filters is a direct consequence of the prevalence and strength distribution of sideband inhibition (see Fig. 3).
Local schemes for temporal and spectral modulation processing
The results revealed that there are some global differences across laminae for best modulation frequency, MTF bandwidth, and MTF shape. The distribution of these MTF parameters is wide, as indicated by the range in the difference histograms (Fig. 6). This raised the question: Is there evidence of a stereotypical laminar transformation of modulation information? We addressed this by examining best modulation frequency (BMF) on a penetration-by-penetration basis. If different sites in AI process modulations similarly, then only a few transformations of best modulation frequency should be present. For example, if bTMFs are systematically and progressively lowered between thalamic input layers and output layers, then the majority of our penetrations should reveal this.
To determine whether stereotypical transformations exist, we calculated the mean BMF for each layer in a penetration (Fig. 8, A and B: individual penetrations in gray; mean value across all penetrations in black). The mean BMF trajectory follows the global variation of tBMF and sBMF (Fig. 8, A and B, Table 1). When each trajectory is examined, their diversity is apparent. Normalizing the individual trajectories to the BMFs in layer IV yields a similarly complex picture that expresses considerable variability across layer. Individual trajectory curves follow many different patterns in each penetration, showing that there may not be a stereotypical transformation of modulation properties across AI layers.
To quantify modulation differences between the main thalamic input layer, layer IV, and the various output layers, we calculated the proportion of penetrations that showed the same modulation preference or exhibited an increase or decrease relative to layer IV. The mean best modulation value of a layer was considered the same as the value for layer IV when it differed by <1SD of the layer IV distribution. Temporal BMFs in layers II/III, V, and VI were the same as those in layer IV in 69% of the penetrations (Fig. 8E). This suggests that the lack of cumulative differences between layers (Fig. 8A) is also reflected in most of the individual penetrations and may be considered as a sign for a columnar organization of temporal modulation properties. However, output layers in >30% of the penetrations showed substantial increases or decreases in bTMF relative to layer IV, indicative of substantial, perhaps location-specific contribution, interlaminar processing of temporal modulation information.
By contrast, best spectral modulation frequencies in output layers differed from those in layer IV in 73% of the columnar penetrations (Fig. 8F). The majority of sites in output layers showed an increase in bSMF compared with that in layer IV (mean proportion: 43%), with the highest values for layer VI. This suggests that the lack of cumulative differences between layers for spectral modulation preference (Fig. 8B) is not reflected in the majority of individual penetrations and may be considered as evidence of substantial interlaminar processing for spectral modulation, resulting in a noncolumnar functional organization. This analysis ruled out a stereotypical transformation of response preference for envelope modulation frequencies in the AI microcircuit.
Latency and electrical stimulation studies have identified a basic ordinal progression of interlaminar processing steps, leading from the lemniscal input layers (IIIb and IV) to the supragranular layers (II and IIIa), and from there to infragranular layer V (Mitani et al. 1985). Furthermore, studies of the transformation of spectrotemporal receptive fields between the auditory thalamus and cortex have indicated that layer IV excitatory receptive fields may be built by three schemes of functional convergence. First, cortical excitatory receptive fields can be directly inherited from thalamic neurons. They may also be constructed from several thalamic inputs, which combine to create the total excitatory area of the cortical cell. Last, they may be built from an ensemble of thalamic inputs, with the common receptive field components creating the cortical receptive field (Miller et al. 2001a,b, 2002). In analogy to the analysis of sets, the inheritance scheme is an “Identity” operation on the receptive field components of the thalamic inputs and thus the cortical receptive field is essentially identical to the thalamic receptive fields; the construction scheme is a “Union” operation and the ensemble scheme implements an “Intersection” operation. To test whether similar principles of convergence hold for progressive interlaminar modulation processing, we calculated the congruity, or relative overlap, of modulation transfer functions between the main consecutive stations in the laminar processing sequence, i.e., between layers IV and II/III and between layers II/III and V. Here, for a connection between layers IV and II/III, the source was presumed to be layer IV and the target was layer II/III. Congruity between the MTFs in different layers was assessed by first calculating the area of intersection between MTFs. We then divided this value by either the extent of the presumed source MTF (“overlap”) or the presumed target MTF (“coverage”; see diagrams in Fig. 9). Among the 32 simultaneously recorded layer IV and II/III neuronal pairs and the 30 layer II/III and V pairs, the majority showed high congruity. This is reflected in the high concentration of data points in the upper right quadrant of the congruity plots (Fig. 9, A–D). This quadrant corresponds to conditions of high overlap, high coverage, and high alignment. The dotted lines in Fig. 9 indicate that the amount of MTF overlap and/or coverage between the two layers is generally >60%. Conceptually, this region also reflects conditions characteristic for the inheritance principle of receptive field generation. Points above the diagonal have a tendency toward the ensemble convergence principle, whereas points below the diagonal have elements of constructive convergence, i.e., they require superposition of several smaller source subfields.
For the layer IV to layer II/III progression, the data are equally distributed above and below the diagonal. For the progression from layer II/III to V, about twice as many congruity estimates are below the diagonal, suggesting a larger contribution of constructive convergence for that processing step.
Overall, the relatively large values for overlap and coverage between the main consecutive processing stages in the interlaminar microcircuit suggest that MTF modifications between layers are fairly modest. However, MTFs do change between layers, either by shifts in BMFs or by changes in width, as indicated by the high percentage of off-diagonal positions in Fig. 9.
Evidence in support of a precise anatomical lamination of auditory cortex is manifold and compelling (Kelly and Wong 1981; Mitani et al. 1985; Winer 1984a,b; Winguth and Winer 1986). Laminar borders, defined by cell structure, connections, or chemical anatomy, are precise to within a few micrometers, as is the spatial segregation of afferents. Each layer differs in its neuronal architecture and cytoarchitecture, GABAergic organization, thalamic input, commissural input and output, corticocortical input and output, and corticofugal projections to the telencephalon and brain stem (Winer 1992). A basic feature of AI is that certain response parameters, especially best frequency, are conserved across cortical depth (Abeles and Goldstein Jr 1970; Phillips and Irvine 1981). Few physiological studies have found systematic laminar differences in auditory-receptive field properties. It is possible that some differences may have been obscured by limiting measurements to granular layers or by pooling data across laminae or from different functional modules. Previous studies of laminar differences and columnar processing in auditory cortex have failed to reach a consensus on whether auditory-receptive properties systematically differ across layers (see Linden and Schreiner 2003). Studies of cat auditory cortex find layer-dependent variations in minimum response latency, with the shortest latencies in the thalamorecipient middle layers (Mendelson et al. 1997; Phillips and Irvine 1981), but investigations of rodent auditory cortex find the shortest response latencies in deeper layers (Shen et al. 1999; Sugimoto et al. 1997). Laminar differences in frequency tuning bandwidths, intensity thresholds, and other response properties are seen in some studies of cat, bat, and rodent auditory cortex (Dear et al. 1993; Eggermont 1996; Norena and Eggermont 2002; Sugimoto et al. 1997), but not in others in the same species (Abeles and Goldstein Jr 1970; Clarey et al. 1994; Foeller et al. 2001; Jen et al. 1989; Phillips and Irvine 1981).
Ambiguities regarding columnar organization warrant a reevaluation to determine both whether there is a functional columnar organization in AI besides the alignment with regard to the receptor surface and how local circuit deviations relate to the spectral and temporal properties in layers IIIb/IV. Previous studies of STRFs in the thalamic input layers of AI revealed that the spectral and temporal modulation properties undergo distinct transformations that are not highly predictable from thalamic inputs (Miller et al. 2001a,b, 2002; Winer et al. 2005). The local shaping of these properties make them ideal candidates to explore whether they vary systematically within an auditory cortical column and in what way they might be a substrate of columnar computation.
Our approach with radially oriented multielectrode arrays had several key advantages and significantly extended previous research. First, we simultaneously recorded from single neurons across the full cortical depth, i.e., all neurons were recorded under the same physiological conditions. Second, we focused on an analysis that was penetration centered. The heterogeneous distributions of physiological response properties across AI, reflected in parameter gradients and multiple, functionally diverse patches (Schreiner and Winer 2007), can strongly confound any population correlations between receptive field types and laminar position. Third, we used a complex, dynamic stimulus that allowed us to simultaneously probe temporal and spectral modulation processing via STRFs and MTFs. This was a significant extension, since previous reports have not obtained STRFs from multiple simultaneously recorded neurons and they have not related the spectral or temporal modulation properties to functional anatomy. Fourth, we inspected the action potential waveform of each neuron and separated regular-spiking from fast-spiking neurons to reduce potential receptive field variability due to different cell types (Atencio and Schreiner 2008).
One important issue that our work does not resolve is how modulation properties are represented topographically within AI. The methodologies required for such an analysis conflict with the aims of this study, since broad cortical mapping and detailed single-unit recording from all laminae are not very compatible methods. Cortical mapping requires short recordings at many locations, whereas detailed single-unit analysis necessitates quite long recordings (hours) per location to fully capture the response properties, thus severely limiting the number of sites within a given experiment. Given that topographic variations in response latency (Mendelson et al. 1997) and spectral bandwidth (Schreiner and Mendelson 1990) have been documented, it is likely that temporal and spectral modulation properties are not uniformly distributed in AI. The impact of such nonuniform distributions on the laminar analysis offered here is an important issue that will require more attention in future work.
Temporal modulation processing
In roughly 70% of the penetrations, the preferred temporal modulation frequencies did not significantly differ between thalamorecipient and supra- or infragranular output layers. This suggests a functionally fairly uniform columnar organization for temporal modulation information with limited intracortical columnar processing. However, about 30% of the penetrations did show laminar differences for preferred modulation, indicating that such processing can take place. The global trend was that layer IV had the highest preferred modulation frequencies and layers II/III and VI had the lowest values. This trend is in line with the generally observed reduction in temporal modulation following capacity between auditory nerve to cortex (Joris et al. 2004). The observation of laminar differences in temporal frequency tuning has also been made in the somatosensory barrel cortex (Ahissar et al. 2001) and in visual cortex of squirrels (Heimel et al. 2005) and cats (Hawken et al. 1996). Although the temporal modulation progression in the layers of cat primary visual cortex (VI) is similar to the progression seen here in AI, in squirrel VI the highest temporal tuning was found in layer II/III, suggestive of species- and/or receptor-specific processing schemes.
The >60% overlap between the temporal modulation ranges of AI source and target layers for most neurons is quite high and suggests a high proportion of inheritance of the modulation parameters with some smaller, approximately equal contributions from ensemble convergence and constructive convergence (Fig. 9). The observation that band-pass MTFs were more common in thalamorecipient layers than in output layers points to significant contributions of additional corticocortical and/or nonlemniscal thalamic inputs to the shaping, and transformation, of temporal modulation capacity outside layers IIIb/IV, in accord with general connectivity schemes (Lee and Winer 2005). At this stage, it remains unclear whether and how physiological, contextual, task-related, or behavioral conditions or constellations may influence the degree of columnar response uniformity for temporal modulation.
Spectral modulation processing
The analysis of spectral modulation preferences revealed a somewhat different picture. Differences in preferred modulation range were much more common for spectral modulation (in ∼70% of the penetrations) than for temporal modulation. This suggests a significantly weaker organization principle of columnar constancy for spectral modulation than that for temporal modulation. The direction of change of spectral modulation preference between different laminae was more variable, resulting in a large interlaminar variance. Broadening and increase in preferred spectral modulation frequencies can be accounted for by the shift of the strength and location of inhibitory sidebands in infragranular layers. However, target layer responsiveness to higher modulations that are not contained in the source layer clearly requires additional inputs not provided by a simple columnar feedforward stream from thalamorecipient layers. Another difference in temporal modulation processing was that both main laminar transitions, from layer IV to layer II/III and from layer II/III to layer V, showed a larger proportion of ensemble construction than constructive convergence. Thus it appears that temporal and spectral modulation processing may follow different general principles in their interlaminar modifications or transformations.
Hierarchical modulation processing?
In a simple framework, receptive fields in cortical input layers may be created via three general schemes: they are predominantly built by thalamic inheritance, constructive convergence of different narrow thalamic and cortical inputs, and/or by ensemble convergence of combined, broader thalamic and cortical inputs (Miller et al. 2001b). After this initial integration stage, further transformations are likely to occur and will be related to the primary interlaminar flow of information from thalamocortical input layers to supragranular, and then to infragranular, output layers (Mitani and Shimokouchi 1985; Mitani et al. 1985; Wallace et al. 1991). Some of the observed transitions and laminar differences in modulation preferences support this sequential or hierarchical scheme, such as a stepwise reduction in temporal-following ability. However, other laminar differences—such as the relative reduction in the proportion of band-pass temporal filters in the output layers or the increase and broadening of preferred spectral modulation properties in infragranular layers—are less easily explained by a purely hierarchical or sequential processing scheme. It will be more parsimonious to view AI as composed of many functional microcircuits, some clustered within modules, with globally similar connection patterns, but highly specific input and processing attributes that vary depending on the tasks of the targeted subcortical and cortical circuits (Schreiner and Winer 2007).
Columnar organization in auditory cortex
The context of the current study is surrounded by much debate over the function of cortical columns in sensory systems. The canonical view is that larger, functionally defined cortical columns, or domains, are composed of many individual microcolumns, with each microcolumn containing a constant representation of a receptive field parameter (Mountcastle 1997). For example, in striate cortex, cells with the same eye preference, a fundamental feature of visual processing, are grouped into ocular dominance columns. In contrast to this canonical view, ocular dominance columns are not present in all species with well-developed stereoscopic vision (Horton and Adams 2005). Even more challenging is that within a species, and even within the same field, these columns may not be uniformly expressed (Adams and Horton 2003). Recent results have shown that when simple stimuli are used, columns for stimulus orientation are present in VI, although changing the stimulus class changes the location of these columns (Basole et al. 2003). This indicates that these simple stimuli probed only a restricted region of stimulus space to which the neuron responds and did not adequately probe the complete neural receptive field. Thus the location of these columns is not fixed, but stimulus dependent. Our study is consistent with many of these issues, since a stereotypic transformation of modulation information within a column is not present across different regions within auditory cortex. Our results are not likely hampered by stimulus complexity, since we used a rich stimulus set that featured dynamic elements present in natural sounds and that was distributed across the full range of relevant stimulus constellations (Escabí et al. 2003). The results from visual cortex, as well as ours—i.e., that stimulus constancy and/or transformational uniformity are not universally expressed—are consistent with the highly detailed, though variable, structure of cortical microcircuits. Although a stereotyped excitatory circuit appears to be present in sensory cortex (Douglas and Martin 2004), different proportions of neuron classes, between and even within layers at different locations within the same cortical field, make it challenging to uncover a stereotyped and universal columnar transformation of acoustic information (Silberberg et al. 2002; Winer 1992). Further, neuronal parameter sensitivity and selectivity may change with adaptation, task conditions, context, and attention, consistent with nonlinear and recurrent processing, but not necessarily with a columnar organizing principle (Basole et al. 2003). Combined, these considerations reinforce the impetus for further work in auditory cortex so that possible columnar stimulus transformations may be identified or excluded (Linden and Schreiner 2003).
Our study is consistent with the idea that the AI microcircuit should be understood according to the different tasks and requirements of the auditory system and how cortical connection patterns subserve these tasks. Simple stereotypical maps across all laminae, which are repeated across the spatial extent of AI, can be excluded as a dominant computational principle. Although this tentative conclusion may be correct, other possible interpretations of our results may also need to be considered. First, it is conceivable that the main functions of AI circuits may remain hidden when applying simple, stimulus-based parameter analyses. For the processing rules to emerge fully, a more task-dependent analysis, including determining more complete and higher-order receptive field properties, may have to be performed (Fritz et al. 2007; Nelken 2008). Second, the manner in which stimulus information is processed may be a more relevant organizing principle for AI than the encoding of acoustic content itself. In this framework, increased nonlinear dynamics may emerge as information moves from input to output layers (Ahmed et al. 2006), analogous to the different nonlinearities inherent in simple and complex cell processing in the cat primary visual cortex (Hubel and Wiesel 1962; Martinez and Alonso 2003). This may be particularly germane for the present report, since here we focused on acoustic content, although, as we have demonstrated previously, how the acoustic elements are computationally processed contributes substantially to AI coding and likely also undergoes systematic changes within and across columns (Atencio et al. 2008).
We have demonstrated that spectral and temporal modulation parameters, unlike characteristic frequency, can vary significantly across AI layers. The behavior of temporal and spectral modulation processing is dissimilar in that temporal modulation has a stronger tendency for columnar constancy—i.e., layer-independent behavior. The direction of parameter changes is not tightly linked to a simple interlaminar information flow pattern from thalamic input layers to supra- and infragranular output layers. This suggests an important and novel insight into the laminar processing of auditory cortex, since it is not assured that stimulus parameters will vary in parallel with position in the AI microcircuit. This implies that studying fully defined receptive fields is a more optimal and efficient approach when evaluating columnar processing, since studies that rely on single-parameter variations may miss important columnar transformations. Thus STRFs make it more feasible to dissect laminar-specific, module-specific, and field-specific variations in the cortical processing regime and can help to determine whether common functional patterns pertain to cortical or subcortical inputs and how they reflect local, lamina-specific circuitry.
This work was supported by National Institutes of Health Grants DC-02260 and MH-077970 and by Hearing Research Inc. (San Francisco, CA).
We thank A. Tan, M. Heiser, K. Imaizumi, B. Philibert, J. Shih, and K. Yuan for experimental assistance; M. Kvale for use of the SpikeSort 1.3 Bayesian spike-sorting software; and the late J. A. Winer for comments on the manuscript and for assistance with histological processing.
- Copyright © 2010 the American Physiological Society