Adaptive Changes in Cortical Receptive Fields Induced by Attention to Complex Sounds

Jonathan B. Fritz, Mounya Elhilali, Shihab A. Shamma


Receptive fields in primary auditory cortex (A1) can be rapidly and adaptively reshaped to enhance responses to salient frequency cues when using single tones as targets. To explore receptive field changes to more complex spectral patterns, we trained ferrets to detect variable, multitone targets in the context of background, rippled noise. Recordings from A1 of behaving ferrets showed a consistent pattern of plasticity, at both the single-neuron level and the population level, with enhancement for each component tone frequency and suppression for intertone frequencies. Plasticity was strongest near neuronal best frequency, rapid in onset, and slow to fade. Although attention may trigger cortical plasticity, the receptive field changes persisted after the behavioral task was completed. The observed comb filter plasticity is an example of an adaptive contrast matched filter, which may generally improve discriminability between foreground and background sounds and, we conjecture, may predict A1 cortical plasticity for any complex spectral target.


When an animal engages in an auditory behavior and focuses attention on salient acoustic cues, its cortical receptive fields can adaptively reshape their properties in a manner consistent with enhancing salient, task-relevant auditory object perception and thus improving task performance. Such receptive field changes have been shown in the primary auditory cortex after extended training over a period of days or weeks on a variety of tasks such as discriminating changes in tone frequency (Blake et al. 2002, 2006; Galvan and Weinberger 2002; Polley et al. 2006; Recanzone et al. 1993; Rutkowski and Weinberger 2005), tone loudness (Polley et al. 2004, 2006), tone sequence direction (Brosch et al. 2005; Selezneva et al. 2006), and temporal modulation (Bao et al. 2004; Beitel et al. 2004). Extended training, however, is not essential to achieve behaviorally driven receptive field adaptation in the auditory cortex. Receptive fields have been shown to adapt after a brief behavioral session of classical conditioning lasting only 5 min (Edeline and Weinberger 1993).

Recently, rapid, dynamic changes in receptive field shape have also been observed over a time course of minutes in ferret primary auditory cortex (A1) during a variety of auditory behavioral tasks such as single-tone detection and two-tone-frequency discrimination (Fritz et al. 2003, 2005a,b). For example, when an animal detected a single foreground (target) tone amid a variable background sequence of background (reference) broadband noise bursts (Fritz et al. 2003), many spectrotemporal receptive fields (STRFs) rapidly changed their shape at target frequency. Specifically, STRFs developed an increased excitatory (or weakened inhibitory) sensitivity in a narrow zone centered at the frequency of the target tone and, in contrast, suppressed responses immediately outside this zone (Fritz et al. 2003). Furthermore, in many cells, this plasticity persisted for minutes to hours after completion of the task.

The current research was motivated by some of the questions arising from these earlier studies. 1) What profile of STRF changes would be predicted in tasks with more complex, broadband, spectral targets such as a multitone target, compared with a single-tone target? 2) Are plastic changes equally likely throughout the entire receptive field, or do they occur primarily in the vicinity of the center of the STRF? 3) What is the time course of such STRF plasticity after completion of the task? Do such persistent changes represent a form of sensory memory for the previous task? 4) What is the role of attention in mediating the onset of task-related receptive field plasticity? Does it reflect global attention to the overall task and gestalt contrast between auditory stimuli, or selective attention to salient, specific acoustic features of the auditory target?

In the current work we describe rapid plasticity induced in A1 cells during performance of a detection task where the target was a chord, consisting of a complex of tones of different spacing. We report findings at both the single-unit level and the population level that address the four questions just raised. 1) Regions with enhanced response emerged in the STRFs at the frequencies of the target tones, whereas fields with suppressed response were induced between the tones. 2) Receptive field changes were largest in the frequency regions closest to the center of the initial STRF. 3) In many cells, plasticity persisted immediately after performance of the multitone detection task and, in some cells, it continued to build up for at least 1 h. Such persistent receptive field changes were reflected at the population level for the multitone detection task and were also found in other behavioral tasks.

In our current behavioral physiology experiments, the target chord was fixed throughout individual behavioral sessions. We wondered whether the animals, during task performance discriminating the target from reference, attended to a global difference in the spectral structure of the two categories of sounds or, instead, attended selectively to a specific individual tone component in the target chord. Based on the results of further experiments in which the animal had no difficulty in performing the task when random-, rather than fixed-tone chord targets were used during behavioral sessions, we infer that 4) during task performance, animals were likely to be attending to the global differences between target and reference stimuli, sound categories that were learned during training as part of the acquired “task rules.” Furthermore, we hypothesize that the exact form of STRF plasticity is automatically driven by the acoustic structure of these two stimulus categories. In particular, we propose that the STRFs are reshaped so as to enhance the contrast between the foreground (target) and background (reference) stimuli—a form of an adaptive contrast matched filter (Fritz et al. 2007).


Four adult female ferrets were used in these experiments. Three were behaviorally trained and one was a behaviorally naïve control. All experimental procedures used in this study were approved by the University of Maryland Animal Care and Use Committee and were in accord with National Institutes of Health Guidelines.

Behavioral training

Three adult ferrets were trained on a multitone-detection task using a conditioned-avoidance procedure (for further details of training procedure see Fritz et al. 2003, 2005a; Heffner and Heffner 1995). Ferrets licked water from a spout while listening to a sequence of reference stimuli [drawn from a set of 30 temporally orthogonal ripple combinations (TORCs) (Klein et al. 2000)] until they heard a complex-tone target. When presented with a target, the animals were trained to stop licking, to avoid a mild shock. A trial consisted of a sequence of reference stimuli (randomly ranging from one to six TORCs) followed by a multitonal target (except on catch trials in which seven reference stimuli were presented with no target). The ferrets were trained daily (∼100 trials/session) in a sound-attenuated test box (IAC) until they reached criterion, defined as consistent performance on the detection task for different multitonal targets (each multitonal complex had at least three components, in the range from 125 Hz to 16 kHz) for two sessions with >80% hit rate accuracy and >80% safe rate for a discrimination rate >0.65. Initial training to criterion in the free-running test box took about 6 wk for each ferret. After recovery from head-post implantation, the ferrets were habituated to head restraint in a customized Lucite horizontal holder over a period of 1–2 wk, and then retrained on the task for an additional 1–2 wk while restrained in the holder (further details in Fritz et al. 2005a). The task-naïve control ferret received no behavioral training but, like the other head-post–implanted ferrets, also received gradual habituation to head restraint in the holder, before physiological recording commenced.


To secure stability for electrophysiological recording, a stainless steel head post was surgically implanted on the skull. The ferrets were anesthetized with a combination of Nembutal (40 mg/kg) for induction and halothane (1–2%) for maintenance of deep anesthesia throughout the surgery. In a sterile procedure, the skull was surgically exposed and the head post was mounted using dental cement, leaving clear access to primary auditory cortex in both hemispheres. Antibiotics and postsurgery analgesics were administered as needed after surgery.

Neurophysiological recording

Experiments were conducted in a double-walled, sound-attenuation chamber (IAC). Small craniotomies (<1 mm in diameter) were made over primary auditory cortex before recording sessions that lasted 6–8 h. We recorded with 3- to 8-MΩ tungsten electrodes (FHC). Responses from each microelectrode were recorded and then stored, filtered, and spike-sorted off-line. Electrode location in A1 was based on the presence of distinctive A1 physiological characteristics (such as latency and tuning) and the position of the neural recording relative to the cortical tonotopic map in A1 (Bizley et al. 2005; Nelken et al. 2004; Shamma et al. 1993). As in previous studies (Fritz et al. 2003, 2005a ), we used multiple criteria for acceptable single-unit recordings from A1: 1) clear, short-latency auditory responses to pure-tone stimuli; 2) rapidly measurable on-line multiunit STRFs with only a few (one to five) TORC repetitions (suggesting a large linear component in the neuronal responses); 3) at least one unit in the multiunit cluster whose spike waveform had an amplitude more than fivefold the baseline noise level; 4) stability of the recording (persistence of the same waveform throughout the recordings for at least one unit); 5) distance of >150 microns in depth from any previous recordings. Single units (typically one to two neurons per electrode, but at least four neurons per electrode in a few cases of clear waveform separability) were isolated using off-line spike-sorting techniques with customized MATLAB software. In our spike-sorting program, user-defined templates were constructed with multiple amplitude time windows, to isolate each spike shape. The window thresholds were chosen carefully such that variances from the different sorted spike classes did not overlap at multiple chosen points. The variance of each sorted class of units was always within the threshold windows chosen in the sorting. In addition, we always used two other criteria for the sorted spike classes: 6) the interspike intervals for each class were exponential with a minimum 1-ms spike latency and the distribution peak was always >2 ms, and 7) for each spike class, the spike rate remained stable throughout the recording time. Further methodological details of the neurophysiological recordings are available in earlier publications (Fritz et al. 2003, 2005a).

Unlike our previous behavioral physiology experiments that exclusively used single electrode recordings (Fritz et al. 2003, 2005a,b), in the currently reported experiments, we also used multiple independently movable electrodes (in a multielectrode drive made by Alpha Omega). In our multielectrode configuration, at least four recording electrodes were used, separated by about 500 microns from their nearest neighbor. To position all four electrodes, a craniotomy of approximately 2-mm diameter was required. In our recording protocol, we initially measured the frequency tuning curve and STRF for all recording sites. We then chose our target tones based on this multisite spectral tuning. One consequence of our change to multielectrode recording was that the choice of target tones relative to the best frequencies (BFs) on the different electrodes could no longer be custom-tailored to each receptive field, as in earlier single-electrode experiments (Fritz et al. 2003, 2005a). An advantage of this more dispersed distribution of target tones relative to receptive field best frequency, however, was that we could simultaneously test the effectiveness of the same target tones at many different distances from the best frequencies of the STRFs of multiple A1 neurons during performance of the behavioral task.


During training and active physiological measurements, the acoustic stimuli were 1.5 s in duration, 60–75 dB SPL (fixed in amplitude for a given behavioral session) and consisted of tone complexes (chords) and TORCs. However, all passive STRF measurements used TORC stimuli that were longer (3 s) in duration, which allowed for more rapid measurements. In control studies, we have shown that the change in TORC duration (from 1.5 to 3 s) does not affect the resultant STRF. Each of the 30 TORCs was a broadband noise with a dynamic spectral profile that was the superposition of the envelopes of six ripples. A single ripple has a sinusoidal spectral profile, with peaks equally spaced at 0 (flat) to 1.4 peaks/octave; the envelope drifted temporally up or down the logarithmic frequency axis at a constant velocity of ≤48 Hz (Depireux et al. 2001; Klein et al. 2000; Kowalski et al. 1996; Miller et al. 2002). During physiological recording, the computer-generated stimuli were delivered through inserted earphones (Etymotic) that were calibrated in situ at the beginning of each experiment. The amplitude of tone and TORC stimuli was set at a value in the range between 60 and 75 dB during physiological recording.

Targets consisted of 1.5-s chords with frequencies chosen based on the BFs of some of the isolated units. The overall distribution of all target tone stimuli used in these behavioral physiology experiments is shown in Fig. 1B. The chords typically consisted of one to six tones, most often three to four tones spaced 1 octave apart, but with nearest-neighbor intertone spacing that varied between 0.5 and 2 octaves (Fig. 1A).

FIG. 1.

Acoustic stimuli used in behavioral tasks and physiological recordings. A: examples of the spectrotemporal structure of a representative reference temporally orthogonal ripple combination (TORC) stimulus and a target tone chord. B: target chords typically consisted of 1–6 tones with variable intertone spacing. Distribution of the number of target tones (left) and intertone spacing (right) across all experiments reveals that the most common target chord configuration consisted of 3–4 tones and 1-octave spacing.

STRF analysis

STRFs were measured using the reverse-correlation method (Klein et al. 2000). Response variance (σ) was estimated using a bootstrap procedure (Depireux et al. 2001; Efron and Tibshirani 1998) and an overall signal-to-noise ratio (SNR) was computed for each STRF. (A detailed description of the SNR analysis is provided in Klein et al. 2006.) Most SNRs were >1. Those STRFs with an SNR <0.2 were excluded from further analysis. Each STRF plot was therefore associated with a particular variance (σ). Excitatory (positive) and inhibitory (negative) fluctuations from the (zero) mean of the STRF were deemed significant only if they exceeded a level of 2.5σ. Contours were drawn at this level to demarcate significant excitatory and inhibitory features. This analysis and these criteria were also applied in determining the significant changes between two STRFs. Thus a significant STRF change refers to a suppressive or facilitative region in the difference receptive field (difference STRF) that exceeded the 2.5σ criterion.

To measure the population effect of the target tones, we labeled the tones by their distance away from the center of the receptive field (defined as the peak of the magnitude of the Hilbert transform of a spectral cross section of the receptive field). We note that this center often, but not always, coincided with the cell's BF. Each STRFdiff was then spectrally shifted up or down to align all receptive fields by the closest (or furthest) tone from the center of the receptive field. Only STRFdiff values with a significant effect (>2.5σ) at the close (or far) tone were included in the population figure. The effect was measured at a band ±0.25 octave around the tone being analyzed, over the first 1–40 ms of the STRFdiff. The final population figure was then scaled by the total number of units, thus reflecting the percentage of units that contributed with a significant effect to the entire population. In cases where there were only few cells used for the population analysis (Fig. 5C), we analyzed all cells without taking into account any significant effect at the octave-chord locations.

FIG. 2.

Examples of single-unit spectrotemporal receptive field (STRF) plasticity during- and post-behavioral tasks. A and B: responses of 2 single units recorded simultaneously from 2 electrodes. For each cell, the Pre- and Active-STRFs are shown in a sequence. To the right of each pair are the difference STRFs (PreActive-STRF), which highlight the changes that occurred during the task. In all plots, the color scheme ranges from red (enhanced) to blue (suppressed) responses, around a baseline of green. All regions of significant change from the baseline are delineated by contours (see methods). For display purposes, we scaled down (by a factor of 2) the PreActive-STRF regions that are below significance criterion (2σ). This scaling was not used for any population analysis and was performed only to visually clarify the effects of target tones.

FIG. 3.

Examples of single-unit STRFs that changed most after completion of the behavioral task. A: STRF of this cell did not change significantly during the task (no significant regions in PreActive-STRF). Instead, it was facilitated at the 2 target tones only after the end of the behavior (PrePost-STRF). All details of plots are as in previous examples. B: another example of some STRF changes that occurred primarily after task completion.

FIG. 4.

Longer-term persistence of plasticity after completion of the task. A: 3 Post-STRFs for this single unit reveal that changes continued to build up for over 1 h after the behavioral task. B: 2 Post-STRFs illustrate the slow decay of the changes after the behavior. All details of the STRF plots are as in previous figures.

FIG. 5.

Average induced plasticity in a neuronal population. A: average change due to target tones located nearest to the center of the STRFs. Left: significant changes at tone locations occurred in 58/105 units. Average of these units indicates a facilitatory effect at the center of the alignment (dashed line). Middle: significant changes between tones occurred in 61/105 units. Average of these units indicates a strong suppression, followed by a facilitatory region. Right: distribution of distances between the center of the STRFs and the nearest target. Histogram displays the absolute value of the distance (i.e., targets below and above the center of the receptive field) B: significant changes at and between the furthest targets occurred in 37/105 and 31/105 cells, respectively, and were relatively small compared with the changes for nearest targets. C: average STRF changes (left) and their predicted counterpart (right) for a 1-octave-spaced target chord of 3 tones (symbolized by the 3 arrows and dashed lines). Average changes included results from 49 cells. Predicted pattern was computed by aligning the results due to the one tone pattern (A, left) at the 3 1-octave-spaced locations and then summing them. Two patterns share similar, broad features, in particular the dominant intertone suppression. D: recordings from 31 cells in primary auditory cortex (A1) of a naïve ferret. Stimuli and analysis were identical to plots in A, except that the animal was not trained on the task, and there were no behavioral contingencies of positive or negative reward. Significant STRF changes occurred in 14/31 cells at the nearest tones, and 12/31 cells at the nearest intertone locations, but the average plasticity in either case was weak and inconsistent.


The design of these experiments, animal training, and data recording and analysis were similar to those already described in detail in Fritz et al. (2003) and are reviewed in methods. Three ferrets were trained to lick from a spout during the presentation of reference noise sounds, called “temporally orthogonal ripple combinations” (TORCs) (described in detail in methods), and to cease licking on hearing a target tone chord. TORCs were specially constructed to allow for measurement of the isolated units' STRFs both during passive and task-performance conditions. In the passive condition, the animals passively listened to the reference TORCs in the absence of target sounds. Such passive STRF measurements were conducted both before and after performance of the tasks (designated as Pre- and Post-STRFs). In between, identical STRF measurements were conducted during the active task condition, while the animal performed the target-detection task (Active-STRF), and thus was aroused and actively “anticipating” the multitone target that varied between sessions, but was constant throughout a given behavioral session.

In a typical recording session, a sequence of STRFs was measured pre-, during-, and postbehavior from up to four independently movable electrodes, each separated by about 500 microns from the nearest neighbor. Sometimes, several postbehavior STRFs were measured over 1–2 h or until the cells were lost. This sequence of behavioral/physiological measurements was often repeated twice or more at different electrode depths for different behavioral sessions on a given recording day, for as long as the animal was thirsty and sufficiently motivated to perform the task (although it varied from animal to animal, typically we could elicit 40–120 trials per day, sufficient for one to three behavioral sessions). Here we report results from a total of 105 single units with passive/active pairs from three animals and from 31 “passive/active” pairs in the naïve animal.

Behavior with fixed versus random multitone chord targets

In each behavioral session during physiological recordings, the ferrets performed the task with fixed multitone target chords with component frequencies selected based on the initial measured STRFs of the units. Thus although the target chords chosen varied appreciably from behavioral session to session, in a given experimental recording session the target chord was fixed, to accumulate enough spikes to measure the cumulative effects of the specifically chosen tones in the target chord. Although the animals may never have previously encountered this specific target tone complex, or indeed any of its individual tones, they could nevertheless perform this task based (presumably) on the substantial difference between the timbres of the reference (TORC) and target (multitone) stimuli. Our current physiological results are based on this behavioral design.

We conjectured that the animals likely attended to the global quality of the broadband target multitone stimuli to distinguish this acoustic category from the broadband reference TORCs, as opposed to selectively attending to any particular tone within a specific chord to detect the chord in the presence of background TORCs. However, it is also possible that the animal's behavioral strategy was not global, but instead was to attend selectively to a single tone for a few trials, and then to switch attention from one tone to another, effectively attending to all of the individual component tones in the multitone complex sequentially throughout the experiment. To examine this possibility, we tested the behavior in the three trained ferrets used in the physiological experiments on a variant of the task in which the target multitonal chords randomly varied in frequency composition from trial to trial [i.e., each successive target was generated with a random number of three to six harmonic components (each separated by 1 octave) based on a randomly chosen fundamental from 125 to 500 Hz]. The animals performed well on this variable multitone task variant without any additional training beyond their original “fixed multitone detect” training. Specifically, in the first three test sessions, the behavioral performance, as measured by average discrimination ratio [DR = hit rate × (1 − false positive rate)] of the three animals was 0.78, surpassing the average performance with fixed multitone targets for the three behavioral sessions just before the variable multitone tests (DR = 0.71). There was no apparent learning curve for the random multitone task—in fact, all animals performed superbly from the onset in their very first behavioral session with variable multitone targets (with individual first session performances for the three ferrets of DR values of 0.70, 0.78, and 0.84).

These behavioral results, showing sterling performance and a seamless behavioral transition between task variants, suggest that the ferrets were unlikely to have ever been using a “single-tone” strategy of selectively attending to any particular anchor tone because such a strategy would not immediately transfer to the variable multitone target component tone frequencies in the new variable multitone task variant, which were random and thus unpredictable from trial to trial. Instead, it seems more likely that the animals performed the random multitone-detection task and also likely performed the fixed multitone-detection task, by attending to the change in overall timbre of the sound in distinguishing the multitone chords from TORCs. These behavioral results set the stage for the physiological results, subsequently described.

Examples of STRF changes in single units

Figure 2 provides examples that illustrate the general pattern of STRF changes during task performance observed in 105 single units recorded in three trained ferrets. Each of the two cells (Fig. 2, A and B) was simultaneously recorded on two neighboring electrodes; thus the recordings were behaviorally matched because each pair was recorded while the animal performed the same task in the same session.

Initially, the two units in Fig. 2, A and B were tuned at approximately 8 and 1.5 kHz (leftmost panels). The target chord was composed of four tonal components that were spaced 1 octave apart (at 1, 2, 4, and 8 kHz, as indicated by the four dashed lines drawn through Active-STRFs). In the first cell (Fig. 2A), one of the tones was aligned with the BF (8 kHz), whereas the two lowest tones were far from the strongest regions of the original, prebehavior STRF. In the second cell (Fig. 2B), the three lowest tones flanked the BF and a secondary excitatory area nearby (1.5 and 3 kHz). During performance of the task, the shape of the two STRFs changed rapidly in a pattern illustrative of the average plasticity observed in the population (as we discuss later).

In the STRF of Fig. 2A, the main excitatory region around the BF (∼8 kHz) was considerably enhanced during behavior, and a new smaller excitatory region at about 4 kHz split off. This change is explicitly shown in the rightmost panel that illustrates the difference between the Pre- and Active-STRFs (labeled PreActive-STRF). Here the enhancement of the excitatory region near the location of the target tones (4 and 8 kHz) is clearly seen, as well as a suppression introduced between the target tones (at 6 kHz), which was responsible for the splitting of the original Pre-STRF.

The Pre-STRF in Fig. 2B initially had low-frequency excitatory regions (at 1.5 and 3 kHz) that fell between the three lowest tones of the target complex (1, 2, and 4 kHz). During behavior (Active-STRF), the excitatory regions between the tones contracted in size and were more delayed in temporal onset relative to their shape in the initial passive STRF. The difference STRF (PreActive-STRF) in the rightmost panel illustrates the suppression effectively introduced between the lower target tones during the behavior, as well as an accompanying delayed excitation.

These two STRFs (Fig. 2, A and B) highlight two important findings that will be further quantified later: 1) Significant changes in the STRFs tended to occur near or within their initial boundaries. Thus in the first STRF, only the two highest target tones induced large changes during behavior, whereas in the second STRF, the lowest target tones were more effective. This weighted form of rapid plasticity was consistently seen across the population of isolated units. 2) In a given behavioral session, some or all tones of the target chord may simultaneously induce changes in several neighboring STRFs.

STRF changes evident post behavior

Although most STRFs changed during the task, and often quickly reverted toward their original shape afterward (Fritz et al. 2003, 2005a,b), many STRFs continued to evolve in a variety of different ways following behavior. Figure 3 shows two examples of these postbehavioral buildup trends. In both, most plastic changes became measurable only after the behavior was completed. For instance, no significant changes occurred in the Active-STRF in Fig. 3A, but significant changes were seen in the Post-STRF where two excitatory effects, with a suppressive intermediate band, were observed at the two target tones. Similarly in Fig. 3B, extensive reduction of the inhibitory regions occurs mostly in the Post-STRF. Out of 92 cells in which pre-, active-, and post-STRFs were measured, over a quarter (26/92 cells) exhibited a buildup of changes post behavior, whereas just under a quarter (22/92 cells) reverted (sometimes gradually) to their original shape. In the remainder (44/92), the Post-STRF was similar to the Active-STRF. These proportions were determined by measuring the maximum significant changes ± 0.25 octave around the target tones and intertones.

Persistence of STRF changes post behavior

In 31 cells that exhibited persistent plasticity after completion of the behavioral task, we were able to measure two or more passive STRFs over a period of 1–2 h, and thus observe the evolution of the plasticity over time. Two single-unit examples are shown in Fig. 4. In Fig. 4A, the Pre-STRF had excitatory and inhibitory regions (at 1.4 and 2.8 kHz, respectively) that were aligned with two upper target tones. After the behavior, the first Post-STRF (Post-STRF1) exhibited a broadening of the STRF regions, with a significant weakening of the inhibitory area relative to the Pre-STRF. Two subsequent measurements (Post-STRF2, Post-STRF3) illustrate that the induced changes persisted, and even may have fluctuated in strength for an extended period afterward. Specifically, the inhibitory area continued to weaken (Post-STRF3), whereas the excitation remained relatively stable and stronger relative to the Pre-STRF.

In the second example of Fig. 4B, the pattern was broadly similar in that the excitatory area at 12 kHz was enhanced following the behavior (Post-STRF1), and then persisted but gradually weakened (Post-STRF2). By contrast, the inhibitory area (coincident with target tone at 6 kHz) was reduced after the task (Post-STRF1), but remained relatively stable afterward (Post-STRF2).

Population patterns of plasticity

Despite the heterogeneity of STRF plasticity in different cells, there was a consistent pattern of changes that emerged when we averaged the individual task-related STRF changes across the entire population as shown in Fig. 5. Specifically, consistent with the single-unit examples in previous figures: 1) Target tones were most effective in inducing plasticity when they fell near the center of the STRFs and 2) the effects induced were facilitatory (excitatory) at the target tone frequencies and suppressive between them.

These two effects are demonstrated in Fig. 5A where only difference STRFs that showed significant changes (according to the criteria detailed in methods) were included in the averages. In the left panel, the PreActive-STRF of each cell was aligned at the target tone nearest to the center of its STRF (mean distance = 0.37 octave), and then averaged across all cells. A histogram of the distances of the nearest tones to the centers of the STRF used in this average (from all 105 units) is shown in Fig. 5A (see methods for further details). The resulting pattern illustrates that the induced plasticity was localized and facilitatory at the target tones and was surrounded by strong inhibitory sidebands that were followed by strong and delayed facilitatory regions. By contrast, when the PreActive-STRFs were aligned between the target tones nearest to the STRFs, the resulting pattern (right panel) was almost purely suppressive, followed by a facilitatory region. Both the facilitatory and suppressive plasticities were significantly weaker or less organized when the PreActive-STRF of each cell was aligned at the target tones furthest from the STRFs (Fig. 5B), confirming the findings in the single-unit examples in Figs. 2 Figs. 3 Figs. 4. A histogram of the distances of the furthest tones from the STRF centers is shown with mean distance = 2 octaves.

The chord targets were usually heterogeneous in acoustic structure and varied considerably from one experimental session to another. However, in a subset of experiments (49 cells), we used standardized target chords consisting of only three or more tones, with 1-octave spacing. Because of the uniform intertone distance, it was readily possible to align and average the results from this target stimulus set, and thus obtain a global view of the changes to the whole target pattern. The averaged plot for this subset of tests is shown in Fig. 5C (left). The pattern of change exhibits alternating excitatory and suppressive peaks roughly registered with the 1-octave spacing of the target tones. This pattern can be approximately predicted (Fig. 5C, right) from a superposition of three shifted copies of the average pattern (shown in Fig. 5A, left). Each copy is aligned around one of the tones in the three-tone complex. However, because the prediction is derived from the average pattern based on a different data set [a population of 58 units (see Fig. 5A, left panel) with a different tone-BF distance profile, and component tone spacing than the 49 units presented with 1-octave spacing in the measured plasticity], we did not expect to find perfect equivalence between the predicted and measured plasticities. The major difference between the two is found in the weakening of the facilitatory effects at the target tones, presumably due to the cumulative effects of the strong suppressive influences between the tones, and possibly also because the average distance to BF was greater for the tones in the data set with 1-octave spacing.

Finally, we also computed the average STRF changes from a population of 31 units in a naïve animal using exactly the same stimuli, presented in the same sequence as to the behaving animals (except that there were no task contingencies for shock, nor did the naïve animal receive water during the behavioral session). The results are shown in Fig. 5D, both at and between target tones that are nearest to the STRFs (i.e., corresponding to Fig. 5A). There was no significant and meaningful change at the center of these plots, confirming that behavior is necessary to induce plasticity.

Population patterns of persistence

As discussed earlier, STRF changes persisted to varying degrees immediately after completion of the behavior. Figure 6 illustrates these effects averaged from all measurements. Each panel was computed the same way as for the during-behavior changes shown earlier in Fig. 5, except that PrePost-STRFs were used instead of the PreActive-STRFs. The top panel in Fig. 6A reveals that the facilitation, which induced post behavior at target tones, broadly resembles the facilitation measured during the behavioral task (Fig. 5A), except for being somewhat more diffuse and lacking inhibitory sidebands. Midway between tones (bottom panel) induced suppression is the predominant change, although it is weaker and more delayed compared with that observed during the task (Fig. 5A). In summary, in both PrePost-STRF average plots in Fig. 6A, suppressive changes apparently fade away significantly faster than the facilitatory changes.

FIG. 6.

Average persistent STRF changes in different behavioral tasks. A: persistent plasticity in multitone detection. Top: average PrePost-STRFs at the nearest targets computed from significant changes in 63/92 cells. Bottom: persistent plasticity measured at the nearest intertone in 51/92 cells. B: similarity of average STRF changes that occurred at the target tone during (left) and post (right) performance of a single-tone detection task. C: similarity of average STRF changes that occurred at the reference tone during (left) and post (right) performance of a tone frequency discrimination task.

To put these persistent plasticity patterns, for the multitone-detection task, in the broader perspective of long-lasting, task-related plasticity in A1 in other spectral tasks that we have previously reported (Fritz et al. 2003, 2005a,b), we reanalyzed those data, as well as new additional data, and computed the equivalent averaged (PrePost-STRF) plots from recordings with animals detecting a single tone in the presence of background TORCS (Fig. 6B, right) or discriminating two tones (Fig. 6C, right). In both of these previous tasks, population-level, average STRF changes induced at the target tone during detection (Fig. 6B, left) and at the reference tone during discrimination (Fig. 6C, left), persist in roughly the same form afterward (right panels), providing further population evidence for sensory memory in all of these tasks.

Finally, Fig. 7 illustrates the persistence of plasticity in a population of 31 cells (a subset of cells used in Fig. 6A) in which at least two postbehavioral STRF measurements were conducted (during the hour after completion of the behavioral task). Each measurement lasted about 15–20 min. The first postbehavioral STRF measurement began immediately after task completion and the second postbehavioral STRF measurement began about 30 min later. On average, STRF changes relative to the Pre-STRF (PrePost-STRF1 and PrePost-STRF2), centered at target tones (Fig. 7A) and midway between tones (Fig. 7B), remained stable with no statistically significant buildup or decay (despite the apparent buildup in the figure). Note also that that pattern differs from that in the PreActive-STRF plots of Fig. 5A in that the suppression fades relatively quickly, apparently unmasking facilitatory effects surrounding it; the pattern, however, remains generally similar to that seen earlier in Fig. 6A. These results provide a population view of persistent receptive field modulation.

FIG. 7.

Time course of persistent plasticity at and between tones of a target chord. A: average STRF changes at target tones persisted and continued to build up after task completion for at least 1 h. B: average STRF changes between target tones exhibited strong suppression, and a preceding weaker and diffuse facilitation. Both effects continue to strengthen slightly after task completion.


Receptive fields in ferret primary auditory cortex can undergo rapid and persistent changes in their spectral selectivity when the animal engages in an auditory spectral task and attends to salient acoustic cues. The pattern of overall STRF changes reflects the spectral cues of the task-relevant acoustic stimuli, and the presence of such spectral plasticity is contingent on attention to these cues, as reflected by the performance on the task. As we have shown for the multitone targets, these adaptive changes persist long after the task is over, in largely the same form as that observed during the behavior. Furthermore, in a subset of cells, the changes persist over a relatively long time (>30–45 min) after the approximately 15- to 20-min task was completed. These results confirm and extend our previously reported findings in animals performing different tasks, such as single-tone detection or two-tone discrimination (Fritz et al. 2003, 2005a).

There are several novel experimental findings reported here that provide new insights into the nature of this rapid, task-related, receptive field plasticity.

First, our results suggest that the pattern of plasticity observed for a complex target, consisting of multiple, simultaneous tones, may be roughly viewed as a superposition of changes induced by the individual tones (as in Fig. 5C), this despite the inhibitory, facilitatory, and often nonlinear interactions often reported in the responses to complex stimuli. The implication of these findings is that the predicted pattern of plasticity for any arbitrary combination of discriminated stimuli in a spectral auditory-discrimination task should approximately reflect the spectral difference between the spectral shapes of the target and the reference stimuli. In this sense, each stimulus combination in a given auditory-discrimination task, should yield a characteristic, spectral plasticity signature. In the experiments described in the present study, the relevant spectral difference was between the narrow spectral peaks of the multitone target and the flat, broadband noise spectrum of the reference TORCs, which may explain why suppressive effects were induced between the target tones. This is consistent with the pattern of plasticity we found earlier (Fritz et al. 2005a) in animals discriminating between two tones—a target tone (which potentiated the STRF at target frequency) and a reference tone (which suppressed the STRF at reference frequency).

Second, we found that plasticity was strongest near the center of the prebehavior STRF of the cell (Fig. 5), suggesting that task-related adaptation may occur by changing the strengths of preexisting functional synapses, often leading to lasting plasticity (Figs. 6 and 7). In light of previous studies showing receptive field plasticity in the auditory thalamus (Edeline et al. 1991) and our observations of the precise frequency specificity of task-related plasticity, it is likely that changes in the efficacy of thalamocortical inputs play an important role in reshaping the STRF, although it is unlikely that A1 merely “inherits” plasticity that occurred in the ventral partition of the medial geniculate body (MGBv). A recent computational model has been proposed to explain these observations (Soto et al. 2006).

Third, the persistent, postbehavioral persistent receptive field changes closely resembled the plastic STRF changes measured during the multitone-detection task, in effect providing a form of short-term sensory memory of the task and its associated stimuli in primary auditory cortex. We have also presented additional data confirming this result at a population level for two additional tasks (tone detection and two-tone discrimination) by analyzing postbehavioral physiological data from previous studies in our laboratory (Fritz et al. 2003, 2005a). It should be noted, however, that the suppression induced between the tones in the chord-detection task apparently fades rather quickly after the behavior, compared with the induced facilitation at the tones. Further studies are necessary to fully understand the differential time course of these events.

Finally, our results may have implications for the role of selective versus global attention in mediating rapid plasticity. We asked whether the experimental animals attended selectively to individual tone components in the multitone chord, or attended globally to the entire chord. Our data suggest that several (if not all) tonal components of a chord target may simultaneously induce plasticity in different cells. Because the multitone chord targets in the physiological experiments were fixed in their composition throughout a particular behavioral session, one could thus argue that the STRF changes at any one frequency, observed in any one cell, might be due to the animal extracting a single tone from the target tone complex and then attending to that one specific tone. However, because it is challenging for even trained musicians to extract all of the individual frequency components of random chords, it is unlikely that the ferrets in this study were able to extract and attend to single tones in complex chords. Moreover, because different STRF changes occurred simultaneously in different neurons at multiple frequencies matching the target tones during the same experiment, one has to conclude that either the animal was still selectively, but successively, attending to multiple individual tones throughout the experiment or, instead, that the animal attended to the global feature differences between the two classes of broadband sounds, the target (chords) and reference (TORCs) stimuli. This latter explanation is more parsimonious and is more readily consistent with the demonstrated behavioral ability of the animals to readily perform the task variant in which target multitones were randomized between trials.

To comprehend the possible role of attention in inducing rapid task-related receptive field plasticity in our experiments (Fritz et al. 2007), it may also be necessary to understand the prior neural changes that occurred when the animal originally learned the task. During the training phase to learn the fixed multitone-detection task (which typically lasted about 1 mo), the ferrets first learned the random-target-frequency version of the single-tone detection task (Fritz et al. 2003). In the course of this training, they learned the basic task rules and formed representations and categorical distinctions (perhaps encoded in auditory association cortex and/or in the prefrontal cortex) between the background, reference TORCs (30 distinct exemplars in this set) and multiple exemplars of the foreground, target multitone chords (a virtually limitless set, usually with one new chord for each behavioral session–thus ≥15–20 exemplars by the end of the first month of training). During the training period, not only did the ferrets learn to recognize the acoustic category of TORC sounds, but they also learned the associated “meaning” of the sounds in this category—which was that these were “safe” sounds during which they could safely drink water. In contrast, the ferrets also learned to recognize the acoustic category of multitone chords, which signified a category of “warning” sounds, following which they should refrain from drinking to avoid possible shock.

When the animals subsequently performed the multitone task during physiological experiments (with novel exemplars of the target), we conjecture that whenever the animal detected a target stimulus (category), a top-down signal was triggered that induced plasticity in auditory cortical neurons (such as those in A1 studied here) that adaptively reshaped their receptive fields to facilitate responses to the current foreground target stimulus. We also propose that the occurrence of reference background stimuli (in this experiment, the TORCs) induced an opposite (or suppressive) effect on the STRFs. The net consequence of these dual “push–pull” receptive field changes is a contrast filter, which enhances a target relative to the reference. In our current study of task-related plasticity arising from multitone detection, and also in previous studies involving different auditory tasks (Fritz et al. 2003, 2005a), we have observed that receptive field plasticity precisely follows the specific acoustic details of the target and reference stimuli. Thus we suggest that, once triggered by attention, the details of the salient contrasting acoustic stimuli in the current acoustic discrimination task are “automatically imprinted” on the malleable receptive fields in A1. This process may be mediated by neuromodulators (discussed in Fritz et al. 2005, 2007) such as acetylcholine or noradrenaline (Bakin and Weinberger 1996; Kilgard and Merzenich 1998; Manunta and Edeline 2004). In this view, attention can play a pivotal role by triggering receptive field plasticity on the stage of task knowledge and stimulus representation.

In summary, we propose that during task performance, the trained animals attend to the difference between the acoustic categories of target and reference sounds, which results in the rapid transformation of cortical receptive fields. The transformation of A1 STRFs can be described, to a first approximation, as convolution with a matched contrast filter (we note that these effects are often most clearly seen at the population level). The present results provide support for this contrast-filter hypothesis (Fritz et al. 2007), which has recently been formally modeled (unpublished observations, N. Mesgarani et al.) and can be used to predict A1 STRF plasticity changes for any arbitrary combination of spectral target and background stimuli in an auditory-discrimination task. Although we believe that this is the first explicit formulation of the general matched contrast-filter hypothesis for auditory cortical plasticity (see also Fritz et al. 2007), we note our approach is completely consistent with earlier results (such as those of Blake et al. 2002; Edeline and Weinberger 1993), which emphasized the development of A1 receptive field changes in a two-tone discrimination task, leading to greater neural discriminability. The contrast-filter hypothesis can be even more rigorously tested in future studies that explore the interrelation between auditory feature attention and receptive field plasticity.


This work was supported by National Institute on Deafness and Other Communication Disorders (NIDCD) Grant R01 DC-005779, NIDCD Training Grant DC-00046-01, Collaborative Research in Computational Neuroscience Grant RO1 AG-02757301, and a Defense University Research Instrumentation Program equipment grant from the Air Force Office of Scientific Research.


We thank all members of the Neural Systems Laboratory at the University of Maryland for valuable assistance throughout the course of these experiments, particularly for help in animal training from K. Donaldson; implant surgery from Dr. Pingbo Yin, Dr. Stephen David, and K. Donaldson; for help with naïve recordings from Dr. Pingbo Yin and Dr. Stephen David; and for help with software design (development of customized MATLAB programs for stimulus generation, task execution, and physiological and behavioral data analysis) by N. Mesgarani and Dr. Stephen David.


  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract