In a rest period immediately after a task, neurons in the hippocampus, neocortex, and striatum exhibit spatiotemporal correlation patterns resembling those observed during the task. This reactivation has been proposed as a neurophysiological substrate for memory consolidation. We provide new evidence that rodent ventral tegmental area (VTA) neurons are selective for different types of food stimuli and that stimulus-sensitive neurons strongly reactivate during the rest period following a task that involved those stimuli. Reactivation occurred primarily during slow wave sleep and during quiet awakeness. In these experiments, VTA reactivation patterns were uncompressed and occurred at the firing rate level, rather than on a spike-to-spike basis. Mildly aversive stimuli were reactivated more often than positive ones. The VTA is a pivotal structure involved in the perception and prediction of reward and stimulus salience and is a key neuromodulatory system involved in synaptic plasticity. These results suggest new ways in which dopaminergic signals could contribute to the biophysical mechanisms of selective, system-wide, memory consolidation, and reconsolidation during sleep.
it has been proposed that the hippocampus stores information during the acquisition of new memory episodes and that these memories are replayed during sleep as part of a memory consolidation process (Marr 1971; Stickgold and Walker 2007). Consolidation is believed to involve synaptic changes in the neocortex reflecting the integration and refinement of memory representations (McClelland et al. 1995; Schwindel and McNaughton 2011). This replay involves neural populations that were active during a task immediately preceding the sleep period. Reactivations of specific neural activity patterns have been observed in several brain areas including the hippocampus, amygdala, neocortex, and striatum (Bendor and Wilson 2012; Carr et al. 2012; Euston et al. 2007; Foster and Wilson 2006; Hoffman and McNaughton 2002; Ji and Wilson 2007; Karlsson and Frank 2009; Kudrimoti et al. 1999; Lee and Wilson 2002; Nadasdy et al. 1999; Pennartz et al. 2004; Peyrache et al. 2009; Popa et al. 2010; Qin et al. 1997; Ribeiro et al. 2004; Sutherland and McNaughton 2000; Tatsuno et al. 2006). Recent evidence suggests that replay may also indicate the planning of behaviors yet to be performed (Carr et al. 2011; Davidson et al. 2009; Diba and Buzsaki 2007; Gupta et al. 2010) and that the presence of rewards may increase reactivations in hippocampus and ventral striatum (Lansink et al. 2008; Singer and Frank 2009). The computational mechanisms underlying these reactivations and their possible consequences on learning have been investigated in hippocampus (Hasselmo 2008; Johnson and Redish 2005). One hypothesis is that reactivation occurs as a result of local attractor dynamics within the structure in which they occur (Shen and McNaughton 1996). Another, nonexclusive, possibility is that reactivation in a given area is, at least in part, inherited from or modulated by one or more other structures that project to it.
Theoretical and experimental research on reinforcement learning has led to several proposals regarding ways in which neural activity can be modulated by the value associated with a stimulus (Samson et al. 2010). Increases in extracellular levels of dopamine in regions such as the striatum, frontal cortex, or nucleus accumbens have been observed during the exploration of novel environments, visual or auditory stimulations that predict rewards or punishment (food, drugs, foot shock), suggesting that the ventral tegmental area (VTA) is responding to stimulus salience (Schultz 2007). Intracerebral injection of D1 and D2 agonists immediately after a radial arm learning task improved memory retention (Packard and White 1991; White et al. 1993). However, no mechanisms were proposed. VTA dopaminergic neurons retain a significant level of activity during slow wave sleep (SWS) (Dahan et al. 2007; Lee et al. 2001) and REM sleep with a possible change in firing patterns rather than in firing rate compared with the awake state (Monti and Monti 2007). The relationship between this activity and that during the awake state was, however, until now not studied.
It is still unclear why some specific memory items persist and/or become incorporated into the general knowledge base, while others do not (Lisman and Grace 2005; Uncapher and Rugg 2009). How does the nervous system determine what new information to memorize and what to discard? One possibility is that, as with memory acquisition, the consolidation process is modulated by the value associated with a specific memory item (Singer and Frank 2009). Research has shown that this value may, at least in part, be encoded by subcortical structures such as the VTA (Schultz 2004; Ungless 2004; Waelti et al. 2001). VTA activity increases at the time of unpredicted reward or at the time of a cue predicting the imminent receipt of a reward, and VTA is in an ideal anatomical relationship to influence the brain areas in which reactivation has been observed so far, whether reactivation is supporting memory consolidation or online planning. Studies in humans have shown that the enhancement of memory consolidation of highly rewarded stimuli was blocked if a D2-receptor agonist was administered during sleep (Feld et al. 2014). This impairment was hypothesized to be due to a general nonspecific increase in consolidation, whether the stimuli were rewarded or not during learning. Whether the effect was due to a change in sleep pattern or a physiological change at the neuronal level was, however, unclear. Recent research using optogenetic techniques reported that individual VTA dopaminergic cell activity was more selective than previously thought and contributed to behaviors beyond classical reinforcement learning (Chaudhury et al. 2013; Cohen et al. 2012; Lammel et al. 2012; Stamatakis and Stuber 2012; Tye et al. 2013). We show here that populations of neurons of the VTA indeed show posttask, offline reactivation and that this reactivation involves neuronal patterns similar to those elicited by specific stimuli during the task.
MATERIALS AND METHODS
Four Brown Norway/Fisher 344 Hybrid adult (8–10 mo old) male rats were kept individually in Plexiglas home cages, with regular chow and water ad libitum, on a 12:12-h reversed dark-light cycle. During the experimental period the animals were food deprived to 85% of their body weight and the experiments were conducted during the dark phase of the cycle. All surgical and behavioral procedures described below were approved by the Institutional Animal Care and Use Committee of the University of Arizona and conformed to National Institutes of Health Guide for the Care and Use of Laboratory Animals (NIH Publications No. 80-23, revised 1996).
Surgical procedure and implants.
Rats were anesthetized with 1.5–2.5% isoflurane in oxygen at a flow rate of ∼1.5 l/min. The animals were fixed in a stereotaxic frame and implanted with a Hyperdrive consisting of 14 independent movable tetrodes, 2 of which were used as reference electrodes (Fig. 1A). The 12 recording tetrodes were loaded inside a silica tube (65-μm inner diameter, 125-μm outer diameter; Polymicro Technologies, Phoenix, AZ), which penetrated ∼4 mm inside the brain to facilitate the targeting the peribrachial pigmented (PBP) division of VTA at a depth of 8.5 mm. The Hyperdrive was implanted at −6 mm from bregma, 2.6 mm lateral with an angle of 14.5° from midline, following the Paxinos's rat atlas coordinates (Paxinos and Watson 1998) (Fig. 1A). The drive was anchored to the skull with eight anchor screws and dental acrylic, and one of these screws was used as an animal ground. Additionally, two EEG electrodes (Teflon-insulated stainless steel wire, 0.0045 in.) were implanted in CA1 contralateral to the Hyperdrive position, at −3.1 mm from bregma, 2 mm lateral and 3 mm depth. Two EMG electrodes were implanted in the neck muscles of the rat. The EEG and EMG signals were used to assess sleep phases.
Behavior and apparatus.
Before the task started, the rat was allowed to rest in a towel-lined flower pot, for a period of 20–30 min (“Rest-1”), alone in the experimental room. The task consisted in the delivery with a pair of tweezers of flavored food pellets. Care was taken to minimize the required movements of the animal (e.g., we did not use a Skinner box) to minimize the extent to which dopaminergic activity was related to movement. The events were empty tweezers (“e”), regular 25-mg food pellets (“f”), 25-mg sugar pellets (“s”), or 25-mg food pellets containing 2% quinine (“q”). The empty tweezer condition acted both as a control (no food delivered) and as a mild negative experience (rats were food restricted). As with quinine, empty tweezers were not negative enough to demotivate the animal. The choice of several valences (positive, negative) and types of stimuli was motivated by our goal to establish a “tuning curve” for each of the neurons. Pilot work showed that 15 trials per stimulus were necessary to assess stimulus preference and that 4 types of stimuli were optimal to avoid satiation. Food pellets were delivered at random intervals of at least 20 s. The animal could not perceive the pair of tweezers earlier than ∼3 s before the food was available for consumption (i.e., expectation for food was limited to <3 s before the delivery). All pellets were obtained from Research Diets, Noyes precision pellets (New Brunswick, NJ). Pellets differed by their color (“f” = green, “s” = white, and “q” = gray) and presumably by smell and taste. The time of first contact between the tweezers and the rat mouth was recorded by the data acquisition system. Because of the random nature of the delivery, rats typically made contact with the tweezers even if the pellet was quinine flavored or if the tweezers were empty. Trials for which no contact was made were discarded. Only one session was given per day. After the task was completed, an additional rest period in the towel-lined flower pot was conducted for 1 h (“Rest-2”), in the same conditions as Rest-1. Control sessions, on separate days, were also conducted in all animals. In these sessions, the VTA neuronal activity was recorded during 2 h without any task. After the end of these control sessions, an additional task session was given to characterize the stimulus selectivity of all cells recorded.
Electrophysiological signals were recorded on each of the 12 × 4 = 48 wires simultaneously. The tetrodes (Wilson and McNaughton 1993) were made of four twisted 12-μm nichrome wires (H. P. Reid, Palm Coast, FL), gold plated to an impedance of 0.5–1 MΩ. The tetrode configuration greatly facilitated the reliability of spike identification. Each tetrode was independently lowered to the target area, at a speed of no more than 600 μm each day, until appropriate signals could be recorded; it usually took 2 wk to reach the VTA. The leads of the tetrodes were connected to a unity-gain head stage and all the data were collected using a Digital Lynx System (Neuralynx, Bozeman, MT). Single unit data from each tetrode were amplified and band pass filtered (600-6,000 Hz) and digitized at a rate of 32 kHz. In some sessions the high pass value was also opened to 8 kHz to compare spike shape with the traditional 6-kHz filtering. No significant differences were noted. The EEG, EMG, and local field potential signals were acquired with the same system, filtered between 0.5 to 450 Hz and digitized at 2 kHz. Single neurons were isolated offline using automatic cell sorting software (KlustaKwik by K. Harris, and MClust by D. Redish). All spike clusters for which >1% of interspike intervals were <3 ms were discarded. Of the 12 tetrodes targeted to VTA, per session, we had an average of 9.6 ± 1.8 tetrodes with usable cells and 6.5 ± 1.9 tetrodes containing at least one stimulus-sensitive cell (n = 18 sessions, see below). Electrodes were moved at the end of each session to ensure that different cells were recorded from session to session.
Classification of the stimulus sensitivity of VTA neurons.
The statistical analyses were based on the recording of 468 VTA neurons in 18 sessions from 4 animals. The cells were considered stimulus sensitive if their firing rate around at least one of the tweezers events was significantly different (t-test, P < 0.05) compared with baseline. The activity was computed 3 s before and after the stimulus delivery, and baseline activity was assessed at least 6 s before each event. From the total number of cells recorded, 44.2% (207/468) were stimulus sensitive, and 36.5% (171/468) were not stimulus sensitive (for the stimuli chosen here, See Fig. 1D). Cells firing at 20 Hz or more and with peak to trough amplitude ratio close to 1 (putative GABAergic cells) during at least 10 min in Rest-1 were excluded from the analyses (12.8%, 60/468, Fig. 1D). Cells with firing rates <0.5 Hz were also excluded (6.5%, 30/468).
Sixty of the 207 stimulus-sensitive neurons previously identified were also tested with a quinine-flavored pellet. However, the classification of stimulus sensitivity was performed on the basis of the cell responses to three events (food, sugar, and empty) to prevent the emergence of aversive behaviors of the rat to food obtained from the tweezers. Among the neurons that were tested with quinine, we found that 15% (9/60) responded to only one of the four events (food, empty, sugar, and quinine), 44% (4/9) of them were only sensitive to empty, 33% (3/9) were only sensitive to sugar, and 22% (2/9) were only sensitive to regular food. We did not detect any neurons selective to quinine only. The remaining 85% (51/60) of stimulus-sensitive neurons showed more complex responses to the different stimuli: 35% (18/51) were sensitive to all four stimuli (but with possibly different signs), and 18% (9/51) were sensitive to the two negative stimuli quinine and empty. The other neurons were sensitive to three of the four conditions: 10% (5/51) all but sugar; 10% (5/51) all but empty; 9% (4/51) all but regular food; and 6% (3/51) all but quinine. A small percentage of cells showed sensitivity to empty and food 6% (3/51); 4% (2/51) were sensitive to food and quinine; 2% (1/51) were sensitive to food and sugar; and finally, 2% (1/51) of the cells were sensitive to sugar and quinine. A graphical representation of the proportion of cells responsive to one, two, three, and four of the stimuli is given in Fig. 1D.
In sum, our analyses were performed on 207 stimulus-sensitive neurons and 171 stimulus-nonsensitive neurons in 4 animals.
The physiological characterization of dopaminergic cells in vivo is debated (Grace and Bunney 1983; Margolis et al. 2006, 2008; Roesch et al. 2007; Ungless and Grace 2012). Figure 2A shows a typical dopamine neuron with triphasic waveform recorded with our technique. For comparison, we recorded under chloral hydrate anesthesia in conditions similar to previously published work (Corral-Frias et al. 2013; Grace and Bunney 1983) (Fig. 2, B and C). While our recordings showed typical long duration dopaminergic triphasic waveforms, not all putative dopamine VTA stimulus-sensitive cells fell in this category (Fig. 2D). In two sessions where 45 VTA neurons were recorded, the animals received an intraperitoneal injection of apomorphine (0.75 mg/kg). Out of these 45 cells, 33 were putative dopaminergic cells and were analyzed further. These neurons were classified as apomorphine responsive by comparing (t-test, P < 0.01) their average firing rate during 10 min (120 bins, 5 s each) before and 30 min (360 bins, 5 s each) after the apomorphine injection. In these sessions, we found that 18 out of 19 stimulus-sensitive neurons were apomorphine sensitive. Fifteen showed a significant decrease in firing rate in response to apomorphine injection and only three of them increased their firing rates, in agreement with a previous report (Roesch et al. 2007) (Fig. 2E). Out of 14 stimulus-nonsensitive cells, 5 neurons showed a significant decrease in firing rates, 2 showed an increase in firing rate and 7 were not responsive. In sum, 79% (15/19) of stimulus-sensitive cells were inhibited by apomorphine and only 35% (5/14) of the stimulus-nonsensitive cells were inhibited by apomorphine. The apomorphine effect started in the stimulus-sensitive neurons ∼10 min after injection and lasted up to 30 min (Fig. 2F). Note that it was not possible to systematically inject the animals with apomorphine after each session because this drug might have interfered with the behavioral and cognitive state of the animals in subsequent days.
GABA-ergic cells were readily identifiable by their baseline firing rate and spike shape during the Rest-1 period (above 20 Hz in resting condition) and peak-to-trough ratio close to 1 (Fig. 2, G and H). The study of these cells is left for further work.
In sum, while it is clear that our recordings contained typical VTA DA neurons waveforms, and typical VTA neurons responsive to apomorphine, not all putative DA cells could be unequivocally classified using these criteria. In addition, these criteria did not significantly correlate with stimulus sensitivity (Fig. 2D). We also note that recent work using optogenetic phototagging and Cre-driver lines has provided important tools for the identification of VTA DA cells (Cohen et al. 2012; Witten et al. 2011). These tools were beyond the scope of this study. Taken all together, the electrophysiological characteristics, pharmacological responses, histology, and known stereological estimates of the PBP nucleus of the VTA [70% DA, 25% GABA (Nair-Roberts et al. 2008)] suggest that the non-GABAergic cells recorded were putative dopaminergic cells.
The correct position of the electrodes tips was confirmed in all animals by electrolytic lesions on each of the electrodes (5–10 μA, 10-s positive to electrode, negative to ground). As was indicated before, of the 12 tetrodes targeted to VTA, per session, we had an average of 9.6 ± 1.8 tetrodes with usable cells, and 6.5 ± 1.9 tetrodes per session containing at least one stimulus-sensitive cell (n = 18 sessions). One additional animal for which tetrodes did not reach the VTA as assessed by histological analyses (200 μm away from the VTA border) did not show any cells that could be classified as stimulus sensitive according to our criteria and these data were not included in the analysis.
At the end of each experiment, animals were deeply anesthetized with Nembutal (100 mg/kg) and perfused through the left ventricle with a saline flush (200 ml) followed by 250 ml of 4% paraformaldehyde in 0.1M phosphate buffer (pH 7.4). After the brain was removed, it was postfixed in the same fixative for 2 h and then transferred to a solution of 30% sucrose in PBS (phosphate buffer 0.01 M, NaCl 0.9%) with 0.02% sodium azide for 2 days or until it sank to the bottom of the tube. Brains were then blocked in the coronal plane and subsequently cut with a Cryostat set for a thickness of 50 μm. The tissue was processed for Nissl staining and tyrosine hydroxilase (TH) immunohistochemistry. For TH immunohistochemistry, free-floating sections were incubated in 0.3% H2O2 in PBS for 30 min, rinsed in PBS, and transferred to the blocking solution (0.4% Triton X-100, 0.02% sodium azide, and 3% normal goat serum in PBS) for 1 h. The sections were then transferred to the primary antibody incubation solution overnight at room temperature. This incubation solution contained an anti-TH rabbit polyclonal antibody, from Chemicon International, which was diluted 1:10,000 in the blocking solution. The sections were then rinsed in PBS for 1 h before being incubated in the secondary antibody solution [Biotin-SP-conjugated AffiniPure goat anti-rabbit IgG (H + L) from Jackson ImmunoResearch; diluted 1:1,000 in 0.4% Triton X-100 and 1.5% normal goat serum in PBS].
After being rinsed for 40 min, the sections were incubated for 1 h in Vectastain ABC Elite kit (Vector Laboratories) diluted 1:500 in PBS, rinsed, and incubated in a 0.05% 3-3' diaminobenzidine hydrochloride (DAB) solution containing 0.003% H2O2, in PBS by 8 min, and finally, the slices were rinsed 20 min in PBS, mounted, and dehydrated for further analyses under regular light microscopy. An example of Nissl and TH staining and electrode localizations can be found in Fig. 1B, left and right, respectively.
Calculation of reactivation.
The derivation of the measure of reactivation was previously described (Kudrimoti et al. 1999). We use the explained variance (EV) method and computed the partial correlation coefficient between a task (Task) and the following period of resting (Rest-2), accounting for any pretask correlations (Rest-1). The following steps were taken
1) Select cells that do not belong to the same tetrodes to prevent any possible artifactual cross correlation due to spike sorting errors. Fast spiking cells (>20 Hz) were excluded.
2) For task, Rest-1 and Rest-2 compute the correlation matrix:
For each cell, compute the firing histogram (bin size of 100 ms unless otherwise noted) in the epoch.
Compute the cross correlation matrix (N × N) of the firing histograms. This step is modified below using a smoothed similarity function rather than cross correlation (Kruskal et al. 2007). This formulation gives a more stable and more reliable estimate of the EV measure and is further detailed below.
3) Select the upper triangle values (excluding diagonal) of all three matrices and vectorize (R1, T, R2).
4) Compute EV/reverse EV (REV).
Compute the correlations: c1T = Cor(R1,T), c2T = Cor(R2,T), and c12 = Cor(R1,R2).
Compute EV and REV as:
The task epoch included all spikes. Rest-2 encompassed the first 10 min spent in the holding flower pot. The data extracted from Rest-1 were always of the same duration as in Rest-2 and were taken immediately before the task was started, while the rat was in the towel-lined pot, before the experimenter entered the room. To assess the significance of the measure, the Rest-1 epoch was extracted by translating a 10-min window backward in time by steps of 2 min at least 10 times to generate 10 values for EV and REV, which were then used to compute the mean and standard deviation of EV and REV values.
As controls, the interspike intervals of all cells were randomly shuffled 50 times and 50 EV/REV values were generated. We confirmed that the EV and REV values thus obtained were statistically identical (t-test, P > 0.05) and very low (<0.1). The calculations above were similar to those performed in other studies, although variation in bin size is rarely done (data not shown). This result indicates that, using the traditional methods, care needs to be taken in choosing the bin size of the analysis and several shifts of Rest-1 are needed to derive a meaningful EV/REV value.
We conducted systematic variations in the time of onset and length of the Rest-2 period on nine datasets. EV values were the strongest when Rest-2 started immediately after the task, and was of 6- to 9-min duration (data not shown).
An alternative method for EV computation has been proposed (Kruskal et al. 2007). This method is similar to that proposed above, except that step 2 involves a binless similarity measure (rather than a correlation). A similarity matrix (values between −1 and 1) is computed. Briefly, spike trains in each epoch are convolved with a Gaussian kernel of width sigma (note that for efficiency, the convolution is achieved analytically). The similarity between two spike trains is computed as the normalized dot product of the curves thus obtained (i.e., cosine of the vectors constituted by the continuous curves). The sole parameter of this measure is sigma, the width of the Gaussian kernel. We computed the sensitivity of the EV measure to variation in the width of the kernel [the width has been divided by sqrt(12) to allow for direct comparison with the “traditional” EV computation (Kruskal et al. 2007)]. We found that the variability of the measure is greatly reduced and converges rapidly to a steady state value (not shown). With this measure an effective width of 100 ms was sufficient. The classic and smooth EV/REV values were computed across nine session of the task, using the population of stimulus-sensitive cells. No statistically significant differences were detected (t-test, P > 0.05) between the two methods (average ± SE of smooth EV = 0.34 ± 0.06, classic EV = 0.33 ± 0.05; average ± SE of smooth REV = 0.07 ± 0.03, classic REV = 0.06 ± 0.01). The use of a similarity measure rather than a correlation to measure reactivation eliminated the problem of time bin-size sensitivity and did not significantly change the absolute value of the reactivation computed using the traditional method with 50 shufflings; however, it saved a significant amount of computational time.
Characterization of sleep and awake epochs.
To determine whether rats were in awake or sleep state, we analyzed the EMG signals coming from the neck muscle electrodes and the EEG signals from the electrodes implanted in the contralateral hippocampus. To determine whether specific epochs corresponded to rapid eye movement (REM) or non-REM (NREM) sleep, the power spectral density (PSD) of the EEG signals was computed between 0.5 and 30 Hz on sliding 15-s epochs, using the Neuroexplorer sofware. Those epochs when the peak of the PSD was <4 Hz were considered NREM and the epochs were the peak of the PSD was around 8 Hz (±2) was considered REM sleep.
To interpret the ensemble activity in response to a stimulus delivery, we built a matrix of the pattern of neural activity displayed by all stimulus-sensitive cells recorded simultaneously in one session in response to each of the four stimuli previously described (“f,” “s,” “q,” and “e”). The matrix consisted of the average firing rate 1 s before and 1 s after the stimulus delivery, across all trials used in the session. This 2-s matrix was divided into 40, 50-ms bins and vectorized, generating a template for each stimulus (see Fig. 5). The correlation coefficient of this vector was then computed between the template and sliding windows of the same time span before, during, and after the task. The confidence interval was built using 25 shufflings of the position of each neuron within the template and by computing again the correlation coefficient between the new 25 shuffles of the template and the current activity of those cells. The average ±2 SD values of the 25 shuffles templates was defined as the confidence intervals. Peaks that crossed these values were considered significant correlations between the template and the patterns of activity observed in the data. We also systematically varied the temporal time scale of the patterns (multiplicative factors: 0.5, 1, 2, 3, 4, 5, 8, 16) and collected the correlation values during Rest-2. The highest values were obtained for 1, the no-compression case (data not shown).
Dopaminergic cells are broadly tuned to taste stimuli.
Simultaneous single unit recordings in the VTA were obtained using tetrodes in four rats in a task that consisted of the delivery of food pellets (CS) with a pair of tweezers (US) in a towel-lined flowerpot. Two stimuli were of positive valence (25-mg food or sugar pellets) and two stimuli were of mild negative valence (25-mg quinine-flavored food pellets or empty tweezers, see materials and methods).
Cells were classified as stimulus-sensitive if they responded to at least one of the four stimuli by altering their firing rates compared with baseline across 15 trials (Fig. 1, C and D). Additional classification details are given in materials and methods.
To measure the reactivation of VTA activity patterns, simultaneous recordings were obtained from 12 tetrodes independently targeted to the PBP division of VTA (Nair-Roberts et al. 2008) (Fig. 1). With the techniques used here, it was physiologically possible to identify typical dopaminergic triphasic waveforms (Fig. 2, A–C) (Corral-Frias et al. 2013; Ungless and Grace 2012); however, not all putative dopamine stimulus-sensitive cells fell into this classical category (Grace and Bunney 1983; Margolis et al. 2006, 2008) (Fig. 2D). Putative interneurons were clearly identified by their symmetric waveforms and high firing rates (>20 Hz; Fig. 2, G and H). Here we present data on putative dopaminergic VTA cells only. Because the PBP division of the VTA contains ∼70% dopaminergic neurons and 25% GABAergic neurons, non-GABAergic cells are likely to be dopaminergic cells (Nair-Roberts et al. 2008). The location of the tetrodes was marked by current injection, and the brains were processed using Nissl stain and TH immunohistochemistry to confirm the positions of the electrodes (Fig. 1B). In a few sessions, intraperitoneal injections of apomorphine were given. As shown by others (Roesch et al. 2007), apomorphine decreased the firing of the putative dopamine stimulus-sensitive cells, increasing the confidence of their dopaminergic nature (Fig. 2, E and F; additional details on the physiological characterization of all cells are given in materials and methods) (Corral-Frias et al. 2013).
Putative dopaminergic neurons reactivate during posttask rests periods.
Spike trains from ten, simultaneously recorded, stimulus sensitive, putative dopamine neurons are shown in Fig. 3A before, during, and after the task. During the pretask resting period (Rest-1), neurons did not show any clear pattern of coordinated activity; however, during the task period, when different stimuli were given to the animal, we observed clear patterns of activity throughout the entire population. In this representative dataset, different population-wide activity patterns were observed in response to the delivery of different stimuli, as a consequence of the selectivity of the individual cells [i.e., compare the 2 patterns resulting from regular food “f” delivery (arrows) to the 2 patterns resulting from quinine “q” or sugar “s” pellets]. These qualitative observations will be quantified below (see Figs. 5 and 6). The occurrence of patterns resembling those elicited during the task continued during the rest period immediately after the task (Rest-2), even though no experimenter was in the room and no stimulus was delivered (compare with Rest-1). These data suggest that VTA stimulus-sensitive cells reactivate during the posttask rest period. The overall firing rate of these neurons during the Rest-2 and the Task periods were similar (Fig. 3C) but significantly different from that during Rest-1, even though the amount and architecture of sleep was similar between Rest-1 and Rest-2 (Fig. 3B). Further analyses of the patterns of spiking revealed that the increase in firing rate was due to an increase in the incidence of bursts but not in the number of spikes per bursts (data not shown), which was similar to that found in other studies (Dahan et al. 2007) and compatible with optogenetic studies (Tsai et al. 2009).
Reactivation involves the coordinated population wide replay of activity.
To quantify these reactivation episodes, we computed the pairwise cross correlation between all putative dopamine stimulus-sensitive cells simultaneously recorded. Most of these correlations were broad, occurred near zero-lag, and persisted in Rest-2 (Fig. 4A). A graphical representation of the cross correlograms of multiple pairs of stimulus-sensitive neurons is shown in Fig. 4B (only 45 cross correlograms with the strongest overall correlation peaks during the task are shown for clarity). In this graph, each row is a color-coded representation of the cross correlogram between two cells recorded simultaneously (as in Fig. 4A). The pairs are ordered according the time of occurrence of the peak of their cross correlograms (red) during the task (Euston et al. 2007). The same pairs, plotted in the same order, are used to display the correlations during the rest periods before and after the task. A curve is drawn through all the peaks of all the correlograms (black dot-dash curve in Fig. 4B, top, middle, and bottom). This analysis shows that there is no clear similarity between the pattern of cross correlations before and during the task (left and middle); however, the population-wide pattern of cross correlations induced by the task is clearly present in the second rest period, around zero lag, in a subset of cell pairs (arrow, cell pairs 17–35). A similar analysis for stimulus-nonsensitive putative dopaminergic cells showed that the pattern of correlation during the task was not reproduced in Rest-2 (data not shown).
The patterns of firing during reactivation episodes broadly match those during the task and occur mainly during SWS and quiet awakeness.
To assess the specificity of each reactivation episode, we next studied the time of occurrence of the population-wide patterns of activity using a template-matching technique (Tatsuno et al. 2006). We observed a very rich dynamics in population response from trial to trial. In some cases, and for a subpopulation of cells, the activity shifted from being highest poststimulus delivery in the initial trials to being slightly predictive in the later trials (Fig. 5A). To achieve a robust quantification of reactivation, we first computed the average pattern of activity of the population of putative dopamine stimulus-sensitive cells recorded for a given stimulus during the task (e.g., “f,” 15 trials, in Fig. 5A). This template, temporally uncompressed, was then applied to the entire recording, and the instantaneous similarity between the template and the data was plotted as a continuous curve. The SD of the similarity was measured across trials. Figure 5B shows the result of this procedure before and during the task. In this example, the template yielded strong matches (greater than 2× SD) during the task each time a food pellet “f” or a sugar pellet “s” was delivered but not when the empty “e” or quinine “q” stimuli were presented. This result indicated that the “f” pattern, while broad enough to signal both an “f” and to a lesser extent a “s” stimulus was selective enough not to be triggered by the “e” and “q” events (see also Fig. 6B). The peaks of the correlation between the template and events during the task cannot be explained by a pure motor component, because when a nonstimulus-driven spontaneous movement of the animal occurred, the correlation between the template and the current neural activity dropped to values close zero (circle on Fig. 5B, task, bottom), and characteristic EMG and EEG traces could be identified (Fig. 5C).
No clear template similarity episode was evident in the rest period before the task (the curve mostly stays within ±2× SD, Fig. 5B, top). Rest-2, however, showed a complex temporal pattern of “f”-specific reactivation episodes evidenced by multiple crossings of the +2× SD line (Fig. 5B, bottom). The time scale of the pattern was systematically varied from 0.5 to 16, and the highest values were obtained for 1, indicating that there was no temporal compression (data not shown).
Figure 6A shows a further analysis of the Rest-2 period in Fig. 5B, bottom. For clarity, each crossing is depicted by an open circle and the sleep stages are indicated. An analysis of the EEG and EMG records showed that almost all reactivation episodes occurred during SWS or quiet awake states (Fig. 6, A and C; 58.03 ± 8.27 and 26.71 ± 10.71%, respectively; means ± SD. Computed with respect to total amount of time in these epochs). Across experiments and rats, the quinine-flavored stimulus (see also neural responses on Fig. 3) displayed the largest number of specific reactivation episodes relative to any other stimulus (Fig. 6D).
Figure 6B shows the average correlation coefficients between the “f” matrix template and all four stimulus-delivery events during the task. The correlation exhibits the highest values for “f,” as expected, and lower values for “s”, “e,” and “q” in that order. Additional analyses were conducted taking the “s”, “q,” and “e” average matrices as template to fully validate the procedure (data not shown). In our dataset, the number of stimulus selective dopaminergic neurons considered per template was 10 ± 3.1.
Reactivation is mainly driven by stimulus-sensitive cells.
To quantify these results further across all experiments and across all animals, we used a smoothed, binless version of the explained variance method (Kruskal et al. 2007; Kudrimoti et al. 1999). The explained EV is a measure of the extent to which a pattern of zero-lag pairwise correlations (as in Fig. 4B) in the Rest-2 epoch is similar to that in the task epoch, factoring out preexisting correlation patterns (Rest-1). This smoothed version is more robust and more stable than the original measure (data not shown). Reactivation is significant when EV is significantly different from REV, calculated when Rest-2 and Rest-1 periods are swapped (Kudrimoti et al. 1999; Pennartz et al. 2004). EV was first calculated using the entire population of putative dopamine VTA neurons, consisting in the stimulus-sensitive and stimulus-nonsensitive neurons, across multiple rats and sessions (18 sessions). Control sessions consisting in 2 h of rest were conducted to obtain estimates of the explained variance values expected when no task was given to the animals (6 sessions). These control sessions were analyzed using the same amount of time for Rest-1, Rest-2, and Task as in the actual experiments. Nine additional shufflings of these three epochs per control session were conducted to yield 54 control datasets. Figure 7, A–C, shows the average ± SE of EV values during the task. The EVs for all putative dopamine VTA cells was significantly different from REV when both stimulus-sensitive and stimulus-nonsensitive cells were considered together (Fig. 7A). A systematic variation of the duration and time of onset of the Rest-2 period used in the data analyses showed that a duration of 7–8 min and 0-min delay were the most effective at yielding a stable EV measurement (data not shown). To determine whether reactivation is driven by the correlations involving the stimulus-sensitive neural population, and not just a general property of all VTA neurons, the analysis was repeated for stimulus-sensitive cells only (SS-cells, Fig. 7B), and for stimulus-nonsensitive cells separately (SnS-cells, Fig. 7C). Most of the reactivation was due to correlations within the stimulus-sensitive cell population (Fig. 7B). These data show no significant reactivation during the control sessions (no difference between EV and REV; n = 54), indicating that, in the time frame of our experiment, no VTA reactivation occurred if no stimuli were given to the animals. These results thus show that patterns of activity within putative dopamine stimulus-sensitive cells are reactivated after a period of wakefulness, but only if effective stimuli are delivered.
Recent evidence indicates that VTA firing may be modulated by velocity and acceleration (Puryear et al. 2010); however, although our task was conducted within a flower pot to minimize possible influences of rat movement on reactivation, we calculated the percentage of time spent motionless during Rest-1, Task, and Rest-2 period (Fig. 7D). There was no significant difference in the amount of motionless behavior between the two Rest epochs, but there was much more motionless behavior in these periods than during Task, as expected. We recomputed the EV and REV in the same sessions but separated movement and nonmovement periods and verified that the EV/REV values with or without movement were not statistically different and that the relative differences between EV and REV were preserved and remained statistically different (Fig. 7E).
The results show that putative dopamine VTA neurons in the rat reactivated stimulus-induced activity patterns during rest periods immediately following the task. Most VTA neurons were responsive to stimuli, with a large population of neurons being sensitive to specific stimuli, whether they were positively or mildly negatively valued. In this task, in which motor activity was intentionally minimized, reactivation was primarily driven by the correlations between stimulus-sensitive neurons. Reactivation episodes consisted of a coordinated near-zero-lag transient increase in firing rate within specific subsets of cells, at the same time scale at which the coactivation occurred during the task. This period of reactivation lasted at least 8 min after the task was completed and smoothly declined to control levels in the following 10–15 min. Reactivation occurred during SWS but not during REM sleep and was strongest for stimuli that were more salient for the animal (quinine flavored pellets). Unlike reactivation in other areas such as in the prefrontal cortex, reactivation in the VTA appeared on an uncompressed and longer time scale and emphasized correlations of firing rates changes rather than correlations between single spikes. Such a time scale is compatible with that of the dynamics of dopamine release and synaptic clearance.
VTA has widespread connections to multiple memory-related areas of the brain, including the hippocampus and neocortex (Gasbarri et al. 1997; Oades and Halliday 1987). Although explicit reward delivery is not necessary for memory reactivation (Tatsuno et al. 2006), most experiments demonstrating strong memory trace reactivations were based on tasks that involved rewards in animals or humans (Feld et al. 2014; Shohamy and Adcock 2010). Our results therefore suggest the possibility that these reactivating episodes may be at least in part triggered by or coordinated with VTA reactivations. Dual recordings in VTA and other areas would be required to test this hypothesis.
It has been well established that reactivation in hippocampus and cortex is a repetitive process. For example, neurons in hippocampus reactivate many times during multiple sharpwaves, some patterns reactivating more often than others. The fact that VTA neurons reactivate during SWS and the fact that dopamine is well known to increase excitability and modulate synaptic plasticity suggest that this selective reactivation may play a role in the amount of reoccurrence of the reactivations of specific neurons in the hippocampus and prefrontal cortex. It is possible, for example, that a dopaminergic signal could bias the local reactivations of some subpopulations of cells in the target areas, so as to progressively increase their mutual synaptic strengths and consequently increase the strengths of specific memory traces.
The selective reactivation of putative VTA dopamine cells that we report here suggests that reactivations in hippocampus and cortex could be modulated by the VTA and by the now well-documented effects of dopamine on synaptic plasticity (Edelmann and Lessmann 2013; Lisman et al. 2011). Indeed, reactivation as quantified by the EV/REV measure has been found to occur primarily within the first 15 min of sleep in hippocampus and prefrontal cortex, a time scale compatible with that of VTA reactivation (data not shown). The finding that stimulus-sensitive neurons primarily reactivated when stimuli were present during the task suggests a mechanism by which hippocampal and neocortical reactivations per se, or the consequences of these reactivations on synaptic plasticity, could be modulated by the saliency of the memory items they encode. In turn, the neocortex and hippocampus could modulate VTA activity as part of a loop in which memory content and memory saliency are dynamically established. This suggests that the phenomenon of reactivation can occur outside of the well-documented hippocampus-neocortex interaction loop and may involve neuromodulatory systems such as the VTA and its targets such as the ventral striatum (Pennartz et al. 2004).
The previous findings that VTA neurons fire for both negatively and positively valenced stimuli and that they exhibit stimulus selectivity beyond what was previously thought is compatible with our results (Cohen et al. 2012; Lammel et al. 2012; Roeper 2013). Our findings also show a bias toward negatively valued stimuli, with quinine being the event that is the most reactivated. This result suggests that reactivation was the strongest for aversive stimuli, possibly explaining recent studies showing aversive bias of overnight memory retention in the monkey amygdala, a recipient of VTA inputs (Livneh and Paz 2012).
The mechanisms with which dopamine, or neuromodulation in general, affects neural computation and plasticity are still largely not understood (Fellous et al. 2015; Fellous and Linster 1998). It is clear, however, that future conceptual and computational models of reinforcement learning and memory consolidation should take into account the type of selectivity and activation dynamics of VTA neural firing described here.
This work was supported by Pew Latin-American Fellows Program in the Biomedical Science, Iniciativa Cientifica Milenio Grant ICM P10-001-F (to J. L. Valdés), National Science Foundation Grant CRCNS 1010172 and Office of Naval Research Grant MURI-N000141310672 (to J. M. Fellous), and an Alberta Innovates-Health Solutions Polaris award (to B. L. McNaughton).
At the request of the author(s), readers are herein alerted to the fact that additional materials related to this manuscript may be found at the institutional website of one of the authors, which at the time of publication they indicate is: http://www.u.arizona.edu/∼fellous/. These materials are not a part of this manuscript, and have not undergone peer review by the American Physiological Society (APS). APS and the journal editors take no responsibility for these materials, for the website address, or for any links to or from it.
No conflicts of interest, financial or otherwise, are declared by the author(s).
Author contributions: J.L.V. and J.-M.F. conception and design of research; J.L.V. performed experiments; J.L.V. and J.-M.F. interpreted results of experiments; J.L.V. prepared figures; J.L.V., B.L.M., and J.-M.F. edited and revised manuscript; J.L.V., B.L.M., and J.-M.F. approved final version of manuscript; J.L.V. and J.-M.F. analyzed data.
- Copyright © 2015 the American Physiological Society