|
|
||||||||
Department of Neuroscience, University of Minnesota, Minneapolis, Minnesota 55455
Submitted 17 July 2003; accepted in final form 15 January 2004
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
While striatal neurons have been examined in primates during the performance of long chains of sequential movements, studies in the rodent have tended to use tasks in which rats make a single response (such as turning on a T maze, lever pressing, or head movements). Little is known about how neurons in the rodent dorsal striatum respond as rats perform more complex tasks, and an understanding of such responses may shed light on how striatal activity could be used to perform long chains of automated behavior. Many tasks that have been used in striatal recordings have strong temporal components: subjects pay attention to instruction cues and make reactions after fixed or variable delays to receive reward at some later time point. These tasks have shown that striatal neurons have responses that cover the temporal interval between the start of a trial and the receipt of reward (summarized by Schultz et al. 1995
). However, less is known of what type of striatal representation will be obtained in a task that emphasizes spatial information.
In this experiment, we examined neurons in the dorsal striatum of rats during the performance of arbitrary navigational sequences. Rats ran through a complex route formed by four T choices arranged sequentially and were rewarded for correctly navigating the maze with food delivered at two different locations. The use of multiple T choices and multiple food-delivery sites allowed us to examine neural responses when rats made the same action (turning left, turning right, food consumption) observed in different parts of the task sequence. We have previously reported that rats running on a multiple T maze quickly eliminated errors from their path, demonstrating that rats could perform well in a single session even when presented with a sequence of T choices that they had never experienced (Schmitzer-Torbert and Redish 2002
). Here, we present behavioral data from rats running novel and familiar multiple T mazes and data from neurons recorded in the dorsal striatum as rats ran the multiple T maze. Some of this work has been previously presented in abstract form (Redish et al. 2002a
; Schmitzer-Torbert et al. 2002
).
| METHODS |
|---|
|
|
|---|
Five male Brown-Norway-Cross rats obtained from the National Institute on Aging were used (aged 1315 mo at the beginning of recordings). During behavioral training, rats were food-restricted, and in most cases, rats received all of their food during the experimental task. Additional food was provided following the experimental task when required to maintain a rat's weight >80% of its free-feed weight. Two of the rats had prior experience running for food on an operant conditioning task, whereas the other three rats did not. Before and during experiments, rats were handled for 15 min each day. All procedures were in accordance with National Institutes of Health guidelines for animal care and approved by the IACUC at the University of Minnesota.
Task
Rats were trained to run an elevated linear multiple T task which consisted of four T choices arranged sequentially to form a turn sequence (see Fig. 1). On either side of the turn sequence, return rails led from the end of the maze back to the beginning, so that rats ran the maze as a continuous loop. On each return rail, two automatic food dispensers (Med-Associates, St. Albans, VT) delivered pellets to locations on the track separated by
45 cm. On completion of each trial, the rat received two 45-mg pellets (Research Diets, New Brunswick, NJ) at each food-delivery location, for a total of four pellets per trial. If the rat made an incorrect turn on the final T and ran back along the wrong return rail (thus passing the incorrect pair of pellet dispensers), no pellets were delivered, and the rat had to repeat the turn sequence to finish the trial and receive food. Throughout the task, rats were blocked from moving backward on the maze but were allowed to make incorrect choices. In practice, rats tended not to turn around and were rarely blocked.
|
A trial was defined as the interval between successive arrivals at the second food-delivery site on the rewarded return rail. A lap was defined as each time the rat passed through the turn sequence; thus multiple laps could occur during a single trial if the rat made an incorrect choice on the final T. Rats were not removed from the track between trials or laps; rats ran the task as a continuous loop. Trials and laps were defined for analysis purposes only.
Initial training was conducted using a shortened three T maze with only three T choices. When rats were able to run the task, they began training on a five T maze, which used five T choices. They were trained on the five T maze for
1 wk. For both three T and five T maze training, the sequence of turns used for each rat was changed daily. Rats were taken off the task
2 days before surgery and given ad lib access to food. Beginning 2 days after surgery, rats ran three T mazes until they were able to run the task while connected to the recording system and tetrodes had been advanced into the striatum. Rats were then moved to a novel-1/familiar/novel-2 protocol in which they ran one four-T maze per day for 3 wk (7 mazes per condition). In the first and third week (novel maze conditions), rats were presented with a different sequence of turns each day, whereas in the second week (familiar maze condition), rats were presented with the same sequence of turns each day. The mazes in the familiar condition used the last sequence of turns presented in the novel-1 condition. To control for odor cues in the familiar maze condition, specific T choices were swapped daily, but the position of the turn sequence relative to the experimental room remained constant from day to day. T choices were also swapped daily in Novel maze conditions. Sessions lasted for 40 min.
Two rats were implanted with hyperdrives before learning the multiple T task. Training for these animals began with the three T version of the task and then moved directly to the 3-wk novel-1/familiar/novel-2 protocol using four T mazes.
Surgery
Rats were implanted with 14-tetrode hyperdrives (David Kopf Instruments, Tujunga, CA) targeting the striatum. Twelve tetrodes were used to record neural activity, and two electrodes were used as references for common noise rejection. Tetrodes were constructed from four lengths of 0.0127-mm wire insulated with polyamide (Kanthal Precision Wire, Palm Coast, FL). Rats were anesthetized with Nembutal (pentobarbital sodium, 4050 mg/kg, Abbott Laboratories, North Chicago, IL), and the area of the implantation was shaved. Rats were then placed on a stereotaxic apparatus (Kopf) and 0.1 ml Penicillin G benzathine and penicillin G procaine (dual-cillin, Phoenix Pharmaceutical, St. Joseph, MO) was injected intramuscularly into each hindlimb. During surgery, anesthesia was maintained using isoflurane (0.52% isoflurane vaporized in medical grade O2). The scalp was then disinfected with alcohol and swabbed with Betadine (Purdue Frederick, Norwalk, CT). The skin overlying the skull was incised and retracted, and the underlying fascia was cleared from the surface of the skull. Excess bleeding was stopped by application of hydrogen peroxide followed by cautery of the retracted fascia. Anchor screws and one ground screw were placed in the skull, and a 1.8-mm diam craniotomy was opened using a surgical trephine (Fine Science Tools, Foster City, CA). The hyperdrive was positioned over the striatum (Bregma +0.5 mm AP, 3.0 mm ML) (Paxinos and Watson 1998
) and lowered to 1 mm below the surface of the skull. The craniotomy was protected using silastic (Dow Corning 3140), and the hyperdrive was secured in place with dental acrylic (Perm Reline and Repair Resin, The Hygenic Corp, Akron, OH). After surgery, 10 ml sterile saline (0.9%) was administered subcutaneously, and all tetrodes were advanced
1 mm. Animals were allowed to recover in an incubator until they were ambulatory, which was usually 12 h after surgery. Once animals were ambulatory, 0.8 ml acetaminophen (children's Tylenol) was administered orally. For 2 days after surgery, rats received water containing children's Tylenol (25 ml in 275 ml of water). Rats were allowed 2 days to recover from surgery before resuming experiments. Three animals received right-side implants, and two animals received left-side implants.
Histology
After the completion of all experiments, the location of each tetrode was marked by passing a small amount of anodal current through each tetrode (5 µA for 5 s), which causes a small lesion to form in the region of the tetrode tip. At least 48 h after gliosis, rats were killed with 1.0 ml Nembutal and perfused intracardially with saline followed by 10% formalin. Brains were removed and placed in 10% formalin overnight, then transferred to a 30% sucrose/formalin mixture for several days. Brains were sliced on a freezing microtome into 40-µm coronal sections (3 animals) or horizontal sections (2 animals) through the region of the hyperdrive implantation. Slices were stored in formalin at 4°C until staining. Sections were then mounted on gelatincoated slides, dipped in ethidium bromide (Sigma Aldrich, St. Louis, MO) for 15 s, rinsed in dH20, dehydrated, and coverslipped with DPX (Fluka Chemical, Ronkonkoma, NY).
Neurophysiology
RECORDING. After surgery, the 12 recording tetrodes were advanced
160640 µm per day until reaching the striatum. Arrival in the striatum was determined on the basis of the estimated depth of each tetrode and the passing of corpus callosum (which is quiet relative to the overlying cortex and underlying striatum). The striatum was further identified by the presence of slow firing cells, consistent with known properties of medium-spiny GABAergic neurons. After reaching the striatum, each tetrode was moved no more than 40 µm per day. The two reference electrodes were advanced in a similar manner to the corpus callosum.
Neural activity was recorded using a 64-channel Cheetah recording system (Neuralynx, Tucson, AZ). During recording sessions, a 72-channel motorized commutator (AirFlyte, Bayonne, NJ; Dragonfly, Ridgeley, WV; Neuralynx) allowed the rats to run the task without twisting the tether cables that connected the hyperdrive to the recording system. Tetrode channels were sampled at 32 kHz and filtered between 600 Hz and 6 kHz. When the voltage on any of the four channels of a single tetrode exceeded a threshold set by the experimenter, a 1-ms window of the spike waveform on each of the four channels on the tetrode was recorded to disk and timestamped with microsecond resolution. Spikes were clustered off-line into putative cells on the basis of their waveform properties using MClust 3.0 (Redish et al. 2002b), with automatic preclustering using KlustaKwik 1.0 (Harris 2002
).
CLUSTER QUALITY. In each session, spike waveforms recorded on a single tetrode represented a mixture of spikes obtained from multiple sources including both neurons and noise events (chewing artifact, mechanical artifacts). After clustering spikes into putative cells, it was important to ensure that spikes assigned to one cluster were well separated from other spikes recorded simultaneously. A quantitative measure of cluster quality, Lratio, was applied to measure how well each cluster (i.e., each putative cell) was separated from other clusters and noise events recorded on the same tetrode (Jackson et al. 2003
).
For each spike, four features of the waveform recorded on each tetrode channel were calculated (energy and the 1st 3 principal components of the energy normalized waveform). Energy was calculated using the square root of the sum of squares of each sample in the waveform. For the principal-components analysis, a set of principal components based on a sample of striatal neurons were applied to the waveforms after normalization by energy.
The 16 feature quantities (4 tetrode channels x 4 features) defined each spike as a point in 16 dimensional space. For each cluster, the squared Mahalanobis distance (D2) from the center of the cluster was calculated to every spike in the data set using the covariance matrix based on spikes assigned to the cluster. The Mahalanobis distance had the effect of scaling the spikes in the cluster to unit variance in all 16 dimensions. Under the assumption that the spikes in the cluster distribute normally in each dimension, D2 for spikes in a cluster will distribute as
2 with 16 degrees of freedom (D'Agostino and Stephens 1986
).
The amount of contamination of a given cluster is denoted by L and is calculated as the sum of the probabilities that each spike that is not a cluster member should actually be a part of the cluster. The probability of cluster membership for each spike is taken to be the inverse of the cumulative distribution function for the
2 distribution with 16 degrees of freedom. Spikes that are close to the center of the cluster will have probabilities approaching 1, whereas spikes far from the center of the cluster will have probabilities approaching 0. For each cluster, L was calculated by
![]() | (1) |
C is the set of spikes that are not members of the cluster,
is the cumulative distribution function of the
2 distribution with df =16, and
is the squared Mahalanobis distance of spike i from the center of cluster C. Spikes from other clusters or noise events that are close to the center of cluster C will have high probabilities and contribute strongly to this sum, whereas spikes far from the center of cluster C will contribute little. A low value of L indicates that the cluster has a good "moat" and is well separated from other spikes recorded on the same tetrode. In contrast, a high value of L indicates that the cluster is not well separated and is likely to include both spikes that are not part of the cluster and exclude spikes that are part of the cluster. The cluster quality measure, Lratio was defined as L divided by the total number of spikes in the cluster. Using a criterion based on Lratio rather than L allows clusters with larger numbers of spikes to tolerate more contamination. Examples of representative clusters and their Lratio values are shown in Fig. 4. For the analyses described in the following text, clusters with values of Lratio < 0.05 were used. The results described in the following text were consistent with more strict thresholds of Lratio.
|
CELL TYPE CATEGORIZATION. On the basis of the extracellularly recorded spike trains, striatal cells were grouped into two categories: phasic-firing neurons (PFNs) and tonic-firing neurons (TFNs). PFNs were distinguished from TFNs on the basis of the proportion of time spent in interspike intervals (ISIs) >5 s. This long-ISI proportion was calculated by finding all ISIs >5 s, summing these ISIs and dividing by the total session time. PFNs were then defined as cells with values of the long-ISI proportion >0.4, which indicated that these cells spent >40% of the session in ISIs >5 s. TFNs were defined as cells with values of the long-ISI proportion <0.4. The choice of 5 s as the definition of a long-ISI was based on a preliminary consideration of the data and definitions using 2- or 10-s ISIs yielded similar separations of PFNs and TFNs. Analyses described in the following text were applied to the set of phasic-firing neurons.
Behavioral data collection
Position of the rats during the task was determined using a battery-operated LED backpack constructed in the lab. The LED was secured in an elastic wrap and fastened together with Velcro, which allowed for snug fitting to the rat. Wearing the backpacks, rats were able to move without obvious difficulty, and the LEDs appeared to maintain stable positions on the rat over the course of a session. After implantation, additional LEDs were also present on the headstage, which plugged into the hyperdrive.
LED position was monitored by video tracking input to the Cheetah recording system sampled at 60 Hz. Food delivery was controlled with signals generated by in-house software running in Matlab (Math-Works, Natick, MA). Delivery of food pellets was recorded and timestamped by the Cheetah recording system.
Data analysis
IDEALIZED PATH. For the purpose of identifying errors and constructing linearized spatial rastergrams and histograms (described in the following text), an idealized path was created for each session by selecting a set of points that followed the path that the rat traveled through on a typical lap (one without errors). This set of points was interpolated linearly so that the distance between points was 1 pixel (
0.4 x 0.4 cm). A set of eight spatial landmarks on the idealized path was also selected: one for each food-delivery site, one for each turn, and two that marked where the rat turned to enter and exit the turn sequence (see Fig. 1).
ERROR QUANTIFICATION. Errors were defined as deviations from the idealized path at any of the choices, which occurred when rats made incorrect turns. At each turn, a 28-cm segment of the path centered on the location of the turn was examined. An error was scored on a lap if the rat deviated by >5 cm from the idealized path in this region for a total of
10 position samples (166 ms).
LINEARIZED SPATIAL PLOT CONSTRUCTION. To examine the responses of phasic-firing neurons as rats ran on the maze, a linearized spatial rastergram and histogram were developed. The location of the rat at the time of each spike was mapped to the nearest point on the idealized path, and the location of the spike relative to the idealized path was used to construct linearized rastergrams, and average firing across laps was used to construct linearized histograms (see Fig. 1).
WARPED LINEARIZED SPATIAL PLOTS. When rats ran a new maze each day, the spatial layout of the turn sequence varied from session to session. For presenting data from a set of PFNs recorded in different sessions, the linearized spatial histograms were warped to 20 bins between each of the eight spatial landmarks. The 20 bins surrounding each spatial landmark (the 10 bins before and the 10 bins after) were taken to represent activity of the PFN near the landmark.
FIRING RATE. To better estimate the instantaneous firing rate of each cell, a measure of continuous firing rate was calculated for each spike train by dividing the session into 10-ms bins and assigning each spike to one bin. The binned spike train was convolved with a Gaussian (
= 100 ms) to create a continuous function of estimated firing rate sampled at discrete intervals of 10 ms. This firing rate estimate was used in calculations of task responsiveness, in the creation of phasic firing fields, and in the temporal versus spatial encoding analysis (described in the following text).
TASK RESPONSIVENESS. The responses of PFNs to task parameters were classified using firing rate relative to the time of arrival at each food-delivery location and position on the maze. Two-sample Kolmogorov-Smirnov tests were applied to determine if the population of PFNs was modulated by these task parameters. The null hypothesis was that PFN firing rates were not modulated by either location or food delivery. Then the distribution of firing rates with respect to either task parameter would not differ from the distribution of average PFN firing rates (calculated across the entire recording session). A significant Kolmogorov-Smirnov test would indicate that these task parameters (food delivery and location on the maze) did modulate PFN firing rates.
After determining if PFNs as a population were responsive to task parameters, individual PFNs were tested for task responsiveness. PFNs were classified as reward-responsive if they showed a significant increase in firing rate during the 5 s after arrival at either food-delivery site. PFNs were classified as maze-responsive if they showed a significant increase in firing rate when the rat was running on the maze. To determine if a firing rate was significantly elevated, the mean firing rate of each PFN relative to task events was compared with a distribution of expected mean firing rates created from the same spike train using shuffled event times (i.e., a bootstrap) (see Efron 1982
).
With a large number of expected mean firing rates, the mean ± SD of the distributions of expected mean firing rates can be used as estimates of what the cell's firing rate should be if the cell is not responsive to task parameters. The distributions of expected mean firing rates that were obtained for PFNs frequently exhibited a skew toward positive values, and a square-root transform was applied to normalize the distribution (Sokal and Rohlf 1995
). Under the assumption of normality, estimates of µ and
from the expected mean firing rate distributions were used to calculate the probability of observing the cell's actual mean firing rate using the inverse of the normal cumulative distribution function. An
of 0.05 was adopted, and a Bonferroni correction for multiple comparisons was applied.
PFNs were classified as reward-responsive if the mean firing rate of the PFN in the 5 s after arrival at either food-delivery location was significantly larger than the distribution of expected mean firing rates created from 5-s time segments selected randomly from the session. PFNs were classified as maze-responsive if the mean firing rate at any location on the maze was significantly larger than the distribution of expected mean firing rates created from similar length time segments selected randomly from the session. For determining maze-responsiveness, the idealized path in each session was divided into eight regions using the spatial landmarks described in the preceding text. If any of these regions was >1.5 times the average distance between successive turns on the maze, these regions were divided in half.
PFNs that fired very infrequently during the session tended to produce quantized distributions of expected mean firing rates. Such quantized distributions were not normal following the square-root transform. Therefore all PFNs which fired <100 spikes were not considered any further in our analyses because not enough spikes were observed in the session to accurately estimate the cell's responsiveness to task parameters.
PHASIC FIRING FIELDS. To examine the size and distribution of maze responses in PFNs, a quantification of each maze-responsive PFN's activity on the maze was defined. For each maze-responsive PFN, phasic firing fields (PFFs) were defined as each set of continuous 5-cm bins on the linearized spatial histogram that exceeded 50% of the PFN's maximum firing rate in any bin.
SPATIAL VERSUS TEMPORAL ENCODING. A correlation analysis was performed to determine if maze responses were better related to the location of the rat or to temporal events. For each maze-responsive PFN, the average firing rate at each position along the maze was determined for each lap using the continuous firing rate measure. The correlation of the firing rate as a function of position was calculated between every pair of laps, and the average correlation was taken to represent how well related the PFN's activity was to the location of the animal on the maze.
For temporal measures, the firing rate of each maze-responsive PFN was calculated for each lap over two temporal windows: the 20 s preceding the arrival at the first food-delivery site and the 20 s after departure from the second food-delivery site. For each measure, the correlation of the PFN's firing rate relative to either arrival or departure was calculated for every pair of laps, and the average correlation served to describe how well related a cell's activity was to the time of arrival at the first food-delivery site and the time of departure from the second food-delivery site.
| RESULTS |
|---|
|
|
|---|
Over 105 sessions, rats ran an average of 52.4 ± 8.3 laps per session. Figure 2 shows the average probability of making an error on each lap over the 76 sessions (23 from novel-1, 26 from familiar, 27 from novel-2) in which rats ran
40 laps. Sessions in which rats ran novel and familiar mazes are plotted separately. For convenience, weeks 1 and 3 (novel-1 and novel-2, in which rats ran novel mazes) have been combined. Rats performed at chance on the first lap when running novel mazes and eliminated errors over the first five laps of the session.
|
= 0.05) revealed that in each week, more errors were made in the first 10 laps than in the rest of the session. In the first 10 laps, the number of errors made in the two novel maze conditions did not differ, and significantly fewer errors were made in the familiar maze condition than either novel maze condition. Cell-type categorization
All final tetrode locations were histologically verified to be in the dorsal striatum. Tetrode tracks were observed in a region extending approximately -0.5 to 1.5 mm anterior/posterior and 1.63.6 mm medial/lateral with respect to bregma (see Fig. 3).
|
|
From the set of unique spike trains, 589 (68%) were classified as phasic-firing neurons (PFNs). The remaining 278 (32%) were classified as tonic-firing neurons (TFNs). As shown in Fig. 5, PFNs were well separated from TFNs.
|
10 Hz and a quiescent mode), whereas TFNs tended to fire in one tonic mode. Figure 5 (A and D) shows the average ISI histogram for PFNs and TFNs, constructed using normalized ISI histograms. The average TFN ISI histogram was unimodal (Fig. 5D) as was the ISI histogram of a typical TFN (Fig. 5E). This indicates that TFNs tended to fire regularly at a single rate (see also Fig. 5F for an example). The average PFN ISI histogram had a peak at short ISIs of
100 ms, corresponding to a firing rate of
10 Hz and a shoulder at longer ISIs. This pattern was also observed in individual PFNs (Fig. 5B). PFNs tended to have their spikes organized into bouts of activity, separated by periods of quiescence (see Fig. 5C for an example). As described in METHODS, PFNs with <100 spikes had too few spikes to estimate responsiveness to task parameters. Of the 589 PFNs in the unique spike train sample, 194 contained <100 spikes and were eliminated from further analysis.
Task responses
To determine if PFN firing rates were modulated by two task parameters (the location of the rat on the maze and the delivery of food) Kolmogorov-Smirnov goodness-of-fit tests for two samples were applied. The null hypothesis was that PFN firing rates were not modulated by either location or food delivery and that these distributions of firing rates would not differ from the distribution of average PFN firing rates (calculated across the entire recording session). The distribution of average firing rates of PFNs on the maze was significantly different from the distribution of average PFN firing rates (P < 0.0001). Also the distribution of PFN firing rates in the 5 s after arrival at either food-delivery site was significantly different from the distribution of average PFN firing rates (P < 0.0001). These results indicate that the firing rates of at least some PFNs were modulated by the location of the animal and by the delivery of food. Posthoc tests (described in TASK RESPONSIVENESS) were conducted to identify which PFNs were task-modulated. In general, PFNs had low average firing rates and thus only increases in activity were tested. Of 395 PFNs, 108 (22 ± 9 cells per rat) PFNs were responsive in one or more regions on the maze and 81 (16 ± 7 cells per rat) PFNs were responsive during the 5 s after arrival at one or both food-delivery sites.
Maze-responsive PFNs
Of the PFNs which fired
100 spikes a session, 27% were classified as maze-responsive. Figure 6 shows an example of a maze-responsive PFN that was active at one location on the maze (as the rat ran between the last turn on the maze and the 1st food-delivery location). Maze-responsive PFNs had between one and six phasic firing fields (PFFs, median number of PFFs per cell = 1), with a median PFF width of 3 bins (
15 cm).
|
|
MAZE RESPONSES DISTRIBUTE EVENLY ON THE TURN SEQUENCE. To determine if the PFN maze responses favored specific locations on the maze, the distribution of phasic firing fields (PFFs) over the maze was examined. Figure 8 shows the PFFs obtained from maze-responsive PFNs sorted according to the center of each PFF. Maze-responsive PFNs responded at every location along the length of the turn sequence. The R2 from a linear regression on the sorted PFF centers was 0.98, indicating that the centers of the PFFs were well described by a linear fit. This implies that maze-responses were uniformly distributed on the maze and did not concentrate at specific landmarks.
|
40 Hz) and a weak response at the fourth turn (
10 Hz). This PFN was not equally responsive to all left turns, however. It was silent at two other left turns, where the rat was entering and exiting the turn sequence. To examine how the activity of maze-responsive PFNs as a group depended on the actions that the rats were making, the firing rate in the phasic-firing fields of maze-responsive PFNs was compared with the firing rate of the same PFN at other regions of the turn sequence where the shape of the rat's path was highly similar or dissimilar. Similar and dissimilar paths were defined as locations where the rat's path was well and poorly correlated, respectively. In cases where the shape of the paths were very similar, the rat was likely to be making similar motor actions, such as turning in the same direction. In cases where the shapes of the paths were dissimilar, the rat was likely to be making different motor actions, such as turning in opposite directions.
Of the 108 maze-responsive PFNs, 58 cells (from 5 animals, 11.6 ± 6.0 cells per rat) had at least one PFF on the turn sequence. The firing rate and path of the rat in each PFF was compared with other locations on the turn sequence using a sliding comparison window the size of which was equal to that of the PFF. For each comparison window, the firing rate in the PFF as a function of position along the idealized path was correlated with the firing rate as a function of position in the comparison window. The rat's path in the PFF was also correlated with the rat's path in the comparison window to quantify how similar the route taken by the rat in the comparison window was to the route taken in the PFF. If maze-responsive PFNs encoded general actions, then the firing pattern in the PFF should be highly similar to the firing pattern observed at other locations where the rat's route was similar to the route taken through the PFF. Also we would expect that the firing pattern in the PFF should be poorly correlated with the firing pattern observed at other locations where the rat's route was dissimilar to the route taken through the PFF. In these analyses, a similar route was defined as having a path correlation >0.85, and a dissimilar route was defined as having a path correlation less than -0.85. In Fig. 9, we can see that there was a significant bias toward similar firing patterns in other regions on the maze where the rats' paths were similar compared with other regions on the maze where the rats' paths were dissimilar [Kruskal-Wallis
2(1, n = 116) = 9.98, P = 0.0016]. However, neither group of correlations was biased toward positive correlations, which indicates that even in the same route group, there was no tendency for a maze-responsive PFN to fire similarly at other locations on the maze in which the rat's path was similar to that taken through the PFF. To the extent that the shape of the rat's path is an indication of the motor activity of the rat, these results indicate that maze-responsive PFNs did not purely encode movements.
|
0.85; and different route: cases where the rat ran a different sequence of turns in each session, but the path taken through the phasic-firing field and the path taken through the same region of the matching session were correlated by less than -0.85. To correct for multiple observations of the same cell, the set of correlations in each group obtained from one maze-responsive PFN and its matching spike trains were averaged.
In the group of same-maze pairs, correlations from 14 cells were included (from 4 rats, 3.5 ± 1.5 cells per rat). In the group of same-route pairs, correlations from four cells were included (from 3 rats, 1.3 ± 0.4 cells per rat). In the group of different-route pairs, correlations from 10 cells were included (from 5 rats, 2.0 ± 0.5 cells per rat). Shown in Fig. 10 are the firing rate correlations for each group. There was a significant difference between groups [Kruskal-Wallis
2(2, n = 28) = 12.85, P = 0.0016], and pairwise comparisons using Wilcoxon rank-sum tests (
= 0.05/3) revealed that firing rate correlations in the same-maze group were significantly higher than the different-route group (P = 0.0006) but did not differ from the same-route group (P = 0.574). The firing rate correlations in the same-route group were higher than those of the different-route group, but these differences were not significant (P = 0.036). These results indicate that maze responses were highly similar when the same turn sequence was presented and were not similar when rats took a different route through the same physical location in the environment. Based on the results of the same-route group, our data also indicate that maze responses did not purely encode sensory-context/action relationships: when rats ran through similar paths in the same physical locations in the environment, but ran a different turn sequence, correlations were intermediate between both the same-maze and different-route groups. The same-route group contained data from a small number of cells, and it could be the case that some PFNs were responding to purely sensory-context/action relationships while other PFNs were further modulated by the specific sequence of turns presented. Maze responses may thus have reflected a combination of information related to the specific actions performed, the sensory environment those actions were performed in, and in some cases the specific turn sequence presented.
|
Of the PFNs which fired
100 spikes a session, 21% were classified as reward responsive. Figure 11 shows an example of a reward-responsive PFN that was active after arrival at the first food-delivery site but not at the second food-delivery site. Of reward-responsive PFNs, 31 (38%) had a significant response only at the first food-delivery site, 33 (41%) had a significant response only at the second food-delivery site, and 17 (21%) had a significant response at both food-delivery sites. Because 79% of the reward-responsive PFNs were responsive at only one of the food-delivery sites, these cells did not encode general aspects of food retrieval or consumption (e.g., chewing), which occurred at both food-delivery sites.
|
In Fig. 6, the maze-responsive PFN was not active after arrival at either food-delivery location, and in Fig. 11, the reward-responsive PFN was not active while the rat was running on the maze. This segregation of maze and reward responses was a characteristic of the entire task-responsive PFN population. Based on the proportions of PFNs that were maze responsive (27.3%) or reward responsive (20.5%), we would expect that 5.6% of the PFNs (
22 cells) would have been responsive to both maze and reward if the probability of being a maze-responsive PFN and a reward-responsive PFN was independent. This was not the case: only 1/395 PFNs (0.3%) was classified as both reward and maze responsive. This is significantly less than we would expect by chance [
2(3) =35.0, P < 0.001].
Figure 12 shows the average normalized firing rate of PFNs after the arrival of the rat at the food-delivery sites (top) and on the turn sequence (bottom). Reward-responsive PFNs were more active after food delivery than the entire population of PFNs, whereas maze-responsive PFNs were more active on the maze than the entire population of PFNs. These results follow from the definitions of reward and maze responsiveness. However, reward-responsive PFNs were also less active on the maze than either maze-responsive PFNs or the entire PFN population. Likewise, maze-responsive PFNs were less active after food delivery than either reward-responsive PFNs or the entire PFN population. Our analyses allowed PFNs to be classified as both reward and maze responsive, but cells predominantly responded to one or the other parameter, not both. As such, these results further indicate that reward- and maze-responsive PFNs were separate populations of cells.
|
= 0.05) revealed that the activity of maze-responsive PFNs was significantly less than the entire population of PFNs. Similarly, on the maze, there was a significant effect of group [all PFNs, reward-responsive, maze-responsive, F(2,12) = 48.6, P < 0.001]. Post hoc comparisons (Tukey-Kramer HSD,
= 0.05) revealed that the activity of reward-responsive PFNs was significantly less than the entire population of PFNs. These results indicate a separation of information processing such that PFNs that responded while the rats ran on the maze did not respond during food receipt and PFNs that responded during food receipt did not respond while the rats ran on the maze.
| DISCUSSION |
|---|
|
|
|---|
Striatal representation
Lesions and inactivations of the dorsal striatum in rodents impairs performance of habitual, stimulus-response (S-R) tasks (Kesner et al. 1993
; McDonald and White 1993
; Packard 1999
; Packard and McGaugh 1992
; Packard et al. 1989
) as well as longer chains of sequential behavior (Berridge and Whishaw 1992
; Cromwell and Berridge 1996
; Matsumoto et al. 1999
; Miyachi et al. 1997
). One theory of how the dorsal striatum learns and produces S-R behavior is that striatal projection neurons with connections to motor centers encode S-R relationships by responding specifically to complex cortical inputs (Graybiel 1998
; Graybiel et al. 1994
). A number of studies have shown that striatal neurons have highly specific responses to task parameters that could encode stimulus-response relationships. Studies in the rat (Carelli et al. 1997
; Gardiner and Kitai 1992
) and primate (Kermadi and Joseph 1995
; Kermadi et al. 1993
; Kimura 1986
, 1990
; Tremblay et al. 1998
) have found that the responses of striatal cells often depend on behavioral context. In the rat, for example, Gardiner and Kitai (1992
) report that cells in the dorsal striatum that responded to an auditory cue during a movement task usually did not respond to the same cue presented outside of the task, and some cells that responded during head movements during the task did not respond when rats made similar movements outside of the task. Carelli et al. (1997
) report that in rats who have learned to barpress for food, dorsolateral striatal cells that responded to movement of the forelimb outside of the instrumental task were not active during lever pressing.
On the multiple T maze, neurons in the dorsal striatum responded as rats navigated the maze and during the delivery of food. Neither maze nor reward responses were described by general motor behavior. Less than one-third of reward-responsive PFNs responded at similar levels at both food-delivery sites, indicating that reward responses did not simply encode the action of chewing. Maze-responsive PFNs that responded at one location on the maze were not biased to respond similarly at other regions where rats took similar paths, indicating that maze responses did not simply encode motor activity during navigation. These results are consistent with the studies cited in the preceding text, which indicated that dorsal striatal neurons correlated with a movement or stimulus during a task are often not active during the same movement or stimulus presentation in a different behavioral context. Within a session, maze responses were well related to the spatial location of the animal. However, maze responses did not encode the spatial position of the animal independent of the animal's actions. First, maze responses were poorly correlated across sessions when rats took a different path through the same two-dimensional location in the environment. Second, maze responses were highly correlated across sessions when animals ran the same sequence of turns. Finally, maze responses were biased toward positive correlations across sessions when animals ran a different sequence of turns but took a similar path through the same two-dimensional location in the environment. These data indicate that maze-responsive cells were modulated by the location of the animal, what the animal was doing at that location, and, to some extent, by the specific sequence of actions the rat was performing.
This type of striatal sequence-specificity is consistent with the work done in primates and rats. In primates, Kermadi and colleagues (Kermadi and Joseph 1995
; Kermadi et al. 1993
) have shown that striatal neurons in primates preferred specific visuomotor sequences. In rats, Aldridge and Berridge (1998
) have shown that dorsal striatal neurons that were active during sequenced grooming were often not active during similar movements occurring outside of grooming sequences. To our knowledge, the data presented here from the multiple T task are the first evidence for sequence-specific striatal activity in rodents performing an arbitrary sequencing task.
Maze responses were also uniformly distributed over the turn sequence on the maze. If maze responses encode what actions need to be performed at a particular location/sensory context, then a uniform distribution of maze responses indicates that the striatal representation is rich enough to specify an action to perform at any point of the task. Such a result has important implications for theories of striatal function. Recent proposals have suggested that the striatum may implement a temporal-difference reinforcement-learning algorithm (Barto 1995
; Sutton and Barto 1998
) in which striatal neurons select an action to perform based on a policy that is modified to maximize the receipt of reward over time (Daw 2003
; Doya 2000
; Houk et al. 1995
; Schultz et al. 1997
; Suri and Schultz 1999
; Suri et al. 2001
).
Segregation of maze and reward responses
Maze-responsive PFNs often responded at multiple locations on the maze- and reward-responsive PFNs often responded after arrival at both food-delivery sites. However, maze-responsive PFNs did not respond after arrival at either food-delivery site, and reward-responsive PFNs did not respond while rats were running on the maze. A segregation of maze- and reward-responsive PFNs implies a segregation of information processing in the striatum and brings up two questions: what is the functional consequence of segregation and what properties of the striatum produce segregation?
FUNCTIONAL SIGNIFICANCE OF SEGREGATION. A segregation of maze and reward responses may shed light on the computational functions of the striatum. Recent proposals of basal ganglia function suggest that the striatum is involved in selecting appropriate actions in a task by implementing a reinforcement-learning algorithm (Barto 1995
; Brown and Sharp 1995
; Daw 2003
; Daw and Touretzky 2000
; Doya 1999
, 2000
; Foster et al. 2000
; Houk et al. 1995
; Montague et al. 1996
; Schultz et al. 1997
; Sutton and Barto 1998
). In reinforcement-learning models of the striatum, the nigrostriatal dopaminergic system provides a reward-prediction error signal and the striatum implements an actor-critic archite