Many neurons show anticipatory activity in learned tasks. This phenomenon appears to reflect the brain's ability to predict future events. However, what actually is predicted is unknown. Using a memory-guided saccade task, in which only one out of four directions was rewarded in each block of trials, we found that a group of neurons in the monkey caudate nucleus (CD) showed activity before presentation of an instruction cue stimulus. Among 329 CD neurons that were related to memory-guided saccade tasks, 156 showed the precue activity and 91 of them were examined fully. Remarkably, the magnitude of the precue activity varied across the four blocks of the one-direction-rewarded (1DR) condition, depending on which direction was rewarded. A majority of neurons with precue activity (83/91, 91%) showed significant directional preference. The best and worst directions were usually in the contralateral and ipsilateral directions, respectively. Within a block, the precue activity increased rapidly for the best direction in 1DR and decreased gradually for the worst direction in 1DR and all-directions-rewarded (ADR) condition. The precue activity was weak in ADR. The precue activity did not reflect the likelihood of a particular cue stimulus, because the probability of the cue appearing in each direction was the same regardless of the rewarded direction. These results suggest that each CD neuron indicates a particular position–reward association prospectively, usually with contralateral preference. Assuming that the CD neurons have access to saccadic motor outputs, the precue activity would create a motivational bias toward the contralateral space, even before an instruction is given by the cue stimulus.
Many neurons in the cerebral cortex and basal ganglia show anticipatory activity preceding a task-related event (Hikosaka et al. 1989c;Mackay and Crammond 1987; Mauritz and Wise 1986; Sakagami and Niki 1994; Schultz et al. 1992; Watanabe 1996). Common to these studies was that the event occurred in a highly predictable manner in a well-learned task. It was thus assumed or indicated (Mauritz and Wise 1986) that neurons showed such an anticipatory activity because the event was highly predictable.
However, the events that these neurons anticipated were always imperative for the animal to obtain reward. It was therefore unknown which was crucial for the anticipatory activity, the likelihood of the event or the reward value attached to the event. Do these neurons show the anticipatory activity because the event is likely to occur? Or do they show the anticipatory activity because the event leads to reward? Most of the previous studies were not designed to differentiate between these possibilities, since the task-related events were, typically, behaviorally significant in that they were followed by reward. Although some recent studies have provided an experimental condition in which different events (stimuli) are followed by different reward states (Leon and Shadlen 1999;Tremblay and Schultz 1999; Watanabe 1996), the nature of the anticipatory activity has not been examined. Therefore the question remains to be solved whether the anticipatory neural activity reflects the predictability of the event or the reward value of the event.
We now address the question in relation to the function of the basal ganglia, specifically the caudate nucleus (CD). The CD plays a pivotal role in the basal ganglia control of saccadic eye movement (Hikosaka et al. 2000). It receives inputs from association cortices, including the frontal and supplementary eye fields, and send outputs to the substantia nigra pars reticulata (SNr) directly and indirectly, which in turn inhibits the superior colliculus. In addition to neurons showing visual or saccade-related activities (Hikosaka et al. 1989a,b), many CD neurons show anticipatory activities before different task-specific events (Hikosaka et al. 1989c). Using a modified memory-guided saccade task (called 1DR) in which reward was given only for one particular direction out of four directions (Kawagoe et al. 1998), we found that the precue activity was remarkably dependent on the direction for which reward was to be given.
We used three male Japanese monkeys (Macaca fuscata). The monkeys were kept in individual primate cages in an air-conditioned room where food was always available. At the beginning of each experimental session, they were moved to the experimental room in a primate chair. The monkeys were given restricted amounts of fluid during periods of training and recording. Their body weight and appetite were checked daily. Supplementary water and fruit were provided daily. All surgical and experimental protocols were approved by the Juntendo University Animal Care and Use Committee and are in accordance with the National Institutes of Health Guide for the Care and Use of Animals.
The experiments were carried out while the monkey's head was fixed and its eye movements were recorded. For this purpose, a head holder, a chamber for unit recording, and an eye coil were implanted under surgical procedures. The monkey was sedated by intramuscular injections of ketamine (4.0–5.0 mg/kg) and xylazine (1.0–2.0 mg/kg). General anesthesia was then induced by intravenous injection of pentobarbital sodium (5 mg/kg/h). Surgical procedures were conducted under aseptic conditions. After exposing the skull, 15–20 acrylic screws were bolted into it and fixed with dental acrylic resin. The screws served as anchors by which a head holder and a recording chamber, both made of delrin, were fixed to the skull. A scleral eye coil was implanted in one eye for monitoring eye position (Judge et al. 1980;Robinson 1963). The recording chamber, which was rectangular (anteroposterior: 42 mm; lateral: 30 mm; depth: 10 mm), was placed over the frontoparietal cortices, tilted laterally by 35°. The monkey received antibiotics (sodium ampicillin 25–40 mg/kg im each day) after the operation.
The monkey sat in a primate chair in a dimly lit and sound-attenuated room with his head fixed. In front of him was a tangent screen (30 cm from his face) onto which small red spots of light (diameter: 0.2°) were backprojected using two LED projectors. The first projector was used for a fixation point, and the second for an instruction-cue stimulus. The position of the cue stimulus was controlled by reflecting the light via two orthogonal (horizontal and vertical) galvanomirrors.
The monkeys were trained to perform the memory-guided saccade task in two different reward conditions: all-directions-rewarded (ADR) condition and one-direction-rewarded (1DR) condition (Kawagoe et al. 1998) (Fig. 1). A task trial started with the onset of a central fixation point on which the monkeys had to fixate. A cue stimulus (spot of light) came on 1 s after onset of the fixation point (duration: 100 ms), and the monkeys had to remember its location. After 1–1.5 s, the fixation point turned off, and the monkeys were required to make a saccade to the previously cued location. The target came on 400 ms later for 150 ms at the cued location. The saccade was judged to be correct if the eye position was within a window around the target (usually within ±3°) when the target turned off. The monkeys made the saccade usually before the target onset based on memory, because, otherwise, the eyes could rarely reach the target window within the 150-ms target-on period. The next trial started after an intertrial interval of 3.5–4 s. The cue was chosen pseudorandomly, such that the four directions were randomized in every sub-block of four trials; thus, one block of experiment (60 trials) contained 15 trials for each direction.
In ADR, every correct saccade was rewarded with the liquid reward together with the tone stimulus. In 1DR, an asymmetric reward schedule was used in that only one of the four directions was rewarded while the other directions were either not rewarded (exclusive 1DR) or rewarded with a smaller amount (about 1/5; relative 1DR). We used the exclusive 1DR for two out of three monkeys; the third monkey had difficulty in performing the exclusive 1DR, so that we used the relative 1DR. The highly rewarded direction was fixed in a block of trials (including 60 successful trials). Even for the nonrewarded or less-rewarded direction, the monkeys had to make a correct saccade, because otherwise the same trial was repeated. The correct saccade was indicated by the tone stimulus. The amount of reward per trial was set approximately the same between 1DR and ADR. Other than the actual reward, no indication was given to the monkeys as to which direction was currently rewarded. 1DR was performed in four blocks, in each of which a different direction was rewarded highly. The order of the rewarded direction in four blocks of 1DR was randomized. The behavioral tasks as well as storage and display of data were controlled by a computer (PC 9801RA; NEC, Tokyo, Japan).
Eye movements were recorded using the search coil method (Enzanshi Kogyo MEL-20U) (Judge et al. 1980;Matsumura et al. 1992; Robinson 1963). Eye positions were digitized at 500 Hz and stored into an analog file continuously during each block of trials.
Before the single-unit recording experiment, we obtained MR images (AIRIS, 0.3 T; Hitachi, Tokyo, Japan) such that they were perpendicular to the recording chamber. We then determined the recording sites in the CD based on the chamber-based coordinates (Kawagoe et al. 1998). The recording sites were further verified by MR imaging in a plastic guide tube through which the electrodes were inserted.
Single-unit recordings were performed using tungsten electrodes (diameter: 0.25 mm, 1–5 MΩ, measured at 1 KHz; Frederick Haer). A hydraulic microdrive (MO95-S; Narishige, Japan) was then used to advance the electrode into the brain. We recorded extracellular spike activity of presumed projection neurons, which showed very low spontaneous activity (Hikosaka et al. 1989a), but not of presumed interneurons, which showed irregular tonic discharge (Aosaki et al. 1994).
To find CD projection neurons, we let the monkeys perform 1DR continuously. If a CD neuron was found, we let the monkeys perform some blocks of 1DR with different rewarded directions, each for several trials. Depending on the neuron's preferred direction, we chose a set of four target locations of equal eccentricity, arranged in either normal or oblique angles. The target eccentricity was usually set either 10 or 20°. We then asked the monkeys to perform at least one block of ADR and four blocks of 1DR (i.e., four different rewarded directions). In addition, we sometimes repeated 1DR blocks to confirm the reproducibility of the neuron's behavior.
We changed the experimental procedures in some experiments, by arranging the targets linearly, not concentrically (Fig. 6) or using only two targets out of four (Fig. 7). The averaged amount of reward per trial was set approximately the same between the four-target version and the two-target version of 1DR, as well as ADR.
This study focused on the precue activity that started after onset of the fixation point and ended soon after (<150 ms) the cue presentation. We first determined the duration of the precue activity for each neuron (test duration) and calculated the spike frequency during the test duration. We did not set any control period because the neurons we analyzed showed very low spontaneous activity. To test whether the precue activity was different among the rewarded directions, we performed the following analyses.
Selectivity of the precue activity for the rewarded direction: A one-way ANOVA was performed for the magnitudes of the precue activities in four blocks of 1DR.
Polar diagram: The selectivity of a CD neuron for the rewarded direction was also expressed by four vectors that represented the precue activities in the four blocks of 1DR. The direction of each vector corresponded to the rewarded direction and its amplitude corresponded to the magnitude of the precue activity.
Direction vector (DV): The four vectors constituting the polar diagram were summed (Σ V). The summed vector was then divided by the sum of the amplitudes of the four vectors [Σ (V)]. The direction vector would indicate the preferred direction and the sharpness of the directional tuning. See Fig. 5 C.
Selectivity of precue activity for rewarded location
We recorded single-unit activities of neurons in the CD of three monkeys. We selected neurons that showed low spontaneous activity and are presumed to be GABAergic projection neurons; we did not record from tonically active neurons (TANs), which are presumed to be interneurons (Aosaki et al. 1994). We examined each neuron by performing one block of ADR and four blocks of 1DR (Fig. 1). We found several types of activity in CD neurons: activity preceding the cue stimulus (precue activity); responses to the instruction cue stimulus (postcue activity) (Kawagoe et al. 1998); and activity preceding a saccade (presaccadic activity) (Takikawa et al. 2000).
In the present study we focus on the precue activity. Figure2 shows a typical example of the precue activity recorded in the left CD. In ADR, the ordinary memory-guided saccade task, the neuron showed weak precue activity initially, which disappeared toward the end of the ADR block; the neuron was otherwise nearly silent. In contrast, the neuron showed strong precue activity in 1DR, especially in the block when the right direction was rewarded (left column). The precue activities in the other blocks of 1DR were much weaker, weakest in the left-rewarded block (third column from left).
We emphasize that the precue activity was not selective for the direction of the cue stimulus (Fig. 3; one-way ANOVA, P > 0.05). This is not surprising because the cue stimulus was presented pseudorandomly in the same four directions in every block of 1DR. Instead, the selectivity of the precue activity was observed across 1DR blocks among which different directions were rewarded.
One might argue that the recording condition may change across the blocks. We excluded this possibility most carefully by repeating the same 1DR blocks. For example, the neuron shown in Fig. 2 was examined repeatedly, part of which is shown in Fig. 7.
The reward-direction-selectivity of the precue activity was a common feature (Fig. 4). Neurons A–C showed similar discharge patterns, such that their activity reached a peak at or just after cue onset, whereas neuron D was most active some time before cue onset. Neuron C showed some postcue activity as well, unlike the others. The preferred reward-direction varied among these neurons, but mostly toward the contralateral side. Among 329 neurons related to the memory-guided saccade tasks (ADR and 1DR), 156 showed precue activity. We examined 91 out of the 156 neurons using four blocks of 1DR and one block of ADR. Among them, 83 (91%) showed clear spatial selectivity (one-way ANOVA, P < 0.01).
Among 91 neurons with the precue activity, 62 neurons showed a response to the cue stimulus (postcue activity) as well and the remaining 29 neurons showed only precue activity. Figure5 A shows the population activity of neurons with precue activity only, separately for the best and worst reward-directions of 1DR and ADR. The precue activity, on the average, grew gradually after the onset of the fixation point, and then declined sharply at about 100 ms after the cue onset. The precue activity in the 1DR-worst condition was similar to that in ADR. Other 62 neurons that combined postcue activities showed very similar patterns of precue activity (not shown). In most neurons, the “best” reward-directions were in the contralateral field, while the “worst” reward-directions were in the ipsilateral field (Fig.5 B). Although some neurons had preferred reward-directions in the ipsilateral direction (dots in the ipsilateral hemifield in Fig.5 C), they were less sharply tuned compared with contralateral preferring neurons (ipsilateral dots closer to the center than contralateral dots in Fig. 5 C).
The results shown in Fig. 5 raised the question whether the precue activity is not just selective for the rewarded direction, but had a spatial field for the rewarded location. To test this possibility, we performed an experiment as shown in Fig.6. Using a concentric set of targets, we first found that the precue activity of the neuron was selective for the leftward direction (not shown). We then used a set of four target locations in the horizontal meridian, instead of the four concentric locations (Fig. 6). The precue activity was strongest when the leftmost location (left 20°) was rewarded, decreasing monotonically toward the rightmost location. Among 8 neurons examined in the same procedure, 5 showed the same tendency, in that the precue activity was strongest for the contralateral, most eccentric rewarded location; 3 neurons showed a preference for an intermediate location.
Relativity of precue activity
In 1DR so far described, the rewarded trials were less common than the nonrewarded trials. It was possible that the precue activity was stronger when a less-common event occurred in a particular direction (tentatively called an “uncommon” theory). To test this possibility we used a two-target (not the four-target) schedule: the cue stimulus was presented at one of two possible directions while the rewarded direction was fixed to one of them in a particular block. Figure7 shows the results of the two-target 1DR for the same neuron as in Fig. 2. We examined all six target combinations, each containing two blocks with different rewarded directions. The precue activity was very strong whenever the right (contralateral) direction was rewarded, no matter which direction it was paired with as the nonrewarded direction (column R in Fig.7 A; R in Fig. 7 B). In contrast, the precue activity was very weak whenever the left direction was rewarded (column L in Fig. 7 A; L in Fig. 7 B). The results excluded the “uncommon” theory, and instead indicated that the precue activity is related to the rewarded direction (“reward” theory). The same result was obtained in 5 out of 6 neurons using the two-target 1DR.
However, the magnitude of the precue activity for a particular rewarded direction varied, depending on the paired nonrewarded direction. For example, the precue activity for the upward rewarded direction was weakest when it was paired with the rightward direction, stronger when paired with downward direction, and strongest when paired with the leftward direction (column U in Fig. 7 A; U in Fig.7 B). The other 5 neurons examined also showed the same tendency. Thus the magnitude of the precue activity for a particular rewarded direction was inversely correlated with the magnitude of the precue activity for the paired rewarded direction. Nonetheless, the reward-direction tuning of the precue activity averaged across different pairs in the two-target 1DR was very similar to that obtained in the four-target 1DR (Fig. 7 C).
Emergence of precue activity
The reward-direction selectivity of the precue activity became evident gradually within one block of 1DR, as illustrated in Fig. 2, which we call within-block change. Figure8 A shows, for another neuron, the within-block changes of precue activity in the best and worst reward-directions of 1DR and ADR. Similar changes in the precue activity were observed in other neurons, as summarized in Fig.8 B. The precue activity was usually low at the beginning of the block, but increased within several trials in the 1DR block in which the reward was given for the neuron's best reward-direction; it decreased more gradually in the 1DR blocks in which the reward was given for the worst reward-direction. The precue activity in ADR showed a change very similar to that of the 1DR worst condition.
A novel type of spatial selectivity in caudate neurons with anticipatory activity
Using the one-direction-rewarded version of a memory-guided saccade task (1DR), we found that about half of task-related CD neurons showed precue anticipatory activity. The magnitude of the precue activity varied remarkably across the 1DR blocks in which different directions were rewarded. This could be regarded as a kind of spatial selectivity, but of a type that has never been reported.
The precue activity in CD neurons was found in a previous study (Hikosaka et al. 1989c), in which correct memory-guided saccades were always rewarded. It was thought to reflect the monkey's expectation or prediction of the cue stimulus, because the cue stimulus was presented in a highly predictable manner and the activity grew larger gradually and then stopped immediately after cue presentation. However, the results obtained in the present study using 1DR do not support this idea. The precue activity could not simply be related to the predictability of the event, because the probability of the cue to be presented in a particular direction was equally 1/4 across the four blocks of 1DR and yet the precue activity was present selectively in the block when one particular direction was rewarded.
The spatial or direction selectivity is a common feature of sensory neurons, which is usually called receptive field. A similar spatial selectivity has been shown for neurons that encode spatial working memory, which would be called memory field (Funahashi et al. 1989; Rainer et al. 1998; Sawaguchi and Goldman-Rakic 1994). Here, for the first time, we demonstrate that anticipatory activity could also be spatially selective. Whereas the sensory or memory field is contingent on a sensory stimulus that has already been presented (i.e., retrospective), the spatial selectivity of the precue activity is contingent on a reward that has not yet been presented but is expected (i.e., prospective). The precue activity would represent a particular position–reward association prospectively.
Relation to reinforcement learning
The reward-direction-selective precue activity may also be considered in the framework of reinforcement learning (Barto 1994; Houk et al. 1995; Schultz 1998; Wickens and Kötter 1995). The theory of reinforcement learning states that, if an action yields a reward, the action is subsequently reinforced. This may be difficult if the reward is given long after the action so that the neural mechanism for the action may not be identified and therefore may not be reinforced (frequently called credit assignment problem). Since the precue activity would indicate a particular position–reward association before an action, it may help solve the credit assignment problem.
Interestingly, neural activities in the basal ganglia have provided evidence that this problem could be solved adequately. Notably, dopaminergic neurons are activated by the sensory event that indicates the future reward (Schultz et al. 1993, 1997). Visual responses of CD neurons are profoundly enhanced (or depressed) if the stimulus indicates the future reward (Kawagoe et al. 1998). The precue activity of CD neurons might facilitate the postcue visual response of the same CD neurons to yield the reward-predicting feature (Kawagoe et al. 1998). This may in turn modulate the activity of dopaminergic neurons, since CD neurons would connect to dopaminergic neurons in the substantia nigra pars compacta, directly or indirectly through GABAergic neurons in the substantia nigra pars reticulata (Grofova et al. 1982;Hajós and Greenfield 1994; Tepper et al. 1995; Van den Pol et al. 1985). Alternatively, the dopaminergic neurons, once they acquire the reward-predicting feature, would condition the activity of CD neurons (Calabresi et al. 1997; Cepeda et al. 1993; Reynolds and Wickens 2000). The mutual relationship between the CD and the substantia nigra would provide the key to understanding the neural mechanism of reinforcement learning.
Origin and destination of precue anticipatory activity
Neurons that anticipate task-specific events have been found in the prefrontal (Sakagami and Niki 1994; Watanabe 1996), premotor (Mauritz and Wise 1986), and parietal (Mackay and Crammond 1987) cortices; basal ganglia (Hikosaka et al. 1989c; Schultz et al. 1992); and even in the superior colliculus (Basso and Wurtz 1998; Dorris and Munoz 1995). A simple idea is that the precue activities of CD neurons are caused by the inputs from some of these areas. Probably most likely among them is the prefrontal cortex, which has heavy projections to the central part of the CD (Selemon and Goldman-Rakic 1985; Yeterian and Pandya 1991), where most of the neurons were recorded in this study.
Preliminary studies from our laboratory using 1DR and ADR (Kobayashi et al. 2000) indeed have indicated that some neurons in the dorsolateral prefrontal cortex showed precue activities. However, the prefrontal neurons were different from CD neurons, in that the precue activities were not dependent on the rewarded direction. There are at least two explanations for the discrepancy. First, the precue activities of CD neurons may not be derived from the dorsolateral prefrontal cortex. However, the source of the anticipatory signal may not be found in the basal ganglia, because neither presumed cholinergic interneurons in the CD (Shimo et al. 2001) nor presumed dopaminergic neurons in and around the substantia nigra (Kawagoe et al. 1999) show anticipatory activities. Second, the precue activities of CD neurons may indeed be derived from the dorsolateral prefrontal cortex, but the cortical signal is somehow conditioned to be selective for the rewarded direction. Dopaminergic inputs may play a role in the conditioning, because dopamine neurons show a short postcue burst only in the rewarded trial (Kawagoe et al. 1999).
We thank H. Nakahara, H. Itoh, and J. Lauwereyns for helpful comments; M. Kato and B. Coe for designing the computer programs; and M. Koizumi for technical support.
This work was supported by Grant-in-Aid for Scientific Research on Priority Areas (C) of Ministry of Education, Culture, Sports, Science and Technology; Core Research for Evolutional Science and Technology of Japan Science and Technology Corporation; and Japan Society for the Promotion of Science Research for the Future program.
Address for reprint requests: O. Hikosaka, Dept. of Physiology, Juntendo University School of Medicine, 2-1-1 Hongo, Bunkyo-ku, Tokyo 113-8421, Japan.
- Copyright © 2002 The American Physiological Society