Journal of Neurophysiology

Reward-Predicting Activity of Dopamine and Caudate Neurons—A Possible Mechanism of Motivational Control of Saccadic Eye Movement

Reiko Kawagoe, Yoriko Takikawa, Okihide Hikosaka

Abstract

Recent studies have suggested that the basal ganglia are related to motivational control of behavior. To study how motivational signals modulate motor signals in the basal ganglia, we examined activity of midbrain dopamine (DA) neurons and caudate (CD) projection neurons while monkeys were performing a one-direction-rewarded version (1DR) of memory-guided saccade task. The cue stimulus indicated the goal position for an upcoming saccade and the presence or absence of reward after the trial. Among four monkeys we studied, three were sensitive to reward such that saccade velocity was significantly higher in the rewarded trials than in the nonrewarded trials; one monkey was insensitive to reward. In the reward-sensitive monkeys, both DA and CD neurons responded differentially to reward-indicating and no-reward-indicating cues. Thus DA neurons responded with excitation to a reward-indicating cue and with inhibition to a no-reward-indicating cue. A group of CD neurons responded to the cue in their response fields (mostly contralateral) and the cue response was usually enhanced when it indicated reward. In the reward-insensitive monkey, DA neurons showed no response to the cue, while the cue responses of CD neurons were not modulated by reward. Many CD neurons in the reward-sensitive monkeys, but not the reward-insensitive monkey, showed precue activity. These results suggest that DA neurons, with their connection to CD neurons, modulate the spatially selective signals in CD neurons in the reward-predicting manner and CD neurons in turn modulate saccade parameters with their polysynaptic connections to the oculomotor brain stem.

INTRODUCTION

The basal ganglia play an important role in voluntary saccadic eye movements (Hikosaka et al. 2000). This is done mainly by serial inhibitory connections: one from the caudate (CD) to the substantia nigra pars reticulata (SNr) and the other from the SNr to the superior colliculus (SC). Spatial information necessary for saccades may be provided by the inputs from the frontal eye field and the surrounding prefrontal cortices (Selemon and Goldman-Rakic 1985; Yeterian and Pandya 1991; Parthasarathy et al. 1992). Recent studies from our laboratory indicated that this basal ganglia mechanism modifies a saccade depending on whether the saccade is followed by reward (Kawagoe et al. 1998; Sato and Hikosaka 2002; Takikawa et al. 2002a,b). An important question is: Where does the reward-related information originate? A candidate for the input would be midbrain dopaminergic neurons that are located in and around the substantia nigra pars compacta (SNc) (Andén et al. 1964). Indeed, animal experiments provided some hints that DA neurons are related to reward and motivation (Robbins and Everitt 1996). DA neurons in monkeys respond to reward and to sensory stimuli that predict reward (Schultz 1998). Specifically, DA neurons respond to unexpected reward and show depression of activity when reward is unexpectedly omitted (Schultz et al. 1997) or shifted (Hollerman and Schultz 1998). Other lines of evidence suggest a common role of DA neurons in animals and humans. Thus reinforcement of brain self-stimulation in animals and drug addiction in humans seem to be mediated by DA neurons (Wise 1996). Human imaging studies have suggested that rewarding experiences are associated with changes in neural activity (Delgado et al. 2000; Knutson et al. 2000) or increases in DA transmission in the striatum (Koepp et al. 1998). These observations provided experimental evidence for recent theories of reinforcement learning (Barto 1994; Houk et al. 1995; Montague et al. 1996).

Despite a large body of data on DA neurons, it is still unclear how their signals are used to modify motor outputs. A likely site where DA neurons may influence motor signals is the dorsal striatum (including the CD and putamen), since a majority of DA neurons project to the striatum (Lynd-Balta and Haber 1994) and its dorsal part is thought to have motor functions (Alexander and Crutcher 1990). Inside the striatum are cellular and molecular mechanisms with which DA neurons can exert strong influences on striatal projection neurons (Calabresi et al. 1997; Nicola et al. 2000). DA depletion in the unilateral CD induces deficits in initiation of saccades to the contralateral side (Kato et al. 1995; Kori et al. 1995) or contralateral hemineglect (Miyashita et al. 1995). Since CD projection neurons are a major origin of eye movement signals in the basal ganglia (Hikosaka et al. 2000), DA neurons are in a good position to induce the reward-contingent modulation of activity of CD neurons and other basal ganglia neurons.

Using a modified memory-guided saccade task (called 1DR task) (Kawagoe et al. 1998), in which reward was given for one particular direction out of four targets, we found that activity of CD neurons is strongly modulated by the presence and absence of an upcoming reward (Kawagoe et al. 1998) and the position at which reward is given (Lauwereyns et al. 2002b; Takikawa et al. 2002a). To study the function of midbrain DA neurons in reward-oriented saccades, we recorded from CD neurons and DA neurons in the same monkeys using 1DR. We found that activity of DA neurons was even more clearly modulated by an upcoming reward. Interestingly, one monkey showed no reward-dependent modulation of saccades, unlike the other monkeys. In this monkey neither CD neurons nor DA neurons exhibited reward-dependent modulation of visual cue responses. Such individual differences were consistent with the hypothesis that the reward dependencies of DA neurons and CD neurons are causally related such that saccades are modulated depending on an upcoming reward (Takikawa et al. 2002b).

METHODS

General

We used four male Japanese monkeys (Macaca fuscata): monkeys G, H, K, and M. Under general anesthesia, we implanted a head holder, chambers for unit recording, and a scleral search coil (Hikosaka et al. 1993). All surgical and experimental protocols were approved by the Juntendo University Animal Care and Use Committee and are in accordance with the National Institutes of Health Guide for the Care and Use of Animals. The monkeys were trained to perform saccade tasks, especially a memory-guided saccade task (Hikosaka and Wurtz 1983b). Eye movements were recorded using the search coil method. We recorded extracellular spike activity of presumed projection neurons in the CD in all four monkeys (G, H, K, and M) and presumed DA neurons in and around the SNc in three monkeys (G, H, and M).

Task procedures

The monkeys performed the memory-guided saccade task in two different reward conditions: all-directions-rewarded condition (ADR) and one-direction-rewarded condition (1DR). We used a set of four target locations of equal eccentricity (either 10° or 20°), arranged in either normal or oblique angles, depending on the neuron's response field. For every neuron recorded, we required the monkeys to perform one block of ADR and four blocks of 1DR (i.e., four different rewarded directions).

A task trial started with the onset of a central fixation point that the monkeys had to fixate (see Fig. 1). A cue (spot of light, duration: 100 ms) came on 1 s after onset of the fixation point at one of the four target locations, and the monkeys had to remember its location. After a random period (1-1.5 s), the fixation point turned off, and the monkeys were required to make a saccade to the previously cued location. The target came on 400 ms later for 150 ms at the cued location. The saccade was judged to be correct if the eye position was within a window around the target (usually within 3°) when the target turned off. The correct saccade was indicated by a tone stimulus. The next trial started after an intertrial interval of 3.5-4 s.

fig. 1.

Memory-guided saccade task in 1-direction-rewarded condition (1DR). A: durations of stimuli and other task-related events. B: four blocks of 1DR. Only 1 direction was rewarded throughout a block of 60 trials. Different directions were rewarded in different blocks. See methods, for details.

In ADR, every correct saccade was rewarded with a liquid reward together with the tone. In 1DR only one of the four locations was rewarded while the other directions were not rewarded (as illustrated in Fig. 1). The rewarded direction was fixed in each block of experiments which consisted of 60 successful trials. Even for the nonrewarded direction, the monkeys had to make a correct saccade. If the saccade was incorrect, the same direction was repeated. The target cue was chosen pseudorandomly such that the four directions were randomized in every subblock of four trials; thus one block of experiment (60 trials) contained 15 trials for each direction. 1DR was performed in four blocks, in each of which a different direction was rewarded (highly rewarded for monkey K). Other than the actual reward, no indication was given to the monkeys as to which direction was currently rewarded. The average amount of reward in one block of experiment was set approximately the same between 1DR and ADR; in other words, the amount of reward in a rewarded trial of 1DR was approximately four times larger than the amount of reward in a trial of ADR.

The monkeys were first trained to perform the ADR task. They were then trained to perform the 1DR task. Before the recording experiment started, the monkeys had been trained on the 1DR task extensively for more than 300 blocks.

Recording procedures

Eye movements were recorded using the search coil method (Enzanshi Kogyo MEL-20U) (Judge et al. 1980; Matsumura et al. 1992; Robinson 1963). Eye positions were sampled at 500 Hz. The behavioral tasks as well as storage and display of data were controlled by a computer (PC 9801RA, NEC, Tokyo, Japan). The unitary action potentials were passed through a window discriminator (Bak INC, Model DDIS-1), and the times of their occurrences were stored with a resolution of 1 ms.

Before the single-unit recording experiment, we obtained MR images (Hitachi, AIRIS, 0.3T) such that they were perpendicular to the recording chamber. We then determined the recording sites in the substantia nigra based on the chamber-based coordinates (Kawagoe et al. 1998). The recording sites were further verified by MR-imaging a plastic guide tube through which the electrodes were inserted.

Single-unit recordings were performed using tungsten electrodes (diameter: 0.25 mm, 1-5 MΩ, measured at 1 KHz, Frederick Haer). To introduce the electrode into the brain, we first inserted a stainless steel guide tube (OD 0.85 mm, ID 0.60 mm) containing the electrode. A hydraulic microdrive (Narishige, MO95-S) was used both to insert the guide tube and subsequently to advance the electrode into the brain. We sometimes implanted a plastic guide tube (OD 1.1 mm, ID 0.8 mm) semipermanently at the location where DA neurons were concentrated. The location of the guide tube was visualized on MRIs and was confirmed to be directed to the SNc.

Identification of DA neurons and CD projection neurons

We searched for DA neurons in and around the SNc (François et al. 1985). Guided by MRIs, we first searched for the SNr, where neurons showed high-frequency spontaneous firing (Hikosaka and Wurtz 1983a). The identity of the SNr was assured if such rapid firing stopped in relation to saccade tasks (Sato and Hikosaka 2002). Often intermingled with or just medial to SNr neurons (usually ≤2 mm) were presumed DA neurons that fired irregularly and tonically at about 5 spikes/s with broad spike potentials. Extracellular spikes may have an initial positive component or may be followed by prolonged positive component (Schultz and Romo 1987). A neuron with these features was thus determined to be a DA neuron. We then examined whether it responded to the delivery of reward. A drop of water as a reward was given with a random time interval (4-9 s) while the monkey was sitting in the dim experimental room. If the neuron responded to reward, we had the monkeys perform at least one block of ADR and four blocks of 1DR (i.e., four different rewarded directions). In addition, we sometimes repeated 1DR blocks to confirm the reproducibility of the neuron's behavior.

We recorded extracellular spike activity of presumed projection neurons, which showed very low spontaneous activity (<1 Hz) (Hikosaka et al. 1989a), not of presumed interneurons, which showed irregular tonic discharge (Aosaki et al. 1994). To find CD projection neurons, we let the monkeys perform 1DR continuously because CD neurons were often very quiet if the monkey was not performing a task. If a CD projection neuron was found and it fired at any period while the monkey was performing 1DR, we had the monkeys to perform at least one block of ADR and four blocks of 1DR.

Data analysis

determination of a response. We analyzed the postcue response of CD neurons and DA neurons and the postreward response of DA neurons. For each response, we determined its presence by comparing the spike frequency in the control period and the spike frequency in the test period using the Wilcoxon signed-rank test (P < 0.001) separately for rewarded and nonrewarded trials. For CD neurons two kinds of test period after the cue stimulus were set for each neuron: 200-500 ms for visual response and 500-1000 ms for sustained response. If the neuron showed the postcue response significantly at 200-500 and/or 500-1000 ms, we determined that a postcue response was present. We focused on reward enhanced type whose test activity was higher in rewarded trial than nonrewarded trial. For DA neurons the test period for the postcue response was 100-300 ms (for excitatory responses) and 200-500 ms (for inhibitory responses). The control period was set just before the onset of the fixation point for both of CD and DA neurons, its duration being the same as that of the test period. The test period for the response to reward was set to be 100-400 ms after the time when reward would be delivered. We then determined the presence of the reward response in the same way as the cue response (Wilcoxon signed-rank test, P < 0.001).

statistical analysis for the reward and spatial selectivity. A normalized cue response was calculated, for each trial, as the spike frequency during the test period minus the spike frequency during the control period. We performed nonparametric analysis for reward and spatial selectivity for the four-block data of 1DR for the neuron's normalized response. The first eight trials in each block were excluded from the analysis. Reward selectivity of the postcue response in 1DR was tested by the Mann-Whitney U test (P < 0.01) by comparing the responses in the rewarded trials and the responses in the nonrewarded trials. Spatial selectivity of the postcue response was determined by Kruskal-Wallis test (P < 0.01) by comparing the responses in four cued directions in 1DR. We used nonparametric tests because in many trials there was no spike in a test or control period and hence the sampled spike frequencies did not show the normal distribution.

reward/nonreward discrimination of saccade parameters (see fig. 2 and table 1). For each experiment including the four blocks of 1DR, we obtained mean values of saccade parameters (velocity, latency, and amplitude) separately for rewarded trials (52 trials: 13 trials × 4 blocks) and nonrewarded trials (156 trials: 39 trials × 4 blocks). The first eight trials in each block were excluded from the analysis. We then performed a statistical comparison for the entire set of experiments between the rewarded and nonrewarded conditions. For this purpose, we used the Wilcoxon signed-rank test (P < 0.0001) by pairing the mean value of rewarded trials and the mean value of nonrewarded trials for each experiment.

fig. 2.

Effects of an upcoming reward on peak saccadic velocity in reward-sensitive monkeys (A) and a reward-insensitive monkey (B). The mean peak saccadic velocity in the rewarded condition (ordinate) is plotted against that in the nonrewarded condition (abscissa). The mean saccade velocities for each neuron (indicated by a dot) were obtained by averaging across saccades separately for the rewarded and nonrewarded trials. Vertical and horizontal bars at each dot indicate SEs. The data were obtained from monkeys H, G, K, and M based on 37, 40, 35, and 28 experiments, respectively, in which target eccentricity of 20° was used.

View this table:
table 1.

Effects of an upcoming reward on saccade parameters and error rate in 1DR task

reward/nonreward discrimination of error rate (table 1). For each experiment including the four blocks of 1DR, we obtained the number of errors separately for rewarded and nonrewarded trials. We then calculated total error rate across experiments.

Histology

After recording was completed, monkeys H, G, and M were anesthetized with an overdose of pentobarbital sodium and perfused through the heart with 4% paraformaldehyde. The brain was blocked and equilibrated with 20% sucrose. Frozen sections were cut every 50 μm in the planes parallel to the electrode penetrations so that complete tracks were visible in single sections. The sections were stained with cresyl violet. Reconstruction of the recording sites was based on microlesions (5 μA for 180 s) made at the end of some recording experiments. Other recording sites were estimated based on these microlesions. To examine the dopaminergic activity in the brain of monkeys G and M, we used tyrosine hydroxylase (TH) immunohistochemistry to visualize dopaminergic neurons in and around the SNc and axon terminals in the CD, according to the procedure described previously (Kato et al. 1995). One of every five sections was subject to TH immunohistochemistry and this was done for the anterior-posterior dimension including the entire substantia nigra.

RESULTS

Behavioral sensitivity to reward

We trained four monkeys to perform 1DR of the memory-guided saccade task. Before the recording experiment started, all monkeys had been trained for the 1DR task for more than 300 blocks. As shown in Fig. 2A, in three monkeys (H, G, and K), the peak saccadic velocity was significantly faster when the saccade was followed by reward (ordinate) than when the saccade was not followed by reward (abscissa) (Wilcoxon signed-rank test, P < 0.0001, also see Table 1). This phenomenon has been reported previously (Takikawa et al. 2002b). In monkey M (Fig. 2B), however, the peak saccadic velocity was not different between the two conditions (Wilcoxon signed-rank test, P > 0.01, Table 1). In monkeys H and K, the saccade latency was significantly shorter in rewarded than in nonrewarded trials (Wilcoxon signed-rank test, P < 0.0001, Table 1). Furthermore, the coefficients of variation of saccade parameters were significantly lower in monkeys H, G, and K in rewarded trials than in nonrewarded trials (Wilcoxon signed-rank test, P < 0.0001, parenthesis in Table 1). The error rate was significantly lower in monkeys H, G, and K in rewarded trials than in nonrewarded trials (Wilcoxon signed-rank test, P < 0.0001, Table 1). These tendencies were absent in monkey M. On the other hand, the saccade amplitude was not significantly changed (Table 1). Therefore we judged that three monkeys (H, G, and K) were sensitive and one monkey (M) was insensitive to an upcoming reward. The saccade velocity was generally higher in monkey M than the other monkeys regardless of reward condition (Table 1).

Search for DA neurons and CD projection neurons

A total of 368 neurons in three monkeys (H, G, and M) were determined to be DA neurons (see methods). Among them, 203 (55%) responded to the delivery of reward by phasically increasing its discharge rate. We further examined these reward-responsive DA neurons using the ADR and 1DR. The other reward-unresponsive neurons were not examined further, although we cannot exclude the possibility that they also were DA neurons. The data presented in this paper are based on 71 DA neurons for which we were able to examine at least one block of ADR and four blocks of 1DR. They were recorded from monkeys H (n = 16), G (n = 32), and M (n = 23).

A total of 397 neurons in four monkeys (H, G, K, and M) were determined to be CD projection neurons (see methods) and were related to the performance of the memory-guided saccade task. Among them, 293 CD neurons responded to the visual cue stimulus in the memory-guided saccade task (Kawagoe et al. 1998); other types of neurons (Lauwereyns et al. 2002a; Takikawa et al. 2002a) are not included if they showed no visual response. The data presented in this paper are based on 206 CD neurons for which we were able to examine at least one block of ADR and four blocks of 1DR. They were recorded from monkeys H (n = 84), G (n = 59), K (n = 36), and M (n = 27).

The behavioral sensitivity (indicated by the saccade velocity) to reward was well correlated with the neuronal sensitivity to reward. In the three reward-sensitive monkeys (H, G, and K), a majority of both CD and DA neurons tended to be sensitive to an upcoming reward. In the reward-insensitive monkey (M) a majority of CD and DA neurons were insensitive to reward. In the following we mainly concentrate on three monkeys in which we recorded from both DA neurons and CD neurons (monkeys H, G, and M).

Comparison of neuronal activity between reward-sensitive and reward-insensitive monkeys

Figure 3 shows typical examples of a DA neuron (top) and a CD neuron (bottom) recorded in reward-sensitive monkey H. The data represent the neuronal sensitivity to an upcoming reward. The DA neuron responded to reward with a burst of spikes when the reward was delivered while the monkey was not engaged in a task (data not shown). However, when the same reward was delivered in response to the correct performance of the monkey in ADR or 1DR, DA neurons showed no discernable response except for the first few rewarded trials (Fig. 3, top, shaded parts). Instead, they responded to the cue stimulus in a selective manner in 1DR. In the first block of 1DR in which the right-up (RU) direction was rewarded (RU column), the DA neuron exhibited a burst of spikes after RU cue that indicated an upcoming reward. In the second block in which the left down (LD) direction was rewarded (LD column), the neuron responded to LD cue with a similar burst of spikes, but no longer responded to RU cue. In this way, the DA neuron changed its response completely such that it always responded to the cue that indicated the upcoming reward (Wilcoxon signed-rank test, P < 0.0001). The magnitude of the cue response was invariable whichever cue was rewarded (Kruskal-Wallis test, P = 0.6105). For the other three directions that indicated no reward, the neuron showed some suppression in activity (Wilcoxon signed-rank test, P < 0.0001). These results indicate that DA neurons acquired a particular position-reward association very quickly and flexibly. In contrast, the neuron showed no sign of response in ADR in which reward was delivered in every trial (ALL column).

fig. 3.

Comparison between a dopamine (DA) neuron (top) and a caudate (CD) neuron (bottom) in a reward-sensitive monkey H. Neurons were recorded from the left CD and the left substantia nigra pars compacta (SNc). The data obtained in 1 block of ADR (right) and 4 blocks of 1DR (left) are shown in columns. In the histogram/raster display (binwidth: 20 ms), the neuronal discharge aligned on cue onset and reward onset is shown separately for different cue directions (RU, right-up; LU, left-up; LD, left-down; RD, right-down); the cue directions were pseudorandomized at the time of experiment. The rewarded direction is indicated by a dotted circle. Target eccentricity was 20° and 10° for DA and CD neurons, respectively. The order of the rewarded directions in the 1DR blocks was RU—LD—LU-RD (DA) and RU-LD—LU-RD (CD). Both neurons were recorded after the monkey experienced 600 blocks of 1DR. The DA neuron showed no response in ADR, but responded to whichever cue indicated an upcoming reward in 1DR, not but reward itself. The CD neuron responded to RU cue most strongly in ADR, but the response was greatly enhanced and depressed in 1DR when the cue indicated reward and no reward, respectively.

The CD neuron in this monkey (Fig. 3, bottom) also showed a remarkable flexibility. In ADR the neuron responded strongly to the contralateral (RU and RD) cue stimuli. The neuron changed its preferred direction completely in 1DR. It responded most strongly to whichever cue stimulus indicated an upcoming reward. This kind of reward contingency was a common feature among CD projection neurons (Kawagoe et al. 1998). On the other hand, the CD neuron showed no reward response (Fig. 3, bottom, shaded parts).

We found that neuronal activity was remarkably different in the reward-insensitive monkey M (Fig. 4). The DA neuron shown in Fig. 4, top, did not respond to the cue stimulus in either 1DR or ADR, but responded with a burst to reward itself. This pattern was completely opposite to the DA neuron in the reward-sensitive monkeys (Fig. 3, top).

fig. 4.

Comparison between a DA neuron (top) and a CD neuron (bottom) in a reward-insensitive monkey M. The same format as in Fig. 3. Neurons were recorded from the right CD and the right SNc after the monkey experienced 600 blocks of 1DR. Target eccentricity was 20°. The order of the rewarded directions in the 1DR blocks was U—L— R-D (DA) and RU—LD-RD-LU (CD). The DA neuron showed no response to any cue in ADR or 1DR, but responded to reward itself. The CD neuron continued to respond to RU cue regardless of the reward condition, although the response was slightly larger when the RU cue indicated reward.

The CD neuron (Fig. 4, bottom) responded selectively and vigorously to RU cue stimulus in ADR and 1DR (RU row). Unlike the CD neuron in the reward-sensitive monkey H, the postcue response appeared unchanged depending on the rewarded direction. An analysis using Mann-Whitney U test showed no significant difference between the rewarded trials (RU column) and the unrewarded trials (LU, LD, and RD columns) (P = 0.9505) for RU cue direction in 1DR. Also unlike in the reward-sensitive monkey H, the neuron's preferred direction RU was ipsilateral to the recorded neuron.

The difference in neuronal activity between the reward-sensitive and reward-insensitive monkeys is summarized as population histograms in Fig. 5. DA neurons in the two reward-sensitive monkeys (H and G) were very similar (Fig. 5A, left). While the response to reward was negligible (not shown), the response to the cue stimulus was robust and selective for the expected outcome of reward. The response to the reward-indicating cue (thick black line) appeared as a single positive peak that reached about 16 spikes/s in both monkeys. The response to the nonreward-indicating cue (thick gray line) consisted of an initial small positive peak that was followed by a delayed negative peak, again in both monkeys. On the other hand, DA neurons in reward-insensitive monkey M were completely different as shown in Fig. 5B, left. The cue response was very weak and did not discriminate between the reward-indicating and nonreward-indicating cues.

fig. 5.

Population activity of DA neurons (left) and CD neurons (right) in reward-sensitive monkeys (A) (monkey H, DA: n = 16, CD: n = 39; monkey G, DA: n = 32, CD: n = 24) and a reward-insensitive monkey (B) (monkey M, DA: n = 23, CD: n = 10). The activity is aligned on cue onset (binwidth: 10 ms) and is shown separately for 1DR-rewarded trials (thick black), 1DR-nonrewarded trials (gray), and ADR trials (thin black). The data for CD neurons were chosen from the reward-enhanced type. Data were smoothed with 3-point moving average.

Among 48 DA neurons in the reward-sensitive monkeys, 42 neurons showed significant cue responses (Wilcoxon signed-rank test, P < 0.001, Table 2). All cue-responsive DA neurons showed reward-dependent modulation (Mann-Whitney U test, P < 0.01, Table 2). In contrast, among 23 DA neurons in the reward-insensitive monkey M, only 1 showed significant cue response, and even that response was not modulated by an upcoming reward (Table 2). Instead, most DA neurons (17/23) in monkey M continued to respond to reward delivery during 1DR, while the reward response in reward-sensitive monkeys were less (Table 2). Thus the DA neuronal responses during 1DR in the reward-sensitive monkeys were quite different from those in the reward-insensitive monkey. On the other hand, basic characteristics of DA neurons were similar. The spontaneous discharge rate was 4.5 ± 1.2 (monkey H), 4.9 ± 1.2 (monkey G), and 4.7 ± 1.1 (monkey M) (mean ± SD) spikes/s. More than half of DA neurons responded to reward when it was delivered without warning: 21/33 (64%) (monkey H), 125/237 (53%) (monkey G), and 57/98 (58%) (monkey M).

View this table:
table 2.

Characteristics of DA neurons in 1DR

CD neurons also showed a clear contrast. Population activities of CD neurons of reward-enhanced type are shown in Fig. 5. Anticipatory precue activity was evident in the reward-sensitive monkeys (H and G) (Fig. 5A, right), not in the reward-insensitive monkey M (Fig. 5B, right). The postcue response was equally frequent (Table 3) and was spatially selective in both types of monkeys (Kruskal-Wallis test, P < 0.01, Table 3). As reported previously (Kawagoe et al. 1998) reward selectivity occurred in the reward-sensitive monkeys (Fig. 5A, right), but not in the reward-insensitive monkey M (Fig. 5B, right). A majority of postcue-responsive CD neurons in the reward-sensitive monkeys (104/127, 82%) showed significant reward-dependent modulation of the postcue response (Mann-Whitney U test, P < 0.01, Table 3). Among them 76 neurons showed stronger responses to the reward-indicating cue than to the no-reward-indicating cue (positive reward modulation) as shown in Fig. 5A, whereas in 28 neurons the response was stronger to the no-reward-indicating cue (negative reward modulation). In the reward-insensitive monkey M, only 2 of 20 neurons showed reward-dependent modulation (both positive) (Table 3).

View this table:
table 3.

Characteristics of CD neuron in 1DR

Comparison of DA and CD neuronal activity in reward-sensitive monkeys

The data shown so far indicate that the reward-dependent activity in DA neurons and CD neurons exists in the reward-sensitive monkeys, but not in the reward-insensitive monkey. This raises the possibility that DA and CD neuronal responses are either causally related or share common causes. To study the relationship between DA and CD neurons in the reward-sensitive monkeys, we examined spatial and temporal factors of the neuronal responses (Fig. 6). In Fig. 6A the DA and CD cue responses in the reward-sensitive monkey H are shown separately for the contralateral cue (left) and the ipsilateral cue (right). Only 4 of 42 DA neurons showed significant contralateral versus ipsilateral spatial selectivity (Kruskal-Wallis test, P < 0.01 Table 2). In contrast, CD neurons responded better to the contralateral cue than to the ipsilateral cue provided that it indicated an upcoming reward (Fig. 6). A majority of CD neurons (82/127 in the reward-sensitive monkeys and 16/20 in the reward-insensitive monkey) showed significant spatial selectivity (Table 3). These data indicate that CD neurons receive spatially selective inputs that do not originate from DA neurons.

fig. 6.

Comparison of cue responses between DA neurons and CD neurons in reward-sensitive monkey H. A: population cue responses in 1DR-rewarded (thick black), 1DR-nonrewarded (gray), and ADR (thin black) conditions are shown separately for contralateral (left) and ipsilateral (right) cues. B: within-block changes in population cue responses of DA neurons and CD neurons are shown separately for contralateral (left) and ipsilateral (right) cues. The responses in 1DR-rewarded trials (thick black), 1DR-nonrewarded trials (gray), and ADR trials (thin black) are plotted against the subblock number in 1DR and ADR. Shaded regions indicate the range of spontaneous activity (mean ± SD). The data for CD neurons were chosen from the reward-enhanced type.

Figure 6B shows changes in DA and CD neuronal activity within a block of 1DR task. In the first subblock of four trials (value 1 in the abscissa) in which the cue was presented in all four directions, both DA and CD neurons showed no differential responses to the reward-indicating cue (thick black) and the no-reward-indicating cues (thick gray). This was natural because the monkey knew the rewarded direction only by experiencing the trials in different directions. DA neurons and CD neurons were different in this initial cue response: while DA neurons showed little response to the contralateral or ipsilateral cue (little departure from the background activity), CD neurons already showed full responses that were stronger to the contralateral cues than the ipsilateral cues. This again indicates that the spatially selective responses of CD neurons originate from neurons other than DA neurons.

In the second subblock or thereafter, both DA and CD neurons differentiated their cue responses depending on the presence and absence of the upcoming reward. In DA neurons, the response to the reward-indicating cue became positive (i.e., increase from the background activity) while the response to the no-reward-indicating cue became negative (i.e., decrease from the background activity). In CD neurons, the response to the reward-indicating cue increased slightly while the response to the no-reward-indicating cue decreased monotonically.

The data in Fig. 6B suggested that the activity of DA neurons and the activity of CD neurons covaried within one block of 1DR. To illustrate their relationship we replotted the time-course data in Fig. 6B for monkey H as a scatterplot in Fig. 7, but now for monkey G and M as well. Each data point indicates the mean cue response of DA neurons (abscissa) and the mean cue response of CD neurons (ordinate). Only the responses to the contralateral cue are shown. In the reward-sensitive monkeys (H and G) there was a clear positive correlation between the mean cue responses of DA and CD neurons (Spearman ρ = 0.869 P < 0.0001, ρ = 0.840 P < 0.0001, respectively). If we consider the CD cue response as a function of the DA cue response, any positive response of DA neurons (compared with the background activity) would result in a positive shift of CD responses (compared with their initial responses in the first subblock); any negative response of DA neurons would result in a negative shift of CD responses. No such relationships were found in the reward-insensitive monkey M (Spearman ρ = -0.317, P > 0.03), possibly because the dynamic range of DA cue responses was negligible.

fig. 7.

Correlated within-block changes in cue responses of DA neurons and CD neurons in reward-sensitive monkeys (A) and a reward-insensitive monkey (B). Each data point represents the mean cue response of DA neurons (abscissa) and the mean cue response of CD neurons (ordinate) at each subblock in 1DR. This scatterplot is basically a replot of the time-course data as shown in Fig. 6B, but is shown separately for monkeys H, G, and M. The responses to the contralateral cue are shown for 1DR-rewarded trials (black circle), 1DR-nonrewarded trials (gray circle), and ADR trials (black dot). Data for the first, second, and third subblocks are connected to visualize the initial within-block change in cue responses.

General behavior

All four monkeys were healthy and had a good appetite throughout the experiments. However, monkey M was remarkably different from the other three monkeys. When humans showed up in front of their home cages, most monkeys would show some signs of affection, aggression, obedience, or withdrawal. Monkey M showed no such sign. When an investigator tried to touch them, most monkeys would avoid the touch. Only when an intimate relationship has been established, monkeys may invite tactile contacts by presenting themselves toward a human mate. Monkey M made no such sign of avoidance or invitation, remaining expressionless. When presented with a favorite food such as banana, most monkeys would grab it and eat it hastily. Monkey M showed no sign of an immediate interest; he would not reach for the food for a while, although he would eat it eventually. A monkey usually discriminated humans, showing emotional attachments with the investigator who worked with the monkey. Monkey M showed no sign of such social discrimination. Although monkey M was usually indifferent to human observers, he was more sensitive to a sudden sensory stimulus, especially sound, by orienting to the source of the stimulus.

Interestingly, monkey M was able to learn saccade tasks, including the memory-guided saccade task, as quickly and accurately as the other monkeys. He even had no difficulty in learning 1DR. Most monkeys were reluctant to, and occasionally refused to, make a saccade to the nonrewarded direction. Monkey M showed no sign of reluctance. However, monkey M tended to get drowsy during experiments, a tendency so robust as to be unusual.

Recording sites of DA neurons

After making more than 20 electrode penetrations to the region around the SNc for each monkey, we could determine the region where DA-like neurons were recorded, on which the subsequent recordings were centered. Near the end of the recording session, we selected representative locations for electrode penetration and, when DA neurons were recorded, we made electrolytic microlesions at the recording sites. Histological examination revealed that the microlesions were located in the region inside or just mediodorsal to the SNc. This was true for both the reward-sensitive monkeys, G (shown in Fig. 8, A-D) and H (not shown), and the reward-insensitive monkey M (Fig. 8, F and H-J). By aligning the sections that contained the microlesions on the adjacent TH-immunostained sections, we found that the microlesions and reconstructed recording sites were among TH-positive neurons that were most likely to be DA neurons for both monkey G (Fig. 8E) and monkey M (Fig. 8G). The lack of behavioral and neuronal sensitivity to the upcoming reward in monkey M raised the possibility that the DA mechanism of monkey M may be abnormal. However, TH immunohistochemistry showed no gross abnormalities.

fig. 8.

Recording sites of DA neurons in monkey G (left) and monkey M (right). Three coronal sections are shown for monkey G (A-C) and for monkey M (H-J), rostrocaudally with 1-mm intervals. DA neurons that were and were not fully examined are shown by filled and open circles, respectively. Horizontal bars indicate electrolytic marks. Photomicrographs in D and F (Nissl-stained sections) represent parts of C (monkey G) and I (monkey M), respectively (indicated by dashed circles). TH-stained sections in E and G were adjacent to the sections in D and F, respectively, and their positions correspond to the dashed rectangles in D and F. Aggregation of TH-positive cell bodies and dendrites are visible. SNc, substantia nigra pars compacta; SNr, substantia nigra pars reticulata; STN, subthalamic nucleus.

DISCUSSION

Relationship between CD neuronal activity and saccade behavior

In the memory-guided saccade task, a saccade-motor instruction is given by a positional cue stimulus. Many CD neurons respond preferentially to the contralateral one (Hikosaka et al. 1989b). By using a one-direction version of the memory-guided saccade task (1DR), we previously found that the CD postcue response was strongly modulated by reward condition (Kawagoe et al. 1998). Typically, the postcue response was much larger when the cue indicated an upcoming reward. 1DR task also gave rise to a robust behavioral outcome: saccades to the rewarded direction were faster and earlier than saccades to the nonrewarded directions (Takikawa et al. 2002b). These observations suggested that CD neurons create the reward-dependent modulation of saccade parameters. This idea was supported by a recent study indicating that CD neuronal activity, including the postcue response, is positively correlated with the velocity of contralateral saccades (Itoh et al. 2003). How then might DA neurons contribute to the neuronal and behavioral outcomes of reward?

Relation between DA neurons and CD projection neurons

There were similarities and differences between DA neurons and CD projection neurons. They were similar in that their cue responses were stronger when the cue indicated an upcoming reward in 1DR than when the cue indicated no reward (positive reward modulation). Whereas this rule applies to all DA neurons, a small fraction of CD neurons showed the opposite reward dependency (negative reward modulation, data not shown). Distinct differences were found in their task specificity, spatial selectivity, and precue activity. DA neurons showed postcue response in 1DR exclusively, whereas CD neurons responded both in ADR and 1DR. DA neurons showed no spatial preference, whereas CD projection neurons usually responded to the contralateral cue rather than to the ipsilateral cue. The spatial preference of CD neurons was present also in ADR. Precue activity was evident in CD neurons (Takikawa et al. 2002a), but not in DA neurons, while postcue activity were evident in both CD and DA neurons. To summarize, the cue response of DA neurons seems to be determined exclusively by the reward contingencies (not by the spatial localizations), whereas that of CD projection neurons is determined by both the reward contingency and spatial localization.

If DA neurons project to the CD and make synaptic connections, the reward contingency for the cue response of CD neurons may, at least partly, be derived from DA neurons. Indeed, the within-block development of DA cue responses was similar to that of CD cue responses.

These results imply that the spatial factor of CD cue responses is derived from non-DA neurons. One plausible candidate would be the inputs from the cerebral cortex, although other inputs such as those from the thalamus (Matsumoto et al. 2001) could also contribute. Cortical inputs to the CD that carry spatial attention or working memory would include the frontal eye field, supplementary eye field, intraparietal sulcus area, and dorsolateral prefrontal cortex (Parthasarathy et al. 1992; Selemon and Goldman-Rakic 1985; Yeterian and Pandya 1991). These cortical signals carrying spatial information may be modulated by DA signals carrying reward information. The modulation would usually be positive because in a majority of CD neurons the cue response increases when reward is expected (Kawagoe et al. 1998). Such a facilitatory effect of DA neurons may be mediated by D1 receptors (Hernández-López et al. 1997; Reynolds et al. 2001). In the small number of CD neurons for which the cue response is depressed if reward is expected (Kawagoe et al. 1998), the inhibitory effect of DA neurons may be mediated by D2 receptors (Hernández-López et al. 2000; West and Grace 2002). In any case, it is plausible to think that the interaction of the spatial signals from the cerebral cortex and the reward-related signals from DA neurons in CD projection neurons are the reason why the monkeys behaved (or made saccades) in a reward-sensitive manner. In monkey M, DA neurons showed no response to the cue stimulus in 1DR, and under our hypothesis CD neurons could not acquire the ability to discriminate between the reward-indicating and no-reward-indicating cues.

Although we favor the hypothesis above, we cannot rule out the possibility that the relationship is reversed: from CD to DA. A main target of CD neurons is the SNr (François et al. 1994; Hedreen and DeLong 1991; Parent and Hazrati 1994). Many SNr neurons send recurrent collateral axons to DA neurons in and around the SNc (Tepper et al. 1995). Since CD projection neurons and SNr projection neurons are both GABAergic, spike activity of CD neurons would lead to disinhibition of DA neurons. Under this alternative, postcue activity of CD neurons might lead to or facilitate postcue activity of DA neurons.

Precue anticipatory activity

The precue anticipatory activity was present in many CD neurons in the reward-sensitive monkeys, but not in the reward-insensitive monkey M. It is unlikely that the precue activity affected saccade parameters directly, because the activity occurred before the motor instruction. Nonetheless, the precue activity might influence the saccade parameters indirectly. As illustrated in previous studies from our laboratory (Lauwereyns et al. 2002a; Takikawa et al. 2002a), the precue activity was present selectively when a particular direction, especially a direction contralateral to the recorded neuron, was rewarded. Many of the CD neurons with precue activity also showed postcue responses with similar direction selectivity. Note, however, the meaning of the direction selectivity was different: selectivity for rewarded direction for precue activity; selectivity for instructed direction for postcue response. This combination of precue activity and postcue activity gave rise to selective boosting of the postcue response when the preferred direction was rewarded. The enhancement appeared as a positive bias or boosting rather than a gain change of the postcue response (Lauwereyns et al. 2002a). The enhanced postcue response would then facilitate saccades.

Our recent studies have suggested that the precue activity could indeed contribute to changes in saccade parameters. When the monkey was instructed to make a saccade to the cue stimulus on its appearance, the precue activity was followed by a saccade immediately (Lauwereyns et al. 2002b). When a contralateral side was rewarded, the precue activity was higher and the saccade latency was shorter. The result is consistent with the hypothesis that stronger precue activity leads to facilitation of a saccade to the contralateral direction. Monkey M would certainly lack the hypothetical facilitatory effects of CD precue activity on contralateral saccades. This might partly account for the lack of reward-dependent modulation of saccades in monkey M.

Possible pathological mechanisms

Monkey M's behavior was clearly different from other monkeys that we have tested. He rarely showed motivated behavior, emotional reactions, or any sign of affect. And yet, monkey M was more sensitive to sudden sensory stimuli than other monkeys. We, the authors, agreed that these behaviors were pathological, likened to human psychiatric or neurological disorders. These behavioral symptoms were somewhat similar to parkinsonism, although monkey M showed no obvious sign of motor deficits. Parkinsonian patients often have poor expression (called “mask-like expression”), like monkey M; they are often reluctant to initiate an action spontaneously (hypokinesia), like monkey M; they often showed reflex blink hyperexcitability (see Basso et al. 1996), as monkey M continued to respond to sudden sensory stimuli. The similarities suggested that the basal ganglia of monkey M did not function normally. In this monkey, DA neurons showed almost no response to the cue stimulus while CD neurons were not influenced by expected reward, unlike the other monkeys. One obvious difference between parkinsonian patients and monkey M was that DA neurons become scarce in parkinsonian patients, but were present and active in monkey M. Histological examination suggested that there were plenty of TH-immunoreactive neurons in the substantia nigra, which probably correspond to DA neurons. Despite the apparently normal chemical feature, these DA neurons did not function normally. It is possible that CD neurons recorded in monkey M may have not acquired the ability to predict an upcoming reward, perhaps due to the lack of reward-predicting signal in DA inputs. Alternatively, we may have recorded from different types of CD neurons in these monkeys. However, the latter explanation seems unlikely because we rarely found reward-predicting responses in CD neurons in monkey M.

On the other hand, monkey M's behavior was somewhat similar to symptoms of schizophrenia or autism. Schizophrenic and autistic patients are often alienated and lack emotional or social contact with other people, like monkey M. In fact, it has long been suspected that DA dysfunctions are involved in schizophrenic symptoms (Weinberger 1987). However, pathophysiology of schizophrenia remains largely elusive. It is largely because there has practically been no primate model of schizophrenia. One way to solve this problem would be to rely on the natural incidence of behavioral disorders, but the probability of finding such behavioral disorders would be very low. One way to circumvent this difficulty would be to collect animals with behavioral disorders from research institutions world wide. Researchers would probably be willing to give away such “psychiatric” animals since they are usually inappropriate for most experiments. The collection would be a precious data base for research on disordered mind and behavior.

Acknowledgments

We thank K. Nakamura, H. Nakahara, J. Lauwereyns, and B. Richmond for helpful comments, H. Itoh for analyzing data, and M. Kato for designing the computer programs.

GRANTS

This work was supported by a Grant-in-Aid for Scientific Research on Priority Areas (C) of Ministry of Education, Culture, Sports, Science, and Technology; Core Research for Evolutional Science and Technology of Japan Science and Technology Corporation; and Japan Society for the Promotion of Science Research for the Future program.

Footnotes

  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

References

View Abstract