|
|
||||||||
The Journal of Neurophysiology Vol. 87 No. 3 March 2002, pp. 1488-1498
Copyright ©2002 by the American Physiological Society
1Department of Neurology, University of Tokyo School of Medicine, Tokyo 113-8655; 2Department of Physiology, Juntendo University School of Medicine, Tokyo 113-0033; and 3Brain Science Research Center, Tamagawa University Research Institute, Tokyo 194-8610, Japan
| |
ABSTRACT |
|---|
|
|
|---|
Kobayashi, Shunsuke,
Johan Lauwereyns,
Masashi Koizumi,
Masamichi Sakagami, and
Okihide Hikosaka.
Influence of Reward Expectation on Visuospatial Processing in
Macaque Lateral Prefrontal Cortex.
J. Neurophysiol. 87: 1488-1498, 2002.
The lateral prefrontal cortex (LPFC)
has been implicated in visuospatial processing, especially when it is
required to hold spatial information during a delay period. It has also
been reported that the LPFC receives information about expected reward
outcome. However, the interaction between visuospatial processing and
reward processing is still unclear because the two types of processing could not be dissociated in conventional delayed response tasks. To
examine this, we used a memory-guided saccade task with an asymmetric
reward schedule and recorded 228 LPFC neurons. The position of the
target cue indicated the spatial location for the following saccade and
the color of the target cue indicated the reward outcome for a correct
saccade. Activity of LPFC was classified into three main types: S-type
activity carried only spatial signals, R-type activity carried only
reward signals, and SR-type activity carried both. Therefore only
SR-type cells were potentially involved in both visuospatial processing
and reward processing. SR-type activity was enhanced (SR+) or depressed (SR
) by the reward expectation. The spatial discriminability as
expressed by the transmitted information was improved by reward expectation in SR+ type. In contrast, when reward information was coded
by an increase of activity in the reward-absent condition (SR
type),
it did not improve the spatial representation. This activity appeared
to be involved in gaze fixation. These results extend previous findings
suggesting that the LPFC exerts dual influences based on predicted
reward outcome: improvement of memory-guided saccades (when reward is
expected) and suppression of inappropriate behavior (when reward is not expected).
| |
INTRODUCTION |
|---|
|
|
|---|
Prediction of future events, such
as the presence of food, is critically important for an animal's
survival. Single-unit studies revealed neuronal activity related to
stimulus-reinforcement association in various brain areas including
substantia nigra (Schultz 1998
), amygdala (Fukuda
and Ono 1993
; Schoenbaum et al. 1998
),
hypothalamus (Fukuda and Ono 1993
), orbitofrontal cortex
(Schoenbaum et al. 1998
; Thorpe et al.
1983
; Tremblay and Schultz 1999
), anterior cingulate cortex (Niki and Watanabe 1976
), and
ventral striatum (Apicella et al. 1991
; Shidara
et al. 1998
). Together with other lines of research, it is
suggested that an animal's ability of reward prediction relies on
these brain areas (Jones and Mishkin 1972
; Olds
and Milner 1954
; Phillips et al. 1983
).
In addition, animals, especially primates, have developed cognitive
abilities to flexibly respond to the environment. Spatial working
memory is an example that allows motor behavior to be guided by
visuospatial representations that are stored in memory. Using a
memory-guided saccade task (Hikosaka and Wurtz 1983
), it
has been demonstrated that spatial information is maintained by
individual neurons in the lateral prefrontal cortex (LPFC) (Funahashi et al. 1989
-1991
), parietal cortex
(Barash et al. 1991
), basal ganglia (Hikosaka and
Wurtz 1983
; Hikosaka et al. 1989a
), and superior
colliculus (Kojima et al. 1996
). Together with other lines of evidence, the LPFC is now thought to be a key structure for
spatial working memory (Jacobsen 1935
; Jonides et
al. 1993
; Milner et al. 1985
).
These two functions, reward prediction and spatial working memory, have
been examined in separate studies, with some exceptions (Leon
and Shadlen 1999
; Watanabe 1996
), and the
relationship between these two types of processing remains unknown.
They may not be independent of each other, considering that cognitive
behavior may be based on the expectation of reward outcome. A recent
study from our laboratory elucidated such a situation. During a
memory-guided saccade task in which a correct response was rewarded for
only one of four directions (1DR task), the behavioral performance of
monkeys was better and more precise in the reward-present trials than
in the reward-absent trials (Kawagoe et al. 1998
). This
behavioral result indicates that reward processing does interact with
spatio-motor processing in the brain.
It is likely that the LPFC is the place where the interaction takes
place because the brain areas primarily engaged in the reward
processing, such as the midbrain dopamine area and orbitofrontal cortex, project to the LPFC (Barbas and Mesulam 1985
;
Ilinsky et al. 1985
). Further, reward-related activity
has been recorded from the LPFC during the performance of instrumental
behavior (Leon and Shadlen 1999
; Rosenkilde et
al. 1981
; Watanabe 1996
). Additional support
comes from the fact that administration or blocking of dopamine, which
is regarded as a carrier of reward information (Schultz
1998
; Yokel and Wise 1975
), affects the function of spatial working memory: microinjection of dopamine and dopamine antagonists changes the spatial discriminability of working-memory neurons in the LPFC (Sawaguchi 1988
, 1990
;
Williams and Goldman-Rakic 1995
). Clinical studies on
patients with deficits of the dopaminergic system, such as Parkinsonian
and schizophrenic patients, reported deterioration in working-memory
tasks (Freedman and Oscar-Berman 1986
; Owen et
al. 1997
; Pantelis et al. 1997
; Sahakian
et al. 1988
). At present, however, it remains unclear how the
LPFC integrates spatial and reward information and to what extent such
integration correlates with behavior.
We hypothesized that reward information and spatial information are
integrated in the LPFC so that spatial information becomes more
accurate when reward outcome is expected; more accurate representations of spatial information would, in turn, lead to more accurate behavior. To test this hypothesis, we devised a memory-guided saccade task with
an asymmetric reward schedule, in which the subject performed a
spatiomotor behavior with different reward outcomes. The task is the
same as the one used in our laboratory to study basal ganglia neurons
(1DR) (Kawagoe et al. 1998
), except that the reward
condition was indicated by the color of the target cue, not its
position. We call it "1CR" (1-color rewarded) task.
| |
METHODS |
|---|
|
|
|---|
The study was conducted on two male Macaca fuscata monkeys (monkeys H and Z, both weighed 5.0-6.0 kg). They performed a behavioral task under computer control. A stimulus generator (VSG, Cambridge Research, UK) was used to generate visual stimuli on a 20-in. computer display (GDM-F500, Sony, Tokyo, Japan). Activity of single neurons was recorded with moveable electrodes, whereas eye movements were monitored using the magnetic search-coil technique. All surgical and experimental protocols were approved by the Juntendo University Animal Care and Use Committee and were in accordance with National Institutes of Health's Guide for Care and Use of Laboratory Animals.
Behavioral procedures
The monkeys were seated with their head fixed in a primate chair inside a completely enclosed sound-attenuated room. A CRT was set 70.5 cm in front of the monkey to present visual stimuli. They were trained on a memory-guided saccade task with an asymmetric reward schedule (Fig. 1). A trial started with the onset of a central fixation point (0.21° in visual angle). Five hundred milliseconds after the onset of the fixation point, a target cue (0.53°) was presented for 200 ms randomly at one of two positions. The target cue was colored and the color predicted a reward. Two diagonally opponent positions were selected of four candidates (left, right, top, bottom) at a 6.5° distance from the center of the monitor. When the neuron had a spatial preference, one of the two positions was selected to be within the neuron's response field. The subject had to remember the cue position during the delay period that was randomized between 0.9 and 2.1 s. The disappearance of the fixation point after the delay period was the signal to make a saccade to the previously cued location. The saccade was judged to be correct if the eye position was within a virtual window around the target cue (within 5.2 × 5.2° and within 400 ms). The target cue came on 400 ms later for 100 ms. An auditory tone of 900-Hz rectangular waveform was presented if the subject made a correct saccade. If the subject made any error, the same trial was repeated. The sound served as an indication of correct task performance both in reward-present and -absent conditions. The color of the target cue indicated whether a reward would follow. Two colors were paired out of four prepared colors (red, yellow, green, and blue, luminance was 5.51, 25.6, 20.1, and 1.6 cd/m2, respectively). One of the two colors was associated with reward (a drop of water about 0.15-0.20 ml) and the other with no reward. Both color and position of the cue were alternated randomly. The color-reward contingency was fixed within a block consisting of at least 60 trials, and it was reversed in the following block. A rest break (about 30 s) was given between the blocks.
|
Surgical procedures
Surgery was performed to implant a head-holder, a delrin chamber
(30 × 42 mm) and a scleral magnetic search coil (Robinson 1963
). All surgical procedures were performed with aseptic
technique under ketamine (4.6-6.0 mg/kg im) and pentobarbital
(4.5-6.0 mg/kg/h iv) anesthesia. The monkey received antibiotics
(sodium ampicillin, 25-40 mg/kg im) after the operation.
Data acquisition
During recording sessions, action potentials of single neurons
were recorded extracellularly with tungsten electrodes (FHC, Bowdoinham, ME; shank diameter: 250 µm, taper angle: 20-15°,
impedance: 1.5-3 m
). Microelectrodes were advanced vertically to
the cortical surface, using an oil-driven micro-manipulator (MO-95,
Narishige, Tokyo, Japan) and a grid system (holes 0.6 mm wide and 1.0 mm apart from center to center; Nakazawa, Tokyo, Japan). The action potentials were amplified, filtered (500 Hz to 2 kHz) and processed by
a window discriminator (MDA-4 and DDIS-1, BAK Electronics, Germantown,
MD). Neuronal discharges were converted into standard digital pulses by
means of an adjustable Schmitt-trigger, the output of which was
continuously monitored on a digital oscilloscope together with the
waveform. A PC-generated raster displays of neuronal activity. Eye
movements were recorded using the magnetic search-coil technique
(MEL-25, Enzanshi-Kogyo, Tokyo, Japan). The data of neuronal discharges
and eye movements were also collected on a digital tape recorder at 20 and 2 kHz, respectively (GX-1, Teac, Tokyo, Japan).
Data analysis
Off-line analysis was carried out on a PC using MATLAB release 12 for Windows. All spike analyses were based on trials in which the subject made a correct response. We defined the cue period as the period from 100 to 300 ms after cue onset, the delay period as the period from 300 to 900 ms after cue onset, and the saccade period as the period from 300 ms before the saccade onset to the saccade onset time. We tested the effect of the spatial and reward conditions on the neuronal activity during these three periods separately by two-way ANOVA (position × reward, P < 0.01). The reward condition gave rise to subtle but significant differences in saccade latency, peak velocity, and amplitude (Table 1). Therefore it was possible that reward-differential neuronal activity was confounded by these oculomotor parameters. To examine this possibility, data were tested by two-way analysis of covariance (ANCOVA, position × reward, P < 0.01) with saccade latency, peak velocity, amplitude, and precision as covariates. The ANOVA results were consistent with those using ANCOVA except for a few cases (among 52, 88, and 27 cases with a significant main effect of reward according to ANOVA in cue, delay, and saccade periods, respectively, 5, 8, and 14 cases were not significant according to ANCOVA). We treated these cases as nonsignificant with respect to the reward factor in the neuronal database (Table 2) so that reward-related activity in our population analysis is not explained by systematic changes in the oculomotor behavior caused by the reward condition.
|
|
We tried to evaluate how well neuronal activity discriminates the
position of the cue. The activity of SR-type cells (see RESULTS for detail) changed not only by the spatial
condition but also by the reward condition. In other words, discharge
rate, i.e., the range of the activity in which these neurons encoded spatial information, changed systematically by the reward condition. Therefore we had to use a measure by which we can evaluate spatial discriminability independent of the range of the activity. Our approach
was to ask how well could the neural responses tell us about the cue
position if we consider the neurons to be transmission channels that
carry spatial information via their spike rates in response to the cue
presentation. Spatial discriminability represented by a neuron was
expressed by transmitted information. Predictable information of
spatial condition associated with the neuronal responses
(IS) was quantified as the
decrease in entropy of the stimulus occurrence
H(S)
|
(1) |
|
|
(2) |


Anatomical location
We explored a wide area in the prefrontal cortex for single-unit
recording in two monkeys. To identify recording sites, brain magnetic
resonance images were taken (AIRIS, 0.3T, Hitachi) (Saunders et
al. 1990
). In addition, frontal eye field (FEF) was identified as an area where saccades were elicited by micro-stimulation (a train
of 20 cathodal pulses of 0.2-ms duration and 3-ms interval) lower than
50 µA (Bruce et al. 1985
). Histological confirmation of recording sites is not available because both monkeys are alive and
participating in other studies.
| |
RESULTS |
|---|
|
|
|---|
Behavioral performance
The animals' behavior was analyzed based on 36,903 trials
(monkey Z: 21,193 trials, monkey H: 15,710 trials) during the course of single-unit recordings. There were a total
of 3,250 error trials (Z: 1,844 trials, H: 1,406 trials) that consisted of fixation errors before the cue onset (280 trials, Z: 212 trials, H: 68 trials), fixation
errors after the cue onset (1,929 trials, Z: 1,053 trials,
H: 876 trials), and errors of saccade direction (1,041 trials, Z: 579 trials, H: 462 trials). Although
the animals were free to ignore the color of the target cue that
indicated the reward condition, their behavior was systematically
influenced by the reward condition: correct performance rate after the
cue presentation was significantly higher in the reward-present
condition (94.6%, Z: 94.6%, H: 94.6%) than in
the reward-absent condition (88.8%, Z: 89.0%,
H: 88.4%, P < 0.001 for both monkeys by
2 test).
Because we changed the stimulus-reward contingency in every block, it may have taken a while for the animals to understand the new contingency. To examine the within-block change of animal behavior, we plotted the correct performance rate as a function of trial count from the start of the block: probability to make a correct response at the nth reward-present or reward-absent trial in the block was calculated based on behavioral data from 296 blocks (monkey Z) and 202 blocks (monkey H). A block consisted of at least 60 trials, 30 reward-present trials and 30 reward-absent trials, hence trial count n ranged from 1 to 30. Behavioral performance became clearly worse in the reward-absent condition than in the reward-present condition in the later part of the block in both monkeys (Fig. 2, A and B). We also plotted saccade precision as a function of trial count (Fig. 2, C and D). Saccade precision was determined by the distance in visual angle between target position and saccade end point. Although they were regarded as correct responses, saccades became less precise in the later part of the block when absence of reward was indicated. Other saccade parameters also systematically changed with the reward condition in the later part of the block; saccade latency was shorter (both monkeys), peak velocity was higher (monkey Z), and saccade amplitude was smaller (monkey H) in the reward-present condition than in the reward-absent condition (Table 1).
|
Neuronal database
We recorded from 228 well-isolated LPFC neurons that were
related to 1CR task. Significant color-discriminative activity was found in 25, 14, and 7 neurons during the cue, delay, and saccade periods respectively (color discriminability was tested by
t-test, P < 0.01, using data from trials
with the target cue in the cell's preferred position). Color, or
luminance, was not commonly coded in 1CR probably because once
translated into a reward-predicting signal, the color feature was
behaviorally irrelevant. This result is consistent with the view that
the LPFC encodes behaviorally relevant information (Rainer et
al. 1998
; Sakagami and Niki 1994
; Watanabe 1986
). In the following analysis, we focus on
neuronal activity related to position and reward information.
For every recorded neuron, we screened its spatial discriminability with respect to four positions (left, right, top, and bottom) by visual inspection. During the recording session, two diagonally opponent positions were used including the neuron's preferred position, if it had any. We tested the effect of the spatial and reward conditions on the neuronal activity during the cue, delay, and saccade periods by two-way ANOVA (position × reward, P < 0.01; Table 2). Neuronal activity with only a position main effect was called S type and that with only a reward main effect was called R type. If neuronal activity had both position and reward main effects or an interaction between position and reward, it was called SR type. Neurons that had a significant response compared with the baseline activity but no main effect or interaction were called nondifferential type (ND type).
LPFC activity coding the spatial condition: S type
S-type activity that changed by the cue position not by the reward condition was found in 28.8, 21.2, and 27.6% of responsive neurons in the cue, delay, and saccade period, respectively.
A typical S-type cell is shown in Fig. 3. The target cue was presented at the top or bottom in red or yellow. In the first block, the red cue indicated the upcoming reward (block red); in the second block, the yellow indicated reward (block yellow). The neuron responded to the cue when it was presented at the top, whereas it was mildly suppressed in response to a cue at the bottom. The activity was affected by neither the color of the cue nor the reward condition as illustrated by the nearly identical histograms for the reward-present condition (red line) and reward-absent condition (gray line).
|
LPFC activity coding the reward condition: R type
R-type cells changed their activity depending on the reward
condition but were unaffected by the cue position. This type of activity was further classified into R+ type (higher activity in the
reward-present condition) and R
type (higher activity in the
reward-absent condition). A typical R+-type cell is shown in Fig.
4. In the first block, red indicated the
upcoming reward and yellow indicated no reward (block red). The target
cue was presented randomly on either the left or right. The neuron
showed sustained activity during the delay period after a red cue,
whereas it was nearly silent after a yellow cue. The color-reward
contingency was reversed in the second block, with yellow indicating a
reward (block yellow). This neuron showed sustained activity to the
previously rewarded red cue for several trials, but soon stopped firing
for the red color. The activity to the yellow cue, which was associated with reward, increased, although not so much as in the reward trials of
the previous block. This neuron did not significantly change its
activity by the position of the target cue. To further confirm the
reward effect and to exclude color and position effects, we used other
positions, top and bottom, and other colors, green and blue, in the
next two blocks. The reward-selective activity was reproduced;
sustained delay period activity appeared to the green cue in block
green and to the blue cue in block blue independent of the cue
position. The extinctive tendency in the neuronal activity was also
reproduced; in block blue, the activity to the previously rewarded
green cue remained for a couple of trials and the activity to the
rewarded blue cue did not reach the same level as in the reward-present
trials of block green. In sum, this neuron showed sustained activity in
the reward-present condition regardless of the position of the cue.
This is in contrast with R
-type cells, as demonstrated in Fig.
5. This neuron was phasically activated in the reward-absent condition regardless of the color or position of
the cue.
|
|
LPFC activity coding the spatial condition and reward condition: SR type
Our main interest in this study was SR-type cells, which showed
activity related to both spatial and reward conditions because they
were potentially involved in both visuospatial and reward processing.
Like R-type cells, there were two kinds of reward dependency among
SR-type cells: higher activity in the reward-present condition (SR+) or
higher activity in the reward-absent condition (SR
). A typical
SR+-type cell is shown in Fig. 6. The
neuron was spatially discriminative in that its sustained activity
started earlier and was stronger after the right cue than after the
left cue. The neuron was reward dependent in that its activity was stronger in the reward-present than reward-absent condition, regardless of the cue color (block green and block blue). Interestingly, the
reward dependency was clear for the preferred position (right) but not
for the nonpreferred position (left). This phenomenon is illustrated by
the clear difference between the reward-present (red) and reward-absent
(gray) histograms for the preferred position but not for the
nonpreferred position.
|
Figure 7 shows an example of SR
-type
cells. Let us focus on the phasic activity after the cue. The activity
was stronger for the bottom position (spatially discriminative). In
addition, it was stronger in the reward-absent condition than in the
reward-present condition. Unlike the SR+-type cell shown in Fig. 6, the
enhancement effect in the reward-absent condition was present both for
the preferred position (bottom) and for the nonpreferred position (top), the latter being even clearer in this particular cell.
|
Spatial discriminability in the population of SR+ and SR
types
The difference between SR+- and SR
-type activity represented in
Figs. 6 and 7 turned out to be a general rule as visualized by the
population histograms of mean firing rates (Fig.
8, A, B,
D, and E) and those of transmitted information
(Fig. 8, C and F). When the cue was at the
preferred position (Fig. 8A), the mean SR+-type activity was
higher by approximately 45% in the reward-present condition (black)
than in the reward-absent condition (gray). Note, however, the activity
started at 80 ms after cue onset in both conditions similarly, but
diverged at 110 ms after cue onset depending on the reward condition.
In contrast, when the cue was at the nonpreferred position (Fig.
8B), the SR+-type activity was not different between the
reward-present condition (black) and the reward-absent condition
(gray). The results suggest that the spatial discriminability of
SR+-type neurons was improved in the reward-present condition. To test
this more quantitatively, we calculated the transmitted spatial
condition information (hereafter denoted simply as Is) of
each neuron. Is was calculated based on spike count in a
50-ms time window moved by 10-ms step and plotted at the center of the
window. Figure 8C shows the mean Is of all SR+
neurons. The mean Is in the reward-present condition (black)
was nearly twice as large as that in the reward-absent condition (gray)
in all three periods. Is based on total spike count in each
period was 0.39 bits (reward+) and 0.24 bits (reward
) in the cue
period, 0.38 bits (reward+) and 0.20 bits (reward
) in the delay
period, and 0.44 bits (reward+) and 0.25 bits (reward
) in the saccade
period (cue period: P < 0.01, delay period:
P < 0.01, saccade period: P = 0.05;
paired t-test). Note that Is based on spike count
in each period was smaller than the value achieved by integrating the
curves in Fig. 8C because information does not necessarily
accumulate over different bins.
|
The results for the SR
type were completely different. The mean
SR
-type activity was higher in the reward-absent condition (gray)
than in the reward-present condition (black). This was true whether the
cue was at the preferred position (Fig. 8D) or at the
nonpreferred position (Fig. 8E). Again, the SR
activity started at 80 ms after cue onset, but the reward-dependent modulation started at 120 ms. The mean Is was strikingly similar
between the reward-present (black) and reward-absent (gray) conditions (Fig. 8F). There was no statistical difference between the
conditions in any of the three periods.
Activity related to gaze fixation in SR
-type cells
We noticed other features that were different between the SR+ and
SR
activities. The SR+ activity was sustained during the delay period
and tended to increase toward the saccade period (Fig. 8A),
whereas SR
-type activity made a phasic peak in the cue period and
tended to decrease toward the saccade period (Fig. 8D).
SR
-type cells defined in the cue period not only responded phasically
to the cue presentation but also increased their activity in the precue
period at almost the same level (Fig. 8, D and
E). Whereas, the activity of SR+-type cells in the precue
period stayed low at around 10 Hz (Fig. 8, A and
B).
Change of reward effect within a block
We found that the effect of the reward condition on behavioral
performance became clearer in the late part of the block (Fig. 2),
suggesting that the animal's motivational contrast between the
reward-present and absent conditions became clearer in the late part of
the block. To examine whether the within-block change in behavior
correlates with spatial representation of LPFC neurons, we calculated
Is in both the early and late part of the block. Figure
9 demonstrates Is in the
reward-present condition (black lines and dots) and in the
reward-absent condition (gray lines and dots) in the early (1-10th
trial) and late (11-30th trial) part of the block. Is of
SR+ activity (Fig. 9A) showed changes similar to those in
saccadic behavior (Fig. 2). In the reward-absent condition, the mean
IS of SR+ activity decreased
significantly from the early part to the late part of the block, both
in the cue period (paired t-test, P < 0.05)
and the delay period (P < 0.01). On the other hand,
the mean Is of SR
activity was independent of the reward
condition and showed no significant change between the early and late
parts (Fig. 9B). Within-block changes of Is in
the saccade period are not shown because reward information was not
commonly coded in the saccade period.
|
Recording locations
We recorded from a wide area in the right LPFC of two monkeys (Fig. 10A). Confirmation of recording locations is based on magnetic resonance imaging and electric stimulation of frontal eye field. Anatomical estimation by these two methods were consistent: the tracks where saccades were elicited by electric stimulation below 50 µV corresponded to the rostral bank of the arcuate sulcus. Distribution of each type of neurons showed some tendency depending on the time period. To show this tendency in two monkeys, the brain maps obtained from two monkeys were superimposed so that the brain landmarks (principal sulcus and arcuate sulcus) overlapped best. During the cue period (Fig. 10B), S-type activity (blue circle) tended to appear in the ventrocaudal part of LPFC, while R-type activity (red triangle and red inverted triangle) appeared in and ventral to the principal sulcus area. SR-type activity (green triangle and green inverted triangle) was distributed between the zones of the S and R types. During the delay period (Fig. 10C), the number of neurons showing SR-type activity increased, and there was no clear anatomical separation of the different types of activity. During the saccade period (Fig. 10D), the number of neurons showing SR- and R-type activity decreased; S-type activity appeared in the caudal half of LPFC while most of the activity in the rostral half was of the ND type.
|
| |
DISCUSSION |
|---|
|
|
|---|
The present study demonstrated that the monkey LPFC showed various subsets of neurons in terms of spatial coding and reward coding during the performance of 1CR task; S-type cells coded only the spatial condition, R-type cells coded only the reward condition, and SR-type cells coded both of these conditions. This may support the idea that the LPFC receives both spatial and reward information and their integration takes place in the LPFC to generate behavioral responses adjusted to the expected reward outcome. From this viewpoint, SR-type cells are the postintegration cells, and indeed they showed a correlation with saccade behavior adjusted to the expected reward outcome.
Effects of the reward conditions on behavior
We found that the monkeys changed their behavior depending on the expected reward outcome during the 1CR task; saccades were less precise in the reward-absent condition than in the reward-present condition, and this effect became clearer in the latter part of the block (Fig. 2, Table 1). These results indicate that the level of motivation changed according to the reward condition and the trial count in the block.
The monkeys made more errors during the delay period by breaking the fixation, particularly in the reward-absent condition. These behavioral results suggest that behavioral error was due to the failure of the spatial working memory mechanism, the failure of the eye fixation mechanism, or both; to perform 1CR correctly, mechanisms for spatial working memory and suppression of fixation break might be engaged.
Effects of the reward conditions on LPFC neurons
It has been reported that LPFC neurons show activity that
correlates with reward expectation (Leon and Shadlen
1999
; Watanabe 1996
). The tasks in these studies
changed the kinds or magnitude of reward while the animal performed a
spatial working memory task. We also found reward-related activity in
the LPFC consistent with the previous studies. We further dissociated
neurons that process purely reward signals (R type) from neurons that
process both reward and spatial signals (SR type). In contrast to the fact that there were many S-type cells in the cue period and their visual onset was sharp (Fig. 3), R-type cells were most common in the
delay period and their onset was generally not sharp. In case of
SR-type cells, space-coding activity made a steep rise in the histogram
followed by reward-differential activity with delay (Fig. 8). This is
especially interesting because we presented spatial and reward
information simultaneously by the target cue. The different temporal
course between spatial signal and reward signal suggests different
routes before arriving at the LPFC and the delay of the reward signal
may correspond to the time needed to process the color-reward association.
Effects of reward condition on spatial processing in LPFC
Neurophysiological studies reported activity of the LPFC neurons
related to visuospatial processing during the visual response period,
delay period, and motor response period (Funahashi et al.
1989
-1991
; Suzuki and Azuma 1983
). Especially
the delay period activity of LPFC neurons has been emphasized to be a
neural correlate of spatial working memory. Consistent with previous
studies, we found that activity related to visuospatial processing is
common in the LPFC throughout the task period (S and SR types: 43.9, 44.8, and 38.7% of responsive neurons in the cue, delay, and saccade period respectively). Our study reveals that, at least in some subsets
of neurons, the activity is not purely cognitive but mixed with reward expectation.
Effects of reward condition on spatial discriminability of SR+
and SR
types
The SR-type activity was classified into two groups, SR+ and SR
,
depending on whether the activity was enhanced or depressed by the
predicted presence of a reward. Interestingly, they were different in
several respects and probably have different, not opposite, functions.
In the SR+ type, the spatial discriminability was higher in the reward-present condition than in the reward-absent condition, and the effect was maintained until the time of saccade execution (Fig. 8C). The enhanced spatial discriminability corresponded to a better performance in the reward-present condition (Table 1). Furthermore, the spatial discriminability of the SR+ type and the monkey's performance co-varied within a block of 1CR task (Fig. 9A). SR+-type cells, then, may be directly responsible for the adjustment of saccadic behavior to the expected reward outcome.
In contrast, the spatial discriminability of the SR
-type activity was
the same, on average, between the reward-present and reward-absent
conditions, despite the significant difference in the magnitude of the
activity (Fig. 8, D-F). Thus SR
-type cells do not appear
to determine the quality of the saccadic behavior. SR
-type cells may
represent spatial information unaffected by reward expectation.
Another possibility is that SR-type cells may be related to some other
function, such as gaze fixation or suppressing inappropriate eye
movements away from fixation. This hypothesis is supported by the
behavioral results that in the reward-absent condition the monkeys made
more errors by breaking fixation, hence more efforts were required for
successful central fixation in the reward-absent condition than in the
reward-present condition. Another support for gaze-related function of
SR
type is that they were phasically activated by the onset of the
fixation point and the target cue when the monkey gazed at the fixation
point (Fig. 8, D and E). SR
-type activity could
serve to strengthen fixation as a reaction against demotivating
information carried by the peripheral cue. In this sense, spatial
selectivity of SR
-type would imply that SR
-type activity suppresses
a saccade to the position in the neuron's response field. This idea is
consistent with the long-standing hypothesis that the prefrontal cortex
plays a key role in protecting the temporal structure of behavior from
interference and distraction and in controlling instinctual or
emotional behavior (Fuster 1997
; Sakagami et al.
2001
).
Bivalency hypothesis on the reward-coding activity in LPFC
The critical functional difference between SR+ and SR
types
originates from the direction of the reward effect (R+ or R
). One
interpretation is that bivalent reward outcome (presence or absence of
reward) in 1CR task was represented in distinct populations of LPFC
neurons; R+- and SR+-type cells represent positive motivation and R
and SR
-type cells represent negative motivation. Leon and Shadlen
reported reward-related enhancement of LPFC neurons restricted to the
visual field (Leon and Shadlen 1999
), which might
correspond to our SR+-type activity. However they did not find
SR
-type neurons. The different result on SR
type might be explained
by the bivalence hypothesis: our all-or-none type of reward schedule
may cause bivalent motivational states in the monkey, whereas their
large-or-small type of reward schedule may cause positive motivation to
a large or small degree.
Multi-valent representation of reinforcement was shown not only in the
LPFC (Watanabe 1996
) but also in other brain areas (Elliot et al. 2000
; O'Doherty et al.
2001
; Thorpe et al. 1983
), and it has been
proposed that there are parallel neural systems for the representation
of appetitive and aversive reinforcers. The current study suggests that
the limbic projection to the LPFC modulates cognitive functions in a
valence-specific manner: R+ signals serve to enhance working-memory
processes and R
signals suppress inappropriate behavior due to distraction.
Relationship among the subsets of LPFC neurons
The effects of reward condition on behavior imply that spatial signals and reward-predicting signals are integrated prior to motor output. The current results of single-unit recording suggest that such integration takes place in single neurons in the LPFC. S- and R-type cells independently encoded the spatial and reward-predicting signals, whereas SR-type cells encoded both. Furthermore, we found an interesting temporal and anatomical pattern of activity in these cells (Fig. 10). S-type activity occurred in the caudal LPFC close to FEF and stayed there, whereas R-type activity occurred in the rostral LPFC and shifted caudally as if joining S-type activity. Correspondingly, SR-type activity, which occurred in the caudal and ventral LPFC, became more prevalent in the delay period. These data raise the possibility that the spatial signals and reward-predicting signals were integrated in SR-type neurons generating saccadic behavior, which adjusted to the expected reward outcome.
Functional difference between LPFC and basal ganglia
Spatial information and reward information also converge in the
basal ganglia (Hikosaka et al. 1989b
; Kawagoe et
al. 1998
). The functional comparison between the LPFC and basal
ganglia is important because projection of the LPFC to the dorsal
striatum forms a part of the basal ganglia-thalamocortical circuit
(Alexander et al. 1986
). Taken together with previous
studies in our laboratory using a similar 1DR task (Kawagoe et
al. 1998
), we found that SR-type neurons were more common in
the caudate nucleus (about 60% of task-related neurons) than in the
LPFC (8-24% of task-related neurons). However, SR-type neurons in the
caudate generally increased their activity in the reward-present
condition independent of the cue position so much so that the original
spatial discriminability was often deteriorated. Therefore in these
neurons, the spatial representation may not be improved when the
subject expects the presence of reward. Caudate activity correlated
with saccade parameters such as saccade latency and peak velocity
rather than with the probability of making an error (Itoh et al.
2000
). In contrast, the quality of the spatial code of SR+-type
neurons in the LPFC was adjusted by the expected reward outcome. To
generalize this comparison, we propose that the prefrontal cortex
changes the cognitive aspects of behavior based on expected reward
outcome, whereas the basal ganglia change the motor aspects of behavior based on expected reward outcome.
| |
ACKNOWLEDGMENTS |
|---|
We thank B. Coe, H. Itoh, and H. Wakabayashi for comments on the manuscript and I. Kanazawa for comments and encouragement.
This work was supported by Core Research for Evolutional Science and Technology of Japan Science and Technology Corporation, Japan Society for the Promotion of Science (JSPS) Research for the Future Program, and JSPS Research Fellowships for Young Scientists.
| |
FOOTNOTES |
|---|
Address for reprint requests: M. Sakagami, Brain Science Research Center, Tamagawa University Research Institute, Machida, Tokyo 194-8610, Japan (E-mail: sakagami{at}lab.tamagawa.ac.jp).
Received 8 June 2001; accepted in final form 25 October 2001.
| |
REFERENCES |
|---|
|
|
|---|