JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 87: 1488-1498, 2002;
0022-3077/02 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (80)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kobayashi, S.
Right arrow Articles by Hikosaka, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kobayashi, S.
Right arrow Articles by Hikosaka, O.

The Journal of Neurophysiology Vol. 87 No. 3 March 2002, pp. 1488-1498
Copyright ©2002 by the American Physiological Society

Influence of Reward Expectation on Visuospatial Processing in Macaque Lateral Prefrontal Cortex

Shunsuke Kobayashi,1,2 Johan Lauwereyns,2 Masashi Koizumi,2 Masamichi Sakagami,2,3 and Okihide Hikosaka2

 1Department of Neurology, University of Tokyo School of Medicine, Tokyo 113-8655;  2Department of Physiology, Juntendo University School of Medicine, Tokyo 113-0033; and  3Brain Science Research Center, Tamagawa University Research Institute, Tokyo 194-8610, Japan


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Kobayashi, Shunsuke, Johan Lauwereyns, Masashi Koizumi, Masamichi Sakagami, and Okihide Hikosaka. Influence of Reward Expectation on Visuospatial Processing in Macaque Lateral Prefrontal Cortex. J. Neurophysiol. 87: 1488-1498, 2002. The lateral prefrontal cortex (LPFC) has been implicated in visuospatial processing, especially when it is required to hold spatial information during a delay period. It has also been reported that the LPFC receives information about expected reward outcome. However, the interaction between visuospatial processing and reward processing is still unclear because the two types of processing could not be dissociated in conventional delayed response tasks. To examine this, we used a memory-guided saccade task with an asymmetric reward schedule and recorded 228 LPFC neurons. The position of the target cue indicated the spatial location for the following saccade and the color of the target cue indicated the reward outcome for a correct saccade. Activity of LPFC was classified into three main types: S-type activity carried only spatial signals, R-type activity carried only reward signals, and SR-type activity carried both. Therefore only SR-type cells were potentially involved in both visuospatial processing and reward processing. SR-type activity was enhanced (SR+) or depressed (SR-) by the reward expectation. The spatial discriminability as expressed by the transmitted information was improved by reward expectation in SR+ type. In contrast, when reward information was coded by an increase of activity in the reward-absent condition (SR- type), it did not improve the spatial representation. This activity appeared to be involved in gaze fixation. These results extend previous findings suggesting that the LPFC exerts dual influences based on predicted reward outcome: improvement of memory-guided saccades (when reward is expected) and suppression of inappropriate behavior (when reward is not expected).


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Prediction of future events, such as the presence of food, is critically important for an animal's survival. Single-unit studies revealed neuronal activity related to stimulus-reinforcement association in various brain areas including substantia nigra (Schultz 1998), amygdala (Fukuda and Ono 1993; Schoenbaum et al. 1998), hypothalamus (Fukuda and Ono 1993), orbitofrontal cortex (Schoenbaum et al. 1998; Thorpe et al. 1983; Tremblay and Schultz 1999), anterior cingulate cortex (Niki and Watanabe 1976), and ventral striatum (Apicella et al. 1991; Shidara et al. 1998). Together with other lines of research, it is suggested that an animal's ability of reward prediction relies on these brain areas (Jones and Mishkin 1972; Olds and Milner 1954; Phillips et al. 1983).

In addition, animals, especially primates, have developed cognitive abilities to flexibly respond to the environment. Spatial working memory is an example that allows motor behavior to be guided by visuospatial representations that are stored in memory. Using a memory-guided saccade task (Hikosaka and Wurtz 1983), it has been demonstrated that spatial information is maintained by individual neurons in the lateral prefrontal cortex (LPFC) (Funahashi et al. 1989-1991), parietal cortex (Barash et al. 1991), basal ganglia (Hikosaka and Wurtz 1983; Hikosaka et al. 1989a), and superior colliculus (Kojima et al. 1996). Together with other lines of evidence, the LPFC is now thought to be a key structure for spatial working memory (Jacobsen 1935; Jonides et al. 1993; Milner et al. 1985).

These two functions, reward prediction and spatial working memory, have been examined in separate studies, with some exceptions (Leon and Shadlen 1999; Watanabe 1996), and the relationship between these two types of processing remains unknown. They may not be independent of each other, considering that cognitive behavior may be based on the expectation of reward outcome. A recent study from our laboratory elucidated such a situation. During a memory-guided saccade task in which a correct response was rewarded for only one of four directions (1DR task), the behavioral performance of monkeys was better and more precise in the reward-present trials than in the reward-absent trials (Kawagoe et al. 1998). This behavioral result indicates that reward processing does interact with spatio-motor processing in the brain.

It is likely that the LPFC is the place where the interaction takes place because the brain areas primarily engaged in the reward processing, such as the midbrain dopamine area and orbitofrontal cortex, project to the LPFC (Barbas and Mesulam 1985; Ilinsky et al. 1985). Further, reward-related activity has been recorded from the LPFC during the performance of instrumental behavior (Leon and Shadlen 1999; Rosenkilde et al. 1981; Watanabe 1996). Additional support comes from the fact that administration or blocking of dopamine, which is regarded as a carrier of reward information (Schultz 1998; Yokel and Wise 1975), affects the function of spatial working memory: microinjection of dopamine and dopamine antagonists changes the spatial discriminability of working-memory neurons in the LPFC (Sawaguchi 1988, 1990; Williams and Goldman-Rakic 1995). Clinical studies on patients with deficits of the dopaminergic system, such as Parkinsonian and schizophrenic patients, reported deterioration in working-memory tasks (Freedman and Oscar-Berman 1986; Owen et al. 1997; Pantelis et al. 1997; Sahakian et al. 1988). At present, however, it remains unclear how the LPFC integrates spatial and reward information and to what extent such integration correlates with behavior.

We hypothesized that reward information and spatial information are integrated in the LPFC so that spatial information becomes more accurate when reward outcome is expected; more accurate representations of spatial information would, in turn, lead to more accurate behavior. To test this hypothesis, we devised a memory-guided saccade task with an asymmetric reward schedule, in which the subject performed a spatiomotor behavior with different reward outcomes. The task is the same as the one used in our laboratory to study basal ganglia neurons (1DR) (Kawagoe et al. 1998), except that the reward condition was indicated by the color of the target cue, not its position. We call it "1CR" (1-color rewarded) task.


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The study was conducted on two male Macaca fuscata monkeys (monkeys H and Z, both weighed 5.0-6.0 kg). They performed a behavioral task under computer control. A stimulus generator (VSG, Cambridge Research, UK) was used to generate visual stimuli on a 20-in. computer display (GDM-F500, Sony, Tokyo, Japan). Activity of single neurons was recorded with moveable electrodes, whereas eye movements were monitored using the magnetic search-coil technique. All surgical and experimental protocols were approved by the Juntendo University Animal Care and Use Committee and were in accordance with National Institutes of Health's Guide for Care and Use of Laboratory Animals.

Behavioral procedures

The monkeys were seated with their head fixed in a primate chair inside a completely enclosed sound-attenuated room. A CRT was set 70.5 cm in front of the monkey to present visual stimuli. They were trained on a memory-guided saccade task with an asymmetric reward schedule (Fig. 1). A trial started with the onset of a central fixation point (0.21° in visual angle). Five hundred milliseconds after the onset of the fixation point, a target cue (0.53°) was presented for 200 ms randomly at one of two positions. The target cue was colored and the color predicted a reward. Two diagonally opponent positions were selected of four candidates (left, right, top, bottom) at a 6.5° distance from the center of the monitor. When the neuron had a spatial preference, one of the two positions was selected to be within the neuron's response field. The subject had to remember the cue position during the delay period that was randomized between 0.9 and 2.1 s. The disappearance of the fixation point after the delay period was the signal to make a saccade to the previously cued location. The saccade was judged to be correct if the eye position was within a virtual window around the target cue (within 5.2 × 5.2° and within 400 ms). The target cue came on 400 ms later for 100 ms. An auditory tone of 900-Hz rectangular waveform was presented if the subject made a correct saccade. If the subject made any error, the same trial was repeated. The sound served as an indication of correct task performance both in reward-present and -absent conditions. The color of the target cue indicated whether a reward would follow. Two colors were paired out of four prepared colors (red, yellow, green, and blue, luminance was 5.51, 25.6, 20.1, and 1.6 cd/m2, respectively). One of the two colors was associated with reward (a drop of water about 0.15-0.20 ml) and the other with no reward. Both color and position of the cue were alternated randomly. The color-reward contingency was fixed within a block consisting of at least 60 trials, and it was reversed in the following block. A rest break (about 30 s) was given between the blocks.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 1. The memory-guided saccade task with an asymmetric reward schedule. A: the subject fixated on a central fixation point. A peripheral target cue briefly appeared, indicating the target position for the saccade and the reward condition. After a variable delay period, the fixation point disappeared, signaling the subject to make a saccade to the remembered position. In the reward-present condition, the subject received a liquid reward and a beep tone; but in the reward-absent condition, it received only a beep tone. B: stimulus-reward associations in each type of block. Four kinds of stimuli were used (2 positions × 2 colors). One of the 2 colors was associated with presence of reward and the other with absence of reward. The association was fixed within a block and reversed in the next block.

Surgical procedures

Surgery was performed to implant a head-holder, a delrin chamber (30 × 42 mm) and a scleral magnetic search coil (Robinson 1963). All surgical procedures were performed with aseptic technique under ketamine (4.6-6.0 mg/kg im) and pentobarbital (4.5-6.0 mg/kg/h iv) anesthesia. The monkey received antibiotics (sodium ampicillin, 25-40 mg/kg im) after the operation.

Data acquisition

During recording sessions, action potentials of single neurons were recorded extracellularly with tungsten electrodes (FHC, Bowdoinham, ME; shank diameter: 250 µm, taper angle: 20-15°, impedance: 1.5-3 mOmega ). Microelectrodes were advanced vertically to the cortical surface, using an oil-driven micro-manipulator (MO-95, Narishige, Tokyo, Japan) and a grid system (holes 0.6 mm wide and 1.0 mm apart from center to center; Nakazawa, Tokyo, Japan). The action potentials were amplified, filtered (500 Hz to 2 kHz) and processed by a window discriminator (MDA-4 and DDIS-1, BAK Electronics, Germantown, MD). Neuronal discharges were converted into standard digital pulses by means of an adjustable Schmitt-trigger, the output of which was continuously monitored on a digital oscilloscope together with the waveform. A PC-generated raster displays of neuronal activity. Eye movements were recorded using the magnetic search-coil technique (MEL-25, Enzanshi-Kogyo, Tokyo, Japan). The data of neuronal discharges and eye movements were also collected on a digital tape recorder at 20 and 2 kHz, respectively (GX-1, Teac, Tokyo, Japan).

Data analysis

Off-line analysis was carried out on a PC using MATLAB release 12 for Windows. All spike analyses were based on trials in which the subject made a correct response. We defined the cue period as the period from 100 to 300 ms after cue onset, the delay period as the period from 300 to 900 ms after cue onset, and the saccade period as the period from 300 ms before the saccade onset to the saccade onset time. We tested the effect of the spatial and reward conditions on the neuronal activity during these three periods separately by two-way ANOVA (position × reward, P < 0.01). The reward condition gave rise to subtle but significant differences in saccade latency, peak velocity, and amplitude (Table 1). Therefore it was possible that reward-differential neuronal activity was confounded by these oculomotor parameters. To examine this possibility, data were tested by two-way analysis of covariance (ANCOVA, position × reward, P < 0.01) with saccade latency, peak velocity, amplitude, and precision as covariates. The ANOVA results were consistent with those using ANCOVA except for a few cases (among 52, 88, and 27 cases with a significant main effect of reward according to ANOVA in cue, delay, and saccade periods, respectively, 5, 8, and 14 cases were not significant according to ANCOVA). We treated these cases as nonsignificant with respect to the reward factor in the neuronal database (Table 2) so that reward-related activity in our population analysis is not explained by systematic changes in the oculomotor behavior caused by the reward condition.


                              
View this table:
[in this window]
[in a new window]
 
Table 1. Behavioral results


                              
View this table:
[in this window]
[in a new window]
 
Table 2. Numbers of neurons in each type

We tried to evaluate how well neuronal activity discriminates the position of the cue. The activity of SR-type cells (see RESULTS for detail) changed not only by the spatial condition but also by the reward condition. In other words, discharge rate, i.e., the range of the activity in which these neurons encoded spatial information, changed systematically by the reward condition. Therefore we had to use a measure by which we can evaluate spatial discriminability independent of the range of the activity. Our approach was to ask how well could the neural responses tell us about the cue position if we consider the neurons to be transmission channels that carry spatial information via their spike rates in response to the cue presentation. Spatial discriminability represented by a neuron was expressed by transmitted information. Predictable information of spatial condition associated with the neuronal responses (IS) was quantified as the decrease in entropy of the stimulus occurrence H(S)
<IT>I<SUB>S</SUB></IT><IT>=</IT><IT>I</IT>(<IT>S</IT><IT>; </IT><IT>X</IT>)<IT>=</IT><IT>H</IT>(<IT>S</IT>)<IT>−</IT><IT>H</IT>(<IT>S</IT><IT>‖</IT><IT>X</IT>) (1)

<IT>=</IT><LIM><OP>∑</OP><LL><IT>s</IT></LL></LIM><IT>−</IT><IT>p</IT>(<IT>s</IT>)<IT> log </IT>(<IT>p</IT>(<IT>s</IT>))<IT>−</IT><FENCE><LIM><OP>∑</OP><LL><IT>s</IT></LL></LIM><IT>−</IT><IT>p</IT>(<IT>s</IT><IT>‖</IT><IT>x</IT>)<IT> log </IT>(<IT>p</IT>(<IT>s</IT><IT>‖</IT><IT>x</IT>))</FENCE><SUB><IT> x</IT></SUB>
where S is the set of spatial conditions of the cue s, X is the set of neuronal responses x, p(s|x) is the conditional probability of spatial condition s given an observed spike count x, and p(s) is the a priori probability of spatial condition s. The brackets indicate the average of the signal distribution p(x). Uneven distribution of the data samples across the bins causes systematic bias in the values of information. For example, if the number of bins exceeds the number of data samples, it never yields zero information. On the other hand, if the number of bins is too small, it leads to an underestimation of the true values of information (Golomb et al. 1997). To settle these problems, we binned the responses with unequal binning intervals so that at least two responses were present in each bin. After achieving a relatively even distribution of the data in each bin, X's dimensionality (number of bins) could still differ across the two reward conditions or across neurons. The transmitted information is biased upward when X's dimensionality increases. To correct for this bias, we subtracted the first-order correction term (C1) from the value calculated using Eq. 1 (Treves and Panzeri 1995).
C<SUB>1</SUB>=<FR><NU>1</NU><DE>2<IT>N</IT><IT> ln 2</IT></DE></FR> <FENCE><LIM><OP>∑</OP><LL><IT>s</IT></LL></LIM> <IT><A><AC>X</AC><AC>˜</AC></A><SUB>s</SUB></IT><IT>−</IT><IT><A><AC>X</AC><AC>˜</AC></A></IT><IT>−</IT>(<IT>S</IT><IT>−1</IT>)</FENCE> (2)
[&Xtilde;s denotes the number of response bins where p(x|s) is nonzero at given s, &Xtilde; denotes the number of nonzero response bins where p(x) is nonzero, N: number of trials. Bins in X, but not Xs, were all nonzero as a result of the binning procedure mentioned above.] Thus Is = I(S; X) - C1. After this correction, no correlation was observed between the dimensionality of X and Is.

Anatomical location

We explored a wide area in the prefrontal cortex for single-unit recording in two monkeys. To identify recording sites, brain magnetic resonance images were taken (AIRIS, 0.3T, Hitachi) (Saunders et al. 1990). In addition, frontal eye field (FEF) was identified as an area where saccades were elicited by micro-stimulation (a train of 20 cathodal pulses of 0.2-ms duration and 3-ms interval) lower than 50 µA (Bruce et al. 1985). Histological confirmation of recording sites is not available because both monkeys are alive and participating in other studies.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

Behavioral performance

The animals' behavior was analyzed based on 36,903 trials (monkey Z: 21,193 trials, monkey H: 15,710 trials) during the course of single-unit recordings. There were a total of 3,250 error trials (Z: 1,844 trials, H: 1,406 trials) that consisted of fixation errors before the cue onset (280 trials, Z: 212 trials, H: 68 trials), fixation errors after the cue onset (1,929 trials, Z: 1,053 trials, H: 876 trials), and errors of saccade direction (1,041 trials, Z: 579 trials, H: 462 trials). Although the animals were free to ignore the color of the target cue that indicated the reward condition, their behavior was systematically influenced by the reward condition: correct performance rate after the cue presentation was significantly higher in the reward-present condition (94.6%, Z: 94.6%, H: 94.6%) than in the reward-absent condition (88.8%, Z: 89.0%, H: 88.4%, P < 0.001 for both monkeys by chi 2 test).

Because we changed the stimulus-reward contingency in every block, it may have taken a while for the animals to understand the new contingency. To examine the within-block change of animal behavior, we plotted the correct performance rate as a function of trial count from the start of the block: probability to make a correct response at the nth reward-present or reward-absent trial in the block was calculated based on behavioral data from 296 blocks (monkey Z) and 202 blocks (monkey H). A block consisted of at least 60 trials, 30 reward-present trials and 30 reward-absent trials, hence trial count n ranged from 1 to 30. Behavioral performance became clearly worse in the reward-absent condition than in the reward-present condition in the later part of the block in both monkeys (Fig. 2, A and B). We also plotted saccade precision as a function of trial count (Fig. 2, C and D). Saccade precision was determined by the distance in visual angle between target position and saccade end point. Although they were regarded as correct responses, saccades became less precise in the later part of the block when absence of reward was indicated. Other saccade parameters also systematically changed with the reward condition in the later part of the block; saccade latency was shorter (both monkeys), peak velocity was higher (monkey Z), and saccade amplitude was smaller (monkey H) in the reward-present condition than in the reward-absent condition (Table 1).



View larger version (21K):
[in this window]
[in a new window]
 
Fig. 2. Within-block change of animal behavior. A and B: the correct performance rate was plotted as a function of trial count from the start of the block. The correct performance rate at the nth reward-present or reward-absent trial in the block is shown in the reward-present condition (black lines) and in the reward-absent condition (gray lines) separately. In the later part of the block, the correct performance rate decreased in the reward-absent condition in both monkeys. C and D: saccade precision in correct trials was measured by the distance in visual angle between a target position and a saccade endpoint and plotted as a function of trial count. In the reward-absent condition, saccades became less precise in the later part of the block in both monkeys. Conventions are the same as in A and B.

Neuronal database

We recorded from 228 well-isolated LPFC neurons that were related to 1CR task. Significant color-discriminative activity was found in 25, 14, and 7 neurons during the cue, delay, and saccade periods respectively (color discriminability was tested by t-test, P < 0.01, using data from trials with the target cue in the cell's preferred position). Color, or luminance, was not commonly coded in 1CR probably because once translated into a reward-predicting signal, the color feature was behaviorally irrelevant. This result is consistent with the view that the LPFC encodes behaviorally relevant information (Rainer et al. 1998; Sakagami and Niki 1994; Watanabe 1986). In the following analysis, we focus on neuronal activity related to position and reward information.

For every recorded neuron, we screened its spatial discriminability with respect to four positions (left, right, top, and bottom) by visual inspection. During the recording session, two diagonally opponent positions were used including the neuron's preferred position, if it had any. We tested the effect of the spatial and reward conditions on the neuronal activity during the cue, delay, and saccade periods by two-way ANOVA (position × reward, P < 0.01; Table 2). Neuronal activity with only a position main effect was called S type and that with only a reward main effect was called R type. If neuronal activity had both position and reward main effects or an interaction between position and reward, it was called SR type. Neurons that had a significant response compared with the baseline activity but no main effect or interaction were called nondifferential type (ND type).

LPFC activity coding the spatial condition: S type

S-type activity that changed by the cue position not by the reward condition was found in 28.8, 21.2, and 27.6% of responsive neurons in the cue, delay, and saccade period, respectively.

A typical S-type cell is shown in Fig. 3. The target cue was presented at the top or bottom in red or yellow. In the first block, the red cue indicated the upcoming reward (block red); in the second block, the yellow indicated reward (block yellow). The neuron responded to the cue when it was presented at the top, whereas it was mildly suppressed in response to a cue at the bottom. The activity was affected by neither the color of the cue nor the reward condition as illustrated by the nearly identical histograms for the reward-present condition (red line) and reward-absent condition (gray line).



View larger version (51K):
[in this window]
[in a new window]
 
Fig. 3. A neuron of S type. The raster histograms were aligned with cue onset (left vertical line, "cue") and saccade onset (right vertical line, "sac on"). The horizontal bar indicates cue duration. The rastergram was plotted separately for different cue positions (top and bottom) and for different cue colors (red and green). A bull's eye mark represents the reward-present condition and solid line represents the reward-absent condition. The top 2 rows of the rastergram are from "block red" in which the red cue was associated with reward. The bottom 2 rows of rastergram are from "block yellow," in which the yellow cue was associated with reward. The color of the circles on the right side indicates the color condition in which the rastergram was obtained. For each rastergram, the sequence of trials was from top to bottom. A separate histogram was made for each cue position and each block. A red line represents a histogram in the reward-present condition, and a gray line represents a histogram in the reward-absent condition. The neuron's response was higher for top presentation during the cue period, and it was not influenced by the reward condition.

LPFC activity coding the reward condition: R type

R-type cells changed their activity depending on the reward condition but were unaffected by the cue position. This type of activity was further classified into R+ type (higher activity in the reward-present condition) and R- type (higher activity in the reward-absent condition). A typical R+-type cell is shown in Fig. 4. In the first block, red indicated the upcoming reward and yellow indicated no reward (block red). The target cue was presented randomly on either the left or right. The neuron showed sustained activity during the delay period after a red cue, whereas it was nearly silent after a yellow cue. The color-reward contingency was reversed in the second block, with yellow indicating a reward (block yellow). This neuron showed sustained activity to the previously rewarded red cue for several trials, but soon stopped firing for the red color. The activity to the yellow cue, which was associated with reward, increased, although not so much as in the reward trials of the previous block. This neuron did not significantly change its activity by the position of the target cue. To further confirm the reward effect and to exclude color and position effects, we used other positions, top and bottom, and other colors, green and blue, in the next two blocks. The reward-selective activity was reproduced; sustained delay period activity appeared to the green cue in block green and to the blue cue in block blue independent of the cue position. The extinctive tendency in the neuronal activity was also reproduced; in block blue, the activity to the previously rewarded green cue remained for a couple of trials and the activity to the rewarded blue cue did not reach the same level as in the reward-present trials of block green. In sum, this neuron showed sustained activity in the reward-present condition regardless of the position of the cue. This is in contrast with R--type cells, as demonstrated in Fig. 5. This neuron was phasically activated in the reward-absent condition regardless of the color or position of the cue.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 4. A neuron of R+ type. See Fig. 3 for legend format. This neuron showed sustained activity in the reward-present condition in block red. After the reversal of the color-reward contingency, i.e., in block yellow, this neuron showed sustained activity to previously rewarded red cue for several trials, but soon it stopped firing for this currently not reward-associated color. By using other colors (green and blue) and position (top and bottom), this neuron consistently showed sustained activity in the reward-present condition not changing its activity by color or position of the target cue.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 5. A neuron of R- type. See Fig. 3 for legend format. This neuron showed phasic activity in the cue period only in the reward-absent condition, and there was no significant spatial discriminability.

LPFC activity coding the spatial condition and reward condition: SR type

Our main interest in this study was SR-type cells, which showed activity related to both spatial and reward conditions because they were potentially involved in both visuospatial and reward processing. Like R-type cells, there were two kinds of reward dependency among SR-type cells: higher activity in the reward-present condition (SR+) or higher activity in the reward-absent condition (SR-). A typical SR+-type cell is shown in Fig. 6. The neuron was spatially discriminative in that its sustained activity started earlier and was stronger after the right cue than after the left cue. The neuron was reward dependent in that its activity was stronger in the reward-present than reward-absent condition, regardless of the cue color (block green and block blue). Interestingly, the reward dependency was clear for the preferred position (right) but not for the nonpreferred position (left). This phenomenon is illustrated by the clear difference between the reward-present (red) and reward-absent (gray) histograms for the preferred position but not for the nonpreferred position.



View larger version (46K):
[in this window]
[in a new window]
 
Fig. 6. Examples of SR+-type cells. See Fig. 3 for legend format. The cue and delay period activity had a preferred position at the right side. In addition, the activity for cue presentation at the preferred position was even higher in the reward-present condition. The response to the left-side presentation, which was a nonpreferred position, did not change significantly by the reward condition. As a result, the spatial discriminability was clearer in the reward-present condition than in the reward-absent condition.

Figure 7 shows an example of SR--type cells. Let us focus on the phasic activity after the cue. The activity was stronger for the bottom position (spatially discriminative). In addition, it was stronger in the reward-absent condition than in the reward-present condition. Unlike the SR+-type cell shown in Fig. 6, the enhancement effect in the reward-absent condition was present both for the preferred position (bottom) and for the nonpreferred position (top), the latter being even clearer in this particular cell.



View larger version (45K):
[in this window]
[in a new window]
 
Fig. 7. Examples of SR--type cells. See legends to Fig. 3 for format. In the reward-present condition, the activity during the cue period was higher for the bottom presentation. When the cue color indicated absence of reward, the activity was generally enhanced irrespective of the cue position. The activity of this neuron in the delay and saccade periods did not have spatial discriminability, but its reward selectivity was higher in the reward-absent condition; hence R- type.

Spatial discriminability in the population of SR+ and SR- types

The difference between SR+- and SR--type activity represented in Figs. 6 and 7 turned out to be a general rule as visualized by the population histograms of mean firing rates (Fig. 8, A, B, D, and E) and those of transmitted information (Fig. 8, C and F). When the cue was at the preferred position (Fig. 8A), the mean SR+-type activity was higher by approximately 45% in the reward-present condition (black) than in the reward-absent condition (gray). Note, however, the activity started at 80 ms after cue onset in both conditions similarly, but diverged at 110 ms after cue onset depending on the reward condition. In contrast, when the cue was at the nonpreferred position (Fig. 8B), the SR+-type activity was not different between the reward-present condition (black) and the reward-absent condition (gray). The results suggest that the spatial discriminability of SR+-type neurons was improved in the reward-present condition. To test this more quantitatively, we calculated the transmitted spatial condition information (hereafter denoted simply as Is) of each neuron. Is was calculated based on spike count in a 50-ms time window moved by 10-ms step and plotted at the center of the window. Figure 8C shows the mean Is of all SR+ neurons. The mean Is in the reward-present condition (black) was nearly twice as large as that in the reward-absent condition (gray) in all three periods. Is based on total spike count in each period was 0.39 bits (reward+) and 0.24 bits (reward-) in the cue period, 0.38 bits (reward+) and 0.20 bits (reward-) in the delay period, and 0.44 bits (reward+) and 0.25 bits (reward-) in the saccade period (cue period: P < 0.01, delay period: P < 0.01, saccade period: P = 0.05; paired t-test). Note that Is based on spike count in each period was smaller than the value achieved by integrating the curves in Fig. 8C because information does not necessarily accumulate over different bins.



View larger version (30K):
[in this window]
[in a new window]
 
Fig. 8. Population histograms of mean firing rates and transmitted information of spatial condition (Is). A, B, D, and E: peristimulus and perisaccadic time histograms of SR+ type (A and B) and SR- type (D and E) when the target cue was presented in the preferred position (A and D) and in the nonpreferred position (B and E). Each population was defined independently for the cue, delay, and saccade periods. Numbers of neurons are listed in Table 2. The abscissa is broken at the time when the populations change and the histogram is aligned on cue onset (left vertical line) and on saccade onset (right vertical line). Black lines represent activity in the reward-present condition and gray lines represent activity in the reward-absent condition. Error bars indicate SE in each bin. C and F: transmitted information of spatial condition (Is) averaged among the population of SR+ (C), and SR- type (F). Is coded in 100 ms-wide window moved by 10-ms steps was averaged among the same type of neurons and plotted at the center of the window. The basic layout of the graph is the same as the histograms of mean firing rate except that the ordinate is the Is axis. Error bars indicate SE in each bin.

The results for the SR- type were completely different. The mean SR--type activity was higher in the reward-absent condition (gray) than in the reward-present condition (black). This was true whether the cue was at the preferred position (Fig. 8D) or at the nonpreferred position (Fig. 8E). Again, the SR- activity started at 80 ms after cue onset, but the reward-dependent modulation started at 120 ms. The mean Is was strikingly similar between the reward-present (black) and reward-absent (gray) conditions (Fig. 8F). There was no statistical difference between the conditions in any of the three periods.

Activity related to gaze fixation in SR--type cells

We noticed other features that were different between the SR+ and SR- activities. The SR+ activity was sustained during the delay period and tended to increase toward the saccade period (Fig. 8A), whereas SR--type activity made a phasic peak in the cue period and tended to decrease toward the saccade period (Fig. 8D). SR--type cells defined in the cue period not only responded phasically to the cue presentation but also increased their activity in the precue period at almost the same level (Fig. 8, D and E). Whereas, the activity of SR+-type cells in the precue period stayed low at around 10 Hz (Fig. 8, A and B).

Change of reward effect within a block

We found that the effect of the reward condition on behavioral performance became clearer in the late part of the block (Fig. 2), suggesting that the animal's motivational contrast between the reward-present and absent conditions became clearer in the late part of the block. To examine whether the within-block change in behavior correlates with spatial representation of LPFC neurons, we calculated Is in both the early and late part of the block. Figure 9 demonstrates Is in the reward-present condition (black lines and dots) and in the reward-absent condition (gray lines and dots) in the early (1-10th trial) and late (11-30th trial) part of the block. Is of SR+ activity (Fig. 9A) showed changes similar to those in saccadic behavior (Fig. 2). In the reward-absent condition, the mean IS of SR+ activity decreased significantly from the early part to the late part of the block, both in the cue period (paired t-test, P < 0.05) and the delay period (P < 0.01). On the other hand, the mean Is of SR- activity was independent of the reward condition and showed no significant change between the early and late parts (Fig. 9B). Within-block changes of Is in the saccade period are not shown because reward information was not commonly coded in the saccade period.



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 9. Change of Is in SR-type cells within the block. Mean transmitted information of spatial condition (Is) was computed in the early (1-20th trial) and late (21-60th) part of the block. The black line and dots indicate Is in the reward-present condition, and gray line and dots indicate Is in the reward-absent condition. Within-block change of Is encoded by SR+-type cells (A) was in accordance with behavioral performance (Fig. 2); in the reward-absent condition, Is of SR+-type cells deteriorated in the late part of the block. On the other hand, such tendency was not clear in SR--type cells (B). * and ** indicate statistical significance of 5 and 1%, respectively. Error bars indicate SE.

Recording locations

We recorded from a wide area in the right LPFC of two monkeys (Fig. 10A). Confirmation of recording locations is based on magnetic resonance imaging and electric stimulation of frontal eye field. Anatomical estimation by these two methods were consistent: the tracks where saccades were elicited by electric stimulation below 50 µV corresponded to the rostral bank of the arcuate sulcus. Distribution of each type of neurons showed some tendency depending on the time period. To show this tendency in two monkeys, the brain maps obtained from two monkeys were superimposed so that the brain landmarks (principal sulcus and arcuate sulcus) overlapped best. During the cue period (Fig. 10B), S-type activity (blue circle) tended to appear in the ventrocaudal part of LPFC, while R-type activity (red triangle and red inverted triangle) appeared in and ventral to the principal sulcus area. SR-type activity (green triangle and green inverted triangle) was distributed between the zones of the S and R types. During the delay period (Fig. 10C), the number of neurons showing SR-type activity increased, and there was no clear anatomical separation of the different types of activity. During the saccade period (Fig. 10D), the number of neurons showing SR- and R-type activity decreased; S-type activity appeared in the caudal half of LPFC while most of the activity in the rostral half was of the ND type.



View larger version (31K):
[in this window]
[in a new window]
 
Fig. 10. Anatomical location. A: tested tracks. Two right hemispheres from 2 monkeys were superimposed. Each dot indicates 1 penetration. Frontal eye field tracks identified by electric micro-stimulation are marked by ×. 1-&cjs3489; indicate the location of sample neurons in Fig. 3, 4, 5, 6, and 7 respectively. B-D: distribution of each type of cells in the cue (B), delay (C), and saccade (D) periods. Blue circle, S-type cell; red triangle, R+-type cell; red inverted triangle, R--type cell; green triangle, SR+-type cell; green inverted triangle, SR--type cell; black circle, ND-type cell. When more than one neuron was recorded in one track, the symbols were drawn 0.75 mm apart on the figure scale to avoid complete overlapping. 1SD distribution contours of the S type, the R type, and the SR type are indicated by blue, red, and green ellipses respectively.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES

The present study demonstrated that the monkey LPFC showed various subsets of neurons in terms of spatial coding and reward coding during the performance of 1CR task; S-type cells coded only the spatial condition, R-type cells coded only the reward condition, and SR-type cells coded both of these conditions. This may support the idea that the LPFC receives both spatial and reward information and their integration takes place in the LPFC to generate behavioral responses adjusted to the expected reward outcome. From this viewpoint, SR-type cells are the postintegration cells, and indeed they showed a correlation with saccade behavior adjusted to the expected reward outcome.

Effects of the reward conditions on behavior

We found that the monkeys changed their behavior depending on the expected reward outcome during the 1CR task; saccades were less precise in the reward-absent condition than in the reward-present condition, and this effect became clearer in the latter part of the block (Fig. 2, Table 1). These results indicate that the level of motivation changed according to the reward condition and the trial count in the block.

The monkeys made more errors during the delay period by breaking the fixation, particularly in the reward-absent condition. These behavioral results suggest that behavioral error was due to the failure of the spatial working memory mechanism, the failure of the eye fixation mechanism, or both; to perform 1CR correctly, mechanisms for spatial working memory and suppression of fixation break might be engaged.

Effects of the reward conditions on LPFC neurons

It has been reported that LPFC neurons show activity that correlates with reward expectation (Leon and Shadlen 1999; Watanabe 1996). The tasks in these studies changed the kinds or magnitude of reward while the animal performed a spatial working memory task. We also found reward-related activity in the LPFC consistent with the previous studies. We further dissociated neurons that process purely reward signals (R type) from neurons that process both reward and spatial signals (SR type). In contrast to the fact that there were many S-type cells in the cue period and their visual onset was sharp (Fig. 3), R-type cells were most common in the delay period and their onset was generally not sharp. In case of SR-type cells, space-coding activity made a steep rise in the histogram followed by reward-differential activity with delay (Fig. 8). This is especially interesting because we presented spatial and reward information simultaneously by the target cue. The different temporal course between spatial signal and reward signal suggests different routes before arriving at the LPFC and the delay of the reward signal may correspond to the time needed to process the color-reward association.

Effects of reward condition on spatial processing in LPFC

Neurophysiological studies reported activity of the LPFC neurons related to visuospatial processing during the visual response period, delay period, and motor response period (Funahashi et al. 1989-1991; Suzuki and Azuma 1983). Especially the delay period activity of LPFC neurons has been emphasized to be a neural correlate of spatial working memory. Consistent with previous studies, we found that activity related to visuospatial processing is common in the LPFC throughout the task period (S and SR types: 43.9, 44.8, and 38.7% of responsive neurons in the cue, delay, and saccade period respectively). Our study reveals that, at least in some subsets of neurons, the activity is not purely cognitive but mixed with reward expectation.

Effects of reward condition on spatial discriminability of SR+ and SR- types

The SR-type activity was classified into two groups, SR+ and SR-, depending on whether the activity was enhanced or depressed by the predicted presence of a reward. Interestingly, they were different in several respects and probably have different, not opposite, functions.

In the SR+ type, the spatial discriminability was higher in the reward-present condition than in the reward-absent condition, and the effect was maintained until the time of saccade execution (Fig. 8C). The enhanced spatial discriminability corresponded to a better performance in the reward-present condition (Table 1). Furthermore, the spatial discriminability of the SR+ type and the monkey's performance co-varied within a block of 1CR task (Fig. 9A). SR+-type cells, then, may be directly responsible for the adjustment of saccadic behavior to the expected reward outcome.

In contrast, the spatial discriminability of the SR--type activity was the same, on average, between the reward-present and reward-absent conditions, despite the significant difference in the magnitude of the activity (Fig. 8, D-F). Thus SR--type cells do not appear to determine the quality of the saccadic behavior. SR--type cells may represent spatial information unaffected by reward expectation.

Another possibility is that SR-type cells may be related to some other function, such as gaze fixation or suppressing inappropriate eye movements away from fixation. This hypothesis is supported by the behavioral results that in the reward-absent condition the monkeys made more errors by breaking fixation, hence more efforts were required for successful central fixation in the reward-absent condition than in the reward-present condition. Another support for gaze-related function of SR- type is that they were phasically activated by the onset of the fixation point and the target cue when the monkey gazed at the fixation point (Fig. 8, D and E). SR--type activity could serve to strengthen fixation as a reaction against demotivating information carried by the peripheral cue. In this sense, spatial selectivity of SR--type would imply that SR--type activity suppresses a saccade to the position in the neuron's response field. This idea is consistent with the long-standing hypothesis that the prefrontal cortex plays a key role in protecting the temporal structure of behavior from interference and distraction and in controlling instinctual or emotional behavior (Fuster 1997; Sakagami et al. 2001).

Bivalency hypothesis on the reward-coding activity in LPFC

The critical functional difference between SR+ and SR- types originates from the direction of the reward effect (R+ or R-). One interpretation is that bivalent reward outcome (presence or absence of reward) in 1CR task was represented in distinct populations of LPFC neurons; R+- and SR+-type cells represent positive motivation and R- and SR--type cells represent negative motivation. Leon and Shadlen reported reward-related enhancement of LPFC neurons restricted to the visual field (Leon and Shadlen 1999), which might correspond to our SR+-type activity. However they did not find SR--type neurons. The different result on SR- type might be explained by the bivalence hypothesis: our all-or-none type of reward schedule may cause bivalent motivational states in the monkey, whereas their large-or-small type of reward schedule may cause positive motivation to a large or small degree.

Multi-valent representation of reinforcement was shown not only in the LPFC (Watanabe 1996) but also in other brain areas (Elliot et al. 2000; O'Doherty et al. 2001; Thorpe et al. 1983), and it has been proposed that there are parallel neural systems for the representation of appetitive and aversive reinforcers. The current study suggests that the limbic projection to the LPFC modulates cognitive functions in a valence-specific manner: R+ signals serve to enhance working-memory processes and R- signals suppress inappropriate behavior due to distraction.

Relationship among the subsets of LPFC neurons

The effects of reward condition on behavior imply that spatial signals and reward-predicting signals are integrated prior to motor output. The current results of single-unit recording suggest that such integration takes place in single neurons in the LPFC. S- and R-type cells independently encoded the spatial and reward-predicting signals, whereas SR-type cells encoded both. Furthermore, we found an interesting temporal and anatomical pattern of activity in these cells (Fig. 10). S-type activity occurred in the caudal LPFC close to FEF and stayed there, whereas R-type activity occurred in the rostral LPFC and shifted caudally as if joining S-type activity. Correspondingly, SR-type activity, which occurred in the caudal and ventral LPFC, became more prevalent in the delay period. These data raise the possibility that the spatial signals and reward-predicting signals were integrated in SR-type neurons generating saccadic behavior, which adjusted to the expected reward outcome.

Functional difference between LPFC and basal ganglia

Spatial information and reward information also converge in the basal ganglia (Hikosaka et al. 1989b; Kawagoe et al. 1998). The functional comparison between the LPFC and basal ganglia is important because projection of the LPFC to the dorsal striatum forms a part of the basal ganglia-thalamocortical circuit (Alexander et al. 1986). Taken together with previous studies in our laboratory using a similar 1DR task (Kawagoe et al. 1998), we found that SR-type neurons were more common in the caudate nucleus (about 60% of task-related neurons) than in the LPFC (8-24% of task-related neurons). However, SR-type neurons in the caudate generally increased their activity in the reward-present condition independent of the cue position so much so that the original spatial discriminability was often deteriorated. Therefore in these neurons, the spatial representation may not be improved when the subject expects the presence of reward. Caudate activity correlated with saccade parameters such as saccade latency and peak velocity rather than with the probability of making an error (Itoh et al. 2000). In contrast, the quality of the spatial code of SR+-type neurons in the LPFC was adjusted by the expected reward outcome. To generalize this comparison, we propose that the prefrontal cortex changes the cognitive aspects of behavior based on expected reward outcome, whereas the basal ganglia change the motor aspects of behavior based on expected reward outcome.


    ACKNOWLEDGMENTS

We thank B. Coe, H. Itoh, and H. Wakabayashi for comments on the manuscript and I. Kanazawa for comments and encouragement.

This work was supported by Core Research for Evolutional Science and Technology of Japan Science and Technology Corporation, Japan Society for the Promotion of Science (JSPS) Research for the Future Program, and JSPS Research Fellowships for Young Scientists.


    FOOTNOTES

Address for reprint requests: M. Sakagami, Brain Science Research Center, Tamagawa University Research Institute, Machida, Tokyo 194-8610, Japan (E-mail: sakagami{at}lab.tamagawa.ac.jp).

Received 8 June 2001; accepted in final form 25 October 2001.


    REFERENCES
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
REFERENCES