|
|
||||||||
Center for the Neural Basis of Cognition, Mellon Institute, Pittsburgh, Pennsylvania 15213-2683
Submitted 10 January 2003; accepted in final form 21 May 2003
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The transition from evaluating an anticipated reward to initiating action in pursuit of it is thought to depend on structures interposed functionally between the limbic system and motor cortex. The dorsal striatum and the dorsolateral prefrontal cortex are frequently thought of in connection with this mediating role. They differ from limbic structures in that neuronal activity signals both the nature of action the monkey is planning to perform and the value of the expected reward (Hassani et al. 2001
; Hikosaka and Watanabe 2000
; Hikosaka et al. 1989
; Hollerman et al. 1998
; Kawagoe et al. 1998
; Kobayashi et al. 2002
; Lauwereyns et al. 2002a
,b
; Leon and Shadlen 1999
; Takikawa et al. 2002a
; Tremblay et al. 1998
; Watanabe 1990
, 1992
, 1996
; Watanabe et al. 2002
). Thus it makes sense to think of them as at the watershed between limbic evaluative functions and motor output. However, the exact nature of the transitional role played by these areas is not clear. In particular, the functional significance of their reward-related neuronal activity remains uncertain.
On the one hand, it is widely assumed that reward-related activity in the dorsal striatum and dorsolateral prefrontal cortex corresponds to an internal representation of the reinforcement value of the expected reward (Hassani et al. 2001
; Hikosaka and Watanabe 2000
; Schultz 2000
; Watanabe 1998
). For example, Hassani et al. (2001
) speculate that reward-dependent neuronal signals in the dorsal striatum contribute to "the representation of goals before and during the execution of actions," while Watanabe (1998
) proposes that dorsolateral prefrontal neurons with reward-related activity have as their function "monitoring the goal (in this case, reward)" for which the monkey is working. This view is reasonable 1) because goal-directed behavior requires that the value of the outcome associated with an action be explicitly represented at the time when the action is being planned (Dickinson and Balleine 1994
) and 2) because there must exist somewhere in the brain neurons whose activity underlies emotional states experienced subjectively (Papez 1937
).
On the other hand, reward-related activity in the dorsal striatum and dorsolateral prefrontal cortex might arise from motivational modulation of control signals for motor preparation and motor output. It is a central feature of motivation, considered as a psychological construct, that "the motivational state serves to prime, facilitate, or potentiate a response mechanism that leads to the appetitive or consummatory behavior" (Stellar and Stellar 1985
; p. 73). A commonly cited example is the tendency of rats to run faster down an alleyway in pursuit of a more valued reward (Stellar 1982
). In a monkey waiting during a delay period to perform an operant response for a known reward, there are at least 4 ways in which the size of the reward might exert motivational control over neuronal activity related to motor planning and motor output. 1) Neurons representing the plan for the instructed response might fire more strongly when a more valued reward is at stake. 2) Neuronal activity sensitive to arousal or responsible for generalized motor readiness might be enhanced on trials involving a more valued reward. 3) Neuronal activity governing overt behaviors that automatically accompany response planning, such as increases of axial tonus, might be enhanced when a more valued reward is at stake. 4) Neurons involved in preparing ingestive movements might be more active before delivery of a more valued reward. These are not mere speculative possibilities. The fact that speed and accuracy are enhanced when a more valued reward is expected (Hassani et al. 2001
; Hollerman et al. 1998
; Kawagoe et al. 1998
; Kobayashi et al. 2002
; Lauwereyns et al. 2002a
,b
; Leon and Shadlen 1999
; Takikawa et al. 2002b
; Tremblay et al. 1998
; Watanabe 1990
; Watanabe et al. 2001
) indicates, in accordance with scenarios 1 and 2, that the representation of the planned action is enhanced, or generalized readiness to respond is greater, during the delay period leading up to the response. The occurrence, in some contexts, of anticipatory licking (Hassani et al. 2001
) indicates, in accordance with scenario 4, that ingestive movements tend to be programmed during the delay period. The distinction between neuronal activity representing the value of an expected reward and neuronal activity reflecting motivational modulation of motor planning and performance has been acknowledged in principle by previous authors. However, little consideration has been given to the question of how to distinguish between them in practice.
As a step toward resolving this issue, we have extended the analysis of reward-related activity beyond the prefrontal cortex (PFC) into adjacent premotor areas involved directly in oculomotor and skeletomotor control. Lesions and inactivation of the frontal eye field (FEF) (Dias and Segraves 1999
; Sommer and Tehovnik 1997
), supplementary eye field (SEF) (Sommer and Tehovnik 1999
), premotor cortex (PM) (Kurata and Hoffman 1994
), and supplementary motor area (SMA) (Brinkman 1984) result in impairments of motor control but do not interfere with the evaluation of rewards or with motivation. Thus reward-related activity, if observed in them, could reasonably be ascribed to changes in motor outflow or motor readiness rather than to the internal representation of rewards as goals. The results indicate not only that activity modulated by reward is present in premotor areas but that it is strikingly more prominent in them than in the PFC. This finding raises an important issue with reference to areas other than premotor cortex in which the value of an expected reward influences neuronal activity. Reward-dependent activity in these areas 1) might reflect the motivational modulation of motor signals as it presumably does in the premotor cortex, 2) might represent the value of the expected reward as a basis for emotional experience and goal selection, or 3) might do both. To distinguish among these possibilities will require the use of nonstandard behavioral paradigms in which reward value varies independently of motivation.
| METHODS |
|---|
|
|
|---|
Four adult male rhesus monkeys were used (Macaca mulatta; laboratory designations N, P, A, and F). Experimental procedures were approved by the Carnegie Mellon University Animal Care and Use Committee and were in compliance with the guidelines set forth in the United States Public Health Service Guide for the Care and Use of Laboratory Animals.
Preparatory surgery
At the outset of the training period, each monkey underwent sterile surgery under general anesthesia maintained with isofluorane inhalation. The top of the skull was exposed, bone screws were inserted around the perimeter of the exposed area, a continuous cap of rapidly hardening acrylic was laid down so as to cover the skull and embed the heads of the screws, a head-restraint bar was embedded in the cap, and scleral search coils were implanted on the eyes, with the leads directed subcutaneously to plugs on the acrylic cap (Robinson 1963
). After initial training, recording chambers were implanted into the acrylic. At each selected site, a 2-cm-diameter disk of acrylic and skull was removed. A cylindrical recording chamber was cemented into the hole with its base just above the exposed dural membrane. The medial chambers placed over SEF and SMAr were centered on the midline of the brain approximately 21 mm anterior to the HorsleyClarke interaural plane. The lateral chambers placed over PFC, FEF, and PM were centered approximately at anterior 23 mm and lateral 23 mm.
Memory-guided saccade task
The aim of this task was to allow characterizing the spatial selectivity of each neuron. The monkeys performed memory-guided saccades (MGS) to 6 targets forming a hexagonal array at an eccentricity of 10° (Fig. 1A). Each trial began with the monkey's fixating a central spot. At 500 ms after attainment of fixation, the 6 targets appeared. After an additional 300 ms a cue was flashed on one of the targets for 250 ms. After a random delay in the range of 500 to 1,000 ms, the fixation spot was extinguished, whereupon the monkey had to make a saccade directly to the previously cued target. Trials involving the 6 targets were interleaved in pseudo-random order. Testing continued until it was possible to identify the target eliciting maximal activity. Subsequent testing in the Variable Reward task involved this target and the one diametrically opposite with respect to the fixation point (Fig. 1A, 1 and 1' or 2 and 2' or 3 and 3').
|
Variable reward task
The monkeys performed an MGS task in which a cue presented early in the trial predicted a big (0.3 ml) or a small (0.1 ml) juice reward. Essential features of the task are summarized in Fig. 1. Each trial began with the onset of a central fixation spot (Fig. 1B). At a delay of 50 ms after attainment of fixation, the spot was transformed to a cue whose shape and color signified the magnitude of upcoming reward (Fig. 1C). After 400 ms 2 targets appeared (Fig. 1D) at diametrically opposed locations. A directional cue identical to the fixation cue except in size was then presented for 250 ms in superimposition on one of the targets (Fig. 1E). After a 1,500-ms delay period (Fig. 1F), the fixation spot was extinguished (Fig. 1G), whereupon the monkey was required to make a saccade directly to the previously cued target (Fig. 1H) and to maintain fixation on it for 300450 ms after saccade completion until delivery of a reward of the predicted magnitude (Fig. 1I). Big and small rewards were unambiguously identifiable because the solenoid valve clicked 3 times (at intervals of 100 ms) in the former case and once in the latter. There were 4 possible conditions representing all possible combinations of reward (big or small) and direction (preferred or antipreferred). The conditions were interleaved in pseudo-random order. To prevent confounding reward selectivity with selectivity for the visual properties of the cues, the cue convention was reversed after each block of 40 successful trials. The collection of data from a given neuron commonly continued until 80 trials had been completed successfully.
Stimuli
The fixation spot was a 0.38° white square presented at the center of the screen. Targets were 0.38° white squares presented 10° from central fixation. The central reward cues, which spanned 0.96°, were an orange + and a green x. The pairing of the cues with large reward and small reward was reversed after each block of 40 successful trials. The directional cue shared all of the properties of the foveal reward cue with the exception that it spanned 1.32°. The background of the display had a luminance of 1.5 cd/m2 and CIE x and y chromaticity coefficients of 0.26 and 0.26, respectively. White stimuli had a luminance of 126.5 cd/m2 and CIE x and y chromaticity coefficients of 0.28 and 0.32, respectively. Orange stimuli had a luminance of 110.7 cd/m2 and CIE x and y chromaticity coefficients of 0.47 and 0.51, respectively. Green stimuli had a luminance of 116.5 cd/m2 and CIE x and y chromaticity coefficients of 0.25 and 0.66, respectively.
Single-neuron recording
At the beginning of each day's session, a varnish-coated tungsten microelectrode with an initial impedance of several megohms at 1 kHz (Frederick Haer, Bowdoinham, ME) was advanced vertically through the dura into the immediately underlying cortex. The electrode could be placed reproducibly at points forming a square grid with 1 mm spacing (Crist et al. 1988
). The action potentials of a single neuron were isolated from the multineuronal trace by means of an on-line spike-sorting system using a template-matching algorithm (Signal Processing Systems, Prospect, Australia). The spike-sorting system, on detection of an action potential, generated a pulse whose time was stored with 1-ms resolution.
Electromyographic measurements
Adhesive surface electrodes were placed on the shaved skin overlying the right splenius capitus and masseter muscles. The voltage threshold was set as low as possible, subject to the constraint that the voltage did not cross threshold at rest. Muscle activity was stored as time-marked records of threshold crossings. From these, histograms were constructed off-line, representing the mean instantaneous threshold-crossing rate as a function of time during the trial under each of 4 trial conditions defined by size of reward (big or small) and direction of response (preferred = ipsiversive or antipreferred = contraversive as defined relative to the anatomical location of the muscle).
Experimental control and data collection
All aspects of the behavioral experiment, including presentation of stimuli, monitoring of eye movements, monitoring of neuronal activity, and delivery of reward, were under the control of a pentium-based computer running Cortex software provided by R. Desimone, Laboratory of Neuropsychology, National Institute of Mental Health. Eye position was monitored by means of a scleral search coil system (Riverbend Instruments, Birmingham, AL). The X and Y coordinates of eye position were stored with 4-ms resolution. Stimuli generated by an active matrix LCD projector were rear-projected onto a frontoparallel screen 25 cm from the monkey's eyes. Reward in the form of 0.1 or 0.3 ml of water or juice was delivered through a spigot under control of a solenoid valve.
Analysis of the dependency of behavior on predicted reward
We used paired t-tests to compare, across sessions, the session means of the following measures obtained on big-reward versus small-reward trials: reaction time, error rate, and fixation-break rate. Reaction time was defined as the delay from offset of the fixation spot to the moment when the eye left the central fixation window. Error rate was defined as the number of trials on which a saccade was directed to the wrong target expressed as a percentage of all trials on which a saccade was directed to either target. Fixation-break rate was defined as the percentage of all trials on which the eye left the central fixation window before offset of the fixation spot.
Analysis of the dependency of firing rate on task factors
We employed 2-factor ANOVAs to analyze the impact of reward size and response direction on the firing rate of each neuron. We independently analyzed data from 7 trial epochs: 1) from reward cue onset to directional cue onset (700 ms), 2) from onset to offset of the directional cue (250 ms), 3) 250 ms beginning with directional cue offset, 4) 250 ms before fixation spot offset, 5) 200 ms before saccade initiation, 6) from saccade onset to 100 ms after saccade completion, and 7) 100 ms before to 100 ms after initiation of reward delivery. In all tests, the criterion for statistical significance was taken as P
0.05.
Assessing contribution of behavioral reaction time to reward-related activity
To determine whether neuronal activity continued to depend on reward size when the effects of behavioral reaction time were factored out, we performed a multivariate regression analysis, fitting three models:
where Y is the firing rate measured from onset of the reward cue to offset of the fixation spot and RT is the behavioral reaction time. The variable REWARD was set to 1 or 0 for trials with large or small rewards, respectively. To determine whether adding the variable REWARD produced a significant improvement in performance, we compared model 3 to model 1. To determine whether adding the variable RT produced a significant improvement, we compared model 3 to model 2. Significance was assessed with an F-test using
![]() |
0.05. Localization of recording sites
To characterize the location of the recording sites relative to gross anatomical landmarks, we projected the sites onto structural MR images. The images were collected by use of a Brükker 4.7 T magnet in which the anesthetized monkey was supported by an MR-compatible stereotaxic device. Fiducial marks made visible by means of a contrast agent included the centers of the ear bars and selected locations inside the recording chamber. Frontoparallel slices of 2 mm thickness spanning the entire brain were collected. In addition, slices of 2 mm thickness were collected parallel to the cortical surface underlying each lateral chamber. To determine the location of recording sites relative to functional divisions of cortex, we mapped out regions under each chamber from which motoric responses (eye, face, and limb movements) could be elicited at low threshold (
40 µA) by electrical microstimulation (1.65-ms biphasic pulses delivered through the recording microelectrode at a frequency of 300 Hz in trains 200 ms long).
| RESULTS |
|---|
|
|
|---|
After completion of training, during the period when neuronal data were being collected, all monkeys performed stably at a high level, selecting the correct target on >98% of all trials in which one target or the other was selected. Furthermore, all monkeys were sensitive to the cues indicating whether a large or small reward would be delivered, exhibiting greater engagement with the task on big-reward trials. This was evident in 3 behavioral measures computed for every neuronal data collection session. The error rate (percentage of trials when the incorrect target was selected relative to all trials when one target or the other was selected) was lower on big-reward (0.6%) than on small-reward (1.3%) trials (Fig. 2A). This trend was present in data from every monkey, achieving significance (2-tailed paired t-test, P < 0.05) in 3 out of 4 cases (Fig. 2D). Furthermore, it was highly significant in data collapsed across monkeys (P < 0.0001). The average behavioral reaction time was shorter on big-reward (232 ms) than on small-reward (239 ms) trials (Fig. 2B). This trend achieved significance (2-tailed paired t-test, P < 0.05) in every monkey (Fig. 2D). Furthermore, it was highly significant in data collapsed across monkeys (P < 0.0001). To analyze the impact of reward on the frequency of fixation breaks, we first asked how fixation breaks were distributed across time during the trial under both big- and small-reward conditions. The results, shown in Fig. 2C, indicate 1) that fixation breaks were more frequent under small- than under big-reward conditions and 2) that the tendency to break fixation declined over the course of the trial under both conditions. To determine whether the effect was significant, we compared the fixation-break frequencies (number of trials prematurely terminated by cessation of fixation expressed as a percentage of all trials) observed under small- and big-reward conditions in each monkey (Fig. 2D). The tendency for fixation breaks to occur more often under small-reward conditions was present and significant in every monkey (2-tailed paired t-test, P < 0.05) and was highly significant in data collapsed across monkeys (P < 0.001).
|
|
We recorded from superficial cortex underlying 3 chambers placed over the lateral frontal cortex of the right hemisphere (monkeys P, N, and F) and 3 chambers centered over midline frontal cortex (monkeys A, N, and F). To assign recording sites to specific areas, we noted the location of the chamber relative to gyri and sulci visible in MR images, using slices parallel to the cortical surface in the case of the lateral chambers and coronal slices in the case of midline chambers (Fig. 3A). We also mapped out the cortex under each chamber by means of microstimulation, noting whether movements of the eyes, face, or limbs were elicited at currents <40 µA (Fig. 3, B and C). We assigned recording sites to 6 areas according to the following criteria. Dorsolateral prefrontal cortex (PFC): a region in front of the arcuate sulcus and surrounding and within the principal sulcus in which microstimulation did not elicit movements. Frontal eye field (FEF): a region rostral to and in the anterior bank of the arcuate sulcus in which microstimulation elicited saccades and not movements of the face or limbs. Recording sites in the FEF were all within 4 mm of the cortical surface at locations where microstimulation at the corresponding depth elicited eye movements. Premotor cortex (PM): a region caudal to the arcuate sulcus in which microstimulation elicited face or limb movements and not saccades. Transitional cortex (FEF/PM): a region caudal to the pure saccade zone and rostral to the pure face/limb zone in which electrical stimulation elicited both saccades and movements of the face or limbs. Recording sites in FEF/PM were all within 4 mm of the cortical surface at locations where microstimulation at the corresponding depth elicited eye and face/limb movements. On the grounds of its location behind the arcuate sulcus, this cortex belongs to the premotor area. However, we have designated it as an independent zone with the possibility in mind that its distinct traits, as revealed by electrical stimulation, might be accompanied by some differential form of sensitivity to reward-predicting cues. The finding of an oculomotor representation in PM is not without precedent (Fujii et al. 1998
, 2000
). The supplementary eye field (SEF) is a region located rostral to the genu of the arcuate sulcus and extending 25 mm from the hemispheric midline, in which microstimulation elicited saccades. The rostral supplementary motor area (SMAr) is a region immediately caudal to the SEF in which microstimulation elicited movements of the face and limbs.
|
Selection of neurons for study
Neuronal activity was first monitored in the context of the MGS task with reward size fixed and with targets at 6 locations spaced at 60° intervals around fixation (Fig. 1A: locations 1, 1', 2, 2', 3 and 3'). Any neuron appearing to exhibit task-related activity in the MGS task was selected for study in the Variable Reward task (Fig. 1, BI). Out of the 6 targets used in the MGS task 2 targets were selected for use in the Variable Reward task: the target associated with strongest neuronal activity and the target diametrically opposite this one. Thus the pair of targets employed in the Variable Reward task could be either 1 and 1', 2 and 2', or 3 and 3' (Fig. 1A). Neuronal activity was monitored while the monkey performed 20 successful trials under each of 4 conditions (2 directions x 2 reward levels) in the Variable Reward task.
Organization of single-neuron results
Neurons in many frontal areas exhibited reward-related activity. This activity commonly took the form of a main effect (with net firing rate higher or lower on big-reward trials) and less frequently took the form of an interaction effect (with the strength of the directional signal stronger or weaker on large-reward trials). Both forms of effect were present and significant in the neuron of Fig. 4. Its net firing rate was clearly higher when a large reward was expected (top row vs. bottom row). In addition, its directional signal, the difference in firing rate between trials requiring a leftward response (left column) and those requiring a rightward response (right column), was greater under the big-reward condition. For each cortical area, we analyze the nature and rate of incidence of such effects by proceeding through 3 steps. 1) Population histograms. The aim of this step is to indicate qualitatively how the magnitude of expected reward affected the population firing rate and the population directional signal. 2) Individual neurons by epoch. The aim of this step is to indicate whether effects evident at the level of the population were statistically demonstrable at the level of individual neurons. For each of 7 epochs spanning the duration of the trial, we indicate how many neurons showed significant increases or decreases in firing rate on big-reward as compared with small-reward trials, and how many showed significant increases or decreases in the strength of the directional signal. 3) Individual neurons across trial. To complement the statistically insensitive analysis by epoch, we also describe a more robust analysis based on the trial as a whole, indicating in how many neurons the firing rate and directional signal were significantly affected by the size of the expected reward. The details of each step of analysis and the conventions for presenting the results are laid out in the course of the next section, on the PFC, which will thus serve as a guide to later sections.
|
Prefrontal cortex (PFC)
POPULATION. We collected data from 201 neurons in the PFC of 2 monkeys (Table 1). As a basis for qualitative assessment of the effect of anticipated reward on the activity of these neurons, we constructed population curves representing firing rate as a function of time under the 4 trial conditions (Fig. 5A). In this display, thick and thin lines represent population activity on trials requiring responses in the preferred and antipreferred directions, respectively. Neuronal activity was strongly affected by response direction as indicated by the consistent elevation of thick above thin lines. Subtle effects of the size of the predicted reward are manifest as differences in firing rate between trials in which the response direction (indicated by line thickness) was the same but reward level (indicated by color) was different. To characterize the time course of reward-related activity, we computed independently the impact of reward on net firing rate independent of direction and on the directional signal. The effect on net firing rate was represented by an index corresponding to the average amount by which the firing rate increased under the big-reward condition. It was computed as (BP + BA SP SA)/2, where BP is the firing rate under the big-reward, preferred-direction condition, SA is the firing rate under the small-reward, antipreferred-direction condition, and so on. Over the course of the trial from onset of fixation to initiation of response, this index was more often positive than negative (Fig. 5B), indicating a subtle tendency for firing to be stronger when a big reward was expected. The effect on the directional signal was represented by an index that corresponded to the average amount by which big-reward caused the firing rate to increase on preferred-direction trials and to decrease on antipreferred-direction trials. It was computed as (BP BA SP + SA)/2. This index was consistently greater than zero during the period between presentation of the directional cue and execution of the saccade (Fig. 5D).
|
INDIVIDUAL NEURONS BY EPOCH. To determine whether effects present in the population were also observable at the level of individual neurons, we analyzed data from each neuron during 7 trial epochs (IVII) defined in METHODS and depicted along the time line at the base of Fig. 5. For each epoch, we carried out an ANOVA with firing rate as the dependent variable and with reward size and response direction as factors. Counts of neurons exhibiting significant main effects of reward size on firing rate are shown in Fig. 5C, where blue (or red) symbols represent the percentage of cases in which firing was increased (or decreased) for big compared with small reward. The counts of significant effects were only slightly elevated above the 5% expected by chance in light of the criterion employed for judging statistical significance (P < 0.05). It is true that during preresponse epochs (IV), the count of neurons firing more strongly for big reward (blue) was consistently higher than the count of neurons showing the opposite effect (red), but the difference in counts achieved significance only during epoch IV (
2 test, P < 0.05). Counts of neurons exhibiting a significant interaction between reward size and direction are shown in Fig. 5E, where blue (or red) symbols represent the percentage of cases in which the directional signal was stronger (or weaker) for big reward. Note that the counts indicated during epoch I must represent type 1 errors because it was only after this epoch that the directional instruction was delivered. With the epoch I counts as a basis for comparison, it is clear that reward x direction interaction effects were rare or absent in all epochs. There was an apparent slight elevation, during epochs IIIV, in the number of neurons showing enhanced direction selectivity under big reward (blue) relative to those showing reduced direction selectivity (red). In no epoch, however, did this difference achieve significance. Thus analyses based on brief epochs in single neurons lack the sensitivity to detect effects of predicted reward evident in PFC population activity.
INDIVIDUAL NEURONS ACROSS TRIAL. To achieve a more robust statistical measure of the impact of predicted reward size on the neuronal firing rate, we carried out an ANOVA using, as the dependent variable, the mean firing rate across the entire period from onset of the reward cue to execution of the saccade, and employing as factors both reward size and response direction. This analysis revealed a significant main effect of predicted reward size in 15% of PFC neurons. The majority of these neurons fired more strongly under big-reward conditions (Table 1). This effect was not significantly different between monkeys and was significant in the data from the 2 monkeys combined (
2 test, P < 0.01). To assess the impact of reward on the directional signal, we carried out an identical procedure with the exception that the measurement period began with onset of the direction cue. This analysis revealed a significant interaction between the size of the predicted reward and the direction of the response in 5% of PFC neurons. Among neurons exhibiting a significant interaction effect, those carrying stronger directional signals under the big-reward condition outnumbered those carrying a weaker signal (Table 1). This effect was not significantly different between monkeys and was significant in the data from the 2 monkeys combined (
2 test, P < 0.05).
SUMMARY. Among PFC neurons, expectation of a large reward led to a subtle elevation in firing rate and a subtle enhancement of directional signals. These effects were evident both in population activity and in counts of neurons showing statistically significant effects over an epoch spanning the full extent of the trial.
Frontal eye field (FEF)
POPULATION. We collected data from 122 neurons in the FEF of 3 monkeys (Table 1). Curves representing the activity of this population as a function of time during the trial indicate that the level of anticipated reward exerted a moderate effect on neuronal activity (Fig. 6A). The net firing rate was elevated from shortly after presentation of the reward cue until onset of the saccade (Fig. 6B). The directional signal was elevated from presentation of the directional cue until onset of the saccade (Fig. 6D).
|
INDIVIDUAL NEURONS BY EPOCH. Neurons firing significantly more strongly under the big-reward condition (blue symbols) outnumbered neurons firing more weakly (red symbols) during epochs IV (Fig. 6C). This trend was significant during epoch III (
2 test, P < 0.05). In few neurons was there a significant interaction between reward size and response direction (Fig. 6E). In no epoch did the difference in counts between neurons showing stronger and weaker direction selectivity under the big reward condition (blue and red symbols in Fig. 6E) achieve significance.
INDIVIDUAL NEURONS ACROSS TRIAL. In 16% of FEF neurons, the net firing rate averaged across the entire trial was significantly dependent on reward size. Of these neurons, a majority fired more strongly under big-reward conditions (Table 1). This effect was not significantly different between monkeys and was significant in the data from the 3 monkeys combined (
2 test, P < 0.0001). Only 2 neurons (2%) exhibited a significant interaction of reward size and directional signal. Both carried stronger direction signals under the big-reward condition (Table 1).
SUMMARY. The size of the anticipated reward exerted a moderate effect on neuronal activity in the FEF. Anticipation of a big reward led to enhanced firing, as revealed both by population measures and by counts of neurons showing significant effects of reward size. It led also to an enhancement of directional activity evident in population measures but too small to emerge from statistical tests on the activity of individual neurons.
Transitional cortex (FEF/PM)
POPULATION. We collected data from 46 neurons in the FEF/PM transition zone of two monkeys (Table 1). Curves representing the activity of this population as a function of time during the trial indicate that the level of anticipated reward exerted a strong effect on neuronal activity (Fig. 7A). The net firing rate was sharply elevated from shortly after presentation of the reward cue until completion of the saccade (Fig. 7B). The directional signal was moderately elevated from presentation of the directional cue until completion of the saccade (Fig. 7D).
|
INDIVIDUAL NEURONS BY EPOCH. Neurons firing significantly more strongly under the big-reward condition (blue symbols) dramatically outnumbered neurons firing more weakly (red symbols) during epochs IVI (Fig. 7C). The trend was significant (by a
2 test) during epochs I (P < 0.05), III (P < 0.01), IV (P < 0.001), V (P < 0.01), and VI (P < 0.01). Neurons in which direction selectivity was significantly greater under the big-reward condition markedly outnumbered those in which it was weaker during epochs IIIIV (Fig. 7E). The trend was significant (by a
2 test) in epoch III (P < 0.01) and epoch IV (P < 0.05).
INDIVIDUAL NEURONS ACROSS TRIAL. The firing rate averaged across the entire trial was significantly dependent on reward size in 43% of FEF/PM neurons. Of these, a majority fired more strongly under the big-reward condition (Table 1). This effect was not significantly different between monkeys and was significant in the data from the 2 monkeys combined (
2 test, P < 0.0001). Five neurons (11% of the sample) exhibited a significant interaction of reward size and directional signal; 4 of these carried stronger direction signals under the big-reward condition (Table 1).
SUMMARY. Increasing the size of the anticipated reward markedly enhanced the mean firing rate and moderately enhanced the strength of the directional signal in the FEF/PM transition zone. The firing rate increase was evident both in measures of population activity and in counts of neurons showing statistically significant effects. The increase of direction selectivity was evident primarily in population measures.
Premotor cortex (PM)
POPULATION. We collected data from 83 neurons in the PM of 2 monkeys (Table 1). Curves representing the activity of this population as a function of time during the trial indicate that the level of anticipated reward exerted a strong effect on neuronal activity (Fig. 8A). The net firing rate was sharply elevated from shortly after presentation of the reward cue until completion of the saccade (Fig. 8B). The directional signal was moderately elevated from presentation of the directional cue until completion of the saccade (Fig. 8D).
|
INDIVIDUAL NEURONS BY EPOCH. Neurons firing significantly more strongly under the big-reward condition (blue symbols) dramatically outnumbered those firing more weakly (red symbols) during epochs IV. The trend was significant (by a
2 test) during epochs I (P << 0.0001), II (P < 0.001), III (P << 0.0001), IV (P < 0.001), and V (P < 0.01). In contrast, few neurons exhibited an interaction between reward size and direction (Fig. 8E). In no epoch was there a significant difference between the counts of neurons exhibiting enhanced direction selectivity when a big reward was expected (blue symbols in Fig. 8E) and those showing the opposite effect (red symbols).
INDIVIDUAL NEURONS ACROSS TRIAL. The firing rate averaged across the entire trial was significantly dependent on reward size in 52% of sampled PM neurons. Of these, a large majority fired more strongly under big-reward conditions (Table 1). This effect was not significantly different between monkeys and was highly significant in the data from the 2 monkeys combined (
2 test, P << 0.0001). Out of 6 neurons exhibiting a significant interaction of reward size and direction (Table 1), 3 showed stronger and 3 weaker directional signals under the big-reward conditions.
SUMMARY. Increasing the size of the anticipated reward strongly enhanced the mean firing rate of neurons in PM. This was evident in the population data and in counts of neurons showing significant effects. Expectation of a large reward led to only a weak enhancement of the strength of the directional signal.
Supplementary eye field (SEF)
POPULATION. We collected data from 164 neurons in the SEF of 2 monkeys (Table 1). Curves representing the activity of this population as a function of time during the trial indicate that the level of anticipated reward exerted only a very small effect on neuronal activity (Fig. 9A). The net firing rate tended to be lower under the big-reward conditions (Fig. 9B), whereas the directional signal tended to be stronger (Fig. 9D).
|
INDIVIDUAL NEURONS BY EPOCH. The number of neurons exhibiting significant effects of reward size on firing rate (Fig. 9C) was very small. In no epoch was there a significant tendency for neurons firing significantly more under the big-reward conditions to predominate over those firing less or vice versa. During epoch II, neurons in which direction selectivity was significantly stronger under the big-reward condition significantly outnumbered those in which it was weaker (
2 test, P < 0.05).
INDIVIDUAL NEURONS ACROSS TRIAL. The firing rate averaged across the entire trial was significantly dependent on reward size in 11% of sampled neurons. Among these, 8 fired more under the big-reward condition and 10 fired less. Of two neurons (1% of the sample) exhibiting an interaction between reward and direction, one showed enhanced and the other reduced direction selectivity under big reward.
SUMMARY. Increasing the size of the anticipated reward exerted a very subtle effect on neuronal activity in the SEF. This took the form of a slight reduction of firing rate and a slight enhancement of the directional signal clearly evident only at the population level.
Rostral supplementary motor area (SMAr)
POPULATION. We collected data from 88 neurons in the SMAr of one monkey (Table 1). Curves representing the activity of this population as a function of time during the trial indicate that the level of anticipated reward exerted marked effects on neuronal activity (Fig. 10A). There was a substantial increase of firing rate from shortly after the reward signal until execution of the saccade on big-reward trials (Fig. 10B). Furthermore, under big reward conditions, the directional signal was moderately stronger (Fig. 10D).
|
INDIVIDUAL NEURONS BY EPOCH. Neurons firing significantly more strongly under the big-reward condition (blue symbols) outnumbered those firing more weakly (red symbols) during epochs IV (Fig. 10C). This trend achieved significance during epoch IV (
2 test, P < 0.05). During no epoch was there a significant difference between the counts of neurons in which directional signals were enhanced or reduced under the big-reward condition (Fig. 10E).
INDIVIDUAL NEURONS ACROSS TRIAL. The firing rate averaged across the entire trial was significantly dependent on reward size in 32% of neurons. Among these, 24 fired more strongly under the big-reward condition and 4 fired more weakly. The difference in these counts was significant (
2 test, P < 0.001). Of 5 neurons exhibiting an interaction between reward and direction, 3 showed enhanced direction selectivity under the big-reward condition and 2 showed reduced selectivity.
SUMMARY. Increasing the size of the anticipated reward induced a strong enhancement of firing rate among SMAr neurons. This was evident in both population measures and counts of neurons showing significant effects of reward size. A weak enhancement of directional signals was evident only at the population level.
Comparison among areas
To determine whether the impact of anticipated reward varied systematically across areas, we carried out area-to-area comparisons based on counts of neurons in which firing depended significantly on reward, direction, and a reward x direction interaction (Table 1; Fig. 11). Several systematic interareal differences were clearly evident on comparison.
|
REWARD ENHANCEMENT. Neurons firing significantly more strongly under the big-reward condition became steadily more frequent with progress in a posteriorward direction across the cortical surface (Fig. 11A, blue bars). This was true in both the lateral frontal lobe (PFC, FEF, FEF/PM, and PM) and the medial frontal lobe (SEF and SMAr). Most pairwise interareal differences in the frequency of neurons exhibiting stronger firing on big-reward trials were highly significant (Table 2). Even among neurons selected on the basis of firing significantly more strongly under the big-reward condition, the strength of the signal tended to increase in a posteriorward direction (4.5, 5.1, 5.3, and 5.2 spikes/s in PFC, FEF, FEF/PM, and PM, respectively, and 3.6 and 4.3 spikes/s in SEF and SMAr, respectively).
|
REWARD SUPPRESSION. Neurons firing significantly more strongly under the small-reward condition were observed infrequently and in a pattern that did not seem to reflect systematic differences between areas (Fig. 11A, red bars). Only one pairwise interareal comparison (FEF vs. SEF) revealed a statistically significant difference in the frequency with which such neurons were observed (
2 test, P < 0.05).
DIRECTION SELECTIVITY. Neurons exhibiting a statistically significant dependency of firing rate on response direction (blue and red bars in Fig. 11B) were more numerous in the eye fields (FEF and SEF) than in other areas. The frequency of such neurons (as measured by a
2 test) was significantly higher in the FEF than in the PFC (P < 0.01) and PM (P < 0.05) and was significantly higher in the SEF than in the PFC (P < 0.0001), PM (P << 0.0001), and SMAr (P < 0.05). The distribution across areas of direction selective neurons was highly significantly different from the distribution across areas of neurons firing more strongly under big-reward conditions (
2 test, P << 0.0001).
IMPACT OF REWARD ON DIRECTIONAL SIGNALS. Neurons exhibiting a statistically significant dependency of firing rate on the interaction between reward size and response direction were rare in all areas (Fig. 11C). The frequency with which these neurons were observed in the total sample (0.043) was in fact no greater than the frequency of type 1 errors expected by chance (0.05). Thus statistical tests carried out on single neuron data do not afford independent support for the observation based on the population histograms (Figs. 5, 6, 7, 8, 9, 10) that the directional signal tended to be stronger under big-reward conditions and that the strength of this effect varied to some degree across areas.
Location of reward sites relative to gross morphological landmarks
To analyze the fine distribution of neurons exhibiting reward effects, we projected recording sites onto the cortical surface. For 3 lateral chambers (in monkeys P, N, and F), we collected MR images parallel to the cortical surface. Using a set of slices that contained both the cortex and fiducial markers at known locations relative to the recording grid, we determined the locations of recording sites relative to the arcuate (AS) and principal (PS) sulci. The results are shown in Fig. 12, AC. In this figure, the size of each symbol indicates the proportion of neurons at the corresponding site in which there was a significant enhancement of firing rate on big-reward trials. The general tendency for reward enhancement to occur at posterior sites, and, in particular, at sites behind the arcuate sulcus, is clear. In the vicinity of the FEF and FEF/PM, neurons represented by a single symbol occupied depths ranging from 0 to 4 mm (mean = 2.23 mm; median = 2.00 mm). However, there was no consistent trend with respect to depth. Neurons showing a significant effect of reward size were observed throughout the range of recording depths (mean = 2.34 mm; median = 2.25 mm). For 3 midline chambers (in monkeys A, N, and F), we collected frontoparallel MR images showing the cortex and fiducial markers at known locations relative to the recording grid. These images were used to determine the locations of recording sites relative to the interhemispheric midline and the frontal plane containing the genu of the arcuate sulcus (AS genu). The results are shown in Fig. 12, DF. The general tendency for reward enhancement to occur at relatively posterior sitesin monkey Fas contrasted to relatively anterior sitesin monkeys A and Nis clear.
|
Impact on reward-related activity of reversing the cue-reward associations
At the end of every 40 successful trials in the Variable Reward task, the cue previously associated with big reward became associated with small reward and vice versa. Consequently, in each 80-trial data collection session, there was one block conforming to each cue convention. This manipulation possessed the virtue of allowing us to consider the influence of reward size on neuronal activity independently of any selectivity neurons may have possessed for the visual attributes of the stimuli. However, it may have resulted in an attenuation of reward-related activity. This would be true if it took monkeys many trials to adjust their expectations after each switch. We addressed this concern by asking how long it took monkeys to adjust to the new cue-reward contingencies after a switch.
To do so, we assessed, as a function of trial number relative to the time of the switch, the effect of predicted reward on behavioral reaction time. To achieve adequate power in this analysis, we combined reaction time measures across all data collection sessions in all monkeys, considering only those trials that the monkey completed successfully, receiving a reward and thus acquiring information about the current relation between the cues and the reward sizes. The results are shown in Fig. 13A. This plots, as a function of trial number, the reaction time to the preswitch big-reward cue minus the reaction time to the preswitch small-reward cue. The values were negative before the switch because the monkeys responded more swiftly on trials when big reward was predicted. The values became positive after the switch for the same reason. The shift to reliable positivity was established by the third postswitch trial. Thus on average, it took the monkeys only 2 trials to register in their performance the switched cue-reward contingencies.
|
We also assessed how quickly neuronal reward-related activity was reestablished after the switch. To do so, we combined data from all neurons with a significant tendency to fire more strongly during big-reward trials, collapsing the data across areas and monkeys. We restricted consideration to trials that the monkey completed successfully, receiving reward and thus receiving information about the cue-reward contingencies. The results are shown in Fig. 13B. This plots, as a function of trial number, the mean firing rate on trials involving the preswitch big-reward cue minus the mean firing rate on trials involving the preswitch small-reward cue. The values were positive before the switch because the neurons fired more strongly on big-reward trials and negative after the switch for the same reason. The shift to reliable negativity was established by the third trial after the switch.
It is conceivable that neuronal activity after the switch might have adjusted at different rates in different areas. To assess this possibility, we analyzed data from each area separately. To compensate for the smaller sample sizes, we carried out a coarser analysis, analyzing firing rates on blocks of 4 consecutive correct trials, with blocks demarcated so that the time of the switch fell at a between-block boundary. The results are shown in Fig. 13, CG. These plots indicate that neuronal activity adjusted quickly to the new rule regardless of area. Furthermore, insofar as there were interareal differences in the rate of adjustment, these could not explain interareal differences in the frequency of reward-related activity. On the contrary, area PM, in which neuronal activity was maximally affected by reward (Fig. 11A), was apparently the slowest to adjust to the switch (Fig. 13F). We conclude that the rule-switching design did not lead to a major attenuation of reward-related neuronal activity.
Relation of reward-related activity to other functional properties of neurons
It might be the case that reward-related activity is correlated with the presence of certain other functional properties in neurons, for example, visual responsiveness, delay period activity, or perisaccadic firing. To assess this possibility, we compared the patterns of task-related activity of neurons exhibiting a significant increase in firing rate under the big-reward condition (Table 1, B > S) and of all other neurons. The classification of neurons was effective as reflected by the fact that neurons in the reward-sensitive category exhibited dramatically enhanced firing on big-reward trials (Fig. 14A, blue vs. red curves), whereas neurons in the reward-insensitive category (Fig. 14B) did not. To compare the functional properties of neurons in <