|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1Center for the Neural Basis of Cognition, Mellon Institute; and 2Department of Neuroscience, University of Pittsburgh, Pittsburgh, Pennsylvania
Submitted 12 April 2005; accepted in final form 11 June 2005
|
|
ABSTRACT |
|---|
|
|
|
INTRODUCTION |
|---|
|
In light of these observations, it is reasonable to speculate that the delay expected to elapse before delivery of a reward might also affect neuronal activity in OF. Across a wide range of species, extending from pigeons to humans, value judgments are subject to time discounting. A reward of a given size is perceived as having greater or lesser value according to whether delivery is anticipated after a shorter or longer delay (Cardinal et al. 2001
; Evenden and Ryan 1996
; Herrnstein 1961
; Ho et al. 1999
; Lowenstein 1992
; Mobini et al. 2002
; Montague and Berns 2002
; Thaler 1981
). That monkeys engage in time discounting was demonstrated in a recent study from our laboratory (Roesch and Olson 2005
). The monkeys in this study performed a variable-delay version of the memory-guided saccade task. A cue presented early in each trial indicated whether the delay intervening before the monkey could make a saccade and receive a reward would be long (2,500 ms) or short (500 ms). The essential behavioral finding was that monkeys were more motivated when working for a reward at short delay as indicated by a reduction in the frequency with which they aborted trials by breaking fixation. This indicates that they placed higher value on the reward when it was expected sooner. That monkeys engage in time discounting is also suggested by the fact that their performance improves as they progress through multiple trials in anticipation of receiving a reward at the end of the sequence (Liu et al. 2004
; Shidara and Richmond 2002
, 2004
).
No previous experiment has posed the question whether neuronal activity in OF represents the time-discounted value of an expected reward. However, results obtained in other cortical areas suggest that it might. Neuronal activity in dorsolateral prefrontal cortex, the frontal and supplementary eye fields, premotor cortex, and the supplementary motor area, when monitored in the context of the variable-delay task, was found to depend on the length of the anticipated delay (Roesch and Olson 2005
). Firing tended to be stronger in anticipation of a short delay just as it tended to be stronger in anticipation of a large reward; moreover, the tendency for a neuron to fire more strongly in anticipation of a short delay was positively correlated with its tendency to fire more strongly in anticipation of a large reward. Delay-dependent activity in these areas may simply have reflected motivational modulation of the monkey's state of motor preparation and need not have represented the time-discounted value of the reward (Roesch and Olson 2003
, 2004
). Nevertheless, the fact that neuronal activity increased in anticipation of a short delay, like the observation that monkeys performed better in anticipation of a short delay, encourages the view that neural representations of value somewhere in the brainincluding, perhaps, OFwere enhanced by the prediction of a short delay.
|
|
METHODS |
|---|
|
Two adult male rhesus monkeys were used (Macaca mulatta; laboratory designations P and F). Experimental procedures were approved by the Carnegie Mellon University Animal Care and Use Committee and were in compliance with the guidelines set forth in the United States Public Health Service Guide for the Care and Use of Laboratory Animals.
Preparatory surgery
At the outset of the training period, each monkey underwent sterile surgery under general anesthesia maintained with isofluorane inhalation. The top of the skull was exposed, bone screws were inserted around the perimeter of the exposed area, a continuous cap of rapidly hardening acrylic was laid down so as to cover the skull and embed the heads of the screws, a head-restraint bar was embedded in the cap, and scleral search coils were implanted on the eyes, with the leads directed subcutaneously to plugs on the acrylic cap (Robinson 1963
). After initial training, recording chambers were implanted into the acrylic. For this purpose, a 2-cm-diameter disk of acrylic and skull overlying the left hemisphere was removed. A cylindrical recording chamber was cemented into the hole with its base flush to the exposed dural membrane. The chamber was centered at approximately anterior 23 mm and lateral 23 mm with respect to the HorsleyClarke reference frame.
Variable-delay task
The monkeys performed a memory-guided saccade task in which a cue presented early in each trial predicted a short (500-ms) or a long (2,500-ms) delay period. Essential features of the task are summarized in Fig. 1A. Each trial began with onset of a central fixation spot. At a point in time 50 ms after attainment of fixation, the spot was transformed to a cue the shape and color of which signified the length of the upcoming delay period. After 400 ms two potential targets appeared at diametrically opposed locations to the right and left of fixation. A directional cue identical to the fixation cue except in size was then presented for 250 ms in superimposition on one of the targets. After a 500-ms (or 2,500-ms) delay period, the fixation spot was extinguished, whereupon the monkey was required to make a saccade directly to the previously cued target and to maintain fixation on it for 300450 ms after saccade completion, at which time a juice reward was delivered. There were four possible conditions representing all possible combinations of delay length (short or long) and direction (right or left). The targets were always placed at standard locations directly to the right and left of fixation because neurons in OF are not reported to possess well-localized response fields and, indeed, only rarely exhibit selectivity for response direction (Wallis and Miller 2003
). The conditions were interleaved in pseudorandom order according to the rule that one trial conforming to each condition had to be completed successfully before initiation of the next block of four trials. To prevent confounding activity related to delay length with selectivity for the visual properties of the cues, the cue convention was reversed after each block of 40 successful trials. The collection of data from a given neuron commonly continued until 80 trials had been completed successfully. Further details of the task and stimuli are described in a previous publication (Roesch and Olson 2005
).
|
To determine whether neurons sensitive to variable delay were also sensitive to variable-reward size we had monkeys perform the variable-reward task. Task order was random across sessions, but alternated within a recording session. Essential features of the task are summarized in Fig. 1B. It was similar to the variable-delay task except in that 1) the delay was fixed at 1,500 ms and 2) the cue at the beginning of the trial predicted a big (0.3 ml) or small (0.1 ml) juice reward. Further details of the task and stimuli are described in a previous publication (Roesch and Olson 2003
).
Single-neuron recording
At the beginning of each day's session, a vertically oriented transdural guide tube was advanced to a depth such that its tip was approximately 1 cm above OF. A varnish-coated tungsten microelectrode with an initial impedance of several megohms at 1 kHz (FHC, Bowdoinham, ME) was then advanced through the guide tube into the brain. The guide tube could be placed reproducibly at anteriorposterior and mediolateral points forming a square grid with 1-mm spacing (Crist et al. 1988
). The action potentials of a single neuron were isolated from the multineuronal trace by means of an on-line spike-sorting system using a template-matching algorithm (Signal Processing Systems, Prospect, Australia). The spike-sorting system, on detection of an action potential, generated a pulse the time of which was stored with 1-ms resolution.
Experimental control and data collection
All aspects of the behavioral experiment, including presentation of stimuli, monitoring of eye movements, monitoring of neuronal activity, and delivery of reward, were under the control of a Pentium-based computer running Cortex software provided by R. Desimone, Laboratory of Neuropsychology, National Institute of Mental Health. Eye position was monitored by means of a scleral search coil system (Riverbend Instruments, Birmingham, AL). The X- and Y-coordinates of eye position were stored with 4-ms resolution. Stimuli generated by an active matrix LCD projector were rear-projected onto a frontoparallel screen 25 cm from the monkey's eyes.
Analysis of the dependence of behavior on delay length and reward size
We used paired t-tests to compare, across sessions, the session means of the following measures obtained on short- versus long-delay trials (or big- vs. small-reward trials): reaction time, error rate, and fixation-break rate. Reaction time was defined as the delay from offset of the fixation spot to the moment when the eye left the central fixation window. Error rate was defined as the number of trials on which a saccade was directed to the wrong target expressed as a percentage of all trials on which a saccade was directed to either target. Fixation-break rate was defined as the percentage of all trials on which the eye left the central fixation window before offset of the fixation spot. In all tests, the criterion for statistical significance was taken as P
0.05. In all tests, the distribution of the pairwise differences did not deviate from normality.
Analysis of the dependence of firing rate on task factors
We used two-factor ANOVAs to analyze the impact of delay length (or reward size) and response direction on the firing rate of each neuron. We independently analyzed data from seven trial epochs: 1) from delay cue onset to directional cue onset (700 ms), 2) from onset to offset of the directional cue (250 ms), 3) 250 ms beginning with directional cue offset, 4) 250 ms before fixation spot offset, 5) 200 ms before saccade initiation, 6) from saccade onset to 100 ms after saccade completion, and 7) 100 ms before to 100 ms after initiation of reward delivery. These correspond to the boxes labeled IVII at the base of Fig. 5. In all tests, the criterion for statistical significance was taken as P
0.05.
|
To characterize the location of the recording sites relative to gross anatomical landmarks, we projected the sites onto structural magnetic resonance (MR) images. The images were collected by use of a Brükker 4.7-T magnet in which the anesthetized monkey was supported by an MR-compatible stereotaxic device. Frontoparallel slices of 2 mm thickness spanning the entire brain were collected. Projection of recording sites onto the MR images was accomplished by reference to the image of an electrode inserted into the brain near the center of the recording zone and at known coordinates relative to the recording grid.
|
|
RESULTS |
|---|
|
We recorded neuronal activity in OF during the performance both of the variable-delay task (Fig. 1A) and the variable-reward task (Fig. 1B). All recording sites in each monkey were within 2 mm of the point indicated in the corresponding frontoparallel image in Fig. 2. The indicated zone corresponds closely to a region shown in previous studies to contain neurons sensitive to the value of a predicted reward (Roesch and Olson 2004
; Rolls 2000
; Thorpe et al. 1983
; Tremblay et al 1999
). Neurons were selected for study if, on preliminary testing, they seemed to exhibit any form of phasic activity in conjunction with task performance. In the context of the variable-delay task, we recorded from 154 neurons (83 in monkey F and 71 in monkey P). In the context of the variable-reward task, we recorded from 152 neurons (83 in monkey F and 69 in monkey P). Whenever possible (148 cases) the same neuron was studied in both tasks and is represented in both databases.
|
LENGTH OF ANTICIPATED DELAY. As an index of the impact of anticipated delay on motivation, we measured the tendency for the monkeys to abort a trial by prematurely breaking fixation. The results indicate 1) that the tendency to break fixation declined over the course of the trial, as if the monkeys became more invested in completing the trial as it proceeded; 2) that the tendency to break fixation was reduced on short- compared with long-delay trials; and 3) that this effect was particularly strong during the first second of the trial, before the variable segment of the delay period, when the only difference between trials lay in the monkey's anticipation of a short or long delay. These points are supported by Fig. 3C, which represents the number of fixation breaks occurring at each time during the trial (in 500-ms bins) and under each condition (short or long delay), as a fraction of all fixation breaks committed by the monkeys. Overall, cases in which a trial was aborted by breaking fixation, even during the first second, were rare. Out of all long-delay trials, 2.6% were terminated by a fixation break during this period (5.6 and 0.03% in monkeys P and F, respectively). Out of all short-delay trials, 0.03% were terminated by a fixation break during this period (0.06 and 0.00% in monkeys P and F, respectively). The difference in these counts was highly significant in the data collapsed across monkeys (t-test, P < 0.001) and in monkey P (t-test, P < 0.0001). The number of fixation breaks in monkey F was so low (this monkey literally never broke fixation on a short-delay trial) that the trend fell short of significance (t-test, P = 0.15). We conclude that monkey P was more motivated to complete the trial if the anticipated delay was short and infer that the same was true of monkey F as well.
|
SIZE OF ANTICIPATED REWARD.
In data from the variable-reward task, the same behavioral measures revealed the following results. 1) The error rate was lower on big- than on small-reward trials (Fig. 3D). This effect was significant in data collapsed across monkeys (t-test, P < 0.05) and in a post hoc test of monkey P (P < 0.05). 2) The behavioral reaction time was inconsistently related to anticipated reward size (Fig. 3E). It was significantly lower on big-reward trials in monkey P (t-test, P < 0.0001) and on small-reward trials in monkey F (t-test, P < 0.0001). 3) Fixation breaks summed across the entire 2,500-ms period after presentation of the delay cue were more frequent under the small- than under big-reward condition (Fig. 3F). This effect was significant in the data collapsed across monkeys (t-test, P < 0.05) and in a post hoc test on data from monkey P (t-test, P < 0.01). The error-rate and fixation-break measures suggest that the monkeys were more motivated on big-reward trials. However, for unknown reasons, the observed effects were not as robust as in a previous study (Olson and Roesch 2003
). In light of this fact, it is important to note that the interpretation of neuronal data does not hinge critically on there being a strong relation between the size of the predicted reward and these particular behavioral measures. It hinges only on the ability of cues predicting large and small rewards to elicit responses of different strength in OF.
Early activity correlated with anticipated delay length
EXAMPLE. In some individual neurons, the firing rate was clearly dependent on the length of the anticipated delay. An example is shown in Fig. 4, A and B. During the period after presentation of the delay-predicting cue, this neuron fired more strongly when the predicted delay was short than when it was long (Fig. 4, A vs. B, leftmost panels). Data in this figure are collapsed across response direction because the firing rate of the neuron was unaffected by direction.
|
DIFFERENCE HISTOGRAMS. As a graphic representation of the tendency for the population to fire more strongly on short- than on long-delay trials, we constructed a difference histogram representing, as a function of time during the trial, the index (SP + SA LP LA)/2, where SP is the firing rate under the short-delay, preferred-direction condition, LA is the firing rate under the long-delay, antipreferred-direction condition, and so on (Fig. 5B1). This shows that the strongest delay-related activity occurred during the period immediately after the delay-predicting cue. As a graphic representation of the tendency for the population directional signal (firing rate on preferred-direction trials minus firing rate on antipreferred-direction trials) to be stronger on short- than on long-delay trials, we constructed a difference histogram representing, as a function of time during the trial, the index (SP SA LP + LA)/2 (Fig. 5D1). This shows that there was a tendency, but only an extremely weak one, for the directional signal to be stronger on short-delay trials.
STATISTICAL ANALYSIS OF DATA FROM INDIVIDUAL NEURONS. Although the population and difference histograms indicate tendencies that were present across the neuronal population as a whole, they do not indicate how consistent or statistically significant these tendencies were. To determine in how many neurons the length of the anticipated delay significantly affected the firing rate, we carried out an ANOVA with firing rate as the dependent variable and with delay length and response direction as independent variables. This analysis focused on three epochs marked at the base of Fig. 5E1: epoch I (700 ms beginning with delay cue onset and ending with directional cue onset, epoch II (250 ms beginning with onset and ending with offset of the directional cue), and epoch III (250 ms beginning with directional cue offset). Direction was included as a factor even in the analysis of data from epoch I, before display of the directional cue, so as to maintain a parallel to the analysis of data from later epochs. Because the directional cue was not displayed until after epoch I, any main effects of direction or interaction effects involving direction could have reflected only type I errors and accordingly were not considered.
AFTER THE DELAY CUE.
Out of 154 recorded neurons, 28 (18%) exhibited a significant main effect of delay during epoch I. Out of these 28 neurons, those firing significantly more strongly under the short-delay condition (n = 27; blue symbols in Fig. 5C1) dramatically outnumbered those firing more weakly under the short-delay condition (n = 1; red symbol in Fig. 5C1). This effect was not significantly different between monkeys and was highly significant in the data from the two monkeys combined (
2 test, P < 0.0001).
AFTER THE DIRECTIONAL CUE.
During epochs II and III, the effects observed in the two monkeys were inconsistent. In monkey P, a significant majority of neurons fired more strongly in anticipation of a short delay (
2 test, P < 0.001). Furthermore, in monkey P, the strength of the delay-related signal during this period was positively and significantly correlated with its strength during the epoch after the delay cue (r2 = 0.42, P < 0.0001). In contrast, in monkey F, the number of neurons firing more strongly in anticipation of a short delay did not exceed the number expected by chance. Rather, a significant majority of neurons fired more strongly in anticipation of a contralateral response (
2 test, P < 0.001). In neither monkey did neurons in which the directional signal was stronger under the short-delay condition (blue symbols in Fig. 5E1) significantly exceed in number neurons showing the opposite effect (red symbols in Fig. 5E1).
SUMMARY. A single robust and consistent effect emerged from this analysis: the phasic response to the delay cue was stronger when it predicted a short delay than when it predicted a long delay.
Late activity correlated with elapsed delay length
POPULATION AND DIFFERENCE HISTOGRAMS. In the population histograms (Fig. 5A2), it is evident that the mean firing rate began to increase around the time of the imperative cue (offset of the fixation spot) and peaked at the time of saccade initiation around 200 ms later; then it continued to climb and peaked a second time before reward delivery. Throughout the period after saccade initiation, population activity was substantially higher after a long delay (red curves) than after a short delay (blue curves). This tendency is evident in the difference histogram of Fig. 5B2, where the downward-pointing red region indicates the period of time during the trial when firing was stronger after a long delay.
BEFORE THE SACCADE.
Results obtained from statistical analysis of data from epoch IV (a 250-ms period immediately preceding offset of the fixation spot) were inconsistent. In monkey F, neurons firing significantly more strongly after a short delay (n = 12) significantly outnumbered those firing significantly more strongly after a long delay (n = 2) (
2 test, P < 0.01). In monkey P, the effect was reversed (n = 2 vs. 13) and was also significant (P < 0.01). During epoch V (a 200-ms period immediately preceding saccade initiation), the difference in number between neurons in the two categories achieved significance neither in the combined data nor in either monkey considered individually. During these epochs, effects involving an interaction between delay and direction did not significantly exceed the rate expected from type I errors.
AFTER THE SACCADE.
During epoch VI (extending from saccade onset to 100 ms after saccade completion) and epoch VII (extending from 100 ms before to 100 ms after reward delivery), the number of neurons exhibiting a significant main effect of delay length was significantly higher than expected by chance (22/154 = 14% during epoch VI and 37/154 = 24% during epoch VII). Moreover, neurons firing more strongly after a long delay consistently outnumbered those firing more strongly after a short delay. This effect was not significantly different between monkeys and was significant in the data from the two monkeys combined (
2 test, P < 0.05 and P < 0.0001 during epochs VI and VII, respectively). Effects involving an interaction between delay and direction did not significantly exceed the rate expected from type I errors during either epoch.
SUMMARY. A single robust and consistent effect emerged from this analysis: beginning with onset of the saccade, neuronal activity was markedly stronger after a long than a short delay.
Relation between early and late effects dependent on delay length
To determine whether effects occurring at the end of the delay period (elapsed-delay effects) were correlated with effects occurring at the beginning of the trial (anticipated-delay effects), we computed, for each neuron, indices reflecting the dependency of its firing rate on delay-length during 1) a predelay epoch extending from delay-cue onset to directional-cue onset and 2) a postdelay epoch extending from saccade initiation to reward delivery. The delay index, (S L)/(S + L), where S and L are the firing rates on short-delay and long-delay trials, respectively, was positive in the case of any neuron firing more strongly when the delay length was short and negative in the case of any neuron firing more strongly when the delay was long. We also characterized each neuron as exhibiting or not exhibiting a significant main effect of delay length during the epoch in question (ANOVA with firing rate as dependent variable and delay length and direction as independent variables; Table 1).
|
|
|
RATIONALE. It would be reasonable to speculate that OF neurons responded more strongly to the cue predicting a shorter delay because, with time discounting taken into account, the anticipated reward held greater value. To test this idea required assessing how the same neurons responded to a manipulation of reward value achieved by a means other than varying the delay. For this purpose, we manipulated the size of the predicted reward.
EXAMPLE. Manipulating anticipated reward size had a clear effect on the firing of some OF neurons. An example is shown in Fig. 4, C and D. This neuron fired more strongly in anticipation of a big reward (Fig. 4C, leftmost panel) than in anticipation of a small reward (Fig. 4D, leftmost panel).
POPULATION HISTOGRAMS. To characterize reward-dependent effects at the population level, we constructed population curves representing mean firing rate as a function of time under the four trial conditions (Fig. 8A1). These revealed that, after presentation of the cue predicting reward size, there was a sharp phasic increase in the population activity in OF. The mean firing rate peaked at a higher level after a cue predicting a large reward (blue curves) than after a cue predicting a small reward (red curves).
|
STATISTICAL ANALYSIS OF DATA FROM INDIVIDUAL NEURONS.
To determine whether effects present in the population were also observable at the level of individual neurons, we analyzed data from each neuron during seven trial epochs (IVII) defined in METHODS and depicted along the time line at the base of Fig. 8E. On data from each epoch, we carried out an ANOVA with firing rate as the dependent variable and with reward size and response direction as factors. Counts of neurons exhibiting significant main effects of reward size on firing rate are shown in Fig. 8C, where blue (or red) symbols represent the percentage of cases in which firing was significantly increased (or decreased) for big compared to small reward. Neurons firing significantly more strongly under the big-reward condition (blue symbols) significantly outnumbered those firing more weakly (red symbols) only in epoch I (
2 test, P < 0.0001). This effect was not significantly different across monkeys. Counts of neurons exhibiting a significant interaction between reward size and direction are shown in Fig. 8E, where blue (or red) symbols represent the percentage of cases in which the directional signal was stronger (or weaker) for big reward. Counts indicated during epoch I must represent type I errors because it was only after this epoch that the directional instruction was delivered. With the counts of epoch I as a basis for comparison, it is clear that interaction effects did not exceed the frequency expected by chance in any epoch.
SUMMARY.
A single robust effect emerged from this analysis: the phasic response to the reward cue was stronger when it predicted a big reward than when it predicted a small reward. This effect was commensurate with the one observed previously in OF by use of the same manipulation of reward size (Roesch and Olson 2004
). However, because the behavioral signs of enhanced task engagement were small in this study (Fig. 3, DF), we must allow for the possibility that the monkeys were not as intensely aware of the cue-reward contingencies as in the previous experiment and that, had they been, neuronal sensitivity to anticipated reward size would have been even greater than observed here.
Late activity correlated with reward size
A very late effect of reward size on the population firing rate was evident in the population histograms of Fig. 8A2. From immediately before delivery of the reward until the end of the trial, neuronal activity was stronger on small-reward trials (red curves) than on big-reward trials (blue curves). This effect lay outside the period of the trial to which planned comparisons were applied. However, because of its possible relation to a similar effect occurring in the variable delay task (see Late activity correlated with elapsed delay length), we decided to analyze it further.
Relation between early and late effects dependent on reward size
To determine whether the effect occurring at the end of the delay period (an enhancement of activity on small-reward trials) was correlated with the effect occurring at the beginning of the trial (an enhancement of activity on big-reward trials), we computed, for each neuron, indices reflecting the dependency of its firing rate on reward size during 1) a predelay epoch extending from delay-cue onset to directional-cue onset and 2) a postreward epoch extending for 500 ms after the initiation of reward delivery. The delay index, (B S)/(B + S), where B and S are the firing rates on big- and small-reward trials, respectively, was positive in the case of any neuron firing more strongly when the reward was large and negative in the case of any neuron firing more strongly when the reward was small. We also characterized each neuron as exhibiting or not exhibiting a significant main effect of reward size during the epoch in question (ANOVA with firing rate as dependent variable and reward size and direction as independent variables).
The distribution of reward indices for the predelay epoch (Fig. 9A) was shifted significantly above zero (t-test, P < 0.0001). The distribution of reward indices during the postreward epoch (Fig. 9B) was shifted significantly below zero (t-test, P < 0.0001). The same trends were present and significant when consideration was confined to neurons exhibiting a significant dependency on reward size (shaded bars in Fig. 9, A and B). To determine whether the tendency for a neuron to exhibit postreward modulation was correlated with its tendency to exhibit predelay modulation, we plotted the indices obtained during the postreward epoch against the delay indices obtained during the predelay epoch (Fig. 9C). There was no significant correlation. We conclude that the tendency to fire more strongly in anticipation of a big reward was not correlated with the tendency to fire more strongly at the time of delivery of a small reward.
|
EARLY ACTIVITY. To determine whether neurons that fired more strongly (or weakly) in response to the big-reward cue also fired more strongly (or weakly) in response to the short-delay cue, we plotted the delay index computed during the predelay epoch (Fig. 6A) against the reward index computed during the predelay epoch (Fig. 9A) for all 148 neurons studied in the context of both tasks. The results (Fig. 10A) revealed a significant positive correlation (r2 = 0.116, P < 0.0001). We conclude that OF neurons responded similarly to cues predicting a more desirable event either in the form of a short delay or in the form of a large reward.
|
|
Impact of reversing the cue convention
At the end of every 40 successful trials in both the variable-delay task and variable-reward task, the cues previously associated with short-delay or big-reward became associated with long-delay or small-reward and vice versa. Consequently, in each data collection session of 80 trials, there was one block conforming to each cue convention. This manipulation possessed the virtue of allowing us to consider the influence of anticipated delay length and reward size on neuronal activity independently of any selectivity neurons may have possessed for the visual attributes of the stimuli. However, it may have resulted in an attenuation of activity related to the anticipated delay and reward. This would be true if it took monkeys many trials to adjust their expectations after each switch. We addressed these concerns by asking how long it took monkeys and neuronal activity to adjust to the new contingencies after a switch.
To do so, we assessed, as a function of trial number relative to the time of the switch, the effect of reward and delay on behavioral reaction time and neuronal firing rate. It would be of interest to know how each neuron responded to the reversal, although because there was only one reversal per recording session, this was not possible. Instead, to achieve adequate analytic power, we combined reaction time measures and firing rates across all data collection sessions in both monkeys, considering only those trials that the monkey completed successfully. Analysis was performed on blocks of four consecutive correct trials, with blocks demarcated so that the time of the switch fell at a between-block boundary. Data in Fig. 12 are plotted as a function of trial number relative to the point in time at which the cue convention was reversed.
|
The firing-rate index for the variable-delay task was computed as the firing rate on trials involving the cue that initially signaled a short delay minus the firing rate on trials involving the cue that initially signaled a long delay (Fig. 12B). The index was positive before the switch because the neuronal population fired more strongly on short-delay trials and was negative after the switch for the same reason. Inspection of the figure makes clear that the transition from positive to negative values occurred within a few trials after the switch.
The reaction-time and firing-rate indices for the variable-reward task (Fig. 12, C and D) were computed in an exactly analogous way with the following exception. Because monkey F was slower to respond on big-reward trials, we inverted the sign of the reaction-time index in considering data from this monkey, computing it as the reaction time on trials involving the cue that initially signaled a small reward minus the reaction time on trials involving the cue that initially signaled a big reward. Inspection of the figure makes clear that both the transition in behavior and the transition in firing rate occurred within a few trials of the reversal in the cue convention. An intriguing further observation is that adjustment to reversal of the significance of the reward cues occurred slightly earlier (as judged both by behavioral and neural measures) than adjustment to reversal of the significance of the delay cues (Fig. 12, C and D vs. Fig. 12, A and B).
We conclude that reversing the cue convention at the midpoint of the data collection session did not lead to a major attenuation of neuronal activity dependent on delay length or reward size.
Neuronal sensitivity to the visual properties of the cue
Some neurons might conceivably have been selective for the visual properties of the cues. To determine whether this was so, we carried out an ANOVA with the firing rate during the epoch extending from onset of the delay cue (or reward cue) to onset of the directional cue as the dependent variable and with cue identity and delay duration (or reward size) as factors. Among 154 neurons studied in the context of the variable delay task, only ten exhibited a significant (P < 0.05) main effect of cue identity and only six exhibited an interaction effect involving cue identity. These counts were no greater than expected by chance from type I errors (
2 test, P > 0.05). Among 152 neurons studied in the context of the variable reward task, 16 exhibited a significant main effect of cue identity and eight exhibited an interaction effect involving cue identity. The number exhibiting a significant main effect was significantly in excess of the number expected by chance (
2 test, P = 0.0033). We conclude that a few neurons may have been selective for the visual properties of the cues used in the reward task. However, because the number was small and because cue identity was counterbalanced against reward size by reversing the convention at the midpoint of each data collection session, this should not have affected the outcome of the main analysis neuronal activity dependent on reward size.
|
|
DISCUSSION |
|---|
|
We monitored single-neuron activity in OF of monkeys performing a variant of the ocular delayed response task in which a cue presented early in each trial predicted whether the ensuing delay would be short or long. We found that a cue predicting a short delay commonly elicited a stronger neuronal response. The strength of the response presumably represented the time-discounted value of the anticipated reward because neurons firing more strongly in response to a short-delay cue also tended to fire more strongly in response to a big-reward cue. We observed an additional incidental effect of uncertain significance at the end of the delay period: neurons fired more strongly if the preceding delay had been longer.
Activity of OF neurons in relation to anticipated time-discounted value
The fact that OF neurons respond more strongly to cues predicting a short delay than to those predicting a long delay might, in principle, arise from the involvement of OF in at least three different sets of processes: 1) affective processes embodying the monkey's emotional response to the display and representing the subjective value of the predicted outcome, 2) motivational processes underlying value-dependent modulation of the monkey's degree of engagement with the demands of the task, and 3) predictive processes associated with the monkey's preparing to respond to the imminent imperative cue on short-delay trials but deferring preparation on long-delay trials. Having considered these possibilities at length in connection with delay-related activity in other frontal areas (Roesch and Olson 2005
), we will confine ourselves here to noting that the first interpretation, based on affective representations of value, is the one most likely to apply to OF. The predictive interpretation, involving timed preparation, can be ruled out on two grounds. First, it cannot explain the correlation demonstrated here between the tendency for a neuron to fire more strongly in response to a cue predicting a short delay and the tendency to fire more strongly in response to a cue predicting a big reward. In the variable-reward paradigm, activity related to timed preparation should be identical on big- and small-reward trials because the behavioral response occurs at the same delay after the cue on both kinds of trial. The motivational interpretation can be ruled out on the basis of the observation that when the monkey's motivational state and the value of the predicted outcome are dissociated (through manipulating motivation with threatened penalties as well as promised rewards) then neuronal responses to the outcome-predicting display are correlated with value rather than motivation (Roesch and Olson 2003
). Thus we are left with the conclusion, consonant with a large body of literature summarized in the INTRODUCTION, that neuronal activity in OF represents the value of the predicted outcome.
If we grant that delay-related activity in OF was related to the monkey's affective response to the predictive cue and was related to the value conveyed by the cue, it still does not follow with certainty that the value was determined by time discounting. The fact that the anticipated reward had greater value on short-delay trials might be explained in any of three ways: 1) the time-discounted value of the reward was greater, 2) the probability-discounted value was greater, or 3) the effort-discounted value was greater. A probability-based account must be considered because, on trials with longer delays, there was a greater likelihood of the monkey's aborting the trial by breaking fixation (Fig. 3C) or of making a saccade to the wrong target (Fig. 3A), with the result that the probability of earning a reward was lower. The probability of receiving a reward is known to exert an impact on both behavior and neural activity related to the anticipation of reward (Fiorillo et al. 2003
; Mobini et al. 2002
; Platt and Glimcher 1999
). However, it would seem that a probability-based interpretation is ruled out by the fact that monkey F, although exhibiting a neural effect whereby the response to the cue was stronger on short-delay trials, exhibited no significant difference between short- and long-delay trials with respect to either error rate or fixation break rate (t-test, P > 0.05). We must still consider whether the evaluation of the cue could have been effort based. According to this interpretation, a reward anticipated at short delay had greater value because the monkey anticipated spending less effort (measured in terms of the duration of central fixation) to obtain it. The design of our task does not allow distinguishing between effects arising from time discounting and those arising from effort discounting. To test for time discounting without contamination from effort discounting would require imposing a delay between completion of the trial and delivery of the reward rather than, as in our task, between the beginning and end of the trial. Although it is entirely plausible that the enhancement of OF activity that we observed on short-delay trials reflected the monkey's time-based evaluation of the reward, we cannot altogether rule out an effect arising from an effort-based evaluation.
Activity of OF neurons at the end of the delay period
Although characterizing neuronal activity at the end of the delay period was not a central aim of this experiment, we did make a perplexing incidental observation. Delay-dependent neuronal activity in OF underwent an inversion of sign after initiation of the saccade, with firing becoming stronger after a long than after a short delay (Fig. 5, A2 and B2). We can only speculate about the possible functional significance of this observation.
On one hand, it might be explained in terms of the idea that neuronal activity in OF is stronger under circumstances associated with more positive affect. Termination of a long delay could have been perceived as a positive event for two reasons: first, escape from the long delay must have been rewarding insofar as the delay itself was aversive; second, because of the nature of the algorithm governing the sequencing of trials (as described in METHODS), completion of a trial involving a long delay indicated with a probability of 0.625 that the next trial would involve a short delay. This line of reasoning unfortunately does not provide a clear account of why the inversion should have occurred exactly at the moment of saccade initiation. Nor does it agree particularly well with an observation made in the context of the variable-reward task. In that task, an inversion of activity dependent on the size of the reward occurred at the time of reward delivery, with firing becoming stronger on delivery of a small reward (Fig. 8, A2 and B2). Although delivery of a small reward indicated with a probability of 0.625 that a big reward would be delivered on the next trial, the small reward certainly was not, in itself a more positive event than a big reward.
The late inversion effect might, on the other hand, have been related to the monkeys' state of arousal, attention, or behavioral preparation, which was clearly affected by the duration of the antecedent delay. In this study, as in a previous one (Roesch and Olson 2005
), speed and accuracy were enhanced if the preceding delay had been short. However, interpretation along these lines is hindered by the fact that, unlike in the previous study, signs of behavioral enhancement were weak and mixed on big-reward as compared with small-reward trials. Thus we remain unable, given data currently in hand, to provide a conclusive unitary account of the late inversion of delay-related and reward-related activity.
Comparison of delay-related activity in OF and other frontal areas
We recently characterized neuronal activity accompanying performance of the variable-delay task in several frontal areas other than OF, including dorsolateral prefrontal cortex (PFC), the frontal and supplementary eye fields (FEF and SEF), and the premotor and supplementary motor areas (PM and SMA) (Roesch and Olson 2005
). On comparison of the results, it is clear that there are significant differences between delay-related activity in OF and in these areas. In OF, delay-related activity took the form of a marked enhancement of the strength of the phasic response to the delay-predicting cue when this cue signaled a short delay. This result stands in contrast to results obtained in other areas. In PFC, FEF, and SEF, there was little or no enhancement of firing during the period immediately after the delay-predicting cue. In PM and SMA, firing was enhanced not only immediately after the cue but throughout the delay period leading up to the response. These results fit well within a simple explanatory framework. The early phasic response in OF occurred during the period when the monkey was evaluating the significance of the delay-predicting cue; this activity could well be related to the evaluation process. The prolonged activity in PM and SMA spanned the time during which the monkey was in a state of heightened readiness on short-delay trials; this activity could well reflect motivational modulation of neural processes underlying engagement with the demands of the task including response preparation. It might be the case that early phasic delay-related activity in OF (representing time-discounted value) drove prolonged delay-related activity in PM and SMA (reflecting value-dependent motivational modulation). Because the areas are not directly connected, this would necessitate their communicating through intermediaries. Our finding that delay-related signals in PFC, FEF, and SEF are weak suggests that these areas, although interposed topologically and connectionally between OF and premotor cortex, are not likely to relay delay-related signals between them and thus leaves unresolved the identity of the intermediary structures.
The view that neuronal activity in OF represents the time-discounted value of the anticipated reward, whereas neuronal activity in PM and SMA reflects value-based motivational modulation of the monkey's preparatory state, is consonant with much that is currently known about these areas. OF is not directly involved in oculomotor and skeletomotor control. Lesions and inactivation of OF do not result in impairments of motor control but do interfere with the evaluation of rewards (Baylis and Gaffan 1991
; Butter and Snyder 1972
; Butter et al. 1969
, 1970
; Dias et al. 1996
; Gaffan and Murray 1990
; Iversen and Mishkin 1970
; Izquierdo and Murray 2004
; Jones and Mishkin 1972
; Meunier et al. 1997
) Furthermore, microelectrode recording studies have demonstrated the presence of neurons influenced by the value of an experienced or expected reward in OF but have demonstrated only weak activity related to the locations of stimuli or the directions of responses (Critchley and Rolls 1996
; Hikosaka and Watanabe 2000
; Rolls and Baylis 1994
; Rolls et al. 1996
; Thorpe et al. 1983
; Tremblay and Schultz 1999
, 2000
). In contrast, lesions and inactivation of the PFC (Dias et al. 1996
; Wallis et al. 2001
), FEF (Dias and Segraves 1999
; Sommer and Tehovnik 1997
), SEF (Sommer and Tehovnik 1999
), PM (Kurata and Hoffman 1994
), and SMA (Brinkman 1984
) result in impairments of cognitive, attentional and motor control but are not known to interfere with the evaluation of rewards or with motivation. Furthermore, insofar as neurons in these areas are sensitive to the size of an anticipated reward, their reward-related activity is prolonged throughout the delay period as if reflecting motivational modulation of the monkey's preparatory state (Roesch and Olson 2003
).
|
|
GRANTS |
|---|
|
|
|
|
ACKNOWLEDGMENTS |
|---|
|
|
|
FOOTNOTES |
|---|
Address for reprint requests and other correspondence: M. R. Roesch, University of Maryland School of Medicine, Department of Anatomy and Neurobiology, HSF-2, Room S251, 20 Penn St., Baltimore, MD 21201 (E-mail: mroes001{at}umaryland.edu)
|
|
REFERENCES |
|---|
|
Brinkman C. Supplementary motor area of the monkey's cerebral cortex: short- and long-term deficits after unilateral ablation and the effects of subsequent callosal section. J Neurosci 4: 918929, 1984.[Abstract]
Butter CM, McDonald JA, and Snyder DR. Orality, preference behavior, and reinforcement value of nonfood object in monkeys with orbital frontal lesions. Science 164: 13061307, 1969.
Butter CM and Snyder DR. Alterations in aversive and aggressive behaviors following orbital frontal lesions in rhesus monkeys. Acta Neurobiol Exp (Warsz) 32: 525565, 1972.[Medline]
Butter CM, Snyder DR, and McDonald JA. Effects of orbital frontal lesions on aversive and aggressive behaviors in rhesus monkeys. J Comp Physiol Psychol 72: 132144, 1970.[CrossRef][Web of Science][Medline]
Cardinal RN, Pennicott DR, Sugathapala CL, Robbins TW, and Everitt BJ. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 292: 24992501, 2001.
Crist CF, Yamasaki DSG, Komatsu H, and Wurtz RH.A grid system and a microsyringe for single cell recording. J Neurosci Meth 26: 117122, 1988.[CrossRef][Web of Science][Medline]
Critchley HD and Rolls ET. Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. J Neurophysiol 75: 16731686, 1996.
Dias EC and Segraves MA. Muscimol-induced inactivation of monkey frontal eye field: effects on visually and memory-guided saccades. J Neurophysiol 81: 21912214, 1999.
Dias R, Robbins TW, and Roberts AC. Dissociation in prefrontal cortex of affective and attentional shifts. Nature 380: 6972, 1996.[CrossRef][Medline]
Evenden JL and Ryan CN. The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology (Berl) 128: 161170, 1996.[CrossRef][Medline]
Fiorillo CD, Tobler PN, and Schultz W. Discrete coding of reward probability and uncertainty by dopamine neurons. Science 299: 18981902, 2003.
Gaffan D and Murray EA. Amygdalar interaction with the mediodorsal nucleus of the thalamus and the ventromedial prefrontal cortex in stimulus-reward associative learning in the monkey. J Neurosci 10: 34793493, 1990.[Abstract]
Herrnstein R. Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4: 267272, 1961.[CrossRef][Web of Science][Medline]
Hikosaka K and Watanabe M. Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards. Cereb Cortex 10: 263271, 2000.
Ho MY, Mobini S, Chiang TJ, Bradshaw CM, and Szabadi E. Theory and method in the quantitative analysis of "impulsive choice" behaviour: implications for psychopharmacology. Psychopharmacology (Berl) 146: 362372, 1999.[CrossRef][Medline]
Iversen SD and Mishkin M. Perseverative interference in monkeys following selective lesions of the inferior prefrontal convexity. Exp Brain Res 11: 376386, 1970.[Web of Science][Medline]
Izquierdo A and Murray EA. Combined unilateral lesions of the amygdala and orbital prefrontal cortex impair affective processing in rhesus monkeys. J Neurophysiol 91: 20232039, 2004.
Jones B and Mishkin M. Limbic lesions and the problem of stimulus-reinforcement associations. Exp Neurol 36: 362377, 1972.[CrossRef][Web of Science][Medline]
Kurata K and Hoffman DS. Differential effects of muscimol microinjection into dorsal and ventral aspects of the premotor cortex of monkeys. J Neurophysiol 71: 11511164, 1994.
Liu Z, Richmond BJ, Murray EA, Saunders RC, Steenrod S, Stubblefield BK, Montague DM, and Ginns EI. DNA targeting of rhinal cortex D2 receptor protein reversibly blocks learning of cues that predict reward. Proc Natl Acad Sci USA 101: 1233612341, 2004.
Lowenstein G and Elster J. Choice Over Time: New York: Russell Sage Foundation, 1992.
Meunier M, Bachevalier J, and Mishkin M. Effects of orbital frontal and anterior cingulate lesions on object and spatial memory in rhesus monkeys. Neuropsychologia 35: 9991015, 1997.[CrossRef][Web of Science][Medline]
Mobini S, Body S, Ho MY, Bradshaw CM, Szabadi E, Deakin JF, and Anderson IM. Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology (Berl) 160: 290298, 2002.[CrossRef][Medline]
Montague PR and Berns GS. Neural economics and the biological substrates of valuation. Neuron 36: 265284, 2002.[CrossRef][Web of Science][Medline]
Platt ML and Glimcher PW. Neural correlates of decision variables in parietal cortex. Nature 400: 233238, 1999.[CrossRef][Medline]
Robinson DA. A method of measuring eye movements using a scleral search coil in a magnetic field. IEEE Trans Biomed Eng 10: 137145, 1963.[Medline]
Roesch MR and Olson CR. Impact of expected reward on neuronal activity in prefrontal cortex, frontal and supplementary eye fields and premotor cortex. J Neurophysiol 90: 17661789, 2003.
Roesch MR and Olson CR. Neuronal activity related to reward value and motivation in primate frontal cortex. Science 304: 307310, 2004.
Roesch MR and Olson CR. Neuronal activity dependent on anticipated and elapsed delay in macaque prefrontal cortex, frontal and supplementary eye fields and premotor cortex. J Neurophysiol 94: 14691497, 2005.
Rolls ET. The orbitofrontal cortex. Philos Trans R Soc Lond B Biol Sci 351: 14331444, 1996.[Web of Science][Medline]
Rolls ET. The orbitofrontal cortex and reward. Cereb Cortex 10: 284294, 2000.
Rolls ET and Baylis LL. Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. J Neurosci 14: 54375452, 1994.[Abstract]
Schoenbaum G, Chiba AA, and Gallagher M. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat Neurosci 1: 155159, 1998.[CrossRef][Web of Science][Medline]
Schoenbaum G, Chiba AA, and Gallagher M. Neural encoding in orbitofrontal cortex and basolateral amygdala during olfactory discrimination learning. J Neurosci 19: 18761884, 1999.
Schultz W, Tremblay L, and Hollerman JR. Reward processing in primate orbitofrontal cortex and basal ganglia. Cereb Cortex 10: 272284, 2000.
Shidara M and Richmond BJ. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science 296: 17091711, 2002.
Shidara M and Richmond BJ. Differential encoding of information about progress through multi-trial reward schedules by three groups of ventral striatal neurons. Neurosci Res 49: 307314, 2004.[CrossRef][Web of Science][Medline]
Sommer MA and Tehovnik EJ. Reversible inactivation of macaque frontal eye field. Exp Brain Res 116: 229249, 1997.[CrossRef][Web of Science][Medline]
Sommer MA and Tehovnik EJ. Reversible inactivation of macaque dorsomedial frontal cortex: effects on saccades and fixations. Exp Brain Res 124: 429446, 1999.[CrossRef][Web of Science][Medline]
Thaler R. Some empirical evidence on dynamic inconsistency. Econ Lett 8: 201207, 1981.
Thorpe SJ, Rolls ET, and Maddison S. The orbitofrontal cortex: neuronal activity in the behaving monkey. Exp Brain Res 49: 93115, 1983.[Web of Science][Medline]
Tremblay L and Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature 398: 704708, 1999.[CrossRef][Medline]
Tremblay L and Schultz W. Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex. J Neurophysiol 83: 18641876, 2000.
Tsujimoto S and Toshiyuki S. Neuronal activity representing temporal prediction of reward in the primate prefrontal cortex. J Neurophysiol 2005 Jan 5; [Epub ahead of print].
Wallis JD, Dias R, Robbins TW, and Roberts AC. Dissociable contributions of the orbitofrontal and lateral prefrontal cortex of the marmoset to performance on a detour reaching task. Eur J Neurosci 13: 17971808, 2001.[CrossRef][Web of Science][Medline]
Wallis JD and Miller EK. Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task. Eur J Neurosci 18: 20692081, 2003.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
C. Padoa-Schioppa Range-Adapting Representation of Economic Value in the Orbitofrontal Cortex J. Neurosci., November 4, 2009; 29(44): 14004 - 14014. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. E. Morrison and C. D. Salzman The Convergence of Information about Rewarding and Aversive Stimuli in Single Neurons J. Neurosci., September 16, 2009; 29(37): 11471 - 11483. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Gregorios-Pippas, P. N. Tobler, and W. Schultz Short-Term Temporal Discounting of Reward Value in Human Ventral Striatum J Neurophysiol, March 1, 2009; 101(3): 1507 - 1523. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. K. Berdyyeva and C. R. Olson Monkey Supplementary Eye Field Neurons Signal the Ordinal Position of Both Actions and Objects J. Neurosci., January 21, 2009; 29(3): 591 - 599. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. H. Rudebeck, D. M. Bannerman, and M. F. S. Rushworth The contribution of distinct subregions of the ventromedial frontal cortex to emotion, social behavior, and decision making Cogn Affect Behav Neurosci, December 1, 2008; 8(4): 485 - 497. [Abstract] [PDF] |
||||
![]() |
A. V. Kravitz and L. L. Peoples Background Firing Rates of Orbitofrontal Neurons Reflect Specific Characteristics of Operant Sessions and Modulate Phasic Responses to Reward-Associated Cues and Behavior J. Neurosci., January 23, 2008; 28(4): 1009 - 1018. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Lebedev, J. E. O'Doherty, and M. A. L. Nicolelis Decoding of Temporal Intervals From Cortical Ensemble Activity J Neurophysiol, January 1, 2008; 99(1): 166 - 186. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-W. Sohn and D. Lee Order-Dependent Modulation of Directional Signals in the Supplementary and Presupplementary Motor Areas J. Neurosci., December 12, 2007; 27(50): 13655 - 13666. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Murray, J. P. O'Doherty, and G. Schoenbaum What We Know and Do Not Know about the Functions of the Orbitofrontal Cortex after 20 Years of Cross-Species Studies J. Neurosci., August 1, 2007; 27(31): 8166 - 8169. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |