|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Laboratory of Sensorimotor Research, National Eye Institute, National Institutes of Health, Bethesda, Maryland
Submitted 24 August 2006; accepted in final form 26 September 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
In an asymmetric reward saccade task (Fig. 1A), monkeys make saccades to targets associated with preferred reward with shorter latency than to targets associated with nonpreferred reward (Fig. 1B) (Lauwereyns et al. 2002b
; Watanabe et al. 2003a
,b
). In this task, monkeys were trained to maintain fixation of a central point for a constant waiting period. At the end of the waiting period, a target was presented, and monkeys were required to make saccades to the target. The expected reward associated with each target is stable over a block of trials, i.e., the reward asymmetry is known to the monkeys throughout a block of trials, but the actual reward in a given trial is unknown to the monkeys until target presentation. This design raises several possibilities about how the reward asymmetry information is used over time to result in the final behavioral bias (Fig. 1C). In the first scenario, because the reward asymmetry is always known, behavioral bias toward the preferred target may be present at the beginning of the trial and constant throughout. In the second scenario, the subjective values of reward decreases with time, following a hyperbolic function that is multiplicative to the actual reward magnitude (Fig. 1C, inset) (Mazur 1984
). Because the temporal discounting function is independent of reward magnitude (Green et al. 2004
; Richards et al. 1997
), the difference in the subjective values of the asymmetric rewards decreases in time, giving rise to a decreasing bias. In the third scenario, although the reward asymmetry information is always available, it is only gradually taken into account as time approaches the expected time of target presentation (i.e., the end of the waiting period). Accordingly, an increasing bias may be expected, especially in light of previous results of gradually increasing anticipatory neural activities in asymmetric reward tasks (Coe et al. 2002
; Ikeda and Hikosaka 2003
; Lauwereyns et al. 2002a
,b
; Sato and Hikosaka 2002
; Takikawa et al. 2002
), temporal modulation of attention allocation-related neural signals (Ghose and Maunsell 2002
), and effects of time estimation on motor preparation (Janssen and Shadlen 2005
).
|
| METHODS |
|---|
|
|
|---|
Behavioral tasks
In the visually guided saccade task (Fig. 1A), a trial began with the onset of a central point (diameter: 0.6°). Once the monkey's eye entered the fixation window (3° for monkey D and 4° for monkey L), an auditory click signaled the beginning of the waiting period. The monkey was required to maintain its eye within the fixation window until the end of the waiting period, when the fixation point was turned off and a peripheral target was turned on simultaneously. Targets were presented at 20° left or right to the fixation point. The monkey made saccades to the target to obtain water reward. In regular trials (
85% of all trials), the waiting period was 2.1 s. In probe trials (
15% of all trials), the waiting period was one of four possible values: 200, 500, 950, and 1,600 ms. Whether a trial is a regular or probe trial was determined randomly except for the following constraints: 1) the first three trials after a block change were always regular trials; 2) probe trials were not presented consecutively; and 3) if an error occurred, regular trials, but not probe trials, were repeated. Error trials consisted of mainly fixation breaks, defined as eyes leaving the fixation window before fixation point disappearance, occasionally two-step saccades, consisting of incorrect initial saccades toward the large reward target followed by corrective saccades to the small reward target, and rarely premature saccades, defined as eyes leaving the fixation window within 100 ms after fixation point disappearance. Error trials were followed by an auditory buzz and penalized by 1- to 1.5-s extension of the intertrial interval. Intertrial interval after a correct trial was 1.5 s.
We explored two schemes of selecting the probe waiting period duration. In the first scheme, probe waiting period duration was pseudorandomly chosen from the four possible values. This value was used in all probe trials for the entire session. An example session consisted of 85% of trials with 2.1 s and 15% trials with 500 ms waiting period. In the second scheme, probe waiting period duration was randomly chosen for every probe trial. An example session consisted of
84% of trials with 2.1 s and
4% each with a 200-, 500-, 950-, and 1,600-ms waiting period. The first scheme has the advantage of generating a larger number of probe trials in every target-reward combination, thus facilitating estimation of saccade latency distribution. However, it has the drawback of potential variability in baseline motivation. The second scheme offers uniform baseline motivation for comparison among different probe waiting period durations, but with smaller numbers of saccades for every condition. Preliminary data suggest that saccade latency was modulated by time in a similar fashion in the two schemes. For this study, we primarily used the first scheme and presented example data using the second scheme in Fig. 2C for comparison. Note that, although the first scheme used two discrete periods, it was fundamentally different from the previously used bimodal timing distributions (Ghose and Maunsell 2002
; Janssen and Shadlen 2005
). In these previous studies, monkeys were extensively exposed to the bimodal distribution to form a stable expectation of events at different timings. In our task, on the other hand, monkeys had extensive experience only with the fixed regular waiting period before this study. In addition, during experiments, probe trials served as "catch trials" and only consisted of
15% of all trials, thus limiting the possibility that monkey might form stable expectations of bimodal event timings. In addition, the required saccade, of two possible directions, was not known to the monkey until target onset.
|
100 trials) and could be small (0.075 ml) or large (0.3 ml). Block changes were indicated to monkeys by a prolonged intertrial interval (>5 s). Data analysis
For analysis, only correct trials were included. Saccade onset latency was determined using standard velocity and acceleration threshold-crossing algorithms. Examples of saccade latency distribution in regular trials in one experiment session are shown in Fig. 1B. Median saccade latency was calculated for every target-reward combination in both regular and probe trials. Behavioral bias was calculated as the difference in median latency between small and large reward trials with the same target. Relative bias was calculated as the ratio of probe trial bias to regular trial bias in the same experimental session. Linear regression and statistical tests were performed on median saccade latency, before averaging for figure presentations, using internal functions in GraphPad Prism 4.01 (GraphPad Softward, San Diego, CA).
| RESULTS |
|---|
|
|
|---|
The temporal build-up of behavioral bias resulted from a constant saccade latency in large reward trials and an increasing saccade latency in small reward trials. However, some studies on human subjects have shown that reaction time in simple detection tasks increases with waiting period duration (Foley 1959
; Green and Luce 1971
; Karlin 1959
; Nickerson and Burnham 1969
; Sanders and Wertheim 1973
). Therefore an alternative hypothesis is conceivable. In the alternative scenario, there is a reward-independent increase in saccade latency as a function of the waiting period duration. In large reward trials, this increase is compensated for by large reward-induced facilitation of saccades, thereby giving rise to apparently stable saccade latency. In small reward trials, this increase is unchanged or augmented by additional small reward-induced suppression of saccades, thereby giving rise to increasing saccade latency. To test the validity of this alternative hypothesis, we characterized the relationship between saccade latencies and waiting period duration in the same monkeys in an equal reward task, with identical timing arrangements as the asymmetric reward task.
As seen in Fig. 3, two main observations argued against a significant reward-independent increase in saccade latency. The first observation was that the median latency in equal reward trials (black data points and lines) followed that in large reward trials in the asymmetric reward task (dashed lines). The estimated slopes were not different between the two conditions (P > 0.3 for both saccades in both monkeys). The estimated intercepts were also not different between the two conditions in three of four cases (P = 0.2131 for left saccades in monkey D; P > 0.6 for both saccades in monkey L). The only exception was for right saccades in monkey D. However, even in this exception, median latency in the equal reward task was closer in value to large reward than to small reward trials in the asymmetric reward task. The second observation was that the data points and regression lines for trials with different reward magnitudes were almost identical in the equal reward task (Fig. 3, squares and circles). The estimates of both slopes and intercepts were not significantly different between two reward conditions (slope: P > 0.35 for both saccades in both monkeys; intercepts: P > 0.18 for both saccades in both monkeys). This indicates that, when there is no reward bias, as in the equal reward task, reward magnitude has little effect on saccade latency. Collectively, these observations indicate that the temporal build-up of the bias seen in the asymmetric reward task resulted mainly from gradually increased suppression of saccades to the small reward target.
|
| DISCUSSION |
|---|
|
|
|---|
The gradual suppression of small reward saccades without substantial facilitation of large reward saccades suggests that the behavioral bias cannot be fully accounted for by preferential saccade preparation toward large reward targets or by preferential attention allocation to large reward target positions. It has been shown that when the saccade target is known and the waiting period is variable, saccade latency is inversely correlated to the hazard function based on the subjective estimation of the waiting period distribution (Janssen and Shadlen 2005
). In our experiment, because of the overwhelming exposure to a 2.1-s waiting period compared with the infrequent exposure to shorter probe waiting periods, the hazard function was likely monotonically increasing during the 2.1-s waiting period. Alternatively, monkeys might have learned the underlying timing structure despite the limited experience of short probe waiting periods, in which case the hazard function should follow a multiphasic time-course. If monkeys were merely preparing for saccades toward the known, large reward target position, their saccade latency in large reward trials would mirror the hazard function in time. However, our data indicate that the saccade latency in large reward trials shows little dependence on the probe waiting period, suggesting that saccade preparation toward large reward targets cannot fully account for the behavioral bias.
Because attention and motivation are closely linked, another hypothesis suggests that the reward asymmetry-induced bias may be accounted for by differential attention allocation to target locations associated with large and small rewards (Maunsell 2004
). Based on results in attention tasks (Muller and Findlay 1988
; Posner 1980
), attentional modulation is expected to facilitate saccades toward large reward targets and suppress saccades toward small reward targets. The suppression of saccades to nonpreferred (unattended) targets is similar to what we observed. However, the predicted facilitation of saccades to preferred (attended) targets was not observed, suggesting that attentional modulation cannot fully account for behavioral bias induced by reward asymmetry in our task. Interestingly, in a more complicated asymmetric reward task involving four target locations, facilitation and suppression of rewarded and unrewarded saccades, respectively, were indeed observed (Watanabe et al. 2003a
). This discrepancy raises the possibility that attentional mechanisms may have an enhanced contribution to the overall behavioral bias in more difficult tasks. It remains to be determined how saccade facilitation and suppression develop in time in the more complex task and whether a similar time-course is followed in attention tasks.
In addition to these high-level implications, our results also pointed to specific directions to search for the neuronal underpinnings of reward-driven bias. Previous research in our laboratory and others has an emphasis on reward-modulated neural activity that correlates with facilitation of preferred motor responses. For example, using asymmetric reward tasks, our laboratory has shown the reward asymmetry-modulated anticipatory activity observed in the basal ganglia, superior colliculus, and cortical eye fields (Coe et al. 2002
; Ding and Hikosaka 2006
; Ikeda and Hikosaka 2003
; Kobayashi et al. 2006
; Lauwereyns et al. 2002b
; Sato and Hikosaka 2002
; Takikawa et al. 2002
). Such anticipatory activity emerged before target presentation and had a tendency to increase gradually in time until target presentation. It is selective for one reward asymmetry condition (e.g., left target-large reward and right target-small reward). In the cortical eye fields and superior colliculus, most instances showed enhanced activity when the contralateral target is rewarded or associated with larger reward and the ipsilateral target is either unrewarded or associated with smaller reward. Similar, but weaker, laterality was observed in the basal ganglia. The observed laterality has led to the parsimonious hypothesis that the anticipatory activity mediates the behavioral bias by facilitating desired actions (Hikosaka et al. 2006
). Our behavioral data, however, showed that saccade latency changed in time only in the small reward trials in a fashion reminiscent of the time-course of the anticipatory activity. This suggests that the dominant role of the anticipatory activity may be suppression of undesired actions, in addition to its possible role in facilitation of desired actions.
In conclusion, we showed temporal build-up of reward asymmetry-induced behavioral bias in a nonhuman primate model. Because of the ubiquitous presence of temporal factors and the dominance of reward driven tasks in behaving animals, their interactions are crucial aspects of decision-making and deserve further examination.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: L. Ding, Lab. of Sensorimotor Research, National Eye Inst., National Inst. of Health, Bldg. 49, Rm. 2A50, Bethesda, MD 20892 (E-mail: dingl{at}nei.nih.gov)
| REFERENCES |
|---|
|
|
|---|
Ding L, Hikosaka O. Comparison of reward modulation in the frontal eye field and caudate. J Neurosci 26: 66956703, 2006.
Foley PJ. The foreperiod and simple reaction time. Can J Psychol 13: 2022, 1959.[ISI][Medline]
Ghose GM, Maunsell JH. Attentional modulation in visual cortex depends on task timing. Nature 419: 616620, 2002.[CrossRef][Medline]
Green DM, Luce RD. Detection of auditory signals presented at random time: III. Percept Psychophysiol 9: 257268, 1971.
Green L, Myerson J. A discounting framework for choice with delayed and probabilistic rewards. Psychol Bull 130: 769792, 2004.[CrossRef][ISI][Medline]
Green L, Myerson J, Holt DD, Slevin JR, Estle SJ. Discounting of delayed food rewards in pigeons and rats: is there a magnitude effect? J Exp Anal Behav 81: 3950, 2004.[CrossRef][ISI][Medline]
Hikosaka O, Nakamura K, Nakahara H. Basal ganglia orient eyes to reward. J Neurophysiol 95: 567584, 2006.
Ikeda T, Hikosaka O. Reward-dependent gain and bias of visual responses in primate superior colliculus. Neuron 39: 693700, 2003.[CrossRef][ISI][Medline]
Janssen P, Shadlen MN. A representation of the hazard rate of elapsed time in macaque area LIP. Nat Neurosci 8: 234241, 2005.[CrossRef][ISI][Medline]
Jiang H, Stein BE, McHaffie JG. Opposing basal ganglia processes shape midbrain visuomotor activity bilaterally. Nature 424: 982986, 2003.
Karlin L. Reaction times as a function of foreperiod duration and variability. J Exp Psychol 58: 185191, 1959.[CrossRef][ISI][Medline]
Kawagoe R, Takikawa Y, Hikosaka O. Expectation of reward modulates cognitive signals in the basal ganglia. Nat Neurosci 1: 411416, 1998.[CrossRef][ISI][Medline]
Kobayashi S, Kawagoe R, Takikawa Y, Koizumi M, Sakagami M, Hikosaka O. Functional differences between macaque prefrontal cortex and caudate nucleus during eye movements with and without reward. Exp Brain Res In press.
Kobayashi S, Lauwereyns J, Koizumi M, Sakagami M, Hikosaka O. Influence of reward expectation on visuospatial processing in macaque lateral prefrontal cortex. J Neurophysiol 87: 14881498, 2002.
Lauwereyns J, Takikawa Y, Kawagoe R, Kobayashi S, Koizumi M, Coe B, Sakagami M, Hikosaka O. Feature-based anticipation of cues that predict reward in monkey caudate nucleus. Neuron 33: 463473, 2002a.[CrossRef][ISI][Medline]
Lauwereyns J, Watanabe K, Coe B, Hikosaka O. A neural correlate of response bias in monkey caudate nucleus. Nature 418: 413417, 2002b.[CrossRef][Medline]
Leon MI, Shadlen MN. Effect of expected reward magnitude on the response of neurons in the dorsolateral prefrontal cortex of the macaque. Neuron 24: 415425, 1999.[CrossRef][ISI][Medline]
Maunsell JH. Neuronal representations of cognitive state: reward or attention? Trends Cogn Sci 8: 261265, 2004.[CrossRef][ISI][Medline]
Mazur JE. Tests of an equivalence rule for fixed and variable reinforcer delays. J Exp Psychol Anim Behav Process 10: 426436, 1984.[CrossRef][ISI]
Muller HJ, Findlay JM. The effect of visual attention on peripheral discrimination thresholds in single and multiple element displays. Acta Psychol (Amst) 69: 129155, 1988.
Munoz DP, Istvan PJ. Lateral inhibitory interactions in the intermediate layers of the monkey superior colliculus. J Neurophysiol 79: 11931209, 1998.
Nickerson RS, Burnham DW. Response times with nonaging foreperiods. J Exp Psychol 79: 452457, 1969.[CrossRef][ISI]
Platt ML, Glimcher PW. Neural correlates of decision variables in parietal cortex. Nature 400: 233238, 1999.[CrossRef][Medline]
Posner MI. Orienting of attention. Q J Exp Psychol 32: 325, 1980.[ISI][Medline]
Richards JB, Mitchell SH, de Wit H, Seiden LS. Determination of discount functions in rats with an adjusting-amount procedure. J Exp Anal Behav 67: 353366, 1997.[CrossRef][ISI][Medline]
Rodriguez ML, Logue AW. Adjusting delay to reinforcement: comparing choice in pigeons and humans. J Exp Psychol Anim Behav Process 14: 105117, 1988.[CrossRef][ISI][Medline]
Roesch MR, Olson CR. Impact of expected reward on neuronal activity in prefrontal cortex, frontal and supplementary eye fields and premotor cortex. J Neurophysiol 90: 17661789, 2003.
Sanders AF, Wertheim AH. The relation between physical stimulus properties and the effect of foreperiod duration on reaction time. Q J Exp Psychol 25: 201206, 1973.[ISI][Medline]
Sato M, Hikosaka O. Role of primate substantia nigra pars reticulata in reward-oriented saccadic eye movement. J Neurosci 22: 23632373, 2002.
Takahashi M, Sugiuchi Y, Izawa Y, Shinoda Y. Commissural excitation and inhibition by the superior colliculus in tectoreticular neurons projecting to omnipause neuron and inhibitory burst neuron regions. J Neurophysiol 94: 17071726, 2005.
Takikawa Y, Kawagoe R, Hikosaka O. Reward-dependent spatial selectivity of anticipatory activity in monkey caudate neurons. J Neurophysiol 87: 508515, 2002.
Watanabe K, Lauwereyns J, Hikosaka O. Effects of motivational conflicts on visually elicited saccades in monkeys. Exp Brain Res 152: 361367, 2003a.[CrossRef][ISI][Medline]
Watanabe K, Lauwereyns J, Hikosaka O. Neural correlates of rewarded and unrewarded eye movements in the primate caudate nucleus. J Neurosci 23: 1005210057, 2003b.
This article has been cited by other articles:
![]() |
R. J. Adam and S. G. Manohar Does Reward Modulate Actions or Bias Attention? J. Neurosci., October 10, 2007; 27(41): 10919 - 10921. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |