|
|
||||||||
Department of Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania
Submitted 2 June 2008; accepted in final form 27 August 2008
|
|
ABSTRACT |
|---|
|
|
|
INTRODUCTION |
|---|
|
We trained monkeys on a one-interval, two-alternative direction-discrimination task in which they decided the direction of motion and indicated their decision with an eye movement to a visual target located in the chosen direction. After learning the visuomotor association, the monkeys became increasingly able to discriminate weaker motion signals as training progressed over many months, a form of perceptual learning (Law and Gold 2008
). This improvement in sensitivity was quantified by fitting performance data from individual sessions to psychometric functions describing the relationship between the motion stimulus and choice, assuming the independence of choices across trials. In the present study we extended these analyses to examine dependencies in the trial-by-trial sequences of choices. Our approach included adapting for use with perceptual data a form of Wiener kernel analysis that has been useful for describing sequential choice behavior in nonperceptual or "free-choice" tasks (Corrado et al. 2005
; Dayan and Abbott 2001
).
We show that the monkeys' strategies for distributing choices at least early in training on the direction-discrimination task were similar to strategies used in free-choice tasks, taking into account both the choices and outcomes from recent trials (Corrado et al. 2005
; Kennerley et al. 2006
; Lau and Glimcher 2005
; Lee et al. 2004
). However, for our task information from past trials did not provide a reliable cue for the rewarded alternative on the present trial, which was determined exclusively by the direction of motion of the visual stimulus. Accordingly, as training progressed over many months the monkeys learned to use more effectively information from the motion stimulus to govern their choices and, with a similar time course, suppress the sequential dependencies. We present electrophysiological data suggesting that this interplay between prior history and sensory evidence does not directly affect the representation or interpretation of the stimulus in the brain, but is evident in commands that generate the appropriate oculomotor response. The results imply that training can help to calibrate the relative contributions of sensory and nonsensory factors to select and prepare actions.
|
|
METHODS |
|---|
|
Preparation for experiments
All four monkeys were naïve to behavioral and neurophysiological experiments prior to this study. In preparation for these experiments, each monkey was, in chronological order: 1) trained to sit comfortably in a custom-built primate chair; 2) surgically implanted with a head-holding device, recording cylinder(s) (Crist Instrument, Damascus, MD), and (for monkeys Av and At) eye coil, and given time to recover; 3) imaged using magnetic resonance imaging (MRI) to visualize the three-dimensional trajectories of the surgically implanted recording cylinders relative to the underlying cortical targets to help guide and confirm electrode placement (Kalwani et al. 2008
); and 4) trained for several weeks to perform simple visually guided saccade tasks.
Behavioral task
The monkeys performed a one-interval, two-alternative forced-choice task that required them to decide the direction of random-dot motion and indicate their decision with an eye movement (Fig. 1). For two of the monkeys (At and Av), behavioral testing was paired with a technique for assessing ongoing oculomotor activity via electrical microstimulation-evoked eye movements (Gold and Shadlen 2000
, 2003
). For the other two monkeys (Cy and ZZ), behavioral testing was paired with electrophysiological recordings in the middle temporal visual (MT) and lateral intraparietal (LIP) areas. In both cases, the electrophysiological technique both limited the time spent performing the task, which occurred only when a microstimulation or recording site was found, and affected the geometry of the stimulus used, which was typically designed to match certain properties of the microstimulation or recording site (Law and Gold 2008
).
|
Task timing and feedback were customized for each monkey to maximize their motivation and productivity. Each trial began with onset of the fixation point, which would remain on for
10 s either fixed (monkeys Cy and ZZ) or pseudorandomly changing color and diameter every 2 s (monkeys At and Av) until the monkey attained fixation. Failure to attain fixation or broken fixation during a trial would result in a "time-out" period of about 2 s. Correct responses were rewarded with an audible tone paired with 0–5 drops of apple juice (median value = 3 drops per correct trial for each monkey) chosen at random. The volume of juice per drop was adjusted by trial-and-error to maximize each monkey's motivation. Correct trials were followed by a brief intertrial interval of 1–2 s. Erroneous responses were followed by an additional time-out period of 1–3 s.
Psychometric function
We fit behavioral choice data from each session to a psychometric function describing the relationship between the strength, duration, and direction of the motion stimulus and the monkey's choices. The function is based on a drift-diffusion process with a drift term that decays exponentially as a function of viewing time and has been shown to provide good fits to the behavioral data (see Eckhoff et al. 2008
, especially "ddExp3a" of Eq. 37, for more detailed descriptions of the behavioral models and fitting procedures). Briefly, choice is based on a decision variable x, the value of which evolves as a function of a time-varying drift rate A(t) and a noise term cdW
N(0, c2dt)
![]() | (1a) |
)
![]() | (1b) |
with the average drift rate (with free parameters
and r0)
![]() | (1c) |
Thus the decision variable x is a normally distributed random variable with a mean and variance that both scale with several factors including motion strength and viewing time. Choice depends on the value of x at the end of motion viewing: the correct choice is made when x >0, an error otherwise. The psychometric function describing accuracy as a function of coherence and time is therefore
![]() | (2) |
0
A(s)ds and v(T) = c2T. Finally, lapses (
, errors at the highest motion strengths) are accounted for by scaling the entire function to match its measured upper asymptote
![]() | (3) |
We fit session-by-session behavioral data to P(C, T) using three free parameters a,
, and
. The remaining parameters were set to values used previously (m = 1.25,
= 0.3, and r0 = 10 spikes/s; see Eckhoff et al. 2008
; Gold and Shadlen 2000
, 2003
).
We measured sequential choice dependencies by analyzing the trial-by-trial residuals from the fits to the psychometric function (Fig. 2). The residuals can be thought of as the portion of the monkeys' trial-by-trial choices that were not accounted for by the average effects of the motion stimulus. The psychometric function provided a predicted outcome for each trial, expressed as a value between 0 and 1 describing the probability of making a rightward choice for the given stimulus. The choice residuals were the differences between these predictions and the actual, binary choices (0 for a leftward choice, 1 for a rightward choice). Thus the values of the residuals spanned the range of –1 (a leftward choice on a trial in which a rightward choice was predicted) to +1 (a rightward choice on a trial in which a leftward choice was predicted). For example, a correct leftward choice on a low-coherence, short-viewing-time trial with a predicted proportion correct of 0.62 would correspond to a choice residual of –0.38. Note that the results were not affected by instead computing the residual deviance, based on the log-likelihood ratio of the best-fitting and saturated models (Wichmann and Hill 2001
).
|
![]() | (4) |
|
, where t is the trial lag and
is –log (β0)–1. Model 4 is described in the following text (see in particular Eq. 6). Because the decision variable x can be thought of as an accumulation of evidence in units of spikes/s (from r0 in Eq. 1c), the units of B can be thought of as a change in spikes/s that can be either positive (a bias toward rightward choices) or negative (a bias toward leftward choices). For the average magnitude of bias within a given session, we report the mean ± SE of the absolute value of B in Eq. 4. Sequential analysis
The goal of the sequential analysis of behavior was to determine the extent to which past trials could predict the current choice residual. The primary assumption we made was that the (output) sequence of residuals y(t) was related to the (input) sequence of choices x(t –
) via a causal, linear filter g(
). We computed g(
) as the first-order kernel of the Wiener expansion of the functional G relating the input and output sequences, y(t) = G[x(t –
)] (Rieke et al. 1999
), using the Wiener–Hopf equation in matrix form
![]() | (5) |
Each computed kernel was fit using least-squares fitting to the following double-exponential equation
![]() | (6) |
1, and
2 are free parameters; and n1 and n2 are normalization constants such that
i=1N (1/na) e–i/
a. The kernel Wfit was truncated at lags between 1 and 801 trials in 10-trial increments and then used to filter the input sequence. The final version of Wfit used was the shortest truncated version that corresponded to the maximum correlation coefficient between the filtered input sequence and the actual output sequence (the choice residuals). Two versions of the fit kernel Wfit were computed for each session. For the first kernel, which measured the effect of past correct choices on the sequence of residuals, the input sequence was encoded as –1 for correct leftward choices, 0 for errors, and 1 for correct rightward choices (e.g., Fig. 5A). For the second kernel, which measured the effect of past error choices on the sequence of residuals, the input sequence was encoded as –1 for incorrect leftward choices, 0 for correct trials, and 1 for incorrect rightward choices (e.g., Fig. 5B). The final predicted sequence of residuals—the "sequential bias" (St for trial t in Eq. 4, models 4 and 5)—was computed as the sum of the filtered outputs from the two kernels.
|
For monkeys At and Av, eye position was monitored using a scleral search coil system (CNC Engineering, Seattle, WA) sampled at 1,000 Hz. For monkeys Cy and ZZ, eye position was monitored using a video-based system (Applied Science Laboratories, Bedford, MA) sampled at 240 Hz. While the fixation point was illuminated (e.g., throughout motion viewing), fixation was enforced to within <3°. Following fixation-point offset, choice was determined by comparing the endpoint of the first voluntary saccade (required to occur between 80 and 500 ms following fixation-point offset) to the locations of the two choice targets. Trials with broken fixations or saccadic endpoints located >3.5° from either target were excluded from further analysis.
For monkeys At and Av, behavioral testing was combined with a technique for assessing oculomotor preparation (Fig. 1A; for more details, see Gold and Shadlen 2000
, 2003
). A single, glass-covered tungsten microelectrode (Alpha Omega USA, Atlanta, GA) was advanced into the frontal eye field (FEF) using a NAN microdrive (Plexon, Dallas, TX) until a site was found where electrical microstimulation could elicit saccadic eye movements with a consistent trajectory using <50 µA of current (0.25-ms-long biphasic pulses applied at a rate of about 350 Hz for 60 ms) applied in darkness. Once a site was found, the task geometry was adjusted for that session such that the axis of motion of a foveally presented stimulus was roughly perpendicular to the trajectory of the evoked saccade. Eye movements were evoked on 10–90% of trials in a given session, chosen at random. Microstimulation pulses started at the simultaneous offset of the motion stimulus and fixation point and typically evoked a saccade with a latency of about 40 ms, which was followed within about 100 ms by a second, voluntary saccade to one of the two choice targets. Evoked saccade endpoints were measured from the stable eye position between the evoked and voluntary saccades.
Evoked-saccade trajectories were quantified as the magnitude of deviation, in degrees of visual angle, of their endpoints along the axis of motion. For most, but not all, sessions (159/162 for At, 158/213 for Av), the mean evoked saccade deviated in the same direction as the subsequent voluntary saccade. For these sessions, deviations toward the chosen target were assigned positive values; the rest were assigned negative values. For the remaining sessions, deviations toward the chosen target were assigned negative values; the rest were assigned positive values. Thus in all cases a positive deviation implied the same direction as the average deviation measured for that session.
We measured the relationship between the evoked-saccade deviations and sequential choice dependencies by computing the Spearman's (partial) rank correlation between the trial-by-trial magnitudes of deviation and the choice dependencies computed using the Wiener kernel analyses (St). St was signed according to the actual choice made on the given trial: a positive value for biases in the direction of the actual choice made, a negative value for biases in the other direction. We computed this correlation separately for left and right choices for each session, to account for possible differences in deviation magnitude for the two choices and avoid the confounding influence of sequential dependencies on choice behavior. We used rank correlations to standardize across different average magnitudes of both variables across sessions. We used partial correlations to account for effects of the strength and duration of the motion stimulus on the evoked-saccade deviations (Gold and Shadlen 2000
, 2003
). Specifically, we computed the correlation coefficient after controlling for the effects of both viewing time alone and the multiplicative interaction between motion strength and viewing time (this multiplicative interaction is consistent with an accumulation of motion information over time, as in the psychometric functions; Eckhoff et al. 2008
).
Saccade latency, velocity, and accuracy were measured from the first voluntary saccade for all trials from Cy and ZZ and only for trials without electrical microstimulation from At and Av. Spearman (partial) rank correlations were computed to describe the trial-by-trial relationships between these parameters and sequential bias (St), using the same procedures as the deviation data, described earlier.
Electrophysiological measurements and analysis
For monkeys Cy and ZZ, behavioral testing was combined with recordings of neural activity in areas MT and LIP. To begin each session quartz-coated platinum–tungsten microelectrodes were advanced into MT and LIP via a pair of Mini Matrix microdrive systems (Thomas Recording, Giessen, Germany). Extracellular action potential waveforms were stored and sorted off-line (Plexon). If a direction-tuned MT neuron was found, the motion stimulus was placed in its receptive field and shown at the neuron's preferred direction (and 180° opposite) and speed. If no MT neuron was found, the modal location, direction, and speed from previous sessions were used. If an LIP neuron with spatially tuned activity during the delay period of a delayed saccade task was found, one of the two choice targets was placed in its response field. If no LIP neuron was found, the targets were placed at their modal locations from previous sessions. The monkeys performed the task only while at least one MT or LIP neuron was recorded. Also, unlike in the version of the task used in the microstimulation experiments, there was a delay period of 0.3–0.8 s between offset of the motion stimulus and offset of the fixation point.
We quantified the relationship between MT and LIP activity and sequential choice dependencies (St) using Spearman's (partial) rank correlations. These correlations were computed separately for each choice, using correct trials only. For responses measured during motion viewing, partial correlations were computed by first controlling for the effects of motion strength. For LIP responses during motion viewing, responses were quantified not using raw spike rates but instead of the rate of rise (parameter
1 in Eq. 7) of the responses as a function of viewing time (T), as determined from a trial-by-trial fit to a simple piecewise linear model
![]() | (7) |
0,
1,
2, and
are fitted parameters (
0 and
1 were fit to spike-rate data smoothed using an alpha function from each trial;
and
2 were fit using average spike-rate data from each coherence for the given session). Choice indices were computed as the area under the region of overlap condition (ROC) curve obtained from the two distributions of spike rate from correct trials corresponding to saccade choices made into and away from the neuron's response field (Green and Swets 1966
|
|
|
RESULTS |
|---|
|
Analysis of choice behavior
Behavioral choices tended to exhibit biases within individual sessions. Figure 3 illustrates data from a single session. Overall for this session, the monkey tended to choose rightward more often than leftward (57.1% rightward choices), despite nearly balanced stimulus presentations (49.5% rightward stimuli). This effect is seen in smoothed versions of both the trial-by-trial choices and trial-by-trial choice residuals from the unbiased model (Eqs. 1–3) plotted in chronological order, which tended toward positive values (Fig. 3A). The residuals are particularly informative because they have taken into account the average effects of the stimulus shown on the current trial and thus are a more sensitive measure of the relationship between choices across trials. Autocorrelation functions of both the choice and choice-residual sequences also reflect similar patterns, in both cases peaking at a lag of one trial and then declining slightly but remaining in general positive, implying a tendency to make the same (in this case, rightward) choice (Fig. 3B).
|
|
Two example kernels computed from a single session are depicted in Fig. 5, A and B. One was computed using data only from past correct choices (that is, sequences consisting of +1 for correct rightward choices, –1 for correct leftward choices and 0 for incorrect choices; Fig. 5A), the other using data only from past error choices (sequences consisting of 1 for incorrect rightward choices, –1 for incorrect leftward choices and 0 for correct choices; Fig. 5B). In both cases, the kernel coefficient was relatively large at a lag of one trial and then tended to be noisy but on average declined steadily toward zero. Such a kernel with mostly positive values suggests that the current choice reflects a running, weighted average of past choices. Figure 6 summarizes the session-specific kernels computed separately using correct (left) or error (right) trials for each of the four monkeys. On average, the kernel coefficients tended to be largest at the shortest lag and then steadily approach zero over lags of tens of trials, although individual kernel coefficients could take a fairly wide range of values at all lags.
|
To overcome the problem of overfitting while still providing session-specific estimates of the kernel and capturing its shape at both short (
1 trial) and longer lags, we fit each raw kernel with a double-exponential function (e.g., dashed lines in Fig. 5, A and B). This function typically provided better fits than either a single-exponential (F-test, P < 0.05 for 505/588 sessions from all four monkeys) or power-law (the evidence ratio of Akaike's information criterion [AIC] was >20 for 367/588 sessions; Burnham and Anderson 2004
) function and was similar in shape to the raw kernels averaged across sessions (compare green and black curves in Fig. 6). The fit kernels produced outputs that correlated with the actual choice residuals in a manner that rose with kernel size initially but, in contrast to the raw kernels, not indefinitely: the correlation (rseq) between filtered input and actual choice residuals reached a plateau for kernels using lags of <600 trials for all monkeys and all sessions, with a median [interquartile range, or IQR] value of the minimum lag providing the maximum rseq of 61 [80] trials for monkey At, 81 [140] trials for Av, 41 [70] trials for Cy, and 31 [40] trials for ZZ (Fig. 5, C and D).
The double-exponential kernels accounted for at least some of the sequential structure in the choice residuals. The value of rseq was higher than would be expected by chance from randomly ordered sequences of the same trials for 554 of 693 (80%) total sessions from all four monkeys (Monte Carlo simulations, P < 0.05), with a median [95% CIs] value across sessions of 0.16 [0.05–0.40] for monkey At, 0.11 [0.03–0.33] for Av, 0.14 [0.03–0.34] for Cy, and 0.12 [0.03–0.25] for ZZ. These values of rseq, which were computed using kernels determined separately from correct and error trials, were higher than when correct and error trials were combined together (that is, using a single sequence of choices, encoded as –1 for a leftward choice and +1 for a rightward choice; Wilcoxon test using the distributions of rseq computed from individual sessions, P < 0.05 for Av, Cy, and ZZ, P = 0.23 for At). This result implies that, at least for three of the four monkeys, the sequential dependencies tended to differ following correct versus error trials, an effect that can be seen by comparing the shapes of the average kernels for the two conditions (Fig. 6, compare left and right columns; the kernel coefficient at a lag of one trial had a different sign when computed using correct vs. error trials in 18% of sessions for At, 28% of sessions for Av, 51% of sessions for Cy, and 46% of sessions for ZZ). The kernels were not affected by using choices scaled by the amount of reward received on correct trials or the difficulty (coherence) of each trial (P > 0.05 for both cases, all four monkeys). Thus kernels computed separately using choice data from past correct and error trials appeared to be the best predictors of future choices.
Inclusion of choice bias in the psychometric function
We incorporated the sequential dependencies identified by the Wiener kernel analysis as a bias in a drift-diffusion model of performance (Eqs. 1–4). This procedure allowed us to analyze the relative contributions of the sequential choice dependencies and incoming sensory information on the monkeys' behavioral performance and to determine how these contributions change with training.
The magnitude of the bias in the drift-diffusion model was determined by two fit terms that together accounted for the behavioral biases (model 5 in Fig. 7). One term (βs) was fixed for an entire session and accounted primarily for the overall asymmetry between left and right choices in a given session (compare models 1 and 5 in Fig. 7). The other term (βw) scaled the trial-by-trial sequence of choices filtered by the Wiener kernels and accounted for sequential dependencies in the choice residuals and, to a lesser extent, the overall asymmetry between left and right choices (compare models 4 and 5 in Fig. 7).
We fit the behavioral data using two other models to provide further intuition into the nature of the bias term. For one model, the bias was based on the choice and outcome of the previous trial only (model 2 in Fig. 7). The fits to this model tended to show the least improvement over the unbiased model. Thus despite the fact that the Wiener kernel coefficients tended to be strongest at a lag of one trial (Fig. 6), information from that trial alone did not appear to be sufficient to account for the behavioral biases. The second model was based on an RL algorithm in which the bias term was updated on each trial based on the choice and outcome of the most recent trial (model 3 in Fig. 7). This model appeared to perform approximately as well as a model using only the Wiener kernels to compute biases (model 4), reflecting the fact that the bias terms in both models are consistent with weighted averages of recent correct and error choices (see METHODS for details). In fact, the trial-by-trial biases estimated by the two models were highly correlated (the median [IQR] correlation coefficient between the trial-by-trial values of B in Eq. 4 using the two models fit to data from individual sessions was 0.98 [0.04] for At, 0.98 [0.03] for Av, 0.95 [0.06] for Cy, and 0.90 [0.08] for ZZ). This analysis suggests that the biases might arise, at least in part, from a relatively simple updating process that takes into account the choice and outcome of the previous trial.
We also considered two ways of integrating the bias term in the model. The first caused an offset in the initial (or, equivalently, final) position of the decision variable (Eq. 4), which is consistent with previous models of choice bias (Ashby 1983
; Carpenter and Williams 1995
; Link 1992
; Ratcliff et al. 1999
). The second method caused an offset in the drift rate (a in Eq. 1b) that governs stimulus sensitivity, which might be expected if the choice dependencies arose from a process, like attention, that directly affected the representation of sensory evidence used as input to the decision variable (see Fig. 6 in Gold and Shadlen 2007
). The first method provided a consistently better fit to the behavioral data than the second (the evidence ratio of the AIC was >1, indicating that the first model was more likely than the second, for 574 of 652 total sessions from all four monkeys). Thus the sequential choice dependencies were consistent with a process that provided a dynamically modified offset to the decision variable that converts sensory evidence into the categorical choice.
Changes in bias with training
The magnitude of the choice bias tended to decline as sensitivity to the motion stimulus improved with training. Two example sessions from early and late in training are shown in Fig. 8. The psychometric functions depict the percentage of rightward choices as a function of signed coherence (negative values indicate leftward motion, positive values rightward motion) at long viewing times. This function is steeper for the later versus the earlier session, indicating increased sensitivity to the motion stimulus. The trial-by-trial choice biases (computed here and in subsequent analyses from B in Eq. 4, model 5) show decreased fluctuations in the later versus the earlier session.
|
|
|
For monkeys performing the direction-discrimination task, there appears to be a close relationship between forming the direction decision and preparing the oculomotor response. For example, the temporal accumulation of motion information used to form the direction decision is reflected in neural activity in several brain regions linked to oculomotor preparation, including LIP, FEF, and the superior colliculus (Horwitz and Newsome 1999
, 2001
; Kim and Shadlen 1999
; Roitman and Shadlen 2002
; Shadlen and Newsome 2001
). We tested whether the sequential dependencies are also reflected in oculomotor-related signals.
To assess oculomotor preparatory activity, we measured the trajectories of saccadic eye movements evoked with electrical microstimulation of the FEF at the end of motion viewing. These trajectories are determined primarily by the site of microstimulation but are sensitive to ongoing oculomotor activity, for example, deviating in the direction of a planned saccade (Mays and Sparks 1980
; Schlag et al. 1989
). Saccades evoked at the end of motion viewing during the direction-discrimination task reflect the link between decision formation and oculomotor preparation, deviating in the direction of the monkey's impending saccadic choice (Fig. 11, A and B) with a magnitude that depends on the strength and duration of the motion evidence used to arrive at that choice (Fig. 11C; Gold and Shadlen 2000
, 2003
). To test whether these evoked-saccade deviations also reflect the choice bias inferred from behavior, we computed partial rank correlations between the trial-by-trial magnitudes of deviation and bias (see METHODS).
|
|
Neural correlates
We looked for correlates of choice bias in areas MT and LIP. Neurons in area MT are tuned for the direction of visual motion and are thought to provide the sensory evidence for the direction decision (Gold and Shadlen 2007
). In principle, trial-by-trial shifts in MT responsiveness of particular subsets of MT neurons could affect choices in a manner analogous to the effects of electrical microstimulation, which biases choices in the preferred direction of the stimulated neurons (Salzman et al. 1990
, 1992
). Neurons in area LIP have been associated with sensory, motor, and cognitive functions and in decision tasks are thought to represent the accumulation of sensory evidence into a decision variable that governs the monkey's behavioral choice (Roitman and Shadlen 2002
; Shadlen and Newsome 2001
). In principle, choice biases could result from trial-by-trial changes in the value of this decision variable, such as an additive offset predicted by the diffusion model analysis.
MT activity was not correlated with sequential choice dependencies in a systematic manner before, during, or after motion viewing. Figure 12, A and B shows an example MT neuron. Its responses were modulated strongly by the motion stimulus during motion viewing, with increasingly strong leftward motion eliciting increasingly strong responses and increasingly strong rightward motion eliciting increasingly weak responses. We computed rank correlations between MT activity and sequential biases, using a partial correlation for data from the stimulus-viewing epoch that accounted for the (linear) relationship between response magnitude and motion strength. The value of this correlation did not differ significantly from zero in any of the three epochs (P > 0.05).
|
Likewise, LIP activity was not correlated with choice bias in a systematic manner. Figure 13, A and B shows an example LIP neuron. Its responses were modulated by the motion stimulus during motion viewing and remained separated by choice until the saccadic response. After accounting for these task-related modulations, the partial rank correlations between LIP activity and sequential bias were not significantly different from zero in any epoch before, during, or after motion viewing (P > 0.05).
|
Because LIP activity during motion viewing has been shown to relate closely to task performance (Roitman and Shadlen 2002
; Shadlen and Newsome 2001
), we examined in more detail how LIP responses during this epoch related to sensory input, sequential bias, and choice behavior. For each neuron, we computed an ROC-based choice index that quantifies the degree to which the LIP responses are separate for choices into versus out of the neuron's response field (Shadlen and Newsome 2001
). A value of 0.5 implies that the distributions of responses were completely overlapping; a value of 1.0 implies that the distributions were completely nonoverlapping. To compare the effects of motion evidence and choice bias on the responses, we computed the difference in choice indices between trials with high versus low motion strengths and between trials that resulted in a choice toward versus away from the direction of the choice bias (Fig. 14). Consistent with previous reports, the coherence dependence grew over the first several hundred milliseconds of viewing time then tended to remain positive throughout motion viewing, particularly later in training (Kiani et al. 2008
; Law and Gold 2008
; Roitman and Shadlen 2002
; Shadlen and Newsome 2001
). In contrast, there was no consistent dependence on choice bias at any time in either monkey, even early in training and early in motion viewing when an initial, additive offset would be expected to dominate the value of the decision variable.
|
|
DISCUSSION |
|---|
|
Behavioral choice dependencies
The behavioral choice dependencies persisted throughout months of training despite the fact that they provided no benefit to the monkeys in terms of obtaining reward. In fact, because reward depended only on whether the monkey chose the correct direction of motion for the given trial and directions were chosen at random, biasing choices based on past trials could only hinder performance. However, because the sequential dependencies reflected in most cases only a fraction of the influence of the given stimulus on choice (Fig. 9, I–L), their effect on the amount of reward received was perhaps not large enough to motivate a more rapid change in strategy (the percentage of correct responses in individual sessions estimated from the behavioral fits to Eq. 4 were improved when assuming no biases by median [IQR] of only 1 [3]% for At, 2 [4]% for Av, 1 [3]% for Cy, and 1 [2]% for ZZ even when considering only the first 20 sessions for each monkey, when the biases were greatest). Instead, the dependencies appeared to reflect a default strategy that did not require reinforcement to persist and thus seems likely to be present under a wide variety of behavioral conditions.
Consistent with this idea, the sequential choice dependencies we measured for the motion-discrimination task were similar to those measured on free-choice tasks. The most striking similarity is the form of the weighting function describing the relationship between past events and the current choice, with the strongest weights corresponding to one or two trials in the past and decaying weights moving further into the past (Figs. 5 and 6; Corrado et al. 2005
; Lau and Glimcher 2005
). Such a function, when all the weights are positive and applied to past events, generates a running average that can be used to estimate recent rates of particular choices or rewards (Killeen 1994
). The temporal decay might reflect limitations in memory, an innate emphasis on recent events that has evolved in a dynamic world, and possibly a common mechanism for discounting both past and future rewards (Corrado et al. 2005
; Cowie 1977
). The decay also highlights the local nature of the underlying computations that can drive behavior (Herrnstein and Vaughan 1980
; Vaughan 1981
).
There were also differences between our results and previous reports using free-choice tasks, including the source of past information used to guide choices. For free-choice tasks, behavior is typically analyzed with respect to various forms of reward rate (e.g., Corrado et al. 2005
; Gallistel et al. 2001
; Sternberg 2001
). However, a recent study highlighted the usefulness of accounting separately for the history of rewards and choices, demonstrating that monkeys performing a free-choice task tended to make choices biased toward recently reinforced choices but biased away from recent choices in general (Lau and Glimcher 2005
). For our task, behavior was most strongly predicted by a function of past choices, reflecting in some cases a tendency to repeat and in other cases a tendency to switch choices (see Fig. 6). Taking into account whether the past choices were rewarded (i.e., were correct or incorrect) improved the predictions, but taking into account the magnitudes of previous rewards did not.
Our modeling results are consistent with the idea that these dependencies correspond to an additive offset to a quantity, known as a decision variable, that accumulates motion information to arrive at a decision (Ashby 1983
; Carpenter and Williams 1995
; Link 1992
; Ratcliff et al. 1999
; this mechanism is also closely related to criterion changes in signal detection theory: Green and Swets 1966
; Maddox 2002
). In our decision model (Eqs. 1–4), the accumulated motion information is thought to represent the logarithm of the likelihood ratio describing the relative probabilities of obtaining the current motion evidence given the possible directions of motion (e.g., left vs. right; Gold and Shadlen 2001
). An additive offset to the decision variable that is related to the relative frequencies of recent occurrences of each alternative implies an ongoing estimate of prior probability used in the context of Bayesian inference (Kersten et al. 2004
; Rao 1999
; Tassinari et al. 2006
; Tenenbaum and Griffiths 2001
). In this framework, the sequential dependencies might be thought of as a necessary component of a Bayesian decision process and not simply a reflection of a suboptimal or lazy strategy, thus explaining their prevalence in perceptual and cognitive tasks (Botvinick et al. 2001
; Cho et al. 2002
; Kirby 1976
; Laming 1968
; Remington 1969
; Soetens et al. 1985
).
Oculomotor correlates
Our search for oculomotor correlates of the sequential choice dependencies was motivated by the notion that high-order cognitive and perceptual functions are intimately linked to behavioral output (Clark 1997
; Merleau-Ponty 1962
; O'Regan and Noe 2001
). For example, both spatial attention and perceptual decision making have been shown to be reflected in signals related to the preparation of eye movements (Gold and Shadlen 2000
, 2003
; Kustov and Robinson 1996
; Moore et al. 2003
). Even more relevant to the present study, saccadic latencies on a simple visually guided saccade task can reflect prior probabilities in a manner roughly consistent with our decision model, causing an additive offset to the value of a decision variable that builds up over time (Carpenter and Williams 1995
).
The present results provide further support for these ideas, demonstrating that not just sensory but also sequential information used to select a saccadic response is reflected in signals related to the preparation and execution of that response. The effects were quite weak, which is consistent with the idea that the sequential effects played a minor role in determining where and when to plan the saccadic response relative to the influence of stimulus information from the current trial. Nevertheless, the presence of these sequential effects implies that the oculomotor system has at least some access to the constellation of factors that govern the monkey's saccadic choices.
Neural correlates
We targeted areas MT and LIP because of previous studies linking their activity to performance on the direction-discrimination task (reviewed in Gold and Shadlen 2007
). MT represents motion evidence used to form the direction decision (Britten et al. 1992
; Newsome and Paré 1988
; Pasternak and Merigan 1994
; Salzman et al. 1990
). LIP is thought to represent the decision variable that converts sensory evidence into a plan to generate the appropriate oculomotor response (Roitman and Shadlen 2002
; Shadlen and Newsome 2001
). LIP has also been implicated in a host of other perceptual, oculomotor, and cognitive functions, including the valuation of recent rewards on a free-choice task that further suggested it might reflect the sequential choice dependencies we measured from behavior (Platt 2002
; Snyder et al. 1997
; Sugrue et al. 2004
). Moreover, motion-driven responses in LIP, but not MT, accompany improvements in perceptual sensitivity on the direction-discrimination task (Law and Gold 2008
).
However, we found no systematic effects in either MT or LIP. The lack of effects in LIP was particularly striking, given its close relationship to task performance both during and after training (Hanks et al. 2006
; Law and Gold 2008
; Roitman and Shadlen 2002
; Shadlen and Newsome 2001
). It is possible that the effects were too small or noisy to measure from individual neurons or reflected more complex influences than monotonic changes in spike rates in either area. Conversely, the lack of effects in MT and LIP might imply that the sequential dependencies are processed separately from the judgment about motion direction. That is, task-related activity in LIP might represent only part of the decision variable that ultimately governs saccadic choices for this task, especially early in training when the sequential effects are strongest.
We do not know where or how the sequential aspects of the decision variable are computed. The oculomotor results suggest that activity in other structures that prepare and execute the saccadic response—possibly the superior colliculus or FEF—is at least influenced by the sequential choice dependencies. The fact that the dependencies typically involve multiple trials in the past suggest memory and strategic requirements often attributed to the dorsolateral prefrontal cortex (Barraclough et al. 2004
; Funahashi et al. 1989
; Fuster 2000
; Goldman-Rakic 1995
; Tanji and Hoshi 2001
). The differences in the effects of previous rewarded versus unrewarded trials suggest brain regions involved in predicting and evaluating rewards or punishments like the orbitofrontal and anterior cingulate cortex or midbrain dopamine neurons (Nakahara et al. 2004
; Rolls 2004
; Seo and Lee 2007
; Shidara and Richmond 2002
; Tremblay and Schultz 1999
). Imaging or multisite recording studies will ultimately be needed to understand the roles that these different brain areas play in computing and expressing sequential biases on choice behavior.
|
|
GRANTS |
|---|
|
|
|
ACKNOWLEDGMENTS |
|---|
|
|
|
FOOTNOTES |
|---|
Address for reprint requests and other correspondence: J. I. Gold, University of Pennsylvania, Department of Neuroscience, 116 Johnson Pavilion, 3610 Hamilton Walk, Philadelphia, PA 19104-6074 (E-mail: jigold{at}mail.med.upenn.edu)
|
|
REFERENCES |
|---|
|
Barraclough DJ, Conroy ML, Lee D. Prefrontal cortex and decision making in a mixed-strategy game. Nat Neurosci 7: 404–410, 2004.[CrossRef][Web of Science][Medline]
Behrens TE, Woolrich MW, Walton ME, Rushworth MF. Learning the value of information in an uncertain world. Nat Neurosci 10: 1214–1221, 2007.[CrossRef][Web of Science][Medline]
Bonnet DG, Wright TA. Sample size requirements for estimating Pearson, Kendall and Spearman correlations. Psychometrika 65: 23–28, 2000.[CrossRef][Web of Science]
Botvinick MM, Braver TS, Barch DM, Carter CS, Cohen JD. Conflict monitoring and cognitive control. Psychol Rev 108: 624–652, 2001.[CrossRef][Web of Science][Medline]
Brainard DH. The Psychophysics Toolbox. Spat Vis 10: 433–436, 1997.[Web of Science][Medline]
Britten KH, Shadlen MN, Newsome WT, Movshon JA. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci 12: 4745–4765, 1992.[Abstract]
Burnham KP, Anderson DR. Model Selection and Multi-Model Inference. New York: Springer, 2004.
Carpenter RH, Williams ML. Neural computation of log likelihood in control of saccadic eye movements. Nature 377: 59–62, 1995.[CrossRef][Web of Science][Medline]
Cho RY, Nystrom LE, Brown ET, Jones AD, Braver TS, Holmes PJ, Cohen JD. Mechanisms underlying dependencies of performance on stimulus history in a two-alternative forced-choice task. Cogn Affect Behav Neurosci 2: 283–299, 2002.
Clark A. Being There: Putting Brain, Body, and World Together Again. Cambridge, MA: MIT Press, 1997.
Corrado GS, Sugrue LP, Seung HS, Newsome WT. Linear-nonlinear-Poisson models of primate choice dynamics. J Exp Anal Behav 84: 581–617, 2005.[CrossRef][Web of Science][Medline]
Cowie RJ. Optimal foraging in great tits (Parus major). Nature 268: 137–139, 1977.[CrossRef][Web of Science]
Davidson M, McCarthy D. The Matching Law: A Research Review. Hillsdale, NJ: Erlbaum, 1988.
Dayan P, Abbott LF. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. Cambridge, MA: MIT Press, 2001.
Eckhoff P, Holmes P, Law C-T, Connolly PM, Gold JI. On diffusion processes with variable drift rates as models for decision making during learning. New J Physics (January 31, 2008). doi:1088/1367-2630/10/1/015006.
Funahashi S, Bruce CJ, Goldman-Rakic PS. Mnemonic coding of visual space in the monkey's dorsolateral prefrontal cortex. J Neurophysiol 61: 331–349, 1989.
Fuster JM. Executive frontal functions. Exp Brain Res 133: 66–70, 2000.[CrossRef][Web of Science][Medline]
Gallistel CR, Mark TA, King AP, Latham PE. The rat approximates an ideal detector of changes in rates of reward: implications for the law of effect. J Exp Psychol Anim Behav Process 27: 354–372, 2001.[CrossRef][Web of Science][Medline]
Gold JI, Shadlen MN. Representation of a perceptual decision in developing oculomotor commands. Nature 404: 390–394, 2000.[CrossRef][Web of Science][Medline]
Gold JI, Shadlen MN. Neural computations that underlie decisions about sensory stimuli. Trends Cogn Sci 5: 10–16, 2001.[CrossRef][Web of Science][Medline]
Gold JI, Shadlen MN. The influence of behavioral context on the representation of a perceptual decision in developing oculomotor commands. J Neurosci 23: 632–651, 2003.
Gold JI, Shadlen MN. The neural basis of decision making. Annu Rev Neurosci 30: 535–574, 2007.[CrossRef][Web of Science][Medline]
Goldman-Rakic PS. Cellular basis of working memory. Neuron 14: 477–485, 1995.[CrossRef][Web of Science][Medline]
Green DM, Swets JA. Signal Detection Theory and Psychophysics. New York: Wiley, 1966.
Hanks TD, Ditterich J, Shadlen MN. Microstimulation of macaque area LIP affects decision-making in a motion discrimination task. Nat Neurosci 9: 682–689, 2006.[CrossRef][Web of Science][Medline]
Hays AV, Richmond BJ, Optican LM. Unix-based multiple-process system for real-time data acquisition and control. WESCON Proc Conf 1–10, 1982.
Herrnstein RJ. Relative and absolute strength of response as a function of frequency of reinforcement. J Exp Anal Behav 4: 267–272, 1961.[CrossRef][Web of Science][Medline]
Herrnstein RJ, Vaughan W. Melioration and behavioral allocation. In: Limits to Action: The Allocation of Individual Behavior, edited by Staddon JE. New York: Academic Press, 1980, p. 143.
Horwitz GD, Newsome WT. Separate signals for target selection and movement specification in the superior colliculus. Science 284: 1158–1161, 1999.
Horwitz GD, Newsome WT. Target selection for saccadic eye movements: prelude activity in the superior colliculus during a direction-discrimination task. J Neurophysiol 86: 2543–2558, 2001.
Kalwani RM, Bloy L, Elliott MA, Gold JI. A method for localizing microelectrode trajectories in the macaque brain using MRI. J Neurosci Methods In press.
Kennerley SW, Walton ME, Behrens TE, Buckley MJ, Rushworth MF. Optimal decision making and the anterior cingulate cortex. Nat Neurosci 9: 940–947, 2006.[CrossRef][Medline]
Kersten D, Marnassian P, Yuille AL. Object perception as Bayesian inference. Ann Rev Psychol 55: 271–304, 2004.[CrossRef][Web of Science][Medline]
Kiani R, Hanks TD, Shadlen MN. Bounded integration in parietal cortex underlies decisions even when viewing duration is dictated by the environment. J Neurosci 28: 3017–3029, 2008.
Killeen PR. Mathematical principles of reinforcement. Behav Brain Sci 17: 105–172, 1994.[Medline]
Kim JN, Shadlen MN. Neural correlates of a decision in the dorsolateral prefrontal cortex of the macaque. Nat Neurosci 2: 176–185, 1999.[CrossRef][Web of Science][Medline]
Kirby NH. Sequential effects in two-choice reaction time: automatic facilitation or subjective expectancy? J Exp Psychol Hum Percept Perform 2: 567–577, 1976.[CrossRef][Web of Science][Medline]
Kustov AA, Robinson DL. Shared neural control of attentional shifts and eye movements. Nature 384: 74–77, 1996.[CrossRef][Web of Science][Medline]
Laming DRJ. Information Theory of Choice Reaction Time. New York: Wiley, 1968.
Laming DRJ. Choice reaction performance following an error. Acta Psychol (Amst) 43: 199–224, 1979.[CrossRef]
Lau B, Glimcher PW. Dynamic response-by-response models of matching behavior in rhesus monkeys. J Exp Anal Behav 84: 555–579, 2005.[CrossRef][Web of Science][Medline]
Law CT, Gold JI. Neural correlates of perceptual learning in a sensory-motor, but not a sensory, cortical area. Nat Neurosci 11: 505–513, 2008.[CrossRef][Web of Science][Medline]
Lee D, Conroy ML, McGreevy BP, Barraclough DJ. Reinforcement learning and decision making in monkeys during a competitive game. Brain Res Cogn Brain Res 22: 45–58, 2004.[CrossRef][Medline]
Link SW. The Wave Theory of Difference and Similarity. Hillsdale, NJ: Erlbaum, 1992.
Luce RD. Response Times: Their Role in Inferring Elementary Mental Organization. New York: Oxford Univ. Press, 1986.
Maddox WT. Toward a unified theory of decision criterion learning in perceptual categorization. J Exp Anal Behav 78: 567–595, 2002.[CrossRef][Web of Science][Medline]
Mays LE, Sparks DL. Saccades are spatially, not retinocentrically, coded. Science 208: 1163–1165, 1980.
Merleau-Ponty M. Phenomenology of Perception. London: Routledge & Kegan Paul, 1962.
Moore T, Armstrong KM, Fallah M. Visuomotor origins of covert spatial attention. Neuron 40: 671–683, 2003.[CrossRef][Web of Science][Medline]
Nakahara H, Itoh H, Kawagoe R, Takikawa Y, Hikosaka K. Dopamine neurons can represent context-dependent prediction error. Neuron 41: 269–280, 2004.[CrossRef][Web of Science][Medline]
Newsome WT, Paré EB. A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J Neurosci 8: 2201–2211, 1988.[Abstract]
O'Regan JK, Noe A. A sensorimotor account of vision and visual consciousness. Behav Brain Sci 24: 939–973, 2001.[Web of Science][Medline]
Pasternak T, Merigan WH. Motion perception following lesions of the superior temporal sulcus in the monkey. Cereb Cortex 4: 247–259, 1994.
Pelli DG. The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat Vis 10: 437–442, 1997.[Web of Science][Medline]
Platt ML. Neural correlates of decisions. Curr Opin Neurobiol 12: 141–148, 2002.[CrossRef][Web of Science][Medline]
Rao RP. An optimal estimation approach to visual perception and learning. Vision Res 39: 1963–1989, 1999.[CrossRef][Web of Science][Medline]
Ratcliff R, Van Zandt T, McKoon G. Connectionist and diffusion models of reaction time. Psychol Rev 106: 261–300, 1999.[CrossRef][Web of Science][Medline]
Remington RJ. Analysis of sequential effects in choice reaction times. J Exp Psychol 82: 250–257, 1969.[CrossRef][Web of Science][Medline]
Rieke F, Warland D, de Ruyter van Steveninck RR, Bialek W. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press, 1999.
Roitman JD, Shadlen MN. Response of neurons in the lateral intraparietal area during a combined visual discrimination reaction time task. J Neurosci 22: 9475–9489, 2002.
Rolls ET. The functions of the orbitofrontal cortex. Brain Cogn 55: 11–29, 2004.[CrossRef][Web of Science][Medline]
Salzman CD, Britten KH, Newsome WT. Cortical microstimulation influences perceptual judgements of motion direction. Nature 346: 174–177, 1990.[CrossRef][Web of Science][Medline]
Salzman CD, Murasugi CM, Britten KH, Newsome WT. Microstimulation in visual area MT: effects on direction discrimination performance. J Neurosci 12: 2331–2355, 1992.[Abstract]
Schlag J, Schlag-Rey M, Dassonville P. Interactions between natural and electrically evoked saccades. II. At what time is eye position sampled as a reference for the localization of a target? Exp Brain Res 76: 548–558, 1989.[CrossRef][Web of Science][Medline]
Seo H, Lee D. Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. J Neurosci 27: 8366–8377, 2007.
Shadlen MN, Newsome WT. Neural basis of a perceptual decision in the parietal cortex (area LIP) of the rhesus monkey. J Neurophysiol 86: 1916–1936, 2001.
Shidara M, Richmond BJ. Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science 296: 1709–1711, 2002.
Snyder LH, Batista AP, Andersen RA. Coding of intention in the posterior parietal cortex. Nature 386: 167–170, 1997.[CrossRef][Web of Science][Medline]
Soetens E, Boer LC, Hueting JE. Expectancy or automatic facilitation? Separating sequential effects in two-choice reaction time. J Exp Psychol 11: 598–616, 1985.
Sternberg S. Separate modifiability, mental modules, and the use of pure and composite measures to reveal them. Acta Psychol (Amst) 106: 147–246, 2001.[CrossRef][Medline]
Sugrue LP, Corrado GS, Newsome WT. Matching behavior and the representation of value in the parietal cortex. Science 304: 1782–1787, 2004.
Sutton RS, Barto AG. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press, 1998.
Tanji J, Hoshi E. Behavioral planning in the prefrontal cortex. Curr Opin Neurobiol 11: 164–170, 2001.[CrossRef][Web of Science][Medline]
Tassinari H, Hudson TE, Landy MS. Combining priors and noisy visual cues in a rapid pointing task. J Neurosci 26: 10154–10163, 2006.
Tenenbaum JB, Griffiths TL. Generalization, similarity, and Bayesian inference. Behav Brain Sci 24: 629–640, 2001.[CrossRef][Web of Science][Medline]
Tremblay L, Schultz W. Relative reward preference in primate orbitofrontal cortex. Nature 398: 704–708, 1999.[CrossRef][Web of Science][Medline]
Vaughan W. Melioration, matching, and maximization. J Exp Anal Behav 36: 141–149, 1981.[CrossRef][Web of Science][Medline]
Wichmann FA, Hill NJ. The psychometric function: I. Fitting, sampling, and goodness of fit. Percept Psychophys 63: 1293–1313, 2001.
Williams BA. Reinforcement, choice, and response strength. In: Stevens' Handbook of Experimental Psychology, edited by Atkinson RC, Herrnstein RJ, Lindzey RD, Luce RD. New York: Wiley, 1988, p. 167–244.
This article has been cited by other articles:
![]() |
H. Seo and D. Lee Behavioral and Neural Changes after Gains and Losses of Conditioned Reinforcers J. Neurosci., March 18, 2009; 29(11): 3627 - 3641. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. M. Connolly, S. Bennur, and J. I. Gold Correlates of Perceptual Learning in an Oculomotor Decision Variable J. Neurosci., February 18, 2009; 29(7): 2136 - 2150. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |