|
|
||||||||
Laboratory for Computational Motor Control, Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, Maryland
Submitted 10 March 2004; accepted in final form 19 September 2004
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
(Georgopoulos et al. 1982
), and a weighted combination provides an estimate of direction. This "internal model" simply produces an identity map,
, but with different weighting functions any other function can be modeled (Pouget et al. 2000
In control of limb movements, it has been suggested that the brain computes a function that approximates inverse dynamics of the limb (
,
,
)
, estimating the forces
that are necessary to achieve a particular desired limb state
,
,
(Conditt and Mussa-Ivaldi 1999
; Shadmehr and Mussa-Ivaldi 1994). Assuming that the approximation of inverse dynamics is linear, one can represent this computation with the expression
= Wg(
,
,
), where g = [g1, g2, ..., gn]T is a vector of bases that encodes state of the limb and W = [p1|p2||pn] associates each basis with a preferred force vector pi. In theory, movement errors produce a change in the preferred force vector of each basis, which in turn produces a performance change in the subsequent movement. If the subsequent movement explores a different part of the state space than the effect of the error on this new part of the state space is a measure of generalization.
Thus, theory predicts that there is a relationship between trial-by-trial change in performance and the shape of the bases (Donchin et al. 2003
; Thoroughman and Shadmehr 2000
). For example, patterns of generalization have been used to show that bases elements are tuned to movement direction (Donchin et al. 2003
); that directional tuning is multiplicatively modulated by a linear function of limb position (Hwang et al. 2003
); and that the encoding of limb position and velocity is in terms of the intrinsic coordinates of muscles and joints (Malfait et al. 2002
; Shadmehr and Moussavi 2000
). These are also properties of some cells in the primary motor cortex (Georgopoulos et al. 1984
) and the cerebellum (Bosco et al. 1996
; Coltz et al. 1999
). However, there has been very little exploration of the representation of cues other than the current state of the limb. For example, if one needs to reach a target while holding a tennis racquet in one instance and a badminton racquet in another instance, a contextual cue must predict different forces for similar states of the limb (Osu et al. 2004
; Wada et al. 2003
). Our ability to interact with various objects in our environment predicts that contextual cues must have a significant influence on the activity fields of the bases. How are "high level" contextual cues and "low level" information about the state of the limb combined in the bases that compute internal models of dynamics? Here we use a system identifi-cation technique to quantify generalization patterns across a multidimensional space that includes both contextual cues and state of the limb.
This work is part of an MS thesis submitted by S. K. Wainscatt to Johns Hopkins Biomedical Engineering Department.
| THEORY |
|---|
|
|
|---|
)
, where s is a contextual cue,
is limb state (its velocity), and
is an estimate of force on the limb. The cue is the ordinal number of a movement within a sequence. We assume that the map is computed with a collection of bases gi, each a scalar function representing the activity of the basis element in different parts of the input space (s,
). Each basis is associated with a preferred force vector pi. Adaptation results in changes in these preferred force vectors. We have
= Wg(s,
), where g = [g1, g2, ..., gm]T and W = [p1, p2, ..., pm].
and pi are 2 x 1 vectors. If in trial n a reaching movement is made, the force error in that trial is
![]() |
(n)
[s(n),
(n)], where
(n) is evaluated at maximum velocity during that trial. Error in trial n results in a change in the preferred force vector associated with each basis. If we define scalar quantity
, minimizing this cost function describes how error in trial n should produce a change in pi
![]() |
![]() |
![]() | (1) |
is a learning rate. Following Thoroughman and Shadmehr (2000)
(n+1)) and note that gT(
(n))g(
(n+1)) is a scalar quantity and arrive at
![]() | (2) |
(n) and trial n + 1 occurred in state
(n+1), the error experienced in trial n produces the following observable change in the system when the system is evaluated at
(n+1)
![]() | (3) |
(n))g(
(n+1)) is a generalization function. This scalar function determines how much of the error experienced in one state affects any other state.
Our discussion assumed that in a given trial (a movement), the internal model computed a single vector of force. This is a gross simplification, and elsewhere we have derived Eq. 3 with the assumption that a trial is a trajectory of forces and states (Donchin et al. 2003
). In that case, the generalization function takes the form of an integral. The result is that generalization from the state in which error was experienced to another state is a correlation between the basis elements as evaluated along the 2 states. When the basis elements are broad, this correlation is high and there is large generalization. When the basis elements are narrow, the correlation quickly drops off as a function of distance between the states.
To give a simplified example, ignore the fact that there may be contextual cues in the task and imagine that our internal model consisted of 6 states, each a direction of movement
. On any given trial, the particular values of pi (i.e., preferred force vector) associated with each gi (i.e., basis function) constitute the motor system's memory of the task. If we evaluate this memory for all states, the result is a field of forces
, as shown in Fig. 1A1. Therefore, at trial n,
(n) is a field (represented as a 12 x 1 vector) that contains a force
(n) (a 2 x 1 vector) for each of the 6 states. The experimenter provides a target and the subject evaluates
(n) along that direction and predicts
(n). We write this as
![]() |
. Because the movement is made to that direction, the experimenter has the robot produce a force f(n). The robot force interacts with force produced by the subject, resulting in hand trajectory. Suppose among the points along this trajectory we pick hand position at peak velocity and label it y(n). If there was a difference between f(n) and
(n), there will be a movement error
![]() |
![]() |
(n) depends on
(n) and the actual forces in that trial f(n). Because errors are generally small (typically < 1.5 cm), we assume that this dependency is linear
![]() |
|
(n) was experienced along state
(n) = 90°, but because of generalization, it affected all states i = 1 · · · 6. Generalization is a scalar function b that depends on the distance between the state in which the error was experienced
(n) and all other states
(i)
![]() |
![]() |
![]() |
![]() |
![]() |
In an ideal adaptive system, transition from trial to trial is dictated by the error in the previous trial and the generalization function. However, other factors may also affect the system. For example, memory may decay from one trial to the next. We can represent this effect with another parameter a
![]() |
(n), resulting in trajectory error
(n).
(n) updates contents of both states by a generalization function b. Because we have 2 states, we assume that b can be represented with 2 parameters: distance between movements of same ordinal number (i.e., 0) and distance between movements of different ordinal number (i.e., 1).
To compute a map that depends on both the direction of movement and the order of that movement in a sequence, the bases must be sensitive to both variables. Therefore, state of a trial is
(n)
[s(n),
(n)]. Distances in "sequence" space are not equal to distances in "direction" space. As a result the generalization function will be parameterized as
![]() | (4) |
In summary, an internal model that learns to compute a function (s,
)
may be thought of as a dynamical system with many hidden states. At any given time, the experimenter can indirectly examine the contents of only one of those states by having the subject make a movement. The result is a movement that may have some error. The error is broadcast to all states. The effect of the broadcast is observed as a generalization function that depends on the tuning of the bases with respect to those states. We represent this process with the following dynamical system
![]() |
![]() | (5) |
(n) is the model's estimate of hand position in trial n. In that trial, the robot produced a force f(n), and we measure the resulting hand position y(n). For that trial we also know the "state" of the task [i.e., L(n)] and K(n). The rest of the variables are unknown and will need to be estimated by fitting the
(n) to the measured data y(n). | HYPOTHESIS |
|---|
|
|
|---|
. We considered 3 ways of encoding this information: a model with separate basis functions for direction and serial order that interacted additively, a model that combined coding of direction and serial order multiplicatively, and a mixture of 2 models resembling a mixture of experts. Here we show the patterns of generalization that each type of encoding predicts.
First consider an additive model that uses separate bases for encoding ordinal and velocity information. A weighted sum of the bases approximates the function of interest
![]() |
(n) with respect to the weights and write an expression like Eq. 3 (i.e., an estimate of forces in the next trial as a function of the errors in the previous trial). If movement n was in state [s(n),
(n)], then generalization from that state to the next movement in the same direction will be b(1, 0) = 
i g1,i(s(n)) g1,i(s(n+1)) + g2,i2(
(n)). We note that the term b(0, 0) b(1, 0) is independent of velocity. Therefore, in this model the differences in the serial order of movements in the sequence will have an additive effect on the generalization function. A second point to note is that with an additive model, it will not be possible to compute a function where movement velocity and serial order are combined nonlinearly in the production of force. We will be presenting our subjects with such a nonlinear function.
To combine both cues in each basis functions, we can have one variable act as a gain to modulate sensitivity of the other variable
![]() |
. Generalization to the same movement number in the sequence and the same movement direction will be:
. Unlike the additive model, in this model the term b(0, 0) b(1, 0) is not independent of velocity. Rather, the change in the contextual cue multiplicatively modulates the generalization function with respect to direction, and so we can call this the multiplicative model.
In a mixture of experts (Ghahramani and Wolpert 1997
), for each possible ordinal position in the sequence there is a separate basis (or expert) that encodes velocity during those movements. A separate model selector chooses which of the experts to use based on the serial order of the current movement. For example, if there are 2 movements in a sequence, we have
![]() |
For example, imagine that each basis has a preferred movement number si and a preferred velocity
i. The coding of movement number may be with a Gaussian
![]() | (6) |
![]() | (7) |
i uniformly in the 2D space (with maximum values at ±0.7 m/s). We assumed that movement n was in direction
(n) = 90° with ordinal value s(n) = 1. We evaluated the bases at peak hand velocity typical for a reaching movement (0.3 m/s), and then computed the generalization function b[|s(i) s(n)|,
(j)
(n)] for all possible movement directions
(j) = 30, 90, ..., 330°, and ordinal values s(i) = 1 and 2.
Figure 1C shows the generalization functions for the additive and multiplicative models. The free parameters in the simulations were c, k,
q, and
s. The values for this simulation were c = 0.06, k = 3.5,
q = 0.15, and
s = 0.8. The solid line is b(0, 
) (i.e., generalization to the same ordinal number but potentially different direction). The dashed line is b(1, 
). In both models, generalization as a function of direction is bimodal, which is a property of the bases (Eq. 7). However, note that in the additive model, when ordinal distance is 1, the generalization function is shifted down by a constant value, whereas in the multiplicative model, the generalization function is scaled. Therefore, in the multiplicative encoding of sequence and direction, a gain modulates the generalization function. This property is not observed in the additive model.
| METHODS |
|---|
|
|
|---|
![]() |
|
The task began with presentation of a green target. This instructed the subject to reach. After completion of the reach, the next target was immediately presented in red. This instructed the subject to wait until the target turned green, and then the movement was performed. The trial was discarded if movement began before the target turned green. Upon completion of each reach, feedback was given to indicate movement success. The target would "explode" if the period of the movement was 0.5 ± 0.05 s. If the movement was either too long or too short, the target turned blue or red, respectively.
In our main groups of subjects, a pattern was imposed on the intertrial interval so that movements were temporally but not spatially organized into pairs: 12, 34, 56, 78, and so on. That is, after completion of an odd-numbered movement, there was a short time delay
1 before the next target turned green. However, after completion of an even-numbered movement, there was a longer time delay
2. The long delay
2 served as a cue that signaled start of the 2-movement sequence. In these groups, the subjects were coached to associate the movement sequence as pairs of movements. The coaching took the form of observing the experimenter perform movements in a null field while she sounded out "movement 1," "movement 2," "movement 1," and so forth. Therefore, the subject had ordinal information about the movements. However, the direction of each movement was randomly chosen and could not be predicted.
Training took place in 2 consecutive days and was divided into target sets. Each set contained 240 trials, divided into 2 subsets of 120 trials. Subjects rested for a minute between the subsets. On Day 1, training began in a null field (robot in a passive mode) with 240 trials, and was then followed by 960 trials in the field. The field training continued on Day 2, where subjects trained in another 960 trials. For any given target, there was a 1/6 probability of a catch trial. Catch trials are movements for which the field is temporarily turned off.
Previous work on a similar task (Karniel and Mussa-Ivaldi 2002
) had found that when
1 =
2 (i.e., no temporal cues were present to provide information about the sequential order of the force fields), subjects could not learn this task (as measured over a few hundred training trials). We hypothesized that if the temporal cues were to be a useful aide in adaptation, their effectiveness might be a function of their disparity: that is, learning may improve as a function of
2
1. To test this, we considered 3 conditions (Fig. 2C). In our control task, termed no-cue group (NC) (n = 10 subjects), the delays were equal (i.e.,
1
2). This group was neither coached nor had temporal cues to identify ordinal structure of the task. Our main group was divided into 2 subgroups. In the short-delay group (SD) (n = 10), we had
1 <
2. In the long-delay group (LD) (n = 10), we had
1 <<
2. The values for the intertrial intervals were as follows:
1 for all 3 groups was 0.5 s;
2 values for the no-cue, short-delay, and long-delay groups were 0.5, 1.0, and 3.0 s, respectively.
Pilot studies indicated that this was a difficult task that required many hundreds of trials. To encourage the participants, in addition to the target explosions we also provided a score that was continuously displayed at the bottom right of the video monitor. The score R was computed as
![]() |
t(n) is movement time recorded for trial n. The score was for the participant's encouragement and entertainment. Our performance measures did not consider it. Performance measures
We focused on the errors that were caused by the force field and the errors that were produced in catch trials. For each movement we computed hand position at maximum hand velocity and found its distance from a straight line to the target. The result was labeled as PD, hand displacement perpendicular to the direction of target (PD). With learning, PD in field trials should decrease, whereas PD in catch trials should increase. Furthermore, the direction of PD in catch trials should be a function of the field appropriate for that trial. For example, for a given target direction, catch trial PDs measured on odd-numbered trials should be in the opposite direction of PDs for even-numbered catch trials. We labeled average PD in odd field trials as
fo, in even field trials as
co, in odd catch trials as
co, and in even catch trials as
co. Because errors in both field and catch trials are important indicators of adaptation, we used a learning index (LI) that considered relative changes in these values (Criscimagna-Hemminger et al. 2003
). We used the following equations to calculate an LI for odd- and even-numbered movements, as well as an overall learning index
![]() |
![]() |
![]() | (8) |
System identification procedure
We fit the system in Eq. 5 to runs of 240 trials of y(n). In our initial analysis, y(n) was generated by averaging the data across subjects for each movement. However, we also fit the model to y(n) generated by each subject. We report the resulting generalization function for both approaches.
For our 2 main groups, where movements consisted of 6 directions and 2 ordinal numbers within the sequence, Eq. 4 produced a generalization function b with 12 parameters. For the control group (NC), where no information about sequencing was available, a 6-parameter generalization function was sufficient but we continued to use the 12-parameter model as a sanity check, expecting that sequence information should be irrelevant for the b in the NC group.
The only observable quantities are y(n) (i.e., hand position at maximum velocity recorded for trial n) and f(n), the force produced by the robot at maximum velocity. For each run of 240 trials, the unknown parameters are: a, b, D,
(1), and Y*, where a is a scalar quantity and represents trial-to-trial memory loss in the system; b is the generalization function and contains 12 unknown parameters; D is the gain of online feedback during each trial and contains 4 unknown parameters; and
(1) is the internal model's field of forces at the very first trial (i.e., initial condition of the dynamical system). Because there are 12 states (6 directions and 2 ordinal numbers),
(1) contains 12 vectors, each a 2 x 1, for a total of 24 parameters. Y* is the reference trajectory for each movement. Because there are 10 different kinds of movements (Fig. 2B), Y* has 10 vectors each representing the reference position (a 2 x 1 vector) for each movement. Therefore, the total number of unknown parameters is 61. However, it is not necessary to estimate
(1). Rather,
(1) for a given set is simply
(240) for the previous set. In case of the first field set,
(240) comes from the previous null set and for the null set
(1) is simply zero. This reduced the number of unknown parameters in the system to 37.
To estimate these parameters, we minimized the sum of squared errors between the model and the data
![]() |
a
![]() |
![]() |
![]() |
![]() |
![]() |
Measuring the goodness of fit
Once system parameters were estimated, Eq. 5 was evaluated by providing it a sequence of inputs f(n). The output of the system was a sequence
(n). Therefore, the model was evaluated in an autonomous mode without access to the measured data y(n). To evaluate goodness of fit we followed a procedure for testing significance of nonlinear models (Glantz and Slinker 2001
). We computed the sum of squares explained by the model over the sequence of trials
![]() |
is a 20 x 1 vector that contains the mean of hand positions (over that set of trials) for each of the 10 types of movement (Fig. 2B). The total sum of square was
![]() |
![]() |
![]() |
The number of parameters in any model has a strong influence on the goodness of fit. As the size increases, the model will fit the data better. To balance this improved fit versus risk of overfitting, we used the Akaike information criterion (AIC) to estimate how the size of the model affected the fit (Akaike 1974
). If there are m unknown parameters and N trials, then
![]() |
| RESULTS |
|---|
|
|
|---|
Psychophysical results
Figure 3 shows representative trajectories in catch and field trials early and late in training from a single subject in each group. As the training in the field began, the robot produced a counterclockwise field on odd trials, and a clockwise field on even trials. Early in training (trials 1120), there were large errors in field trials but little or no errors in catch trials. Near the end of training (trials 1,8001,920), the subject in the NC group showed only a slight improvement on field trials and no consistent errors in catch trials. The lack of errors in catch trials is our most reliable indicator that there were little or no changes in the ability of this subject to predict the forces on a given trial.
|
Figure 4A displays average performance measures for each group. All groups reached to the same sequence of targets and except for the null set in day 1, target sequence was identical in days 1 and 2. Perpendicular displacement (PD) at max velocity was calculated for each trial for each subject and then averaged within the target set and then across subjects. In the NC group, training produced significant changes in the field trials. The average change from set 1 to set 8 for odd trials was +1.3 mm (P = 0.01) and for even trials was +2.9 mm (P < 0.001). Therefore, performance improved in the even field trials but worsened in the odd field trials. By the end of training (set 8), errors in catch trials were not significantly different from zero (odd trials, P = 0.47; even trial, P = 0.06) but the specific sequence of targets in each set appeared to be responsible for some of the variability in the catch trials. To account for this, we compared within-subject change in catch trials from day 1 to day 2 for each target set. We found that although there was a small change in catch trial errors (+0.6 mm for odd trials, P = 0.07; +0.96 mm for even trials, P = 0.06), the direction of change was not opposite of the field trials. Consequently, in the NC group the learning indices for the odd and even movements (LIo and LIe), as well as the total learning index LI, remained near zero.
|
In the LD group, field trials changed from set 1 to set 8 for odd trials by 3.95 mm (P < 0.001) and for even trials by +3.4 mm (P < 0.001). This improvement in performance (errors in set 8 vs. set 1) was somewhat larger than that seen in the SD group, although the change was only marginally significant (odd trials, P = 0.006; even trials, P = 0.05). By set 8, errors in odd catch trials were on average 6.5 mm (significantly different from zero, P < 0.001), and for even catch trials +1.9 mm (significantly different from zero, P = 0.001). With respect to the catch trial in set 1, by set 8 odd catch trials had changed by 4.7 mm (P < 0.001) and even catch trials had changed by +3.3 mm (P < 0.001).
To compare performance between groups, we estimated an average learning index for each subject for each day of training (Fig. 4B), and then performed a 2-way ANOVA (main effects of time and group) and found a significant difference between the groups (P < 0.01). Both SD and LD groups had significantly larger learning indices (LI) than the NC group on days 1 and 2 (all cases P < 0.001). However, although there was a trend for the LD group to have a larger learning index than the SD group, this trend was not significant (day 1, P = 0.20; day 2, P = 0.09).
Goodness of model fits
We averaged hand positions (at maximum velocity) across subjects within a group and arrived at a 240-element series of hand positions y(n) for each set. Figure 5A shows the 2 components of the vector y(n) for the LD group in the last target set (blank lines). We fit Eq. 5 to these trials. Figure 5A displays the model's fit
(n) (gray lines). Among all the sets, this was one of the worst fits (Fig. 5B): the model accounted for 72.9% of the variance in the data [F(37,203) = 14.8, P < 0.0001]. On average the model accounted for 82.4% of variance in the NC group, 75.5% of the variance in the SD group, and 76% of the variance in the LD group.
|
(1) for the test set to be equal to the estimate for the last movement in the previous set. We found that, on average, the r2 values declined by 6.8%. The decline was largest in the LD group (8.7%) and smallest in the NC group (4.3%). In all cases the fit remained highly significant (P < 0.001).
We analyzed the sequence of
(n) produced by the model using the same procedure that we used to compute PD and learning indices in Fig. 4. The plots in Fig. 6 show output of the model in terms of PD in field and catch trials and learning indices. The solid lines are the measured data from subjects (same data as in Fig. 4) and the dotted lines are computed from the model. In field trials, the model tracked the measured data so closely that the dotted and solid lines are exactly on top of each other. In catch trials, the model tracked the data well but appeared to slightly underestimate the size of the catch trials in the SD and LD groups. This produced slightly smaller learning indices for the model versus the measured data.
|
Figure 7A plots the generalization function estimated by the model for each set and each group. In the NC group, the model allowed for the generalization function b to have 12 degrees of freedom (DoF). The result of the fit was a b value that was consistently positive and often tuned to angular distance, but was insensitive to ordinal position of movements, i.e., b(0, 
)
b(1, 
). Average b across the sets is plotted in Fig. 7B. The peak of b is near 0.14, which implies that the internal model's estimate of force
for any given direction of movement changed by 14% of error experienced in that direction. The function is bimodal, which means that error experienced in a given direction affected the opposite direction of movement nearly as much as the same direction of movement. Therefore, in the NC group, error in a given movement produced a consistent adaptation that was reflected as a change in the subsequent movement. However, because the generalization function could not differentiate between the movements in the sequence, the result was trial-by-trial adaptation but no net learning.
|
In the NC group, the insensitivity of the generalization function to movement order suggested that a 12 DoF b was unnecessary. Figure 7B shows the average generalization function when b had only 6 DoF. To quantify the effect of the number of parameters in b with respect to goodness of fit, we used the Akaike information criterion, which indicated that a 12-parameter model was an overfitting of the NC group but not the SD and LD groups. In the NC group, AIC increased from 7.49 to 7.57 when the number of hidden states increased from 6 to 12. This increase indicated that the cost of increasing the number of parameters was not balanced by an improvement in fitting the data. However, in the SD and LD groups, AIC decreased from 7.48 to 7.40 and 7.59 to 7.13, respectively, when the number of hidden states was increased from 6 to 12. This agrees with the intuition that sensitivity to movement order is necessary to account for the performance in the SD and LD groups.
However, the plots in Fig. 7A for the SD and LD groups suggest that that, whereas b is directionally tuned, b(0, 
) is very similar to b(1, 
). The level of similarity between these 2 functions is a measure of the interference between 2 movements of the sequence. Across-set average of the generalization functions is plotted in Fig. 7B. Because b(0, 
) and b(1, 
) are very similar, the brain is generalizing the errors across the movements almost irrespective of their numerical order in the sequence. This explains why the task is so hard to learn.
In the SD and LD groups, the only significant within-group difference between b(0, 
) and b(1, 
) is at 
= 0° (P < 0.01 for both groups). At this angular distance, movement of a given ordinal number had a significantly larger effect on the contents of the internal model for that same movement number than for the subsequent movement. This implies that contextual information about the movement affected the directional tuning of each basis only along its preferred direction.
In our hypothesis, we formulated 3 theoretical scenarios in which a system might learn to compute a function that depends on direction of movement and context. In the multiplicative model and in the mixture of experts, the ordinal number multiplicatively influences the directional tuning of the bases. Both scenarios predict that the difference between b(0, 
) and b(1, 
) should be largest at 
= 0°. This prediction is in agreement with our data. In an additive model, this difference should be equal for all angular differences. Therefore, the data in Fig. 7B clearly reject the additive model. However, the results are not entirely consistent with a multiplicative model. Because b is bimodal, the multiplicative hypothesis predicts a nonzero difference between the generalization functions both at 
= 0° and at 
= 180° (the 2 main peaks of the generalization function). We observed a significant nonzero difference only at 
= 0°.
Figure 7C plots the within-group quantity b(0, 
) b(1, 
) at 
= 0°. The measure is marginally larger in LD versus SD (P = 0.045), but much larger in SD versus NC (P < 0.001). The small difference between the LD and SD groups is consistent with the small performance differences in the 2 groups (Fig. 4). This suggests that the contextual cue in both LD and SD groups increased output of the bases at the preferred direction of movement. However, in the LD group the increase was marginally larger.
When we compare the generalization functions between groups, we do not find that b(1, 
) in the LD or SD groups is smaller than b(0, 
) in the NC group. Therefore, availability of the contextual cue did not result in a reduction of generalization between incompatible movements. This would have been a reasonable way to learn the task. Rather, the contextual cue preferably increased generalization between compatible movements. Therefore, the presence of the contextual cue (in comparison to its absence or irrelevance) appeared to produce increased activity among some of the bases without greatly affecting the remaining bases.
These results were arrived at by fitting the model to the across-subject averaged time series of hand positions. We also fit the model to the time series produced by each subject in each set. The fits produced 8 estimates of b for each subject (one for each set). We averaged these 8 estimates to produce a mean b for each subject and plotted the across-subject distribution (mean ± SE) in Fig. 7D. The NC group again showed no difference between b(0, 
) and b(1, 
), whereas the SD and LD groups showed modulation at 
= 0°. The within-subject modulation b(0, 
) b(1, 
) at 
= 0° is plotted in Fig. 7E. This gain was significantly larger in the LD group versus the SD group (P = 0.01), and significantly larger in the SD group versus the NC group (P < 0.001).
In the NC group, model fits to individual subjects produced an average b that was somewhat flatter (less directionally tuned) than that observed in the fits to the mean of across-subject time series (Fig. 7D vs. B). The reason for this is unclear to us but we note that, in general, our ability to fit data from single subjects was poor. Average r2 values for single-subject fits were 0.44. Although this is still a highly significant fit [F(37,203) = 4.54, P < 0.001], it explains only half of the variance that the model could account for when the times series was an across-subject average.
Gain of online feedback and other model parameters
Other than generalization, there were 4 parameters in the model: a, D, Y*, and
(1). Here we report on the values of these parameters when the model was fit to individual subjects. The parameter a represented trial-to-trial changes in
that could not be accounted for by error in the previous trial. For example, if there was memory decay in the system from trial to trial, then a should be <1. We found that in all groups and all sets, a was extremely close to 1: in the NC group, a = 1.00 ± 0.002 (mean ± SD); in the SD group, a = 0.99 ± 0.011; in the LD group a = 0.99 ± 0.009. Therefore, we could not observe any decay in the state transitions.
The D matrix represented the inverse of the gain of an online feedback control system that corrected errors during a movement. In Fig. 8A, the between-subject averaged D matrix in each set is plotted by multiplying it by a unit vector as it rotated about a circle. The shape of the matrix is very similar to the compliance of the arm (Mussa-Ivaldi et al. 1985
). We examined the main effect of group and set number on the orientation and size of D. Matrix size (determinant) was not different among the groups [F(2,23) = 0.15, P > 0.8]. With training, the determinant tended to become smaller [F(7,23) = 0.44, P = 0.03], which implies that the gain of the online feedback system appeared to increase. The D matrix was oriented at a mean angle of 18.3, 29.2, and 14.5° for the NC, SD, and LD groups, respectively [F(2,23) = 15.4, P < 0.01]. Training did not produce a significant change in this orientation [F(7,23) = 0.09, P > 0.8]. Post hoc analysis of orientations did not find a significant difference between the NC and the LD groups [F(1,15) = 2.16, P = 0.2], although the SD group had a D with an orientation significantly different from that of the other groups.
|
The vector
(1) represents the initial condition of the internal model (i.e., expected force) at the start of the target set (trial 1). Because the internal model has 12 hidden states (6 directions of movement and 2 ordinal numbers), there is a force predicted for each one of these states. We did not fit
(1) explicitly but set it equal to the estimate provided from the final movement in the preceding target set. Figure 8C plots
(1) for set 1 in each movement direction for each group. For each direction there are 2 vectors. The gray and black vectors represent the components of
(1) expected for an odd- and an even-numbered trial, respectively. At the onset of training, the gray and black vectors are generally small and point to the same direction in all groups. By the start of the final set of training (Fig. 8D),
(1) values in the SD and LD groups point nearly perpendicular to each direction of movement. In these 2 groups
(1) is clockwise for even trials and counterclockwise for odd trials, i.e., opposite the force field produced by the robot in these trials (Fig. 2B). By contrast,
(1) shows no obvious pattern for the NC group.
Estimating the effects of model assumptions and simplifications
For the purpose of computing a generalization function, we collapsed the 10 movements that were actually produced into 6 movement directions as if they were performed from a single start location. This assumption produces 2 sources of error in fitting the model to the data. First, limb compliance is a function of hand position (Mussa-Ivaldi et al. 1985
). The assumption of a constant compliance is a violation of this observation. Second, the generalization function may be smaller when the 2 movements have different origins as compared to when they have the same origin. Hwang et al. (2003)
demonstrated that generalization as a function of direction was modulated with a gain that depended on hand position.
To quantify the effects of these sources of error on our model, we performed a set of simulations. We began with a model of the human arm biomechanics with position-dependent stiffness and inertial parameters (Shadmehr and Mussa-Ivaldi 1994
), coupled to the dynamics of our robot arm. We added to this model a mechanism of adaptation by basis functions. The bases encoded both limb position and velocity (in joint coordinates). The encoding of limb velocity was with bimodal Gaussians (Donchin et al. 2003
) that were modulated linearly (as in a gain field) with changes in limb position (Hwang et al 2003
). We then performed 2 simulations. In one simulation, the 10-cm movements were along the diamond pattern, i.e., movements to the same direction could start from different start positions (as in the current experiment). In another simulation, all movements were from the same start location at the center. In both simulations, there were 6 possible directions of movement. The idea was to determine to what extent our assumptions regarding position-invariant generalization and compliance degraded our ability to fit the data.
In the simulation, 6 sets of 192 movements were performed in a field that randomly changed from null, to a clockwise curl, to a counterclockwise curl field, producing a performance comparable to that of the no-cue group. In the first 3 sets, all movements began from the center. In the remaining 3 sets, movements had origins that fell on a diamond pattern, identical to our experiment. Values of r2 for the fits declined from 0.843 to 0.825. That is, the model could account for about 2% less of the variance in the data in the diamond pattern versus in the common-origin pattern. Therefore, the 2 assumptions clearly influenced the fit in a negative way. However, the effect did not appear to be very large. Indeed, this level of fit is quite consistent with the model's performance on actual data (Fig. 5B).
| DISCUSSION |
|---|
|
|
|---|
Effect of withholding the contextual cue
We observed little or no learning of the force fields in the no-cue (NC) group where no temporal cues were available to identify an odd or even movement. The results show that the repeating patterns of error and force are not sufficient for the brain to learn the sequence-dependent field. An explicit cue may be necessary. This lack of adaptation without an explicit cue confirms earlier observations (Karniel and Mussa-Ivaldi 2002
, 2003
). However, one of our key findings is that, although the errors did not converge in this group of subjects, the generalization function was consistently nonzero, implying that subjects continued to respond to the errors in a given trial by altering their motor commands in the subsequent trial. Thus trial-to-trial adaptation continued to take place despite the fact that errors did not converge.
This result, and similar results from Scheidt and colleagues (2001)
, may have implications for brain imaging paradigms where a random trial is offered as a condition for which no learning is thought to occur. The state-space model suggests that despite behavioral measures that show no consistent improvements in performance, the brain continues to adapt to errors much the same way as in a nonrandom task.
Coding of a contextual cue
With the availability of serial order information, subjects learned to produce motor commands that were sensitive to both movement direction and serial order. A number of recent studies have found that certain cues can be effective in training subjects to produce different motor commands for the same direction of movement (Krouchev and Kalaska 2003
; Osu et al., 2004
; Wada et al. 2003
). How might the contextual cue and movement direction be encoded by the neural system that learns to represent the internal model? If neural activity fields acted as bases with which the map (s,
)
was approximated (where s is a contextual cue and
is target direction), then encoding will be related to generalization; thus our objective was to quantify how the information about serial order affected patterns of generalization.
We found that the generalization function was largest at directional distance of 0° but had a smaller second peak at 180°. This implies that the bases are directionally tuned but the tuning is possibly bimodal. This kind of tuning is consistent with activity of some Purkinje cells in the cerebellum (Coltz et al. 1999
). However, to learn the task, the bases must also be sensitive to the ordinal number of the movement. We found that movement order modulated generalization at 0° but not significantly at other angular distances. To account for this effect, we began with the assumption that the information about movement order was coded as a Gaussian with a center located at a preferred movement number within the sequence. The choice of a Gaussian arose from the observations of Georgopoulos and colleagues regarding the discharge of M1 cells in a task where the animal kept track of potential reach targets as they appeared in a sequence (Carpenter et al. 1999
). They found that during the delay period, cells preferred a specific target number within a sequence independent of target location; that is, the neuronal responses were phasic and specific to the preferred target number. The choice of a Gaussian is also attributed to observations of Clower and Alexander (1998)
, where neurons in SMA discharged maximally for a particular target when it appeared at a particular ordinal number within a sequence.
How might this coding of movement number be combined with directional tuning? Using a model, we contrasted generalization of bases that incorporated the effect of movement order and directional tuning with either an additive or a multiplicative interaction. The generalization patterns in our subjects were clearly inconsistent with the additive model. The multiplicative model predicted that the largest modulation should occur at 0°, which agreed with our data. The multiplicative model also predicted that there should be a smaller but significant modulation at the second peak of the generalization function (i.e., at 180°). We did not observe this in the data. The discrepancy between prediction and observation may be explained by a number of factors; one is the possibility that the bimodality of the generalization function is not an inherent property of a single basis. Rather, the bimodal generalization may reflect the contribution of 2 distinct bases, only one of which is modulated by contextual cues. For example, in the computation of an internal model, the state of the arm may be encoded in both visual and proprioceptive coordinates with separate bases (Hwang et al. Soc Neurosci Abstr 822.17, 2003). Multiplicative interaction with one but not the other would produce a generalization pattern similar to the data presented here.
A key characteristic of multiplicative interaction is a change in the depth of tuning. For example, in the posterior parietal cortex there is preliminary evidence that cells that encode a reach plan (i.e., target location with respect to fixation) have a receptive field that is modulated multiplicatively with reward expectancy (Corneil et al. 2003
). It is possible that multiplicative modulation of activity fields is a common mechanism by which information from disparate systems is combined by neurons that take part in computing a transformation (Pouget and Sejnowski 1997
). A recent model has found that simulated neurons that multiplicatively code contextual and noncontextual information can allow a single network to switch behavior to respond to changing environmental demands (Salinas 2004
). This kind of coding of diverse sources of information is a plausible method with which a single network of neurons can effectively store multiple, context-specific internal models.
In contrast to using multiplicative coding within a single network, an alternate approach is to assume existence of multiple modules, called a MOSAIC, which is a collection of competing internal models (Wolpert and Kawato 1998
). In MOSAIC, internal models are controllers that receive the goal (or desired trajectory) of the movement. Each controller predicts a different motor command, and a switching mechanism assigns likelihood of success to each potential command. Contextual cues are priors that influence the probability of selection of these weights. MOSAIC predicts that neurons that form each internal model are themselves not modulated by the contextual cue. We found that a simple formulation of this mixture of experts produces the same patterns of generalization found with multiplicative encoding of a single network; thus generalization patterns appear to lack the power to differentiate between these 2 different architectures of adaptation.
Salience of the contextual cue
Our conjecture that ordinal number multiplicatively modulates directional tuning runs into trouble when we consider that in the long-delay (LD) group, performance was slightly better than that in the short-delay (SD) group. If one is aware of a particular movement's ordinal number within a sequence, why should the temporal delay between the 2 sequences affect learning? One explanation is that in the short-delay group, subjects were simply more fatigued because of the closer proximity of the trials. However, if fatigue has a major role in this task, then in all groups the generalization function should gradually decline during the day of training. The data do not agree with the fatigue hypothesis.
We do not know whether the SD and LD groups could equally identify the 2 movements in the sequence; we quizzed them verbally only during the null field training trials. Our verbal measure suggested equal cognitive awareness, although this was not documented. However, the generalization function in the LD group was significantly more modulated by the contextual cue than in the SD group.
Implicit in the structure of a sequence is its beginning and its end. In our task, the longer delay interval between the sequences was the cue that one sequence had ended and another was about to begin. Fujii and Graybiel (2003)
found that in the prefrontal cortex of monkeys, neurons had a phasic peak of activity that signaled the beginning and end of a sequence of movements. For example, the burst peaked about 250 ms after the last movement in the sequence (in this case, a saccade), lasted about 500 ms, and was independent of sequence length, pace, or reward condition, even when the sequence was only 2 movements long. Separate bursts signaled beginning and completion of a sequence.
Imagine that such signals are responsible for starting and resetting a counter that indicates ordinal number of a movement within a sequence. It is possible that in the SD group, where the delay between termination of one sequence and start of the next was 1.0 s; neuronal activities that signal the 2 events interfered temporally, causing an occasional misestimation of movement number within the sequence. There was less chance for this temporal overlap when the delay was 3.0 s (LD group). If we measure the degree of separation of 2 movements in the sequence by the amount of modulation of the generalization function, then in the LD group the 2 movements were functionally better separated. In the LD group, errors in one movement had a smaller interference on the subsequent movement. Given the time course of the prefrontal bursts that identified start and end of a sequence (Fujii and Graybiel 2003
), the hypothesis predicts that there should be no further gains in performance if the sequence is separated by longer intervals.
Limitations of the theory
The theory that we developed here is a step toward the goal of representing learning and memory as the accumulation of small trial-to-trial changes. The data that we analyzed were limited to approximately the first 250 ms of movement, a period when motor commands are thought to be dominated by a feed-forward internal model that relies little on online feedback (i.e., from the current movement). However, our earlier work (Bhushan and Shadmehr 1999
; Wang et al. 2001
) and those of our colleagues (Burdet et al. 2001
; Franklin et al. 2003
) had found that adaptation in this task involved changes in both the feed-forward inverse model and a feedback-dependent forward model of dynamics. Indeed, we did observe a small but significant increase in the gain of the online feedback system with the progression of training. These changes are probably better measured in later periods of a trajectory, but require random perturbation to the limb. Identification of online mechanisms of control remains an unexplored area for our theory.
The formulation of the theory was in terms of a deterministic system and did not consider noise. In particular, the model assumed that estimates of force were one and the same as motor commands to the limb. If signal-dependent noise (Harris and Wolpert 1998
) is an important contributor to generation of motor commands in the range of forces that are typical for reaching, then estimate of force and its production are not equal. To account for this, we would need to introduce a noise term in the output equation. The fact that this term is missing from our formulation may explain why we could fit the across-subject averaged time series almost twice as well as the within-subject time series. Averaging across subjects reduces the variance of the output noise.
Our measure of distance between 2 ordinal numbers was symmetric; that is, the effect of movement 1 on movement 1 was assumed to be the same as the effect of movement 2 on movement 2 (the 2 states were the same). The effect of movement 1 on movement 2 was assumed the same as the effect of movement 2 on movement 1 (the 2 states were different). This resulted in a generalization function that measured distance in sequence space as being either 0 or 1. The rationale for this was to keep the number of unknown parameters in the generalization function to a minimum. In a longer sequence, "same" and "different" will be insufficient measures of distance between ordinal numbers.
Both adaptation and learning in this task involve more than what we have modeled. However, the behavior that we observed is largely accounted for by a theory where movements are attributed to a population coding where cues regarding state of the limb and context are transformed into a forcelike motor command. We predict that this transformation is computed with neurons that are directionally tuned, and the cue associated with movement order weakly modulates this tuning at the preferred direction. This hints at a general solution to the question of how the brain learns multiple internal models. During adaptation, information from disparate sources may be combined multiplicatively in every neuron. Attention, reward, or other cognitive factors serve to highlight specific channels of information, increasing the gain with which the particular information modulates directional tuning.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
Present address of O. Donchin: Department of Biomedical Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel.
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: R. Shadmehr, Johns Hopkins School of Medicine, 419 Traylor Bldg., 720 Rutland Ave., Baltimore, MD 21205 (E-mail: reza{at}bme.jhu.edu)
| REFERENCES |
|---|
|
|
|---|
Bhushan N and Shadmehr R. Computational architecture of human adaptive control during learning of reaching movements in force fields. Biol Cybern 81: 3960, 1999.[CrossRef][Web of Science][Medline]
Bosco G, Rankin A, and Poppele RE. Representation of passive hindlimb postures in cat spinocerebellar activity. J Neurophysiol 76: 715726, 1996.
Burdet E, Osu R, Franklin DW, Milner TE, and Kawato M. The central nervous system stabilizes unstable dynamics by learning optimal impedance. Nature 414: 446449, 2001.[CrossRef][Medline]
Carpenter AF, Georgopoulos AP, and Pellizzer G. Motor cortical encoding of serial order in a context-recall task. Science 283: 17521757, 1999.
Clower WT and Alexander GE. Movement sequence-related activity reflecting numerical order of components in supplementary and presupplementary motor areas. J Neurophysiol 80: 15621566, 1998.
Coltz JD, Johnson MTV, and Ebner TJ. Cerebellar Purkinje cell simple spike discharge encodes movement velocity in primates during visuomotor armtracking. J Neurosci 19: 17821803, 1999.
Conditt MA and Mussa-Ivaldi FA. Central representation of time during motor learning. Proc Natl Acad Sci USA 96: 1162511630, 1999.
Corneil BD, Musallam S, and Andersen RA. Representation of reward expectancy in the medial bank of the intraparietal sulcus: implications for neural prosthetics. Soc Neurosci Abstr 607.8, 2003.
Criscimagna-Hemminger SE, Donchin O, Gazzaniga MS, and Shadmehr R. Learned dynamics of reaching movements generalize from dominant to nondominant arm. J Neurophysiol 89: 168176, 2003.
Donchin O, Francis JT, and Shadmehr R. Quantifying generalization from trial-by-trial behavior of adaptive systems that learn with basis functions: theory and experiments in human motor control. J Neurosci 23: 90329045, 2003.
Franklin DW, Osu R, Burdet E, Kawato M, and Milner TE. Adaptation to stable and unstable dynamics achieved by combined impedance control and inverse dynamics model. J Neurophysiol 90: 32703282, 2003.
Fujii N and Graybiel AM. Representation of action sequence boundaries by macaque prefrontal cortical neurons. Science 301: 12461249, 2003.
Georgopoulos AP, Caminiti, R, and Kalaska JF. Static spatial effects in motor cortex and area 5: quantitative relations in a two-dimensional space. Exp Brain Res 54: 446454, 1984.[Web of Science][Medline]
Georgopoulos AP, Kalaska JF, Caminiti R, and Massey JT. On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex. J Neurosci 2: 15271537, 1982.[Abstract]
Georgopoulos AP, Schwartz AB, and Kettner RE. Neural population coding of movement direction. Science 233: 14161419, 1986.
Ghahramani Z and Wolpert DM. Modular decomposition in visuomotor learning. Nature 386: 392395, 1997.[CrossRef][Medline]
Glantz S and Slinker BK. Primer of Applied Regression and Analysis of Variance. New York: McGraw-Hill, 2001.
Harris CM and Wolpert DM. Signal-dependent noise determines motor planning. Nature 394: 780784, 1998.[CrossRef][Medline]
Hwang EJ, Donchin O, Smith MA, and Shadmehr R. A gain-field encoding of limb position and velocity in the internal model of arm dynamics. PLoS Biol 1: 209220, 2003.[CrossRef][Web of Science]
Karniel A and Mussa-Ivaldi FA. Does the motor control system use multiple models and context switching to cope with a variable environment? Exp Brain Res 143: 520524, 2002.[CrossRef][Web of Science][Medline]
Karniel A and Mussa-Ivaldi FA. Sequence, time, or state representation: how does the motor control system adapt to variable environments? Biol Cybern 89: 1021, 2003.[Web of Science][Medline]
Krouchev NI and Kalaska JF. Context-dependent anticipation of different task dynamics: rapid recall of appropriate motor skills using visual cues. J Neurophysiol 89: 11651175, 2003.
Malfait N, Shiller DM, and Ostry DJ. Transfer of motor learning across arm configurations. J Neurosci 22: 96569660, 2002.
Mussa-Ivaldi FA, Hogan N, and Bizzi E. Neural, mechanical and geometric factors subserving arm posture in humans. J Neurosci 5: 27322743, 1985.[Abstract]
Osu R, Hirai S, Yoshioka T, and Kawato M. Random presentation enables subjects to adapt to two opposing forces on the hand. Nat Neurosci 7: 111112, 2004.[CrossRef][Web of Science][Medline]
Poggio T, Fahle M, and Edelman S. Fast perceptual learning in visual hyperacuity. Science 256: 10181021, 1992.
Pouget A, Dayan P, and Zemel R. Information processing with population codes. Nat Rev Neurosci 1: 125132, 2000.[CrossRef][Web of Science][Medline]
Pouget A and Sejnowski TJ. Spatial transformations in the parietal cortex using basis functions. J Cogn Neurosci 9: 222237, 1997.[Web of Science]
Salinas E. Fast remapping of sensory stimuli onto motor actions on the basis of contextual modulation. J Neurosci 24: 11131118, 2004.
Scheidt RA, Dingwell JB, and Mussa-Ivaldi FA. Learning to move amid uncertainty. J Neurophysiol 86: 971985, 2001.
Shadmehr R and Moussavi ZMK. Spatial generalization from learning dynamics of reaching movements. J Neurosci 20: 78077815, 2000.
Shadmehr R and Mussa-Ivaldi FA. Adaptive representation of dynamics during learning of a motor task. J Neurosci 14: 32083224, 1994.[Abstract]
Thoroughman KA and Shadmehr R. Learning of action through adaptive combination of motor primitives. Nature 407: 742747, 2000.[CrossRef][Web of Science][Medline]
Wada Y, Kawabata Y, Kotosaka S, Yamamoto K, Kitazawa S, and Kawato M. Acquisition and contextual switching of multiple models for different viscous force fields. Neurosci Res 46: 319331, 2003.[CrossRef][Web of Science][Medline]
Wang T, Dordevic GS, and Shadmehr R. Learning the dynamics of reaching movements results in the modification of arm impedance and long-latency perturbation responses. Biol Cybern 85: 437448, 2001.[CrossRef][Web of Science][Medline]
Wolpert DM and Kawato M. Multiple paired forward and inverse models for motor control. Neural Networks 11: 13171329, 1998.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
E. Salinas Rank-Order-Selective Neurons Form a Temporal Basis Set for the Generation of Motor Sequences J. Neurosci., April 8, 2009; 29(14): 4369 - 4380. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Izawa, T. Rane, O. Donchin, and R. Shadmehr Motor Adaptation as a Process of Reoptimization J. Neurosci., March 12, 2008; 28(11): 2883 - 2891. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tremblay, G. Houle, and D. J. Ostry Specificity of Speech Motor Learning J. Neurosci., March 5, 2008; 28(10): 2426 - 2434. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cotti, A. Guillaume, N. Alahyane, D. Pelisson, and J.-L. Vercher Adaptation of Voluntary Saccades, But Not of Reactive Saccades, Transfers to Hand Pointing Movements J Neurophysiol, August 1, 2007; 98(2): 602 - 612. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. S. Huang and R. Shadmehr Evolution of Motor Memory During the Seconds After Observation of Motor Error J Neurophysiol, June 1, 2007; 97(6): 3976 - 3985. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Diedrichsen, Y. Hashambhoy, T. Rane, and R. Shadmehr Neural Correlates of Reach Errors J. Neurosci., October 26, 2005; 25(43): 9919 - 9931. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |