The relationship between neural activity in motor cortex and movement is highly debated. Although many studies have examined the spatial tuning (e.g., for direction) of cortical responses, less attention has been paid to the temporal properties of individual neuron responses. We developed a novel task, employing two instructed speeds, that allows meaningful averaging of neural responses across reaches with nearly identical velocity profiles. Doing so preserves fine temporal structure and reveals considerable complexity and heterogeneity of response patterns in primary motor and premotor cortex. Tuning for direction was prominent, but the preferred direction was frequently inconstant with respect to time, instructed-speed, and/or reach distance. Response patterns were often temporally complex and multiphasic, and varied with direction and instructed speed in idiosyncratic ways. A wide variety of patterns was observed, and it was not uncommon for a neuron to exhibit a pattern shared by no other neuron in our dataset. Response patterns of individual neurons rarely, if ever, matched those of individual muscles. Indeed, the set of recorded responses spanned a much higher dimensional space than would be expected for a model in which neural responses relate to a moderate number of factors—dynamic, kinematic, or otherwise. Complex responses may provide a basis-set representing many parameters. Alternately, it may be necessary to discard the notion that responses exist to “represent” movement parameters. It has been argued that complex and heterogeneous responses are expected of a recurrent network that produces temporally patterned outputs, and the present results would seem to support this view.
A central question regarding the neural control of movement has been how neural responses relate to movement parameters. On the one hand, it is not clear that there need be any straightforward relationship. Even for a network that directly generates movement, individual neuron responses may bear no straightforward relationship to any movement parameter, even those they “control” (Fetz 1992; Robinson 1992; Todorov 2000). On the other hand, a large body of work has stressed the simple relationship between the measured responses of motor cortex neurons and kinematic parameters such as reach direction and speed. In this view, neural responses “code” movement parameters and therefore do have a straightforward relationship with them. Support comes from studies using a center-outreaching task, during which cortical responses typically show cosine tuning for reach direction (Georgopoulos et al. 1982; Schwartz et al. 1988), perhaps implying that movement direction is coded (although see Mussa-Ivaldi 1988; Sanger 1994; Zhang and Sejnowski 1999). Perhaps surprisingly, most such studies have focused on the spatial, rather than temporal, structure of responses. For example, many studies have examined the spatial reference frame (hand-centered, joint-centered, muscle-centered) of direction tuning (Ajemian et al. 2000; Caminiti et al. 1990; Kakei et al. 1999; Scott and Kalaska 1997; Wu and Hatsopoulos 2006). Temporal structure has been addressed less directly, by constructing time-varying population vectors (Georgopoulos et al. 1988; Schwartz 1993, 1994) and/or by regressing neural activity against time-varying movement parameters (Ashe and Georgopoulos 1994; Fu et al. 1995; see Paninski et al. 2004b for a nonlinear approach). Population vector–based decode methods can be remarkably successful (Moran and Schwartz 1999a; Schwartz and Moran 1999), but combine potentially heterogeneous responses, and do little to elucidate the temporal profiles of single neuron activity. Regression coefficients also yield a very indirect view of temporal structure, and results depend on including the right movement parameters and on their degree of correlation with one another (Stark et al. 2006; Todorov 2000).
The controversy surrounding this topic (Scott 2000a,b) stresses the necessity of viewing, as directly as possible, temporal patterns of activity at the level of single neurons. This has been done in the context of the center-out reaching task (Moran and Schwartz 1999b), but rarely extensively. An exception is Sergio et al. (2005), in which multiphasic patterns of single-neuron activity were observed when a weighted manipulandum was used. Under these conditions, the population vector diverged sharply from the measured kinematics, in contrast to prior studies of unencumbered reaching movements. A key question, which we address here, is whether similar multiphasic responses are displayed when movements are executed in the absence of imposed loads.
Observing temporal structure requires satisfying some methodological constraints. First, one wishes to obtain reasonably many nearly identical repeats of the same movement. Averaging responses across differing movements can produce an unrepresentative mean, and a paucity of trials will ensure insufficient power to resolve temporal detail. Second, it is desirable that movement kinematics be dissociated (in the temporal domain) from the patterns of muscle activity. One can then ask which of the two more closely resembles neural responses. Third, it is desirable to have a task in which movements with different time-courses are executed to the same endpoint/goal. One can then compare how neural responses vary with movement time-course. Fourth, averages across neurons can be quite unrepresentative, such that it is essential to view large quantities of data at the level of single-neuron responses.
To these ends, we recorded from M1 and dorsal premotor cortex (PMd) of monkeys heavily trained on a task that enforces peak reach speed (Churchland et al. 2006a,b). Repeated reaches of the same type had very similar hand paths, durations, and velocity profiles. There was also a clear dissociation between kinematics and time-varying muscle activity. Finally, we could evoke reaches to the same target but with different time-courses. Moderately large numbers of trials per neuron were collected (mean of 426, up to 1,027), so that average responses would be meaningful at the level of single neurons. Inspection of such responses reveals that neurons in M1 and PMd frequently display temporally complex patterns of activity during movement. Responses were also surprisingly heterogeneous: often the pattern displayed by a given neuron was observed for no other neuron in the dataset. As a result, neural responses spanned a much larger space than would be expected if they represented a modest number of movement parameters. Thus the complexity observed by Sergio et al. (2005) is not unique to the use of a weighted manipulandum. Rather, those features are even more striking for rapid unencumbered reaches.
Animal protocols were approved by the Stanford University Institutional Animal Care and Use Committee. Our basic methods and the details of this task have been described previously (Churchland et al. 2006b,c). Briefly, two adult male monkeys (Macaca mulatta; ∼10 kg) sat in a customized chair with a head restraint and performed a reaching task on a fronto-parallel screen. Figure 1, A and B, shows the task structure and typical behavior. Trials began with the appearance of a 12-mm-diam central spot. After this spot was touched and held for 400–500 ms (randomized), the target appeared. During the subsequent delay-period, the target jittered randomly (2 mm SD). Trials were aborted if the hand moved during this time. At the end of the delay period, target jitter ceased and the central spot disappeared, providing the go cue. At this point, the monkey was required to reach to the target. Reaches were rewarded if they were accurate, with reaction times between 150 and 500 ms. A juice reward was delivered after the target was held for 300 ms, with the next trial beginning a few hundred milliseconds later. Delay period durations varied randomly from 400 to 800 ms, and most experiments also included occasional short-delay catch trials. These details were important when studying delay-period responses (Churchland et al. 2006c) but are largely irrelevant to this study, in which only data after the go cue are analyzed. Monkeys were trained to reach at different speeds depending on the color of the target. Briefly, green targets instructed slow reaches, whereas red targets instructed fast reaches. Success was based on peak speed falling within a window, which depended on target color and distance.
Trial types and datasets
All trials were presented using a randomized-block design, with any failed trials re-presented at a random time within the current block before proceeding to the next block. We collected two types of datasets. The direction series used two target distances (7 and 12 cm for monkey A; 5 and 12 cm for monkey B) and seven target directions (5, 50, 95, 140, 185, 230, and 320° for monkey A; 10, 55, 100, 145, 190, 235, and 325° for monkey B). Seven (rather than 8) directions were used because there was always one target location that the monkey could not see through his arm. The entire target array was rotated so that the missing (roughly downward) direction was aligned with the position obscured by the arm. The distance series (monkey B only) used five distances (3, 4.2, 6, 8.5, and 12 cm) in the preferred and antipreferred direction of each neuron, estimated using an initial cursory direction series. If the antipreferred direction was near the location obscured by the arm, a compromise was made, and the antipreferred target was moved slightly to one side or the other. Some neurons did not have a clear preferred direction, and we simply chose an axis that produced modulation.
Neural and EMG recordings
Neural recordings were made using single electrodes as described previously (Churchland et al. 2006b,c). All recordings are of single units (manually isolated during the experiment) that were judged, based on stored waveforms and interspike intervals, to have no significant contamination from nearby units. Recordings were made from M1 and caudal PMd. Surface locations of all penetrations are shown in Fig. 1C. Note, however, that this figure does not adequately represent the fact that a number of recordings that began on the precentral gyrus progressed deep into the gray matter of the central sulcus below.
EMG activity was recorded using acutely placed hookwire electrodes during separate dedicated sessions as described in Churchland et al. (2006c). Recordings were made from the deltoid, biceps brachii, triceps brachii, trapezius, latissimus dorsi, pectoralis, and brachioradialis. Multiple recordings were often made from different sites in the same muscle, for a total of 17 recordings across both monkeys. When averaging across trials, EMG traces were differentiated to remove any baseline, rectified, and smoothed (25 ms SD Gaussian). The median (across repeated reaches of the same type) was taken as a function of time. The median was typically very similar to the mean but was less influenced by outliers.
Criteria for inclusion of neural data
We recorded responses from 189 neurons: 138 for the direction series task and 51 for the distance series task. Essentially all recorded neurons exhibited responses either during the delay-period or around the time of the movement. We wished to concentrate our analysis on activity around the time of the movement and to exclude neurons whose principal response was during the delay. To do so, we created two indices of neural response: a delay index and a peri-movement index. The delay index was equal to the SD, across target conditions, of the mean delay-period firing rate. For the peri-movement index, we also computed the SD of firing rates across target conditions, but as a function of time from 100 ms before movement onset until 300 ms after (this was done after filtering). The peri-movement index was the maximum value across time. To be included in our analysis, we insisted that 1) the peri-movement index be at least twice the delay index and 2) that peri-movement activity be reliably task-related (ANOVA, P < 0.0001 for a main effect of target condition) at the time the peri-movement index was measured (i.e., at the time when firing rates varied most across conditions). A total of 137/189 neurons satisfied these criteria and were included in our analysis. These criteria do not insist on any particular form of tuning. For example, a neuron with bimodal direction tuning would be included in our analysis even though it would not have satisfied criteria that insisted on cosine tuning. The above criteria were designed in part to exclude neurons that exhibited primarily delay-period activity, with little additional peri-movement activity. Such neurons often showed strong changes in firing rate around the time of the movement, but only because firing rates were returning to baseline levels. We did not wish such neurons to be classified as having peri-movement responses. Such neurons are successfully excluded by the first criterion, because the peri-movement index typically did not exceed the delay index. In principal, our criteria would also exclude neurons with temporally modulated responses but no tuning for target direction, distance, or speed. However, no convincing examples of such responses were present in our dataset.
Roughly defining cylinder zero (approximately the middle of the precentral dimple) as the PMd/M1 border, 55% (45%) of analyzed neurons were recorded from M1 (PMd). Note that, although the total number of recorded neurons (189) is somewhat smaller than is typical for studies of M1/PMd, the total quantity of data is not. Rather than recording from as many neurons as possible, we chose to instead characterize the activity of each neuron as completely as possible, using a moderately large number of conditions and trials per condition. The mean number of trials per neuron was 426. For comparison, Moran and Schwartz (1999b) recorded approximately the same number of trials, but from more neurons (1,066) with fewer trials per neuron (40).
Trials were included in the analysis if the target was hit accurately and held until the time of reward (∼98% of saved trials). Trials aborted because the hand moved during the delay (or never moved) were not saved, but these comprised at most a few percent of all trials. Failures with regard to peak reach speed were more common (6–8%) but usually involved peak reach speed being only slightly too fast or too slow. We saw no reason to exclude such trials from analysis, as they formed a continuum with normal behavior and presumably did not reflect a lack of effort on the part of the monkey, but rather the challenging nature of the task. In principle, not rejecting such trials might slightly reduce the difference in response between the two instructed speeds. In practice, this seems not to be a concern: large differences were typical.
Plots of mean firing rate as a function of time were made by convolving spike trains with a Gaussian filter (25 ms SD) and averaging across trials (Fig. 1D). Means were computed after aligning individual trial data to the onset of movement, defined as the time when hand velocity reached 15% of its peak. Mean hand velocity, measured in the direction of the target, was similarly computed with single-trial data aligned to the onset of movement.
Some analyses and presentations were restricted to neurons whose responses had the highest signal-to-noise ratios (SNRs). For each neuron, we measured the maximum and minimum firing rates, FRmax and FRmin, each taken across both time and conditions. We also computed SEmax, the maximum (across time and conditions) standard error of the mean. The SNR was defined as (FRmax –FRmin)/(SEmax + ε), where ε is a small value (3 spikes/s) that prevents division by zero and to some degree biases the SNR in favor of neurons with high firing rates. As would be expected, neurons with high SNRs tended to be those with high firing rates, high trial counts, and strong tuning across time and conditions. A number of other reasonable definitions are possible; our intent was simply to use a sensible objective measure.
Computing the preferred direction
The preferred direction (PD) of each neuron was computed as a function of time. For each time, the mean firing rate was plotted against target direction (7 data points). Data were fit with a cosine (free parameters were phase, amplitude, and DC offset) whose peak was taken as the PD. PDs were computed separately for the two target distances and instructed speeds. Note that, for a cosine fit, there is no bias created by the lack of the eighth (downward) direction, which was omitted because the monkey's outstretched arm obscured that target location. A bootstrap procedure was used to compute the sampling distribution: the expected PD distribution, were it remeasured, given the observed measurement variability. For each target direction, we resampled (with replacement) the original distribution of firing rates and computed a new PD. This process was repeated (100 or 1,000 times depending on the analysis), and 95% CIs were computed from the resulting distribution.
Linear regression and kinematic models
The responses of individual neurons were regressed against two linear models. The velocity-tuned model assumes that firing rates are sensitive to reach velocity and speed (1) where x is hand position, bo is the baseline firing rate, b1 and b2 capture the horizontal and vertical sensitivities to hand velocity (and together determine the preferred direction), and b3 captures any nondirectional sensitivity to speed. The complex model assumes that firing rates are also sensitive to hand position and acceleration (2)
It is frequently assumed that neurons can have variable leads, or even lags, relative to the parameters they represent. Thus for each neuron we tested a range of Δt, from 200 (firing rate leading kinematics) to −100 ms (firing rate lagging kinematics), in increments of 10 ms, and took the maximum R2. We also constructed versions of these models where the firing rate was not allowed to drop below zero. In these cases, fits to the neural data were found by numerical optimization, which was initialized using the parameters obtained by the regression.
We computed an approximate correction for the R2 values to account for the fact that some of the data variance is caused by sampling noise and is not expected to be accounted for by the model. This was done only for the complex model. For each neuron, we used the model fits as the underlying firing rate and generated simulated trials in which spikes were generated according to a Poisson process. We generated the same number of simulated trials as we had recorded real trials for that neuron and computed the mean simulated firing rate (after filtering, just as was done for the real data). We refit the model to these noisy simulated responses and computed the variance unaccounted for, which we termed the expected residual variance. We recomputed the R2 of the model fits, factoring out the expected residual variance. (3)
where σtotal2 is the total data variance to be accounted for, σresidual2 is the actual residual variance after fitting, and σexpected2 is the estimated expected residual variance caused by sampling error. When applied to simulated recordings, the above correction was successful. Simulated recordings had Poisson spiking statistics, with the underlying firing rate determined by the complex model. We simulated a population of 108 neurons for which trial-counts, baseline firing rates, and firing rate modulations matched the 108 recorded neurons. When fit with the complex model, the mean R2 was 0.82, and the mean corrected R2 was 1.02, very close to the ideal value of unity.
Principal component analysis
Principal component analysis (PCA) was applied to the neural responses collected using the direction series, 108 of which passed the criteria for inclusion. The response pattern of each neuron was considered to be a single observation. We considered times from 150 ms before until 300 ms after movement onset, with the mean firing rate sampled every 10 ms, for a total of 46 data points. Each neuron thus contributed a single 1,288-dimensional observation (46 time-points and 28 target conditions). Each of these vectors had its mean subtracted and was normalized so that the maximum value minus the minimum value was one. PCA was performed on the population of 108 such observations, resulting in 107 PCs. Each of these is itself a 1,288-dimensional vector, embodying a component response pattern. For a given set of PCs (e.g., the 1st 10), the proportion of variance accounted for was computed as the cumulative sum of the eigenvalues associated with those PCs.
PCA was also applied to simulated populations of neural responses, generated according to different models. Simulated populations were generated by drawing random parameters (e.g., the PD) for each neuron. However, the baseline firing rate of each neuron and its degree of modulation were set to approximate those of one of the recorded neurons. Simulated populations were always of 108 neurons, with each matched to one of the recorded neurons. The underlying firing rate of each neuron determined the mean of a Poisson spiking process. We collected simulated trials, matching the number of these to the number collected for the matched recorded neuron. Mean firing rates were computed by filtering and averaging, just as was done for the recorded responses.
For each PC, we computed an “internal consistency” metric. This was based on the correlation between the subcomponents of the PC at different target locations. For each target location, we took the correlation between its subcomponent and the subcomponent at the three nearest neighbors (the other distance in the same direction and the 2 other directions at the same distance). We define the internal consistency as the mean correlation. The decision to use the three nearest neighbors is reasonable if somewhat arbitrary. Virtually identical results were obtained if we used the nearest five neighbors.
Linear fits to EMG and kinematic responses
For each muscle, we fit the pattern of EMG activity using a linear sum of neural activity, in which each neuron was assigned a weight. The same weights were used to fit EMG for all target conditions; different weights were used for different muscles. To allow for varying conduction delays, each neuron was allowed to contribute at one or more delays: 20, 40, and 60 ms (effectively, the response of each neuron was convolved with a simple linear filter). This method was considered preferable to attempting to assign one delay to each neuron, because it allowed fits to be made by linear regression (rather than numerical optimization). This is not only faster, but aids comparison of the relative fit quality: linear regression has a single solution, whereas numerical optimization is not guaranteed to find the global minimum. Fits were made only for the direction series, and for a given EMG recording were made using only neural activity recorded from that same monkey.
Fits were made to EMG activity from 170 ms before movement onset until 350 ms after movement onset. Data from all 28 conditions were concatenated to produce a single vector, y, of EMG values (14,588 data points long). The same was done for the neural data, but for each neuron, the data were sampled three times: starting 20, 40, and 60 ms before the time when the EMG activity was taken. Thus for a sample of 100 neurons, the resulting matrix, X, would be 14,588 by 301 (the 2nd dimension is 301, rather than 300, because of the addition of a column of ones to allow a nonzero intercept). A linear regression was used to find the vector of weights, b, such that Xb provided the best fit to y.
For analyses regarding generalization, an initial regression used data from a subset of targets (training). The resulting weights, b, were used to provide a prediction for the remaining targets (generalization). For each muscle recording, this procedure was repeated four times, using different subsets of targets for training and generalization. For each of these repeats, training excluded one of the four direction/speed combinations (e.g., far/fast), and generalization was measured for that combination (7 targets total).
We compared fit and generalization performance to that when EMG data were “mirrored” relative to the neural data. For this procedure, we treated EMG responses for rightward reaches as if they had been recorded for leftward reaches (i.e., they were fit with neural responses during leftward reaches). The full mapping was to take EMG responses for the three responses to the right and exchange them with the three responses to the left. Vertical alignment was not disrupted (e.g., top right mapped to top left), and the direction nearest upward remained unaltered. All these procedures were also applied when fitting kinematic measurements (horizontal and vertical hand velocity and acceleration). To allow the best comparison, kinematic values were extracted from the same datasets as the EMG recordings (note that this yields 4 times as many kinematic recordings as muscle recordings).
Neural network simulation
We simulated a very simple artificial neural network of nine units. Connections were recurrent, with each unit connecting to all others. Each unit's activity ranged from −1 to 1. A 10th “bias” unit, connected to all other units and of constant activity one, was also included. At each time-step, the activity of the network, a, was updated using the following equations (4) (5) (6) (7)
Equations 4 and 5 derive the synaptic input, s. That input was assumed to be driven by both the magnitude of the weighted presynaptic activity and by its rate of change (sensitivity to the latter being determined by c in Eq. 4). A low-pass filter was applied (Eq. 5). Activity was limited to the range −1 to 1 by a sigmoidal transfer function, T, where T(x) = −1 + 2/(1 + e−x). A second low-pass filter was applied (Eq. 7) to produce the updated activity, a(t + Δt). The 10th element of a (the activity of the bias unit) was not updated, but was clamped at 1. Simulations used a 5-ms time step. The network output was the sum of three of the nine units (chosen arbitrarily and not allowed to change during optimization). The connection weights, W, and the initial state of the network, a(to), were numerically optimized to produce the desired output: the activity of a given muscle for a given target location.
The parameters controlling the dynamics of individual units, c, τ1, and τ2, were set to 0.04, 47 ms, and 25 ms. These values resulted in simple dynamics at the level of individual units but with the network as a whole being capable of complex behavior. These particular values were not critical—similar results were obtained using different parameters.
Behavior on this task has been documented previously (Churchland et al. 2006a,b). Briefly, fast reaches (red targets) had durations of ∼100–200 ms (depending on distance). Even slow reaches (green targets) were fairly swift, with durations of ∼200–300 ms. Repeated reaches for the same target/instructed speed were typically very similar. This was true both for reach path and for the velocity profile, allowing the mean firing rate to be computed across nearly identical movements. Figure 1D gives an example (17 slow reaches), for which the mean firing rate (black trace) displays considerable temporal structure. Such structure can be lost when the mean is computed across trials with different velocity profiles. This highlights an important feature of our task: it not only allows comparison of responses across instructed speeds, it also ensures that averages are made across very similar behavioral responses. To allow averages that could reveal high-frequency components of the neural responses, we typically endeavored to collect as many trials as possible. The mean was 16/condition and was occasionally >40/condition. Per neuron, the average was 426 and was occasionally as high as 1,000.
Example responses: direction series
Of the neurons tested using the direction series, 108 of 138 showed robust peri-movement responses (see methods) and were included in our analysis. We observed a variety of peri-movement patterns, some simple and some complex. Perhaps most strikingly, there was profound heterogeneity in response patterns across neurons. Figures 2 and 3 show responses of 12 example neurons, selected to show the range of observed neural behavior. Each panel (A–F) corresponds to one neuron; each subpanel to a target location. Mean firing rates for slow and fast reaches (green and red traces, respectively; SE given by trace width) are plotted on top of mean hand velocity traces (gray). The example neurons in Fig. 2, A and B, display simple patterns of activity that are literally textbook (Fig. 14.13 of Bear et al. 1996; Fig. 32.1 of Georgopoulos 1995; Fig. 38–13 of Krakauer and Ghez 2000). Activity is roughly cosine-tuned for direction and scales in magnitude with velocity. These patterns adhere closely to a model in which a neuron's activity is determined by the dot product of its preferred direction with the velocity of the reach (Georgopoulos et al. 1982; Moran and Schwartz 1999b). Conversely, the activity of a population of such neurons could be effectively decoded by multiplying each neuron's activity by its preferred vector and then summing to decode velocity (Georgopoulos et al. 1988; Moran and Schwartz 1999b; Schwartz 1993).
Such canonical patterns of activity were not uncommon, but neither were they typical. The presence of noncanonical features was in large part revealed by 1) the ability to compute averages across many nearly identical reaches and 2) the use of two instructed speeds. Departures from the canonical took a variety of forms. One example is shown in Fig. 2C. This neuron showed a response during slow reaches that was both larger and more broadly tuned than that during fast reaches. This response cannot be explained by positing a negative sensitivity to hand velocity, because the neuron is excited during the reaches. It also cannot be explained by a nonmonotonic sensitivity; hand velocity for fast reaches passes through the range of magnitudes that is occupied for slow reaches. Of course, other explanations might be suggested: perhaps a relation to co-contraction or a combination of complex sensitivities to velocity and acceleration. For the moment, let us simply note that the pattern of activity displayed by this neuron is rather different from that of the neurons in Fig. 2, A and B.
Another intriguing response pattern is seen in Fig. 2D. For this neuron, the phase of the responses is advanced for slow versus fast reaches. Less obviously, the phase of the response for fast reaches differs with direction: leading the movement for 230° reaches and lagging for 95° reaches. This effect is unlikely to be caused by sampling noise: SE (width of trace) is modest, and the phase change is evident for both distances. Figure 2E shows that responses could also be temporally complex. In this case, responses are either bi- or triphasic depending on reach direction. For many directions, there was surprisingly little difference in the response for the two reach speeds. Figure 2F shows responses for a neuron whose preferred direction shifted ∼90° between the two reach speeds (from up for fast to left for slow). Thus the neurons in Fig. 2, C–F, all diverge from the canonical, but in different ways.
Further examples are shown in Fig. 3. The responses in Fig. 3A show a temporally complex pattern, especially for leftward reaches. The responses in Fig. 3B are greater for slow reaches and last considerably longer than the movement itself. The responses in Fig. 3, C and D, show temporally bimodal patterns for some reach directions (55 and 100° for C, 235° for D). The neuron in Fig. 3E roughly obeys straightforward expectations (cosine tuning, larger responses for fast speeds). However, the lag between neural response and movement differs for the two reach speeds (especially for 320°), and for some target locations (230°) the response is larger for the slower reaches. Finally, the responses in Fig. 3F exhibit bimodal direction tuning: large responses are seen for 50 and 230°, with little response for 5 and 185°. This pattern is much more pronounced for fast reaches—tuning for slow reaches is roughly cosine.
The responses in Figs. 2 and 3 show that a range of response patterns was observed. Yet those examples do not span the entire range of observed patterns. Frequently, a recorded neuron would exhibit a response pattern that was not observed for any other recorded neuron (even allowing for rotations). Sampling error cannot account for the complexity of the response patterns, their heterogeneity across neurons, or the frequent departures from the canonical. Trial counts were reasonably high, and SE (trace width) was usually quite modest.
Given that response heterogeneity is such a striking feature of the recorded responses, we have taken the unorthodox step of presenting a plethora of further examples. To avoid selection bias, we ranked all neurons by the SNR of their responses (see methods). We selected the top 24 neurons (skipping those shown previously: Figs. 2, B and E, and 3, A and E). Figures 4 and 5 show the responses of these 24 neurons. We provide below brief summaries of some (although by no means all) of the notable response features for each of these neurons.
In Fig. 4, neuron A44 has a roughly canonical response pattern, although the instructed-fast response is often broader than the instructed-slow response (rather than narrower, in parallel with the speed trace). Neuron A13 has an odd temporal response profile, in which two peaks are sometimes visible, especially for the instructed slow reaches. Neuron B101 also has a multiphasic response pattern, which differs between instructed speeds for some directions (up-right) but not others (down-left). Neuron A40 has temporally broad responses with little direction tuning, although the impact of instructed speed varies by direction. Neuron B72 exhibits roughly canonical responses, although for some directions (e.g., up-left) there is little difference between instructed speeds, and for others (right) there are weakly multiphasic responses. Neuron B94 also exhibits roughly canonical responses, although with some minor departures, including a “rebound” response for instructed-slow down-right reaches. Neuron A50 has temporally broad responses that are larger for the instructed-slow condition. Neuron A45 has very high firing rates, but appears noisy and is not easily characterized. Neuron A57 is well tuned for speed, although the responses are broader temporally than the profile of hand speed. Neuron B127 has a response that leads the reaches for some directions (down-right) and lags slightly for others (up-left). Neuron B130 exhibits responses that are longer-lasting, although no higher in peak, for instructed-fast reaches. Neuron B126 exhibits roughly canonical responses, although firing rates are no higher for the instructed-fast reaches. In Fig. 5, neuron A14 exhibits responses that are roughly canonical, if somewhat noisy. Neuron A24 exhibits responses that are in some cases larger for the instructed-slow reaches and whose phase changes with direction (leading for down-left, lagging for up-left), and in some cases with instructed-speed (usually with more lag for slow). Neuron B79 exhibits responses that slightly lag upward movements, but lead right and down-right movements, at least for the larger distance. Neuron B123 exhibits omni-directional suppression that leads the reach. Neuron B121 exhibits a temporally complex response pattern, especially for upwards reaches. Neuron B85 exhibits a roughly canonical response pattern, with a preferred direction up and right. Neuron A21 exhibits a roughly canonical response pattern with a preferred direction up and left (although for both this and the previous neuron, responses are no broader in time for the slower movements). Neuron A26 exhibits two response peaks for up and up-right instructed fast reaches. Neuron A51 exhibits broader direction tuning for instructed-slow reaches and also some changes in response phase with direction and instructed-speed. Neuron B67 exhibits two positive response peaks for left and down-left targets, but only for instructed-fast reaches. Neuron A34 exhibits temporally broad responses, with weakly biphasic responses for up-right reaches. Neuron B115 exhibits roughly canonical responses.
We recognize that the presentation of this much raw data is somewhat unorthodox. However, in our view, the best way to truly appreciate the complexity and heterogeneity of the neural responses is to inspect many examples.
Example responses: distance series
We also tested a modest number of neurons using a distance series. Of 51 neurons tested, 29 showed robust peri-movement responses and were included in our analysis. The responses of five of these are shown in Fig. 6. As was the case for the direction series, response patterns were often temporally complex and were heterogeneous across neurons. Even the response pattern in Fig. 6A is not as simple as it first appears. The neural response leads the movement for 10° targets and occurs in phase with the movement for 190° targets. Responses are greater for slow reaches for 190° targets but not for 10° targets. The response patterns in Fig. 6, B–E, are even more complex, often with multiple phases, and with changes across distances/directions that would not have been readily predicted. Nominally, these neurons were tested with the two target directions aligned with and against their preferred directions. However, for some neurons, it is unclear that there is a well-defined preferred direction. The direction that evokes the largest response can depend on distance, instructed speed, and when the response is observed. Again, note that trace width gives the SE, and thus even some very fine features that might be mistaken as measurement noise can be seen to be reliable, especially when shared across multiple conditions.
Firing rates and movement kinematics
It is typically reported that firing rates in M1 and PMd correlate with a variety of kinematic parameters, especially velocity and speed. This was also true of the responses we recorded, despite the wide variety of observed response patterns. We regressed each neuron's firing rate against three kinematic parameters: horizontal velocity, vertical velocity, and speed (Moran and Schwartz 1999b). We assumed that firing rates could either lead or lag kinematics and tested leads up to 200 ms and lags up to 100 ms (in 10-ms increments), taking the value that maximized R2. The black bars in Fig. 7 plot the distributions, across the 108 neurons recorded using the direction series, of R2 values (A) and best leads/lags (B). Mean values were 0.39 (R2) and 88 ms (firing rate leading the movement).
The above analysis assumes what we will refer to as the velocity-tuned model of kinematic tuning, in which firing rates are determined by a cosine-tuned velocity sensitivity and a nondirectional speed sensitivity (Moran and Schwartz 1999b). We also used a complex kinematic tuning model, with additional sensitivities to position and acceleration. In this case, we regress against seven kinematic variables: horizontal and vertical position, horizontal and vertical velocity, horizontal and vertical acceleration, and speed (allowing the PD to differ for position, velocity, and acceleration; see methods). The gray bars in Fig. 7 plot the results for this regression. Mean values were 0.46 (R2) and 101 ms (lead).
The above results are consistent with prior reports: firing rates correlate with a variety of kinematic parameters, and the sensitivity to velocity/speed seems particularly large: mean R2 values rise only modestly, from 0.39 to 0.46, after the addition of position and acceleration to the regression. Nevertheless, the values of R2 we obtained were somewhat lower than reported in some prior work. In particular, Moran and Schwartz (1999b) performed an analysis nearly identical to that in Fig. 7A, in which firing rates were regressed against velocity (horizontal and vertical components) and speed. The median R2 obtained for that analysis was 0.65, considerably higher than for our parallel analysis (median R2 = 0.37 for the velocity-tuned model). Even more strikingly, they obtained an R2 of 0.99 when correlating mean hand velocity with the ensemble nondirectional neural response in M1. These discrepancies prompt concern that our dataset may be fundamentally different from theirs. Perhaps the recorded population of neurons was rather different, given that neurons in PMd were included. Or perhaps the tasks differ in some fundamental way. To address these concerns, we repeated their analysis of the ensemble nondirectional response. Their task used a single distance, so we likewise considered data for one distance (12 cm). Their task did not involve instructed speeds, and considerable trial-to-trial variability was present in reach velocity. To emulate this, we collapsed data across the two instructed speeds. The ensemble nondirectional response was thus the mean firing rate across all neurons, directions, and speeds. Mean reach speed was computed in the same way. Figure 8A plots both the ensemble neural response (gray) and mean reach speed (black), and can be compared directly to Figs. 3 and 5 of Moran and Schwartz (1999b). To emulate their original analysis, the speed trace is truncated shortly after reaching its peak, and the neural response is similarly truncated. The rising edge of the neural response leads the rising edge of the speed trace by ∼150 ms, in close agreement with the value of 145 ms obtained in their original analysis. Indeed, time-shifting the neural response by 145 ms resulted in a very high correlation with reach speed (R2 = 0.95), only slightly less than was obtained in the original study (0.99).
Thus using the methods of prior work, our dataset yields results similar to those previously obtained. However, somewhat different results were obtained when we did not truncate the speed trace and did not collapse across instructed speeds. Doing so allowed the correlation to compare the neural response with speed not only during the rising phase, but also during the falling phase and across the two instructed speeds. Figure 8B plots reach speed (thin traces) and the ensemble nondirectional response (thick traces) for the entire reach trajectory and for both instructed speeds. For these data, the correlation is considerably weaker: R2 = 0.57. In accounting for this reduction, the longer analysis window is the more critical factor. If we consider the entire velocity trajectory, but do not separate the two instructed speeds (raw traces not shown), R2 falls almost as far: from 0.95 to 0.61. This reduction is largely caused by the ensemble neural response being broader than the velocity trace (many neurons showed response patterns that lasted longer than the movement itself). Separation of the instructed speeds results in a further small correlation reduction (to the final R2 of 0.57). This reduction is caused by the fact that, while the ensemble response scales in magnitude with speed, it does not scale appropriately in time (the response is not briefer for the faster reaches). In summary, we were able to replicate prior results using our dataset, but only if the analysis was restricted. For the more complete analysis, the ensemble nondirectional response only roughly tracks reach speed. Furthermore, it should be stressed that the ensemble response is often quite unrepresentative of the responses of individual neurons. These had temporal profiles that sometimes differed sharply from that of hand speed (Figs. 2–6) and could even exhibit a preference for the instructed-slow condition.
A critical and very general point is that averaging, whether across neurons or across reaches of different speeds, tends to inflate the apparent correlation between neural responses and movement kinematics. In particular, the average neural response tends to appear more and more Gaussian (and thus more and more like the velocity profile) when it is made across more neurons and/or conditions. The analysis shown in Fig. 8 shows one manifestation of this effect. We also repeated the analysis in Fig. 7, after collapsing data (both firing rates and kinematics) across the two reach speeds. This emulates a typical situation, in which target location, but not reach speed, is experimenter controlled. After this manipulation, the mean R2 value for the velocity-tuned model rose from 0.39 to 0.52 (0.55 if we included only the greater distance). Thus the averaging of data across similar but nonidentical behavioral responses can make the relationship between neural activity and kinematics seem rather stronger than it actually is.
Inspection suggests that the responses of many individual neurons would be poorly fit by the canonical velocity-tuned model (cosine-tuned for velocity with a nondirectional speed component) or even by the more complex model (also sensitive to position and acceleration). The analysis in Fig. 7A supports this impression. Linear regression is equivalent to fitting the data with a linear model (i.e., a model with linear sensitivities to the different kinematic parameters). The resulting R2 values, which were typically well below 1 (means of 0.39 and 0.46 for the 2 models), argue that the data are not well explained by either model. However, interpretation is complicated by two factors. First, the linear model has an obvious flaw: its firing rate can drop below zero, whereas real firing rates cannot. Second, some of the variance we are asking the model to account for is merely sampling error, caused by the finite number of trials recorded from neurons with noisy spiking statistics. Thus even an accurate model would not be expected to account for all the data variance. Inspection of individual examples indicates that the above factors cannot be the sole reason that the kinematics-based models fail to provide good fits. For example, the mean firing rates of neuron B68 (Fig. 2E) possess a tight SE (trace widths), and, although they fall to near-zero briefly, are not strongly truncated at zero. Concerns regarding the above factors should thus be minimized, yet R2 values of the model fits were still modest for this neuron (0.49 and 0.51 for the velocity-tuned and complex models, respectively).
However, at the level of the population, it seems likely that linear regression underestimates the quality of the fits that could be provided by the kinematic models. To assess how large this underestimation might be, we constructed a version of the complex model where the firing rate could not drop below zero. This (mildly) nonlinear model was fit to the data. R2 values for the model fit were slightly higher than for the linear regression: the mean was 0.48, up from 0.46. We applied an approximate correction to account for the fact that some of the variance is caused by sampling noise and is not expected to be accounted for by the model (see methods). The mean corrected R2 was 0.56 (for the complex model). This is higher than the uncorrected value of 0.48 but only modestly so. Thus most of the variance unaccounted for by the model is related to real aspects of the neural responses rather than sampling noise.
An important note is that the way in which the model fails is different for each neuron. The responses of neuron A46 (Fig. 2C) are poorly captured because they are larger for the slower reaches (corrected R2 = 0.37 for the complex model). The responses of neuron A56 (Fig. 2D) are poorly captured because of the phase difference between the instructed speeds (corrected R2 = 0.28 for the complex model). The responses of neuron B68 (Fig. 2E) are poorly captured because of their multiple phases (corrected R2 = 0.55 for the complex model). The responses of neuron B107 (Fig. 2F) are poorly captured because of the change in preferred direction between instructed speeds (corrected R2 = 0.36 for the complex model). Thus the model of kinematic tuning does not have any single obvious failing. Rather, the variability of neural responses makes them difficult to capture with a simple model. This issue is addressed further in the next section.
As a final point of comparison, we fit the recorded EMG responses using the complex model. The mean R2 was 0.49 (uncorrected), which was not terribly different from the mean uncorrected R2 for the neural data (0.48). Thus the ability of the complex model to partially fit the neural responses does not necessarily imply that the neural responses code those kinematic features. EMG activity presumably does not code kinematics, even if it can be roughly fit by the complex model.
Response heterogeneity and dimensionality
Not only did the recorded response patterns often differ from the canonical, they often differed sharply from one another. This suggests it may be difficult to adequately capture neural responses using any simple model (kinematics-based or otherwise). However, perhaps this is not the case—perhaps there are only a few basic response patterns, with the apparent heterogeneity arising when these are combined in differing proportions for different neurons. This issue can be addressed by studying the dimensionality of the neural responses using PCA. This approach asks whether we can reconstruct the response pattern of a given neuron from a linear sum of basic response patterns (the PCs). As an example, consider an idealized population of neurons that are cosine tuned for velocity. The response of any neuron could be reconstructed as a linear combination of two PCs: one capturing sensitivity to horizontal velocity and one capturing sensitivity to vertical velocity (a neuron tuned to 45° would use both equally). Thus all the variability in the data would be accounted for by only two PCs.
Figure 9 shows a more realistic version of this scenario. Shown in Fig. 9A are the responses (light traces) of one simulated neuron, taken from a population of 108 simulated neurons. Underlying firing rates were generated by the velocity-tuned model: cosine tuning for velocity plus a nondirectional component related to speed. PDs and gains were randomly assigned for each neuron, but the baseline firing rate and overall modulation were set to match that of 1 of the 108 recorded neurons (Δt, the lead between neural responses and kinematics, was fixed at 100 ms). We recorded the same number of trials as for the matched real neuron (using a Poisson spiking model). Responses were filtered and means taken, as for the actual neural responses. We extracted 107 PCs, rank ordered by the amount of data variance accounted for. Of those 107 PCs, the first 10 capture 82% of the variance. The first 10 PCs do not capture all of the variance, despite the low dimensionality of the underlying model, because the conversion of underlying rates to spike trains introduces sampling noise and a threshold at zero. However, reconstructions (Fig. 9A, dark traces) based on the first 10 PCs were typically reasonable and largely relied on the first few PCs, which captured the underlying sensitivities built into the model. For example, the reconstruction in Fig. 9A relies almost entirely on the first five PCs (see black histogram). These PCs are shown in Fig. 9B (ordered given their importance for this simulated neuron). PCs 1, 2, and 3 capture, respectively, sensitivity to vertical velocity, horizontal velocity, and speed (the 3 sensitivities of the model itself). These simulations show that PCA can extract small numbers of patterns on which diverse responses are based.
Figure 10 shows a typical PCA-based reconstruction (same format as Fig. 9A) for one of the real neurons (B68, same neuron as in Fig. 2E), based on the first 10 PCs. The reconstruction relies heavily on the higher numbered PCs: PCs 6–10 are used to roughly the same degree as PCs 1–5. This contrasts with the reconstruction of simulated responses in Fig. 9, which relied almost entirely on PCs 1–5. It was in general the case that moderately large numbers of PCs were needed to adequately reconstruct the neural responses. Figure 11A plots results at the level of the population, and shows the residual variance (black circles) after reconstruction using a given number of PCs. The rather shallow decline of this curve indicates that neural responses were reasonably high dimensional. For example, capturing 80% of the variance in the neural responses required using 18 PCs. Of course, our measurements of mean firing rate contain sampling noise, which will increase the measured dimensionality of the data. Small fluctuations in behavior from recording to recording could also result in a slight increase in the measured dimensionality. The contribution of such factors can be estimated by including them in simulations. As described above, we simulated a population of neurons, each of which had its baseline firing rate and degree of modulation matched to one of the actual recorded neurons. If we eliminated spiking noise from the model, the responses of this simulated population were low dimensional (light gray diamonds), with three PCs capturing 95% of the variance (this number would have been 100%, except that responses were based on the actual velocity/speed profiles, which varied slightly from recording to recording; also note that the underlying model has a fourth free parameter—baseline firing rate—but that this does not contribute to the dimensionality because means are removed before the PCA). When we assumed Poisson spiking statistics and matched the number of sampled trials to the number of recorded trials (the method that produced the simulated responses in Fig. 9A), there was indeed an increase in the measured dimensionality (gray squares). However, although sampling noise was slightly greater for the simulated population (mean SE of 3.9 vs. 2.9), its dimensionality was still considerably less than for the neural population. For example, accounting for 80% of the variance required only 8 PCs for the simulations, but required 18 for the neural responses. We also simulated a population of responses using the complex model. This model was tuned not only to velocity and speed but also to acceleration and position (with independent PDs for each). To further increase dimensionality, each simulated neuron was assigned a variable lead of 0–100 ms relative to the movement. However, the dimensionality of the simulated population (including sampling noise) was still much less than that of the recorded population: 80% of the variance could be accounted for using nine PCs.
Internal consistency of the principal components
The above analysis assesses dimensionality in terms of the percent of variance accounted for by a given number of PCs. In doing, so, we assume (probably reasonably) that those PCs that account for little of the measured variance (the higher-numbered PCs with lower eigenvalues) are primarily capturing sampling noise. However, if we consider the form of the PCs themselves, there is another way in which we can attempt to gauge which PCs are capturing “real” variance and which are capturing only sampling noise. Consider the PCs shown in Fig. 9B, which were extracted from the population of simulated responses. Given that we know the model was constructed to be sensitive to vertical velocity, horizontal velocity, and speed, it is easy to recognize that these features are roughly captured by PCs 1, 2, and 3. However, even if we did not know the form of the model, we would suspect that PCs 1–3 capture real response features. Whatever our prior expectation regarding the features for which the model is tuned (kinematic variables, dynamic variables, muscle activity, etc.), we expect that those features do not change arbitrarily quickly, either across time or across target locations. Thus PCs 1–3 seem quite reliable: values change smoothly with time, and subcomponents at nearby target locations look similar to one another. In contrast, PCs 4 and 5 seem somewhat marginal. There is still some smoothness with time (unsurprising, because mean firing rates were filtered), but subcomponents at different target locations resemble one another only weakly. Thus, for the velocity-tuned model only the first three PCs appear reliable, in agreement with the construction of the model itself, which had three basic sensitivities (horizontal velocity, vertical velocity, and speed). However, for the neural responses, even the higher-number PCs seem reliable (e.g., PCs 7 and 9 in Fig. 10B).
To capture the above impressions quantitatively, we computed the internal consistency of each PC: the correlation between subcomponents at neighboring target locations (see methods). Figure 11B plots this metric for each PC for the neural responses (black circles) and for the velocity-tuned and complex versions of the model (light squares and dark triangles, populations simulated as above). In all three cases, internal consistency was high for the lower number PCs (those that accounted for large amounts of the data variance). However, for higher number PCs, internal consistency was much greater for the neural responses than for either model. For example, the mean internal consistency of PCs 20–30 was 0.41 for the neural data, 0.04 for the complex model, and 0.00 for the velocity-tuned model. Thus for the neural responses, these high-number PCs seem to be capturing some real component of the responses that varies smoothly across target locations. For the models, these high-number PCs seem to be capturing primarily sampling noise, which is uncorrelated across target locations.
As a reference, the mean internal consistency of the raw neural responses themselves was 0.66. Choosing an arbitrary but reasonable threshold of one half that value, we can roughly assess dimensionality by asking how many PCs have values higher than the threshold. For the velocity-tuned model, three PCs had values exceeding threshold, in agreement with the three basic sensitivities of the underlying model. For the complex model, 10 PCs had values exceeding threshold, also in rough agreement with the underlying model (the model was sensitive to only 7 kinematic parameters, but also had a variable latency that cannot be captured by a single linear dimension). For the neural responses, there were 29 PCs with values exceeding threshold.
The dimensionality of the model responses cannot be made to match that of the neural responses by adding more measurement noise. We have tested this by sampling fewer simulated trials and reducing the model gain, both of which reduce the SNR (data not shown). For the analysis in Fig. 11A, this merely elevates the later baseline for higher-number PCs; the initial rapid decline is preserved. This manipulation also does not increase the dimensionality of the data when assessed as in Fig. 11B. If anything, dimensionality is slightly reduced (as there is now more noise and less signal to be captured).
The analyses in Fig. 11 reveal that the dimensionality of the neural responses is quite high. Not only are the responses of individual neurons heterogeneous (something readily observed by inspecting individual examples), but they cannot be captured as the sum of some small number of component sensitivities. A caveat is that, although dimensionality is high in a linear space, a nonlinear model could in principle capture the range of observed responses using only a small number of free parameters. Whether this could actually be accomplished using a plausible model is unclear, and is certainly beyond the scope of this study. What can be concluded from this analysis is that any model that considers responses to be driven by the sum of sensitivities to individual movement parameters (kinematic, dynamic, or otherwise) will fail to capture the observed heterogeneity of neural responses unless a rather large number of such parameters is included.
A corollary is that, given the high dimensionality of the neural responses, the space spanned by those responses will likely overlap (at least partially) with that spanned by many other parameters. For example, the space spanned by the first 20 neural PCs contains much of the smaller spaces spanned by both the velocity-tuned and complex models. For these two models, the first 10 PCs had an average length of 0.42 (velocity model) and 0.64 (complex model) when projected into the space spanned by the first 20 PCs from the neural data. We also extracted PCs from the EMG responses, the first 10 of which had an average length of 0.55 when projected into the neural data space. It is thus to be expected that individual neuron responses will correlate with both kinematic parameters and with EMG.
Stability of the preferred direction
It is well known that the activity of M1 and PMd neurons relates statistically to multiple factors (Ashe and Georgopoulos 1994; Paninski et al. 2004a; Scott 2003), a finding that agrees with the high dimensionality of our recorded population. It is also known that neural preferred directions can depend on factors such as location in the workspace (Caminiti et al. 1990; Sergio and Kalaska 2003; Wu and Hatsopoulos 2006) and arm posture (Kakei et al. 1999; Scott and Kalaska 1995). Nevertheless, it is typically assumed, if only for practical purposes, that a neuron's preferred direction is in some way fundamental. Perhaps it might even be invariant if we could only express it relative to the right reference frame (intrinsic, extrinsic, shoulder-centered, joint-based, etc.). That there exists considerable debate regarding the right reference frame only underscores that the preferred direction is thought to succinctly capture a basic feature of the neural response. However, inspection of the responses of individual neurons makes it unclear whether there is anything fundamental or invariant about the preferred direction. The basic directionality of responses is not in doubt: at any point in time, roughly cosine tuning was certainly the rule. However, although some neurons (e.g., Fig. 2, A and B) had canonical responses to which a preferred direction could be meaningfully assigned, the complex response patterns of others (e.g., Figs. 2, E and F, 3, C, D, and F, and 6, B and D) make it difficult to assign a unique preferred direction.
As an example, the preferred direction of neuron B68 (Fig. 2E) is rightward during the period just before movement onset and leftward during the movement. This is further illustrated by Fig. 12. For fast reaches to the greater distance, mean firing rate was measured at three different times: 30 ms before movement onset, 70 ms after movement onset (roughly the velocity peak), and 170 ms after movement onset (just as the movement was ending). Mean rate was plotted as a function of target direction and fit with a cosine (gray trace), the peak of which (vertical gray line) was defined to be the measured in the PD. Just before movement onset (Fig. 12A), the PD was almost directly rightward. However, 100 ms later (Fig. 12B), the PD was almost directly leftward. After a further 100 ms (Fig. 12C), the PD had returned to rightward.
To ask whether such effects were prevalent, we performed a similar analysis at the population level. To insure that any observed PD changes were real, and not a consequence of noisy responses, we conservatively limited our analysis to the 50 neurons whose responses had the highest SNR (see methods). We computed both the PD and the strength of tuning (the amplitude of the cosine fit) as a function of time. For each neuron, we found the time of strongest tuning (the median time was 3 ms before movement onset, ∼75 ms before peak reach speed). We measured the PD (and its SE) every 10 ms for the 150 ms before and after that time. Results are plotted in Fig. 13A (fast reaches, greater distance). The bottom panel plots the average strength of tuning (across neurons) at different times. By definition, tuning is strongest at time 0. To provide a reference time scale, the top panel plots mean hand velocity (dashed trace), with its peak aligned to zero. Neurons are, on average, directionally tuned over a period of time greater than the movement itself. However, PDs were not necessarily constant over that time. The colored traces plot PDs, as a function of time, for the 12 neurons with the highest SNR. For each, the PD at time 0 (when tuning was strongest) was set to zero. Thus if PDs remain constant, all data would plot along the zero line. However, this was true for only a few neurons. For most, the PD changed with time. Often such changes were modest, and this was particularly true during the period immediately surrounding the time of best tuning. However, even if we consider only time-points within 100 ms of the time of best tuning, changes >45° were not uncommon, and very large changes were occasionally observed.
The solid black traces at the top show |ΔPD|, the mean absolute PD change (across all 50 neurons) as a function of time, relative to the time of best tuning. The mean |ΔPD| for a given time-point was computed only across neurons where the PD could be estimated with a 95% CI <45°. As a result, effects caused by sampling error (i.e., measurement noise) alone are expected to be small (gray trace). The actual PDs changed much more than expected given sampling error. PDs are reasonably stable at time-points near the time of the best tuning. However, by 70 ms before and after the time of best tuning (gray vertical lines), PDs have changed by an average of 38° in each case. This is not a small change: if PDs at two times bear no relationship, the expected |ΔPD| is 90°. Results were similar for the slow reaches (data not shown) although changes were less rapid: by −/+70 ms, the PD changed an average of 26 and 32°. For both speeds, strong directional tuning was still common 100 ms or more before and after the time of the best tuning. However, |ΔPD| was nearly 90°, almost as much as would be expected if there were no consistency. As indicated by the SD (outer flanking traces), the measurement of |ΔPD| includes neurons with little change and others with large (near 180°) changes.
The above analysis asks, as we travel away from the time of best tuning, to what degree do PDs change? We can also ask to what degree PDs differ between times flanking the time of best tuning. For the fast reaches, if we consider times 50 ms before and after, the average PD difference was 29°. If we consider times 70 ms before and after, the average PD difference was 48°. For slow reaches, respective average differences were 24 and 32°.
In summary, PDs are frequently inconstant with time. Such changes are occasionally large and are readily observable even in raw data. The neurons shown in Figs. 2E and 3, C and D, all show large, nearly 180° PD changes during their peri-movement response. Might such shifts be related to departures of the hand from a straight path (fast reaches often exhibited some curvature, see Churchland et al. 2006b)? The “true” tuning might be for instantaneous hand direction, whereas we are computing the PD with respect to the target direction. In fact, it is quite unlikely that PD changes are related to reach curvature, as they were also observed for slow reaches, which were typically quite straight (Churchland et al. 2006b). We also reanalyzed some of the more profound examples of PD changes, recomputing the PD relative to instantaneous hand velocity, and this did not cause the PD to become constant (see Supplementary Fig. 1). 1 Of course, it is not possible to rule out all possibilities along these lines: perhaps the PD would be constant if expressed relative to a weighted combination of velocity and acceleration, each with its own lag relative to firing rate (and with weights and lags varying across neurons). What we can say is that, when expressed relative to the direction of the target or hand, the PD was frequently not invariant with time.
Much the same was true of EMG activity, which also exhibited directional tuning. Figure 13B repeats the analysis in Fig. 13A, but for EMG responses. Effects are qualitatively similar to those for the neural population: PDs changed modestly but significantly around the time of best tuning and more dramatically at more distant times.
Decoding a heterogeneous population code
Prior studies have shown that kinematics (Humphrey et al. 1970; Reina et al. 2001; Schwartz 1994; Serruya et al. 2002; Taylor et al. 2002; Wessberg et al. 2000) and EMG activity (Morrow and Miller 2003) can be decoded from a population of M1 responses, often using linear methods. This might seem at odds with the response complexity we report, yet the reverse is true. The high dimensionality of the neural responses virtually ensures that they span a space that overlaps heavily with those spanned by kinematic parameters and EMG responses. As a result, kinematics/EMG can be extracted from our population using linear methods, provided we are allowed our choice of weights. Example fits to EMG responses are shown in Fig. 14. Much like the neural responses, EMG responses (thick traces) were often temporally complex, sometimes in ways that would not have been trivially predicted. This complexity is presumably partly due to the rather rapid reaches in this task. At high speeds, the dynamics of the plant and muscles are probably poorly captured by simple approximations. As a result, muscle activity bears little resemblance to the profile of either velocity or acceleration (indeed, this dissociation was a desired feature of the task). It was probably also important that means were computed across nearly identical reaches, which prevents fine temporal features from being lost to averaging.
The fits in Fig. 14 (thin dark lines) were linear combinations of the neural responses, recorded from that monkey for those same target locations (see methods). Ideally, all neurons and all muscles would have been recorded on the same trials. Although this was not the case, behavior was typically quite stable over time, and this procedure is therefore reasonable. Muscle activity could be approximated fairly well by weighted sums of neural responses. This is not entirely trivial—for a given muscle, the fit was constrained to use the same weights for all 28 target conditions. Yet, given the diversity of neural response patterns, it is expected that combinations of those patterns can linearly approximate many other patterns, even arbitrary ones. One would like to know whether the obtained fits are any better than expected given that trivial possibility. In answering this question, we first note that we never found a neuron/muscle pair with near-identical response patterns. However, we did occasionally observe pairs where there was a distinct resemblance (Fig. 15A) such that the neural response aids in providing a good fit. In this example, the useful similarity depends on comparing neural and muscle activity at the same target locations; neural activity for rightward target locations does not resemble muscle activity for leftward locations. We can therefore ask whether our ability to decode EMG is weakened if the pattern of EMG is mirrored relative to the neural patterns of activity, by exchanging rightward responses for leftward ones (see methods). If fits are purely spurious, mirroring should have no effect. However, if the fits originally benefited from real similarities between neural and muscle activity, mirroring should result in poorer fits. Figure 15B plots the results of this analysis. The R2 value of the fits (mean R2 across all 17 muscles) is lower following mirroring. This was a statistically significant effect, with the error (1 − R2) of the fit increasing 34% (P < 0.001).
A similar analysis assesses generalization. Weights were selected by fitting responses for a subset of targets, and generalization was assessed for the remaining targets (see methods and Supplementary Fig. 3). Figure 16 A shows one example. Generalization (short-dashed trace) was somewhat worse than a direct fit (solid trace). However, generalization was much worse when using mirrored EMG data (long-dashed trace), something that was generally true (Fig. 16B; P < 10−6, t-test).
We also applied the above analyses to kinematic variables: horizontal and vertical hand velocity and acceleration. These could also be fit fairly well by a linear sum of neural activity (Supplementary Fig. 2). However, mirroring the kinematic patterns had much less of an effect on fit quality (Fig. 15B) and generalization (Fig. 16B). In both cases, the impact of mirroring was significantly larger for EMG than for kinematics (342 and 814%, P < 0.0001 for each). It was also the case that the curve relating fit quality to neuron count rose more quickly for EMG than for kinematics (Fig. 15C).
Our methods allowed us to observe not only the spatial pattern of tuning, but to estimate the temporal pattern of activity by averaging across nearly identical repetitions of the same movement. This allowed us to examine, with high fidelity, the temporal patterns of activity during whole arm reaching at a natural, rapid pace. Under such conditions, individual neurons display temporally complex patterns of activity. Some neurons display multiphasic patterns of activity. Others show phase changes as a function of reach direction or speed. The preferred direction can depend on time and/or reach speed. Activity does not always scale as expected with reach velocity. The preferred speed can depend on reach direction. Bimodal or otherwise noncosine tuning is occasionally observed. Perhaps most strikingly, activity patterns were heterogeneous across neurons. Many of the recorded neurons exhibited patterns of activity that were unique in our dataset. As a consequence, dimensionality at the population level was quite high, with ∼20–30 PCs making a contribution beyond just capturing sampling error.
These results agree with prior reports that have stressed the complexity and heterogeneity of motor cortical responses. For example, Fetz (1992; Fetz et al. 1989) stressed the diversity of temporal patterns recorded during a ramp-and-hold single-joint flexion/extension task (also see Thach 1978) and the variety of spike-triggered EMG effects observed even among neurons with similar tuning. In contrast, studies of multijoint center-out reaching have typically stressed simple relationships between neural responses and movement kinematics. However, when data are collected and analyzed to preserve temporal features at the level of single neurons, it can be seen that responses are complex and heterogeneous. If anything, response heterogeneity is more noticeable than for the ramp and hold task, given the ability to test different directions, distances, and (in this study) reach speeds.
Relevance to the canonical model of movement-related activity
Our results do not disagree with the finding that cosine tuning is a prevalent feature of motor cortex responses (Georgopoulos et al. 1982). Roughly cosine tuning was observed for most cells, even if the preferred direction was not always constant. Nor do our results disagree with the finding that responses scale with speed (Moran and Schwartz 1999b)—on average they roughly did. However, our results do question the idea that there exists a canonical response in motor cortex—cosine tuned and scaling roughly with velocity—an idea that has been highly influential. In many studies, data analysis assumes such a canonical response (e.g., the preferred direction may be computed and compared across conditions, an action that presumes that the preferred direction is a fundamental quantity). The canonical form is also often assumed when attempting to decode neural responses (Reina et al. 2001; Taylor et al. 2002). Perhaps, most importantly, the notion of a canonical form leads naturally to the idea that neural responses provide a representation of movement in some (still-debated) reference frame. Such a framework is concordant with the proposal that reach generation involves transformations between reference frames (Kakei et al. 2003; Kalaska and Crammond 1992; Kalaska et al. 1997; Pesaran et al. 2006; Reina et al. 2001; Soechting and Flanders 1989, 1992; van Beers et al. 2004; Zipser and Andersen 1988).
Nevertheless, it has always been appreciated that individual neurons can exhibit significant departures from the canonical. Ashe and Georgopoulos (1994) found that activity regressed significantly against a variety of parameters and that the relative strengths could vary from neuron to neuron. Scott and Kalaska (1997) and Kakei et al. (1999) found that the preferred direction could change with hand posture to differing degrees for different neurons. A number of groups (Caminiti et al. 1990; Sergio and Kalaska 2003; Wu and Hatsopoulos 2006) have found that preferred directions were often inconstant across the workspace. Aflalo and Graziano (2006) found direction tuning to be strongest locally, with end-posture tuning being more prominent globally. Schwartz and colleagues (Moran and Schwartz 1999a; Schwartz 1994) found that the temporal lag between the population vector and the movement depends on curvature, implying that the preferred direction is not constant (Todorov 2000). Reina et al. (2001) found that hand movements were reconstructed equally well using preferred directions in Cartesian and joint space. Sergio et al. (2005), using a weighted manipulandum, found complex temporal patterns of activity in which the preferred direction could change with time.
However, it has never been entirely clear to what degree the above results invalidate the canonical model. Some of the above findings simply argue that we must reconsider the reference frame of the canonical model (e.g., joint vs. Cartesian). Other findings potentially depend on the use of a manipulandum and/or imposed loads, and it could be argued that those results may not generalize to more natural unencumbered reaches. Finally, it may be that departures from the canonical cancel out at the population level (Reina et al. 2001). This latter argument carries particular weight because of the success of decode algorithms based on assumption of cosine tuning for velocity (Schwartz 1994; Taylor et al. 2002). Thus issues of exact coordinate frame aside, it has generally been thought that cosine tuning for reach velocity provides a sensible description of neural activity during unencumbered reaching movements.
In this study, the canonical model provides a poor description of the activity of many, perhaps most, neurons. Response complexity and heterogeneity were, if anything, more pronounced than in Sergio et al. (2005), arguing that those results were not peculiar to the use of a weighted manipulandum. In combination with the results mentioned above, this would seem to necessitate a reconception of the role of cortical activity in representing movement.
Do heterogeneous responses provide a basis set?
From a physiological perspective, it has long been observed that many movement parameters influence movement-related activity. From a theoretical perspective, it has recently been pointed out that a population of neurons can provide a basis set for the extraction of useful information, even when the activity of individual neurons fails to obey any clear reference frame (Deneve et al. 2001; Pouget et al. 2002). Indeed, that very failure can allow information to be decoded in different formats depending on what is needed downstream. We found that we could decode, with reasonable accuracy, multiple movement parameters from the recorded responses. Different linear combinations of neural activity provided good fits to the activity of the different arm muscles and to hand velocity and acceleration. One presumes that any relevant reach parameter could be approximated well using a linear combination of neural responses. This suggests that the diversity of neural responses may have the function of providing a basis set, allowing other parts of the nervous system to extract whatever signals they need. For example, other cortical areas might extract signals related to hand velocity, whereas the spinal cord might extract signals related to muscle activity. A related hypothesis is that responses with unclear reference frames serve as a transition between more sensible reference frames (Kakei et al. 2003). The basis set hypothesis simply extends this idea to the extraction of multiple types of information.
Needless to say, it is difficult to know whether the observed response complexity/heterogeneity exists for this purpose, and there is little evidence that simple signals are ever actually decoded downstream. The ability to reconstruct different parameters (EMG, hand velocity) using linear combinations of neural activity shows the plausibility of the basis set hypothesis. However, that ability is hardly surprising, given purely mathematical considerations. Still, there does seem to be at least some real component to the reconstruction of EMG. Both fits and generalization were impaired if EMG data were mirrored relative to the neural data. This implies that there is some shared anisotropy between neural and muscle activity. This is unlikely to be secondary to a kinematic anisotropy (e.g., overall faster reaches in 1 direction). Kinematic anisotropies were small, and kinematic fits/generalization were harmed very little by mirroring.
In summary, M1 responses are sufficiently varied and high dimensional as to plausibly provide a basis set allowing the extraction of a wide variety of parameters. However, the impact of mirroring shows that the responses do not provide a universal/arbitrary basis set. Some parameters (such as the actual recorded EMG) are more readily decoded than others (such as the mirrored EMG).
Representation versus causation
An alternate hypothesis is that neural responses should be understood not in representational terms, but in terms of their causal role in generating movement. This conception is reflected in studies of spike-triggered EMG activity (Fetz and Cheney 1980; Jackson et al. 2003), in the use of neural activity to reconstruct EMG (Morrow and Miller 2003), and in detailed models of cortico-spinal circuitry (Maier et al. 2005). In general, a network that causes and controls movement need not have the goal of representing anything, so long as the right patterns of activity are created at the level of the spinal cord (Scott 2003; Wu and Hatsopoulos 2006). In this view, correlations between activity and movement parameters are just that: correlations; nothing is represented except in the loosest possible sense (Fetz 1992; Todorov 2000). Along similar lines, we have previously questioned whether preparatory activity in PMd/M1 should be understood in representational terms (Churchland et al. 2006b; Cisek 2006).
The idea that the goal of motor cortex activity is to cause the forces driving movement (Evarts 1968; Hatsopoulos 2005) might seem to imply that cortical responses ought to resemble force, acceleration, or muscle responses. This is in fact not true for a variety of reasons. Not the least of these is that, for a large recurrent network, the responses of individual units typically do not match the ensemble output. This point is explicitly made by simulations performed by Fetz (1992), in which the goal of the network was to reproduce EMG activity recorded during a ramp-and-hold task. Figure 17 shows a similar but somewhat more modest simulation, in which the goal of the network was to reproduce EMG activity recorded during our task. Nine units were fully connected to one another, but only three projected to the muscle. The connection weights and the initial state of the network were numerically optimized to produce the desired output, in this case the EMG recorded from the bicep (recording for monkey B for fast reaches to a distant, upward target; Fig. 14B, top subpanel). The network makes little attempt at realism, yet is sufficient to make two general points. First, no individual unit need resemble the output that the network as a whole creates. Second, the activity patterns of different units can be quite heterogeneous. These features exist because there are more output neurons than muscles, and because the primary role of many units is not the direct production of EMG, but rather the enforcement of the right state-space trajectory at the level of the network. Thus the most basic features of our dataset—complex and heterogeneous responses unlike those of individual muscles—are expected even if a network exerts direct control over the muscles. Such features are expected to an even greater degree when we include the assumption that the spinal cord further transforms the cortical output.
Finally, it is worth noting that the responses of such a network can seem to represent things even when they do not. For example, the responses of neuron 1 and 2 respectively resemble (with lead) hand velocity and position. Neuron 7 might be said to “multiplex” representations of muscle activity and hand position. A variety of movement parameters could be read out from such a network, including many that are in no way central to its function. This illustrates a third general point: that a neuron's activity correlates with some factor need not imply that it is involved in representing it or that it exerts control over it (Fetz 1992).
During rapid, unencumbered, whole arm reaches, the response patterns of M1/PMd neurons are both temporally complex and heterogeneous. One possibility is that these features allow the simultaneous representation of multiple movement parameters in a basis set. Alternately, it may be that such responses are best understood not in terms of representation, but in terms of the causal role played by the recorded neurons in a larger network. Response complexity and heterogeneity are expected of individual neurons in a recurrent network that generates temporally patterned outputs.
This research was supported by the following sources for M. M. Churchland: an National Institutes of Health postdoctoral training fellowship, a Helen Hay Whitney Foundation fellowship, and a Burroughs Wellcome Fund Career Award in the Biomedical Sciences. This research was supported by the following sources for K. V. Shenoy: Burroughs Wellcome Fund Career Award in the Biomedical Sciences, the National Science Foundation Center for Neuromorphic Systems Engineering at Caltech, the Office of Naval Research, the Sloan Foundation, the Stanford Center for Integrated Systems, and the Whitaker Foundation.
We thank R. Schafer for pilot analyses fitting EMG with neural responses; M. Howard for animal care; S. Eisensee for administrative assistance; and B. Yu for discussions regarding analysis.
↵1 The online version of this article contains supplemental data.
- Copyright © 2007 by the American Physiological Society