## Abstract

Neural responses are commonly studied in terms of “tuning curves,” characterizing changes in neuronal response as a function of a continuous stimulus parameter. In the motor system, neural responses to movement direction often follow a bell-shaped tuning curve for which the exact shape determines the properties of neuronal movement coding. Estimating the shape of that tuning curve robustly is hard, especially when directions are sampled unevenly and at a coarse resolution. Here, we describe a Bayesian estimation procedure that improves the accuracy of curve-shape estimation even when the curve is sampled unevenly and at a very coarse resolution. Using this approach, we characterize the movement direction tuning curves in the supplementary motor area (SMA) of behaving monkeys. We compare the SMA tuning curves to tuning curves of neurons from the primary motor cortex (M1) of the same monkeys, showing that the tuning curves of the SMA neurons tend to be narrower and shallower. We also show that these characteristics do not depend on the specific location in each region.

- tuning curve
- von Mises distribution
- Bayesian estimation
- supplementary motor area
- motor cortex

tuning curves are commonly used to characterize neuronal responses to continuous features of the world. In the visual cortex, orientation tuning curves are a landmark of visual processing, and in the early auditory system, neurons respond strongly to tones close to their “best frequency.” In the primary motor area (M1) of the mammals, many cortical neurons exhibit a strong unimodal relation between the direction of movement and their firing rates (Georgopoulos et al. 1982). The properties of these M1 direction tuning curves were intensively studied, yielding a characterization of curve shapes and preferred directions (Ben-Shaul et al. 2003; Gribble and Scott 2002; Johnson and Ebner 2000; Naselaris et al. 2006; Vaadia et al. 2003). Neurons in the supplementary motor area (SMA) are also tuned to movement direction (Alexander and Crutcher 1990; Crutcher and Alexander 1990; Nachev et al. 2008; Sohn and Lee 2007), but the effect of movement direction on SMA activity is far less understood.

Comparing the shape of tuning curves across motor regions could reveal different regional roles for movement planning and execution. SMA neurons are believed to participate in preparation and planning of movements (Nachev et al. 2008), particularly of complex movements (Tanji and Mushiake 1996). The SMA motor plans could interplay in various ways with movement control in M1. For instance, the SMA could be raising coarse plans that are further refined by downstream areas, or it could dictate already-refined control plans. These alternative strategies could be reflected in the widths of movement-direction SMA tuning curves. Seung and Sompolinsky (1993) showed that in the population vector model for direction estimation, the performance depends mainly on the width of the population tuning curves and the background noise. Seriès et al. (2004) showed that the effect of tuning curve sharpening on the quality of the code depends on the sharpening mechanism. When tuning curves are sharpened through cortical lateral interactions, strong pairwise correlations are created, leading to severe information loss and complicated codes that rely on pairwise correlations. However, if sharpening is achieved while keeping the noise distribution independent among neurons and independent of the tuning curve width, then sharpening the tuning curves can improve the population code.

Tuning curves are commonly modeled using parametric functions such as the cosine function (Georgopoulos et al. 1982), but cosines provide poor fit when the widths of the curves vary substantially (Amirikian and Georgopoulos 2000). The richer family of von Mises (VM) models (von Mises 1918) captures better the widths of orientation tuning curves in the cat visual cortex (Swindale 1998) and in the direction tuning curves in M1 (Amirikian and Georgopoulos 2000). However, for many other motor areas including the SMA, the refined characteristics of tuning curves have not been studied. Moreover, although many techniques were proposed for estimating the shapes of tuning curve (Cronin et al. 2010; Georgopoulos et al. 1986, 1988; Gribble and Scott 2002; Kass et al. 2005; Simoncelli et al. 2004; Stark and Abeles 2005), robust estimation is hard to obtain with the little data that are typically collected in motor experiments. The challenges become even greater when activity is measured at a small number of unevenly spaced directions, as commonly collected in brain-machine-interface applications.

Here, we describe a simple Bayesian approach to model tuning curves using VM functions and show that it estimates accurately their width and modulation depth. We then analyze direction tuning curves of neurons recorded in the SMA and M1 of the same behaving macaque monkeys. We find that the tuning widths in the SMA tend to be narrower than in the M1 and that the modulation depths and the dynamic range values both tend to be smaller in the SMA. This suggests that SMA neurons that are involved in movement planning code the direction of movement in a more specific way than the movement controlling neurons in M1.

## MATERIALS AND METHODS

#### Data.

We used electrophysiological recordings to evaluate the effect of the Bayesian approach on properties of direction tuning curves in the SMA. We compared the tuning curve properties using electrophysiological recordings from the SMA and the primary motor areas recorded in the same monkeys. For both data sets, the experimental setup, behavioral paradigm, and detailed behavior were the same as described in previous studies (Paz et al. 2003, 2005).

Two female rhesus (*Macaca mulatta*, ∼4.5 kg) monkeys were trained in a delayed go-signal paradigm to move a manipulandum to control the movements of a cursor on a video screen located 50 cm from their eyes. Each trial began when the monkey centered the cursor on the origin (central circle) for ≥1 s. After a random delay of 1–1.5 s, the target appeared: a circle of diameter 0.8 cm at one of eight possible positions 4 cm from the origin. Target appearance was followed by another variable hold period of 1–1.5 s until the origin circle disappeared (go signal), at which time the monkey moved her arm to bring the cursor to the target in <2 s. After another hold of 750 ms, a liquid reward was delivered. The monkey also performed a learning paradigm, but the data analyzed here used only control trials.

The monkeys were implanted with recording chambers (27 × 27 mm) above both the right and left hemispheres. Each day, eight glass-coated tungsten microelectrodes (impedance, 0.2–1 MΩ at 1 kHz) were inserted into each hemisphere. The signals were amplified, filtered, and sorted (MCP-Plus; Alpha-Omega, Nazareth, Israel), and all spike shapes were sampled at 24 kHz. We used a template-based method for real-time isolation of spike shapes (MSD; Alpha-Omega). Penetration locations and angles were verified by magnetic resonance imaging (BioSpec 4.7 T; Bruker BioSpin, Ettlingen, Germany) before and after recordings. Penetrations into the SMA proper were angled at 30° to the sagittal plane and advanced from the point of insertion until they reached the medial cortex. Animal care and surgical procedures complied with the National Institutes of Health *Guide for the Care and Use of Laboratory Animals* (7th ed. Washington, DC: Nat. Acad. Press, 1996) and with the Hebrew University guidelines approved by the Institutional Committee for Animal Care and Use.

Analysis was performed over a window −200 to +700 ms around initiation of movement, an epoch that contains most of the neuronal responses that generate the movement. Movement initiation was calculated offline from the velocity profile by extrapolating a line through the one-third and two-thirds of the peak and selecting the zero-velocity crossing of that line as the movement onset. This approach yielded very reasonable solutions as validated by visual inspection.

Overall, a total of 448 SMA cells and 1,495 M1 cells were recorded. All M1 cells and 35 of the SMA cells had recordings with 8 movement directions. Fifteen SMA cells were recorded with movements at six directions and the remaining SMA cells with movements at five directions.

#### Previous approaches to estimate tuning curve shape.

Estimation of tuning curves properties has been previously explored using various methods ranging from parametric (Amirikian and Georgopoulos 2000; Georgopoulos et al. 1982, 1986, 1988; Stark and Abeles 2005) to reverse correlation (Simoncelli et al. 2004) to nonparametric methods (Kass et al. 2005). Bayesian estimation complements and extends these previous approaches and is particularly helpful when data are scarce relative to the number of free parameters in the model. Since neurons often exhibit complex response properties and electrophysiology experiments are typically limited by their overall duration, this sparse-data scenario is common. Cronin et al. (2010) presented a full Bayesian method based on Markov chain Monte Carlo (MCMC) sampling that allows one to obtain the posterior probability distribution over tuning curves and test hypotheses over the parameters. Implementing MCMC sampling and optimizing may require a considerable effort and would not be necessary in the common case where one is interested in a single, most likely, tuning curve.

#### Modeling tuning curves using the VM function.

The cosine function *f*(*x*) = *b* + *m* cos (*x* − μ), which is commonly used to model direction tuning curves, predicts a firing rate *f* given a movement direction *x*. It can be tuned to capture the preferred direction μ, the baseline firing rate *b*, and the modulation depth *m* but is constrained to a predefined half-peak width of 180°. Although cosines are easy to tune from data, they provide poor fits when the tuning curve widths vary substantially as was observed in the macaque primary motor cortex population (Amirikian and Georgopoulos 2000). Swindale (1998) has discussed a richer family of tuning functions based on the VM distribution (von Mises 1918). In circular statistics, the VM distribution, which is also known as the circular normal distribution, plays a prominent role similar to that of the Gaussian distribution on the line (Batschelet 1981; Mardia 1972). The VM tuning function is defined as:
*b* determines the baseline firing rate, *m* the modulation depth, and μ the preferred direction. VM contains one additional parameter, κ, which determines the width of the function (Fig. 1*A*). As a result, the VM model can describe more accurately cells with narrower or wider tuning curves than the standard cosine function. Indeed, VM models provide superior fits for direction tuning curves in the macaque primary motor cortex (Amirikian and Georgopoulos 2000).

#### Least-squares and Bayesian function estimation.

We first review here the relation between least-squares and maximum likelihood (ML) estimation. The common approach to fit a function to a measured response profile is to use a least-squares minimization. Under this approach, we assume that a neuron is sampled repeatedly in multiple trials, and its responses produce a firing rate *r*_{i} for the trial *i* where movement was at direction *x*_{i}. We also assume that the firing rate *r*_{i} is determined by a function *f*_{θ}(*x*_{i}) of the direction *x*_{i}, possibly with added noise ξ, yielding *r*_{i} = *f*_{θ}(*x*_{i}) + ξ, where θ is a vector of parameters that determines the shape of the response profile. For instance, if *f*_{θ}(*x*) is a cosine, then θ could be a vector holding the baseline firing rate *b*, the modulation depth *m*, and the preferred direction μ yielding θ = [*b*,*m*,μ]. According to the minimum least-squares approach, we search for the set of parameters that minimizes the average-squares prediction error:
*B*, where a bell-shaped function is being fit to a set of firing rate measurements collected from a macaque cortex. Here, we first used all of the measured data to create the blue solid curve and then left most of the data aside and fitted a VM function (dashed red) to 20% of the data using minimum squared error. The red curve strongly overshoots the measured data, producing a curve that is too peaked and very different from the blue curve. Here, even though the VM function achieves a lower squared error, the resulting curve fails to capture the correct tuning curve shape.

To address this overfitting issue, we take a probabilistic Bayesian approach to estimate the parameters and add a prior on the width of the tuning curves. To explain how this is done, we first note that the minimum least-squares problem is equivalent to a ML estimation problem when the noise ξ has a normal distribution with zero mean, ξ ∼ *N*(0, σ^{2}). With such noise, the probability of observing the response *r*_{i}, also known as the likelihood, is given by:

This view of minimum least-squares as finding the parameters that maximize the likelihood of the data is useful in that it allows us to take a Bayesian approach to estimate the parameter θ. In Bayesian estimation, we seek the set of parameters that maximize the posterior probability Pr(θ|*r*_{1}, …, *r _{n}*) ∝ Pr(

*r*, …,

_{i}*r*|θ)

_{n}*P*

_{0}(θ), where P

_{0}(θ) is a prior distribution that expresses our prior belief over the possible values of θ. This prior can be used to penalize models with very narrow curves, for instance by taking a probability density function that decays with the width, like

*P*

_{0}(κ) ∝ α exp(−ακ) (we discuss the considerations for selecting the form of the prior further below). For a given observed data

*r*

_{1}, …,

*r*

_{n}, the posterior is proportional to the joint distribution Pr(θ|

*r*

_{1}, …,

*r*

_{n}) ∝ Pr(

*r*

_{1}, …,

*r*

_{n}, θ).

Finding the parameter θ that maximizes the posterior is equivalent to minimizing the following:
*Eq. 3* can be solved using standard tools. Specifically, in the experiments below, we used MATLAB function fmincon, which uses a trust-region-reflective method. We repeated optimization with 20 random restarts to increase the probability that the optimization converges to global minima. Using even more restarts only yielded an insignificant improvement to the objective.

What should be the form of the prior distribution? In this paper, we selected the exponential distribution and for several reasons. First, for any given mean, the exponential distribution is the maximum entropy distribution, which, loosely speaking, means that it embodies minimal further assumptions about the prior. Indeed, exponential priors have been shown to lead to superior modeling in various systems (e.g., Goodman 2004). Second, for exponential distributions, *P*_{0}(θ) = α exp(−αθ), the regularization term λlog [*P*_{0}(θ)] in *Eq. 3* has only one term that depends on θ and equals to −λαθ. This suggests that instead of tuning both λ and α, one can equivalently treat α as a constant, making optimization simpler and faster.

#### Converting κ to width.

After estimating the VM parameters, we can use the κ value to calculate the estimated width in half-height. For that, we first calculate the *y*-value in half-height. The VM function has maximum value when cos(*x* − μ) = 1, thus *y*_{max} = *b* + *m e*^{κ}. The VM minimum value is for cos(*x* − μ) = −1, thus *y*_{min} = *b* + *m e*^{−κ}. Combining both equations, we have:
*y*-axis and the width in half-height is twice the *x* in half-height. Thus we look for *x* that satisfies the following equation:

## RESULTS

We studied electrophysiological recordings from the SMA and the M1 of macaque monkeys while the monkey performed an eight-direction center-out task and used the Bayesian approach before study the shape of their VM tuning curves. We started by evaluating the quality of the inferred curves in two aspects. First, we evaluate the accuracy of predicting firing rates for directions that were hidden during fitting. Second, we evaluate the accuracy of estimating shape parameters and specifically the width parameter κ. Our approach for evaluating the quality of fit is to “hide” some of the collected data, keep it aside during the fitting procedure, and use it for validation. Specifically, we either leave some trials out or leave all of the trials of one or more directions hidden during fitting.

Typically, the experimental design for estimating tuning curve shape involves using eight directions or less. It has been argued (Amirikian and Georgopoulos 2000) that using more directions is necessary to estimate robustly the width of a tuning curve. We turn to test whether the Bayesian approach can yield good estimates even with a small number of directions.

#### Predicting the firing rates for unseen direction.

We fitted a VM function to every eight-direction response profile using a predefined subset of five directions. This specific subset of directions was selected according to other data sets with five directions only. They followed a pattern of the form “1-1-1-0-1-0-1-0” where a “1” corresponds to a direction that was used during fitting and “0” to a direction that was hidden and all of its trials kept for validation. Since there are eight different possible cyclic shifts of this pattern, there are also eight options for subsets of directions following this pattern, and we selected for each cell the subset of directions that maximize its firing rate. For each cell, we found the VM parameters that minimize the objective function of *Eq.* 3 when using only five directions out of eight. We then computed the absolute error between the measured firing rates of the remaining three directions and the VM predictions for these directions.

The weight of the prior λ is a hyperparameter for which the value is not optimized in *Eq. 3*. We tested two strategies for selecting the weight of the prior λ. In the first approach, which we call Shared-λ, we used a single λ value for all cells. That common λ value was selected using a leave-one-trial-out (LOTO) approach: we further hid another trial, minimized *Eq. 3* with λ values in the range [0.5, 1, 1.5, 2, 2.5, 3, 3.5], and selected the λ value that led to the smallest absolute error over the hidden trial. In a second approach, Per-Cell-λ, we selected a separate λ for each cell.

Figure 2 compares four fitting strategies to SMA neurons: *1*) ML, which is equivalent to selecting λ = 0; *2*) maximum a posteriori (MAP) with a shared λ; *3*) MAP with λ adjusted for each cell separately (Per-Cell-λ); and *4*) ML initialized with MAP solutions. For each of the four strategies, we compared the median of the absolute errors, since the errors do not follow a normal distribution. The median error of the ML approach [median = 1.38 Hz, mean absolute deviation (m.a.d.) = 0.93 Hz] was not significantly different (Wilcoxon test, *P* < 0.94) from with MAP (median = 1.46 Hz, m.a.d. = 0.96 Hz), but selecting an individual λ per cell achieved significantly lower (Wilcoxon test, *P* < 0.02) median error (median = 1.03 Hz, m.a.d. = 0.83 Hz). For Shared-λ, accuracy was largely insensitive to λ, hence we show below results with λ = 2.5.

The improved accuracy obtained with MAP over ML could result from two different causes. First, as discussed above, a prior that favors wide tuning curves could help reduce overfitting in ML. Second, the likelihood surface in the parameter space has multiple maxima, and gradient-based approaches often get stuck in a local, inferior maximum. Using MAP could alleviate this problem since parameter regularization operates to smooth the error surface and remove local maxima corresponding to narrow curves (large κ). To test which of these two effects contributes more to the inferior accuracy of ML, we also tested the following fourth approach, ML-initialized-by-MAP. In this approach, we first find good VM parameters using Per-Cell-λ MAP and then use these parameters to initialize a ML procedure. If local maxima are the main issue of the ML procedure, then initializing with the MAP optimal parameters is expected to yield better fits than MAP.

The rightmost bar in Fig. 2 depicts the absolute error of the ML-initialized-by-MAP approach. It achieves significantly (Wilcoxon test, *P* < 0.05) higher median error (median = 1.25 Hz, m.a.d. = 0.86 Hz) than Per-Cell-λ MAP, and its error is not significantly lower than that of standard ML (Wilcoxon test, *P* < 0.61). We conclude that poor initialization and convergence to poor local minima is not the main cause of poor accuracy obtained with ML.

We further evaluated the quality of the fit in terms of the parametric representation of each curve. Once again, we fitted a VM function to every response profile using a predefined subset of five directions and computed the estimated κ value (denoted κ_{5}). We compared these estimates with κ_{8}, the κ value that is estimated using the full eight direction profiles. The prior weight λ for the κ_{8} estimation was again selected using a LOTO method. Results using leave-one-direction-out (LODO) were comparable and are not shown here. Figure 2*B* compares the estimation error of κ for the four methods, showing that the Bayesian estimator with an individual λ value per cell (median = 0.5, m.a.d. = 0.5) outperforms the method ML-initialized-with-MAP (median = 0.61, m.a.d. = 0.61), and its error is significantly (Wilcoxon test, *P* < 0.1) lower than with ML (median = 1.75, m.a.d. = 0.61). Per-Cell-λ reduces the estimation error significantly.

To explore more deeply the effect of MAP estimation on an individual cell basis, Fig. 3 compares the estimates of both firing rates and shape obtained using a ML with the Per-Cell-λ approach.

Figure 3*A* shows that the firing rates of three held-out directions are predicted significantly better by using the Per-Cell-λ approach than by using plain ML. This is apparent by the fact that errors of the MAP estimates (squares) are in general much closer to the zero line than the errors of the ML estimates (open circles). This effect holds for the complete wide range of firing rates. Figure 3*B* compares the κ estimates obtained with five directions (κ_{5}) with the ones obtained with eight directions (κ_{8}) for each cell. The errors of the Per-Cell-λ estimates of κ_{5} (squares) are again much closer to the zero line than the errors of the ML estimates of κ_{5} (open circles). In this case, too, the effect holds across the whole range of κ values.

We further looked at individual cases where the Per-Cell-λ estimates of κ_{5} deviates significantly from the ML estimate. Figure 4 shows the tuning curves estimates for three cells as obtained using various methods. For Fig. 4, *A* and *B*, the MAP fit of five directions is much closer to the eight-directions fit than the ML fit of the five directions. In contrast, Fig. 4*C* shows a cell for which the ML estimator is more accurate than a Bayesian estimator. As Fig. 3 shows, there are many more cells for which the Per-Cell-λ estimator is superior.

#### Movement direction tuning curves of macaque SMA and M1 neurons.

The above analyses convinced us that the Bayesian estimation procedure provides an accurate method to estimate the shape of the direction tuning curves. We now turned to analyze recording from a monkey performing a center-out task collected in 2 motor areas: the M1 and the SMA. Recordings were made in the SMA in 24 sessions with a total of 448 cells and in M1 in 70 sessions with a total of 1,495 cells. In the SMA recordings, only 2 sessions included recordings from 8 directions, and 1 session from 6 directions and the remaining 21 sessions had 5 directions only. To compare SMA and M1 tuning curve on the same basis, we used for every cell only 5 directions, which were selected to match the same “missingness” pattern 1-1-1-0-1-0-1-0 that sessions with 5 directions already followed. This left 1 degree of freedom for selecting which physical direction matches the 1st direction in the missingness pattern. We choose for each cell the subset of directions that maximize the cell firing rate. Then, for each cell, we used cross-validation to choose the best λ. Each κ estimate was converted into a width parameter, which corresponds to the curve width in its half-height.

Figure 5*A* shows the distribution of the widths estimated from 319 SMA cells and 1,112 M1 cells for which the response curve could be well-fitted with a VM model (means *r*^{2} > 0.7, namely, the model captures >70% of the variance across the mean responses per direction). The median width value for SMA units is ∼119°, whereas the median width value for M1 units is 130°; both correspond to curves that are narrower than the cosine function, for which width is 180°. The difference between the distributions is significant (χ^{2} test, *P* < 0.01, Wilcoxon test, *P* < 3e−4). The error bars show the variation of the count across recording sessions. Cells in this analysis were selected based on how well the VM model captures the shape of the mean response for each direction, as measured using the ratio of explained variance *r*^{2}. Since the trial-to-trial variability of spike counts within a single direction was typically very high, the trial-to-trial *r*^{2} in the same cells was significantly lower than the means *r*^{2}.

Figure 5*B* shows the distribution of the modulation depths estimated in the SMA and M1 populations. The two distributions have slightly different shapes (χ^{2} test, *P* < 0.01), and their medians differ significantly (Wilcoxon test for equal medians, *P* < 2e−5). The median modulation depth for SMA units is 0.95 spikes, whereas the median modulation depth for M1 units is 1.91 spikes.

The neurons of the different areas seem to encode the preferred direction in a different way. The SMA tuning curves are narrower than the M1 neurons, hence more specific to their preferred direction. The M1 tuning curves are deeper, namely the M1 neurons modify their activity for their preferred direction in a more prominent way.

Figure 5*C* shows the distribution of the dynamic range calculated for the same SMA and M1 cells. The dynamic range is calculated by subtracting the minimal average firing rate across all directions from the maximum firing rate. In a VM function, the dynamic range depends on two parameters, κ and *m*, and can be viewed as the magnitude of the response to the preferred direction relative to a baseline. In our data, the distribution of the dynamic range in M1 and in the SMA are very close, but their medians are significantly different (Wilcoxon, *P* < 0.006): the dynamic range in the SMA (mean = 5.25 spikes) is lower than in the M1 (mean = 6.14 spikes).

Figure 6 shows the dispersion of the widths and the modulation depth values across the electrode locations along the rostrocaudal and the medial-lateral axis. We found no significant relation between the electrode location and the curve width or the modulation depth. Specifically, splitting the SMA cells into one group with wide tuning curves (width ≥ 120°, *n* = 154) and a second group with narrow curves (*n* = 165), the Euclidean distance between the centers of these two groups (0.3 mm) is smaller than the average standard deviation (SD) of each group (2 mm). The Euclidean distance between the centers of the wide tuning curves (*n* = 635) and the narrow tuning curves (*n* = 477) of the M1 cells (0.1 mm) is also smaller than the average SD of each group (3.50 mm). It is also true for the modulation depth values of the SMA neurons: mean distance between cells with deep tuning curves (depth ≥2, *n* = 109) and shallow tuning curves (*n* = 210) is 0.45 mm, which is smaller than the average SD of the group, which is 1.93 mm; and for the M1 neurons: mean distance between cells with deep tuning curves (*n* = 541) and shallow tuning curves (*n* = 571) is 0.28 mm, whereas the average SD is 3.55 mm.

## DISCUSSION

Cells in the SMA often exhibit tuned responses to the direction of simple movements. The shapes of their tuning curves may vary in their preferred direction, the modulation depth (response strength), and the width of the curve around the preferred direction. Characterizing these properties of the curves requires that we obtain accurate estimates of their shapes even when sampled at an irregular small number of movement directions.

We described a simple Bayesian procedure for estimating the parameters of unimodal tuning curves that is based on VM functions. Using measurements from SMA neurons and hiding part of the data, we found that adding simple, exponentially decaying priors improves the accuracy of the estimation of the hidden measurements even when only five directions were used to estimate the shape of the curve and when these directions were not evenly spaced. Using this Bayesian tuning curve estimation procedure, we characterized the shape movement direction response curve recorded from the SMA of behaving monkeys. We found that the widths of tuning curves in the SMA were often quite narrow, with a median width of ∼110° as opposed to 180° assumed when using the standard cosine function.

We further used the Bayesian estimation procedure to compare the direction tuning curves of neurons in the SMA with those of neurons in M1 of the same monkeys.

The current study focused on coding of movement direction, but cells in the primary and the premotor cortex also show that the discharge of single neurons is correlated with other aspects of movement, including movement speed, distance, and target position (Fu et al. 1993, 1995). In the current context, it is possible that the slightly smaller modulation depth for direction observed in the SMA, compared with M1, could indicate that a larger involvement of the SMA in other movement parameters besides direction.

Some intuition about the meaning of tuning curves widths can be gained by considering stimulus representation in sensory systems. In the auditory and the visual systems, the representation of stimuli often becomes coarser and coarser when advancing from the periphery to higher processing areas. For example, in the auditory system, frequency tuning curves in the auditory cortex tend to be wider than in the inferior colliculus (IC) and in the auditory thalamus (MGB), and neurons in MGB follow more faithfully the amplitude modulations of natural sounds than neurons in A1 (Creutzfeldt et al. 1980). In the visual system, the receptive fields of individual cells grow substantially from the retina through the primary visual cortex to higher areas. This view of sensory systems as extracting more complex semantic entities, but becoming more invariant to their physical properties (like the frequency of sound stimulus or the position of visual stimuli), could call for a counterpart trend in the motor system. Although motor systems are very different from sensory systems, one could hypothesize that motor control areas that are more remote from the periphery may operate at a coarser resolution when controlling movements and, as a result, may exhibit wider tuning curves than in the periphery.

When considering this curve-narrowing hypothesis in the context of the SMA, there are several opposing considerations. The SMA is involved in motor planning, possibly using a different coordinate space (Kalaska et al. 1997). For instance, SMA neurons fire before arm movement (Tanji and Kurata 1982), and many SMA neurons project to M1. One may expect that SMA tuning curves would therefore be wider than the curves in M1. At the same time, the SMA also contributes a substantial fraction of the corticospinal tract, with up to 18% of the neurons in the SMA being corticospinal neurons (Dum and Strick 1991). Also, for simple motor tasks, activity in SMA is similar to that in M1 (Chen et al. 1991; Fried et al. 1991; Kollias et al. 2001; Tanji and Mushiake 1996). As such, the SMA could be viewed as parallel to M1.

To test these alternative views, we applied the Bayesian estimation procedure to both SMA neurons and M1 neurons from the same monkeys and computed the distribution of tuning curve widths in each brain region. This analysis shows that the hypothesis that curves become narrower when moving from SMA to M1 does not hold since the widths of the SMA tuning curves are actually significantly narrower than the M1 curves (a median of 128°; Fig. 5*A*).

Narrow direction tuning curves were also shown in other brain areas that are linked to the motor cortex. Andersen and colleagues (1985) described neurons in the posterior parietal cortex that exhibited narrow tuning curves for coding eye position. Tanaka et al. (2009) used a population coding model to argue that neurons involved in visuomotor rotation adaptation have a relatively narrow directional tuning width and identified visually selective neurons with narrow Gaussian tuning curves in the posterior parietal cortex, which is a possible site for adaptation to visuomotor rotation. Narrow tuning curves were also observed in analysis of local field potentials (LFP). O'Leary and Hatsopoulos (2006) described neurons both in the primary motor cortex and the premotor cortex that exhibited a range of tuning curve widths at various LFP frequency bands, with many premotor neurons exhibiting sharp tuning curves (10–40°) in the γ-band.

Estimating tuning curves from spike counts is often hard to do in a robust manner. The most common approach is to fit the responses with a curve using least-squares (Amirikian and Georgopoulos 2000), where the model could also depend on additional aspects of the movement besides the directions, like movement speed (Moran and Schwartz 1999). Unfortunately, the least-squares approach is sensitive to outliers and to non-Gaussian noise in general (Etzold et al. 2004). An even harder challenge is that typical experiments probe tuning curves at a coarse resolution, using five to eight directions only. Furthermore, these directions are not necessarily spaced homogeneously but rather probe some directions more finely than others. Intuitively, coarse and inhomogeneous sampling strongly limits the resolution by which the width of the tuning curves can be estimated accurately.

We find that adding priors on the distribution of widths actually allows an accurate estimation for a large range of width values, including the widths that are found experimentally in the high-resolution experiments of Amirikian and Georgopoulos (2000). Handling movement directions that are sampled in a nonhomogeneous way is particularly useful in the case of neural prosthesis, where movements are often collected based on the distribution of natural movements, rather than in a controlled experiment. The above results suggest that including priors in the estimation process offers benefits for estimating movement directions in brain-machine interface (Hatsopoulos and Donoghue 2009; Nicolelis and Lebedev 2009; Taylor et al. 2002). Finally, the Bayesian approach is not limited to neurons in the motor system and can become useful for analysis of sensory responses. For example, tuning curves are a common way of representing responses of the sensitivity of visual neurons to orientation or the sensitivity of auditory neurons to the frequency of a pure-tone stimulus.

The Bayesian approach of adding priors when estimating parametric models is widely used in probabilistic modeling. It can be seen as a form of regularization, which prevents models from overfitting to the data, namely, yielding a model that fits well the data at hand but is likely not to generalize well to new data (Bishop 2006). In practice, the specific form of the prior often has only a limited overall effect and is selected based on mathematical convenience. Cronin and colleagues (2010) have described a more elaborate Bayesian model of tuning curves. In their approach, a hierarchy of priors is used. One level of priors is used when estimating the shape of the tuning curve. A second level of priors is used to determine the distribution of the first-level priors themselves. This approach allows making “softer” decisions with respect to the distributions but comes at a steep computational price. Here, we chose to use a simpler model, with (1-layer) exponential priors, for which the main effect is to penalize very narrow curves. The exponential form of the prior was chosen because it is the maximum entropy distribution with a single constraint and because it can be tuned efficiently. This approach is easy to implement, and our evaluations show that, although simple, it yields a significant improvement in estimation accuracy.

Our analysis focused on the tuning curves of individual neurons. When considering a population of neurons, the direction of a movement can be coded accurately to a high resolution using the activity of the population, even when building on wide-tuned neurons (Seriès et al. 2004). Although it is often assumed that a neuron codes for the preferred direction where its response peaks, the parts of the tuning curves with the steepest slope allow the best discrimination between nearby directions (Butts and Goldman 2006). In the general case, it is not possible to tell a priori if it is easier to code direction using a group of cells with narrow tuning curves or wide tuning curves since the coding accuracy depends strongly on the correlation between cells (Averbeck et al. 2006; Pouget et al. 1999). Specifically, sharpening of tuning curves under fixed noise can increase the Fisher information that activity conveys about the direction, but the information can also be decreased if sharpening is the result of interaction between neurons, due to stronger correlated noise (Seriès et al. 2004). In the SMA, there is evidence for pairwise correlations that affect coding accuracy (Averbeck and Lee 2003), but a full characterization of the effect of correlations on coding directions and their comparisons with correlations in M1 awaits further research.

## GRANTS

G. Chechik was supported by Israeli Science Foundation Grants 1001/08 and 1090/12 and Marie Curie Reintegration Grant PIRG06-GA-2009-256566.

## DISCLOSURES

The authors declare that they have no competing financial interests.

## AUTHOR CONTRIBUTIONS

H.T., E.V., R.P., and G.C. conception and design of research; H.T., E.V., and R.P. performed experiments; H.T. and G.C. analyzed data; H.T., R.P., and G.C. interpreted results of experiments; H.T. and G.C. prepared figures; H.T. and G.C. drafted manuscript; H.T. and G.C. edited and revised manuscript; H.T., E.V., R.P., and G.C. approved final version of manuscript.

- Copyright © 2013 the American Physiological Society