## Abstract

The repeated presentation of an identical visual stimulus in the receptive field of a neuron may evoke different spiking patterns at each trial. Probabilistic methods are essential to understand the functional role of this variance within the neural activity. In that case, a Poisson process is the most common model of trial-to-trial variability. For a Poisson process, the variance of the spike count is constrained to be equal to the mean, irrespective of the duration of measurements. Numerous studies have shown that this relationship does not generally hold. Specifically, a majority of electrophysiological recordings show an “overdispersion” effect: responses that exhibit more intertrial variability than expected from a Poisson process alone. A model that is particularly well suited to quantify overdispersion is the Negative-Binomial distribution model. This model is well-studied and widely used but has only recently been applied to neuroscience. In this article, we address three main issues. First, we describe how the Negative-Binomial distribution provides a model apt to account for overdispersed spike counts. Second, we quantify the significance of this model for any neurophysiological data by proposing a statistical test, which quantifies the odds that overdispersion could be due to the limited number of repetitions (trials). We apply this test to three neurophysiological data sets along the visual pathway. Finally, we compare the performance of this model to the Poisson model on a population decoding task. We show that the decoding accuracy is improved when accounting for overdispersion, especially under the hypothesis of tuned overdispersion.

- spike counts
- overdispersion
- negative-binomial distribution
- decoding
- tuning function.

one of the most common ways to study the central nervous system is to use a stimulus-response paradigm. A system, which could be a neuron or a population of neurons, is provided with an input (a stimulus) and the resulting state change (response) of the system is quantified to better characterize functional associations. A main problem with this approach is that the system's response to repeated presentations of an identical visual stimulus often exhibits high variability, particularly in the spike count statistics (see Fig. 1). As a consequence, statistical methods are essential to characterize the information carried by neural populations.

The Fano factor (FF; Fano 1947) is commonly used to quantify the variability of spike counts (Kara et al. 2000; Alitto et al. 2011; Zhuang et al. 2013; Tolhurst et al. 1983; Softky and Koch 1993; Shadlen and Newsome 1998; Koch and Schutter 1999). It is defined by FF(Δ*t*) = , where μ and σ^{2} are the average and the variance over multiple trials of the number of spikes in some time window of length Δ*t*, respectively. This variability is most often characterized by the Poisson spiking model (PM). By the definition of this model, the spike count follows a homogeneous Poisson process with a rate parameter λ (the expected number of spikes that occur per unit of time). If *k* is the observed number of spikes fired by a neuron in response to a stimulus in a time window Δ*t*, the probability of such an observation is given by a Poisson distribution of parameter *f* = λΔ*t*, which gives the expected number of spikes per sample in the considered time window:
(1)

It follows that μ = *f* and σ^{2} = *f* and thus that the FF under the PM has a value of 1, regardless of λ or Δ*t*. Such a property is in accordance with a wide range of observations of variability in cortical responses to visual stimuli (Shadlen and Newsome 1998; Softky and Koch 1993; Britten et al. 1993; Buracas et al. 1998; McAdams and Maunsell 1999) showing variance-to-mean ratios of spike counts that equal or just exceed unity. However, for a large majority of reported evoked responses, the variability is higher than expected from a PM, especially in mammalian visual cortex (Heggelund and Albus 1978; Dean 1981; Schiller et al. 1976; Tolhurst et al. 1983; Vogels et al. 1989), whereas Sub-Poisson variability has been observed only under specific conditions (Kara et al. 2000).

In fact, the variability of neural responses depends on different parameters. Some studies reported that the FF values can vary depending on the brain region (Kara et al. 2000; Gur and Snodderly 2006; Kayser et al. 2010). Indeed, neural responses in higher cortical areas (Maimon and Assad 2009) or in other noncortical sensory areas such as the retina (Berry et al. 1997) and the thalamus (Kara et al. 2000) show substantially lower variability. It has been also shown that some neural features such as the refractory period (Kara et al. 2000; Berry et al. 1997) lead to a decrease of variability. Moreover, the nature of the stimulus itself could contribute to the variability of neuronal responses. It has been suggested that increasing the complexity of the stimuli by using natural-like statistics increases the reliability (Baudot et al. 2013; Haider et al. 2010; Borst and Theunissen 1999). In addition, it has been reported that the variability of evoked responses is not stationary even during a trial time course and decreases after the presentation onset of most sensory stimuli (Churchland et al. 2010). Finally, a recent study (Nienborg et al. 2012) on behaving animals explored the role of decision and other cognitive processes in shaping variability.

In this article, we mainly explore overdispersion, that is, cases where there is greater variance in the spike count than might be expected by the PM. Our hypothesis is that the Poisson process only accounts for the intrinsic noise induced by the spiking mechanism itself. Indeed, under the Poisson process hypothesis, the trial-to-trial variability only results from the stochastic properties of the neuron (Carandini 2004; Mainen and Sejnowski 1995; Schneidman et al. 1998). Thus the PM cannot account for other significant sources of variability including hidden contextual variables such as attention, varying cortical states (Arieli et al. 1996; Buracas et al. 1998; Tsodyks et al. 1999; Kenet et al. 2003), perceptual effects (Ress and Heeger 2003), and overt behaviors (such as the precise eye position) (Gur et al. 1997). Thus we consider an alternative compound distribution, called Negative-Binomial, that generalizes the Poisson distribution with a dispersion parameter that directly controls the ratio between the mean and variance. Such a doubly stochastic model was recently proposed to model the variability of neural dynamics showing more complex variability than what is expected from a PM (Churchland et al. 2011; Ponce-Alvarez et al. 2013, 2010; Goris et al. 2014). Our aim is to further explore the potential suitability of this doubly stochastic model to account for response variability in neurophysiological data.

The article is organized as follows. First, we describe the Negative-Binomial encoding model (NBM) and two possible implementations of overdispersion sources. Second, we propose a statistical test for evaluating whether the observed overdispersion is significantly better described by the NBM than the PM. To assess the generality of this model, we then apply this test to three data sets of spiking responses in different animals, different states, and different visual stimuli: in the lateral geniculate nucleus (LGN) neurons of anesthetized mice presented with drifting gratings, in the primary visual cortex (V1) of awake macaque monkeys viewing oriented moving bars, and in the middle temporal cortex (MT) of anesthetized macaque monkeys presented with drifting gratings. Finally, we compared the efficiency of PM and NBM on a population decoder using the most dispersed data set (MT).

## MATERIALS AND METHODS

In our analyses, we used three different data sets from neurophysiological recordings.

### LGN Data Set

This data set (Scholl et al. 2013b) contains the spiking responses of 72 LGN neurons in the anesthetized mouse to drifting grating [square wave (*n* = 24) and sinusoidal (*n* = 48)]. Many of these neurons are already contained in the data set published in Scholl et al. (2013a). Contrarily to the two other data sets, it is an openly available data set.

### V1 Data Set

Experiments were conducted at the INT on two adult male rhesus macaque monkeys (*Macaca mulatta*). Experimental protocols have been approved by the Marseille Ethical Committee in Neuroscience (Approval No. A10/01/13; Official National Registration No. 7–French Ministry of Research). All procedures complied with the French and European regulations for animal research, as well as the guidelines from the Society for Neuroscience.

#### Surgical preparation.

Monkeys were chronically implanted with a head-holder and a recording chamber located above the V1 and V2 cortical areas. In a second surgery, a search coil was inserted below the ocular sclera to record eye movements (Robinson 1963).

#### Behavioral task and training.

Monkeys were trained to fixate, within a window of 1 to 2° of diameter, on a red target presented at the center of the screen during the entire duration of the trial (1.5 s). The eye position was monitored by the scleral search coil technique (Collewijn et al. 1975; Robinson 1963) and the animal's behavior was controlled using the REX package (Hays 1982).

#### Unit recording.

The two monkeys were chronically implanted with a head-holder and a recording chamber located above cortical area V1. A computer-controlled microdrive (MT-EPS; Alpha Omega) was used to transdurally insert a microelectrode (FHC, 0.5–1.2 MΩ at 1 kHz) in the right hemisphere. Spikes were sorted online using a template matching algorithm (MSD; Alpha Omega).

#### Visual stimulation.

Visual stimulation protocols have been produced using in-house software (developed by Gérard Sadoc, Acquis1-Elphy, Biologic CNRS-UNIC/ANVAR). Stimuli were back-projected on a translucent screen covering 80 × 80° of the monkey's visual field, at a distance of 1 m, using a retro-projector (resolution: 1,280 × 1,024 pixels at 60 Hz). The mean luminance of the motion stimulus was 22.2 cd/m^{2} and the background was kept constant to ∼2.24 cd/m^{2}. The display was gamma calibrated by means of a lookup table. Once a cell was isolated, a sparse noise stimulus was first used to quantitatively map out the receptive field (a grid of 10 × 10 squares with a side length of 0.6°, 10.9 cd/m^{2} darker or brighter than the background of 11.1 cd/m^{2}, ∼15 trials). Once the receptive field was properly located and estimated, a direction tuning paradigm was launched. In this paradigm, a moving bar (0.5 × 4°) translated across the receptive field (the displacement covers 3° on either side of the receptive filed center) in 12 possible directions (spaced by 30° of polar angle, at a speed of 6.6°/s).

### MT Data Set

Experiments were conducted at New York University Center for Neural Science on three anesthetized, paralyzed, adult macaque monkeys (*M. nemestrina*) of either sex. All procedures were conducted in compliance with the National Institutes of Health *Guide for the Care and Use of Laboratory Animal*s and with the approval of the New York University Animal Welfare Committee.

#### Surgical preparation.

The standard procedures for the surgical preparation of animals and single-unit recordings have been reported in detail previously (Cavanaugh et al. 2002). Briefly, experiments typically lasted 5 to 6 days, during which we maintained anesthesia with infusion of sufentanil citrate (6–30 μg·kg^{−1}·h^{−1}) and paralysis with infusion of vecuronium bromide (Norcuron; 0.1 mg·kg·h^{−1}) in isotonic dextrose-normosol solution. We monitored vital signs (heart rate, lung pressure, end-tidal Pco_{2}, EEG, body temperature, urine flow, and osmolarity) and maintained them within appropriate physiological ranges. Pupils were dilated with topical atropine. The eyes were protected with gas-permeable contact lenses, and refracted with supplementary lenses chosen through direct ophthalmoscopy. At the conclusion of data collection, the animal was killed with an overdose of sodium pentobarbital.

#### Unit recording.

Extracellular recordings were made with quartz-platinum-tungsten microelectrodes (Thomas Recording), advanced mechanically into the brain through a craniotomy and small durotomy. To record from MT we passed microelectrodes through a small durotomy centered roughly 16 mm lateral to the midline and 3 mm posterior to the lip of the lunate sulcus at an angle of 20° from horizontal in a ventro-anterior direction. Area MT was identified by the brisk direction-selective responses of isolated neurons. We made recordings from every single unit with a spike waveform that rose sufficiently above noise to be isolated. Stimuli were presented in random order.

#### Visual stimulation.

We presented visual stimuli on a gamma-corrected CRT monitor (Eizo T966; mean luminance: 33 cd/m^{2}) at a resolution of 1.280 × 960 with a refresh rate of 120 Hz. Stimuli were presented using Expo software (http://corevision.cns.nyu.edu) on an Apple Macintosh computer. For each isolated unit, we first determined its ocular dominance and occluded the nonpreferred eye. We presented circularly windowed sinusoidal grating stimuli (16 directions around the clock, 22.5° apart) to map each cell's receptive field, determined its preferred size and speed, and then measured selectivity for orientation or spatial frequency.

The time window considered to calculate the spike counts was the duration of stimulation for the grating stimuli (LGN and MT data sets) and 400 ms covering approximately the time when the bar stimulus crossed the cell's receptive field for V1 data set. All data were analyzed using Python and Matlab (The MathWorks, Natick, MA).

## THE NBM: ENCODING OF INTERTRIAL OVERDISPERSION

Several models have been developed to account for overdispersion such as generalized Poisson (Consul and Jain 1973), zero-inflated Poisson (Lambert 1992; Miaou 1994; Shankar et al. 1997), or quasi-Poisson (McCullagh and Nelder 1989; Wedderburn 1974) models. However, only a few such as the NBM are well known and widely available. The NBM has been studied and tested in different fields such as epidemiology (Byers et al. 2003), accident statistics (Poch and Mannering 1996), biology (Anscombe 1950; Bliss and Fisher 1953), etc. However, it has only recently been applied to evoked neural responses (Scott and Pillow 2012; Goris et al. 2014) where the variability was traditionally supposed to be well modeled by the PM.

An NBM depends on a distributional form equivalent to a compound stochastic process. It has recently been represented as a Polya-Gamma mixture of normals (Polson and Scott 2012), but its standard representation is the Gamma-Poisson mixture. It corresponds to a doubly stochastic process where the Poisson process has a probability distribution with a parameter λ, which is itself a random variable generated from a Gamma distribution (ϕ, β) with a shape parameter ϕ and a scale parameter β. When we consider a random variable *x* generated by this compound process [i.e., *x* ∼ (λ) with λ ∼ (ϕ, β)], the resulting distribution of this variable corresponds to a Negative-Binomial distribution (NB), which is usually defined by its shape parameter ϕ and its scale parameter *p* =
(2)

With the use of the NBM, the probability of getting *k* spikes from a given cell with parameters ϕ and *p* in response to a specific stimulus is given by:
(3)

We deduce the mean (μ), the variance (σ), and the FF as: (4)

This distribution approaches the Poisson probability of parameter μ for large values of ϕ, but allows to reach larger-than-Poisson variances for small ϕ values. We end up with a multivariate distribution that is, in practice, driven by two parameters: the mean spike count (μ) and the parameter (ϕ) that we call the “inverse dispersion parameter.”^{1}

Under the NBM hypothesis, the distribution of interspike intervals (ISIs) is given by a Loamax distribution (equivalent to an exponential distribution with a random parameter generated from a Gamma distribution). The mean and the variance of such a distribution are given by the following expressions: mean(ISIs) = defined for ϕ > 1, variance(ISIs) = defined for ϕ > 2. The resulting squared coefficient of variation is given by: (5)

defined for ϕ > 2. CV^{2} is higher than 1, which implies than a Negative-Binomial process is more irregular than a Poisson process and approaches Poisson regularity for large values of ϕ. However, unlike under the Gamma distribution hypothesis made in Nawrot et al. (2008), the NBM do not allow us to systematically relate overdispersion to the positive deviation from the null-hypothesis FF = CV^{2}. Thus, to better qualify this overdispersion parameter, we need to focus on how to generate the Gamma input drive that feeds the Poisson generator.

### Overdispersion May Result from Extrinsic Sensory Noise: NBM as a mixture of Poisson and Log-Normal Distributions

Many studies assume a Log-Normal or a Gamma distribution when analyzing spike count distributions as it most often produces similar results (McCullagh and Nelder 1989; Atkinson 1982). Indeed, a Log-Normal process might be a good approximative model for a Gamma process. The underlying spiking process could result from a simple encoding model: let us assume that a noisy sensory input is transformed to a response by a linear correlation measure with a cell's receptive field (for instance, in the early visual system, selective to a given orientation) and then through a nonlinearity into a stochastic spike count (Fig. 2). If we consider additive extrinsic Gaussian noise as the sensory noise and an exponential function as the nonlinearity, the resulting input drive follows a Log-Normal distribution that is closely fitted by a Gamma distribution. With the use of such a model, Fig. 2 shows that overdispersion could simply result from an additive normal sensory noise and the nonlinearity of the spiking response.

However, even if Log-Normal and Gamma distributions are often used to model the same phenomena, their log functions, in general, show clear differences in their respective skewness. Furthermore, there is no exact morphing between the two distributions that allows to link their respective parameters analytically. As such, this approximation does not allow a precise quantification of the relationship between the noise source and the overdispersion parameter.

### Overdispersion May Result from Redundancy Within a Neural Population: NBM as a Mixture of Poisson and Exponential Distributions

The Gamma distribution could also result from a mixture of distributions. Indeed, it is well known that the sum of *n* independent and identically distributed exponential random variables (the parameter of an exponential distribution is by definition the inverse of its mean, here called λ) follows a Gamma distribution (*n*, λ), where *n* and λ are its shape and scale parameters respectively. As such, exponential distributions have previously been used as a simple model of variability in formal neural networks (Treves and Rolls 1991; Levy and Baxter 1996; Baddeley 1996). They correspond to a model of the distribution of time between two Poisson events. In addition, such a distribution maximizes the entropy of spike counts (knowing the average count) and thus maximizes the carried information (Shannon 1948).

From a biological point of view, to drive a Gamma input, with an inverse-dispersion parameter ϕ equal to *n*, to the output neuron considered as a Poisson generator, this neuron should receive as input the sum of n exponential variables having the same mean λ (see Fig. 3). Thus ϕ reflects the amount of convergence to a neuron, which is, for instance, consistent with a model of the convergent connectivity from V1 to area V5/MT (Simoncelli and Heeger 1998; Rust et al. 2006; Wang et al. 2012). In addition, it was observed that the simulation of networks based on biologically realistic parameters (Voges and Perrinet 2010) could lead to complex dynamics, showing in particular an excess of variability (Voges and Perrinet 2012). Thus we could expect an overdispersion in V5/MT resulting from dimensionality reduction [i.e., projecting inputs from a high-dimensional space to a lower dimensional one (Haykin 1999)].

To summarize, we have focused on two main possible sources of overdispersion in spike counts. The first, is an extrinsic source that accounts for sensory noise, conveyed along the thalamo-cortical pathway. The second is a source of noise that emerges from convergent inputs to a population, including feedforward, lateral, and feedback. We have shown that the convergence of inputs between two data spaces may both result in an overdispersion. In the next section, we will focus on quantitatively testing evidence for this overdispersion.

## THE FANO GAMMA TEST: HOW TO QUANTIFY OVERDISPERSION EVIDENCE?

### The FF Test Is Misleading

Most studies concerned with the characterization of spike count variability using different methods and formal tests start with the FF. However, most of the methodological studies measuring trial-to-trial variability in firing rates (Teich et al. 1997; Koch and Schutter 1999; Dayan and Abbott 2001) do not take the estimation uncertainty introduced by the limited number of trials into account. Each trial may be considered as a realization of the random process and when accumulating evidence from *N* trials, the average and the variance converge with an expected error of 1/. Instead, the computed mean and variance are often treated as real parameters of the distribution whereas they actually correspond to estimations given the (necessarily) limited set of measurements. Thus considering only the FF value may induce errors in conclusions regarding the assumption of probability distribution. In particular, when the observed FF deviates from 1, is this the result of the limited number of trials or due to an overdispersion? Or when a value is measured close to 1, can it be safely attributed to a PM?

### The χ^{2}-Test Is Problematic

To evaluate how likely the observed data was generated by a PM, the most common approach is to use a χ^{2}-goodness of fit test. This statistical tool can be used to estimate how closely an expected distribution matches an observed distribution of a discrete quantitative variable (having only finite possible values). However, it has been shown that small expected frequencies may result in a loss of power, that is, a tendency to not reject a false null hypothesis (or to reject a true null hypothesis) (Cochran 1954; Yates et al. 1999; Campbell 2007). This is a potential problem when we deal with small sample sizes and its outcome also depends on how data is pooled into categories. Indeed, it often makes sense to pool the small frequencies together. Moreover, as this test is one tailed, it does not distinguish evidence for overdispersion from evidence for underdispersion.

### The Fano Gamma Test

To better quantify the overdispersion while accounting for sample size and without dealing with pooling problems, we propose a statistical test based on an empirical sampling distribution of the FF under the Negative-Binomial assumption. Here, we opt for an analytic procedure that we call Fano Gamma test (FG). It allows for a simple computation (or tabulation) of probability bounds for specific distributions hypotheses. That is not the case for other existing procedures (Amarasingham et al. 2006; Jackson and Redish 2007; Nawrot et al. 2008). This procedure characterizes the probability distribution of the estimated FF. Thus it allows one to compute FF bounds with respect to *P*-value precision limits. Starting from previous analyses (Hoel 1943; Kathirgamatamby 1953; Eden and Kramer 2010) about the distribution of the Fisher dispersion index, we propose here that, for a Negative-Binomial distribution, the FF asymptotically follows a Gamma distribution with a shape parameter and a scale parameter
(6)

where *n* is the number of spike count observations. We tested the accuracy of this approximation through simulation by comparing it to the empirical estimates of the true distribution of the FF for any (μ, ϕ, *n*) triplet. We found that the *P*-value upper bounds resulting from the approximative formula are slightly overconservative. Thus it is worthwhile mentioning that using it to reject the NBM hypothesis is based on a strong assumption (i.e., a rejected realization could be a good one because its *P* value is between the approximation and the empirical upper bound), but we do not detail this analysis here. For large values of ϕ, the Negative-Binomial distribution approaches a Poisson distribution and the previous formula gives:
(7)

Eden and Kramer (2010) studied the FF distribution under the Poisson distribution hypothesis and have shown that the convergence to this asymptotic distribution is fast. Thus this analysis provides a formal test for both hypotheses (PM and NBM) given the observed variability in the spiking.

We first compared the power of this test to the χ^{2}-test for the general NBM hypothesis and for the specific PM hypothesis on surrogate data of spike counts. Figure 4 shows that the FG test did better than a χ^{2}-goodness of fit test at a significance level of 0.025. This latter test has a larger probability to make a type II error for both distributions (dashed red lines in Fig. 4*C* and dashed blue lines in Fig. 4*D*).

### Overdispersion Evidence from Neurophysiological Data

We applied these tests to evaluate evidence against the PM on three data sets.^{2}

The first data set results from recording signals from mouse LGN neurons (72 cells) in response to moving gratings. The stimulus was presented several times in 12 different directions. The second data set comes from extracellular recordings in area V1 (67 cells) of two awake macaque monkeys in response to an oriented moving bar. The task was repeated several times for 12 different directions (orthogonal to the bar's orientation). The third data set results from recording in area MT (40 cells) of three anaesthetized macaque monkeys in response to moving gratings. The stimulus was presented several times in 16 different directions. See materials and methods for more details on the data collection.

We show the results in Fig. 5. For the three data sets, we found that the FG test gives lower evidence against the PM and the NBM compared with the FF and χ^{2}-tests (as expected given our discussion above). The FG test shows, as do the other tests, that the evidence of overdispersion is only significant in MT (with a rate of 64%) among cell/condition pairs, compared with LGN and V1 (with rates of 19 and 26%, respectively). Similarly, it shows that evidence against the NBM is significantly lower in the MT data set (15%) compared with LGN and V1 data sets (52 and 41%, respectively).

For these particular data sets where the number of trials is limited (mostly less than 20 trials per condition), we have shown that in the LGN and V1 data sets the high values of FF observed could be simply related to the sampling of the stochastic process itself. Indeed, looking only at the FF values induces a false belief that these data sets exhibit a significant degree of overdispersion, while the FG test reliably rejects this hypothesis. The χ^{2}-test shows slightly more evidence against the PM than the FG test, which could be explained by categorization issues. It shows also very strong evidence against the NBM known to be suitable for overdispersed data. Finally, we have shown that the situation is different for the MT data set where we could indeed show evidence of a higher variability than expected from a PM and less evidence against an NBM. It is worth mentioning that our statistical test (as validated on surrogate data) allows us to make this conclusion, while other tests fail to do so. Since we mostly observe overdispersion in the MT data set, we propose to compare the performance of an NBM-based to a PM-based direction decoder applied to this data set.

## NBM-BASED DIRECTION DECODER: WHAT IF THE OVERDISPERSION IS TUNED?

To evaluate the gain in decoding when accounting for the overdispersion using NBM compared with PM, we extended the classical probabilistic decoding approach proposed by Jazayeri and Movshon (2006) and tested the population decoder on the MT data set (presenting the highest overdispersion probability). We based it on a PM of trial-to-trial variability applied to single neurons and a model of tuning for a population coded variable (direction θ). This decoding approach allowed us to infer the stimulus (the hidden variable θ) given a particular neuronal response (an observed spike raster *Y*).

The decoding algorithm consists of maximizing the posterior probability *P*(θ|*Y*) function of the estimate direction θ given a distribution hypothesis. Bayes' rule gives: *P*(θ|*Y*) = , where the evidence term *P*(*Y*) is a normalization term independent of θ. Therefore, if we assume that there is no prior knowledge on θ(*P*(θ)constant), maximizing the posterior under the PM hypothesis is equivalent to maximizing the following log-likelihood function *LP*(θ):
(8)

where *f*_{i} is a function of the stimulus parameter θ and corresponds to the mean response of each cell *i*, *k*_{i} is the number of trials, and *N* is the number of cells. Similarly, under the NBM hypothesis, we maximize the following log-likelihood function LNB(θ) to estimate θ, where *p*_{i}(θ) = and ϕ_{i} corresponds to the inverse-dispersion parameter of each cell *i* as a function of the stimulus parameter θ:
(9)

Given that objective, let us first define the tuning functions of the cells. These correspond to the mean spike count (the mean parameter of the NBM) with respect to direction. Thus we defined for the MT data set a generative tuning function as von Mises functions peaking on the preferred direction. The parameters corresponding to the scaling, maximum and minimum values (κ_{o}, κ_{d}, *R*_{0}, *R*_{0} + *R*_{m}) are free to vary. The main tuning function is given by:
(10)

Figure 6*B* shows the estimated tunings of *f* for 16 different cells.

Then, we should consider the estimation of inverse-dispersion parameter ϕ. We tested two hypotheses. First, having no prior on the tuning of ϕ as a function of the stimuli direction, we supposed that this parameter is stimulus independent (constant chosen as the average value of its estimates over the different conditions) and varying between cells. Second, we tested the hypothesis of a stimulus dependent inverse-dispersion parameter. We considered a von Mises tuning of ϕ centered on the preferred direction for each cell. Figure 6*D* shows the estimated tunings of ϕ for 16 different cells. Please note that our aim is not to have an optimal fit but to show that a simple tuning hypothesis of ϕ can improve the quality of decoding, even if the estimation of ϕ values is difficult (see discussion).

Finally, to perform population decoding, we used a leave-one-out (LOO) cross validation method. In this method, we randomly pick a single trial from the data set for each neuron and each condition. Then, we apply the decoding approach previously described: we infer an estimate of the encoded direction (stimulus direction) that maximizes the log-likelihood functions based on the PM (*Eq. 8*) or the NBM (*Eq. 9*) using the spike count of each cell given by the picked trial (as *k*_{i}) and the estimated values of mean and inverse-dispersion using the remaining trials(as ϕ_{i} and *f*_{i}). Then, we compare the estimated direction angle to the stimulus direction for the PM model and the NBM model under tuned and untuned dispersion hypotheses.

The resulting decoding error values for both models applied to MT data show that the precision of decoding of the PM and the NBM under the nontuned dispersion hypothesis are very similar (Fig. 7*A*). However, we found that there is a significant gain in decoding when using the NBM under the tuned dispersion hypothesis (Fig. 7*C*). To test if this is related to some data or estimation bias, we performed the same test on surrogate data. We generated Negative-Binomial data using the estimated parameter and applied the two decoding schemes. Similarly, the results of decoding on surrogate data show that the precision of decoding is not improved when using nontuned dispersion (Fig. 7*B*). However, when considering additive knowledge about overdispersion tuning, there is a clear gain in accuracy and precision (mean and variance Fig. 7*D*), which seems to follow decoding performance in real data.

Moreover, comparing the goodness of fit of these models to MT data favors strongly the tuned NBM. Indeed, we used the Bayesian information criterion (BIC) that penalizes models with additional parameters (Box et al. 1994) to determine which model best explained our data. With the BIC measure, 37 (respectively 28) over the 40 cells were best fit with the tuned NBM (respectively nontuned NBM) and the remaining were best fit with the PM.

In summary, we have shown two main results in this section. First, the PM performs as well as the NBM in the decoding task most of the time but not for good reasons as the data set is not well described by a Poisson variability model. Second, the NBM with tuned overdispersion performs better and could be a generic model of neural computation.

## DISCUSSION

In this article, we explored overdispersion in neural spike counts using a doubly stochastic model, namely the NBM, that could disentangle different sources of overdispersion. We have first explained that some commonly used procedures to quantify overdispersion (the FF and χ^{2}-goodness of fit tests) could be problematic. Then, we have shown that the alternative analytical procedure that we propose (the FG test) better characterizes overdispersion in three different data sets (different animals, tasks, conditions, and areas). Using the FG test we show that there is weak evidence of overdispersion in LGN and V1 data sets and low evidence against the NBM in MT data set, especially for high mean spiking values, a trend expected from (Goris et al. 2014).

In the last section, we compared the performances of the traditional PM to the more recent NBM on a population decoding algorithm (direction decoding from spike recording in MT cells of an anesthetized macaque monkey). Under the Poisson hypothesis, a linear decoding model is optimal, as proposed by Jazayeri and Movshon (2006). We show here that this decoder is still efficient (mean decoding error <10°) even though there is strong evidence against the Poisson hypothesis. This is partly due to the tuning profile of the likelihood functions: the direction corresponding to the maximum likelihood will converge toward the stimulus' direction even if the noise model is not optimal.

Intuitively, one could expect that using the compound model that accounts for the observed overdispersion, one should obtain better decoding results. We first considered the hypothesis of nontuned cells dispersion (i.e., each cell dispersion parameter is constant). The PM and the NBM with nontuned dispersion show similar direction decoding results, not only on real data, but also on surrogate data. However, note that, similarly to what was reported before (Goris et al. 2014), we found that the likelihood values of the estimated directions are always higher for the Negative-Binomial-based decoding compared with the Poisson-based decoding (not shown), but with similar bandwidths. This leads to similar decoding accuracies, despite the fact that the NBM fits better to the data. Indeed, the spike counts corresponding to the response of the cell to its preferred direction are more likely to be generated by an NBM with higher inverse-dispersion values, i.e., Poisson like. This could explain why PM performance could be comparable to NBM performance in decoding.

In addition, the estimation of the inverse-dispersion parameter is known to be difficult. Indeed, the estimation of this parameter has been widely studied in different fields (Gourieroux et al. 1984; Lawless 1987; McCullagh and Nelder 1989; Dai et al. 2013) and several estimation methods were provided such as the maximum likelihood method (Piegorsch 1990), or the method of moment and maximum extended quasilikelihood method (Clark and Perry 1989). A wide range of studies have shown that the estimation of this parameter is very challenging and can be significantly biased, especially when dealing with small data sets. It is worthwhile to point out this problem here: since neurophysiological experiments are costly and time consuming, we often deal with small data sets. Thus having more knowledge and control about the dispersion tuning is crucial.

Therefore, we tested a simple possible form of dispersion tuning, that is, the tuning of the inverse dispersion having similar preferred direction as the tuning of the spike count (using a von Mises function). We found that under the hypothesis of a stimulus-dependent inverse-dispersion, the NBM performs qualitatively better.

Moreover, such a tuning function could explain complex FF tuning profiles that were recently reported (Ponce-Alvarez et al. 2013). Our results lead us to propose that the reported trial-by-trial variability among MT neurons, showing a directional tuning that is not trivially explained by firing rate variations alone, may simply be explained by a bell-shaped tuning of the overdispersion (see Fig. 8). However, contrary to what they expect, we did not observe a decrease of overdispersion with respect to the spontaneous level. Indeed, applying the FG test to MT spontaneous spiking data (results not shown) yields lower evidence against a PM (47.5%) compared with that for evoked spiking (64.1%, Fig. 5*D*) but similar evidence against NBM (15%) compared with evoked spiking (16.3%, Fig. 5*D*). This is more in line with an increase of underdispersion but is still in accordance with the known fact that FF is decreased when a stimulus is applied.

Where could such a tuning of the overdispersion come from? One first hypothesis could be that overdispersion results from the sensory noise. Then, as suggested by Goris et al. (2014), overdispersion could also reflect an excitability state of the cell. However, both of these hypotheses cannot account for stimulus-dependent variability. Alternatively, we suppose that this property is better apprehended at the population scale. Going back to the encoding paradigm described earlier (Fig. 3), an alternative hypothesis would be that the overdispersion arise from a stimulus-dependent degree of network convergence (within MT or from V1). Such a scheme could result from a center-surround connectivity profile in the direction domain: more excitatory input for a stimulus close to the preferred direction of the recipient cell.

This study calls for an extension over a wider range of experimental observations. First, we would like to extend this work to more data sets with more trials to explore if our predictions are still valid, both for teasing out sources of variability but also to evaluate population decoding. It would be mostly interesting to compare this extra variability along the sensory pathways to associative areas to confirm the tendency that we observed with a gradual increase of overdispersion from the lower to associative areas. A second important perspective is to consider the role of experimental conditions. Indeed, it is likely that results would greatly vary depending on the arousal state of the animal, the nature of the stimulation, or the behavioral task. In particular, behaviorally more relevant stimulations (such as the model-based synthesis of textures; Sanz-Leon et al. 2012) may lead to a more precise response (Baudot et al. 2013).

## GRANTS

W. Taouali, F. Chavane, and L. U. Perrinet were supported by EC FP7-269921, “BrainScaleS.” F. Chavane and L. U. Perrinet were also supported by ANR project “BalaV1” (ANR-13-BSV4-0014-02). G. Benvenuti was supported by by FACETS ITN Project (European Union funding, Grant 237955) and “Marie-Curie Initial Training Network.” P. Wallisch was supported by National Eye Institute Grant F32-EY-019833. We also thank the Philippe Foundation for a travel grant (to F. Chavane).

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## AUTHOR CONTRIBUTIONS

Author contributions: W.T., F.C., and L.U.P. conception and design of research; W.T., G.B., P.W., F.C., and L.U.P. analyzed data; W.T., G.B., P.W., F.C., and L.U.P. interpreted results of experiments; W.T. and L.U.P. prepared figures; W.T. and L.U.P. drafted manuscript; W.T., P.W., F.C., and L.U.P. edited and revised manuscript; W.T., G.B., P.W., F.C., and L.U.P. approved final version of manuscript; G.B., P.W., and F.C. performed experiments.

## ACKNOWLEDGMENTS

We are indebted to T. Movshon and his laboratory for giving the opportunity to F. Chavane to perform MT experiments at the Center for Neural Science, New York University. We thank I. Balansard and M. Martin for invaluable help with animal experimentation and X. Giovanni, A. de Moya, and J. Baurberg for important technical support.

Present address of G. Benvenuti: Center for Perceptual Systems, Univ. of Texas, Austin, TX.

## Footnotes

↵1 The inverse of the parameter ϕ, usually denoted by α, has been often called in the literature as the “dispersion parameter.” It has been occasionally named the “overdispersion parameter” because it does not account for FF values smaller than one.

↵2 Indeed, an hypothesis test is a statistical analogy to proof by contradiction. Thus we used the “evidence against” term that is the result of the statistical test and that means the rejection plausibility. Importantly, the evidence against a claim does not imply the evidence (plausibility) of the opposite claim , i.e., this is not the same as a likelihood-based test that allow explicitly to quantify the weight of evidence for one hypothesis.

- Copyright © 2016 the American Physiological Society