JN Ad Instruments
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 95: 3633-3644, 2006. First published March 22, 2006; doi:10.1152/jn.00919.2005
0022-3077/06 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
95/6/3633    most recent
00919.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (11)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Averbeck, B. B.
Right arrow Articles by Lee, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Averbeck, B. B.
Right arrow Articles by Lee, D.

Effects of Noise Correlations on Information Encoding and Decoding

Bruno B. Averbeck and Daeyeol Lee

Department of Brain and Cognitive Sciences, Center for Visual Science, University of Rochester, Rochester, New York

Submitted 1 September 2005; accepted in final form 13 March 2006


 ABSTRACT
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Response variability is often correlated across populations of neurons, and these noise correlations may play a role in information coding. In previous studies, this possibility has been examined from the encoding and decoding perspectives. Here we used d prime and related information measures to examine how studies of noise correlations from these two perspectives are related. We found that for a pair of neurons, the effect of noise correlations on information decoding can be zero when the effect of noise correlations on the information encoded obtains its largest positive or negative values. Furthermore, there can be no effect of noise correlations on the information encoded when it has an effect on information decoding. We also measured the effect of noise correlations on information encoding and decoding in simultaneously recorded neurons in the supplementary motor area to see how well d prime accounted for the information actually present in the neural responses and to see how noise correlations affected encoding and decoding in real data. These analyses showed that d prime provides an accurate measure of information encoding and decoding in our population of neurons. We also found that the effect of noise correlations on information encoding was somewhat larger than the effect of noise correlations on information decoding, but both were relatively small. Finally, as predicted theoretically, the effects of correlations were slightly greater for larger ensembles (3–8 neurons) than for pairs of neurons.


 INTRODUCTION
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
The possibility that patterns of activity across neurons are important features of the neural code has led to their study from a number of perspectives. Many of these studies have focused on noise correlation, which is between neuron correlation in the variability of the neural response to a fixed stimulus, with the response often measured as a spike count (Gawne and Richmond 1993Go; Lee et al. 1998Go). Noise correlations should be distinguished from signal correlations, which are correlations in average spike count, and from coherent oscillations, which decompose noise correlations into multiple temporal or frequency points (Averbeck and Lee 2004Go). Here we examine the role of noise correlations in information encoding and decoding. Studying information encoding involves studying the mapping between the stimulus and the population neural response, and estimating the total amount of information present in the neural responses. To evaluate the effect of noise correlations on information encoding, we can determine whether neurons with correlated noise encode more or less information relative to those without correlated noise. Studying information decoding involves studying the mapping from the population neural response to a prediction of the stimulus. When we assess the effects of noise correlations on information decoding, we calculate the amount of information lost when a decoding algorithm derived by ignoring correlations is applied to the neural responses with noise correlation.

Theoretical studies using Fisher Information have examined the effect of correlations on the information encoded by populations of neurons. Fisher Information bounds the variance with which a parameter encoded by a population of neurons can be estimated (Casella and Berger 1990Go). These studies have found that noise correlations can increase or decrease the information encoded with respect to an uncorrelated population, depending on their relationship with signal correlation (Johnson 1980Go; Snippe and Koenderink 1992Go). They have also found that information either grows with the number of neurons in a population (Abbott and Dayan 1999Go; Shamir and Sompolinsky 2004Go; Wilke and Eurich 2002Go) or saturates as the number of neurons goes to infinity (Sompolinsky et al. 2001Go; Zohary et al. 1994Go), depending on the structure of the correlations in the population. Theoretical work has also been done on the effect of noise correlations on information decoding (Shamir and Sompolinsky 2004Go; Wu et al. 2001Go). However, how the effects of noise correlation on information encoding and decoding are related has not been systematically investigated.

Empirical studies have focused on whether more or less information can be extracted from neural responses when trials are shuffled, destroying correlations, (Averbeck et al. 2003Go; Gawne and Richmond 1993Go; Gawne et al. 1996Go; Golledge et al. 2003Go; Panzeri and Schultz 2001Go; Panzeri et al. 1999Go, 2002Go; Petersen et al. 2001Go, 2002Go; Pola et al. 2003Go; Rolls et al. 2003Go; Romo et al. 2003Go) or whether correlations could be ignored by decoding algorithms without a loss of information (Averbeck and Lee 2003Go; Dan et al. 1998Go; Maynard et al. 1999Go; Nirenberg et al. 2001Go; Oram et al. 2001Go). These studies have generally found that noise correlations have little impact on information coding (Averbeck and Lee 2004Go). However, they have only analyzed interactions at the level of pairs of neurons. Finally, a number of other studies have considered neural coding at the ensemble level, but they have not directly addressed the affects of noise correlations on encoding or decoding (Brown et al. 1998Go, 2004Go; Nicolelis et al. 1997Go; Truccolo et al. 2005Go).

The effects of noise correlations on neural coding have been assessed using information measures and decoding algorithms. In general, there are many different ways to quantify information (Arndt 2001Go). In this study, we measured information using the square of d prime (d2) and used d2 to estimate the fraction or percent correct we obtained in corresponding decoding analyses. This allowed us to link directly the results of using an information measure (d2) and using a decoding analysis to study the effects of noise correlations. Although measures of the fraction correct are often closely correlated with Shannon information (Averbeck et al. 2003Go), it is theoretically possible to dissociate them (Thomson and Kristan 2005Go). The fraction correct is more directly related to behavior in experiments in which stimuli must be discriminated or movements must be produced. Because d2 is the discrete analog of Fisher Information, the use of d2 on our experimental data provides an assessment of the effect of correlations similar to that used in the theoretical studies cited above. Finally, d2 is a simple measure, and therefore its interpretation is straightforward. We exploit this simplicity to examine how the effects of noise correlations on information encoding and decoding are related because both encoding and decoding have been studied in the literature, but they have not been linked.

In our results, we show extensively that d2 provides an accurate measure of the amount of information in the neural activity recorded from the supplementary motor area of monkeys. In this study, information refers to the discriminability of the targets toward which the monkey reached. The analyses also showed that the decoding performance predicted by d2 agreed closely with the results from actually carrying out the corresponding linear decoding analyses and that nonlinear decoding algorithms do not extract more information than a linear decoding algorithm. Thus both d2 and the corresponding linear decoding algorithms provide an accurate measure of the information in the neural activity.


 METHODS
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
General

The data set analyzed in this study has been described previously (Lee and Quessy 2003Go). All the procedures used in this study were approved by the University of Rochester Committee on Animal Research and conformed to the principles outlined in the Guide for the Care and Use of Laboratory Animals (National Institutes of Health Publications No. 85–23, revised 1996). Neurons analyzed in the present study were recorded from the left caudal supplementary motor area (SMA-proper or F3) in two rhesus macaques producing sequences of visually guided reaching movements.

Behavioral task

Two animals were trained on the serial reaction time task shown in Fig. 1. They sat facing a computer monitor on which a series of targets was presented in a 4 x 4 grid. The animals acquired each target by reaching toward the corresponding location on a touch screen placed horizontally in front of the animal. After acquisition of a target, the subsequent target was presented after a 250-ms delay. A correct trial consisted of a sequence of 10 target acquisitions after which a juice reward was given. All data analyzed in this study were obtained from the task condition in which the monkey repeated a deterministic sequence of three movements three times (i.e., a single trial was 3 repeats of a 3 target sequence, for example, ABCABCABCA as shown in Fig. 1), with the first target of the sequence repeated at the end of the sequence. A new target sequence was selected randomly for each recording session. The minimum number of trials analyzed was 152, and the average was 267. Because each movement was repeated three times in each trial, ≥456 repetitions of each movement were available for analysis, with an average of 801.


Figure 1
View larger version (11K):
[in this window]
[in a new window]
 
FIG. 1. Task. Left: time sequence of events in a trial. Right: movements for an example sequence. The trial starts when 1 of the targets is illuminated, for example, A. After acquisition of a target (i.e., when the monkey touches the corresponding location on the touch-screen), there is a 250-ms delay, and then the next target, in this case B, is illuminated, and the monkey has to move to B. This pattern continues until the monkey has completed the sequence of 3 movements 3 times.

 
Data preprocessing

We analyzed the responses of 193 pairs and 19 ensembles of simultaneously recorded neurons. The data for each trial were split into epochs corresponding to each of the ten movements of the sequence. Data from the first movement were not considered because they followed the inter-trial interval and varied from trial to trial. Neural activity in the period from 0 to 200 ms after target onset was used to predict the target toward which the animal was about to reach. Previously, we found that the optimal classification accuracy was obtained when we split the 200-ms epoch into 3 bins of 66-ms duration (Averbeck and Lee 2003Go). Thus the same three bins were used for the analyses shown in Figs. 4 and 5. For the analyses in which a single 66-ms window was considered (Figs. 7 and 8), the final 66 ms of the 200-ms epoch (i.e., from 134 to 200 ms) was used because this tended to contain the most information.


Figure 4
View larger version (25K):
[in this window]
[in a new window]
 
FIG. 4. Measured vs. predicted accuracy for pairs of neurons. Values in the top left corner of scatter plots are Spearman rank order correlation coefficients (c), and those in histograms are means (m) of distributions. The asterisk indicates that the mean was significantly different from 0 (P < 0.05). Top: measured (abscissa) and predicted (ordinate) accuracy values for A, Ashuffled, and Adiag respectively. Bottom: measured and predicted accuracies for {Delta}Ashuffled and {Delta}Adiag. Histograms of marginal distribution of decoded (measured) values are shown at the top of each plot. Data points associated with covariance matrices with negative eigenvalues, caused by numerical errors, were removed from the plots as these tended to be outliers. The outlined data points indicated by the arrows show a single pair of neurons for which the effect of noise correlations on information encoding (left) was relatively large, but the effect on decoding (right) was essentially 0.

 

Figure 5
View larger version (24K):
[in this window]
[in a new window]
 
FIG. 5. Measured vs. predicted accuracy for ensembles of neurons. Conventions are as in Fig. 4.

 

Figure 7
View larger version (27K):
[in this window]
[in a new window]
 
FIG. 7. A: Bhattacharyya distance (BD) vs. d2. We have plotted 8*BD because the 1st term of the BD is d2/8. B: comparison of linear and quadratic classifiers. These comparisons are for the corresponding accuracy. Inset: histogram of differences between linear and quadratic classifiers. C: linear and quadratic decision boundaries (green) for a case in which BD predicted a large benefit of covariances, but the actual classification performance was the same for linear and quadratic decoders. The covariance ellipse for target 1 is indicated in blue and for target 2 in red; the mean of each distribution is indicated in black. The blue and red dots indicate individual responses for 2 targets with the blue dots shifted slightly rightward. In the linear case, the covariances are forced to be identical. In the quadratic case, the ellipse for target 1 is smaller than the response marker, and so is not visible.

 

Figure 8
View larger version (28K):
[in this window]
[in a new window]
 
FIG. 8. Performance of multinomial classifier. A: decoding accuracy for linear vs. multinomial classifier. Inset: histogram of the difference between linear and multinomial classifiers. B: comparison of differences in decoding accuracy for linear vs. quadratic and for linear vs. multinomial classifiers. C: effect of noise correlations in information encoded, {Delta}Ashuffled assessed with the multinomial classifier. D: comparison of {Delta}Ashuffled assessed with the multinomial and linear classifiers. E: effect of noise correlations on information decoding, {Delta}Adiag assessed with the multinomial classifier. F: comparison of {Delta}Adiag for multinomial and linear classifiers.

 
In our previous work (Averbeck and Lee 2003Go), the neural responses were used to predict one of the three possible movement targets (e.g., A, B, or C in Fig. 1). In this paper, we decoded targets in pairs because this is the case described by d prime, as discussed in the following text. Pairwise analyses of the three targets resulted in three separate analyses for each set of neural responses considered. For most of the analyses in the present study, we averaged the results from the three different pairs of targets. For the decoding analysis carried out on ensembles of neurons (Fig. 5), we show the results separately for each target pair because a relatively small number of ensembles were available for that analysis.

Analysis of d prime, encoding, and decoding

In the present study, we used d2 to measure information. Figure 2A shows the components of d2, which is defined as

Formula 1(1)
where µi indicates the mean spike count of a neuron to target i and {sigma}2 is the variance of spike count. This plot shows that d2 is a measure of the discriminability of samples from two Gaussian response distributions, characterized by the same variance and different means. In our case, these response distributions correspond to the spike counts of a single neuron for two different movement directions. In RESULTS, because we always considered the responses of more than one neuron, we used a multivariate generalization (Poor 1994Go), given by

Formula 2(2)
where {Delta}µ is the vector difference in mean responses to the pair of targets and Q is the pooled or average covariance matrix with the average taken across targets. The dimensionality of {Delta}µ and Q depends on the specific analysis. For example, when pairs of neurons and three time bins are analyzed, the dimensionality is 6. Multiplying by the inverse covariance matrix is analogous to dividing by the variance in Eq. 1. We also used two other measures that, when combined with d2, provide an estimate of the effect of noise correlations on information encoding and decoding. The first is a measure of the information that would be contained in the neural responses if they were uncorrelated. We call this dshuffled2 because shuffling trials is often used to destroy correlations in neurophysiological data. It is defined as

Formula 3(3)
where Qd is the diagonal covariance matrix obtained by setting the off-diagonal elements corresponding to correlations between neurons to 0. Finally, we defined ddiag2 as

Formula 4(4)


Figure 2
View larger version (12K):
[in this window]
[in a new window]
 
FIG. 2. Illustration of d2. A: d2 measures the separability of 2 Gaussian distributions. The portion of the left distribution that is shaded in yellow would be incorrectly classified using the optimal decision boundary indicated at the midpoint between the distributions. Therefore an estimate of the amount of the left distribution lying to the right of the midpoint provides an estimate of the percentage of movements which would be misclassified. B: projection of multivariate response distributions onto a linear discriminant line (solid line). The dotted lines represent corresponding decision boundaries. The green lines correspond to the optimal decision boundary and the red lines to the sub-optimal decision boundary obtained by ignoring correlations. Classification is carried out by projecting responses (example r) onto the discriminant line perpendicular to the decision boundary.

 
This measures the amount of information that would be extracted by using a decoding algorithm that ignored correlations on the original unshuffled dataset. We refer to this as ddiag2 because it amounts to assuming a diagonal covariance matrix for the neural responses when deriving the decoding algorithm. In this case, the decoding algorithm is suboptimal. This quantity can be derived by computing the variance of the linear decoder obtained by ignoring correlations with respect to the real response distribution. It was derived for Fisher Information by Wu et al. (2001)Go as a local linear approximation. In our case, the formula is exact because the difference in the mean responses is necessarily linear.

To understand what is being measured by ddiag2, we can consider the process of linear decoding. When carrying out linear decoding analyses, one projects the neural response vector onto a discriminant line that is perpendicular to the linear decision boundary (see Fig. 2B). This results in a scalar decision variable that is compared with a threshold to make the classification decision. In Fig. 2B, we show an example of a response r, projected onto both an optimal and a suboptimal discriminant line. The projection onto both lines is shown at the bottom of the plot. If this decision variable is compared with the decision boundary that bisects the distributions, the response will be properly classified in the optimal case because it was actually generated by target 2 but misclassified in the suboptimal case. If we project the entire distribution of responses onto the corresponding discriminant lines, we get the decision variable distributions shown at the bottom of Fig. 2B. The overlap in these distributions can be related to d2. Because the optimal decision boundary is defined as the one that maximally separates the distributions, projection onto a sub-optimal decision boundary, for example one derived by ignoring correlations, always results in worse classification performance, in theory. Correspondingly, ddiag2 as it would be measured in the distribution on the left, cannot be larger than d2, as it would be measured in the distribution on the right, and decoding algorithms that ignore correlations cannot lead to fewer misclassifications.

Given d2, dshuffled2, and ddiag2, we can estimate the effects of noise correlations on information encoding and decoding. The effect of noise correlations on decoding, which we will refer to as {Delta}ddiag2, is given by

Formula 5(5)

This quantity estimates the difference between the total amount of information that could be extracted from the neural responses using an optimal decoder, and the amount of information that would be extracted by a decoding algorithm which ignored correlations. Because information can only be lost by a decoding algorithm that ignores correlations, {Delta}ddiag2 is always positive. Similarly, the effect of noise correlations on the information encoded, {Delta}dshuffled2, is given by

Formula 6(6)

This quantity measures the difference in the information between the correlated neural responses, and the information that would be in a fictitious dataset of uncorrelated neural responses. {Delta}dshuffled2 can be positive or negative.

The measures d2, dshuffled2, and ddiag2, can each be converted to percent correct classification performance for the corresponding case (Poor 1994Go). In Fig. 2, the portion of the distribution for target 1 to the right of the classification boundary would be misclassified as having come from the distribution for target 2. Therefore we can write the probability of misclassification using the error function as

Formula 7(7)
where Formula 7 refers to the decision boundary, Formula 7 is the predicted target, and t is the actual target. If we make the change of variables, z = (x – µ1)/{sigma}, we get

Formula 8(8)

Finally, noting that Formula 8 – µ1 = (µ2 – µ1)/2, we can rewrite Eq. 8 as the normalized error function, with a lower integration limit of d/2

Formula 9(9)
Which shows that the fraction correct is only a function of d prime, where H is the complementary error function. From Eq. 9 we can calculate the percent correct performance or accuracy as

Formula 10(10)

Equation 10 shows that the percent correct is only a function of d2.

Although Eq. 9 was derived for the univariate case, given by Eq. 1, it is the same for Eq. 2 as discussed in the preceding text. The predicted classification performance shown in the results was obtained by first calculating d2, dshuffled2, and ddiag2 per Eqs. 24 for each pair or ensemble of neurons. Then Eq. 10 was used to convert each of these to the corresponding decoding accuracy, denoted as A, Ashuffled, and Adiag. We then refer to the changes in accuracy related to the effect of noise correlations on encoding and decoding as {Delta}Ashuffled, and {Delta}Adiag, respectively, and these were calculated as follows

Formula 11(11)

Formula 12(12)

Bhattacharyya distance

We also used the Bhattacharyya distance (BD) as an information measure (Basseville 1989Go) because it does not make the assumption of equal covariance matrices implicit in d2 and thus provides a measure of the information in the differential variance and covariance of the neural response to different targets. The BD is a special case of the Chernoff distance, which was recently used in the analysis of V1 responses (Kang et al. 2004Go). The BD is given by

Formula 13(13)
where the integral is over all possible responses, r. If we assume additive Gaussian noise, the response distributions are given by

Formula 14(14)
where r is a vector of spike counts for a given movement, µi is the vector of mean spike counts for target i, the superscript T indicates transpose, Qi is the noise covariance matrix for target i, and || indicates the determinant of the matrix. Substituting Eq. 14 into Eq. 13 leads to the BD for Gaussian distributions (Basseville 1989Go)

Formula 15(15)

The first term of Eq. 15 is equal to d2/8. Thus the second term indicates contributions due to the difference in covariance for different targets.

If covariance matrices vary between targets, the maximum likelihood estimator for the target is a quadratic Gaussian classifier (Johnson and Wichern 1998Go). This is referred to as a quadratic classifier because it contains terms that are products of firing rates between pairs of neurons. If the covariance matrices are the same across targets, the maximum likelihood estimator is a linear classifier, which does not contain interaction or product terms of the responses of individual neurons.

Decoding analyses

We compared the predicted percent correct classification performance based on d2 (Eq. 10) to the results of carrying out decoding analyses and classifying the data movement by movement. Although it may seem counter-intuitive to use decoding analyses to estimate the effects of noise correlations on information encoding, an optimal decoder will extract all of the information available in the neural responses, and in our encoding analyses, we are trying to determine how much information is encoded. This was the approach adopted originally by Bialek and his colleagues (Bialek et al. 1991Go; Rieke et al. 1997Go). The effects of correlations on information decoding were examined using a sub-optimal decoder, specifically one that assumed that there were no noise correlations. In this case, the question is how much information is lost when the suboptimal decoder is used.

The Gaussian decoding analyses have been described in detail previously (Averbeck and Lee 2003Go). Two-fold cross validation was used whenever different decoding algorithms were being compared. In general, the target was predicted by selecting the target with the maximum probability from the conditional distribution of targets given the neural activity. This can be formalized as

Formula 16(16)
where Formula 16 is the estimated target for the subsequent movement and p(t|r) is the conditional probability distribution of a target, t, given the response vector, r, that represents the response of one or more neurons across a given number of bins. The conditional probability of t is given by Bayes' rule

Formula 17(17)
where p(t) is the prior probability of a given target, and p(r) is a normalizing constant calculated as

Formula 18(18)

The likelihood for the Gaussian model is given by Eq. 14. We fit linear models by estimating a single, pooled covariance matrix for both targets, and we fit quadratic models by estimating separate covariance matrices for each target. These covariance matrices were the same as those used to calculate the predicted accuracy described in the preceding text, using d2.

The decoding analysis based on the linear Gaussian model was carried out under three conditions to derive values for the measured accuracy shown in Figs. 4 and 5. These conditions were in correspondence with the predicted accuracy, derived from d2, dshuffled2, and ddiag2, given by Eq. 10. In this study, we only analyzed the effect of noise correlations, i.e., correlations between neurons, not autocorrelations which were analyzed in a previous study (Averbeck and Lee 2003Go). In the first condition, which was used as an estimate of A, the decoding analysis was carried out on the original dataset using a covariance matrix that used the measured values for the noise correlations. In the second condition, which was used as an estimate of Ashuffled, the analysis was carried out on a trial-shuffled dataset. The shuffling effectively destroyed the correlations between neurons. The shuffling analysis was carried out five times, and the average of these five analyses was used as the estimate. In the final analysis, a decoding model was used in which all off diagonal elements of the covariance matrix that correspond to inter-neuronal correlations were set to zero. This model was then applied to the original unshuffled dataset. This was used as an estimate of Adiag and corresponds to the independent model from our previous work (Averbeck and Lee 2003Go). Thus the decoding model was essentially the same for Ashuffled and Adiag because in both cases the off-diagonal elements of the covariance matrix were zero, but the decoders were applied to the shuffled trials and the original datasets respectively. From these analyses, {Delta}Ashuffled and {Delta}Adiag were also calculated, per Eqs. 11 and 12.

We also used a multinomial decoding algorithm to determine how the results from the above analyses are influenced by the assumption of a Gaussian distribution for the neural responses. Because spike counts are discrete quantities, it was possible to tabulate the different responses for a pair of neurons, and generate a probability mass function for each target direction. For example, for target direction 1, how often did neuron 1 fire two spikes and neuron 2 fire one spike? A complete table of these probabilities specified the probability mass function for each target. In this case, the likelihood is estimated by

Formula 19(19)
where nij is the number of times response ri occurred when target j was presented and Nj is the number of times target j was presented. This is the same characterization of neural responses used in the direct method of estimating mutual information between neural responses and stimuli (Strong et al. 1998Go). These models provide the most detailed description of responses, at a given bin size, that is possible. It is important to note, however, that each different response type that occurs is a parameter of the model, and estimating these models for several response bins or multiple neurons is not possible without an extremely large dataset because the number of parameters necessary to estimate the models grows quickly. Therefore we estimated this model using a single 66-ms response bin for a pair of neurons. When we compared this model to the linear and quadratic models, those models were also fit on the same bin.


 RESULTS
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Noise correlations and information coding

The impact of noise correlations on information encoding and decoding can be compared by looking at {Delta}dshuffled2 and {Delta}ddiag2, which measure the effect of noise correlations on information encoding and decoding respectively (see METHODS). {Delta}dshuffled2 is defined as

Formula 19
where d2 is the information in the correlated neural responses, and dshuffled2 is the information in the uncorrelated (shuffled) neural responses. The effects of noise correlations on information decoding have been studied by examining how much information is lost by a decoding algorithm that ignores correlations. Therefore {Delta}ddiag2 is defined as

Formula 19
where ddiag2 is the information that would be extracted by a decoding algorithm that ignores correlations, from the original correlated dataset. Thus {Delta}ddiag2 measures the amount of information lost by a decoding algorithm that ignores correlations.

The relation between {Delta}dshuffled2 and {Delta}ddiag2 is shown in Fig. 3 (d2, dshuffled2, and ddiag2 are also shown for reference). In this figure, we used relatively large values for the correlation coefficient between neurons (0.6), so the effects of noise correlations can be clearly visualized. When noise correlations are smaller, the effects have the same periodicity, and are simply scaled down. Figure 3, B–D, shows three illustrative examples of the responses of two fictitious neurons to two different movement directions. The mean response vector for the corresponding movement direction is indicated as µi, and the ellipse represents the response distribution, or the variability of the responses, for each movement direction. The ellipse also indicates the covariance of the pair of neurons because the orientation of the ellipse indicates whether the covariance is positive or negative. If the covariance was zero, i.e., no noise correlation, the ellipses would be circles because these neurons have the same variance. The primary axis of each ellipse is given by the eigenvector, e1 (shown in Fig. 3D), which corresponds to the largest eigenvalue of the covariance matrix. The covariance in each panel of Fig. 3, B—D, is the same, as is the length of {Delta}µ, which defines the difference in the mean responses. However, {alpha}, the angle between e1 and {Delta}µ, is different. As is shown in Fig. 3A, this is the key parameter for relating {Delta}dshuffled2 and {Delta}ddiag2.


Figure 3
View larger version (27K):
[in this window]
[in a new window]
 
FIG. 3. Effects of noise correlations on encoding and decoding. A: components of d2 vs. the angle {alpha} between e1, the eigenvector associated with the largest eigenvalue of the covariance matrix and {Delta}µ, the vector of the difference in mean responses. This shows how the elements of the information breakdown change as a function of {alpha}. For all calculations, the variance of both neurons was 1, and the correlation coefficient was 0.6. B—D: examples of covariance/signal structures. Horizontal and vertical axes represent the responses of 2 hypothetical neurons. Green lines represent optimal decision boundaries, red lines represent decision boundaries that would be obtained if correlations were ignored, referred to as diagonal. When the decision boundaries are the same (B and D), a dashed line is shown.

 
The effects of noise correlations on information encoding and decoding are different, and depend on the value of {alpha}. For example, for values of {alpha} near 0 or {pi}, {Delta}dshuffled2 is negative, whereas {Delta}ddiag2 is zero. Thus when {alpha} is near zero, assessing the effect of correlations on the information encoded would suggest that they have a negative effect, whereas assessing the effect of correlations on information decoding would suggest no effect. Between approximately {pi}/7 (a value that depends on the size of the correlation) and {pi}/2, both {Delta}dshuffled2 and {Delta}ddiag2 are positive, whereas for {alpha} = {pi}/2, {Delta}dshuffled2 takes on its largest value and {Delta}ddiag2 is again zero. The effect of noise correlations on {Delta}dshuffled2 can be seen by looking at the overlap of the response distributions. The more the distributions overlap, the less information encoded. To gain an intuition for the effect of noise correlations on {Delta}ddiag2, we have plotted the optimal decision boundaries in green, and the decision boundaries derived by a decoding algorithm that ignored correlations (a diagonal decoding algorithm) in red. In Fig. 3, B and D, the optimal and the suboptimal (i.e., derived under the assumption of no correlation) decision boundaries are the same. Because the decision boundaries are the same in these cases, no information is lost by ignoring correlations, and {Delta}ddiag2 is zero. In Fig. 3C, however, the optimal and suboptimal decision boundaries are not the same, and so if the suboptimal decision boundary is used for classification, information will be lost, as indicated by the positive value of {Delta}ddiag2 at {pi}/4.

Comparison of predicted and measured information

Implicit in the use of d2 as an information measure is the assumption that the variance and covariance of the neural responses is the same for both targets. In this case, all of the information available in the neural responses can be extracted by computing the dot-product between the vector of spike counts of individual neurons, in time bins of the appropriate size (Averbeck and Lee 2003Go), and an appropriate weight vector. As shown in Fig. 3C, the weight vector can be affected by the correlations, i.e., it is not the same for correlated and uncorrelated neural responses. However, explicitly taking into account correlations between neurons by computing products or interactions between neural responses will not extract more information. Thus all of the information is available in the spike counts of neurons considered individually. This is a strong assumption about how information is encoded in neural responses. In the DISCUSSION, we consider the biophysical implications of this code.

As a first step toward examining how accurately d2 predicts the information in neural responses, we compared the decoding accuracy predicted by d2 to the results of actually carrying out decoding analyses and classifying individual movements. Figure 4 shows the predicted and the measured values of the decoding accuracies for pairs of simultaneously recorded neurons in the supplementary motor area for each of the information measures shown in Fig. 3A. The predicted values of the accuracy, A, Ashuffled, and Adiag, were calculated by estimating d2, dshuffled2, and ddiag2, based on the covariance, Q, and mean response vectors, {Delta}µi, estimated for each pair of neurons, and then converting these to estimates of percent correct classification performance or accuracy (see METHODS, Eqs. 9 and 10). The values plotted along the axis labeled measured accuracy were derived by explicitly classifying every movement using a linear decoding algorithm applied to the spike counts of pairs of neurons. Although d2 is only a function of the covariance matrix and the difference in the mean response vectors, these measures predicted the outcome of actually carrying out the decoding analyses accurately (Fig. 4). The quantities {Delta}Ashuffled and {Delta}Adiag, which measure the effects of correlations on the classification performance, were also well approximated by the predicted values.

The histograms at the top of the scatter plots (Fig. 4, bottom) show the distribution of the corresponding measured accuracies. These plots show that, on average, correlations had almost no effect on the information encoded and only a small effect on decoding performance when correlations were ignored. The only distribution with a mean that was significantly deviated from zero was {Delta}Adiag (t-test, P < 0.01). Although the distribution was centered near zero, correlations did affect the information encoded in some cases, increasing it or decreasing it by up to a few percent. The negative values of measured accuracy for {Delta}Adiag are due to finite sampling and mismatches between the linear model and the actual distribution of the data. As shown in Fig. 3, there could be a large effect of noise correlations on information encoding, when the effect on decoding is minimal. Similarly, the outlined data points indicated by the arrows in the plots of {Delta}Ashuffled and {Delta}Adiag in Fig. 4 show a pair of neurons for which the effect of noise correlations on information encoding was relatively large and the effect on information decoding was essentially zero.

To examine the role of noise correlations in ensembles of more than two neurons, the analyses were applied to groups of three to eight simultaneously recorded neurons (Fig. 5). As with the pair-wise analyses, the predicted and measured decoding performances again agreed closely. However, the largest effects of correlations at the ensemble level were larger than the effects for pairs of neurons (see histograms in Fig. 5, bottom). To ensure that this increase in the size of the effects of correlations was not due to the fact that we were fitting more complex models, we also re-ran the analysis using only half the data. The results were essentially the same (data not shown). Overall, correlations reduced the information encoded slightly, but the effect was small and the mean of the distribution was not significantly different from zero. Again the only distribution with a mean that was significantly deviated from zero was {Delta}Adiag (t-test, P < 0.01). These analyses show that noise correlations have a relatively small effect on either information encoding or decoding. The size of the effects will depend on the size of the noise correlations in the population. On average, noise correlations in our data are almost zero, although a few pairs do have correlations >0.2 (Fig. 6A). The signal correlations are somewhat larger, but close to zero on average (Fig. 6B).


Figure 6
View larger version (18K):
[in this window]
[in a new window]
 
FIG. 6. Distributions of noise correlation (A) and signal correlation (B). The mean of the distribution is shown in the top left of each plot.

 
Contribution of target-related changes in covariance to encoded information

As described in the preceding text, even correlations that are the same for both targets can affect information encoding and decoding. In this case, the correlations themselves cannot be used directly to predict which target was presented. The correlations can, however, still affect information decoding by changing the optimal decision boundary used to estimate the target. For all the analyses described so far, it was assumed that the covariance of neural activity was identical for both targets. However, it is often observed that the variance of spike counts scales with the mean, and the covariances may change as well (Averbeck and Lee 2003Go; Tolhurst et al. 1981Go). This raises the possibility that additional information can be carried in the variances and the covariances of the neural responses. A classifier that explicitly computes interactions, a nonlinear operation, must be used to extract this information (Shamir and Sompolinsky 2004Go).

We examined the possibility that changes in the covariances for different targets might carry additional information by computing the BD, and comparing it to d2. The BD has one term that is proportional to d2, and a second term that measures the amount of information in the covariances (see METHODS). Thus by comparing the BD to d2, we obtained an estimate of the additional information available in the covariances that cannot be extracted by a linear decoding algorithm. The results of this analysis suggested that there was additional information available in the covariances for many pairs of neurons (Fig. 7A). To test this further, we compared the classification performance of a nonlinear, quadratic classifier, which is a decoding model that can extract information from the target-dependent variances and covariances of the responses, to that of the linear classifier that uses only information available in the individual spike counts (Fig. 7B). In this analysis, only the accuracy in the unshuffled neural responses (A) was considered, and the analysis was applied to pairs of neurons to control model complexity. In contrast to the increase in information predicted by the BD, the performance of linear and nonlinear classifiers was essentially equivalent in most cases with the performance of the quadratic classifier being slightly worse on average than that of the linear classifier. We investigated possible reasons for the discrepancy between the information predicted by the BD and the actual performance of the decoding algorithms by looking at the decision boundaries produced by both classifiers (e.g., Fig. 7C). As expected, the classification boundaries of both classifiers separated the data equally well despite the fact that the BD predicted suboptimal performance for the linear classifier. This is due to at least two factors. The first is that neural responses are discrete quantities (i.e., spike counts), so the exact position of the decision boundary does not affect the classification performance, i.e., whether the decision boundary is at 1.2 or 1.4 spikes does not matter because the responses never take any values between 1 and 2. Second, negative firing rates may not be properly classified by the linear model (see Fig. 7C), but they do not occur. Thus the BD does not predict the actual decoding performance accurately, as does d2.

Non-Gaussian, multinomial decoding algorithm

To further validate the results we obtained using the linear Gaussian decoding algorithm, we used a multinomial decoding algorithm. The multinomial decoding algorithm provides a general, assumption-free description of the neural responses, and as such allowed us to re-examine the effects of noise correlations on information encoding and decoding, without making the Gaussian assumption. We fit the multinomial model to the joint spike count distributions for pairs of neurons and a single 66-ms bin of neural responses. Comparison of the performance of the multinomial model to that of the linear model showed that the performance was similar, with a slight but not statistically significant advantage for the multinomial model (mean = 0.013%, paired t-test, P = 0.4; Fig. 8A). However, because the performance of these models was assessed with cross validation, the cases in which the multinomial performs better than the linear model are individually relevant. Interestingly, when the quadratic model outperformed the linear model for a given neuron pair, the multinomial model was also more likely to outperform the linear model (Fig. 8B). Thus in a small number of cases, extra information does appear to be available, beyond the spike counts of individual neurons. We also compared estimates of {Delta}Ashuffled and {Delta}Adiag obtained with the linear and multinomial decoding algorithms. As can be seen by the histograms in Fig. 8, C and E, the size of the effects of noise correlations on encoding and decoding assessed with the multinomial algorithm are similar to those assessed with the linear decoding algorithm (histograms in Fig. 4). Additionally, Fig. 8D shows that there is a fairly strong correspondence, on a pair by pair basis, between the multinomial and linear decoding algorithms for {Delta}Ashuffled. However, the correspondence is poor for {Delta}Adiag. Thus the size of the effect of noise correlations on encoding and decoding is quite similar, independent of the algorithm with which it is assessed, but on a pair-by-pair basis, the estimates of the two algorithms differed somewhat for the effects on decoding.

Possibility of learning-related changes in noise correlation

The dataset used in this study was generated using a serial reaction-time task (Lee and Quessy 2003Go). This raises the possibility that learning-related changes in noise correlations could affect our decoding analyses. To examine this possibility, we divided each of our datasets into four parts, and compared the noise correlations in the first and last quarter of the data (Fig. 9). We found that there was a strong correlation between the noise correlations in the first and last blocks (r = 0.933) but that the slope of the best fit line was not unity (95% confidence interval: 0.908–0.967). Thus there is a very small but significant difference in the noise correlations between the first and the last quarter of the data. To test how the effect of noise correlation on information coding is influenced by these small changes, we also computed {Delta}dshuffled2 and {Delta}ddiag2 in the first and second half of the dataset for pairs of neurons. For this analysis, we used halves of the dataset to ensure a sufficient number of trials. The mean ± SE (n = 132) for {Delta}dshuffled2 were 0.336 ± 0.092 and 0.414 ± 0.086% for the first and second halves of the data, respectively, and the corresponding values for {Delta}ddiag2 were 0.074 ± 0.021 and 0.094 ± 0.023%. The means for {Delta}dshuffled2 were significantly different from zero, but the overall size of the effect was still quite small, with 95 and 93% of the distributions confined within ±2% for the first and second halves, respectively. Although this is slightly broader than the distribution shown in Fig. 4, reducing the sample size by dividing the dataset in half would be expected to broaden the distribution. To examine this quantitatively, we generated a dataset with half as many trials by sampling randomly with replacement from the original dataset and calculated {Delta}dshuffled2 in this bootstrapped dataset. The resulting distribution was somewhat broader with only 89% of the data between –2 and +2%. Thus there is a slight shift in the mean of the distribution for {Delta}dshuffled2 due to learning-related changes in neural activity or to nonstationarities in the neural responses not related to learning, but the overall width of the distribution, which is a measure of the largest positive and negative effects, appears to be a relatively stable feature of our dataset.


Figure 9
View larger version (20K):
[in this window]
[in a new window]
 
FIG. 9. Comparison of noise correlation in 1st and last block of data. The best fit line is shown (- - -; slope = 0.94) as well as the line with unit slope (—).

 

 DISCUSSION
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Noise correlations have been studied both empirically and theoretically using a variety of methods. One approach to the empirical study of noise correlations is to consider how much information is lost when neural responses are decoded using algorithms that assume that the response of individual neurons are uncorrelated. Theoretically, decoding algorithms can only do worse when correlations are ignored. Most of these studies have found that little information is lost when correlations are ignored (Averbeck and Lee 2003Go; Nirenberg et al. 2001Go; Oram et al. 2001Go), although some studies have shown that the effect of ignoring correlations can be larger (Dan et al. 1998Go; Maynard et al. 1999Go). Empirical analyses have also assessed the impact of noise correlations on information encoding. Romo and his colleagues (2003)Go have shown that for a subpopulation of their neurons correlations increase the amount of information encoded. As shown in this study (Fig. 3A), this occurs when the difference in the response vectors for different movements is nearly orthogonal to the longest axis of the noise covariance matrix, which was the case with the subset of the data considered by Romo et al. (2003)Go. Another series of studies have examined the encoding effects of correlations using a decomposition of the Shannon information (Panzeri et al. 1999Go), which can be related to the decomposition used in this study. Their approach also considered the total effect of correlations, as well as splitting the correlation into two terms, one of which is related to {Delta}ddiag2. Similar to our finding, which is based on neural activity recorded in the supplementary motor area, these studies have shown that correlations in pairs of neurons carry relatively little information in V1 (Golledge et al. 2003Go), rat barrel cortex (Petersen et al. 2002Go), and inferior-temporal cortex (Rolls et al. 2003Go). Thus studies based on the analyses of pair of neurons have consistently demonstrated that the role of correlations in information coding is limited (Averbeck and Lee 2004Go).

Studying correlations from both the encoding and the decoding perspectives are useful. Assessing the effects of noise correlation on information encoding is valuable for at least two reasons. The first is to check predictions for the amount of information contained in the responses of populations of neurons, based on the recording of single neurons. If neurons are indeed independent, then extrapolation from single-cell recording studies, which by definition cannot estimate the effects of noise correlations, are valid. If, however, neurons are not independent, then these extrapolations are not valid. Second, information maximization models of information coding in the cortex often ignore correlations (Bell and Sejnowski 1995Go; Hyvarinen et al. 2001Go; Olshausen and Field 1996Go). However, maximizing the information contained in the responses of a population of neurons is not simply a matter of optimizing the mean responses of a population of neurons. It also requires the optimization of the distribution of the noise in the neural responses. Assessing the effects of noise on information encoding in small ensembles of neurons is a first step toward assessing the effects of noise in larger populations.

Consideration of the effect of noise correlations on decoding, in addition to the effects on encoding, is also valuable for several reasons. First, understanding the impact of noise correlations on information decoding is important in the design of algorithms for driving neural prosthetic devices (Musallam et al. 2004Go; Taylor et al. 2002Go). Second, insights into the biophysical or network mechanisms that would be necessary to extract all of the information from spike trains of upstream neurons can be gained from studying {Delta}ddiag2. To carry out computation, the brain has to solve the same computational problem faced by our decoding algorithm. If {Delta}ddiag2 is small, the neural responses can be decoded reliably by assuming the upstream neurons are conditionally independent. This simplifies the computational task of defining the optimal decision boundaries, which presumably simplifies the problem to be solved by the biological system. When the decoding problem is considered from a more general, probabilistic perspective, estimation of the full joint distribution of neural responses is considerably simplified if the distribution can be factorized. This implies that the neurons can be considered conditionally independent for purposes of decoding. These simplifications are the basis for the recent success of graphical models (Jordan and Sejnowski 2001Go). Furthermore, if all the information in the neural responses resides in the spike counts of individual neurons, i.e., if d2 describes all of the information available in the responses, they can be decoded linearly. This might obviate the need for computational machineries at the single neuron or network level that combine inputs nonlinearly. For example, if dendritic arbors combine their inputs linearly, they would not be able to extract information from differential covariances. Some results have suggested that dendritic arbors process their inputs relatively linearly (Cash and Yuste 1998Go, 1999Go), whereas others have shown that some level of nonlinearity can be found (Koch 1998Go; Margulis and Tang 1998Go; Nettleton and Spain 2000Go). In general, however, it is unlikely that the brain is limited to linear computations. Understanding the relation between the features of neural responses that carry information and the processing capabilities of dendritic arbors and networks will provide important converging perspectives for understanding the neural code.

When considering whether or not correlations have an effect, studying information encoding and information decoding can lead to different answers (Fig. 3). For some pairs of simultaneously recorded neurons in the supplementary motor area, we found that noise correlations affected the information encoded. However, the effects were relatively small, and averaged across the population, the mean effect was not significantly different from zero. The effect of noise correlations on information decoding was similar in magnitude to the effect on information encoding. Although the mean of the distribution of effects on decoding was significantly different from zero, this term is in principle nonnegative, so this result is not surprising. At the ensemble level, the effects of noise correlations were somewhat larger, but the average effect for information encoding was again not significantly different from zero. It is important to point out that, as we and others have shown before (Averbeck and Lee 2003Go; Constantinidis and Goldman-Rakic 2002Go; Reich et al. 2001Go), measured correlations, and correspondingly the size of the effect of correlations, depend on the bin size used for their estimation. This is due to the fact that cross-covariances between neurons are much stronger at low frequencies and thus larger bins show stronger correlations between neurons (Averbeck and Lee 2004Go). We have chosen 66 ms in this study because our previous work (Averbeck and Lee 2003Go) showed that this was the optimum bin size for information extraction. The correlations we observed were similar in size to those that have been observed in other studies (Constantinidis and Goldman-Rakic 2002Go; Reich et al. 2001Go). Therefore a relatively small effect of noise correlation on information coding found in the present study may generalize to other brain areas and task conditions, although this remains to be investigated in future studies.

Implicit in the use of d2 is the assumption that the conditional response distributions of the neurons are Gaussian, and have the same variance for different targets. Theoretically, this is a strong limitation of using d2 as an information measure because the variance of neural responses tends to scale with the mean response (Averbeck and Lee 2003Go; Tolhurst et al. 1981Go; Werner and Mountcastle 1963Go), and response distributions are at best truncated Gaussians unless spike rates are high (Wiener and Richmond 1999Go). We have shown that the predicted decoding performance derived from d2 closely matched the actual decoding performance of a linear decoding model. We have also shown, through a series of analyses, that linear decoding models can generally extract almost all of the information available in the neural responses. More general decoding models that assumed that variances can change for different targets, as well as a very general multinomial decoding model, were only able to do marginally better than the linear decoding model. The major discrepancy we found was between {Delta}Adiag measured with the linear and the multinomial decoding algorithms. Although the relative magnitude of the effects was similar across our population, the two decoding algorithms did not agree on a pair by pair basis. Continued investigation of the limitations of the Gaussian assumption will be important because most current theoretical models of information coding in the cortex make this assumption (Abbott and Dayan 1999Go; Shamir and Sompolinsky 2004Go; Sompolinsky et al. 2001Go; Wilke and Eurich 2002Go; Wu et al. 2001Go).

Another information measure often computed on the responses of pairs of neurons is synergy/redundancy (Averbeck et al. 2003Go; Gawne and Richmond 1993Go; Latham and Nirenberg 2005Go; Narayanan et al. 2005Go; Puchalla et al. 2005Go; Schneidman et al. 2003Go). There are two important differences between this measure and our measures of the effects of noise correlations on encoding and decoding. The first is that synergy/redundancy is a function of both noise correlations and signal correlations. Specifically, even if noise correlations are zero, there can and likely will be redundancy in neural responses. Given that noise correlations play a relatively small role in information coding for pairs of neurons, and measured redundancy is normally large, signal correlations are presumably responsible for the reported redundancy effects. It has been shown that {Delta}dshuffled2 can be small when there are large redundancies (Averbeck et al. 2003Go). Furthermore, the redundancy is largely a function of the finite entropy of discrete Shannon information, since discrete Shannon information saturates. We would get a similar effect if we calculated a synergy/redundancy statistic on the percent correct classification, which of course saturates at 100%. For example if we computed the statistic

Formula 19
where A1 is the percent correct classification for neuron 1, A2 is the percent correct classification for neuron 2, and A1,2 is the joint percent correct classification. If A1 and A2 individually perform at a 90% classification rate, but A1,2 is only at 98%, the responses would be considered redundant. This could be true even if there were no noise correlations. However, if there are no noise correlations, a statistic based on d2, for example

Formula 19
will be zero. Thus it seems to us that if synergy/redundancy is being calculated, the separate effects of signal and noise correlations should be examined, as has been done in some studies (Gawne and Richmond 1993Go; Gawne et al. 1996Go; Panzeri et al. 1999Go; Petersen et al. 2001Go; Pola et al. 2003Go).

In conclusion, noise correlations can differentially affect information encoding and decoding. Both perspectives are useful, and they address different questions about the nature of the neural code. We found that in general, the effects of noise correlations were relatively small in our population of SMA neurons. However, we have only considered analyses in small ensemble of neurons, and theoretical work (Shamir and Sompolinsky 2004Go) suggests that small effects of noise correlations in pairs of neurons can become substantial in large populations. Furthermore, because the output of a system cannot contain more information than the input, correlations must ultimately limit information, if the number of neurons becomes sufficiently large (Narayanan et al. 2005Go; Seriès et al. 2004Go). Consistent with this, the effects of correlations in our study were slightly larger in ensembles than in pairs of neurons. Perhaps this is also an explanation for the saturation effects seen in studies related to neural prosthetics that have attempted to use relatively small ensembles to decode hand kinematics (Averbeck et al. 2005Go; Paninski et al. 2004Go; Wessberg et al. 2000Go). Future studies with larger populations of neurons will help to answer these questions empirically.


 GRANTS
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
This work was supported by National Institutes of Health Grants R01-MH-59216, T32-MH-19942, and P30-EY-01319.


 ACKNOWLEDGMENTS
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
We thank S. Quessy for help with the experiment and A. Pouget for extensive conversations that led to many of the results in the paper.


 FOOTNOTES
 
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Address for reprint requests and other correspondence: B. B. Averbeck, Dept. of Brain and Cognitive Sciences, Center for Visual Science, University of Rochester, Rochester, NY 14627 (E-mail: baverbeck{at}cvs.rochester.edu)


 REFERENCES
 
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Abbott LF and Dayan P. The effect of correlated variability on the accuracy of a population code. Neural Comput 11: 91–101, 1999.[CrossRef][Web of Science][Medline]

Arndt C. Information Measures. Berlin: Springer-Verlag, 2001.

Averbeck BB, Chafee MV, Crowe DA, and Georgopoulos AP. Parietal representation of hand velocity in a copy task. J Neurophysiol 93: 508–518, 2005.[Abstract/Free Full Text]

Averbeck BB, Crowe DA, Chafee MV, and Georgopoulos AP. Neural activity in prefrontal cortex during copying geometrical shapes. II. Decoding shape segments from neural ensembles. Exp Brain Res 150: 142–153, 2003.[Web of Science][Medline]

Averbeck BB and Lee D. Neural noise and movement-related codes in the macaque supplementary motor area. J Neurosci 23: 7630–7641, 2003.[Abstract/Free Full Text]

Averbeck BB and Lee D. Coding and transmission of information by neural ensembles. Trends Neurosci 27: 225–230, 2004.[CrossRef][Web of Science][Medline]

Basseville M. Distance measures for signal processing and pattern recognition. Signal Process 18: 349–369, 1989.[CrossRef]

Bell AJ and Sejnowski TJ. An information-maximization approach to blind separation and blind deconvolution. Neural Comput 7: 1129–1159, 1995.[Web of Science][Medline]

Bialek W, Rieke F, de Ruyter van Steveninck RR, and Warland D. Reading a neural code. Science 252: 1854–1857, 1991.[Abstract/Free Full Text]

Brown EN, Frank LM, Tang D, Quirk MC, and Wilson MA. A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. J Neurosci 18: 7411–7425, 1998.[Abstract/Free Full Text]

Brown EN, Kass RE, and Mitra PP. Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat Neurosci 7: 456–461, 2004.[CrossRef][Web of Science][Medline]

Casella G and Berger RL. Statistical Inference. Belmont, CA: Duxbury, 1990.

Cash S and Yuste R. Input summation by cultured pyramidal neurons is linear and position-independent. J Neurosci 18: 10–15, 1998.[Abstract/Free Full Text]

Cash S and Yuste R. Linear summation of excitatory inputs by CA1 pyramidal neurons. Neuron 22: 383–394, 1999.[CrossRef][Web of Science][Medline]

Constantinidis C and Goldman-Rakic PS. Correlated discharges among putative pyramidal neurons and interneurons in the primate prefrontal cortex. J Neurophysiol 88: 3487–3497, 2002.[Abstract/Free Full Text]

Dan Y, Alonso JM, Usrey WM, and Reid RC. Coding of visual information by precisely correlated spikes in the lateral geniculate nucleus. Nat Neurosci 1: 501–507, 1998.[CrossRef][Web of Science][Medline]

Gawne TJ, Kjaer TW, Hertz JA, and Richmond BJ. Adjacent visual cortical complex cells share about 20% of their stimulus-related information. Cereb Cortex 6: 482–489, 1996.[Abstract/Free Full Text]

Gawne TJ and Richmond BJ. How independent are the messages carried by adjacent inferior temporal cortical neurons? J Neurosci 13: 2758–2771, 1993.[Abstract]

Golledge HD, Panzeri S, Zheng F, Pola G, Scannell JW, Giannikopoulos DV, Mason RJ, Tovee MJ, and Young MP. Correlations, feature-binding and population coding in primary visual cortex. Neuroreport 14: 1045–1050, 2003.[CrossRef][Web of Science][Medline]

Hyvarinen A, Karhunen J, and Oja E. Independent Component Analysis. New York: Wiley, 2001.

Johnson KO. Sensory discrimination: decision process. J Neurophysiol 43: 1771–1792, 1980.[Abstract/Free Full Text]

Johnson RA and Wichern DW. Applied Multivariate Statistical Analysis. Saddle River, NJ: Prentice Hall, 1998.

Jordan MI and Sejnowski TJ. Graphical models. In: Foundations of Neural Computation (1st ed.). Cambridge, MA: MIT Press, 2001, p. 421.

Kang K, Shapley RM, and Sompolinsky H. Information tuning of populations of neurons in primary visual cortex. J Neurosci 24: 3726–3735, 2004.[Abstract/Free Full Text]

Koch C. Biophysics of Computation. Oxford: Oxford, 1998.

Latham PE and Nirenberg S. Synergy, redundancy, and independence in population codes, revisited. J Neurosci 25: 5195–5206, 2005.[Abstract/Free Full Text]

Lee D, Port NL, Kruse W, and Georgopoulos AP. Variability and correlated noise in the discharge of neurons in motor and parietal areas of the primate cortex. J Neurosci 18: 1161–1170, 1998.[Abstract/Free Full Text]

Lee D and Quessy S. Activity in the supplementary motor area related to learning and performance during a sequential visuomotor task. J Neurophysiol 89: 1039–1056, 2003.[Abstract/Free Full Text]

Margulis M and Tang CM. Temporal integration can readily switch between sublinear and supralinear summation. J Neurophysiol 79: 2809–2813, 1998.[Abstract/Free Full Text]

Maynard EM, Hatsopoulos NG, Ojakangas CL, Acuna BD, Sanes JN, Normann RA, and Donoghue JP. Neuronal interactions improve cortical population coding of movement direction. J Neurosci 19: 8083–8093, 1999.[Abstract/Free Full Text]

Musallam S, Corneil BD, Greger B, Scherberger H, and Andersen RA. Cognitive control signals for neural prosthetics. Science 305: 258–262, 2004.[Abstract/Free Full Text]

Narayanan NS, Kimchi EY, and Laubach M. Redundancy and synergy of neuronal ensembles in motor cortex. J Neurosci 25: 4207–4216, 2005.[Abstract/Free Full Text]

Nettleton JS and Spain WJ. Linear to supralinear summation of AMPA-mediated EPSPs in neocortical pyramidal neurons. J Neurophysiol 83: 3310–3322, 2000.[Abstract/Free Full Text]

Nicolelis MA, Fanselow EE, and Ghazanfar AA. Hebb's dream: the resurgence of cell assemblies. Neuron 19: 219–221, 1997.[CrossRef][Web of Science][Medline]

Nirenberg S, Carcieri SM, Jacobs AL, and Latham PE. Retinal ganglion cells act largely as independent encoders. Nature 411: 698–701, 2001.[CrossRef][Medline]

Olshausen BA and Field DJ. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381: 607–609, 1996.[CrossRef][Medline]

Oram MW, Hatsopoulos NG, Richmond BJ, and Donoghue JP. Excess synchrony in motor cortical neurons provides redundant direction information with that from coarse temporal measures. J Neurophysiol 86: 1700–1716, 2001.[Abstract/Free Full Text]

Paninski L, Fellows MR, Hatsopoulos NG, and Donoghue JP. Spatiotemporal tuning of motor cortical neurons for hand position and velocity. J Neurophysiol 91: 515–532, 2004.[Abstract/Free Full Text]

Panzeri S, Pola G, Petroni F, Young MP, and Petersen RS. A critical assessment of different measures of the information carried by correlated neuronal firing. Biosystems 67: 177–185, 2002.[CrossRef][Web of Science][Medline]

Panzeri S and Schultz SR. A unified approach to the study of temporal, correlational, and rate coding. Neural Comput 13: 1311–1349, 2001.[CrossRef][Web of Science][Medline]

Panzeri S, Schultz SR, Treves A, and Rolls ET. Correlations and the encoding of information in the nervous system. Proc R Soc Lond B Biol Sci 266: 1001–1012, 1999.[Medline]

Petersen RS, Panzeri S, and Diamond ME. Population coding of stimulus location in rat somatosensory cortex. Neuron 32: 503–514, 2001.[CrossRef][Web of Science][Medline]

Petersen RS, Panzeri S, and Diamond ME. Population coding in somatosensory cortex. Curr Opin Neurobiol 12: 441–447, 2002.[CrossRef][Web of Science][Medline]

Pola G, Thiele A, Hoffmann KP, and Panzeri S. An exact method to quantify the information transmitted by different mechanisms of correlational coding. Network 14: 35–60, 2003.[Web of Science][Medline]

Poor HV. An Introduction to Signal Detection and Estimation. New York: Springer, 1994.

Puchalla JL, Schneidman E, Harris RA, and Berry MJ. Redundancy in the population code of the retina. Neuron 46: 493–504, 2005.[CrossRef][Web of Science][Medline]

Reich DS, Mechler F, and Victor JD. Independent and redundant information in nearby cortical neurons. Science 294: 2566–2568, 2001.[Abstract/Free Full Text]

Rieke F, Warland D, de Ruyter van Steveninck R, and Bialek W. Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press, 1997.

Rolls ET, Franco L, Aggelopoulos NC, and Reece S. An information theoretic approach to the contributions of the firing rates and the correlations between the firing of neurons. J Neurophysiol 89: 2810–2822, 2003.[Abstract/Free Full Text]

Romo R, Hernandez A, Zainos A, and Salinas E. Correlated neuronal discharges that increase coding efficiency during perceptual discrimination. Neuron 38: 649–657, 2003.[CrossRef][Web of Science][Medline]

Schneidman E, Bialek W, and Berry MJ, 2nd. Synergy, redundancy, and independence in population codes. J Neurosci 23: 11539–11553, 2003.[Abstract/Free Full Text]

Seriès P, Latham PE, and Pouget A. Tuning curve sharpening for orientation selectivity: coding efficiency and the impact of correlations. Nat Neurosci 7: 1129–1135, 2004.[CrossRef][Web of Science][Medline]

Shamir M and Sompolinsky H. Nonlinear population codes. Neural Comput 16: 1105–1136, 2004.[CrossRef][Web of Science][Medline]

Snippe HP and Koenderink JJ. Information in channel-coded systems: correlated receivers. Biol Cybern 67: 183–190, 1992.[CrossRef][Web of Science][Medline]

Sompolinsky H, Yoon H, Kang K, and Shamir M. Population coding in neuronal systems with correlated noise. Phys Rev E 64: 051904, 2001.[CrossRef]

Strong SP, Koberle R, de Ruyter van Steveninck RR, and Bialek W. Entropy and information in neural spike trains. Phys Rev Lett 80: 197–200, 1998.[CrossRef]

Taylor DM, Tillery SI, and Schwartz AB. Direct cortical control of 3D neuroprosthetic devices. Science 296: 1829–1832, 2002.[Abstract/Free Full Text]

Thomson EE and Kristan WB. Quantifying stimulus discriminability: a comparison of information theory and ideal observer analysis. Neural Comput 17: 741–778, 2005.[CrossRef][Web of Science][Medline]

Tolhurst DJ, Movshon JA, Thompson ID, and Thompson PG. The dependence of response amplitude and variance of cat visual cortical neurones on stimulus contrast. Exp Brain Res 41: 414–419, 1981.[Web of Science][Medline]

Truccolo W, Eden UT, Fellows MR, Donoghue JP, and Brown EN. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. J Neurophysiol 93: 1074–1089, 2005.[Abstract/Free Full Text]

Werner G and Mountcastle VB. The variability of central neural activity in a sensory system, and its implications for the central reflections of sensory events. J Neurophysiol 26: 958–977, 1963.[Free Full Text]

Wessberg J, Stambaugh CR, Kralik JD, Beck PD, Laubach M, Chapin JK, Kim J, Biggs SJ, Srinivasan MA, and Nicolelis MA. Real-time prediction of hand trajectory by ensembles of cortical neurons in primates. Nature 408: 361–365, 2000.[CrossRef][Medline]

Wiener MC and Richmond BJ. Using response models to estimate channel capacity for neuronal classification of stationary visual stimuli using temporal coding. J Neurophysiol 82: 2861–2875, 1999.[Abstract/Free Full Text]

Wilke SD and Eurich CW. Representational accuracy of stochastic neural populations. Neural Comput 14: 155–189, 2002.[CrossRef][Web of Science][Medline]

Wu S, Nakahara H, and Amari S. Population coding with correlation and an unfaithful model. Neural Comput 13: 775–797, 2001.[CrossRef][Web of Science][Medline]

Zohary E, Shadlen MN, and Newsome WT. Correlated neuronal discharge rate and its implications for psychophysical performance. Nature 370: 140–143, 1994.[CrossRef][Medline]




This article has been cited by other articles:


Home page
J. Neurophysiol.Home page
A. V. Cruz, N. Mallet, P. J. Magill, P. Brown, and B. B. Averbeck
Effects of Dopamine Depletion on Network Entropy in the External Globus Pallidus
J Neurophysiol, August 1, 2009; 102(2): 1092 - 1102.
[Abstract] [Full Text] [PDF]


Home page
Cereb CortexHome page
J. Poort and P. R. Roelfsema
Noise Correlations Have Little Influence on the Coding of Selective Attention in Area V1
Cereb Cortex, March 1, 2009; 19(3): 543 - 553.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
E. M. Meyers, D. J. Freedman, G. Kreiman, E. K. Miller, and T. Poggio
Dynamic Population Coding of Category Information in Inferior Temporal and Prefrontal Cortex
J Neurophysiol, September 1, 2008; 100(3): 1407 - 1419.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
M. J. Chacron and J. Bastian
Population Coding by Electrosensory Neurons
J Neurophysiol, April 1, 2008; 99(4): 1825 - 1835.
[Abstract] [Full Text] [PDF]


Home page
J. Neurophysiol.Home page
B. Gourevitch and J. J. Eggermont
Evaluating Information Transfer Between Auditory Cortical Neurons
J Neurophysiol, March 1, 2007; 97(3): 2533 - 2543.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
B. B. Averbeck and D. Lee
Prefrontal Neural Correlates of Memory for Sequences
J. Neurosci., February 28, 2007; 27(9): 2204 - 2211.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Celikel and B. Sakmann
Sensory integration across space and in time for decision making in the somatosensory system of rodents
PNAS, January 23, 2007; 104(4): 1395 - 1400.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
B. B. Averbeck and L. M. Romanski
Probabilistic Encoding of Vocalizations in Macaque Ventral Lateral Prefrontal Cortex
J. Neurosci., October 25, 2006; 26(43): 11023 - 11033.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
95/6/3633    most recent
00919.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (11)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Averbeck, B. B.
Right arrow Articles by Lee, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Averbeck, B. B.
Right arrow Articles by Lee, D.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Visit Other APS Journals Online
Copyright © 2006 by the The American Physiological Society.