|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
INNOVATIVE METHODOLOGY
Department of Physiology and Biophysics and Department of Psychology, University of Calgary, Calgary, Alberta, Canada
Submitted 16 October 2006; accepted in final form 21 December 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
The concept of information with respect to neuronal activity, although stemming from Shannon's theory, has no standardized meaning in neuroscience and has been used in several different ways (Borst and Theunissen 1999
). For instance, information rates have been estimated at a microscopic physiological scale in the transmembrane response to hormonal stimuli (Prank et al. 2000
) or within a single synapse (London et al. 2002
). The entropy carried by a spike train has also been widely investigated, the complexity of estimation being reflected by the large range of methods proposed [e.g., histogram method (Strong et al. 1998
), vector spaces (Victor 2002
), LempelZiv complexity (Amigo et al. 2004
)]. Information-related measures between neurons have also been used to study the effect of noise correlation on encoding and decoding stimuli (Averbeck and Lee 2006
). Finally, the mutual information between the spike-train responses and a set of stimuli was also estimated to investigate the discrimination abilities of neurons (Borst and Theunissen 1999
; Chechik et al. 2006
; Gehr et al. 2000
; Werner and Mountcastle 1965
).
Herein we present the transfer entropy as a new exploratory tool that provides a bridge between the study of neural assemblies and the information about stimuli carried by individual neurons. More precisely, the transfer entropy estimates the part of activity of a neuron that is not dependent on its own past but dependent on the past activity of another neuron. In a nutshell, it estimates the information transferred between two neurons in both directions. To our knowledge, the transfer entropy concept has been discussed in Jumarie (1990)
, but applied only once in context of physical continuous systems (Schreiber 2000
), and has never been applied to spike trains. Yet, to a certain extent, this tool is able to distinguish information resulting from common history and exclude it by appropriate conditioning of the entropy. Transfer entropy also detects asymmetry in neural relations, allowing studies of possible feedback in neural circuits, a topic that recently gained considerable interest (Contreras et al. 1996
; Hupe et al. 1998
; Krupa et al. 1999
; Sillito et al. 1993
, 1994
; Yan and Suga 1996
). Finally, but not unimportantly, transfer entropy takes into account linear and nonlinear flows and thus may represent a very general way to define the causality strength between two spikes trains. In particular, the window size for which maximum information is transferred may be useful to study neural integrative properties.
After presenting the mathematical tenets of the statistics, its basic properties will be elucidated through simulations of independent Poisson processes and compared with those of cross-correlation measures. An exploration of the multiunit activity recorded with an array of 16 electrodes in cat auditory cortex will highlight the potential of the method in network studies. Finally, recordings of spontaneous activity in 21 cats will show the statistical distribution of transfer entropy and above all delineate the size of temporal integration windows in primary auditory cortex as a potential first important physiological result.
| METHODS |
|---|
|
|
|---|
Let X1 and X2 be two spike trains. Let X1F(t), X2F(t) be the number of spikes of X1 and X2, respectively, falling in the upcoming time interval
t, t +
f
. Similarly, let X1P(t), X2P(t) be the number of spikes of X1 and X2 falling in the past time interval
t
p, t
.
f and
p typically range from 1 to <100 ms. In practice, time is considered discrete with tn = n
f, n
{0, 1, 2,...} such that [X1F(tn)]n, [X2F(tn)]n, [X1P(tn)]n, and [X2P(tn)]n are discrete processes.
In the stationary case, the transfer entropy from X1 to X2 can be defined as the amount of mutual information between the past of X1 (X1P) and the future of X2 (X2F) when the past of X2 (X2P) is already known, i.e.
![]() | (1) |
![]() | (2) |
{0, 1, 2,...}. Under independence between X1 and X2,
. The equality
![]() | (3) |
When no hypothesis regarding the distribution of X1/2F/P is made, the theoretical properties of the transfer entropy are probably extremely difficult to know a priori. Nevertheless, if
f and
p are large, the joint distribution between X1F, X1P, and X2P will be broad and sparse. The transfer entropy will thus be increasing automatically even if there is no causal link. Two corrections are necessary to avoid this effect. We present them in the next several paragraphs.
Bias
At first, we remove from
an estimate of the same transfer entropy in shuffled data, thereby modifying X1P to make it independent of X2F/P. Practically, we randomly shuffle the interspike intervals (ISIs) of X1, which does not change the ISI distribution of X1 but completely disconnects X1P and X2F/P. This procedure was previously used in another context in Hung et al. (2002)
, for instance. Another possibility is to directly shuffle the values of the process X1P, which gives similar results and may be faster computationally. The shuffled estimate is dubbed
;
in the following and is an average of results obtained on n trials.
Normalization
Finally, given the properties of mutual information, we define the normalized transfer entropy (NTE) by
![]() | (4) |
Preferred direction of flow
Similarly to Wang and Kadia (2001)
and Schnupp et al. (2006)
, we also define a selective index of the preferred direction of flow (DF) by
![]() | (5) |
. Final estimates
All previous quantities depend on
f and
p values. All simulations and investigations on real data convinced us that a clear peak always exists in the surface of NTE values as a function of
f and
p. Consequently, we always assigned to NTE the maximum over the set of
f and
p values.
Cross-correlation
Cross-correlograms were calculated using custom-made programs in MATLAB (Eggermont and Smith 1995b
). The bin size was 2 ms and the resulting cross-correlogram was smoothed with a three-bin running average. Stationarity estimates of the recordings were based on firing rate (mean and variance) in 100-s-long segments of the 15-min recordings for silence. To correct for the overall firing rate, burst firing, and common periodicities in the firing of the neurons, the cross-covariance was deconvolved with the square root of the product of the autocovariance functions. This deconvolution was done in the frequency domain, where it becomes a simple division; Fourier transformation back to the time domain resulted in the corrected cross-correlation coefficient function.
Simulations
We simulated some independent Poisson processes for X1 and X2. We then replaced in X2 a proportion
[0, 1] of spikes by the same proportion of spikes of X1 delayed by 10 ms. In this way, the firing rate (FR) of X2 is not modified, but it creates a causality link from X1 to X2 of various strengths proportional to
. The parameter for the exponential distribution underlying the Poisson process is
= 1/10, which gives an average FR of 10 spikes/s for X1 and X2. Spike trains of 300-s duration are generated for both processes.
Real data
We analyzed spike trains recorded during silence from the primary auditory cortex (AI) of 21 ketamine-anesthetized cats. The length of the recording is 900 s for each data set. Recordings were made with two arrays of eight microelectrodes, arranged in a 4 x 2 pattern with 0.5-mm separation between electrodes. The arrays were independently inserted into the auditory cortex. Details about the anesthesia, the electrode array, and the protocol can be found in Tomita and Eggermont (2005)
. The spike trains from individual electrodes represent multiple sorted units combined into multiple single-unit recordings. NTE is computed between two such multiple single-unit recordings. In addition to spontaneous spiking activity, we also analyzed the spike trains in response to several stimuli used in previous studies: 1) Poisson: Poisson-distributed click trains, with mean click rate of 8/s and dead time of 20 ms, and lasting 15 min (Eggermont and Smith 1995a
). 2) NoiseAM: amplitude-modulated noise, modulation frequency 2 to 64 Hz for AM sounds (Eggermont 2002
). 3) PP: randomly presented gamma-tone pips at a rate of 20/s with a range of five octaves between 0.625 and 20 kHz (Eggermont 2006
). 4) Meow: typical vocalization of a cat, natural and altered with respect to carrier and envelope (Gehr et al. 2000
; Gourévitch and Eggermont 2006
). 5) RMeow: time-reversed version of the Meow stimulus. 6) lpamn: wide-band noise (bandwidth: 40 kHz) modulated with a 30-Hz low-pass filtered noise (Eggermont 2006
). 7) BaPa: presentation of a /ba//pa/ continuum in which voice onset time (VOT) was varied in 5-ms step from 0 to 70 ms (Aizawa and Eggermont 2006
). 8) Gaps: noise bursts with gaps from 5 to 70 ms (Aizawa and Eggermont 2006
). 9) Train: periodic click trains, repetition rates from 2 to 64 Hz (Eggermont 2002
).
| RESULTS |
|---|
|
|
|---|
Simulations
For
= 1 (full causality, Fig. 1 A),
for
p =
f = 10 ms and
(Fig. 1B) as is discussed in the following. The reason that NTE reaches its maximal value for
p =
f = 10 ms is explained through the example in Fig. 2, which also explains why we find
for
p +
f < 10 ms.
|
|
X2=1 except when
p +
f < 10 ms, where
as explained previously. Consequently, the DF statistic should not be used when both NTE estimates are close to zero.
The NTE estimate is nonlinearly related to
(Fig. 3), in contrast to the linear dependency for the cross-correlation (XC). However, NTE is more suited to the study of complex neural networks than XC.
|
= 0.6 with a delay of 10 ms between X1 and X2. Model 2 (Fig. 4B) is a combination of the same 60% of spikes of X1 but in three fractions of 20%, each part being delayed by 4, 8, and 10 ms, respectively. Such variability in delays may occur if several parallel pathways with a different number of synaptic delays are activated. This situation is common in the brain, especially in nonprimary sensory areas where the neural discharges are spread out temporally (for a comparison of temporal patterns in posterior auditory field and AI see Phillips and Orman 1984
|
f equal to the minimum delay (4 ms) and
p equal to the maximum delay (10 ms). These properties of NTE emphasize its potential as a tool to investigate integration memory and information transfer in neural assemblies.
The rationale for the need of shuffled estimates and normalization is emphasized in Fig. 5, where the quantities of transfer entropy are plotted for values
f =
p and
= 0.5. The transfer entropy is increasing when
f and
p are increasing (Fig. 5A), as a consequence of the broad and sparse joint distributions. Removal of this bias ensures that the transfer entropy stays around 0 when no causality is present (Fig. 5B, dashed line for information transfer from X2 to X1). Finally, because the amount of information available in X1 and X2 is also increasing when
f and
p are increasing (Fig. 5C), the normalization of the transfer entropy by this latter value sharpens the main peak at 10 ms. A higher average FR for X1 or X2 would basically have the same effect as increasing
f and
p, i.e., increase of the amount of information available in X1 or X2. Consequently, the combination of bias removal, normalization (Eq. 4), and controlling the influence on the future of a channel of its own past (Eq. 1) makes NTE mostly independent of the firing rate of both neurons.
|
The ability of the transfer entropy to investigate neural assemblies is described in Fig. 6. Two arrays of eight electrodes are inserted in the auditory cortex of a normal hearing cat (Fig. 6A), array 1 being in a ventral part of AI where some recording sites showed nonprimary behavior (C3, C5, C6). This classification was based on longer response latency, more sustained responses, and nonmonotonicity, i.e., responses peak at an intermediate intensity level (Fig. 6B). For spontaneous firings, the matrix of NTE values (Fig. 6C) suggests various networks of information transfer graphically represented in Fig. 6D (for NTE >0.04). Little information was shared between electrodes in different arrays (Fig. 6C). A cluster analysis based on XC values (Eggermont 2006
) showed one cluster for array 2; one cluster consisting of C1, C2, C3, and C4; one cluster consisting of C5 and C6; and one consisting of a single electrode C7 (indicated by different colors in Fig. 6D). The maximum of NTE between electrodes from different arrays was consistently found for higher values of
f and
p (Fig. 6, E and F). Except for C10, a flow from left to right and bottom to top is visible in array 2 (Fig. 6D). Interestingly, a flow from primary to putative nonprimary recording sites is clearly visible for array 1 (Fig. 6D), and even associated with small
f and
p values (Fig. 6, E and F). The transitivity rule is respected herethat is, if there is no relation from channel 2 to 1, then there is none from 2 to 3, and from 3 to 1; however, some strong transfers might occur in both directions (for instance, C1 to C4 and C4 to C1, see Fig. 6C). One possible hypothesis is that this results from an indirect feedback. However, most information transfer occurs in one single preferred direction, as illustrated in Fig. 6G between C2 and C4 both with primary-like responses. For real data, just as for the simulations, a single peak is present in the surface of NTE values as a function of
f and
p (Fig. 6G).
|
p across stimuli (Fig. 7B). Whereas close sites C9, C10, and C11 shared information for very small and unchanged
p values over all stimuli, distant sites C7 and C13 showed high variability for
p. More precisely, presentation of natural and altered Meows provoked more information transferred from C7 to C13 along with longer
p values. In contrast, pairs C9/C16 and C10/C16 also showed longer
p values during Meows or silence stimuli without any specific increase of NTE compared with other stimuli.
|
Analysis of NTE values obtained for spontaneous activity from 5,650 electrode pairs in AI of 21 cats illustrates the putative statistical properties of the transfer entropy in vivo (Fig. 8). The distribution of NTE values approximately follows an exponential law (Fig. 8, A and B) with parameter
= 1/0.0225, where 0.0225 is equal to the mean NTE. In particular, 5% of NTE values are >0.0714 and 16% are >0.04, the value taken as a lower limit in constructing the transfer diagram of Fig. 6D. The normalized information transfer computed without conditioning to the past of the current neuron was found to be 25% higher than NTE values, in average. This suggests that common history between pairs of neurons would account for roughly 20% of information transfer values if conditioning was not performed. The NTE values are somewhat correlated with peak cross-correlation values (Fig. 8C, correlation coefficient 0.58). Nevertheless, strong variability is apparent, suggesting the existence of pairs of neurons that are transferring information but are poorly synchronized, or in the opposite direction. This reflects the difference of information transfer revealed with these two tools. Similarly, the lag for the cross-correlation peak and the
p values are weakly correlated (0.31), although
p is generally higher than the lag time in absolute value [P < 106, Wilcoxon test (Wilcoxon 1945
); Fig. 8D]. This suggests that activity may be integrated over a larger interval than that strictly associated with the mean delay between neuronal firings.
|
Figure 9 presents results about transmission times and neural integration times involved in information transfer in cat AI. Most of the influence of past activity was restricted to the next 5 ms of neuronal activity (distribution of
f values; Fig. 9A). In contrast, the duration of past integration memory was larger, generally extending F up to 15 ms but occasionally even F up to 35 ms (Fig. 9B). The highest values for NTE were found for integration-memory duration <10 ms (Fig. 9C). As expected, the information transfer decreased with distance between neurons (Fig. 9D), suggesting that, at least during spontaneous activity, redundancy between neuron activities occurs mostly locally. Consequently, the influence of multiunit activity from one recording site onto a distant one is weak and drowned in thousands of other incoming connections to this site. Another consequence is that the minimum
p values increase with distance between electrodes (Fig. 9E). However, interestingly, some high values for
p can be found even for nearby electrodes, suggesting the existence of neurons that process input activity over long temporal integration windows, even if in this case the NTE is necessary smaller. XC also decreases with distance between neurons in similar fashion to NTE (Fig. 9F).
|
| DISCUSSION |
|---|
|
|
|---|
Most tools used in the investigation of causality in neuroscience, especially in electroencephalography (review in Gourévitch et al. 2006
), are based on an interpretation of the Granger Causality definition (Granger 1969
): "We say that X1(t) is causing X2(t) [X1(t)
X2(t)] if we are better able to predict X2(t) using all available information than if the information apart from X1(t) had been used." In his paper, Granger interpreted "better able to predict" as a reduction of the variance of the prediction error. Yet, in the light of information theory, the ability to better predict can also be understood through the entropy of the predicted variable. If the uncertainty (entropy) associated with a random variable is reduced, the prediction of its possible values is indeed improved. From Eq. 1, it appears that transfer entropy is the reduction of uncertainty in the future of X2 (X2F) attributed to the knowledge of the past of X1 (X1P). Consequently, when common input does not explain all the activity, NTE is a quantification of a causality link in the Granger sense.
Because NTE is based on information theory, we also posit that it is a very general way to define causality, a way that encompasses both linear and nonlinear relationships between the activities of a pair of neurons. However, only bivariate cases are considered for NTE because "the information apart from" X1P is X2P and so all the available information is implicitly reduced to X1P and X2P. We are aware that a "future challenge is to design methods that truly allow neuroscientists to perform multivariate analyses of multiple spike trains data" (Brown et al. 2004
). However, even though NTE theoretically can easily be extended to an n-system of spike trains, it has been restrained in this paper to bivariate cases because of unobserved contributing neurons and the "curse of dimensionality" issues if all units available are used. One consequence is that "direct causality" should probably be avoided as an interpretation of NTE in a multiple spike-train context because of common inputs and potential intricate parallel and intermediate pathways between the pairs of neurons or multiple single units studied. A better interpretation may be that, in the Shannon sense, information present in one spike train is transferred by any synaptic pathway and subsequently observed in another train. Such a tool may thus be extremely useful in redundancy studies in the brain.
Neural assemblies
The greatest interest about neural networks in the brain concerns the parameters describing relations between neurons and their evolution during elicited responses. For instance, thebalance between inhibition and excitation appears crucial (Bush and Sejnowski 1996
; Kirkland and Gerstein 1998
; Xing and Gerstein 1996
). It may drive the contraction or enlargement of neural assemblies observed through synchrony (Eggermont 2006
). One of the most common definitions for neural assemblies is "a group of neurons [that are] at least transiently working together as indicated by correlation of unit activity" (Gerstein and Kirkland 2001
). We feel that the restriction of assembly membership by correlation only is too limited. It seems to us that temporal integrationand thus information transfer as quantified by NTEdefines another parameter of relations between neurons that is also able to emphasize neural assembly properties. An extended definition of neural assemblies would rather become "a group of neurons that are at least transiently working together as indicated by significant levels of synchronization and short-time integration between their unit activities." In this respect it is important to notice that the peak widths of the cross-correlograms (Eggermont 2000
) are ranging over the same values as the integration times involved in NTE.
Presently, the size of microelectrode arrays (mostly 16 or 32 electrodes) does not allow exhaustive sampling of neural networks. However, it is likely that investigations in the next decade will produce hundreds of simultaneous recordings, from which more precise and realistic descriptions of neural assembly processing will arise. Regardless, NTE may be useful to make network models or neural computation models more realistic by defining additional physiological parameters (see, for instance, Bush and Sejnowski 1996
; Davey et al. 2006
; Feldman 1982
; Graham and Willshaw 1997
; Valiant 2006
), especially those including temporal integration (Panchev and Wermter 2006
) or feedback (Kirkland and Gerstein 1998
; Xing and Gerstein 1996
).
In particular, in the continuing debate opposing population codes based on firing rate with neural assembly code resulting from coincident spiking, NTE appears as a useful tool to investigate neural assemblies resulting from firing rate changes induced by temporal integration. Besides, results in Fig. 9E support the hypothesis of long temporal integration windows for some neurons even if small consecutive NTE values do not allow consistent conclusions. For extended studies of neural assemblies, it is likely that NTE can be used to complement the cross-correlation function.
Another important property of NTE dealing with neural assemblies concerns the conditioning to the past of X2 (X2P), in the case of
. This conditioning cannot exclude common input that would provoke simultaneous activity in X2F and X1P because of a delay between X1 and X2. Latency from the thalamus to a cortical cell is remarkably constant across the cortex (typically,
2 ms), despite the wide divergence of inputs from the thalamus (Salami et al. 2003
). This common input would thus occur without latency differences in cortical cell pairs. Somehow, it appears difficult to exclude it if the connection arising from common input overwhelms the strength of the direct connection between the pair of neurons. As previously noticed, it is one reason that NTE estimates should preferably be interpreted as an information transfer than as a direct causal link. However, the conditioning will exclude all common information between X1P and X2P. Not only is this important in the context of integration of activities between neurons, but because
p is most often greater than the lag detected by cross-correlation method (Fig. 8D), this will partly exclude the influence of similar values for X1P and X2P that would occur if the lag was small and X1P and X2P were determined by only a common input. We indeed found that 20% of information transfer between the only X1P and X2F arises from a common history between X1P and X2P and is removed by conditioning to X2P.
Technical choices
Applying information theory to any type of data always requires careful thinking about the parameters used. These parameters can indeed dramatically influence results and conclusions. We chose to directly use Eq. 2 and the data available to estimate the transfer entropy, keeping a nonparametric environment. Some closed forms for TE may exist albeit dependent on the model considered for data. For instance, using notations introduced in METHODS, if X1 and X2 are Poisson processes, [X1F(tn)]n, [X2F(tn)]n, [X1P(tn)]n, and [X2P(tn)]n all follow a Poisson distribution. The computation of
thus depends only on the model used for the coupling relations between these four random variables. However, to our knowledge, such models have never been seriously considered in the literature and are even suggested as a future challenge in information theory context (Brown et al. 2004
). As a consequence, the theoretical distribution of TE appears unreachable at this time, similarly to several causality measures recently proposed in electrophysiology [directed coherence or DCOH (Saito and Harashima 1981
); directed transfer function or DTF (Kaminski and Blinowska 1991
); partial directed coherence or PDC (Sameshima 1999
)].
A significance threshold for TE is also difficult to determine. One possibility is to use the work of Moddemeijer (1989)
, who basically noted that the histogram represents statistics following a multinomial distribution. He then proposed an approximation for the variance of the entropy estimate in the case of the histogram approximation of the density. Preliminary investigations adapting this idea to the
statistic did not convince us of the robustness of such an approach, which too often gave significant values. We rather chose to normalize
because a coefficient between 0 and 1 is easier to interpret, like that for correlation or coherence. In our case, NTE estimates the part of information conveyed by a channel that is independent of its own past but could be found in the past of another channel. The single
statistic is indeed not comparable between channels because information conveyed by channels shows a high variability. We then computed the empirical distribution of NTE for spontaneous activity, which may be specific for the cat's auditory cortex. Nonetheless, we think that values >0.03 or 0.04 could indicate real information transfer, albeit a modest one. The putative exponential distribution model for NTE (Fig. 7B) should help to delineate threshold values in future studies showing different NTE averages.
Another choice is the use of the same value
p for both the own past of a channel and the past of the exogenous channel. Obviously, it would be preferable to dissociate them, but the computation cost of an additional parameter to current
f and
p on which to maximize the NTE would be extremely high. One must notice here that this statistic in its current state already requires careful programming to achieve results in a reasonable time. In fact, the computation speed essentially depends on the joint distribution computation and thus on the number of trials used to compute the shuffled estimate
.
It is noted that similarities with the transfer entropy idea of conditioning with respect to the past of another spike train can already be found in the old "cross-intensity functions" (Cox and Lewis 1966
; Perkel et al. 1967
), although rarely used with neural data (Brown et al. 2004
; Eggermont and Smith 1996
), and in the nonlinear causality test of Baek and Brock (1992)
improved by Hiemstra and Jones (1994)
. The mutual information between the synaptic input and the output spike train of a single neuron also was investigated by London et al. (2002)
using a finite-order Markov model for sequences of activations. Transitional probabilities were estimated by means of a context-weighting tree representation of all possible models. Although complex, this method might represent an alternative to ours for entropy estimation, even if its ability to test several orders of memory and to manage a high number of pairwise combinations in a reasonable time remains to be proved.
Physiological correlates of the results for spontaneous activity
It is not surprising that transfer information is relatively low between cortical neurons (most values are <0.15; Fig. 8A), somewhat similar to maximum levels of synchrony under spontaneous activity (Eggermont 1994
). Several histological reasons provide evidence for the weak influence of one neuron on another one, even neighboring ones. Even if one neuron typically receives inputs from several thousands of other neurons [rough estimates of 7,800 for mouse (Braitenberg and Schüz 1998
), 9,400 for pyramidal neurons in rat visual cortex (Hellwig 2000
), and 24,00080,000 for human cortex (Abeles 1991
)], it is much smaller than the total number of neurons in the brain [1.6 x 107 for mouse (Braitenberg and Schüz 1998
), 1010 for humans (Abeles 1991
)], even compared with the number of neurons that would be contained in the volume of the functional area of this neuron [75,000/mm3 in rat visual cortex (Hellwig 2000
)]. Moreover, based on excitatory postsynaptic potential values in rat visual cortex (Song et al. 2005
), around 26 presynaptic neurons would be needed to cause a postsynaptic action potential, a result that is within the estimate of between five and 300 (Abeles 1991
).
Similarly, a decrease of transfer entropy and a fortiori synchrony with distance (Fig. 9, D and F) is consistent with anatomical findings. Hellwig estimated that 70% of synapses of layer 2/3 pyramidal neurons in rat visual cortex are contained in a cylinder-shaped volume of cortex, whose radius parallel to the cortical surface is 500 µm and height is 300 µm (Hellwig 2000
). Other studies led to similar results (Gruner et al. 1974
; Nicoll and Blakemore 1993
). Histological studies of Liley and Wright (1994)
and Hellwig (2000)
also showed decreasing connection probability with cell separation within pyramidal and stellate neurons of layer 2/3, the probability being <0.2 when the distance is >500 µm. The estimated probability of connection is often even lower in electrophysiology studies, between 5 and 15% for neighbor neurons (Mason et al. 1991
; Nicoll and Blakemore 1993
; Thomson and Deuchars 1997
). Given that the mean synaptic delay in cortex is 1.2 ms with a minimum of 0.5 ms (Mason et al. 1991
; Nicoll and Blakemore 1990
), it also appears clear that large values for
p (>10 ms) between two multiple single-unit recordings will be associated with distant connections and several synaptic intermediates. This will weaken the influence of the connection and the likelihood of similar activities, and so make NTE values decrease substantially (Fig. 9, C and D). Even if NTE and XC show a similar decrease with distance (Fig. 9, D and F), the variability observed between their values suggests that coincident spiking does not fully reveal the information transferred between neurons (Fig. 8C) and emphasizes the importance of part of the neural code based on temporal integration.
p values reported in Fig. 9 already provide an insight in the windows of temporal integration potentially used in AI. Figure 9E shows that long windows (>20 ms) can be found even between neighboring sites (<<1.5 mm). Nevertheless, >80% of such observed windows are <15 ms. To our knowledge, most studies about potential temporal integration in auditory processing analyzed the responses to more or less complex stimuli, never under silence. For instance, some neurons in AI respond to brief periodic stimuli only for repetition rates <2040 Hz (Eggermont 2002
; Lu et al. 2001
; Schreiner et al. 1997
). Time reversal of short (<50-ms) segments in recorded speech does not affect its intelligibility (Saberi and Perrott 1999
). The mutual information between some vocalizations and the neural firings in the ferret reached a maximum when the temporal resolution of analysis was between 10 and 40 ms (Schnupp et al. 2006
). From awake marmoset monkey responses to periodic click trains, Wang et al. (2003)
concluded that rapidly modulated signals would be integrated within a short-time window of about 2030 ms. These observations suggest that temporal integration over 10 to 50 ms may occur when processing a more or less complex sound. These findings are thus completely in line with
p values mainly between 2 and 15 ms found during silence, their distribution stretching up to 40 ms (Fig. 9C). One underlying question concerns the variation of NTE and temporal integration windows under various stimulus conditions. Figure 7, A and B showed that more information may be transferred between some recording sites during specific stimuli such as Poisson and Meows, whereas the length of the window of temporal integration is not perfectly correlated with variations of NTE. In particular, the study of spontaneous activity may be of more interest than previously expected because some significant levels of information transfer, and so redundancy, can be found between several multiple single units (Figs. 7A and 8A), even when they are >1 mm apart (Fig. 9C). This preliminary result is intriguing and illustrates the potential of the method in understanding certain aspects of brain processing.
In conclusion, normalized transfer entropy or NTE has promising features that should make it useful for neural networks analysis. Based on information theory and an intuitive definition, NTE quantifies the influence in a nonrestricted sense that activity observed in one neuron, or multiple single units, has on another one. NTE has great potential interest for studies of temporal integration as part of the neural code. NTE is a coefficient between 0 and 1 that is easy to interpret and independent of firing rate. NTE may show variability under various stimuli conditions, allowing studies of neural assembly encoding of stimuli. NTE appears robust (one peak over
f and
p) and shows results complementary to cross-correlation. NTE allows studies of feedback in neural circuits. Obviously, further investigations on NTE,
f and
p values between different places of a sensory cortical area, and during stimulus presentation may be needed to reveal the potential of this tool and possibly help to understand brain processing. The present application showed that most temporal integration windows during spontaneous activity in cat's primary auditory cortex would extend from a few milliseconds to 15 ms.
| GRANTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: J. J. Eggermont, Department of Psychology, 2500 University Drive N.W. University of Calgary, Calgary, Alberta, Canada T2N 1N4 (E-mail: eggermon{at}ucalgary.ca)
| REFERENCES |
|---|
|
|
|---|
Aizawa N, Eggermont JJ. Effects of noise-induced hearing loss at young age on voice onset time and gap-in-noise representations in adult cat primary auditory cortex. J Assoc Res Otolaryngol 7: 7181, 2006.[CrossRef][ISI][Medline]
Amigo JM, Szczepanski J, Wajnryb E, Sanchez-Vives MV. Estimating the entropy rate of spike trains via LempelZiv complexity. Neural Comput 16: 717736, 2004.
Averbeck BB, Lee D. Effects of noise correlations on information encoding and decoding. J Neurophysiol 95: 36333644, 2006.
Baek E, Brock W. A General Test for Nonlinear Granger Causality [Working Paper]. Ames, IA: Univ. of Iowa, 1992.
Baker SN, Gerstein GL. Improvements to the sensitivity of gravitational clustering for multiple neuron recordings. Neural Comput 12: 25972620, 2000.
Borst A, Theunissen FE. Information theory and neural coding. Nat Neurosci 2: 947957, 1999.[CrossRef][ISI][Medline]
Braitenberg V, Schüz A. Cortex: Statistics and Geometry of Neuronal Connectivity. New York: Springer-Verlag, 1998.
Brown EN, Kass RE, Mitra PP. Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat Neurosci 7: 456461, 2004.[CrossRef][ISI][Medline]
Bush P, Sejnowski T. Inhibition synchronizes sparsely connected cortical neurons within and between columns in realistic network models. J Comput Neurosci 3: 91110, 1996.[CrossRef][ISI][Medline]
Chechik G, Anderson MJ, Bar-Yosef O, Young ED, Tishby N, Nelken I. Reduction of information redundancy in the ascending auditory pathway. Neuron 51: 359368, 2006.[CrossRef][ISI][Medline]
Contreras D, Destexhe A, Sejnowski TJ, Steriade M. Control of spatiotemporal coherence of a thalamic oscillation by corticothalamic feedback. Science 274: 771774, 1996.
Cox DR, Lewis PAW. The Statistical Analysis of Series of Events. New York: Wiley, 1966.
Davey N, Calcraft L, Adams R. High capacity, small world associative memory models. Connect Sci 18: 247264, 2006.[CrossRef]
deCharms RC, Zador A. Neural representation and the cortical code. Annu Rev Neurosci 23: 613647, 2000.[CrossRef][ISI][Medline]
Eggermont JJ. Neural interaction in cat primary auditory cortex. II. Effects of sound stimulation. J Neurophysiol 71: 246270, 1994.
Eggermont JJ. Is there a neural code? Neurosci Biobehav Rev 22: 355370, 1998.[CrossRef][ISI][Medline]
Eggermont JJ. Sound-induced synchronization of neural activity between and within three auditory cortical areas. J Neurophysiol 83: 27082722, 2000.
Eggermont JJ. Temporal modulation transfer functions in cat primary auditory cortex: separating stimulus effects from neural mechanisms. J Neurophysiol 87: 305321, 2002.
Eggermont JJ. Properties of correlated neural activity clusters in cat auditory cortex resemble those of neural assemblies. J Neurophysiol 96: 746764, 2006.
Eggermont JJ, Smith GM. Separating local from global effects in neural pair correlograms. Neuroreport 6: 21212124, 1995a.[ISI][Medline]
Eggermont JJ, Smith GM. Synchrony between single-unit activity and local field potentials in relation to periodicity coding in primary auditory cortex. J Neurophysiol 73: 227245, 1995b.
Eggermont JJ, Smith GM. Neural connectivity only accounts for a small part of neural correlation in auditory cortex. Exp Brain Res 110: 379391, 1996.[ISI][Medline]
Feldman JA. Dynamic connections in neural networks. Biol Cybern 46: 2739, 1982.[CrossRef][ISI][Medline]
Gehr DD, Komiya H, Eggermont JJ. Neuronal responses in cat primary auditory cortex to natural and altered species-specific calls. Hear Res 150: 2742, 2000.[CrossRef][ISI][Medline]
Gerstein GL, Aertsen AM. Representation of cooperative firing activity among simultaneously recorded neurons. J Neurophysiol 54: 15131528, 1985.
Gerstein GL, Kirkland KL. Neural assemblies: technical issues, analysis, and modeling. Neural Netw 14: 589598, 2001.[CrossRef][ISI][Medline]
Gourévitch B, Bouquin-Jeannes RL, Faucon G. Linear and nonlinear causality between signals: methods, examples and neurophysiological applications. Biol Cybern 95: 349369, 2006.[CrossRef][ISI][Medline]
Gourévitch B, Eggermont JJ. The spatial representation of neural responses to natural and altered conspecific vocalizations in cat auditory cortex. J Neurophysiol 97: 144158, 2007.
Graham B, Willshaw D. Capacity and information efficiency of the associative net. Netw Comput Neural Syst 8: 3554, 1997.[CrossRef]