## Abstract

**Zador, Anthony.** Impact of synaptic unreliability on the information transmitted by spiking neurons. *J. Neurophysiol.* 79: 1219–1229, 1998. The spike generating mechanism of cortical neurons is highly reliable, able to produce spikes with a precision of a few milliseconds or less. The excitatory synapses driving these neurons are by contrast much less reliable, subject both to release failures and quantal fluctuations. This suggests that synapses represent the primary bottleneck limiting the faithful transmission of information through cortical circuitry. How does the capacity of a neuron to convey information depend on the properties of its synaptic drive? We address this question rigorously in an information theoretic framework. We consider a model in which a population of independent unreliable synapses provides the drive to an integrate-and-fire neuron. Within this model, the mutual information between the synaptic drive and the resulting output spike train can be computed exactly from distributions that depend only on a single variable, the interspike interval. The reduction of the calculation to dependence on only a single variable greatly reduces the amount of data required to obtain reliable information estimates. We consider two factors that govern the rate of information transfer: the synaptic reliability and the number of synapses connecting each presynaptic axon to its postsynaptic target (i.e., the connection redundancy, which constitutes a special form of input synchrony). The information rate is a smooth function of both mechanisms; no sharp transition is observed from an “unreliable” to a “reliable” mode. Increased connection redundancy can compensate for synaptic unreliability, but only under the assumption that the fine temporal structure of individual spikes carries information. If only the number of spikes in some relatively long-time window carries information (a “mean rate” code), an increase in the fidelity of synaptic transmission results in a seemingly paradoxical decrease in the information available in the spike train. This suggests that the fine temporal structure of spike trains can be used to maintain reliable transmission with unreliable synapses.

## INTRODUCTION

A pyramidal neuron in the cortex receives excitatory synaptic inputs from 10^{3}–10^{4} other neurons (Shepherd 1990). When an action potential invades the presynaptic terminal of one of these synapses, it sometimes triggers the release of a vesicle of glutamate, which causes current to flow into the postsynaptic dendrite. Some of this current then propagates, passively or actively, to the spike generator, where it may contribute to the triggering of an action potential.

The postsynaptic neuron can be viewed as an input-output element that converts the input spike trains from many presynaptic neurons into a single-output spike train. This input-output transformation is the basic computation performed by neurons. It is the foundation upon which cortical processing is based.

The computational strategies available to a neuronal circuit depend upon the fidelity of its components. For example, the computational power of a single integrate-and-fire neuron depends on the effective noise of the currents driving the spike generator (Zador and Pearlmutter 1996). In the cortex, the transformation of somatic current into an output spike train appears to be highly reliable (Mainen and Sejnowski 1995, see also Bryant and Segundo 1976), in marked contrast to the unreliability of synaptic transmission (Allen and Stevens 1994; Dobrunz and Stevens 1997; Stratford et al. 1996). In this paper, we use simple biophysical models of spike transduction and stochastic synaptic release to explore the implications of synaptic unreliability on information transmission and neural coding in the cortex. Our goal is to provide a quantitative answer to the question: How much information can the output spike train provide about the synaptic inputs? Our answer will be cast in an information-theoretic framework.

## METHODS

### Physiology

Standard slice recording methods were used to obtain Fig. 1. Briefly, patch-clamp recordings were obtained under visual guidance by using infrared optics from 400-μm slices from Long Evans rats [postnatal day (P)14–P20]. Recordings were performed at 33–35°C. Slices were continuously perfused with a solution containing (in mM) 120 NaCl, 3.5 KCl, 2.6 CaCl_{2}, 1.3 MgCl_{2}, 1.25 NaH_{2}PO_{4}, 26 NaHCO_{3}, and 10 glucose, which was bubbled with 95% O_{2}-5% CO_{2} and the pH of which had been adjusted to 7.35. All recordings were obtained in the presence of the α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptor antagonist 6-cyano-7-nitroquinoxaline-2,3-dione (CNQX, 50 μM). Recording pipettes were filled with (in mM) 170 K gluconate, 10 *N*-2-hydroxyethylpiperazine-*N*′-2-ethanesulfonic acid (HEPES), 10 NaCl, 2 MgCl_{2}, 1.33 ethylene glycol-bis(β-aminoethyl ether)-*N*,*N*,*N*′,*N*′-tetraacetic acid (EGTA), 0.133 CaCl_{2}, 3.5 MgATP, and 1.0 guanosine 5′-triphosphate (GTP), pH 7.2. Resistance to bath was 3–5 MΩ before seal formation.

Data were acquired by using a National Instruments (TX)AT-MIO-16-F-5 A/D card on a Pentium-based computer under the Window NT (Microsoft) operating system. Software written in Labview (National Instruments) with Dynamic Data Exchange links to Matlab (Mathworks) allowed convenient online synthesis and injection of arbitrary synthetic current waveforms.

### Simulations

All simulations were performed using Matlab 4.2.

### Model of spiking

We use an integrate-and-fire mechanism to model the transformation of synaptic inputs into spike trains in cortical neurons. Let *i*
_{syn}(*t*) be the synaptic current driving a leaky integrator with a time constant τ and a threshold *V*
_{thresh}. As long as the voltage is subthreshold, ν(*t*) < *V*
_{thresh}, the voltage is given by*R*
_{n} is the input resistance and *V*
_{rest} is the resting potential. At the instant the voltage reaches the threshold *V*
_{thresh}, the neuron emits a spike, and resets to some level *V*
_{reset} < *V*
_{thresh}. The five parameters of this model, *V*
_{thresh}, *V*
_{reset}, *V*
_{rest}, τ, and *R*
_{n}, determine its response to a given input current.

The output of the model is a spike train, i.e. a sequence of times at which ν(*t*) exceeded threshold. If time is finely discretized into bins shorter than the shortest interspike interval, so that the number of spikes in each bin is either zero or one (but not greater than one), then the spike train can be represented as a binary string *z _{o}
*(

*t*), with ones at times when the neuron fired and zeros at other times.

### Model of synaptic drive

We assume that the synaptic current *i*
_{syn}(*t*) consists of the sum of very brief—essentially instantaneous—individual excitatory postsynaptic currents (EPSCs). This represents a reasonable simplification of the component of the excitatory input to cortical neurons mediated by fast AMPA receptors, which decay with a time constant of 2–3 ms (Bekkers and Stevens 1990), but not for the component mediated by the slower *N*-methyl-d-aspartate (NMDA) receptor-gated channels.

The synaptic current driving any neuron results from the spike trains of all the other neurons that make synapses onto it. The postsynaptic current depends both on the precise times at which each of the presynaptic neurons fired and on the response at each synapse to the arrival of a presynaptic action potential. If the response at each synapse is either unreliable or variable in amplitude, then even the arrival of precisely the same spike train at each terminal will fail to produce identical postsynaptic current. In what follows, the exact sequence of action potentials arriving at each of the presynaptic terminals is the “signal,” and any variability response to repeated trials on which precisely the same sequence is presented represents the “noise.”

After the basic quantal model of synaptic transmission (Katz 1966), we consider two sources of synaptic variability, or noise. The first is that the probability *P _{r}
* that a glutamate-filled vesicle is released after presynaptic activation may be less than unity in the hippocampus (Allen and Stevens 1994; Hessler et al. 1993; Rosenmund et al. 1993) and the cortex (Castro-Alamancos and Connors 1997; Stratford et al. 1996). The second is that the postsynaptic current in response to a vesicle may vary even at single individual terminals (Bekkers and Stevens 1990). This quantal variability may arise, for example, from variable amounts of neurotransmitter filling each vesicle (Bekkers and Stevens 1990); but the results of the present study do not depend on the mechanism underlying this variability.

The basic model for the postsynaptic current *i*
_{syn} driving the neuron is as follows. We assume that the activity in the population of presynaptic neurons *j* is given by *z*
_{j}(*t*), where (by analogy with the output *z*
_{o}(*t*) above) *z*
_{j}(*t*) is a binary string whose entries are one if the neuron fired and zero otherwise. When an axon fires, the presynaptic terminal releases transmitter with a probability *P*
_{r}. If transmitter is released at time *t* at synapse *j*, then the postsynaptic amplitude is given by *q*
_{j}(*t*), which is a random variable that represents the quantal variability. Thus the total postsynaptic current is given by*j* is over the input neurons, the random process *f*
_{j}(*t*) representing synaptic failures is a binary string that is one when transmitter is released and zero otherwise and *q*
_{j}(*t*) is a random variable that determines the quantal size of releases when they occur. The processes *i*
_{syn}(*t*
_{i}), *z*
_{j}(*t*
_{i}), *f*
_{j}(*t*
_{i}), and *q*
_{j}(*t*
_{i}) are discrete-time, but for notational convenience we will often suppress the time index *i*.

A single axon may sometimes make multiple synapses onto a postsynaptic target, or a single synapse might have multiple release sites. We use the term functional contact to describe both these situations. *Equation 2
* implicitly assumes that each axon has only a single “functional contact” onto the postsynaptic neuron. We also consider the case where each axon makes *N _{r}
* multiple functional contacts. In this case, the current

*i*

_{syn}is given by

*k*is over functional contacts, each of which is driven by the same sequence of presynaptic action potentials

*z*

_{j}(

*t*). In this model, all the terminals

*k*associated with a single presynaptic axon are activated synchronously, but release failures occur at each contact independently.

The Poisson rate *S*
_{net} (impulses/second) at which EPSCs contribute to the postsynaptic current is given in this model by*A* is the number of afferent axons, *N*
_{r} is the number of functional contacts per axon (assumed to be the same for all axons), *F*
_{in} is the Poisson rate at which each axon fires (assumed to be the same for all axons), and *P*
_{r} is the release probability at each functional contact (assumed to be the same for all contacts). *S*
_{net} determines the average postsynaptic current and thereby the output firing rate *R*.

In some of the simulations described below (Figs. 3-5), the parameters *N _{r}
* and

*P*were varied. To keep

_{r}*S*

_{net}fixed under these conditions, any decrease in these parameters was compensated for by a proportional increase in

*A*×

*F*

_{in}. For example, if the release probability

*P*

_{r}was reduced to 0.5 from 1,

*F*

_{in}was increased twofold.

### Information rate of spike trains

A typical pyramidal neuron in the cortex receives synaptic input from 10^{3}–10^{4} other neurons. We define the activity in each of these input neurons as the “signal,” and the variability due to the unreliability of synaptic transmission is the “noise.”

How much information does the output spike train *z _{o}
*(

*t*) provide about the input spike trains

*z*(

_{j}*t*)? More formally, what is the mutual information

*I*(

*Z*

_{in}(

*t*);

*Z*

_{out}(

*t*)) between the ensemble of input spike trains

*Z*

_{in}(

*t*) = {

*z*

_{1}(

*t*), . . . ,

*z*; (

*t*), . . .} and the output spike train ensemble

*Z*

_{out}(

*t*)? We assume that both

*Z*

_{in}(

*t*) and

*Z*

_{out}(

*t*) are completely specified by the activity (i.e., the precise list of spike times) in each spike train; that is, all the information in the spike trains can be represented by the list of spike times and there is no extra information contained in properties such as spike height or width. Characteristics of the spike train such as the mean or instantaneous rate can be derived from this representation; if such a derived property turns out to be the relevant one, then this formulation can be specialized appropriately.

The mutual information *I*(*Z*
_{in}(*t*); *Z*
_{out}(*t*)) is defined (Shannon and Weaver 1948) in terms of the entropy *H*(*Z*
_{in}) of the ensemble of input spike trains, the entropy *H*(*Z*
_{out}) of output spike trains, and their joint entropy *H*(*Z*
_{in}, *Z*
_{out}),*H*(*Z*
_{in}), *H*(*Z*
_{out}) and *H*(*Z*
_{in}, *Z*
_{out}) depend only on the probability distributions *P*(*Z*
_{in}), *P*(*Z*
_{out}), and the joint distribution *P*(*Z*
_{in}, *Z*
_{out}), respectively.

Note that because the joint distribution is symmetric *P*(*Z*
_{in},*Z*
_{out}) = *P*(*Z*
_{out}, *Z*
_{in}), the mutual information is also symmetric, *I*(*Z*
_{in}; *Z*
_{out}) = *I*(*Z*
_{out}; *Z*
_{in}). Note also that if the inputs *Z*
_{in}(*t*) and outputs *Z*
_{out}(*t*) are completely independent, then the mutual information is zero, because the joint entropy is just the sum of the individual entropies *H*(*Z*
_{in}, *Z*
_{out}) = *H*(*Z*
_{in}) + *H*(*Z*
_{out}). This is completely reasonable, because in this case the inputs provide no information about the outputs.

### Methods for estimating spike train information rates

The expression given in *Eq. 5
* for the mutual information is in practice difficult to evaluate because estimating the distributions *P*(*Z*
_{in}), *P*(*Z*
_{out}), and *P*(*Z*
_{in}, *Z*
_{out}) may require very large amounts of data. For example, suppose that there are 1,000 input spike trains driving the output and that each spike train is divided into segments 100 ms in length and discretized into 1 ms bins. There are then 2^{100} possible output spike trains, 2^{100×1,000} sets of input spike trains, and 2^{100×1,000} × 2^{100} possible combinations of input and output spike trains forming the space over which the joint distribution *P*(*Z*
_{in}, *Z*
_{out}) must be estimated. Although this naive calculation is in practice an overestimate (see Buracas et al. 1996 and de Ruyter van Steveninck et al. 1997 for methods that make use of the fact that most spike trains are very unlikely), it emphasizes the potential problems involved in estimating the mutual information. Below we describe two practical methods for computing information rates.

### Reconstruction method

One approach to this dilemma (Bialek et al. 1991, 1993) is to compute a strict lower bound on the mutual information using the reconstruction method. The idea is to “decode” the output and use it to “reconstruct” the input that gave rise to it. The error between the reconstructed and actual outputs is then a measure of the fidelity of transmission and with a few testable assumptions can be related to the information. Formally, this method is based on an expression mathematically equivalent to *Eq. 5
* involving the conditional entropy *H*(*Z*
_{in}‖*Z*
_{out}) of the signal given the spike train*z*
_{j}(*t*). The entropy *H*(*Z*
_{in}) is just the entropy of the timeseries
*z*
_{j}(*t*) and can be evaluated directly from the Poisson synthesis equation (*Eq. 3
*). Intuitively, *Eq. 6
* says that the information gained about the spike train by observing the stimulus is just the initial uncertainty about the synaptic drive (in the absence of knowledge of the spike train) minus the uncertainty that remains about the signal once the spike train is known. The reconstruction method estimates the input from the output and then bounds the errors of the outputs from above by assuming they are Gaussian. This method, which can provide a lower bound on the mutual information, has been used with much success in a variety of experimental preparations (Bialek et al. 1991; de Ruyter van Steveninck and Bialek 1988; de Ruyter Van Steveninck and Laughlin 1996; Rieke et al. 1997).

### Direct method

In this paper we will use a direct method (DeWeese 1995, 1996; de Ruyter van Steveninck et al. 1997; Stevens and Zador 1996) to estimate the mutual information. Direct methods use another form of the expression *Eq. 5
* for mutual information*H*(*Z*
_{out}) is the entropy of the output spike train itself, whereas the second term *H*(*Z*
_{out}‖*Z*
_{in}) is the conditional entropy of the output given the inputs. The first term measures of the variability of the spike train in response to the ensemble of different inputs, whereas the second measures the reliability of the response to repeated presentations of the same inputs. The second term depends on the reliability of the synapses and spike generating mechanism: to the extent that the same inputs produce the same outputs, this term approaches zero.

The direct method has two advantages over the reconstruction method in the present context. First, it does not require the construction of a “reconstructor” for estimating the input from the output. Although the optimal linear reconstructor is straightforward to estimate, the construction of more sophisticated (i.e., nonlinear) reconstructors can be a delicate art. Second, it provides an estimate of information that is limited only by the errors in the estimation of *H*(*Z*
_{out}) and *H*(*Z*
_{out}‖*Z*
_{in}); the reconstruction method by contrast provides only a lower bound on the mutual information that is limited by the quality of the reconstructor.

As noted above, the estimation of *H*(*Z*
_{out}) and *H*(*Z*
_{out}‖*Z*
_{in}) can require vast amounts of data. If, however, interspike intervals (ISIs) in the output spike train were independent, then the entropies could be simply expressed in terms of the entropy of the associated ISI distributions. The information per spike *I*(*Z*
_{in}, *T*) is then given simply by*H*(*T*) are *H*(*T*‖*Z*
_{in}) are total and conditional entropies, respectively, of the ISI distribution. The information rate (units: bits/second) is then just the information per spike (units: bits/spike) times the firing rate *R* (units: spikes/second)*t*
_{o}, . . . , *t*
_{n}} is entirely equivalent (except for edge effects) to the representation as a sequence of ISIs {*T*
_{o}, . . . , *T*
_{n}}, where *T*
_{i} = *t*
_{i+1} − *t*
_{i}. The advantage of using ISIs rather than spike times is that *H*(*T*) depends only on the ISI distribution *P*(*T*), which is a univariate distribution. This dramatically reduces the amount of data required.

In the sequel we assume that spike times are discretized at a finite time resolution Δ*t*. The assumption of finite precision keeps the potential information finite. If this assumption is not made, each spike has potentially infinite information capacity; for example, a message of arbitrary length could be encoded in the decimal expansion of a single ISI.

*Equation 8
* represents the information per spike as the difference between two entropies. The first term is the total entropy per spike*P*(*T _{i}
*) is the probability that the length of the ISI was between

*T*and

_{i}*T*. The distribution of ISIs can be obtained from a single long (ideally, infinite) sequence of spike times.

_{i+1}The second term is the conditional entropy per spike. The conditional entropy is just the entropy of the ISI distribution in response to a particular set *m* of input spikes [*Z*
_{in}(*t*)]_{m}, averaged over all possible sets of inputs spikes*P*(*T*
_{j}‖[*Z*
_{in}(*t*)]_{m}) is the probability of obtaining an ISI of length *T*
_{j} in response to a particular set of input spikes [*Z*
_{in}(*t*)]_{m}.

We used the following algorithm for estimating the conditional entropy:

*1*) *Generate ensemble of input spikes.* Some particular ensemble of input spikes (corresponding, for example, to *m* = 17) is generated, [*Z*
_{in}(*t*)_{17} = [*z*
^{17}
_{1}(*t*), . . . , *z*
^{17}
_{j}(*t*), . . .], where [*z*
^{17}
_{1}(*t*), . . . , *z*
^{17}
_{j}(*t*), . . .] are independent homogenous Poisson processes (for convenience we assume they have the same rate, but this is not essential).

*2*) *Compute conditional ISI distribution.* The conditional distribution *P*(*T*‖[*Z*
_{in}(*t*)]_{17}) of ISIs of the model neuron is obtained by measuring the ISIs on a large (ideally, an infinite) number of trials in which a synaptic current is generated from [*Z*
_{in}(*t*)]_{17} by using the synaptic noise equations *Eq. 2
* or *Eq. 3
*. If the noise is nonzero, then each realization of the synaptic current *i*
^{17}
_{syn}(*t*) is slightly different, leading to variability in the output ISI.

*3*) *Compute conditional entropy for this input ensemble.* From the conditional distribution, the conditional entropy in response to this particular input ensemble is computed as *H*(*T*‖[*Z*
_{in}(*t*)]_{17}) = −∑_{T}
*P*(*T*‖[*Z*
_{in}(*t*)]_{17}) log_{2}
* P*(*T*‖[*Z*
_{in}(*t*)]_{17}). This ISI distribution depends on the amount of synaptic noise assumed; if there is no noise, the output distribution assumes only a single value and the conditional entropy is zero.

*4*) *Repeat and average over conditional entropies for other ensemble.* The average conditional entropy per spike is calculated by repeating this procedure for a large (ideally, infinite) number of input patterns [*Z*
_{in}(*t*)]_{1}, [*Z*
_{in}(*t*)]_{2}, . . . and averaging over the resulting conditional entropies.

In summary, we have described the three steps required to compute the information rate in our model. First, the total entropy per spike is computed from *Eq. 10
* and the conditional entropy per spike is computed from *Eq. 11
*. Next, the information per spike is computed from *Eq. 8
*. Finally, the information rate (information per time) is computed from *Eq. 9
*.

### Model assumptions

We have assumed a model of neuronal dynamics in which ISIs are independent. This assumption simplifies the estimation of the information rate, because it reduces the estimation of the multidimensional distribution of spike times to the estimation of the one dimensional ISI distributions (*P*(*T*) and *P*(*T*‖*Z*
_{in}(*t*)), from which the mutual information can be calculated exactly. Under what conditions will ISIs be independent? Because correlated ISIs can arise either from the spike generating mechanism itself or the input signal, we consider the validity of our assumptions about each in turn.

The first assumption is that the spike-generating mechanism does not induce correlations between ISIs. We have used a standard “memoryless” integrate-and-fire model in which the length of one ISI has no influence on subsequent ISIs. At least in cortical neurons, this assumption is not strictly valid for at least two reasons. First, on long time scales, adaptation (i.e., a change in the firing rate that depends on the firing rate itself) becomes important. Second, low-pass filtering by dendrites may induce temporal correlations in the effective synaptic current reaching the spike generator, even if they did not exist in the input ensemble. Correlations between ISIs may either increase or decrease the information rate.

The second assumption is that the correlations do not arise from the synaptic drive. This assumption may be inadequate for at least three reasons. First, it requires that EPSCs be much shorter than typical ISIs. Correlations in the synaptic drive are unlikely to arise from the fast AMPA component, but might well arise from the NMDA component, which decays much more slowly. A second potential source of correlations in the synaptic drive is correlations in the spike trains of each of the input neurons. To the extent that each input spike train is not a homogenous Poisson spike train, the model must be reevaluated. Finally, correlations might arise through the history-dependence of efficacy at individual synapses (Abbott et al. 1997; Dobrunz and Stevens 1997; Markram and Tsodyks 1996; Varela et al. 1997; Zador and Dobrunz 1997). We have made no attempt to explore the potentially important consequences of such use-dependent effects.

### Informative upper bound

The assumption that successive ISIs are independent (i.e., that the spike train is a renewal process) leads to an exact expression (rather than the upper bound provided by the reconstruction method) for the mutual information, subject only to error in the estimation of the ISI distribution. Here we review the well known result that a Poisson process (the special case where the ISI distribution is exponential) leads to the maximum entropy spike train, and give the simple closed-form expression for the entropy in this case.

The upper bound on the possible information transmitted in this model is straightforward to calculate (MacKay and McCulloch 1952). The output is a binary string—we have disallowed the possibility of multiple spikes per bin. If the conditional entropy is zero (i.e., if there is no noise whatsoever), then all the entropy is information, and the upper bound on the entropy is equal to the upper bound *I*
_{ub} on the information.

The probability of observing a spike in a bin of length Δ*t* depends on the firing rate *R* as *P*
_{1} = *R* × Δ*t* and the probability of not observing a spike is *P*
_{0} = 1 − *R* × Δ*t*. If spikes are independent—that is, if the probability of observing a spike in one bin does not depend on whether there was a spike in any neighboring bin, so that the spike train is a Poisson process—then the entropy per bin is ∑_{i}
*P*
_{i} log_{2}
*P*
_{0} log_{2}
*P*
_{1} log_{2}
*P*
_{0} → 1, and *P*
_{0} log_{2}
*P*
_{1} log_{2}
*R* × Δ*t* log_{2}
*t* or

## RESULTS

### Synaptic variability is the dominant source of output variability

Mainen and Sejnowski (1995) have previously shown that the timing of spikes produced by cortical neurons in response to somatic current injection can be highly reliable. The currents they injected were obtained by passing a Gaussian signal through a low-pass filter representing the time course of an EPSC and adding a constant offset. Although such a Gaussian current is obtained in the limit as the number of inputs becomes large, Mainen and Sejnowski (1995) did not explicitly relate the current they injected to the underlying synaptic drive.

Figure 1
*A* shows a typical experiment in which the same current was injected into the soma of a pyramidal neuron in layer II/III of a slice of rat neocortex and the response on 20 consecutive trials was recorded. In the experiment shown, a single 1,024 ms waveform was generated according to *Eq. 2
* and then stored; this precise waveform was injected on 20 trials. Figure 1
*A* shows that most of the spikes are aligned with a “jitter” of ≤1 ms, although a few “stray” or “displaced” spikes are also seen. In agreement with the observations of Mainen and Sejnowski (1995), these results show that cortical neurons can generate precisely repeated outputs in response to precisely repeated inputs, even when the driving current corresponds to a synthetic synaptic current generated by an ensemble of independent inputs. The small remaining output variability seen in Fig. 1
*A* is due to some combination of experimental instability and the intrinsic imprecision of spike generator. Experiments in which precisely the same current is injected establish a limit on the output precision of which these neurons are capable. The output variability increases as other sources of variability, such as synaptic noise are considered.

Synaptic failures occurring at even a relatively low rate dramatically increase the output variability. Figure 1
*B* shows the response of the same neuron to injected current (as in Fig. 1
*A*, generated according to the synaptic model described in *Eq. 2
*), but assuming that synapses failed to elicit a postsynaptic response on average 3 of every 10 spikes(*P _{r}
* = 0.7). The response to 20 consecutive trials was recorded. Thus in contrast to Fig. 1

*A*—in which precisely the same current was injected on each trial—for this experiment a somewhat different waveform, corresponding to the random removal of 3/10 spikes from the input ensemble, was injected on every trial. Figure 1

*B*shows that spikes areno longer well aligned, indicating that under these conditions synaptic failures are the dominant source of output variability.

### Information rate depends on firing rate

Experiments like those shown in Fig. 1 suggest that synaptic noise represents an important source of output variability. Such experiments can be used to estimate information rates in cortical neurons by using techniques developed elsewhere (Buracas et al. 1996; de Ruyter van Steveninck et al. 1997). In an experimental setting, however, information estimates can be distorted by nonstationarity, finite data sizes, variability between neurons, and a number of other factors. Although it is possible to correct for such factors (subject to certain reasonable assumptions), here we focus on the results from a model neuron in which all assumptions are explicit; this permits us to focus specifically on the role of synaptic variability in governing transmitted information.

In what follows, we consider a model in which the spike generating mechanism is completely deterministic, known, and stationary. Thus variability in the output spike train is due solely to variability among the stochastic inputs. In this section we begin with the limiting case in which the only source of variability among the inputs is the quantal variability of the synapses, i.e., to the variation in the postsynaptic response that occurs even when only a single functional contact is successfully activated. Thus in this section we assume not only that *1*) the spike generating mechanism is completely deterministic, but also that *2*) synapses release transmitter reliably when an action potential invades the presynaptic terminal (*P _{r}
* = 1). Here as elsewhere, the exact sequence of action potentials arriving at each of the presynaptic terminals is the signal and any variability response to repeated trials on which precisely the same sequence is presented represents the noise.

The information per spike is defined (*Eq. 8
*) as the difference between the total and conditional entropies per spike. Figure *
2A* shows how these quantities depend on the firing rate for the integrate-and-fire spike generation model given by *Eq. 1
*. The dashed curve represents the total entropy, which quantifies the total output variability of the spike train. The dotted line represents the conditional entropy, which quantifies the variability that remains when the signal (i.e. the precise firing times of each of the inputs) is held constant. The solid line is the mutual information between the input and the output and is the difference between these quantities. If there were no quantal variability, the conditional entropy would be zero and all the entropy would be information. Figure 2
*A* shows that even when the only source of synaptic variability is quantal, only about 3/4 of the spike entropy (6 bits/spike conditional entropy vs. 8 bits/spike total entropy at 4 Hz) is information. As seen in the next section, additional sources of synaptic variability reduce this fraction further.

The information and entropies per spike decrease monotonically with firing rate. These quantities diverge logarithmically to infinity as the firing rate goes to zero and in fact the entropy rates were calculated for firing rates only as low as ∼4 Hz. The behavior of the total entropy per spike at low firing rates can be understood in terms of the results for the limiting case of Poisson model outlined (see *Eq. 12
*).

In contrast to the entropy and information per spike, the entropy and information per second increase with increasing firing rate. The reason is that the entropy and information per time depend only logarithmically on firing rate, so the overall dependence, *I* ∝ *R *log_{2} 1/*R*, is increasing (see *Eq. 12
*). Figure 2
*B* illustrates the entropy and information rates (units: bits/second) corresponding to the curves shown in Fig. 2
*A*. Because of our assumption that time is discretized into bins of length Δ*t*, each containing only at most one spike, the information declines back to zero at very high firing rates (not shown).

The information rate is a nearly linear function of the firing rate (Fig. 2
*B*). This is precisely the behavior that would be expected from the maximum entropy Poisson process (*Eq. 12
*). Although the output of the integrate-and-fire model is not a Poisson process, the dependence on firing rate is qualitatively similar: an increased firing rate compensates for a logarithmic decrease in the entropy per spike.

### Information rate depends on release probability

The invasion of a synaptic terminal by an action potential often fails to induce a postsynaptic response both in the hippocampus (Allen and Stevens 1994; Dobrunz and Stevens 1997) and in the cortex (Stratford et al. 1996). Although the release probability *P _{r}
* varies across synapses onto the same neuron (Castro-Alamancos and Connors 1997; Hessler et al. 1993; Rosenmund et al. 1993) and as a function of history of use (Abbott et al. 1997; Dobrunz and Stevens 1997; Markram and Tsodyks 1996; Varela et al. 1997), for simplicity we make the assumption here that the release probability

*P*

_{r}is the same at all terminals.

Figure 3
*A* shows the dependence of information on firing rate for several values of *P _{r}
*. The top curve shows

*P*= 1 and is the same as the solid curve in Fig. 2

_{r}*B*. The lower three curves show that as

*P*is decreased (to 0.9, and 0.6, and 0.3), the form of the dependence is largely preserved, but the curves are shifted down. Thus as expected, synaptic unreliability lowers the information rate. In these simulations, the input Poisson rate

_{r}*S*

_{net}was held constant as described in

*Model of synaptic drive*, so the decrease in the information rate was due solely to an increase in the conditional entropy per spike and not to a change in firing rate. Fig. 3

*B*illustrates the dependence on

*P*

_{r}in more detail. For this curve, the firing rate was held constant at 40 Hz and

*P*

_{r}was varied from zero to one. The information is a monotonically increasing function of

*P*

_{r}. This is reasonable, because as the synaptic reliability increases, so should the reliability with which information is transmitted. No sharp transition is observed from an unreliable to a reliable mode.

### Information rate depends on the number of functional contacts per axon

A single axon may sometimes make multiple synapses onto a postsynaptic target, or a single synapse (such as the neuromuscular junction) might have multiple release sites. To avoid ambiguity, we use functional contact to refer to any release site from a presynaptic axon to a postsynaptic target, whether it involves multiple synapses per axon or multiple release sites per buoton. At the neuromuscular junction, functional contacts are counted by the thousands (Katz 1966). At excitatory synapses in the cortex, the number of functional contacts is much smaller, but still sometimes greater than one (Markram and Tsodyks 1996; Sorra and Harris 1993). We have therefore explored the consequences of multiple functional contacts on the information rate.

Figure 4
*A* shows the dependence of information rate on release probability for three different values of *N _{r}
*, the number of functional contacts per axon. As in the previous simulations, the input Poisson rate

*S*

_{net}was held constant as described in

*Model of synaptic drive*. The bottom curve is the same as that shown in Fig. 3

*B*. As the number of functional contacts is increased, the information available in the output spike train is increased as well. Because multiple functional contacts can be seen as a form of redundancy, the increase in transmitted information is not unexpected. In these simulations, the input Poisson rate

*S*

_{net}was held constant as described in

*Model of synaptic drive*. Although the firing rate was slightly (<10%) higher for large

*N*

_{r}, the increase in the information rate was due primarily to both an increase in the total entropy per spike and a decrease in the conditional entropy per spike and not to the increased firing rate.

Figure 4
*B* illustrates the dependence on the number of functional contacts *N _{r}
* in more detail. For this curve, release probability was held constant at

*P*= 0.5 and

_{r}*N*was varied from 1 to 25. The information saturates at high

_{r}*N*, but no sharp transition is seen from a low to a high reliability mode.

_{r}### Reliability of mean rate coding

It may seem obvious that because multiple functional contacts increase the fidelity with which a presynaptic signal is propagated, it can overcome the noise induced by synaptic failures and quantal fluctuations and thereby increase the fidelity of neuronal signaling. In the previous section we quantified this intuition under the hypothesis that the precise timing of spikes carries information. To what extent does this conclusion depend on the particular assumptions we are making about the neural code?

According to the “mean-rate” hypothesis for the neural code, the signal is carried not by the times at which spikes occur, but instead by the number of output spikes generated in some relatively long window. Under this hypothesis, multiple functional contacts can actually have the seemingly paradoxical effect of decreasing the transmitted information.

We use the Fano factor (Fano 1947) to assess the reliability of coding under the mean rate hypothesis. The Fano factor is defined as the variance σ^{2}
_{N} divided by the mean μ_{N} of the spike count *N* in some time window *W*. The Fano factor can be viewed as a kind of “noise-to-signal” ratio; it is a measure of the reliability with which the spike count could be estimated from a time window that on average contains several spikes. In fact, for a renewal process like the neuronal spike generator considered here, the distribution *P*
_{N}(*N*, *W*) of spike counts can be shown (Feller, 1971) by the central limit theorem to be normally distributed (asymptotically, as the number of trials becomes large), with μ_{N} = *W*/μ_{isi} and σ_{N} = *W*σ^{2}
_{isi}/μ^{3}
_{isi}, where σ_{isi} and are, respectively, the mean and the standard deviation of the ISI distribution *P*(*T*
_{i}). Thus the Fano factor *F* is related to the coefficient of variation *C*
_{ν} = σ_{isi}/μ_{isi} of the associated ISI distribution by *C*
_{ν} =

Figure 5 shows the Fano factor as a function of the number of functional contacts. The spike trains are the same as those analyzed in Fig. 4
*B*. The Fano factor increases monotonically with the firing rate. Because the reliability with which the spike count can be estimated is inversely related to its variability, an increase in the number of functional contacts results in a decrease in the effective signal-to-noise ratio. This suggests that if a mean-rate coding scheme is used, an increase in the number of functional contacts could actually decrease the coding fidelity. This behavior stands in marked contrast to that observed in the previous section, where the increase in functional contacts produced the expected increase in information rate.

How can we account for this seemingly paradoxical decrease in signal-to-noise with increased redundancy? The resolution rests in the normalization used to increase the connection redundancy. In these simulations, the net input Poisson rate *S*
_{net} was held constant (as described in *Model of synaptic drive*) to keep the mean postsynaptic current *i*
_{syn}(*t*), and therefore the firing rate, constant. A large *N*
_{r} leads to a redistribution of the presynaptic spikes into a small number of highly synchronous events, surrounded by longer periods during which no spike occurred. Normalizations that do not increase the effective synchrony might have given a different result.

These synchronous events have two effects. First, they tend to trigger postsynaptic action potentials at precise times. This increased timing precision decreases the conditional entropy (and thereby increases the total information *by Eq. 8
*) under the coding assumptions analysed in this section, but has no effect on the available information under the mean rate hypothesis. Second, the increased synchrony increased the variance of the postsynaptic input current, which in turn leads to an increase in the output variance (as assessed by the Fano factor). This increases the total entropy and hence the total information under the coding assumptions analyzed in this section, but actually decreases the effective signal-to-noise ratio under the mean rate hypothesis. Thus the increased connection redundancy has diametrically opposed effects on the available information, depending on how the spike trains are decoded.

## DISCUSSION

We have estimated the mutual information between the synaptic drive and the resulting output spike train in a model neuron. We have adopted a framework in which the time at which individual spikes occur carries information about the input. In this formulation, the exact sequence of action potentials arriving at each of the presynaptic terminals is the “signal,” and the “noise” is any variability in the response to repeated trials on which precisely the same sequence is presented. We found that the information was a smooth function of both synaptic reliability and connection redundancy: no sharp transition was observed from an “unreliable” to a “reliable” mode. However, connection redundancy can only compensate for synaptic unreliability under the assumption that the fine temporal structure of individual spikes carries information. If only the number of spikes in some relatively long time window carries information (a “mean rate” code), an increase in the fidelity of synaptic transmission results in a seemingly paradoxical decrease in the information available in the spike train.

### Related work

Information rates for sensory neurons in a wide variety of experimental systems have now been measured for both static (Golomb et al. 1997; Optican and Richmond 1987; Richmond and Optican 1990; Tovee et al. 1993) and time-varying (Bair et al. 1997; Bialek et al. 1991; Buracas et al. 1996; Dan et al. 1996; de Ruyter van Steveninck and Bialek 1988; Gabbiani and Koch 1996; Gabbiani et al. 1996; Rieke et al. 1997; Warland et al, 1997) stimuli. Most of the work on time-varying stimuli used reconstruction methods to obtain a lower bound on the transmitted information; typical values were in the range of 1–3 bits/spike. De Ruyter van Steveninck and Lauglin (1996) applied similar techniques to estimate information rates across graded synapses in the blowfly.

The present model is a direct extension of that considered in Stevens and Zador (1996) and closely related to that in DeWeese (1996). Both used a direct rather than a reconstruction method to estimate the information in a spiking neuron model. In Stevens and Zador (1996), the key assumption was that ISIs were independent, whereas in DeWeese (1996), the key assumption was that spikes were independent.

### Neural code

Although it is generally agreed that the spike train output by a neuron encodes information about the inputs to that neuron, the code by which the information is transmitted remains unclear (see Ferster and Spruston 1995; Stevens and Zador 1995) for recent discussions. One idea (the conventional view in systems physiology) is that it is the mean firing rate alone that encodes the signal and that variability about this mean is noise (Shadlen and Newsome 1994, 1995). An alternative view that has recently gained increasing support is that it is the variability itself that encodes the signal, i.e. that the information is encoded in the precise times at which spikes occur (Abeles et al. 1994; Bialek et al., 1991; Rieke et al. 1997; Softky 1995).

Our results make no assumptions about the neuronal code. Rather, they provide an exact expression for the maximum information that could possibly be transmitted, given the stimuli and the neuronal parameters. The precise timing of spikes is used to achieve this maximum; how much of this available information is actually used by “downstream” neurons is a separate question.

The importance of spike timing in encoding time-varying signals is now well-established in some systems, such as the motion-sensitive H1 neuron of the fly (Bialek et al. 1991). A comparable role for spike timing in mammalian cortex has been more controversial. It has been suggested that motion-sensitive neurons in area MT of awake monkeys encode only fractions of a bit per second and that all of the encoded information is available in the spike count over a relatively long time window (Britten et al. 1992). However, more recent experiments (Bair et al. 1997; Buracas et al. 1996) suggest that these neurons encode information at rates (1–2 bits/spike) comparable with those of the H1 neuron of the fly, when presented with visual stimuli that have appropriately rich temporal structure. Thus it may be wrong to speak of *the* neural code: it may well turn out that some components of the input stimulus (e.g., those that are changing rapidly) are encoded by precise firing times, whereas others are not.

We have shown that an increase in the number of functional contacts per axon can lead to an increase in transmitted information if the timing of spikes encodes the signal, but not if a mean rate code is used. Such an increase can be seen as a special case of neuronal synchrony in which all synapses from a single axon are stimulated at precisely the same instant. This seemingly paradoxical observation is a consequence of the manner in which synchrony affects firing patterns: it increases timing precision, but also increases the trial-to-trial variability in the spike count. It is not clear how synaptic unreliability could be compensated for in a mean rate scenario.

### Information and synaptic unreliability

The present paper is the first to interpret information rates in single cortical neurons in terms of the underlying biophysical sources of the signal and noise. Here signal is the set of firing times over the ensemble of presynaptic neurons, whereas noise is synaptic variability that leads to variability in the firing times of the postsynaptic neuron.

The present study was centrally motivated by the hypothesis that the nervous system is under selective evolutionary pressure to preserve as much information as possible during processing. In the limit this is trivially true: a retina that transmits no information whatsoever about the visual input is no better than no retina at all! Less trivially, computational power in some models increases as the precision of the underlying components increases (Zador and Pearlmutter 1996). If such principles apply to cortical computation, then the cortex may have evolved strategies to compensate for synaptic unreliability, given other constraints.

The most obvious strategy would be simply to increase the synaptic release probability. Indeed, there are synapses (e.g., in the fly retina) (de Ruyter Van Steveninck and Laughlin 1996) where the number of release sites *N _{r}
* per terminal is large enough to guarantee a high-fidelity connection under normal conditions. But such multirelease synapses are large and the cortex may be under an additional constraint to minimize size.

It is reasonable to wonder why the more direct approach, setting the release probability *P _{r}
* to unity, does not appear to be common. It is well-known that the release probability

*P*changes in a history-dependent manner during short-term plasticity (e.g., paired-pulse facilitation and depression, posttetanic potentiation, etc) (Abbott et al. 1997; Dobrunz and Stevens 1997; Fisher et al. 1997; Magleby 1987; Markram and Tsodyks 1996; Tsodyks and Markram 1997; Varela et al. 1997; Zador and Dobrunz 1997; Zucker 1989). We speculate that a dynamic

_{r}*P*is essential to cortical computation. A dynamic

_{r}*P*could function as a form of gain control (Abbott et al. 1997; Tsodyks and Markram 1997; Varela et al. 1997). More generally, it could be used to permit efficient computation on time-varying signals (Maass and Zador 1998). Thus we propose that the “reason” that

_{r}*P*does not simply approach unity may be that cortical computation requires that a

_{r}*P*retain a large dynamic range.

_{r}The cortex appears to adopt the “redundant connection” approach, albeit on a smaller scale. Figure 4
*B* shows that even a modest increase in the connection redundancy from one to five can double the information rate, from one to two bits/spike. Although a direct comparison is difficult, it is interesting to note that information rates in both anesthetized (Bair et al. 1997) and alert (Buracas et al. 1996) primate visual cortex are in the same range.

In our formulation, the fraction of the signal entropy transmitted by the spike train is small, even when the signal is not corrupted by noise. This follows immediately when we consider that to drive the model neuron to fire at, for example, 40 Hz, impulses must arrive at 2,400 Hz, which is equivalent to 60 input neurons each firing at 40 Hz, with each input axon presumably carrying comparable (and, by assumption, independent) information. This captures what may be an essential feature of the cortex: each pyramidal neuron must in some sense “summarize” with a single spike train the spike trains from 10^{4} other neurons. It is this “summary” that represents the “computation” that a neuron performs. Understanding the fidelity with which this computation can occur is a necessary step toward understanding the computation.

## Acknowledgments

This work was supported by The Sloan Center for Theoretical Neurobiology at the Salk Institute and by a grant to Charles F. Stevens from the Howard Hughes Medical Institute.

- Copyright © 1998 the American Physiological Society