Sensory systems must translate incoming signals quickly and reliably so that an animal can act successfully in its environment. Even at the level of receptor neurons, however, functional aspects of the sensory encoding process are not yet fully understood. Specifically, this concerns the question how stimulus features and neural response characteristics lead to an efficient transmission of sensory information. To address this issue, we have recorded and analyzed spike trains from grasshopper auditory receptors, while systematically varying the stimulus statistics. The stimulus variations profoundly influenced the efficiency of neural encoding. This influence was largely attributable to the presence of specific stimulus features that triggered remarkably precise spikes whose trial-to-trial timing variability was as low as 0.15 ms—one order of magnitude shorter than typical stimulus time scales. Precise spikes decreased the noise entropy of the spike trains, thereby increasing the rate of information transmission. In contrast, the total spike train entropy, which quantifies the variety of different spike train patterns, hardly changed when stimulus conditions were altered, as long as the neural firing rate remained the same. This finding shows that stimulus distributions that were transmitted with high information rates did not invoke additional response patterns, but instead displayed exceptional temporal precision in their neural representation. The acoustic stimuli that led to the highest information rates and smallest spike-time jitter feature pronounced sound-pressure deflections lasting for 2–3 ms. These upstrokes are reminiscent of salient structures found in natural grasshopper communication signals, suggesting that precise spikes selectively encode particularly important aspects of the natural stimulus environment.
Sensory systems have evolved to process information in a fast and reliable manner. Their efficiency depends on the stimulus characteristics and on how well the stimulus statistics reflect the natural environment (see, e.g., Baddeley et al. 1997; Dan et al. 1996; Laughlin 1981; Olshausen and Field 1996; Rieke et al. 1995; van Hateren and van der Schaaf 1998; Vinje and Gallant 2000). Indeed, already long ago, Attneave (1954) and Barlow (1961) suggested that efficient neural representations should be reserved for stimuli to which an animal is frequently exposed. On the other hand, the behavioral significance of a stimulus may strongly differ from its probability in the natural environment—even a rare stimulus might be crucial for survival.
These observations raise a number of related questions: How can a given sensory system transmit far more information about some stimulus ensembles than about others? Which stimulus features and neural response characteristics are responsible for these differences? Do well-encoded stimulus features carry a particular behavioral meaning? To investigate these questions, we modified natural stimuli to obtain artificial stimulus ensembles that differ in particularly salient directions in stimulus space. Analyzing how the neural responses depend on the specific stimulus statistics can reveal which stimulus attributes are instrumental for efficient sensory encoding.
As a model system, we chose the auditory periphery of grasshoppers. Compared with visual signals or more elaborate acoustic stimuli, the atonal “songs” of grasshopper are low-dimensional, which makes them ideally suited for this study. We recorded from auditory receptors in vivo and varied the stimulus systematically in three behaviorally relevant directions.
Quantitative comparisons of stochastic responses to different stimuli require a rigorous probabilistic framework. We use the information theoretic approach proposed by Strong et al. (1998). Two factors influence information transmission: if a neuron is to represent a large range of stimuli, it should have a rich repertoire of possible responses; on the other hand, to reliably represent each sensory signal, repeated presentations of one stimulus should elicit nearly identical responses. Examining these factors separately shows how the stimulus statistics shape neural coding efficiency.
A spike-by-spike analysis of spike-time jitter allows us to determine those stimulus features that contribute most to the transmitted information; brief sound pressure upstrokes that also occur at prominent locations within the grasshopper songs. These upstrokes can elicit spikes with remarkable temporal accuracy. Stimulus-dependent spike-time precision might therefore provide a simple mechanism to selectively represent behaviorally relevant features of natural stimuli in a reliable and efficient manner.
Electrophysiology and data acquisition
Experiments were conducted on adult male and female Locusta migratoria. Their legs, wings, head, gut, and dorsal part of the thorax were removed. Once the animal was fixed with wax onto a Peltier element that was heated to a constant temperature of 30°C, the metathoracic ganglion and tympanal nerve were exposed. Action potentials were recorded intracellularly from the axons of auditory receptors located in the tympanal nerve using standard glass microelectrodes (borosilicate; GC100F-10, Harvard Apparatus, Edenbridge, UK) filled with 1 M KCl solution (30–100 MΩ resistance). The signal was amplified (BRAMP-01, NPI Electronic, Tamm, Germany) and recorded by a data acquisition board (PCI-MIO-16E-1, National Instruments, Austin, TX) with a sampling rate of 10 kHz. Detection of action potentials and generation of the stimuli were controlled by OEL (on-line electrophysiology laboratory), a custom-made software. In those experiments where action potential detection by the software was deemed to be inexact, off-line spike detection was performed. All experiments were conducted in a Faraday's cage lined with sound-attenuating foam to reduce echoes. The preparation was placed between two loudspeakers (Esotec d-260, Dynaudio, Skanderborg, Denmark, on a DCA450 amplifier, Denon Electronic, Ratingen, Germany), 60 cm from one another. The stimuli were transmitted to the loudspeakers by a data acquisition board at a conversion rate of 100 kHz and played only from the speaker ipsilateral to the nerve that was being monitored. Recordings were obtained from 43 different receptor cells from 25 animals. Each cell was tested with two or more stimuli, resulting in 150 data sets in total (1 data set corresponds to 1 cell in 1 stimulus condition). The experimental protocol complied with German law governing animal care.
Stimulus design and experimental paradigm
Each experiment began with a measurement of the preferred frequency of the receptor, that is, the frequency for which the threshold of the cell is lowest. This minimum lies typically between 3 and 20 kHz. The preferred frequency was subsequently used as the carrier frequency of the stimulus. The stimulus consisted of random amplitude modulations of the carrier wave. The modulation was generated from an amplitude distribution of controlled shape, standard deviation, and cut-off frequency (see Machens et al. 2001, for a detailed explanation of stimulus construction). The cut-off frequency was at most 800 Hz, that is, far below the carrier frequency.
In each experiment, two different amplitude-modulated stimuli were compared. At the beginning, the baseline amplitudes of the two stimuli were adjusted so that, in both conditions, the cell responded with nearly the same firing rate. This was done with the purpose of neutralizing the strong effect that the firing rate has on the information transmission rate (Borst and Haag 2001). Only cells with firing rates between 70 and 150 Hz that showed no systematic decrease or increase in firing rate are reported here. These constitute 104 experiments of the 150 that were carried out. In most of these 104 experiments, the difference between the firing rates of a cell in response to the two stimuli was rather small; only in two cases was the difference more than 20 Hz, but still less than 40 Hz. Most importantly, residual variations of the firing rates had no systematic dependence on the stimulus condition. Throughout the rest of the experiment, the baseline amplitude remained fixed.
Once the carrier frequency and baseline amplitudes were determined, each stimulus was played for 10 s while the neural activity was recorded. These long stimuli were later used to calculate the neuron's linear forward filter.
When collecting data for the information analysis, each stimulus was presented a number N of trials, ranging between 98 and 533 (average 166), depending on how long the recording could be sustained. The two different stimuli lasted for 1 s and were played alternatingly, separated by pauses of 700 ms to prevent slow adaptation effects.
Information theoretical analysis
The statistical dependence between the stimulating sound wave and the resulting neural activity was quantified using information-theoretical measures. The first 200 ms of each trial was discarded to exclude the sharp initial transient of the firing rate caused by spike-frequency adaptation (see, e.g., Gollisch and Herz 2004). The voltage traces, with an effective trial length T = 800 ms, were binned into short windows of duration Δt ranging in all cases from 0.4 to 3 ms. The spike train was represented by a string of T/Δt bins. Each digit in the string indicated the number of spikes in the corresponding time bin. A word w of length l was defined as a sequence with l/Δt entries. The sampled words were allowed to overlap with each other.
The mutual information between stimulus and response is defined as the difference between the total entropy of the spike train and its noise entropy. The total entropy Htotal(l) quantifies the richness and variety in the patterns within the spike train. It is calculated from the word distribution p(w), that is, the probability of finding a word w of length l in the whole collection of trials (1) The noise entropy Hnoise(l), in turn, is the time average of the trial-to-trial variability at a fixed time within each trial. In this picture, each point in time is associated with a different stimulus, namely, the temporal sequence of sound intensities preceding it. Therefore the degree with which responses are time locked is a measure of the statistical correspondence between stimuli and responses. The noise entropy is calculated from the word distribution p(w|t), that is, the probability of finding a word w starting at time t (2) These naïve estimates of the entropy depend both on the number of trials N and the word length l. Because finite data sampling has an upward bias effect on the estimated information (Treves and Panzeri 1995), undersampling problems were controlled by extrapolating the second-order Taylor expansion of the total entropy and noise entropy as a function of 1/N to the case N → ∞. This was done by taking 1/5, 1/4, 1/3, 1/2, and the whole of the data, as in Strong et al. (1998). If there are no long-range correlations, both Htotal(l) and Hnoise(l), grow linearly with l, for large l. A linear regression of Htotal(l) and Hnoise(l) as a function of 1/l therefore allows one to estimate the rates Htotal=liml→∞Htotal(l)/l and Hnoise=liml→∞Hnoise(l)/l in the limit of infinite word length (Strong et al. 1998). The information rate I is defined as (3) Errors caused by goodness of fit of the Taylor expansion and the linear regression were calculated and found to be always <3%. To estimate the size of undersampling errors, a set of five differential equations modeling the spike generation process was used (Watzl 2003). The parameters in the model were adjusted to have the same adaptation and refractory properties as the recorded cells. The artificially generated data showed that the difference of estimating information rates with 100 trials, each one lasting for 800 ms, compared with the more ideal case of 1,000 trials, each one lasting for 10 s was always <1%, for Δt ≥ 0.4 ms. However, when reducing to Δt = 0.2 ms, the error grew to 12%. These simulation results are consistent with information-theoretical considerations (Paninski 2003) that suggest that, for the longest words used in this study, the sampling error at Δt = 0.4 ms is <5% but rapidly grows for smaller Δt. Unless otherwise stated, all information rates are therefore taken at Δt = 0.4 ms.
Quantification of spike-time jitter
To estimate the amount of jitter in repeated spike trains, such as those represented in the raster plots of Fig. 1C and D, the standard deviation, across all trials, of each spike's timing was calculated. To identify aligned spikes automatically, a sliding window spanning from some time t0 to t0 + Δw was used. The width Δw was chosen small enough so that there was a high probability of finding at most a single spike per trial inside the window, yet large enough to encompass the typical amounts of jitter found in the system. Here, we used Δw = 5 ms, as the experimental protocol yielded a mean interspike interval of 10 ms.
The amount of jitter j associated with the action potentials inside (t0, t0 + Δw) is defined as the standard deviation of spike occurrence times in the different trials. To avoid ambiguities, only trials containing one single spike inside the window are used. For given t0, at least one-half the trials were required to have a single spike in the window to proceed to calculate a jitter value j. With this setting, ∼80% of the spikes participated in the calculation of the jitter; the remaining 20% were discarded. Neither these percentages nor the jitter values themselves depended strongly on the fraction of trials required to have a single spike in the chosen interval, as long as this fraction remained less than 90%.
Calculating the jitter j from binned data can underestimate the true jitter. Suppose that all spikes fall into the same bin, such that one-half of the spikes lie at one end of the bin and the other half at the other end. In this worst-case scenario, the true jitter is one-half the bin size. The most conservative estimate of the true jitter j, which we will use throughout, consists therefore of adding one-half the bin size to the binned jitter estimate.
By sliding t0, a collection of j values can be obtained, resulting in a histogram P(j). The mean jitter J is defined as J=∫jP(j)dj. All values reported below were corrected for limited sampling. In all cases, the correction was <1%.
The jitter measure introduced here has the advantage of having units of time, and hence provides a quick, intuitive picture of the temporal dispersion to be expected in the raster plots. In that sense, it is similar to the measure used by Bair and Koch (1996) and differs from other approaches (Neltner et al. 2000; Schreiber et al. 2003) that essentially quantify the degree of coincidence of spike times in different trials.
Calculation of the neural forward filter
The poststimulus time histogram r̄(t) is the trial average of the responses to a fixed stimulus (Rieke et al. 1997) (4) where N is the total number of trials, and ri(t) is the time-dependent firing rate of the neuron in trial i. For this analysis, we use discretized time in bins of length 0.1 ms. Because the stimulus lasted 800 ms, ri(t) is a string with 8,000 entries that are zero, except at those times where a spike is emitted, where ri(t) is 1/(0.1 ms). When the number of trials N is large, ri(t) represents the probability of generating a spike in (t, t + dt). The average firing rate r0 is the temporal integral of r̄(t) divided by the total length of the interval.
The simplest approximation of the input-output relation of a cell is to write r̄(t) as the sum of a constant term and a stimulus-dependent modulation with mean zero, that is (5) where s1(t) = s(t) –s0, and s0 is the temporal average of the time-dependent stimulus s(t). The function h(τ) is called the forward filter of the cell (or as here, simply the filter). Equation 5 implies that the spiking probability is particularly sensitive to those stimulus segments that match the form of h(−τ). The negative sign of the argument indicates that the filter is a time-inverted version of the preferred stimulus of the cell.
The filter can be calculated from the correlation Crs(τ)=∫r(t)s1(t+τ)dt between stimulus and response and the stimulus autocorrelation Css(τ)=∫s1(t) s1(t+τ)dt. This is most easily done in the frequency domain. With Ĉrs(f) denoting the Fourier transform of Crs(τ) and Ĉss(f) that of Css(τ), the Fourier transform ĥ(f) of h(τ) is obtained as (Koch and Segev 1998) (6) Notice that these results can be extended beyond the linear approximation. Equation 6 is still valid when r̄(t) − r0 is a static nonlinear function of the convolution of h and s1. Even in more general cases, where r̄(t) is any nonlinear function of s(t), Eq. 6 gives the best linear approximation of r̄(t) in terms of smallest mean-square error.
In an extension of the forward-filter analyses, filters were also calculated by considering only a subset of the spikes. This subset was selected according to the jitter values j associated with the spikes, for example, those spikes whose j lied within a specified range.
We study how well neural representations in the sensory periphery match different stimulus environments. To this end, the activity of auditory receptor neurons is recorded during a systematic exploration of the stimulus space. This allows us to study the mapping between stimuli and responses. To characterize the trial-to-trial variability of this mapping as well, each stimulus is presented a large number of times. In Fig. 1, two example stimuli are shown, together with the responses of one single receptor. The poststimulus time histograms in the lowest panels show that the stimulus introduces a strong temporal modulation of the spiking probability. In addition, the presence of a certain degree of scatter in the raster plots (middle panels) indicates that the mapping between stimuli and responses is not deterministic. A general description of this mapping, hence, should be grounded on probabilistic methods. Information theory is a branch of statistics that provides rigorous answers to the problem of how faithfully messages are transmitted through a noisy channel and how much information those messages encode. The power of the information-theoretic methods reside in their generality: they make no assumptions on the nature of the transduction process (Borst and Theunissen 1999; Rieke et al. 1997; Strong et al. 1998). Only the joint probability distribution P(r,s) of stimulus s and response r is needed. This makes these methods good candidates for studying the neural encoding of sensory stimuli (see Dayan and Abbot 2001; Rieke et al. 1997).
Within the probability-theoretic framework, we can address key questions in a quantitative manner—what is the relevant temporal resolution for information transmission and how stimulus characteristics influence the trial-to-trial response variability and information rate. The quality of stimulus encoding is assessed by computing the mutual information rate between stimuli and responses (see methods). This allows us to analyze the degree of statistical dependence p(r|s) between the stimulus s and the stochastic neural response r, even if successive spikes are correlated (e.g., the coherent displacement of successive spikes in Fig. 1C) or if the input-output mapping includes sophisticated nonlinear transformations. In the absence of prior knowledge about the neural encoding process, information theory is therefore our method of choice. Comparisons with biologically more intuitive response measures, such as spike-time jitter, will reveal whether these highly reduced measures capture the full complexity of the conditional probability distribution p(r|s) that underlies the information-theoretic approach.
Two qualitatively different processes influence the relation between stimulus and response. First, if a sensory neuron is to represent a large range of stimuli, it should have a rich repertoire of activity patterns. This degree of complexity is given by the total entropy rate Htotal of the responses. Second, if a sensory neuron is to represent each sensory signal in a reliable manner, repeated presentations of one stimulus should elicit nearly identical responses. The effect of the trial-to-trial variability in the neural representation is described by the noise entropy rate Hnoise. As the difference of Htotal and Hnoise, the mutual information provides a quantitative measure for the balanced trade-off between these counteracting response aspects.
As an experimental system, we studied the auditory periphery of L. migratoria, a well-established model system for auditory processing in grasshoppers (Ronacher and Krahe 2000; Ronacher and Römer 1985; Stumpner and Ronacher 1991). Grasshoppers use acoustic courtship signals to call and identify other members of their own species and to assess the quality of potential mates (Balakrishnan et al. 2001; Ronacher and Krahe 1998; von Helversen and von Helversen 1983). The possibility of obtaining long, intracellular recordings from auditory receptors, the comparatively simple statistical structure of their calling songs, and their straightforward behavioral relevance make grasshoppers an ideal system for our study.
Temporal resolution relevant for information transmission
When grasshopper auditory receptors are stimulated with amplitude-modulated (AM) signals, they generate spike trains whose temporal pattern can be remarkably reproducible, as shown in Fig. 1 for a sample cell. Figure 1, C and D, show the responses to the two stimuli presented in Fig. 1, A and B. It is readily seen that the amount of jitter in the trial-to-trial variability of the responses varies noticeably with the stimulus.
We define the amount of spike-time jitter j in a time window spanning from time t0 to t0 + Δw as the standard deviation of the neural firing times within this interval. Here, Δw is set to 5 ms (see methods). The mean jitter J is obtained by sliding t0 along the time axis and averaging over all the j values thus obtained. In the example of Fig. 1D, we find J = 0.45 ms. Furthermore, 20% of the j values are less than 0.25 ms. This number should be compared with the time scale of the AM of the stimulus, which is 5 ms, i.e., more than 20 times larger.
The large differences in spike-timing precision obtained for different stimulus statistics suggests that the amount of jitter is an important aspect of information transmission in different stimulus environments. However, if low-jitter spikes are important, one should be able to find information about the stimulus on time scales as small as a few tenths of a millisecond. To study whether this is indeed the case, we estimated the mutual information between the acoustic stimuli and the responses. To do so, each spike train was transformed into a binary sequence where each digit denotes the presence or absence of a spike within a time window of length Δt (see methods). Figure 2 shows that information rates increased for progressively finer temporal resolution Δt, even down to a Δt = 0.4 ms, the minimal value for which we can calculate the information rate reliably. Similar to results in other systems (Liu et al. 2001; Panzeri et al. 2001; Reinagel and Reid 2000; Strong et al. 1998), this finding suggests that temporal precision does contribute to information transmission. Specifically, in this particular example, 60% of the j values were <0.45 ms.
Systematic exploration of the stimulus space
Grasshoppers generate calling songs by rasping their hindlegs across their forewings. The resulting sound wave consists of a broad-band carrier signal with frequencies in the range of 3–40 kHz, whose intensity is strongly modulated in time, resulting in a characteristic rhythmic, chirping sound. When presented with a male's courtship song, female grasshoppers respond with a different acoustic pattern, and the probability of their response depends on the temporal properties of the male's call (Balakrishnan et al. 2001). Apparently, the amplitude modulation of the call carries important cues about the male singer (Machens et al. 2003). Therefore we explored the stimulus space by varying the statistical properties of the modulation, while keeping the carrier wave at the value where each particular receptor is most sensitive. The stimuli consisted of random AM waves characterized by three parameters: shape, standard deviation, and cut-off frequency of the amplitude modulation.
Different receptors vary in their cellular properties, resulting in different response characteristics. To identify the effect of the stimulus on the response (despite the cell-to-cell variability), each cell was presented with two stimuli. One stimulus was the same for all cells: a Gaussian amplitude distribution with standard deviation σ = 6 dB and cut-off frequency fC = 200 Hz, which is shown in Fig. 3A. This is henceforth called the standard stimulus. The other signal, the comparison stimulus, was varied from cell to cell. Three types of comparison stimuli were used, as depicted in Fig. 3, B–D. These stimuli differed from the standard stimulus in one of the following aspects.
FORM OF THE AMPLITUDE DISTRIBUTION.
The rhythmic sequence of high- and low-intensity segments found in natural songs gives rise to a bimodal amplitude distribution. To determine whether this particular distribution has an effect in the quality of information transmission, in some of the cells, the standard (Gaussian) stimulus was compared with a stimulus with the same standard deviation and cut-off frequency, but with the bimodal amplitude distribution taken from a typical grasshopper song (Fig. 3B).
STANDARD DEVIATION σ OF THE AMPLITUDE DISTRIBUTION.
In another set of cells, the standard stimulus was compared with a stimulus that was also Gaussian and with equal cut-off frequency but with a different standard deviation. This allowed us to modify the probability of finding sharp deflections in the signal. Whereas the width σ of the standard stimulus was fixed to 6 dB, the comparison stimulus had either σ = 3 dB or σ = 12 dB. An example of σ = 12 dB is shown in Fig. 3C.
CUT-OFF FREQUENCY FC OF THE AMPLITUDE MODULATION.
In a third set of cells, the standard stimulus was compared with a Gaussian signal with equal SD but different cut-off frequency. The cut-off frequency of the standard stimulus was 200 Hz, that is, roughly the highest frequency found in the power spectrum of natural songs. Comparison stimuli included cut-off frequencies of 25, 100, 400, and 800 Hz. An example of fC = 400 Hz is shown Fig. 3D. As the cut-off frequency increases, the duration of the fluctuations in the stimulus decreases at 1/fC.
How should the mean sound intensities of the standard and comparison stimulus be chosen? In the natural environment, sound intensities depend on the distance between sender and receiver, so there is no natural value for the mean. Hence, initial calibration of the mean sound intensities (see methods) was designed to yield the same firing rate in response to both stimulus ensembles, thereby eschewing the strong effect of the firing rate on the information transmission rate (Borst and Haag 2001). As desired for this study, our results thus directly reflect the influence of the higher-order stimulus statistics on the transmitted information and are not compromised by spurious firing rate effects. As the average firing rate of the studied receptor neurons ranges from about zero to several hundred Hertz depending on the energy of the sound signal (Gollisch et al. 2002), we aimed at natural range of firing rates around 100 Hz.
Influence of stimulus characteristics on the information rate
The example of Fig. 1 suggests that the properties of the acoustic stimuli strongly influence the precision in the neural response. To study the effects of different stimuli in a systematic fashion, we compared the information rates for the standard and comparison stimuli in all recorded cells. Figure 4 shows the difference between the information rates obtained in the two stimulus conditions as a function of the rate of the standard stimulus. Hence, if a given cell transmits information at a higher (lower) rate when driven with the comparison stimulus, it appears above (below) the horizontal line.
Figure 4A shows that the bimodal amplitude distribution led to lower information rates than the standard Gaussian stimulus in all tested cells (n = 8). This is true, although both the standard and the comparison information rates varied more than twofold from cell to cell. Here, each data point corresponds to a different cell. A paired t-test showed a significant difference in the values obtained for the two stimulus conditions (tdf = 8 = 5.308, P < 0.01). On average, the information rates of the bimodal stimulus was 85 ± 4% of the one transmitted by the Gaussian distribution.
Figure 4B depicts data from experiments where responses to amplitude modulations with σ = 6 dB (standard stimulus) were compared with stimuli with either σ = 3 dB or σ = 12 dB (comparison stimulus). The information rate of each cell was highest for the stimulus with the largest standard deviation. A paired t-test revealed that this effect is significant (for σ = 12 dB: tdf = 6 = −3.615, P < 0.05, whereas for S = 3 dB, tdf = 5 = 19.04, P < 0.001). The information transmitted about the stimulus constructed with σ = 3 dB was, on average, 51 ± 4.9% (n = 6) of the information transmitted about the standard stimulus. For σ = 12 dB, the ratio was 124 ± 6.7% (n = 8). Therefore within the tested range, the information rate increased with the standard deviation σ, that is, with the size of the amplitude deflections in the stimulus.
The above analysis shows that stronger amplitude modulations increase the rate of information transmission. Is there a similar influence of the speed at which the stimulus amplitude fluctuates? To study this question, we compared the standard Gaussian stimulus (containing spectral components up to a cut-off frequency fC = 200 Hz) with slower or faster amplitude modulations that were normalized such that all stimuli had the same variance. As seen in Fig. 4C, comparison stimuli with cut-off frequencies of fC = 25, 100, 400, or 800 Hz yielded significantly lower information rates than the standard stimulus for most receptors (tdf = 6 = 6.973, P ≤ 0.001; tdf = 7 = 3.534, P ≤ 0.05; tdf = 6 = 2.580, P ≤ 0.05; tdf = 7 = 3.662, P ≤ 0.01, respectively). Exceptions to this rule were found in 4 of 30 cases, where the comparison stimulus produced larger information rates than the standard stimulus (1 cell with fC = 100 Hz, 2 with fC = 400 Hz, and 1 with fC = 800 Hz). The ratio of the information rate of the comparison stimulus to the standard one were, on average, 63 ± 4.2% (fC = 25 Hz, n = 7); 92 ± 2.3% (fC = 100 Hz, n = 8); 88 ± 3.9% (fC = 400 Hz, n = 7); and 75 ± 6.7% (fC = 800 Hz, n = 8).
Random stimuli containing fast variations are less predictable and therefore have a higher entropy rate than slow stimuli. As such, they could be expected to lead to a higher information transmission rate than slower stimuli. Interestingly, I started to drop as fC grows beyond 200 Hz, showing that there is an optimal time scale for stimuli to be encoded with high efficiency.
We next asked whether the dependence of the information rate on the stimulus reflects a variation in the richness of the neural code Htotal or on its trial-to-trial variability Hnoise. In Fig. 5 we separately present the variations of the total and the noise entropy rates when switching from the standard to the comparison stimulus. The stimulus conditions are the same as in Fig. 4. The symbols with a black upper half depict the value of the total entropy rate, whereas the gray symbols stand for the noise entropy rate. The three panels of Fig. 5 show that, when the stimulus varies from the standard to the comparison condition, the total entropy remained roughly unchanged; each type of black and white symbol is similarly scattered below and above the horizontal line. A paired t-test showed that the mean total entropy in response to the standard stimulus was not significantly different from that in response to the comparison stimulus (P > 0.1) in all comparisons except for fC = 25 Hz. In contrast, each gray symbol appears preferentially either below or above the horizontal line, depending on the particular type of comparison stimulus. A paired t-test showed that the mean noise entropy in the standard stimulus differs significantly from that in the comparison stimulus (P < 0.05) for all comparisons except for fC = 400 Hz. Hence, under our experimental conditions, the stimulus statistics influenced the information transmission rate by mainly affecting the value of the noise entropy rate and not the total entropy rate.
Relation between spike-time jitter and information-theoretic measures
The noise entropy rate is influenced by the stimulus statistics. How is this dependence reflected in the spike train? To study this question, we calculated the jitter distribution P(j) corresponding to the whole collection of spikes emitted by a cell to a given stimulus. This distribution is defined as the probability of finding a spike with an amount of jitter j (see methods). Figure 6 shows an example corresponding to the same cell and stimulus condition as in Fig. 1D and shows that jitter values as low as 0.15 ms can be achieved. The jitter distribution P(j) was calculated for all cells and stimulus conditions. In all cases, unimodal distributions were found. The maximum of P(j) was reached for some j between 0.35 and 1.9 ms, depending on the cell. The mean jitter J varied between 0.45 and 1.3 ms, and many recordings contained remarkably precise spikes that jittered as little as 0.15 ms.
As the stimulus statistics are altered, the mean jitter J covaries with the mutual information rate in a remarkably consistent way. This becomes apparent in Fig. 7, where the dependence of J on the stimulus conditions (top) is compared with the mean information rate I (bottom). For simplicity, only cell averages are depicted, and the error bars represent the SD from the average. For all stimulus variations, one can readily see that J and I exhibit opposite trends: the information rate increases whenever the mean jitter decreases.
As a summary of the results obtained thus far, Fig. 8 shows the effect of the mean amount of jitter on the information rate, the total entropy rate, and the noise entropy rate (left). For comparison, the effect of the average firing rate is also shown (right). The mean jitter J is noticeably correlated with the noise entropy Hnoise (Fig. 8C). The total entropy rate does not exhibit any clear dependence on J (Fig. 8B). As a result, the mutual information I (obtained by subtracting Hnoise from Htotal) is strongly correlated with J (correlation coefficient −0.888, significant at the 0.01 level), as shown in Fig. 8A. For the present system, the rather abstract mutual information can thus be largely reduced to the biologically more intuitive, yet less general, measure of spike-time jitter. In addition, Fig. 8E shows that the total entropy rate is accurately predicted by the firing rate of the cell as has been reported previously (Borst and Haag 2001). The noise entropy rate, on the other hand, bears no obvious dependence on the firing rate of the cell (Fig. 8F). Together, these two effects result in a mutual information rate that is only weakly correlated with the firing rate (Fig. 8D).
In the previous section, the mutual information rate I was shown to be tightly related to the mean amount of jitter J in the neural responses. To further evaluate the relationship between I and J, we added artificial jitter to the neural response. The temporal location of each spike was randomly altered by a value drawn from a flat probability distribution in a small interval (t0 –τ, t0 + τ) centered at the true spike time t0. Figure 9A shows the dependence of the mutual information I on the bin size Δt used for binning the spike train (see methods) for a sample cell. Results for the original spike train are represented by squares. The circles show a response set that has been jittered with τ = 0.5 ms. For large values of Δt, the mutual information of both sets coincided. However, as Δt approaches τ = 0.5 ms, the information provided by the response with artificial jitter was noticeably lower than that of the true spike train. When the spike train was modified by a larger jitter τ (1 ms, triangles) the discrepancy with the original information rate was even more evident. Notice that the value Δt where the two information values began to differ depended on the size of the added jitter τ.
Performing the same analysis on all recorded cells confirmed the findings from the above example. Figure 9B shows a histogram with the original information values for the whole collection of cells and all stimulus conditions. When uniform jitter with τ = 1 ms was added to each response, the corresponding information values dropped as depicted in Fig. 9C. The number of cases with a mutual-information rate more than 300 bits/s was markedly reduced, and correspondingly, the fraction less than 150 bits/s was noticeably increased.
Random manipulations of spike trains have often been used to selectively disrupt some features in the responses but not others (Furukawa and Middlebrooks 2002; Hatsopoulos et al. 2003; Lu and Wang 2003; Reinagel and Reid 2000). Adding jitter is equivalent to convolving the original probability density of generating a spike with a new, artificial distribution [in our case, a flat distribution in (t0 –τ, t0 + τ)]. This operation, however, only introduces noticeable changes to those distributions that were originally narrow. In other words, the spikes that were imprecise from the start remain roughly unchanged. In contrast, the alignment of precise spikes is markedly destroyed. The drop in information rates obtained with jittered spike trains confirms that the mutual information rate is strongly affected by the fraction of highly precise spikes.
Stimulus features that underlie precise spikes
To uncover those stimulus features that are represented by the most accurate spikes, we took a more detailed look at the correspondence between stimuli and responses. To do so, we analyzed the integration properties of the receptors by calculating their linear forward filter characteristics. The linear filter of each cell can be easily obtained from the correlation between stimuli and responses (see methods). This correlation quantifies the degree up to which spikes are locked to a particular stimulus feature. The shape of the time-inverted filter represents the stimulus feature that, within the linear hypothesis, drives the cell optimally.
However, only those frequency components of the filter that were actually present in the stimulus can be obtained. As the cut-off frequency of the stimulus increases, the filter must therefore reveal its high-frequency content. Our data indicate that the filter has a natural frequency cut-off, as shown in Fig. 10 for a sample cell whose recording lasted long enough to test all five different cut-off frequencies. As fC grows from 25 to 200 Hz, the spectrum of the filter widens in frequency space. However, for fC = 400 Hz and fC = 800 Hz, the fraction of power in the upper half of the frequency range is comparatively small. This means that the filters are dominated by contributions in the range from zero to ∼200 Hz. In Fig. 10B, the temporal behavior of the filters is shown. For concreteness, we defined the preferred stimulus rise time as the interval between the first minimum to the right of the filter's global maximum and this maximum. In terms of the preferred stimulus feature, this corresponds to an upward stimulus deflection. As the cut-off frequency of the stimulus ensemble increases, the cell's preferred rise time settles to a value between 2 and 3 ms. By averaging all cells driven with 400- and 800-Hz cut-off frequencies, the average preferred rise time was estimated as 2.3 ± 0.5 ms. We conclude that spikes preferentially lock to upward stimulus deflections whose rise time lasts between 2 and 3 ms.
Can the preferred locking of spikes to these particular deflections also explain the optimal information transmission obtained for fC = 200 Hz? To answer this question, we analyzed the complete distribution P(j). Figure 11 A depicts P(j) for a sample cell. The action potentials fired by this cell were ranked according to the size of their jitter. From this ranking, two subsets of spikes were extracted, each of which defined a new, artificial spike train: the precise spike train was constructed from the 15% of spikes with the lowest amount of jitter (the dark bars on the left of Fig. 11A, corresponding to j ≤ 0.25 ms). This was done for each one of the N trials recorded in the experiment. Similarly, imprecise spike trains were constructed based on the 15% of spikes with the largest amounts of jitter (the dark bars on the right of Fig. 11A, with j ≥ 1.25 ms). We asked whether the average stimulus segment triggering exact spikes differed from that eliciting inexact responses. To tackle this issue, the forward filters associated with the two separate subsets of spikes were calculated and are shown in Fig. 11B. Precise spikes occurred in response to larger stimulus excursions compared with imprecise spikes. More generally, if the jitter of the spike subset used to calculate the filter was increased, the height of the filter decreased (Fig. 11C).
As revealed by the spike-resolved analysis, spikes were locked to amplitude upstrokes whose rise time lasted between 2 and 3 ms, and the locking improved with increasing size of the upstroke. With this insight, we can finally return to the question of why information transmission is optimal for stimuli with fC = 200 Hz, even if faster stimuli have higher entropy rates. For a cut-off frequency of 25 or 100 Hz, the stimulus only contains slow amplitude modulations, and none of the optimal 2- to 3-ms upstrokes. As the cut-off frequency was increased to 200 Hz, the stimulus exhibited more and more of the preferred features and thereby generates more accurate responses. As the cut-off frequency grows further, even faster modulations are incorporated. However, within the framework of a linear filter, all stimulus deflections lasting less than about 2 ms are virtually filtered out by the cell. However, because the variance of the stimulus was kept fixed as the cut-off frequency varies, these rapid deflections, although innocuous in driving the cell, absorb some of the stimulus power and thereby leave the relevant stimulus frequencies with less remaining power. Hence, even though the optimal 2- to 3-ms excursions are still present, they have smaller amplitudes than for fC ≈ 200 Hz. As a consequence, for fC > 200 Hz, time locking begins to deteriorate, and information transmission rates drop.
The statistics of sensory stimuli or synaptic inputs have a strong influence on spike-time jitter (Bair and Koch 1996; Bryant and Segundo 1976; de Ruyter van Steveninck et al. 1997; Mainen and Sejnowsky 1995; Warzecha et al. 2000). The stimulus statistics also affect the amount of information carried by a spike train (Lewen et al. 2001; Machens et al. 2001; Rieke et al. 1995; Vinje and Gallant 2000). By quantitatively assessing how this information depends on spike-time jitter, this study investigated the relation between both observations in the context of sensory adaptation to natural stimulus environments.
As shown by two independent analyses, spike-time jitter has indeed a strong effect on the transmitted information. First, there is a tight correlation between those responses where the amount of jitter was low and those where the information transmission rate was high (Fig. 8A). Second, adding artificial jitter to the responses revealed that information rates are considerably reduced if no precise spikes remain (Fig. 9). Together, these results show that spike-time precision plays a crucial role for neural information transmission in the studied system.
The biophysical noise sources contributing to spike-time jitter in the studied receptor cells may reside in the mechanosensory transduction or in the spike-generating mechanisms. Our black box analysis of the input-output transformation does not allow us to distinguish between these possibilities. To further localize the origin of the measured spike-time variability, dendritic recordings of the transduction currents or interferometric measurements of tympanal vibrations would be needed.
Our data also suggest how the stimulus statistics affect information transmission; by modulating spike-time jitter, the external signal determines how often precise responses occur and thereby influences the rate of information transmission (Fig. 7). Further analysis of the responses revealed that the effect of spike-time jitter on information rates is mediated through the noise entropy of the response (Fig. 8); the total response entropy, on the other hand, did not vary when the stimulus type was changed (Fig. 5). In other words, stimulus types that lead to higher rates of information transmission do so not because they generate a richer repertoire of response patterns but because these patterns are less noisy.
The fact that the stimulus type has little influence on the richness of neural responses may come as a surprise in view of previous studies (see, e.g., Lewen et al. 2001). As pointed out by Borst and Haag (2001), however, information transmission can be strongly influenced by the average firing rate because higher rates allow a neuron to employ a larger variety of response patterns. To study the effect of stimulus statistics on information transmission beyond these manifest effects of firing rate, our experiments were designed to yield the same firing rate irrespective of the specific stimuli that were compared for a given neuron. Under this condition, the total entropy did not differ for different stimulus types. In this system therefore, stimulus statistics do not influence the complexity of neural responses, apart from effects mediated through firing-rate changes. The two quantities governing information transmission—total entropy and noise entropy—therefore seem to be determined by two different stimulus characteristics—overall stimulus intensity and temporal stimulus variations, respectively.
One may wonder whether the correlation between spike-time jitter and rate of information transmission is a trivial fact to be expected for any coding scheme. Although this seems plausible, it is not generally true. One can easily devise coding schemes for which the information transmission rate does not depend on the neural output jitter. The simplest case may be the classical rate code where no information is found on small time scales (Shadlen and Newsome 1998). For a sensory neuron, this situation could occur if the transduction process included a temporal low-pass filter. The observed reduction of information rates with increasing jitter, however, indicates that, at least for this system, the relevant variable in information transmission is indeed the fine temporal placement of spikes. Notice that the transmitted information can also differ for responses to two stimulus ensembles that yield the same average spike-time jitter. Particular spikes cannot only jitter across trials, they can also be completely absent in some trials. These “missing spikes” do not influence the jitter measure, but clearly affect the information transmitted, and could, in principle, explain some of the variance observed in Fig. 8C.
We conclude that for the auditory system in this study, results obtained using the simple and biologically inspired measure of spike-time jitter are in agreement with the results obtained from a full information-theoretic analysis. However, as shown by the example of “missing spikes,” spike-timing precision and transmitted information need not go hand in hand. How closely both measures are related in other sensory systems remains an open question that needs to be studied case by case.
As suggested by Laughlin (2001) and Schreiber et al. (2002), generating spikes with high temporal accuracy is metabolically expensive. An energy-efficient representation of the natural environment might therefore require a careful match between the most important stimuli and the most accurate responses. We therefore extended the concept of a spike-triggered average such that the average was based on the most (or least) precise spikes only. This showed that responses with small spike-time jitter were preferentially elicited by strong upward stimulus deflections lasting between 2 and 3 ms.
These acoustic features have an important behavioral relevance, because natural songs are structured in syllables whose steep and often overshooting onsets last for one or at most a few milliseconds. Previous results have shown that behaviorally relevant cues about a grasshopper song are contained in the structure and temporal location of these onsets (Balakrishnan et al. 2001; Krahe et al. 2002). Stimulus-dependent spike-time jitter at the sensory periphery might therefore provide a means to encode behaviorally relevant stimuli such that those stimuli can be processed with great efficiency by downstream neurons without wasting metabolic resources to precisely represent less important stimuli.
Within the tested stimulus space, the ensemble with the most frequent instances of these features was a stimulus with Gaussian amplitude modulation, large standard deviation, and a cut-off frequency of 200 Hz. Bimodal distributions, although coinciding with the AM of natural grasshopper songs, were less informative. However, the preference for large amplitude excursions suggests that the system might not have evolved to provide an accurate representation of the entire natural distribution of amplitudes but rather to identify specific stimulus features, which are signaled by pronounced upstrokes of the amplitude. Because the Gaussian distribution contains a larger high-amplitude tail, strong upstrokes appear more frequently than in the bimodal stimulus distribution. This also explains why in a previous study, naturalistic stimuli with large amplitude modulations led to higher information rates and coding efficacies than artificial stimuli with smaller amplitudes modulations (Machens et al. 2001).
Our unexpected finding that naturalistic stimuli are suboptimal compared with Gaussian stimuli with equal variance suggests that sensory systems may be constrained to work as feature detectors for just a few salient characteristics of the input signal. To create these features, natural stimuli use a bimodal amplitude distribution, which may ultimately result from constraints on the sender and not the receiver. After all, grasshoppers are not capable of producing arbitrarily large sound amplitudes and may aim for energetically efficient signals that nevertheless retain precisely encoded amplitude excursions. This hypothesis could be tested with future experiments that study how spike-time jitter varies when the maximal signal amplitude is constrained.
Many recordings contained at least some spikes that jitter as little as 0.15 ms. This is a surprising finding, given that the spike-time jitter is an order of magnitude smaller than typical stimulus time scales. What stimulus aspects are being encoded on such small time scales? Obviously, precise spikes convey accurate information about when a particular stimulus feature occurs. This is of particular importance for grasshoppers who rely on the detailed temporal structure of conspecific communication signals for mate finding (von Helversen and von Helversen 1997) and therefore need to tag events that mark the signal substructure.
Precise spikes may also help in detecting the presence of specific stimulus features, for example through a coincidence-detector read-out. Ronacher and Römer (1985) have speculated that such a mechanism could underlie the females' rejection of male grasshopper courtship songs that are interspersed with short millisecond gaps. The precise spiking in response to the short amplitude excursions may be critical for the operation of such a detection mechanism. Finally, precise spiking appears to be crucial for sound localization in many auditory systems (Grothe and Klump 2000; Mason et al. 2001). Spikes that appear in relative isolation, e.g., following an amplitude excursion after a quiet period, may be most suited for a comparison in timing between the left and the right ear. One would thus expect that these spikes show particular temporal precision, whereas highly precise firing may be of lesser importance for other stimulus parts.
These hypotheses about the functional role of information transmission by precise spiking are directly related to the question of how the information in the spike train is read out by subsequent neural processing levels. Acridid grasshoppers possess about 50 receptor neurons per ear. Their axons converge onto local interneurons in the auditory neuropil within the metathoracic ganglion. Depending on the specific convergence pattern, these secondary neurons will be driven in a highly reliable manner by low-jitter receptor spikes; high-jitter spikes, on the other hand, may not trigger any response of a down-stream coincidence detector. Highly precise spikes found in auditory cortex (DeWeese et al. 2003) may be based on a similar mechanism.
The auditory neuropil of grasshoppers allows the identification of single neurons with distinct response characteristics. Because we now know how the stimulus statistics influence the responses in the receptor cell layer, it should be possible to systematically search for effects in the responses of those neurons that read out the receptor spike trains. Ultimately, this knowledge should help to reveal the mechanisms of fundamental computations carried out by this auditory model system, such as sound localization and time-warp-invariant song recognition.
This work was supported by the Alexander von Humboldt Foundation, the German Federal Ministry of Education and Research, the German Research Foundation, the Israeli Ministry of Science, and the Minerva Foundation of the Max Planck Society.
Present addresses: A. Rokem, The Helen Wills Neuroscience Institute, 132 Barker Hall, MC 3190, The University of California Berkeley, CA 94720–3190; S. Watzl, Columbia University, Department of Philosophy, 708 Philosophy Hall, MC 4971, 1150 Amsterdam Avenue, New York, 10027 NY; T. Gollisch, Department of Molecular and Cellular Biology, Harvard University Cambridge, MA 02138; and I. Samengo, Centro Atómico Bariloche, 8400 San Carlos de Bariloche, Argentina.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2006 by the American Physiological Society