We investigate the capability of turtle retinal ganglion cell (RGC) ensembles to simultaneously encode multiple aspects of visual motion: speed, direction, and acceleration of moving patterns. Bayesian stimulus reconstruction reveals that the instantaneous firing rates of RGCs contain information about all of these stimulus properties. Stimulus velocity is mainly encoded by steady-state firing rates, whereas acceleration can be reconstructed from transient components in RGC activity induced by abrupt velocity changes. Therefore neurons in higher brain areas may in principle extract information about changing velocity from the instantaneous firing activity of RGCs, without the need to compare responses to present velocities to previous ones. However, reconstruction requires the estimation of a combined acceleration and velocity signal, indicating that RGC ensembles signal both properties simultaneously. In accordance with this conclusion, combined velocity/acceleration sensitivity enhances the similarity of artificial spike trains to experimental data by 50% compared with the case of pure velocity tuning. Decoding of motion direction in addition to speed and acceleration requires direction-sensitive cells, which generate higher firing rates for one of the motion directions and therefore show asymmetric velocity tuning. By dividing the entire ensemble of simultaneously recorded cells into one group of direction-sensitive cells and one group with symmetric tuning, we demonstrate that the population of direction-sensitive cells encodes a combination of motion speed, acceleration, and direction. However, estimation of velocity and acceleration is improved by including the larger group of RGC responses that are sensitive to speed but not to motion direction.
To enable adequate behavioral responses in a dynamic environment, animals need to anticipate the future positions of prey, predators, or obstacles. Thus it is essential to correctly judge the velocity of moving objects, including absolute speed, direction of motion, and preferably also abrupt changes in these quantities.
Neurons in various visual brain areas of different species have been shown to process motion-related information. In macaques, neurons in the primary visual cortex and the middle temporal visual area have long been known to be selective for stimulus orientation, speed, and movement direction (Dow 1974; Maunsell and Van Essen 1983). The accessory optic system (AOS), a brain structure involved in retinal image stabilization common to all vertebrates, also contains neurons responding selectively to movement speed and direction of large, textured visual patterns (for review see Simpson 1984). In the pigeon, a subgroup of AOS neurons located in the nucleus lentiformis mesencephali has even been demonstrated to be acceleration sensitive (Cao et al. 2004).
Because all visual information about the surrounding world is transmitted by the action potentials of retinal ganglion cells (RGCs), their spike trains must also contain information about motion in visual scenes. In principle, RGCs could provide the basis for velocity computation in higher brain areas by simply signaling luminance changes within their receptive fields (Adelson and Bergen 1985; Reichardt 1961). In many species, however, RGC subtypes alter their responses systematically depending on stimulus speed (Cleland and Harding 1983; Lee and Willshaw 1978). Additionally, direction-selective retinal ganglion cells (DSGCs) have been described that respond maximally to motion in a particular direction while generating essentially no spikes for movement in the opposite direction (Barlow et al. 1964). The neuronal circuitry, which creates the characteristic responses of DSGCs, has been thoroughly investigated (Clifford and Ibbotson 2003; Taylor and Vaney 2003). Firing rates of DSGCs in the rabbit retina depend not only on the direction of motion, but on the stimulus speed as well (Oyster 1972). Thus a single type of RGC is in principle capable of encoding velocity, i.e., the combination of the absolute speed and the direction of movement.
In the turtle retina, 30–40% of ganglion cells have been shown to be direction selective (Ammermüller et al. 1995; Bowling 1980; Marchiafava 1979). At least some of these cells additionally modulate their firing rate as a function of stimulus speed (Ariel and Adolph 1985). Thus the turtle is ideally suited to investigate the retinal encoding of motion-related information.
In the work presented here, we apply Bayesian estimation to multielectrode recordings from turtle RGCs to reconstruct different aspects of a wide-field motion stimulus. The stimulus moved along a fixed axis and changed its velocity abruptly every 500 ms, occasionally reversing its movement direction. In the first part of the study, we compare the capability of two ganglion cell groups to represent stimulus speed and movement direction. We divide the entire population into direction-sensitive cells and speed-tuned neurons, which show firing rates irrespective of motion direction. The second part is concerned with retinal responses to sudden changes in stimulus speed. We ask whether RGCs signal acceleration at all by modulating their firing rates and whether direction-sensitive or purely speed tuned cells are more sensitive to speed changes.
Preparation and recordings
Electrophysiological measurements were done in turtle (Pseudemys scripta elegans) retinae as described previously (Greschner et al. 2002). Animals were killed according to the guidelines of the University of Oldenburg Ethical Committee and to ECC rules (86/609/ECC). Measurements were performed on isolated retinae with pigment epithelium attached. Retinae were constantly superfused with turtle ringer (120 mM NaCl, 5 mM KCl, 2 mM CaCl2, 2 mM MgCl2, 10 mM glucose, 22 mM NaHCO3, bubbled with 95% O2-5% CO2; pH 7.4).
Ganglion cell activity was recorded extracellularly with a 100-electrode silicon array (Cyberkinetics, Foxborough, MA) inserted from the pigment epithelium side into the ganglion cell layer. The Bionic 100-channel neural signal acquisition system (Cyberkinetics) was used for amplifying, thresholding, and storing the extracellular spike data. Several electrodes of the multielectrode array recorded the activity of more than one cell, indicated by multiple spike waveforms occurring at a single electrode. We used the supervised k-means clustering software SpikeSorter (Cyberkinetics) to separate these spike waveforms. Because the electrode distance of our array is 400 μm, the recording of one cell on more than one electrode is almost excluded. After spike separation, the time stamps of first threshold crossings were stored as a temporal sequence for each unit. Further analysis based on these time stamps was performed using evaluation routines all implemented in the Interactive Data Language (IDL; ITT Visual Information Solutions, Boulder, CO).
Stimulation was done via an optical bench using a white LED as the light source (LXHL-FW6C; Luxeon, San Jose, CA), generating a maximum retinal illuminance of Lmax = 1,000 lux. For the stimulus, we designed a pattern consisting of dark squares on a bright background, covering the whole recording area. Contrast between background and squares was Lmax/Lmin ≈ 25; edge length of the squares and average separation between them was 200 μm on the retina. For comparison, fitting two-dimensional Gaussian functions to the ganglion cell receptive field centers yielded values of σ between 100 and 200 μm. To avoid collinear edges stimulating all cells simultaneously, we jittered the position of each square relative to its position on a regular grid, retaining a minimum separation of 80 μm between squares. A detail of the stimulus pattern is shown in Fig. 1B for illustration. The actual number of squares was 100 × 100, thus covering 40 × 40 mm in total, to ensure that cells at the borders of the recording area are continuously stimulated even when the pattern is displaced during movement. The gray square in the center of Fig. 1B is not part of the stimulus, but represents the area covered by the electrode array. We projected the square pattern onto the retina using an x–y miniature mirror system (Datronik, Rastede, Germany). Stimulus motion was generated by gradually changing the position of the miniature mirrors under the control of a stimulus computer, which was synchronized with the data acquisition computer. Motion was one-dimensional along one axis of the electrode array, and a motion sequence v(t) consisted of transitions between distinct velocity levels. These levels were chosen from a discrete set of nine equidistant velocities in the range from −2.5 to +2.5 mm/s on the retina, corresponding to a maximum angular speed of approximately 30°/s (Northmore and Granda 1991). Negative velocity indicates stimulus movement in the reversed direction along the same axis. After a transition, velocity was kept constant at the new level for 500 ms before the next switch to yet another level. Thus movement with maximum speed between switches displaced the pattern by at most 1.25 mm on the retina, corresponding to roughly one quarter of the recording array. Motion sequences lasting 365 s were generated pseudorandomly and presented nine times per experiment. We ensured that the sequences sampled all velocity levels equally often while at the same time containing an equal number of all possible transitions between velocity levels.
In each retina, we were able to identify signals originating from 80 to 100 ganglion cells, and most of them responded vigorously to the moving stimulus. To ensure that cells were in a steady state during the complete course of our measurements, we counted the total number of spikes during each stimulus repetition. Spiking activity adapted strongly during the first motion sequence of each experiment (Barlow and Hill 1963), whereas the spike numbers of the remaining repetitions differed by maximally 10%. We therefore excluded the first trial from the analysis, leaving eight stimulus repetitions to be evaluated in each experiment. Stability of the preparation was also confirmed by the fact that reconstruction quality did not depend consistently on trial number.
We conducted three experiments with identical stimulation. Experiments 2 and 3 were performed in the same retina preparation. Data presented in the figures are taken from experiment 2, except for plots summarizing all three experiments explicitly. An additional experiment was performed earlier with a slightly different experimental paradigm. In this pilot study, we used a 200-μm square-wave grating as a moving stimulus and velocity was not restricted to nine distinct levels. Moreover, the retina preparation differed insofar as the pigment epithelium was removed. Despite the differences in experimental design, all main results subsequently presented were also found in these data. The qualitative agreement of results obtained under altered experimental conditions confirms the robustness of the effects described here.
Stimulus reconstruction by Bayesian inference statistically exploits the simultaneous occurrences of certain stimuli and neural activity to estimate stimulus features from nerve cell responses. However, the response of retinal ganglion cells lags behind the actual changes in stimulus motion due to the time needed for signal processing in presynaptic retinal cells. The RGC response latency has to be compensated for an optimal assignment of the neuronal responses to the respective stimuli because, otherwise, the performance of the reconstruction process is reduced considerably. To determine the latency, we computed a poststimulus time histogram (PSTH) for each neuron, starting immediately at the end of zero velocity intervals. Averaging over all neurons yields signals clearly displaying a steep rise of retinal activity after motion onset. We considered the earliest responses, which are those evoked by motion onsets with the highest absolute velocity. We defined as the latency Δt the instant at which the pooled ganglion cell signal differs >3SDs from its average value before the motion onset. We determined the minimum latency for each experiment separately (Δt1 = 53 ms, Δt2 = 54 ms, Δt3 = 56 ms) and used these values to temporally align the responses to their corresponding stimuli. Note that we decided not to optimize latency by applying the decoding procedure subsequently described because minimizing the deviation between the actual and the decoded signal yielded estimations changing earlier than the stimulus, thus violating causality.
The Bayesian framework for stimulus reconstruction
The Bayesian framework for stimulus reconstruction implies as its first step to estimate from the experimental data a model to describe the likelihood function P(r|s), i.e., the probability of encountering a neuronal response r given the stimulus s. Using Bayes’ rule, this is combined with the prior distribution of stimuli P(s) to compute the posterior distribution P(s|r). From this function, the stimulus s*—which most probably evoked a given response r—can be inferred. We apply the maximum a posteriori (MAP) estimator; i.e., we choose as the reconstructed stimulus s* the one that maximizes the posterior distribution. Both the neuronal response r and the stimulus s may be vectors, combining responses of multiple neurons and several stimulus properties, respectively.
Here, we use as the neuronal response the number of spikes ri(t) elicited by retinal ganglion cell i within the interval [t, t + τ], and combine these individual spike numbers within the response vector r(t) where N is the total number of recorded ganglion cells. The encoding capabilities of ganglion cell subgroups are investigated by including into r(t) only the responses of neurons belonging to the respective group. For the analysis of single-cell reconstruction quality, the response vector reduces to a scalar r(t) := ri(t).
STIMULUS AND PRIOR DISTRIBUTION.
In the first part of our analysis, we use the time-varying velocity signal described earlier as the only stimulus dimension. Having determined the response latency enables us to relate the latency-shifted stimulus s(t − Δt) := v(t − Δt) to its corresponding neuronal response, i.e., the vector r(t). For each time t, the MAP estimator then yields the stimulus v*(t − Δt) that most probably evoked the neuronal response, resulting in the reconstructed time course of velocities. Later, we include another stimulus property into the analysis by adding a second dimension to the stimulus vector: s(t − Δt) := [v(t − Δt), a(t − Δt)], which consequently yields the most probable combination of both stimulus properties s*(t − Δt) = [v*(t − Δt), a*(t − Δt)].
a(t) represents changes in the absolute velocity, and thus will be denoted the acceleration dimension. We compute a(t) as follows (for a discussion of the rationale of the computation of a(t) see discussion): its first component a1 consists of increases and decreases in the absolute velocity where δt = 1 ms denotes the sample interval of data evaluation. Because visual inspection of the population firing rate suggested that activity is further increased if the stimulus reverses its direction of motion while at the same time increasing the speed, we generated a second component a2(t) We chose v0 = 1 mm/s, which yields peaks in a2 with 0.4 times the maximum value of a1. The a2 component is not strictly necessary for acceleration estimation, but further improves the performance (see results).
To obtain a signal that rises quickly after each velocity transition and smoothly decays afterward, similar to the transient peaks observed in the population firing rate, we finally convolved the sum of a1 and a2 with the kernel function k(t) where k(t) = t × exp[−(t/50 ms)].
The resulting signal a(t) is shown in ⇓⇓Fig. 4A. The 50-ms time constant of k was chosen to fit the time course of the population firing rate. Reconstruction quality degraded steeply for values <40 ms, whereas values >70 ms did not properly reflect the transient character of the acceleration signal. Within this range, reconstruction quality did not critically depend on the exact choice of the kernel's time constant.
Within the Bayesian framework, the stimulus is considered to be a random variable, described by the so-called prior distribution of stimuli P(s). With the prior, it is possible to elegantly consider which stimuli occur more often than others. The estimation process exploits this knowledge and uses it as a bias toward stimuli occurring with a higher probability, which is especially important in cases of uncertain neuronal responses. Here, we use as the prior distribution the proportional frequency of stimuli actually applied during the experiment. In the paradigm using a single stimulus dimension, this corresponds to the number of occurrences of the different discrete velocity levels relative to all velocities presented. Because all levels were chosen equally often, the prior distribution is uniform, with equal probability for each velocity level where δ denotes the Dirac delta function and v1…9 = −2.5, −1.875, …, 2.5 mm/s.
Additionally, we analyzed reconstruction quality as a function of the prior width, motivated by the recent finding that in a psychophysical context, small velocities are favored during estimation by the visual system (Blakemore and Snowden 1999; Stone and Thompson 1992). We chose Gaussian bell curves centered at zero velocity as the prior during the estimation process. Decreasing the width of these curves reduces the probability of large velocities to be estimated, as would be expected for natural stimulus situations (Roth and Black 2007; Steinman and Collewijn 1980). Based on our data set, however, no clear conclusions could be drawn as to whether this modification of the prior distribution improves overall reconstruction quality. Thus only results obtained with a uniform prior distribution of velocities are subsequently presented.
For the acceleration analysis, we computed the proportional frequency of combinations of stimulus velocity and acceleration occurring simultaneously during the experiments, resulting in a two-dimensional prior distribution. This function is no longer flat. Because of the transient nature of the acceleration signal, small values prevail in a(t), such that the prior distribution is largest along the axis of zero acceleration. The fact that large accelerations cannot occur in combination with slow velocities is reflected by some prior entries being equal to zero.
POSTERIOR DISTRIBUTION AND ESTIMATION.
According to Bayes’ formula, the product of the prior distribution and the likelihood is proportional to the posterior distribution P(s|r), which quantifies the probability of each stimulus given the neuronal response Because we use the MAP estimator, the denominator does not need to be determined, because it only scales the distribution without shifting the maximum. The time dependence of r is not shown here for convenience.
Assuming that neurons respond independently from each other, the likelihood may be transformed into a product of single-neuron likelihood functions, which can be determined separately The independence assumption is a simplifying approximation for theoretical convenience, which makes the problem more tractable and is sufficient here to give a reasonable performance, even though probably not all combinations of cells meet this assumption.
If the generation of action potentials in the retinal ganglion cells obeys Poisson statistics (see discussion), the individual likelihood functions can be expressed using the average number fi(s) of spikes elicited by ganglion cell i as a function of the stimulus, i.e., by the tuning function of neuron i. The tuning functions are determined empirically from the experimental data. The complete likelihood function then reads Time dependence is introduced into the estimation procedure because the spike numbers ri vary with time. For each instant in time t, the present spike numbers r(t) in combination with the previously measured average tuning functions f determine a new likelihood and therefore a new posterior distribution of stimuli P[s|r(t)].
With the posterior distribution being known, it is now possible to estimate the stimulus s*(t), which has most probably elicited the neuronal response r(t). This is accomplished by choosing as s*(t) the stimulus that maximizes the posterior To avoid overfitting effects, we determined the likelihood functions from all but one of the eight identical stimulation trials. Estimation was based on the responses recorded during the one remaining trial. This procedure was repeated for each trial.
However, if responses depend on the stimulus history over longer periods of time, reconstruction quality may be overestimated because identical sequences of velocity transitions were used for both training and validation. To exclude this possibility, we repeated reconstruction using the first 319 s of each trial for determining the likelihoods, while retaining the final 45 s for estimation. These final segments contain stimulus sequences not previously presented within the same trial. Reconstruction worked equally well for this method, indicating that the reconstruction capability is independent of stimulus history. Subsequently presented results are thus obtained with the estimation method based on trial-by-trial comparison.
Evaluation of the reconstruction quality
To assess the quality of the reconstructed signal, it has to be compared with the original stimulus. For a quantitative analysis, we applied two methods integrating information about reconstruction precision from the complete experiments. Each of the two stimulus dimensions—velocity and acceleration—was evaluated separately, although they were estimated in combination.
Each stimulus value occurs repeatedly during the experiment. The responses to the same stimulus will depend on how well the neurons encode the stimulus and how strongly noise enters the encoding and measurement processes. Consequently, the reconstructed stimulus values will be scattered around the actual value. The deviation between stimulus s and estimate s* can be described by the conditional probability of estimates P(s*|s) (Stocker and Simoncelli 2006). We computed P(s*|s) by evaluating all eight single-trial time courses of the original and reconstructed stimuli.
The conditional probability of estimates allows for the detailed investigation of the estimation quality separately for each stimulus class. To be able to quantify reconstruction in a single number, we computed the relative reconstruction error E for each experiment as follows where 〈·〉 denotes the time average for the complete duration of each experiment, whereas s̄*(t) symbolizes the average estimate at time t computed from all eight single-trial estimations. E thus relates the mean square difference between stimulus and estimate to the variance of the stimulus. An error of E = 0 is equivalent to a perfect reconstruction, whereas a reconstruction error of E = 1 implies that no aspect of the stimulus could be recovered from the neuronal response (Edin et al. 2004).
Determining the optimal spike count window
The reconstruction quality depends on the duration τ of the integration time window used to count the spike numbers. Small values of τ result in few spikes within the sliding window, yielding estimations reflecting fast variations in the response, which may not be present in the stimulus. On the other hand, large values of τ lead to large errors because the stimulus varies faster than the width of the sliding window. Consequently, the optimal integration time window lies somewhere in between these extremes. We determined the optimal value of τ separately for each experiment by minimizing the relative error E and obtained τ1 = 95 ms, τ2 = 79 ms, and τ3 = 83 ms. All results presented in this study have been computed using these optimal τ values.
Cluster analysis of tuning functions
To identify neurons showing direction sensitivity for the given axis of stimulus motion, we classified individual cell responses according to the form of their tuning curves, using a k-means clustering algorithm. The algorithm's feature vectors were nine-dimensional, consisting of the average firing rates fi(vj) of neuron i for each of the velocity levels vj. For normalization, we divided fi(vj) by the maximum of fi. Without this normalization, classification resulted in a separation only according to the overall activity of a cell, disregarding the form of its tuning function. The k-means algorithm requires predefining the number of cluster centers that are used for classification. We tested cluster numbers in the range from two to seven. A clear separation into distinct groups, which combined neurons preferring the same movement direction, either positive or negative, was obtained with a cluster number of four. This number also yielded the lowest average distance from the tuning curves to the respective cluster centers. Choosing two or three centers resulted in higher distance values and a mixture of direction-sensitive and -nonsensitive neurons within the same clusters. When more than four cluster centers were chosen, only four of them contained data, whereas the others remained empty. Thus the most reasonable grouping obtained with the k-means method was to cluster tuning functions into four classes. We confirmed the stability of the classification by repeating the clustering procedure, each time obtaining identical assignments of single cells to the respective groups. For illustration, we show the tuning functions of the neurons with minimal distances to the four cluster centers computed by the k-means algorithm (Fig. 3A).
We also applied the clustering algorithm to the combined velocity/acceleration tuning functions, but in this case obtained no obviously distinct types. Thus we kept the classification of direction-sensitive and symmetrically tuned responses for the analysis of acceleration encoding.
We recorded ganglion cell action potentials in isolated turtle retinae with a multielectrode array. Retinae were optically stimulated with the pattern shown in Fig. 1B, which moved back and forth along a fixed axis and changed its velocity instantaneously every 500 ms. Pseudorandom sequences of velocity transitions were presented repeatedly. Based on the recorded action potentials, we determined the capacity of retinal ganglion cells to represent stimulus velocity and acceleration.
First, we computed the instantaneous population firing rate by summing the action potentials of all ganglion cells within a sliding time window. The time course of the population activity and the corresponding stimulus speed are shown in Fig. 1A. As can be seen, retinal activity depended on stimulus speed. Most obviously, activity decayed to almost zero if the stimulus did not move at all, and rose steeply at motion onsets (filled arrowheads in Fig. 1A). In general, within the chosen velocity range, slower movement seemed to evoke lower firing rates than fast movement. Additionally, peak firing rates were reached when the difference between previous and new speed was large, i.e., the acceleration of the stimulus was high. After such peaks, the retinal activity decayed to a plateau level, which seemed to be characteristic for the stimulus speed. In the case of speed decreases, the retinal population firing rate sometimes showed small undershoots before rising to the plateau level (open arrowheads in Fig. 1A). Such undershoots occurred only if stimulus speed did not switch to zero.
Thus the population activity of ganglion cells clearly depends on both the stimulus speed and acceleration. To further investigate this dependence, and in particular consider the contributions of single cells individually as opposed to pooling their responses, we applied the method of Bayesian inference, as used by Zhang et al. (1998). We first describe the case of estimating stimulus velocity alone, before later combining velocity and acceleration within the reconstruction process.
Due to photoreceptor integration time and synaptic delays, retinal ganglion cells responded with a certain time lag to the actual changes in stimulus motion. This response latency was particularly evident in the population firing rate signal when the stimulus started to move again after a period of zero velocity (Fig. 1A, filled arrowheads). Latency has to be compensated by temporally shifting the response relative to the stimulus (see methods) because, otherwise, the Bayesian algorithm cannot correctly assign the neuronal responses to the stimuli, resulting in a reduction of performance in the reconstruction process and consequently in underestimating the encoding capabilities of the ganglion cells. It is well known that the latency of retinal ganglion cell responses is not fixed but decreases with increasing stimulus contrast (Bolz et al. 1982; Greschner et al. 2006). Similarly, we found that the onset of maximum speed motion on average evoked population responses 13 ms earlier than the onset of movement with minimum absolute velocity (Fig. 1C). The time shift Δt for latency compensation was defined by the population response to the fastest stimulus movement. For the three experiments, we found Δt1 = 53 ms, Δt2 = 54 ms, and Δt3 = 56 ms, respectively. These values were used to temporally shift the spike trains relative to the stimulus before applying the Bayesian reconstruction algorithm to account for the minimum processing time needed by retinal neurons presynaptic to the RGCs.
Having determined the response latency, we were able to relate the stimulus velocity to its corresponding neuronal response—i.e., the vector consisting of the number of spikes generated by each cell. The average number of spikes in response to each velocity level yielded the cell's tuning curve. For each time step, the tuning curves were weighted with the present spike numbers and the resulting single cell “votes” were combined with each other and with the prior distribution of velocities. The velocity that maximizes the resulting posterior distribution most probably evoked the neuronal response and was therefore chosen as the estimate of the actual stimulus, an approach known as maximum a posteriori (MAP) estimation. Figure 2A shows an example of the actual stimulus velocity and the corresponding average velocity estimate with its SD resulting from eight stimulus repetitions. Despite occasional discrepancies, the average estimate clearly followed the applied stimulus velocity. Movement direction, represented by positive and negative velocity values, was reconstructed robustly most of the time. The algorithm sometimes spuriously reversed the sign of the velocity immediately after the stimulus changes. During these short intervals, single-trial velocity estimations did not necessarily agree with each other, as indicated by the large SDs, and the direction of the reconstructed movement was sometimes inverted compared with the actual stimulus.
The reconstruction quality for the complete duration of the experiment can be represented by the relative reconstruction error E between the actual and average estimated stimuli (see methods). Consistent with previous results (Zhang et al. 1998), we found an optimal value for the spike count window size τ to minimize this error. Using the optimal τ value for each experiment yielded a mean reconstruction error of E = 0.29, with an average τ = 86 ms.
For repetitive stimulation, the values of the estimated velocity v* scattered around the actual value of v, due to the variability of neuronal responses. We characterized this scatter by computing the conditional distribution of the estimates P(v*|v) for all single-trial estimations in each experiment according to Stocker and Simoncelli (2006) (Fig. 2B). The peak of the distribution P(v*|v) was always located at the correct velocity v, indicating that reconstruction was sufficiently precise during the complete experiment. The exact absolute values of the mean estimated velocities for each distribution P(v*|v), however, were consistently smaller than the actual velocities (Fig. 2C). This was a combined effect of two mechanisms. First, spurious inversion of the estimated movement direction shifted the average estimate toward zero. These mistakes in direction estimation are also reflected in the conditional distribution of estimates (Fig. 2B) because correct reconstruction of the absolute velocity—while at the same time confusing the movement direction—increased the probability values along the diagonal from top left to bottom right, perpendicular to the main diagonal of perfect estimation. The mistakes in direction estimation were also responsible for the fact that the scatter around the peak estimate increased with the absolute value of velocity. The second mechanism inducing a bias toward smaller speed when the real speed was at maximum is the prior of the Bayesian estimator, which restricts estimates to maximum possible values. Additionally, this restriction caused the scatter around the maximum absolute velocities to be underestimated. However, despite the occasional estimation errors, this analysis clearly shows that the spikes elicited by a population of turtle retinal ganglion cells contain information about the stimulus velocity, including the direction of motion.
Contribution of individual responses to overall velocity estimation
Having found that populations of retinal ganglion cells signal stimulus velocity, we next tried to determine whether certain cells contributed more to this capability than others. Consistent with the findings of Zhang et al. (1998), we found that on average, the relative error was reduced when more cells were used for the reconstruction. For the available maximum cell numbers the reconstruction quality did not saturate (not shown).
Applying a k-means clustering algorithm, we next analyzed whether the single-cell tuning curves fi(v) differ appreciably in their shape. We found that there are four major classes. Cells within classes one and two were sensitive to the direction of motion along the given axis because their spiking activity during movement in one direction outweighed the activity elicited when the stimulus pattern moved in the opposite direction. Specifically, class one combined cells preferring motion with negative velocities, whereas cells in class two responded best to movement with positive velocities. In contrast, responses of cells within classes three and four increased with larger speed, but did so irrespective of the direction of motion. Cells in class four differed from those in class three only in their higher spontaneous firing rate when the pattern did not move (v = 0). Single neurons were reliably assigned to the same respective classes when the clustering procedure was repeated, confirming the existence of well-distinguishable types of tuning curves. Figure 3A shows one representative tuning curve from each of the four classes. In the following, cells belonging to the first and second classes were combined within the group of direction-sensitive cells, whereas cells from the third and fourth classes were assigned to the group of symmetrically tuned neurons. On average, 39% of all ganglion cells we recorded belonged to the direction-sensitive group for the given axis of stimulus motion.
To determine whether the encoding capabilities of single neurons correspond to their tuning type, we reconstructed the stimulus as described earlier, but this time, estimation was based on individual responses alone. For each cell i, we calculated two separate relative reconstruction errors: Ei|v| for the absolute velocity and Ei± for the direction of motion. These are shown in Fig. 3B, with the tuning type of each neuron indicated by different symbols. For purely random stimulus estimation, an Ei value of 1 is expected. Error values were widely scattered due to the large variations in firing activity for different cells. Cells that were rarely spiking yielded Ei|v| values >2, because the reconstruction algorithm associated all intervals in which no spikes occur with velocity zero. However, several neurons achieved Ei values <1 for one of the two response properties. From the population shown in Fig. 3B, twice as many symmetrically tuned cells than direction-sensitive neurons yielded a low error value for absolute velocity estimation Ei|v| <1. For direction estimation, threefold more direction-sensitive cells than symmetrically tuned neurons were bound to have an error Ei± <1. Experiments 1 and 3 yielded similar ratios. Thus one could argue that the two groups are specialized in encoding two distinct stimulus properties. However, this is not strictly true, as we will show next.
To compare the encoding capacities of a whole group of direction-sensitive neurons on the one hand and the symmetrically tuned cells on the other hand, we estimated the velocity signal by using the combined responses of cells within the respective group. Naturally, the overall error increased as compared with the reconstruction using all cells. For the symmetric tuning group, the reconstruction quality decreased on average to 2.44 times the error value obtained by the complete population. Without the direction-sensitive cells, responses of the remaining cell population often allowed a correct reconstruction of the absolute velocity, but not of the movement direction. Compared with the entire population (Fig. 2B), the erroneous direction estimation of the symmetric tuning group is reflected in P(v*|v) by higher probability values along the diagonal from top left to bottom right, perpendicular to the main diagonal of perfect estimation (Fig. 3C). As a consequence, the SD for large speeds was also increased. Surprisingly, however, reconstruction of the velocity signal using only the responses of direction-sensitive neurons did not show comparable inaccuracies regarding the absolute speed (Fig. 3D). To the contrary, absolute speed was estimated sufficiently well in addition to the movement direction. At first glance, the conditional distribution of estimates looks like the one obtained with the full population (Fig. 2B). This impression was corroborated by the fact that the relative reconstruction error of the direction-sensitive group was on average just 1.14 times higher than the value achieved by the complete population, despite the fact that only about one third of all cells contributed to the estimation. We conclude that the direction-sensitive cells encode both absolute velocity and direction of movement well when considered as a population, whereas including the symmetrically tuned neurons into the estimation process further improves reconstruction quality.
The pooled population firing rate shown in Fig. 1A suggested that not only speed, but also changes in speed (i.e., stimulus acceleration) are encoded by the retinal ganglion cell responses. Thus we added an acceleration signal to the Bayesian reconstruction procedure as a second dimension of the stimulus vector s(t) := [v(t), a(t)], thereby simultaneously estimating velocity and acceleration from the spike numbers generated by the retinal ganglion cells. This estimation of a two-dimensional combination of stimulus properties was performed analogously to the method introduced by Zhang et al. (1998), who reconstructed a two-dimensional quantity as well, i.e., the position s = (x, y) of a rat in a maze.
The question arose how to choose the acceleration signal to be reconstructed. In particular, two aspects had to be considered: first, how to include the direction information in the acceleration signal and, second, how to relate the nearly instantaneous stimulus accelerations to the neuronal responses, which are characterized by peaks in the firing rate that last about 100 ms.
If direction is taken into account, the maximum change in velocity that could possibly occur is a reversal from the fastest speed in one direction to the fastest speed in the opposite direction. However, these events did not trigger the largest peaks in the population firing rate. Much stronger phasic responses were evoked by increases in absolute velocity, e.g., the transition from zero velocity to the highest absolute speed level. Therefore we included these changes in absolute velocity in the estimation process. However, this would result in zero acceleration if the motion stimulus reversed direction without changing absolute speed, even though ganglion cells responded to this situation with intermediate activity. Reconstruction worked reasonably well if these cases were ignored, but estimation quality could still be improved by 15% on average by adding a constant component to the original acceleration signal each time the sign of velocity was inverted (see methods).
To relate instantaneous accelerations to the temporally extended neuronal responses, the physical acceleration signal, a series of events each lasting just 1 ms, needed to be smoothed. This was essential to supply the Bayesian reconstruction algorithm with appropriate data. Without smoothing, neuronal responses would be associated with zero acceleration almost all of the time, preventing the computation of meaningful tuning curves. We convolved the sequence of acceleration events with a function resembling the transient peaks observed in the population firing rate (see methods for details).
As shown in Fig. 4A, the acceleration signal a(t) was estimated successfully as the second stimulus dimension, in addition to the original velocity time course. Latency Δt and spike count window length τ were adopted from the one-dimensional case. Timing and amplitude of the peaks were captured well, with low variation across stimulus trials. In particular, the larger peaks were estimated almost perfectly. Reductions in stimulus speed, i.e., negative accelerations, were only weakly reproduced. This is due to the low spontaneous activity in turtle retinal ganglion cells, which prevented the firing rates from decreasing below baseline when the stimulus decelerated. Nevertheless, the timing of decreases in speed was usually visible in the reconstruction, even if its amplitude was underestimated. These qualitative observations were quantitatively confirmed by the conditional probability of estimates P(a*|a) (Fig. 4B) and the corresponding average values for all experiments (Fig. 4C). There was only a tendency to recognize negative accelerations, whereas high positive accelerations were reconstructed reliably.
Interestingly, acceleration could not be inferred satisfactorily on its own as a single stimulus dimension. In this case, estimation quality was poor because the acceleration signal was overestimated after approximately 200 ms of constant velocity motion. The reconstruction algorithm misinterpreted the sustained firing of ganglion cells in these periods as an ongoing acceleration because this was the only association possible in the case of a one-dimensional acceleration reconstruction. In contrast, the combined estimation using both velocity and acceleration enabled the algorithm to correctly interpret intermediate firing rates during tonic activity as the combination of nearly zero acceleration with medium absolute velocity. This resulted in a reduction of the average relative reconstruction error for the acceleration from Ea = 0.80 to Ea = 0.53, corresponding to the reasonable quality of the estimated signal shown in Fig. 4. Interestingly, the combined estimation did not corrupt the reconstruction of the velocity signal, but even seemed to improve it. This was particularly evident in the first experiment, where the reconstruction error was reduced to E1v = 0.29 (82.4% of its one-dimensional value). For the other experiments, reconstruction was also slightly improved to 99.6 and 93.0% of the one-dimensional values.
Contribution of individual responses to acceleration estimation
Analogously to the analysis for the velocity reconstruction, we next examined the single-cell encoding capabilities with respect to velocity and acceleration. Inspection of the combined tuning functions for velocity and acceleration (Fig. 5A) revealed that most cells generated the strongest activity for combinations of both high speed and high acceleration, as was expected from the population firing rate. Positive accelerations generally evoked higher average spike numbers than did negative ones. Apart from that, no asymmetries in acceleration tuning were obvious.
Reconstructing velocity and acceleration from single-neuron responses yielded the individual error values illustrated in Fig. 5B. In the velocity dimension, ninefold more direction-sensitive cells than symmetrically tuned neurons achieved reconstruction errors Eiv <1, which again confirmed that these neurons are capable of sufficiently encoding both speed and direction of movement. In the acceleration dimension, no clear correspondence between single-cell reconstruction error Eia and tuning type was visible. Only 1.25-fold more symmetrically tuned than direction-sensitive cells had error values Eia <1.
Figure 5, C–F illustrates the results of estimating both velocity and acceleration from either the responses of the direction-sensitive or the symmetrically tuned subpopulation. For both groups, we show the conditional probabilities of estimates for the velocity and the acceleration dimension. Velocity estimation yielded results analogous to those obtained in one-dimensional reconstruction: increased direction errors for the symmetric tuning population (Fig. 5E) and reliable velocity estimation in the direction-sensitive group (Fig. 5C). Regarding the second stimulus dimension, high accelerations were estimated almost perfectly well by both groups, despite the restrictions in cell number. The symmetric tuning group (Fig. 5F) achieved a relative reconstruction error of on average 1.11-fold the value of the complete ensemble of cells. The reconstruction based on the smaller group of direction-sensitive neurons (Fig. 5D) turned out to be only slightly worse with 1.19-fold the full population error, averaged over all three experiments. Thus we conclude that the population of direction-sensitive cells was on its own capable of representing all three major motion properties: direction of movement, absolute velocity, and acceleration.
Description of ganglion cell responses by combined tuning
The derivation of the likelihood functions for the Bayesian reconstruction relied on the assumption of the Poisson statistics of a cell's spike generation. In other words, the instantaneous firing probability of a cell was assumed to be completely specified by how well the present stimulus matches the cell's tuning. To test whether action potential generation of each retinal ganglion cell was indeed characterized by sensitivity to both velocity and acceleration (i.e., by a two-dimension tuning function), we reversed the perspective on the problem and generated spike trains artificially. These artificial responses strictly obeyed the empirically determined tuning of the cells and lacked any temporal structure except the one induced by the stimulus time course. This was done for both the one-dimensional case, in which artificial cells were sensitive only to stimulus velocity, as well as for the two-dimension condition that took both velocity and acceleration tuning into account. We next compared the resulting artificial population response to the experimental data (Fig. 6).
When action potentials were artificially generated based on one-dimensional velocity tuning alone, the amount of tonic activity in these surrogate spike trains was comparable to the plateau activation in the original data. However, the transient overshoots after velocity switches were captured only if the artificial spike trains were created according to the combined velocity/acceleration tuning. In this case, the mean squared deviation between surrogate activity and the experimental population firing rate was reduced to about 50% of the deviation obtained with artificial data based on velocity tuning alone. This increase in similarity adds further evidence to our notion of ganglion cells being sensitive not only to either velocity or acceleration, but to a combination of both features. Still, the original data showed even more pronounced transients. This may be due to spike generation in bursts or spike rate adaptation processes that allowed for high firing rates immediately after low activity before their regulatory effect set in. In conclusion, the comparison of experimental and surrogate population activity indicates that the description of ganglion cell spike generation is significantly improved if the combined sensitivity for velocity and acceleration is taken into account.
The aim of the present study was to demonstrate that retinal ganglion cells directly encode multiple aspects of visual motion. By reconstructing speed, direction, and acceleration of a moving stimulus, we could show that turtle retinal ganglion cells simultaneously transmit information about these stimulus aspects by their instantaneous firing rates. We found that, for uniform global image motion with abrupt changes in speed and direction, the responses of direction-sensitive neurons in the turtle retina contain sufficient information about acceleration and absolute velocity in addition to signaling motion direction.
Bayesian stimulus reconstruction
Stimulus reconstruction has been applied successfully in many studies to investigate the encoding capacities of neural systems, and numerous methods have been developed to estimate the stimulus that most probably elicited a particular neuronal activity, e.g., reverse correlation (Bialek et al. 1991; Stanley et al. 1999; Warland et al. 1997) and population vector averaging (Georgopoulos et al. 1986). In the context of retinal velocity encoding, cross-correlation detectors have been used to estimate the speed of a moving bar, relying on the relative timing of RGC responses not intrinsically tuned to velocity (Frechette et al. 2005). Here, we estimate movement speed, direction, and acceleration based on ganglion cell spike trains by applying Bayesian inference (Sanger 1996, 2003), a reconstruction technique originating from statistical estimation theory, which is superior to basis function methods when evaluating neuronal data (Kass et al. 2005; Zhang et al. 1998).
The most obvious advantage of using Bayesian inference for stimulus reconstruction is the possibility of including prior knowledge about the stimulus ensemble into the estimation process. We used this feature when simultaneously estimating acceleration and velocity. In this case, certain entries in the prior distribution equal zero, representing knowledge about which combinations of stimulus features were impossible to occur. However, we did not observe consistent effects on reconstruction quality when varying the prior distribution of velocity probabilities.
A second advantage of the Bayesian method is that it allows reconstruction when tuning properties are not obvious to human observers. This feature of the Bayesian approach is particularly evident for the reconstruction of acceleration. In contrast to the one-dimensional case of velocity sensitivity, where the tuning curves are familiar and easily interpreted, cells do not show a similar dependence on acceleration alone. The combined two-dimensional velocity/acceleration tuning is more complicated to interpret, but the successful reconstruction demonstrates that the single-cell tuning curves must display a dependence on both stimulus dimensions.
However, the reasonable application of Bayesian inference to a restricted amount of neuronal data like the one we used here requires some simplifying approximations. Most prominently, these are the assumptions that retinal ganglion cells generate action potentials according to Poisson statistics and do so independently of each other.
Even though refractoriness in ganglion cell spike generation makes the Poisson assumption unrealistic, we found that artificially generated Poisson spike trains resemble experimentally measured RGC responses, if they are modeled according to combined velocity and acceleration tuning (Fig. 6). Moreover, our Bayesian investigation method is based on spike counts obtained in intervals that are at least one order of magnitude larger than the absolute refractory period. Thus the temporal fine structure of the responses is not considered, which makes the assumption of Poisson firing a justified simplification for our analysis.
The assumption of uncorrelated responses would also most probably not be supported by a statistical test. However, even though the Bayesian approach ignores additional information that could be present in correlations, it still allows an elegant combination of responses of multiple neurons. Combined tuning properties of the entire cell population result in a reconstruction that is far more accurate than when single cells would have been evaluated separately.
Thus although we did not test whether the assumptions of Poisson firing and independence of cell responses apply to the RGC responses we measured, we are aware that these are nonideal simplifications. However, we argue that the estimation algorithm may serve as a black box model whose components may be chosen arbitrarily, as long as it accurately reconstructs certain stimulus properties. Because this is the case for direction, speed, and acceleration of movement in our experiments, one can conclude that at least these properties have to be encoded by the instantaneous firing rates of RGCs on which the estimation is based.
To our knowledge, acceleration-sensitive neurons have not yet been described in the vertebrate retina. Acceleration sensitivity of single neurons has been found in the pigeon brain, where one third of motion-sensitive cells in the pretectal nucleus encode stimulus acceleration (Cao et al. 2004). Like the retinal ganglion cells described here, pigeon acceleration cells do not exclusively respond to sharp increases or decreases in stimulus speed, but are also tonically active during uniform stimulus motion. However, their firing rate is constant for a large range of velocities, indicated by a plateau-shaped velocity tuning curve. If these cells nevertheless alter their firing during speed changes, the resulting changes in activity can unambiguously be associated with the acceleration of the stimulus. We cannot exclude the existence of ganglion cells with plateau-shaped speed tuning, due to the restricted range of stimulus velocities used in our study. However, for the velocities tested here, we found the same neurons to be both velocity and acceleration sensitive, indicating that retinal acceleration encoding does not rely on a subpopulation of specialized cells. Rather, the transient activity after a velocity switch appears to carry information about the difference in speed. Consistent with our findings, Cao et al. (2004) described a strong transient activation of acceleration-sensitive cells, followed by a sustained component. In these pretectal cells, the ratio of the transient to the sustained component of activity is related to the amount of acceleration. Thus in the pigeon brain, acceleration information is also present in the transient activity of certain cells, similar to what we found for turtle retinal ganglion cells.
The mixture of acceleration and velocity tuning within the same cells possibly originates from effects of short-term adaptation. Motion-sensitive ganglion cells stimulated by a rapid transition from slow to fast motion initially respond vigorously to this event. During the subsequent constant motion, the firing activity drops to a plateau level characteristic for the present motion velocity. This spike rate adaptation may provide the ganglion cells with a band-pass filter characteristic for the transmission of velocity information. Consistent with this finding, the transient component of activation seems to be more pronounced for larger differences between the previous and the present velocity. Figure 5, D and F shows a tendency for higher accelerations being reconstructed more precisely than lower ones, indicating that the transient activity at velocity switches is indeed crucial for the acceleration estimation. It is conceivable that ganglion cells primarily tuned to velocity are capable of transmitting information about image acceleration as well, as long as the timescale of individual velocity transitions is not too slow compared with the band-pass characteristics of the ganglion cells. Thus we propose that acceleration encoding in ganglion cells, as it was found for our stimulation, should generalize at least to other viewing conditions that contain similarly abrupt changes in stimulus velocity.
When designing our stimuli, we applied nearly instantaneous velocity transitions to obtain strong transient activation. Because the actual accelerations were confined to very short intervals of time, they did not yield a sufficient amount of data to allow the computation of average firing rates and acceleration tuning. Therefore we had to use a transformation of the physical acceleration for our estimation and decided to use a smoothed acceleration signal, which was constrained by the time course of the pooled firing rate of all cells. The time constant of the temporal smoothing kernel was chosen to obtain a function resembling the transient peaks observed in the population firing rate, but the exact shape of the kernel did not significantly affect the reconstruction results. Additionally, an additive component within the acceleration signal accounted for the responses of direction-sensitive cells to reversals of motion direction. The exact choice of the smoothed acceleration signal may seem somewhat arbitrary, but according to the black box interpretation of the estimation algorithm described earlier, the successful reconstruction of the signal nevertheless yields valuable conclusions concerning the encoding capacities of retinal ganglion cells.
To initialize eye movements as they are performed during optokinetic responses to large-field motion, estimates of retinal slip speed and movement direction are needed (Priebe et al. 2001). A fast and reliable inference of speed differences may also be important for these optokinetic responses, analogous to the case of pursuit eye movements, where acceleration estimation has been shown to overcome the limitations of delayed feedback (Churchland and Lisberger 2001). In primate middle temporal visual area, acceleration information can be decoded from a population of neurons, although individual cells show no obvious relation to acceleration (Lisberger and Movshon 1999). In the turtle, a population of direction-sensitive retinal ganglion cells projects directly to the animal's accessory optic system (AOS), which contains important circuitry for optomotor reflexes (Ariel 1990; Rosenberg and Ariel 1991). For these cells, the combined transmission of information about various motion properties, like speed, direction, and acceleration, could provide a very useful coding strategy, especially in view of the rather small number of about 1,500 ganglion cells projecting to the turtle AOS (Reiner 1981).
Direction-sensitive, symmetrically tuned, and other retinal ganglion cells
Our analysis revealed that no specialized subpopulations of retinal ganglion cells are needed to separately encode direction, velocity, and acceleration of motion stimuli. In contrast, a population of cells with direction-sensitive tuning encodes all of these stimulus properties in combination. However, this does not mean that the remaining population members generate redundant responses. The precision of velocity estimation is considerably improved if direction-sensitive responses are augmented with responses that are speed tuned, but insensitive to direction. One would expect that at least some of the cells that we found to possess symmetric velocity tuning are actually direction sensitive as well, but with their preferred axes being approximately perpendicular to the one chosen in our experiment. This implies that for another axis of motion, the direction-sensitive cells and the neurons with symmetrical tuning swap roles. If this hypothesis holds true, velocity encoding is enhanced by combining responses from the entire population of direction-sensitive ganglion cells, irrespective of whether motion is along their axes of preferred motion direction.
The turtle retina has been shown to possess as many as 24 different types of ganglion cells, based on morphological and physiological characteristics (Ammermüller and Kolb 1995). Naturally, not all of these types are involved in the processing of movement information. Fortunately, the Bayesian estimation algorithm is not impaired if the population contains cells that are insensitive to motion, keeping their discharge rates constant in the presence of abrupt changes in speed and movement direction. These cells would simply contribute uniform likelihoods, without shifting the resulting maximum of the posterior distribution. However, the fact that on average, including more responses yields better reconstruction qualities indicates that most neurons within our ensemble are indeed sensitive to motion.
Nevertheless, one could imagine that some cells that contribute to the encoding of motion properties are even more sensitive to other stimulus properties, which we did not modify, e.g., stimulus light intensity, size, or wavelength. Within the present paradigm, we cannot check this possibility because velocity was the only stimulus property that varied with time. For instance, varying light intensity in addition to stimulus velocity would in principle allow the estimation of an additional stimulus aspect. Although acceleration estimation was possible only in combination with velocity, and inclusion of the acceleration dimension actually improved precision of velocity inference as well, the advantages of combined estimation are probably due to the close relationship of these two stimulus features, acceleration being the first derivative of velocity. Considering only the firing rates of individual neurons would most likely increase the ambiguity when associating neuronal responses with multidimensional stimuli. Therefore it seems more promising to modify the Bayesian algorithm by incorporating other response properties, possibly solving ambiguities or otherwise improving stimulus reconstruction, e.g., by enabling a faster reconstruction. Of these properties, response latency may be a reasonable choice, given that latency encoding has been proposed to be superior to stimulus decoding based on firing rates (Johansson and Birznieks 2004; Van Rullen et al. 2005).
This work was supported by Deutsche Forschungsgemeinschaft Grants DFG-KR 2925/1–2 and FOR 701.
We thank J. Shelley for critical reading of the manuscript.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2007 by the American Physiological Society