To make efficient use of their limited signaling capacity, sensory systems often use predictive coding. Predictive coding works by exploiting the statistical regularities of the environment—specifically, by filtering the sensory input to remove its predictable elements, thus enabling the neural signal to focus on what cannot be guessed. To do this, the neural filters must remove the environmental correlations. If predictive coding is to work well in multiple environments, sensory systems must adapt their filtering properties to fit each environment's statistics. Using the visual system as a model, we determine whether this happens. We compare retinal ganglion cell dynamics in two very different environments: white noise and natural. Because natural environments have more power than that of white noise at low temporal frequencies, predictive coding is expected to produce a suppression of low frequencies and an enhancement of high frequencies, compared with the behavior in a white-noise environment. We find that this holds, but only in part. First, predictive coding behavior is not uniform: most on cells manifest it, whereas off cells, on average, do not. Overlaid on this nonuniformity between cell classes is further nonuniformity within both cell classes. These findings indicate that functional considerations beyond predictive coding play an important role in shaping the dynamics of sensory adaptation. Moreover, the differences in behavior between on and off cell classes add to the growing evidence that these classes are not merely homogeneous mirror images of each other and suggest that their roles in visual processing are more complex than expected from the classic view.
Sensory systems must confront the problem that their signaling capacity is limited. To make efficient use of this capacity, it has been hypothesized (Barlow 1961) that sensory systems exploit the fact that the environment has statistical regularities and is therefore in part predictable. The predictable aspects of the environment need not be explicitly represented in the neural stream since they are, by definition, not informative. By ignoring the predictable aspects of the environment, neurons can focus their signaling capacity on what is not predictable and thus convey sensory information in an efficient manner. However, this view immediately raises a question: as the organism moves from one environment to another, the statistical regularities of the sensory input change. For the efficient coding hypothesis to hold, the sensory system must adapt appropriately to these changing statistics. Here we ask, does this happen?
Changes in the statistics of a sensory environment may consist of a change in the range of intensities, a change in its correlation structure, or both. Both kinds of changes are known to produce shifts in response properties that are at least qualitatively consistent with the efficient coding hypothesis. When the range of stimulus intensities increases, the neural response gain decreases (Brenner et al. 2000; Fairhall et al. 2001; Gaudry and Reinagel 2007a; Shapley and Victor 1978; Wark et al. 2009). When the correlation structure of the environment changes, neural filtering properties change, both in space (Hosoya et al. 2005; Lesica et al. 2007; Sharpee et al. 2006) and in time (Hosoya et al. 2005; Lesica et al. 2007). Our focus in this study is on the latter.
When one considers correlations, the efficient coding hypothesis translates into a design principle for sensory filters, known as “predictive coding” (Atick and Redlich 1990; Dan et al. 1996; Srinivasan et al. 1982): the design principle is that efficient coding is achieved by filters that remove the correlations in the sensory stream. The implication of this idea is that as an animal moves between environments with different correlation structures, the filters must change so that efficient coding is maintained.
With this in mind, and a focus on temporal correlations, we examined the dynamics of mouse retinal ganglion cells in environments with two very different kinds of temporal correlation structure: natural scenes and white noise. Qualitatively, and consistent with the results of others (Hosoya et al. 2005; Lesica et al. 2007), we find that filtering properties change in a manner that tends to maintain predictive coding across conditions. When switched between environments, neurons shift their dynamics, becoming more high-pass in the naturalistic environment than in white noise.
However, we also identify several aspects of the adaptation of retinal dynamics that are not anticipated from predictive coding. Most prominently, adaptation is not uniform across cell classes: in naturalistic conditions, the on cell population reduces its gain by a factor of 10 at low frequencies, whereas the off cell population, on average, reduces its gain by a factor of only 2. At high frequencies, on cells increase their gain by a factor of almost 2, whereas off cells show no increase. We find no statistical differences between bright and dark in the naturalistic stimulus to account for this difference between cell classes.
In addition to the differences in adaptive behavior between on and off populations, adaptive behavior is also not uniform within the populations. As we will show, this heterogeneity is nontrivial: it does not arise merely because there are cells that are intermediate in behavior between on and off.
In sum, we find that retinal dynamics shift in a manner that is consistent with the demands of predictive coding, but only in part. on and off cells are not simply mirror images of each other, and the way that they adapt to changes in stimulus statistics accentuates this asymmetry and reveals heterogeneity within the cell classes.
Preparation and recording
Recordings from central retinal ganglion cells of the isolated mouse retina were obtained via a flat array of 64 microelectrodes as described in Dedek et al. (2008). Briefly, spike waveforms were recorded using a Multichannel Neuronal Acquisition Processor (Plexon Instruments, Dallas, TX). Two different standard spike-sorting methods were used: a window discriminator or a principal component analysis (PCA)–based waveform sorting algorithm implemented in Chronux (Mitra and Bokil 2008) based on the method of Fee et al. (1996).
The white-noise stimulus consisted of a pseudorandom binary sequence presented in a 20 × 18 stimulus array of 80 × 80-micron (2.6 × 2.6°) square checks, updated every 67 ms. The naturalistic sequence consisted of a movie of a ground-level animal's view of the landscape in New York City's Central Park. The luminance within each 67 ms block was averaged, rescaled to an 8-bit range [0, 255], and presented in the same array as was used for the white-noise stimulus. The two levels used for the binary stimulus were 92 and 202, chosen so that the two stimuli were matched for mean (148) and SD (54). The mean luminance for both stimuli at the retina was 0.24 μW/cm2. Each stimulus sequence was presented for 9,000 frames, lasting 600 s.
In addition to the above-cited stimulus samples used for constructing the model, we measured responses to repeated presentation of a validation (out-of-sample) sequence. For white noise, this consisted of 60 to 200 repeats of a 90-frame (6 s) sequence (seven retinae); for the naturalistic stimulus (all 82 cells), this consisted of 70 to 180 repeats of a 90-frame sequence (one retina) or a 150-frame (10 s) sequence (eight retinae).
To compare ganglion cell dynamics in white-noise and naturalistic conditions in the context of the predictive coding hypothesis, we require a filter-based description of the stimulus–response relationship in each of the two environments. To provide these descriptions, we fitted separate linear-nonlinear-Poisson (LNP) models (Chichilnisky 2001) to data obtained under the two experimental conditions. The parameter-fitting strategy was based on the maximum-likelihood approach of Pillow et al. (2008) and was tailored to the needs of this study as described in the next section.
Neuronal firing is modeled as an inhomogeneous Poisson process whose intensity (firing probability) p(t) is the result of filtering the stimulus S by a linear filter L, and then applying a nonlinearity N (1) where ∗ represents spatiotemporal convolution, formally defined by (2)
Since the stimulus is constant on pixels of size Δx × Δy = 80 × 80 microns and frames of length Δt = 67 ms, we replace this integral by a discrete sum (3) where i covers T = 18 time steps (1.2 s) and the spatial indices nx and ny each cover 10 contiguous integers, corresponding to a 10 × 10 array of pixels covering and approximately centered on the receptive field. Here, the range of stimulus intensities S is mapped to [−0.5, 0.5], with −0.5 corresponding to black and 0.5 corresponding to the maximum luminance, i.e., 0.42 μW/cm2.
To reduce the number of free parameters, we assume that the linear filter L is separable into a product of a temporal kernel Gtemp(τ) characterized by 18 parameters, one for each lag, and a spatial kernel Gspat(x, y) characterized by 100 parameters, one for each pixel [Pillow et al. (2008) used two such terms]. To remove the ambiguity of multiplicative constants shared by these factors, the kernels Gtemp(τ) and Gspat(x, y) are normalized by ∑i=1T [Gtemp(iΔτ)]2 = 1 and ∑nx ∑ny [Gspat(nxΔx, nyΔy)]2 = 1 and the overall size of L is brought into a single factor kmult (4)
To further reduce the number of free parameters and to provide for a principled extrapolation of the temporal kernel to long times without incurring an artifactual transient after the longest lag explicitly considered (1.2 s), the temporal kernel was constrained to be a sum of 10 basis functions (Pillow et al. 2008). The first five basis functions ej (j = 1,…, 5) were set to 1 on the jth time bin (of length Δτ) and 0 elsewhere. The last five basis functions ej (j = 6,…, 10) were single lobes of raised cosine functions in logarithmic time (5) for where αj ranged from 0.22 to 0.55, βj ranged from 1.8 to 2.6, and λj was chosen for unit normalization. This extrapolation had a negligible effect on the estimated transfer functions, since impulse responses had largely returned to zero at 1.2 s.
To allow for exploration of a wide range of kinds of neural responses, the nonlinearity N of Eq. 1 was parameterized as a cubic spline (i.e., a piecewise cubic polynomial with continuous second-order derivatives), which was not constrained to be monotonic. This allowed us to capture the behavior of neurons with on–off characteristics. As we show, this model also provided for accurate estimation of response dynamics under both white-noise and natural scene conditions (Figs. 1B and 2B), as is required to study adaptation.
Instances of the preceding model were independently fit to responses to white-noise and naturalistic stimulation by maximizing the likelihood of the observed responses. To carry out this fit, we began with an exponential nonlinearity N(x) = exp(x + k0), that is (6) because for this nonlinearity, there are no local maxima (Paninski 2004), thus facilitating the parameter estimation. We then replaced the exponential nonlinearity by an approximating 6-knot cubic spline. The model parameters were fit by coordinate ascent (i.e., alternating stages of maximizing the log-likelihood with respect to (i) the spline coefficients and (ii) the filter parameters, until a maximum was reached). The spline parameters were not restricted to monotonic nonlinearities. Empirically, this procedure did not incur a drift in the values of kmult or k0, even though in principle they are redundant with spline parameters. This procedure was vetted by demonstrating that it properly recovered the parameters of model LNP systems (including systems with nonmonotonic nonlinearities) from their responses to both white-noise and naturalistic inputs (see Figs. 1B and 2B).
Calculations were carried out in Matlab, using code adapted from that of Pillow et al. (2008).
Analysis of model parameters
To calculate the frequency response, we Fourier-transformed the temporal kernel Gtemp(τ) numerically. Since the impulse response was measured up to T = 1.2 s, frequency resolution is bounded by 1/ T = 0.8 Hz. That is, the DC value of the frequency response, which is the average of the impulse response over the previous T = 1.2 s (18 time points), captures the average behavior over frequencies from 0 to 0.8 Hz.
In Figs. 1⇑⇓–4 we show in addition, for the interested reader, the frequency response at a step beyond the guaranteed resolution. To calculate this, we used the fact that the impulse response approaches zero at T = 1.2 s. Thus we tapered it to zero beyond T = 1.2 s by projecting it on the basis functions indicated in Eq. 5 and calculated frequency responses by Fourier transformation over the interval from 0 to 4T. As a check, we also used zero-padding and found nearly identical results. Note that the conclusions of this study do not depend on knowledge of this fine structure.
To measure the goodness of fit of the model, we compared the ability of the model response to predict the neuron's response to the neuron's intrinsic reproducibility. To measure the model's predictive ability, we used the variance explained by the model's prediction of the response to the out-of-sample validation trials (Chichilnisky 2001). To measure the neuron's intrinsic reproducibility, we compared the neural responses to the first and second halves of the validation trials. Our measure of goodness of fit (GOF) was the ratio of these quantities (7) where V is the fraction of variance explained, M is the model response, R1 is the firing rate for the first half of validation-sequence trials, and R2 is the firing rate for the second half of validation-sequence trials. Firing rates R1 and R2 were calculated by convolving the raw spike train sequence with a Gaussian whose standard deviation was half the bin width, 33 ms. We report results for the goodness of fit to the natural scene stimulus, which was measured in all cells; similar results were obtained for the goodness of fit measured with the white-noise stimulus (53/82 cells).
The bias index (BI) is a measure of the overall balance of on and off inputs, defined by (8) where RON is the peak firing rate following a step increase in luminance and ROFF is similarly defined for the offset transient. Note that for a cell that has only an on response (RON > 0, ROFF = 0), then BI = 1. Similarly, for a cell that only has an off response, BI = −1.
RON and ROFF were measured by adapting the procedure of Carcieri et al. (2003) for use with random binary stimulation. Specifically, we first identified the “optimal” check—i.e., the one that was closest to the receptive field center and covered most of it. Then, to measure RON, we calculated a poststimulus time histogram (PSTH) triggered by portions of the stimulus sequence in which optimal check was off for five frames (333 ms) and then on for two frames (133 ms). RON was taken as the maximum of the smoothed (Gaussian standard deviation 20 ms) PSTH in the window from 230 to 330 ms (see kernel functions in Figs. 1 and 2) following the onset transient. ROFF was analogously measured from segments of the stimulus sequence in which the optimal check was on for five frames and then off for two frames. With this method, the population distribution of BI values was nearly identical to that reported by Carcieri et al. (2003) who used an “optimal spot,” rather than the optimal check used here.
Our goal is to determine whether ganglion cells adapt to changes in the statistics of their inputs, focusing on temporal changes. To do this, we used a model similar to that of Pillow et al. (2008) to characterize ganglion cell response dynamics in two environments: white-noise checkerboards and naturalistic stimuli with the same spatial discretization and the same mean and variance. This model describes the neural response as the result of three transformations: a linear spatiotemporal filter L, followed by a static nonlinearity N, followed by Poisson spike generation. Because the dynamics are described by the filter L, we focus on it.
We first consider individual example cells and then the behavior of the on and off cell populations as a whole. Data are presented from 82 ganglion cells (38 on, 44 off) in nine retinae.
Figure 1A shows that in typical on cells, a change in the temporal statistics of the sensory input induces a change in response dynamics. In the top row, we compare the temporal kernel of the linear filter measured under white-noise conditions (solid) and naturalistic stimulation (dotted). Visual inspection shows that the temporal kernel under naturalistic conditions has a larger initial transient and a smaller undershoot, suggesting that a change in dynamics has occurred. To understand the implications of this shape change for frequency tuning, we Fourier-transformed the temporal kernels (Fig. 1A, second row). The results showed that sensitivity is augmented at high frequencies (3 to 5 Hz), by a factor of 2 in these examples, corresponding to the larger initial transient. Sensitivity is strikingly attenuated at low frequencies, by a factor of up to 3 to 5 in these three examples and by a factor of 10 on average, as shown in the next section. This loss of low-frequency sensitivity corresponds to the change in the undershoot, resulting in an equalization of the area under the two lobes of the temporal kernel.
To be sure that this change represents an adaptation of ganglion cell dynamics, rather than a behavior built into the LNP model or a result of errors in estimating model parameters, we carried out a simple simulation: we used the LNP model obtained from white-noise conditions, simulated its responses to the white-noise and the naturalistic inputs, and then estimated model parameters from those responses according to the procedures described in methods. This shows what one would expect for a cell whose intrinsic characteristics do not change and controls for errors in estimating model parameters from the naturalistic stimulus ensemble (Sharpee et al. 2006) due to its correlation structure. Results of this analysis are shown in Fig. 1B. In contrast to the corresponding panels of Fig. 1A, the temporal kernels (top row) and their Fourier transforms (bottom row) obtained from white-noise and naturalistic stimulation superimpose. This shows that the changes observed in Fig. 1A cannot be explained by bias or imprecision in the fitting procedure: when the fitting procedure is applied to simulated spike trains generated by LNP neurons, the temporal kernels do not change. In other words, the changes seen in Fig. 1A reflect changes in the neurons' filtering characteristics, not artifacts of the fitting procedure.
off cells (Fig. 2A) show a behavior that contrasts dramatically with on cells. First, the initial response transient changes very little between white-noise and naturalistic conditions. Correspondingly, as Fourier analysis shows, there is essentially no enhancement of the response in the high-frequency range (≥3 Hz) under naturalistic conditions. Second, although there is a change in the size of the undershoot in the two environments, the Fourier analysis shows that this results in a smaller change in the low-frequency sensitivity than that seen in on cells of Fig. 1A. The reason for this is that although the undershoot amplitude changes, its area changes very little.
For completeness, we performed the same computational control for off cells as that for on cells (Fig. 2B), confirming that the changes in the measured temporal kernels in the two stimulus conditions indeed reflect a change in filtering characteristics.
Average behavior of on and off cells
Figure 3 summarizes the preceding analysis across the population of 82 retinal ganglion cells. When shifted from white-noise to naturalistic stimuli, average on cell (n = 38) sensitivity is reduced on average by a factor of 10 at the lowest frequencies and increases on average by a factor of 1.7 at 5 Hz.
In contrast, the off cell population (n = 44) shows, on average, much less adaptation: there is a reduction in sensitivity by a factor of <2 at low temporal frequencies and no appreciable increase at high temporal frequencies.
The shift seen in on cells makes sense in terms of predictive coding: sensitivity to low frequencies is reduced in an environment in which low-frequency correlations are more prominent. However, this behavior is only qualitatively consistent with the behavior expected from predictive coding.
To work out the quantitative prediction, we determine how the temporal kernel must change so that it removes stimulus correlations as the environment changes. It is convenient to work in the frequency domain and to characterize stimulus correlations by the stimulus power spectrum P(f). Assuming linearity and a neural transfer function A(f), the power spectrum of the response is then R(f) = P(f)|A(f)|2. Thus to maintain R(f) constant as the power spectrum changes from P1(f) in environment 1 to P2(f) in environment 2, we must adjust A(f) so that P1(f)|A1(f)|2 = P2(f)|A2(f)|2. This is equivalent to (9)
That is, the change in the transfer function should be inversely proportional to the square root of the change in the power spectrum.
Here, the power spectrum of the white-noise stimulus is flat [P1(f) = K1], whereas the power spectrum of the naturalistic stimulus is approximately P2(f) = K2f−a, where a = 0.83, as determined empirically from our stimulus (see Supplemental Fig. S1).1 Thus, the preceding analysis predicts that adaptation to the naturalistic stimulus would result in a change in the Fourier transform of the temporal kernel proportional to fa/2 and that the behavior of on and off cells would be similar to each other. As a comparison with the line of slope a/2 shows (Fig. 3), on cell behavior is qualitatively consistent with this expectation, but departs quantitatively. off cells, on the other hand, show almost no shift, inconsistent with the expectations of predictive coding.
Heterogeneity of adaptive behavior within ganglion cell populations
At this point, we have seen that adaptive behavior is distributed in a strikingly nonuniform manner across ganglion cells: on cells adapt in a manner that is at least qualitatively consistent with predictive coding, but off cells, on average, show little adaptation. However, the on versus off distinction, although fundamental to any discussion of ganglion cell behavior (Dowling 1970; Rodieck 1973), is a simplified picture of the diversity of ganglion cell classes. In particular, it represents a dichotomy along a continuum of a range of weightings of on and off inputs (Carcieri et al. 2003).
Thus given that on and off cells, on average, have distinct adaptive behaviors (Fig. 3), one can anticipate that individual ganglion cells will differ in their adaptive behavior, reflecting the balance of their on and off inputs. To test this hypothesis, we analyzed how adaptive behavior of individual cells depended on the relative contribution of their inputs. The results of this analysis are shown in Fig. 4. As the figure shows, there is heterogeneity in adaptive behavior, although it substantially exceeds what would be expected by a mixing of on and off signals.
To show this, we used the “bias index” (BI, Eq. 8) to quantify the relative contribution of on and off inputs to each ganglion cell (Carcieri et al. 2003). A BI of −1 indicates that a cell receives exclusively off input; a BI of +1 indicates that a cell receives exclusively on input. As seen in Fig. 4C, the BI accounts for only a small fraction of the cell-to-cell variation in adaptive behavior. Moreover, the cells with BI near +1 or −1 (those that are likely to have exclusively on or off input) showed a range of behavior just as large as that of the cells whose BI was far from the extremes.
We next formalize these observations: (i) that there is a substantial difference in adaptive behavior between on and off cell classes and (ii) that there is heterogeneity within cell classes that is not explained by the mixture of on and off inputs.
With regard to the first observation, we compared the log of the adaptation ratio (the ratio of the Fourier transforms of the temporal kernels measured under the two conditions) between cell classes. Using a bipartite subdivision into off and on subgroups as in Fig. 3, there was a highly significant difference at DC and 5 Hz (P < 0.001). The difference at 3 Hz was less significant (P < 0.01) and there was no significant difference at the intermediate frequencies (0.2 and 1 Hz) shown in Fig. 4. As noted earlier, the on versus off subdivision is likely an oversimplification: as described by Carcieri et al. (2003), the BI distribution is trimodal, with a cutpoint between off and on–off subsets at a BI of approximately −0.6. However, this refinement does not change our conclusions: with the tripartite subdivision, we found that on cells differed substantially from both off and on–off cells at DC and 5 Hz (P < 0.001), whereas off and on–off cells differed only minimally from each other (P < 0.05 at DC, NS at 5 Hz).
With regard to the second observation, heterogeneity within cell classes, we used the BI as a measure of the mixture of on and off inputs and regressed the log of the adaptation ratio against the BI, separately within each cell class. Within each cell class (using either the bipartite or tripartite division), no more than 10% of the variance could be accounted for by the BI. We conclude that the heterogeneity within cell classes was not simply a result of a mixture of on and off inputs.
It is interesting to note that the heterogeneity within categories does not merely represent random variation of responses: there is structure to the distribution of adaptive behavior. Specifically, ganglion cells that show adaptation at low temporal frequencies also tend to show adaptation at high temporal frequencies. Figure 5 shows this, by comparing the adaptation ratio at DC and at 5 Hz. Across all cells, these ratios are strongly negatively correlated (r = −0.48, P < 0.001). This covariation is concentrated within the on subpopulation (r = −0.40, P < 0.01).
Finally, we rule out the possibility that modeling error accounts for the observed heterogeneity. To do this, we examined the adaptation ratio as a function of goodness of fit of the response model (see methods). As seen in Fig. 6, across the range of cells that were fit well by the model and cells that were fit poorly, a much smaller effect cells decreased their responses at DC and increased their responses at 5 Hz, whereas off cells showed a much smaller effect. Correspondingly, there was no correlation between adaptation ratio and goodness of fit (DC: r = −0.02; 5 Hz: r = 0.12, both P > 0.05).
In sum, we found that retinal ganglion cells adjust their dynamics so that their low-frequency sensitivity is reduced and their high-frequency sensitivity is enhanced, when moved from a white-noise environment to a naturalistic one. However, this difference is only qualitatively consistent with the expectations of predictive coding and there is substantial nonuniformity, both between and within cell classes.
Predictive coding in a changing environment
Sensory systems have limited signaling capacity and must operate in many kinds of environments. To confront the problem of limited signaling capacity, sensory systems make use of predictive coding (Barlow 1961; Laughlin 1981; Srinivasan et al. 1982). The idea of predictive coding is that a sensory system need not signal what is predictable about its environment and thus can concentrate its capacity on the aspects of the incoming sensory signals that cannot be guessed. To implement this, the filtering properties of a sensory system need to decorrelate the input signal—i.e., remove what is predictable about its environment. This, indeed, is what is found experimentally, both in the spatial domain (Atick and Redlich 1990, 1992; Hosoya et al. 2005; Sharpee et al. 2006; Srinivasan et al. 1982) and in the temporal domain (Dan et al. 1996; Hosoya et al. 2005; Lesica et al. 2007; Srinivasan et al. 1982).
However, sensory systems must also operate in many different environments and these environments may have different statistical characteristics. If predictive coding is to be effective as the environment changes, the filtering properties of the sensory system must adapt to its changing correlation properties.
Our main finding is that this expectation holds (Hosoya et al. 2005; Lesica et al. 2007), but in a heterogeneous fashion across and within cell classes. Specifically, when shifted from a white-noise environment to a naturalistic one, on cells, on average, change their gain in a manner that approximately compensates for the statistics of natural scenes, but off cells, on average, do not (Fig. 3). Moreover, we also find that adaptive behavior is heterogeneously distributed within the cell classes (Figs. 4 and 5). We now consider the functional implications of these two levels of heterogeneity.
Heterogeneity between cell classes
The simplest functional hypothesis that might account for the differences between on and off cells is that it merely reflects statistical asymmetries in the environment. For example, if the bright regions of natural scenes were temporally correlated but the dark regions were not, then predictive coding would in fact account for this asymmetry. However, although there are notable asymmetries in the spatial correlation structure of bright and dark regions (Balasubramanian and Sterling 2009), no such differences in the temporal correlation structure have been identified (Dong and Atick 1995a) and our analysis confirms this (Supplemental Fig. S1). Thus the observed difference in adaptive behavior between on and off cells does not appear to be driven merely by the needs of predictive coding.
An alternative hypothesis is suggested by the recognition that adaptation has potential disadvantages (Wark et al. 2009). If a sensory system completely adapts to the statistics of its sensory environment, then its ability to communicate those statistics has been lost. Similarly, if a sensory system adapts based on an incorrect inference that the environment has changed, it becomes less efficient than if it did not adapt. These losses are mitigated if there is a population of ganglion cells that does not adapt (or that adapts differentially), which is what off cells appear to do.
on and off classes are not mirror images
The differences in adaptive behavior reported here are part of the larger picture of growing evidence that on and off pathways are not simply mirror images, but differ in a more fundamental manner. These differences encompass not only the way that on and off cells adapt, but also the basic aspects of their behavior in a single environment.
At the most basic level, the synaptic mechanisms that provide input to on and off ganglion cells differ in their linearity, their contrast–response characteristics (Zaghloul et al. 2003), and their temporal characteristics (Murphy and Rieke 2006). Their filtering properties differ as well: off cells are more numerous and have smaller receptive fields than those of on cells (Balasubramanian and Sterling 2009) and this difference in spatial resolution persists into visual cortex (Zemon et al. 1988). Our data add a difference in the temporal filtering properties: on cells are more broadband than off cells (Fig. 3), which corresponds to the observation that on cell temporal kernels tend to be biphasic, whereas off cell temporal kernels tend to be triphasic (Figs. 1 and 2).
Further differences between on and off cells are revealed by examining how they adapt to a changing environment, including (i) the striking differences reported here, (ii) the differences in adaptation rates following steps changes of white-noise stimuli reported by Wark et al. (2009), and (iii) evidence that on cells adapt their dynamics much more than off cells during dark adaptation (Pandarinath et al. 2009).
Although it is tempting to think of the on and off pathways as a means to subdivide positive and negative visual signals into components that spiking neurons can transmit, the fact that on and off pathways differ in so many respects suggests that this view is an oversimplification. As mentioned earlier, the existence of a subset of ganglion cells that do not adapt, or that adapt differentially, can mitigate some of the deleterious effects of adaptation. However, the fact that on and off cells differ even in their basic filtering properties suggests that they may play different roles in specific visual tasks or environments, even when adaptive behavior is not engaged.
Qualitative, not quantitative, agreement
Even at the level of the average behavior within the on cell population, agreement with the expectations of predictive coding is only approximate (Fig. 3). In this section, we outline some reasons for this. We emphasize, though, that none of these considerations can account for the heterogeneity we observe, either between or within cell classes.
The main reason is that to determine the consequences of the efficient coding hypothesis for response dynamics, it is necessary to make simplifying assumptions: that the environment is Gaussian, and that neurons can be regarded as approximately linear. In principle, the efficient coding hypothesis also applies to scenarios in which the environmental signals have complex, non-Gaussian statistics (Geisler 2008) and neural channels that are nonlinear; however, to work out the consequences of the efficient coding hypothesis in this general setting is simply not possible. Therefore as have others (e.g., Atick and Redlich 1990, 1992; Dan et al. 1996; Dong and Atick 1995a,b; Hosoya et al. 2005; Sharpee et al. 2006; Srinivasan et al. 1982), we focused on the implications of the second-order correlation properties of the stimulus. This leads to the simple, straightforward prediction of predictive coding: that neural filtering will decorrelate (prewhiten) the stimulus.
Neurons have limited power and they also have noise that is frequency dependent, although within the realm of the above-cited approximation, this does not change the expectations of predictive coding. To see this, we first consider such a neuron confronted with an environment consisting of uncorrelated Gaussian noise. Intuitively, information transmission is maximized if the neuron devotes most of its power to frequencies at which its internal noise is low. Made rigorous, this intuition becomes the classic “water-filling” theorem (Shannon 1948): a recipe for the linear filter that optimizes the information transmission by a noisy channel in an uncorrelated environment. When the environment has correlations, information transmission can be further improved, by first filtering the environmental input to remove what is predictable. This yields an uncorrelated (i.e., white) signal, to which the filter specified by the water-filling theorem can then be applied. The optimal neural transformation is a product of these two components: one that depends only on the environment (the decorrelating filter) and one that depends only on the noise characteristics of the neuron (the water-filling filter). Because the first filter does not depend on the characteristics of the neuron, neuronal noise does not influence how we expect the decorrelating filter to adapt. For a more detailed analysis, see Diamantaras et al. (1999).
Heterogeneity within cell classes
Our second main finding is that even within ganglion cell classes, adaptive behavior is heterogeneous, with the heterogeneity not simply accounted for by a mixing of on and off signals (Figs. 4 and 5). This conclusion holds independent of whether we consider the gamut from off to on to be a continuous one quantified by the bias index, whether we dichotomize this range or whether we adopt the trichotomous classification of Carcieri et al. (2003). These authors also identified a subset of long-latency cells with on and off responses and a subset of on cells that were sustained. These distinctions support the notion of dynamic heterogeneity within the off and on subsets. However, we are unable to make a precise connection between these distinctions and the present measurements of adaptation, since the long-latency and sustained subsets were defined by Carcieri et al. (2003) on the basis of responses to sustained, optimal spots, a stimulus that was not used in the present experiments.
We now consider the functional implications of this heterogeneity. We begin by considering adaptation in its biological context. Both from the point of view of analytical convenience and the traditional approach to the analysis of neural transductions, it is natural to consider the “white-noise” condition to be the baseline and the naturalistic stimulus to be the perturbation to which ganglion cells adapt (as we have done earlier). However, from a biological perspective, the opposite view is more appropriate: the retina has evolved and developed in the animal's environment and the white-noise stimulus is the one that represents a perturbation. From this perspective, it is not at all surprising that the adaptation ratio observed is only approximately what is expected from predictive coding—a white-noise environment is only rarely encountered and, consequently, one for which there is little selective pressure for efficiency.
Conversely, the kinds of environments encountered by an animal are likely to be more varied than just alternations between a stereotypical “natural scene” versus white noise, for example, dense brush, open space, and underground, in various admixtures, viewed in a range of lighting and atmospheric conditions. Each of these environments is characterized by distinctive statistics. Thus, to maintain predictive coding, the retina would be required to have a wide variety of adaptive behaviors, to compensate for a multidimensional range of changes in the environmental spectrum. The heterogeneity we observe may serve this purpose.
Interestingly, one can imagine how a single mechanism for adaptation, such as the well-described contrast gain control, could be varied among different cells (e.g., varied among different on cell) to produce this heterogeneity. The action of the contrast gain control (Shapley and Victor 1978, 1981) is as follows: when contrast is high, low-frequency responses are attenuated and high-frequency responses are enhanced. Importantly, the setting of the contrast gain control is determined by a neural measure of contrast, not its physical measure, i.e., it is determined by the cells' sensitivity to contrast, not by absolute contrast. This allows the contrast gain control to produce adaptation to temporal correlations: even though the naturalistic stimulus and the white-noise stimulus are equated for physical contrast, the sensitivity is higher for the naturalistic one because it contains more power at the frequencies that the mouse retina is most sensitive to: <3 Hz. Thus the contrast gain control is engaged more strongly for naturalistic stimuli than for white noise of the same physical contrast (Lesica et al. 2007) and could suffice to account for the basic phenomenon that low-frequency responses are attenuated for natural scenes. If this idea is correct, then the sensitivity to contrast should determine the extent of adaptation. This should hold not just between white-noise and naturalistic stimuli, but between any two environments: the extent of the change in dynamics should be governed, quantitatively, by the extent to which the frequency content of the environment activates the contrast gain control.
In this view, variability in individual neurons' expression of the contrast gain control would allow different cells within a population to adapt to different kinds of environments. This heterogeneity in adaptive behavior could be accomplished simply: by a linear admixture of retinal signals that are, and are not, influenced by the contrast gain control. This simplicity does have a limitation, though: it cannot adapt in an arbitrary fashion, since its repertoire is limited to this one-parameter family of mixtures. This kind of limitation may account for the imperfect adaptive behavior that we observe.
An implication of this idea is that the mechanism for contrast gain control might differ between the on and off pathways. This follows from the observation that both on and off cells show effects of the contrast gain control (Benardete and Kaplan 1999; Lesica et al. 2007; Ohzawa et al. 1985; Shapley and Victor 1978, 1981), but they differ in the way in which they adapt to the statistics of their inputs. As mentioned earlier, differences between on and off pathways at a mechanistic level are now well documented (Murphy and Rieke 2006; Zaghloul et al. 2003) and a consequent difference in the mechanism of the gain control would not be unexpected.
Whatever its mechanistic basis, the observation that retinal ganglion cells differ in the extent to which they adapt raises an interesting possibility: that this heterogeneity provides a way to bootstrap adaptation along a single dimension at the level of individual ganglion cells (e.g., a contrast gain control) into multivariate adaptation at the population level. That is, we speculate that local populations containing a mix of neurons that differ in the extent to which they adapt may provide for efficient coding in a much wider range of environments and task demands than could be achieved by a population of neurons that all adapted similarly.
Use of LNP models
Recently, model structures other than LN cascades have been used to study issues related to adaptation (Borst et al. 2005; Famulare and Fairhall 2010; Gaudry and Reinagel 2007b) and this raises the question of why we used an LNP structure. The reason was that our goal was to examine ganglion cell response properties in the context of predictive coding. Since predictive coding makes predictions about filters, we needed a model that describes the response properties in those terms. The LNP served this purpose. If, for example, we had fit one of these alternative models to our data and found that it accounted for the differences in the responses in the two environments, it would not have provided information about whether predictive coding was occurring. To determine that, we would have to characterize the filtering properties of the alternative model under the two operating conditions. Thus we took the LNP approach because it is a direct way to get to the filters. Note that our intent is not to exclude these models; we just use the LN approach for a particular purpose.
We have identified adaptive behavior in retinal ganglion cells that is qualitatively consistent with the demands of predictive coding in a changing environment, but is characterized by heterogeneity at two levels: between ganglion cell classes and within ganglion cell classes. We speculate that this heterogeneity has a functional role: to provide for efficient coding in multiple kinds of environments and for multiple tasks, rather than to optimize efficiency for a single stereotypical “natural scene.”
This work was supported by National Eye Institute Grants EY-12978 to S. Nirenberg and EY-9314 and EY-7977 to J. Victor.
No conflicts of interest are declared by the authors.
We thank M. Meytlis for providing the naturalistic stimulus and for assistance with the retinal recordings.
↵1 The online version of this article contains supplemental data.
- Copyright © 2010 the American Physiological Society