## Abstract

In this work, we studied the adaptation of H1, a motion-sensitive neuron in the fly visual system, to the variance of randomly fluctuating velocity stimuli. We ask two questions. *1*) Which components of the motion detection system undergo genuine adaptational changes in response to the variance of the fluctuating velocity signal? *2*) What are the consequences of this adaptation for the information processing capabilities of the neuron? To address these questions, we characterized the adaptation of H1 by estimating the changes in the parameters of an associated Reichardt motion detection model under various stimulus conditions. The strongest stimulus dependence was exhibited by the temporal kernel of the motion detector and was parametrized by changes in the model's high-pass time constant (τ_{H}). This time constant shortened considerably with increasing velocity fluctuations. We showed that this adaptive process contributes significantly to the shortening of the velocity response time-course but not to velocity gain control. To assess the contribution of time-constant adaptation to information transmission, we compared the information rates generated by our adaptive model motion detector with model simulations in which τ_{H} was held fixed at its unadapted value for all stimulus conditions. We found that for intermediate stimulus conditions, fixing τ_{H} at its unadapted value led to higher information rates, suggesting that time-constant adaptation does not optimize total information rates about velocity trajectories. We also found that, over the wide range of stimulus conditions tested here, H1 information rates are dependent on the amplitude of velocity fluctuations.

## INTRODUCTION

Adaptation is usually defined as a change in the response properties of a system, making it better suited to cope with the present environment. One interesting form of adaptive behavior in sensory systems is the adaptive temporal integration of motion stimuli exhibited by neurons in both the fly (Borst et al. 2005) and the primate (Bair and Movshon 2004) visual systems. The time course of these neurons' responses to randomly fluctuating velocity stimuli, as measured in the velocity spike-triggered average (STA), shortens when the amplitude of velocity fluctuations is increased. Previous work (Borst et al. 2005) has shown that a simple model of motion detection, the Reichardt model, predicts an automatic shortening of the time course of the velocity STA with increasing velocity fluctuations, in qualitative agreement with experimental results, even when all model parameters are assumed to be fixed. Similarly, the well-documented velocity gain control observed in H1 (Borst 2003; Brenner et al. 2000a; Fairhall et al. 2001) was shown to be an automatic consequence of the inherent nonlinearity of the Reichardt detector (Borst et al. 2005). Here we ask whether in addition to these effects, parameters of the fly motion detection system undergo genuine adaptational changes in response to the variance of the fluctuating velocity signal. To address this question, we modeled H1 as an array of Reichardt motion detectors, consisting of a high-pass filter (HPF), a low-pass filter (LPF), a multiplier, and a subtraction stage, followed by a static nonlinearity. Model parameters were fit to spike trains recorded from H1 under a wide range of stimulus conditions. We found that the HPF time constant (τ_{H}) is strongly dependent on the stimulus statistics, shortening considerably with increasing velocity fluctuations. This parameter change was interpreted as the signature of an adaptive process in the fly motion detection system and was related to the known adaptive properties of motion-sensitive neurons measured in other stimulus paradigms (Borst and Egelhaaf 1987; Borst et al. 2003; de Ruyter van Steveninck et al. 1986; Harris et al. 1999; Reisenman et al. 2003). Comparison of the observed velocity STAs to model simulations with τ_{H} fixed at its minimal and maximal observed values revealed that this adaptation contributes significantly to the shortening of the time course of the velocity response. The second model component that changed with the stimulus statistics was the static nonlinearity, which underwent a relatively weak reduction in slope, quantifying the contribution of additional adaptive processes beyond the automatic gain control (Borst et al. 2005) predicted by the Reichardt model.

Information-theoretic measures are often used to assess the functional role of sensory adaptation. For example, velocity gain control in H1 has been interpreted as an adaptive rescaling of the response that serves to maximize information transmission by enabling the system to use its full dynamic range under changing stimulus conditions (Brenner et al. 2000a). However, our results suggest that gain control is primarily an automatic consequence of the inherent nonlinearity of the motion detector, whereas the primary adaptive process is related to the system's temporal kernel. This adaptation is parametrized in our model by the observed reduction in τ_{H} with increasing velocity fluctuations. We therefore studied the effect of τ_{H} adaptation on information transmission in H1. To address this issue, we compared the information rates generated by our adaptive model motion detector with model simulations in which τ_{H} was held fixed at its minimal and maximal observed values for all stimulus conditions. We found that for intermediate stimulus conditions, longer τ_{H} led to stronger responses, which in turn led to higher information rates. These results suggest that the observed reduction in τ_{H} actually has a detrimental effect on overall information transmission. Similarly, we found that, over the wide range of stimulus conditions tested here, H1 information rates (both bits/s and bits/spike) are dependent on the amplitude of velocity fluctuations, indicating that the system does not optimally use its full dynamic range (bits/s) or operate at optimal efficiency (bits/spike) under all stimulus conditions.

## METHODS

### Experiments

Blowflies (*Calliphora vicina*, *n* = 20) were stimulated by a moving sinusoidal wave grating (22° spatial wavelength, 63% contrast, 14-cd/m^{2} mean luminance), with a low-pass filtered white noise velocity profile. The correlation time of the velocity profile was either τ_{0} = 20 ms or τ_{0} = 100 ms (*n* = 10 flies for each value of τ_{0}; we shall refer to the two groups as “τ_{0} = 20 ms flies” and “τ_{0} = 100 ms flies,” respectively). Each fly was stimulated with five different velocity signals, with SD of σ = 0.1, 0.5, 1, 5, and 10 periods per second (Hz), respectively. To verify consistency of experimental conditions in the two groups of flies, all τ_{0} = 20 ms flies were presented with one additional stimulus with parameters τ_{0} = 100 ms and σ = 5 Hz, and we verified that results were similar to those of the τ_{0} = 100 ms flies for the same stimulus condition. For each stimulus condition, 75–110 sweeps of identical stimuli, each with a duration of 9 s, were presented, with a 1-s pause between stimulus presentations. The grating was presented on a cathode ray tube (Tektronix 608) by means of a Picasso image synthesizer (Innisfree, Cambridge, MA), at a frame rate of 200 Hz. The screen had a horizontal and vertical extent of 65 and 80°, respectively, as seen by the fly, and was positioned at a distance of 7.5 cm from the fly, 75% left-frontal and 25% right-frontal, to optimally stimulate the left H1 neuron. Spikes were recorded extracellularly from the left H1 neuron with a tungsten electrode inserted in the lobula plate, fed through a threshold device, and transferred at 1-kHz temporal resolution to a computer (Pentium II–based PC with a DAS16 I/O board, MetraByte, Tauton, MA). Data analysis was performed using custom-written MATLAB (The MathWorks, Natick, MA) programs on data rebinned to 2-ms temporal resolution, unless otherwise stated.

### Model of motion detection

##### STIMULUS.

Similar to the experiment, the stimulus in the model consists of a moving sine grating, whose velocity *v*(*t*) is expressed as a temporal frequency (in Hz) that reflects the number of spatial periods passing a given image location per second. The luminance level at a given angular location θ at time *t* is given by (1) where *L*_{0} is the mean luminance level, ρ is the contrast of the grating, λ is its spatial wavelength, and x(t)=2π∫_{0}^{t}ν(u)du is the total displacement of the grating from time *0* to time *t*. The units of *x*(*t*) are such that *x* = 2π corresponds to displacement by one spatial period of the grating. The velocity profile *v*(*t*) is generated by low-pass filtering of gaussian white noise with zero mean. We write the time-lagged velocity autocorrelation as σ^{2}*c*(*t*), where *c*(*t*) = exp(−|*t*|/τ_{0}) is the normalized autocorrelation at time lag *t*, τ_{0} is its time constant, and σ is the SD of the velocity.

##### ELABORATED REICHARDT MOTION DETECTOR MODEL.

We model the fly's motion detection system (Fig. 1) as an array of local correlation-based motion detectors known as Reichardt detectors (Borst and Haag 2002; Egelhaaf 1989; Egelhaaf and Borst 1989; Egelhaaf and Reichardt 1987; Haag et al. 2004; Poggio and Reichardt 1973; Reichardt 1961, 1987; Single and Borst 1998). Reichardt detectors extract the direction of motion by multiplying^{1} the brightness signals from neighboring image locations after asymmetric temporal filtering. This operation is done twice in mirror-symmetrical subunits, whose outputs are subtracted. Following Borst et al. (2003) and Bialek and de Ruyter van Steveninck (2005), we used a Reichardt detector with an LPF in one input line to the multiplier and an HPF in the cross arm. The output of a local Reichardt detector at angular location θ in response to a luminance field *L*(θ,*t*) is given by (2) where *K*(τ, τ′)=*K*_{H}(τ)*K*_{L}(τ′) − *K*_{L}(τ)*K*_{H}(τ′) is the time response of the Reichardt detector, *K*_{H}(τ)=δ(τ) − exp(−τ/τ_{H}) and *K*_{L}(τ) = exp(−τ/τ_{L}) are the impulse responses of the HPF and LPF, respectively, τ_{H} and τ_{L} are their time constants, and ε is the angular spacing between the two arms of the motion detector.

The outputs of these local motion detectors then undergo spatial summation. For simplicity, we assume uniform summation over a visual field that is spanned by an integer number *n* of spatial periods of the sinusoidal grating stimulus (*Eq. 1*). Substituting *Eq. 1* into *Eq. 2* and integrating over θ, we find that the output signal of the Reichardt detector array in response to a moving sinusoidal grating is given by (3) where y_{0} = *n*λ(*L*_{0}ρ)^{2} sin (2πε/λ), and Δ*x*(*t*, *t*′) = 2π∫_{t′}^{t}*v*(*u*)*du* is the total displacement of the grating from time *t*′ to time *t*, in the same units as those used for *x*(*t*) in *Eq. 1*. Importantly, the dependence of the Reichardt detector array's output signal on the stimulus history is proportional to the sine of Δ*x*, rendering motion detection an inherently nonlinear process.^{2}

Under our stimulus conditions, the mean of the Reichardt detector array's output signal is zero. Squaring *Eq. 3* and averaging over the Gaussian ensemble of velocity stimuli, we find that the variance of the output signal of the Reichardt detector array is given by (4) where Γ(τ_{1}, τ′_{1}; τ_{2}, τ′_{2}) ≡ ∫_{τ′1}^{τ1}*du*_{1}∫_{τ′2}^{τ2}*du*_{2}*c*(*u*_{1}−*u*_{2}) is the covariance of Δ*x*(τ_{1}, τ′_{1}) and Δ*x*(τ_{2}, τ′_{2}), normalized by 4π^{2}σ^{2}. The normalized variance of Δ*x*(τ, τ′) is given by Δ(τ′−τ) ≡ Γ(τ, τ′; τ, τ′).

The output signal of the Reichardt detector array can assume both positive and negative values, producing a positive signal in response to preferred-direction motion stimuli and a negative signal in response to motion in the opposite direction. This signal can be interpreted as proportional to the total input received by H1 (Borst et al. 1995; Haag et al. 2004; Single and Borst 1998; Single et al. 1997). To compare our model to the spike trains of H1, we introduce an additional static nonlinearity, *f*(·), which transforms the Reichardt detector array's output signal, *y*(*t*), into a positive firing rate, *r*(*t*). We make no a priori assumptions about the shape of this static nonlinearity; instead, we calculate it directly from the data, as part of the parameter estimation process (see *Parameter estimation*). We also allow for an additional fixed delay in the system, *t*_{d}. The firing rate of our H1 model is therefore given by (5)

In this work, we shall refer to *y*(*t*) as the Reichardt detector output signal and to *r*(*t*) as the response or the firing rate of our H1 model.

### Velocity response time course and gain

The time course of the velocity response is described by the stimulus-response cross-correlation function, defined as (6)

Under our stimulus conditions, this function is equivalent to the spike-triggered average of the velocity stimulus.

We define the velocity response function, *R*_{t}(v), as the average firing rate at time *t*′ + *t*, subject to the condition that the velocity at time *t′* is equal to *v* (7) where 〈…〉_{t′} denotes average over time *t*′. Under stationary stimulus conditions, this is equivalent to averaging over the gaussian velocity stimulus ensemble. We define the velocity response gain, *G*_{t}, as the maximal slope of the velocity response function (8)

In this work, we will use the velocity response function and gain at a time lag *t*_{peak} equal to the time of the peak of *c*_{rv}(*t*).

### Data analysis

##### PARAMETER ESTIMATION.

To estimate the parameters of the fly's motion detection system, we minimize the mean-square error (MSE) between the peristimulus time histogram (PSTH) of H1, *r*_{data}(*t*), and the firing rate predicted by the model, *r*(*t*) (9) where *y*_{τL,τH}(*t*−*t*_{d}) is the (delayed) output signal generated by a model Reichardt detector array with time constants τ_{L} and τ_{H}, in response to the stimulus used in the experiment. As is explained below (*Performance and model selection*), the average 〈…〉_{t} is taken either over all stimulus conditions or only over times belonging to one particular value of σ, depending on the particular model variant. Similarly, τ_{L} and τ_{H} denote either a single time constant or a vector of five time constants, one for each σ, depending on the model variant.

To calculate the static nonlinearity ƒ_{τL,τH,td}^{est}(*y*′) that minimizes *Eq. 9* for a particular choice of τ_{L}, τ_{H}, and *t*_{d}, we rewrite *Eq. 9* as (10) where *P*_{τL,τH}(*y*′) is the probability that y_{τL,τH}(t−t_{d}) = y′, and 〈…|*y*_{τL,τH}(*t* − *t*_{d}) = y′〉, denotes an average over all times for which *y*_{τL,τH}(*t* − *t*_{d}) = *y*′. Minimizing *Eq. 10* with respect to the function *f*(*y*′), we find that the estimated static nonlinearity for a given choice of τ_{L}, τ_{H}, and *t*_{d} is equal to the average firing rate of the neuron, conditioned on *y*′ (11)

In practice, ƒ_{τL,τH,td}^{est}(y′) is estimated by binning the values of *y*′ (binwidth = Std(*y*_{τL,τH})) and calculating the average firing rate for all times *t* in which *y*_{τL,τH}(*t*−*t*_{d}) falls into a given bin.

Substituting the left-hand side of *Eq. 11* into *Eq. 10*, we now find that minimizing the MSE is equivalent to maximizing the following objective function (12) with respect to τ_{L}, τ_{H}, and *t*_{d}. As in *Eq. 9*, the averages in *Eqs. 11* and *12* are either over all stimulus conditions or over only one value of σ, and τ_{L} and τ_{H} are either scalars or vectors, depending on the model variant.

Because the mean of ƒ_{τL,τH,td}^{est}[y_{τL,τH}(*t* − *t*_{d})] is always, by definition, equal to the mean firing rate 〈r_{data}(*t*)〉, maximizing *V* is equivalent to maximizing the variance of ƒ_{τL,τH,td}^{est}[y_{τL,τH}(*t* − *t*_{d})]. Our objective function *V* (*Eq. 12*) is equivalent to the one introduced by (Paninski 2003) in the context of estimation of the parameters of a linear-nonlinear (LN) model, consisting of a linear filter (*K*) followed by a static nonlinearity (*f*), under non-gaussian stimulus conditions. Here we show that *Eq. 12*, which was derived in Paninski (2003) using a φ-divergence technique, can also be derived from the principle of minimization of the MSE. We apply the method to our scenario, in which the filter is assumed to have the form of a Reichardt detector, *K* = *K*_{H}(τ)*K*_{L}(τ′), parametrized only by the two time constants τ_{L} and τ_{H}. The sin[Δ*x*(*t* − τ, *t* − τ′)] term in *Eq. 3* corresponds to the non-gaussian stimulus discussed in Paninski (2003). Both the stimulus and the filter are, in our scenario, vectors in the space of functions of two times, τ and τ′.

##### PERFORMANCE AND MODEL SELECTION.

We consider three possible loci of adaptation: the two motion detector time constants τ_{L} and τ_{H} and the static nonlinearity *f*(·). To determine which, if any of these parameters adapt to the velocity variance (σ^{2}), we compare the performance of all 2^{3} = 8 possible variants of our model. In each model variant, some of the parameters are estimated separately for each value of σ, whereas the other(s) are assumed to be fixed for all values of σ (see the table in Fig. 3). For simplicity, the delay *t*_{d} is assumed to be independent of σ in all model variants.

The simplest model variant is *model A* (see table in Fig. 3), in which all parameters are fixed for all values of σ. For this model variant, the averages in *Eqs. 11* and *12* are taken over all stimulus conditions, and τ_{L} and τ_{H} are scalars. To estimate the parameters of this model, we compute *y*_{τL,τH}(*t* − *t*_{d}), ƒ_{τL,τH,td}^{est}(·), and *V*[τ_{L}, τ_{H}, *t*_{d}] for all possible values of τ_{L}, τ_{H}, and t_{d} (upper and lower limits 2 ms ≤ τ_{L}, τ_{H} ≤ 600 ms, 0 ≤ *t*_{d} ≤ 60 ms, sampled at 2-ms intervals), using *Eqs. 3*, *11*, and *12*. We maximize *V*[τ_{L}, τ_{H}, *t*_{d}] by exhaustively screening this entire parameter space.

In *models B*, *C*, and *D*, *f*(·) is kept fixed for all values of σ, whereas τ_{L} and/or τ_{H} are allowed to adapt to the stimulus conditions. For these model variants, the averages in *Eqs. 11* and *12* are still over all stimulus conditions, but τ_{L} and/or τ_{H} become vector(s) of five time constants, one for each σ. Screening this high-dimensional parameter space exhaustively is prohibitively time consuming. We therefore maximize *V* for these model variants using a direct search algorithm, sampling progressively smaller (50 to 2 ms) intervals around the current estimated maximum. If no new maximum is found and the interval reaches 2 ms, the search is terminated; if a new maximum is found, the interval is increased and the search continues around the new maximum. The parameters that were estimated for *models F*, *G*, and *H*, respectively, are used as initial conditions for this algorithm for *models B*, *C*, and *D*, respectively.

For *models E*–*H*, in which *f*(·) is allowed to adapt, the averages in *Eqs. 11* and *12* are calculated separately for each stimulus condition, resulting in five separate objective functions *V*_{σ} [τ_{L}, τ_{H}, *t*_{d}], σ = 0.1, 0.5, 1, 5, or 10 Hz, one for each stimulus condition. The arguments τ_{L} and τ_{H} of each *V*_{σ} are scalars. Each objective function is calculated exhaustively for all possible τ_{L}, τ_{H}, and t_{d}, as was done for *model A*. Each *V*_{σ} is maximized with respect to whatever time constant(s) are being allowed to adapt, obtaining the estimated adaptive time constant(s) for each possible choice of *t*_{d} and fixed time constant (if any). We maximize the sum of the resulting five *V*_{σ}^{max}[τ_{fixed}, *t*_{d}], to obtain the estimated fixed time constants(s) and *t*_{d}.

To compare the performance of the different model variants, we use a fivefold cross-validation procedure. Parameters for each model variant are estimated for a given fly using data from four fifths of the stimulus duration and tested on the remaining one fifth of the data from the same fly. The firing rate of the model in response to the test stimuli is calculated by applying *Eq. 3* to the test stimulus using the estimated time constants, yielding a Reichardt detector signal *y*_{test}(*t*) and generating a firing rate *r*_{test}(*t*) by linear interpolation of the estimated *f*(*y*). We calculate the MSE between *r*_{test}(*t*) and the response (PSTH) of H1 to the test stimulus. We repeat this procedure for five different choices of training and test datasets (folds), with each fold containing data from all five σ conditions. The resulting generalization MSEs are averaged over the five folds, yielding *n* = 10 generalization scores for each model, one for each fly. We perform a three-factor [τ_{L}, τ_{H}, and *f*(·)] two-level (fix/ad) repeated-measures ANOVA (Keppel and Wickens 2004) to determine which, if any, of the three model components significantly improve the generalization score when they are allowed to adapt. ANOVAs are performed using SPSS software (SPSS, Chicago, IL); effects with *P* < 0.01 are considered statistically significant. We also calculate the correlation coefficients between the predicted and actual firing rates in response to the various test stimuli.

##### CALCULATION OF VELOCITY RESPONSE TIME COURSE AND GAIN.

Stimulus-response cross-correlations are calculated using *Eq. 6*. After subtracting the baseline value, measured over the range of 5 × τ_{0} ≤ *t* ≤ 1 s during the prestimulus period, we normalize the correlation functions by their peak values. We calculate the full-width at half-max and the peak latency of the resulting normalized correlation functions.

Velocity response functions (*Eq. 7*) are calculated at a time lag equal to the peak of the stimulus-response cross-correlation *c*_{rv}(*t*) (*Eq. 6*). Velocity values are binned with a binwidth of 0.25 × σ. The velocity response gain is estimated by computing the maximal slope of the appropriate velocity response function. Slopes are computed using a velocity range of width 0.75 × σ.

##### CALCULATION OF INFORMATION RATES.

We calculate the mutual information between the spike count *k*(*t*) in a given window at time *t*, and the stimulus history {*v*(*t*+*t*′)}_{t′=−∞}^{∞}, for each stimulus condition. We use window sizes of 2, 4, 10, and 20 ms for the τ_{0} = 20 ms flies and 2, 4, 10, 20, 40, and 100 ms for the τ_{0} = 100 ms flies. This analysis makes no assumptions about which aspects of the stimulus are being encoded by the spike counts. Following Strong et al. (1998), and assuming sufficient sampling of the stimulus space, we replace the average over all possible stimulus histories with an average over time, yielding (13) where *H*(*k*)=−∑k*P*(*k*)log_{2}*P*(*k*) is the total entropy of the spike counts, and *H*(*k*|{v})=*H*(*k*|*t*)=−∑k,t*P*(*k*|*t*)log_{2}*P*(*k*|*t*) is the noise entropy. *P*(*k*|*t*) is the probability of observing *k* spikes at time *t*, and *P(k)* = ∑t*P*(*k*|*t*) is the marginal distribution of *k*. The results of this analysis are presented in units of bits per second, and in units of bits per spike, calculated by dividing the result by the mean firing rate in each stimulus condition.

To calculate the total information rates of the H1 spike trains (*Eq. 13*), we estimate the distribution *P*(*k*|*t*) empirically by constructing a histogram of spike counts measured at a given time during different trials. Following Strong et al. (1998) and Borst (2003), we control for sampling bias by calculating *H*(*k*|*t*) based on data fractions of 1/10, 1/7, 1/5, 1/4, 1/3, 1/2, and 1 of the trials and performing a quadratic extrapolation to infinite sample size. To control for sampling bias of *H*(*k*) caused by finite stimulus duration, we calculate *H*(*k*) based on subsegments of 1/10, 1/7, 1/5, 1/4, 1/3, 1/2, and 1 of the stimulus duration, average over all subsegments for each fraction, and perform a quadratic extrapolation to infinite stimulus duration. As expected (Strong et al. 1998) for datasets of similar size (Brenner et al. 2000b), this correction is small.

We also calculate the information rates predicted by the Reichardt model of motion detection. For the smallest window size of 2 ms, we can safely assume that only one spike per time window can be produced, because of the refractory period of the H1 neuron. We therefore interpret the firing rate of the Reichardt model, *r*(*t*), expressed in units of spikes per bin, as an instantaneous probability ofspiking, allowing us to evaluate *Eq. 13* using *P*(*k* = 1|*t*) = *r*(*t*) and *P*(*k* = 0|*t*) = 1 − *r*(*t*). For larger window sizes, it is necessary to incorporate a description of the trial-to-trial variability of the spike counts into our model. The responses of H1 to time-varying velocity signals such as ours are known to be highly reliable across trials, rendering the often-used inhomogeneous Poisson model a poor description of the spiking statistics (de Ruyter van Steveninck et al. 1997; Haag and Borst 1997). To incorporate a more accurate description of the spiking statistics into our model, we generalize the static nonlinearity *f*(*y*) (*Eq. 11*), by calculating the full distribution of spike counts for a given value of the Reichardt detector output signal, *P*(*k*|*y*). This distribution is calculated directly from the spike trains of H1 for each fly and for each stimulus condition, using the output signal *y*_{τL,τH}(*t* − *t*_{d}) of a Reichardt detector with the τ_{L}, τ_{H} calculated in the parameter estimation procedure and delayed by the estimated *t*_{d}. Values of *y* are discretized with a binwidth of 3/7 ·Std(*y*_{τL,τH}). The mean of *P*(*k*|*y*) is equal, by definition, to the value of the static nonlinearity *f*_{τL,τH,td}^{est}(*y*)(*Eq. 11*), whereas its overall shape describes the trial-to-trial variability of the spike counts for a given value of the Reichardt detector output signal. This technique retains the assumption that the spiking statistics are determined exclusively by the instantaneous value of the Reichardt detector output signal, *y* while “borrowing” the detailed behavior of the trial-to-trial variability from the actual data. For a description of a similar technique, see Brenner et al. (2000a).

To calculate the information rates predicted by the Reichardt model, we use *Eq. 3* to calculate the Reichardt detector output signal *y*_{τL,τH}(*t*−*t*_{d}) generated in response to the stimuli used in the experiment, using the parameters τ_{L}, τ_{H}, and t_{d} previously estimated for each fly, averaged over training sets. For each stimulus condition (σ), we use the appropriate adaptive value of τ_{H}, as determined by the fitting procedure for model variant G. We calculate the generalized model response *P*_{model}(*k*|*t*) by linear interpolation of the appropriate *P*(*k*|*y*) at y = y_{τL,τH}(*t*−*t*_{d})[negative *P _{model}*(

*k*|

*t*) values resulting from extrapolation beyond the edges of

*P*(

*k*|

*y*) were set to zero and the distribution was renormalized]. The resulting model response is then substituted into

*Eq. 13*, yielding the model's information rate. To quantify the contribution of adaptation to information transmission, we repeat this analysis with τ

_{H}fixed at its minimal and maximal observed values for all stimulus conditions and compare the results.

## RESULTS

### Performance and model selection

We find that the elaborated Reichardt model accounts well for the complex responses of H1 to our randomly fluctuating velocity stimuli. Figure 2*A* shows 3.6-s segments of the experimentally measured firing rate (PSTH) of an H1 neuron under the five velocity variance conditions compared with the responses predicted by the Reichardt model (variant G, see table in Fig. 3) . The model response follows the actual PSTH quite closely, despite the fact that model parameters for each 1.8-s segment were estimated using the remaining portion of the data (see *Methods*). The average correlation coefficients (CCs) between actual responses and model (variant G) predictions were between 0.76 and 0.96, depending on the stimulus condition.

To determine which parameters of the motion detection system adapt to stimulus statistics, we compare the performance of eight model variants. In each model variant, some model parameters are estimated separately for each velocity variance condition, thereby allowing for adaptation to stimulus conditions, whereas the other parameter(s) were assumed to be fixed for all values of σ. Figure 3 shows the MSEs for each of the eight model variants, for the τ_{0} = 20 ms and τ_{0} = 100 ms flies. The table indicates which components are kept fixed (fix) and which are allowed to adapt to stimulus statistics (ad) in each model variant. Comparing *models A* and *B* with *models C*–*F*, we observe that allowing adaptation of either τ_{H} or *f*(·) leads to a significant (3-way repeated-measures ANOVA, *n* = 10, *P* ≤ 0.001, main effects) reduction in the generalization error of the model. In contrast, estimating τ_{L} separately for each stimulus condition does not reduce the generalization error (*P* = 0.829 for τ_{0} = 20 ms, *P* = 0.189 for τ_{0} = 100 ms), as can be seen by comparing *model A* to *B*, *C* to *D*, etc. Allowing both τ_{H} and *f*(·) to adapt to the stimulus conditions (*model G*) results in a further reduction of the generalization error, as can be seen by comparing *model G* to *models C*–*F*. These findings are consistent for both of the stimulus correlation times (τ_{0}) used in our experiments [for τ_{0} = 20 ms, there was also a significant (*P* < 0.001) negative interaction between τ_{H} and *f*(·), reflecting the fact that the contributions of these 2 parameters to the MSE sum sublinearly]. We conclude that model variant G, with adaptive τ_{H} and *f*(·), is the best fit to H1 responses under our experimental conditions. This model variant also outperformed model variant E [adaptive *f*(·), fixed τ_{H}] under each individual stimulus condition (Wilcoxon sign-rank tests for generalization MSEs and CCs, *P* < 0.01), with the exception of τ_{0} = 100 ms, σ = 10 Hz, where the differences were not statistically significant.

Figure 2, *B* and *C*, compares segments of the experimentally measured firing rate (PSTH) with the responses of model variants E [adaptive *f*(·), fixed τ_{H}, brown] and G [adaptive *f*(·) and τ_{H}, blue] for a low-variance (σ = 0.1 Hz; Fig. 2*B* ) and a high-variance (*σ* = 5 Hz; Fig. 2*C* ) stimulus condition. In both cases, *model G* is a better match to the actual response, because of adaptation of the high-pass time constant τ_{H}, which is long for the low-variance condition and short for high-velocity variance (see *Motion detector parameters* and Fig. 4*A* ). The fixed τ_{H} estimated under model variant E had an intermediate value, leading to an overly brisk model response to the low-variance stimulus (Fig. 2*B*) and an excessively sluggish response under the high-variance condition (Fig. 2*C*). For the remainder of this paper, we will therefore present results for model variant G, which we will refer to as the adaptive Reichardt model.

### Motion detector parameters

The strongest dependence on stimulus statistics is exhibited by the HPF time constant τ_{H} (Fig. 4*A*), which shrinks from ∼200 ms to as little as 20 ms with increasing σ. The behavior of τ_{H} is consistent for the two stimulus correlation times used in our experiments. The low-pass filter time constant (τ_{L}) amounted to ∼30 ms, and the fixed delay (*t*_{d}) was ∼20 ms. The second model component that exhibits statistically significant adaptation is the static nonlinearity (Fig. 4, *C* and *D*). The maximal slope of this function decreases by a factor of 3–4 as σ is increased (Fig. 4*B*). Model parameters estimated from τ_{0} = 20 ms flies' responses to a σ = 5 Hz, τ_{0} = 100 ms control stimulus were consistent with the behavior of the τ_{0} = 100 ms flies at σ = 5 Hz, although their mean firing rates were slightly lower than those of the τ_{0} = 100 ms flies.

In conclusion, we observe significant and systematic adaptation of motion detector parameters to the variance of our random velocity stimuli, as parametrized by the changes in the HPF time constant τ_{H} and the static nonlinearity *f*(·) in model variant G. In the following sections we will analyze the contribution of these parameter changes to the system's velocity response properties and information transmission.

### Contribution of time-constant adaptation to velocity response time course and gain

We quantify the effective time course of the H1 velocity response by calculating the normalized stimulus-response cross-correlation function (*Eq. 6*). Figure 5*A* shows the shape of this function for the various stimulus conditions, calculated from the H1 spike trains and from the responses of model variant G. As previously reported in both the fly (Borst et al. 2005) and primate (Bair and Movshon 2004) visual systems, the width and peak latency of this function decrease with increasing velocity fluctuations (Fig. 5, *B* and *C*; black traces). The adaptive Reichardt model (Fig. 5, *B* and *C*; blue traces) reproduces this effect faithfully.

To quantify the contribution of τ_{H} adaptation to this effect, we simulate the model's response to the various stimulus conditions, but with τ_{H} held fixed at the minimal (20 ms; Fig. 5, *B* and *C*, red traces) or maximal (200 ms; Fig. 5, *B* and *C*, green traces) values observed in our experiments. We measure the width and latency of the resulting *c*_{rv}. As previously reported (Borst et al. 2005), the width and latency decrease with σ even when τ_{H} is held fixed at 200 ms (Fig. 5, *B* and *C*, green traces; the slight nonmonotonicity of some of the τ_{H} = 200 ms simulation results shown in Fig. 5, *B* and *C*, is an artifact of the particular stimulus realizations used in our experiments and did not occur in model simulations when longer stimuli were used). However, the actual observed time-course control (Fig. 5, *B* and *C*, black traces) is clearly steeper, indicating that τ_{H} adaptation contributes significantly to this effect under most stimulus conditions. For large stimulus fluctuations (i.e., σ = 10 Hz), the inherent nonlinearity dominates, automatically suppressing the contribution of past velocity history regardless of the value of τ_{H}. Conversely, when τ_{H} is fixed at 20 ms (red traces), there is no noticeable automatic time-course control under our stimulus conditions, because of the weak history dependence of the model response.

The H1 neuron is also known to rapidly adapt its velocity response gain (*Eq. 8*) to the amplitude of stimulus velocity fluctuations (Borst et al. 2005; Brenner et al. 2000a; Fairhall et al. 2001). Figure 6*A* shows the velocity response functions (*Eq. 7*) for the various stimulus conditions, as calculated from the H1 spike trains and from the adaptive Reichardt model responses. The black and blue traces in Fig. 6*B* show the gain of this function (*Eq. 8*) for data and the adaptive Reichardt model, respectively. Over our range of σ, the velocity gain is reduced by a factor of 22 (τ_{0} = 20 ms) or 35 (τ_{0} = 100 ms) in H1 responses and by a similar factor in the adaptive Reichardt model. To determine whether time-constant adaptation contributes to the observed gain control, we calculated the velocity response gain from model simulations in which τ_{H} was held fixed at 20 or 200 ms for all stimulus conditions (Fig. 6*B*, red and green traces). The behavior of the gain is not altered when τ_{H} is kept fixed at its unadapted value of 200 ms (green), indicating that time-constant adaptation does not contribute to velocity gain control.

The contribution of *f*(·) adaptation to velocity gain control is difficult to quantify without an exact description of the saturation properties of *f*(·) and their dependence on stimulus conditions. However, we note that the slope of *f*(·) decreases by a factor of 3–4 (Fig. 4*B*), whereas the velocity gain is reduced by a factor of 22–35 (Fig. 6*B*), suggesting that most of the observed gain control is explained by the inherent nonlinearity of the motion detector as described by the Reichardt model, independent of parameter change (Borst et al. 2005). The decrease in the slope of *f*(·) can be interpreted as the signature of additional processes of gain adaptation that are not accounted for by the Reichardt model.

### Information transmission

The information rate of the H1 neuron has been found to be rather insensitive to the statistics of the velocity stimulus, whereas decreasing the size or the contrast of the grating results in a strong deterioration of information transmission (Borst 2003; Fairhall et al. 2001). However, these studies examined only a limited range of stimulus conditions. Here we calculate H1 information rates for a wide range of velocity variance conditions, spanning two orders of magnitude (Fig. 7, *A* and *B*, black traces). We find that over this wide range of σ, H1 information rates (bits/s, Fig. 7*A*; bits/spike, Fig. 7*B*, black traces) are not invariant to stimulus statistics. The adaptive Reichardt model (blue traces) exhibits similar behavior, albeit with lower overall information rates, reflecting the portion of response fluctuations that are not accounted for by the model.

We find that H1 information rates (bits/s) are correlated, across flies and stimulus conditions, with the mean (*r* = 0.76) and the variance (*r* = 0.98) of the firing rate. The mean firing rate (Fig. 7*C*, black) exhibits a bell-shaped dependence on σ, and the variance of the firing rate (Fig. 7*D*, black), as well as the information rate (bits/s; Fig. 7*A*), shows a similar trend for our range of stimulus conditions. Similar findings regarding the mean and variance of H1 firing rates have been previously reported in Flanagin (2006). The mean firing rates of our adaptive Reichardt model (Fig. 7*C*, blue) are by definition equal to those observed in the data, because of the way the static nonlinearity is defined (*Eq. 11*). The response variance of the adaptive Reichardt model (Fig. 7*D*, blue) exhibits a bell-shaped trend similar to that observed in the data.

Following the steps described in *Elaborated Reichardt motion detector model*, we derive an analytical expression for the variance of the Reichardt detector output signal, Var(*y*) (*Eq. 4*). This calculation reveals that Var(*y*) is a bell-shaped function of σ (Fig. 8). Thus the basic Reichardt model, without the static nonlinearity, already predicts bell-shaped behavior of the mean firing rate, firing-rate variance, and information rate (bits/s).

The velocity gain control observed in H1 has been interpreted as an adaptive rescaling of the response, which serves to maximize information transmission by enabling the system to use its full dynamic range under changing stimulus conditions (Brenner et al. 2000a). However, our results suggest that gain control is primarily an automatic consequence of the inherent nonlinearity of the motion detector, whereas the primary adaptive process is related to the system's temporal kernel. This adaptation is parametrized in our model by changes in τ_{H}. We therefore ask whether τ_{H} adaptation maximizes information transmission. To address this question, we calculate the information rates (*Eq. 13*) predicted by the adaptive Reichardt model (Fig. 7*A*, blue traces). We compare these results to model simulations in which τ_{H} is held fixed at its minimal (20 ms; Fig. 7*A*, red traces) or maximal (200 ms; Fig. 7*A*, green traces) observed value. We find that for intermediate values of σ, time-constant adaptation actually has a detrimental effect on information transmission, because information rates (bits/s) would have been higher had τ_{H} remained at its unadapted value of 200 ms. The information rate in bits per spike (Fig. 7*B*) is essentially the same with τ_{H} adaptation (blue trace) and with τ_{H} fixed at 200 ms (green trace), indicating that τ_{H} adaptation cannot be understood as an optimization of this measure, either. These results did not depend on the size of the time window used to count spikes (we show results for spike count window sizes of 4 ms, because these yielded the highest information rates at most σ for both values of τ_{0}), or on the extrapolation method used for *f*(*y*).

Our analytical results (*Eq. 4*; Fig. 8) show that the variance of the Reichardt detector output signal, Var(*y*), is an increasing function of τ_{H} for our range of parameters and stimulus conditions. This increase leads to the higher firing rates, response variance, and information rates (bits/s) observed in the full model simulations (green traces compared with blue in Fig. 7, *A*, *C*, and *D*), where *y* is fed through the static nonlinearity, *f*(*y*), to generate the model's firing rate. Thus our results suggest that saturation of the system's dynamic range, parametrized in our model by the shape of the static nonlinearity, is not an important limiting factor for information transmission under our stimulus conditions.

## DISCUSSION

### Adaptation of motion detector parameters

In this work, we studied the adaptation of the H1 neuron to stimulus velocity variance by fitting the neural responses to a simple model of motion detection. The model consisted of an array of Reichardt motion detectors, containing an HPF, an LPF, a multiplier, and a subtraction stage, followed by a static nonlinearity (Fig. 1). We found that the HPF time constant and static nonlinearity of the motion detector adapt to stimulus statistics (Figs. 3 and 4). The HPF time constant shortens considerably when stimulus fluctuations are increased, whereas the static nonlinearity shows a relatively small reduction in its slope. In contrast, we did not detect any adaptation of the LPF time constant (Fig. 3, *column A* vs. *B*, *C* vs. *D*, etc.). This behavior was highly consistent for the two stimulus correlation times (τ_{0}) used.

Our findings regarding the time constants of H1 are consistent with earlier results obtained using simple motion stimuli in adapt-and-probe experimental paradigms. In its unadapted state, the H1 cell has been found to respond to a brief motion pulse with a sudden rise in activity, followed by an exponential decay with a time constant of ∼300 ms. After prolonged exposure to constant motion stimuli, the decay time constant of the impulse response was found to shorten to values as low as 30 ms. The extent of the shortening was found to depend systematically on the velocity and the contrast of the adapting stimulus (de Ruyter van Steveninck et al. 1986). This adaptation was found to occur after exposure to motion in either the preferred or the null direction and even after exposure to flicker stimuli (Borst and Egelhaaf 1987). Adaptation was observed only when the test stimulus was presented in the same area of the visual field as the adapting stimulus, indicating that this is a spatially local process (de Ruyter van Steveninck et al. 1986; Reisenman et al. 2003). Analytical treatment of the Reichardt model with an HPF in one arm and an LPF in the other, as used in this work, shows that the impulse response time constant is equal to τ_{H} (Borst et al. 2003), indicating that this parameter undergoes motion adaptation. In this work, we found that τ_{H} shortens after exposure to white-noise velocity stimuli, with the amount of shortening depending systematically on the variance of the velocity fluctuations, extending the results of Borst and Egelhaaf (1987).

We did not detect any significant adaptation of the LPF time constant of the motion detection system for our stimulus conditions. This result is consistent with the behavior of the steady-state responses of H1 and other fly motion-sensitive neurons to constant motion stimuli. This response has been found to peak at stimulus velocities of ∼2–10 Hz, independent of the stimulus contrast (Harris et al. 1999; Reisenman et al. 2003). Similarly, the steady-state responses of HS motion-sensitive neurons in the drone fly (*Eristalis tenax*), have been found to peak at ∼7 Hz, independent of prior motion adaptation (Harris et al. 1999). These findings indicate that the relevant time constant for the steady-state velocity tuning is fixed at a few tens of millseconds [i.e., 1/(2π × 4 Hz) ≈ 40 ms] and does not undergo significant motion adaptation. Analytical treatment of the Reichardt model shows that the location of the peak of the steady-state response depends primarily on τ_{L}, explaining the different behavior of the impulse and steady-state responses (Borst et al. 2003).

Lindemann et al. (2005) recorded the graded membrane potential responses of the blowfly motion-sensitive neuron HSE to naturalistic motion stimuli and fitted them to a similar model of motion detection. Their analysis yielded seemingly nonsystematic changes in the estimated motion detector time constants, which only weakly improved the goodness of fit. In contrast, our experimenter-controlled velocity stimuli enabled us to observe a systematic reduction in τ_{H} with increasing velocity fluctuations, and to examine the relative contribution of this adaptation to H1's response time course and information transmission properties. In addition, our cross-validation procedure indicated that of the two time constants, only τ_{H} exhibits significant adaptation under our stimulus conditions.

In addition to the shortening of τ_{H}, we also found a decrease in the slope of the static nonlinearity of the motion detector (Fig. 4*B*). This would predict a reduction in the amplitude of the impulse and steady-state responses after motion adaptation. Interestingly, Harris et al. (1999) found that the amplitude of both the impulse and the steady-state responses of the HS neuron was reduced after exposure to constant motion stimuli because of a hyperpolarizing afterpotential.

### Contribution to velocity response time-course and gain control

By comparing our results to model simulations in which τ_{H} was fixed at its maximal and minimal observed values, we showed that time-constant adaptation contributes significantly to the shortening of the velocity response time course (*c*_{rv}) but not to velocity gain control (Figs. 5 and 6). Over our range of σ, the velocity gain was reduced by a factor of 22 (τ_{0} = 20 ms) to 35 (τ_{0} = 100 ms) (Fig. 6*B*). A large portion of this effect can be accounted for by the inherent adaptive properties of the motion detection system, as described in Borst et al. (2005). The contribution of additional processes of gain adaptation is parametrized by the static nonlinearity *f*(·), which decreases its slope by a factor of three to four as σ is increased (Fig. 4*B*).

### Adaptation and information transmission

Our model simulations indicate that time-constant adaptation in H1 does not optimize information transmission, because information rates would have been higher (bits/s) or unchanged (bits/spike) if τ_{H} had remained at its maximal value of 200 ms for all stimulus conditions (Fig. 7, *A* and *B*, cf. green and blue traces). Our results suggest that, as stimulus fluctuations are increased, time-constant adaptation facilitates increased encoding of recent velocities, at the expense of the system's overall information rate. Identifying the specific stimulus features encoded by H1 under different stimulus conditions and quantifying the reasons for their selection from a functional point of view are interesting challenges for future experimental and theoretical studies.

Optimal information transmission also predicts that information rates should remain constant over changing stimulus conditions. This prediction was corroborated in Borst (2003), where information rates were found to be largely independent of the stimulus entropy (firing rates were fairly constant for the stimuli used in that work). Similarly, Fairhall et al. (2001) found that H1 information rates scale with the firing rate, implying that the system is operating at a constant, presumably optimal, level of efficiency (bits/spike). However, these studies examined only a limited range of stimulus conditions. In this work, we calculated H1 information rates for a wider range of velocity fluctuation amplitudes, spanning two orders of magnitude. We found that, over this range of stimulus conditions, H1 information rates (both bits/s and bits/spike) are dependent on the amplitude of velocity fluctuations.

### Multiple adaptive processes

Adaptation is usually defined as a change in the response properties of a given system, making it better suited to cope with the present environment. The fly motion detection system is a classic example of multiple adaptive behaviors (Borst and Egelhaaf 1987; Borst et al. 1995, 2005; Brenner et al. 2000a; de Ruyter van Steveninck et al. 1986; Fairhall et al. 2001; Harris et al. 1999; Reisenman et al. 2003; Single et al. 1997). In this work, we focused on the adaptive temporal integration of motion signals in H1. We related the well-known adaptation of the H1 velocity impulse response observed in simple adapt-and-probe paradigms (Borst and Egelhaaf 1987; Borst et al. 2003; de Ruyter van Steveninck et al. 1986; Reisenman et al. 2003) to the narrowing of the velocity STA observed in experiments using random motion stimuli (Borst et al. 2005), through adaptation of the motion detector time constant τ_{H}. Our model simulations indicated that adaptation of τ_{H} does not optimize the system's overall information rate. In addition, we found that H1 information rates are not constant over our range of stimulus conditions.

## GRANTS

A. Borst was supported by the Max Planck Society. H. Sompolinsky is partially supported by a U.S.–Israel Binational Science Foundation grant and funding from the Volkswagen Foundation.

## Footnotes

↵

^{1}For possible mechanisms of neuronal multiplication, see Gabbiani et al. (2002) and Torre and Poggio (1978).↵

^{2}For nonsinusoidal periodic gratings, we find that*y(t*) is a weighted sum over the spatial frequencies present in the grating, where each element of the sum is proportional to sin(*k*Δ*x*),*k*= 1,2, … (M.N.S., unpublished notes).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2007 by the American Physiological Society