Neurons in the accessory optic system (AOS) and pretectum are involved in the analysis of optic flow and the generation of the optokinetic response. Previous studies found that neurons in the pretectum and AOS exhibit direction selectivity in response to large-field motion and are tuned in the spatiotemporal domain. Furthermore, it has been emphasized that pretectal and AOS neurons are tuned to a particular temporal frequency, consistent with the “correlation” model of motion detection. We examined the responses of neurons in the nucleus of the basal optic root (nBOR) of the AOS in pigeons to large-field drifting sine wave gratings of varying spatial (SF) and temporal frequencies (TF). nBOR neurons clustered into two categories: “Fast” neurons preferred low SFs and high TFs, and “Slow” neurons preferred high SFs and low TFs. The fast neurons were tuned for TF, but the slow nBOR neurons had spatiotemporally oriented peaks that suggested velocity tuning (TF/SF). However, the peak response was not independent of SF; thus we refer to the tuning as “apparent velocity tuning” or “velocity-like tuning.” Some neurons showed peaks in both the fast and slow regions. These neurons were TF-tuned at low SFs, and showed velocity-like tuning at high SFs. We used computer simulations of the response of an elaborated Reichardt detector to show that both the TF-tuning and velocity-like tuning shown by the fast and slow neurons, respectively, may be explained by modified versions of the correlation model of motion detection.
The pretectum and the accessory optic system (AOS) have been implicated in the processing of the visual consequences of self-motion, known as optic flow (Gibson 1954), and the generation of the optokinetic response (OKR) to facilitate retinal image stabilization (for reviews see Grasse and Cynader 1990; Simpson 1984; Simpson et al. 1988). The AOS and pretectum are highly conserved in vertebrates. The mammalian pretectal nucleus of the optic tract (NOT) is homologous to the nucleus lentiformis mesencephali (LM) in birds, whereas the avian nucleus of the basal optic root (nBOR) of the AOS is homologous to the medial (MTN) and lateral terminal nuclei (LTN) of the mammalian AOS (Fite 1985; McKenna and Wallman 1985a; Simpson 1984; Simpson et al. 1988; Weber 1985). In numerous species, it has been shown that pretectal and AOS neurons have large receptive fields in the contralateral eye and exhibit direction selectivity to moving large-field stimuli (NOT: Collewijn 1975a,b; Hoffmann and Schoppmann 1981; Mustari and Fuchs 1990; Volchan et al. 1989; LM: Cooper and Magnin 1986; Fan et al. 1995; Fite et al. 1989; Katte and Hoffmann 1980; McKenna and Wallman 1985b; Winterson and Brauth 1985; Wylie and Frost 1996; MTN/LTN: Grasse and Cynader 1982; Grasse et al. 1984; Natal and Britto 1987; Soodak and Simpson 1988; nBOR: Burns and Wallman 1981; Gioanni et al. 1984; Morgan and Frost 1981; Rosenberg and Ariel 1990; Wylie and Frost 1990a). The AOS and pretectum provide input to olivo-vestibulocerebellar pathways that respond best to patterns of optic flow resulting from self-translation and self-rotation (Graf et al. 1988; Simpson et al. 1981; Wylie and Frost 1993, 1999; Wylie et al. 1993 1998).
Using large-field drifting sine wave gratings of varying spatial frequency (SF) and temporal frequency (TF), a few studies have shown that AOS and pretectal neurons are tuned in the spatiotemporal domain. Ibbotson et al. (1994) recorded from the NOT of wallabies and found that there were two groups of neurons: those that preferred high SFs and low TFs versus those that preferred low SFs and high TFs. Given that velocity = TF/SF, these two groups were referred to as “slow” and “fast” neurons, respectively. Strikingly similar observations were found in the pigeon LM and nBOR (Crowder and Wylie 2001; Wylie and Crowder 2000). Wolf-Oberhollenzer and Kirschfeld (1994) also recorded the responses of pigeon nBOR neurons to sine wave gratings, but they used a restricted range of SFs (<0.185 cpd), which did not include the SFs that maximally stimulate slow neurons (0.25–2 Hz in pigeon nBOR and LM, and wallaby NOT; Crowder and Wylie 2001; Ibbotson et al. 1994; Wylie and Crowder 2000). Both Ibbotson et al. (1994) and Wolf-Oberhollenzer and Kirschfeld (1994) emphasized that neurons were tuned to TF rather than stimulus velocity, consistent with the “correlation” model of motion detection (Barlow and Levick 1965; Reichardt 1957, 1961; van Santen and Sperling 1985) as opposed to the “gradient” models, which predict velocity tuning over a broad range of SFs and TFs (e.g., Buchner 1984; Marr and Ullmann 1981; Srinivasan 1990).
In the present study we recorded the responses of neurons in the pigeon nBOR to drifting sine wave gratings, but used a broader range of SFs than those used by Wolf-Oberhollenzer and Kirschfeld (1994). We found that, whereas the fast cells were tuned to TF, the responses of the slow cells were more closely related to velocity than to TF. Although it has been assumed that the correlation model of motion detection (Barlow and Levick 1965; Reichardt 1957, 1961; van Santen and Sperling 1985) is not well suited for the measurement of image velocity, some versions of the correlation model produce responses that are dependent on image speed (e.g., Zanker et al. 1999). The data are discussed with regard to these recent elaborations of the correlation model of motion detection.
Surgery and extracellular recording
The methods used conformed to the guidelines established by the Canadian Council on Animal Care and were approved by the Biosciences Animal Welfare and Policy Committee at the University of Alberta. Details for anesthesia, extracellular recording, stimulus presentation, and data analysis were previously described by Wylie and Crowder (2000). Briefly, pigeons were anesthetized with a ketamine (65 mg/kg)–xylazine (8 mg/kg) mixture (im) and supplemental doses were administered as necessary. Based on the pigeon stereotaxic atlas (Karten and Hodos 1967), sufficient bone and dura were removed to access the nBOR with vertical penetrations. Recordings were made with tungsten microelectrodes (impedance 2–5 MΩ) or glass micropipettes filled with 2 M NaCl (tip diameters 4–5 μm; impedance 2–5 MΩ). The extracellular signal was amplified, filtered, displayed on an oscilloscope, and fed to a window discriminator. Transister-Transister Logic (TTL) pulses representing single spikes were fed to a 1401plus [Cambridge Electronic Designs (CED)], and peristimulus time histograms (PSTHs) were constructed with Spike2 software (CED).
After neurons in the nBOR were isolated, the direction preference and the approximate locations of the receptive field boundaries were qualitatively determined by moving a large (90 × 90°) handheld stimulus in various areas of the visual field. Directional tuning and spatiotemporal tuning were determined quantitatively with sine wave gratings that were generated by a VSGThree graphics computer (Cambridge Research Designs, Cambridge, UK), and back-projected onto a tangent screen that was located 50 cm from the bird (90 × 75°). Direction tuning was tested using gratings of an effective SF and TF at 15 or 22.5° increments, whereas spatiotemporal tuning was tested using gratings of varying SF [0.03–2 cycles/deg (cpd)] and TF [0.03–16 cycles/s (Hz)] moving in the preferred and antipreferred directions. Each sweep consisted of 4 s of motion in one direction, a 3 s pause, 4 s of motion in the opposite direction, followed by a 3 s pause. Firing rates were averaged over 3–5 sweeps. Contour plots of the mean firing rate in the spatiotemporal domain were made using Sigma Plot.
In some cases, when tungsten microelectrodes were used, electrolytic lesions were placed at the recording site (30 μA for 8–10 s, electrode positive). At the end of each experiment, animals were given a lethal dose of sodium pentabarbitol (100 mg/kg ip) and immediately perfused with saline followed by 4% paraformaldehyde. Brains were extracted, postfixed for 2–12 h (4% paraformaldehyde with 20% sucrose) and then left in 30% sucrose for ≥24 h. Frozen sections (45 μm thick in the coronal plane) through the nBOR were collected. Sections were mounted onto gelatin-coated slides and counterstained with neutral red. Light microscopy was used to localize electrode tracts and lesion sites.
Extensive quantitative data, including directional and spatiotemporal tuning to sine wave gratings of varying SF and TF, were obtained from 53 nBOR neurons in 26 animals. Most neurons, although broadly tuned, were excited in response to motion in a particular direction (preferred direction) and inhibited below the spontaneous rate (SR) in response to motion in the (approximately) opposite direction (antipreferred direction). Each neuron's direction preference was assigned by calculating the maximum of the best cosine fit to the tuning curve. As shown in Fig. 1, there was an obvious clustering into 4 groups: 5 (9%), 9 (17%), 15 (28%), and 24 (45%) neurons preferred forward (temporal to nasal), backward (nasal to temporal), downward, and upward motion, respectively. These data are in agreement with previous studies of the pigeon nBOR. Wylie and Frost (1990a) found that upward, downward, and backward cells are equally abundant, but forward cells were rare (see also Gioanni et al. 1984; Rosenberg and Ariel 1990). It has been noted that a small subpopulation of nBOR neurons have binocular receptive fields and respond best to particular patterns of optic flow resulting from either self-rotation or self-translation (Wylie and Frost 1990b, 1999; Wylie et al. 1998). No such neurons were recorded in the present study.
Spatiotemporal properties of nBOR neurons
Figure 2 shows the responses of an nBOR neuron to gratings drifting in the preferred (up) and antipreferred (down) directions. PSTHs to 36 combinations of SF (abscissa) and TF (ordinate) are shown. Each PSTH is for a single sweep, where each sweep consisted of 4 s of motion in the preferred direction (upward motion, solid line), followed by a 3 s pause, followed by 4 s of motion in the antipreferred direction (downward motion, broken line). Note that this cell showed strong excitation to motion in the preferred direction and strong inhibition to motion in the antipreferred direction. The neuron responded to several of the gratings, but the degree of the excitation and inhibition was variable. Note that for 1 cpd/0.03 Hz the neuron showed excitation rather than inhibition to motion in the anti-preferred direction. The asterisk (*) and pound (#) symbols, respectively, indicate the peak excitatory (ER) and inhibitory responses (IR) in the spatiotemporal domain (0.25 cpd/0.125 Hz) based on the average firing rate over the 4 s epoch. This average encompassed the steady-state and transient responses during the epoch. An onset transient, variable in size, was present in response to motion in the preferred direction for most gratings. Onset transients to motion in the antipreferred direction were less common, as were offset transients to motion in both directions. In this report we do not further address these transients and other temporal factors (such as oscillations in the responses apparent in some PSTHs in Fig. 2). [Ibbotson et al. (1994), Price and Ibbotson (2002), and Wolf-Oberhollenzer and Kirschfeld (1994) have already provided extensive descriptions of temporal factors].
To graphically illustrate tuning in the spatiotemporal domain, contour plots were constructed for both the preferred and antipreferred directions (see Fig. 3). Because large-field motion in the preferred direction elicits excitation and motion in the antipreferred direction elicits inhibition, we refer to these as excitatory response plots (ER plots) and inhibitory response plots (IR plots), respectively. TF and SF were plotted on the ordinate and abscissa, respectively, and the firing rate (relative to the SR) was plotted on the z-axis. The diagonal lines overlaying the contour plots indicate particular velocities (TF/SF). In these plots, the black represents the SR, red represents excitation, and green represents inhibition. Progressively brighter and less saturated reds/greens represent greater magnitudes of excitation/inhibition, such that the peaks are shown as off-white. The neurons shown in Fig. 3, A and B clearly had two peaks in their ER plots. For the neuron in Fig. 3A there was a primary peak at 1 cpd/0.5 Hz (60 spikes/s) and a smaller secondary peak at 0.125 cpd/16 Hz (20 spikes/s). For the neuron in Fig. 3B there was a primary peak at 0.063 cpd/16 Hz (45 spikes/s) and a smaller secondary peak at 0.5 cpd/0.125 Hz (35 spikes/s). The neuron shown in Fig. 3C had a single peak in its ER (200 spikes/s above SR) to high SF gratings (0.5–1 cpd) drifting at mid-low TFs (0.5–2 Hz). Of the 53 ER plots, 25 showed a single peak (e.g., Fig. 3C) and 28 showed multiple peaks (e.g., Fig. 3, A and B). The IR plot in Fig. 3B showed a similar profile to the ER plot for that neuron, but this was not the case for the neuron shown in Fig. 3C. The neuron in Fig. 3C was maximally excited (200 spikes/s) by high SFs drifting at mid-TFs in the preferred direction, but maximally inhibited (–12 spikes/s) by mid-SFs (0.25 cpd) drifting at high TFs (16 Hz) in the antipreferred direction. For 16 neurons the ER and IR plots showed a similar tuning profile (as in Fig. 3B; see also Fig. 2). However, for 33 neurons, the tuning in the spatiotemporal domain was quite different for the ER and IR plots (as in Fig. 3C).
Quantitative analysis of the ER plots
Stimulus velocity (in degrees per second; °/s) is calculated as TF/SF. Thus from the contour plots it is straightforward to see whether a cell is tuned to TF or velocity. A contour plot showing perfect velocity tuning would have an elongated peak, such that the slope is equal to 1. TF-tuning is exemplified by contour plots that are symmetrical about a horizontal line through the peak, indicating a preference for the same TF over a range of SFs.
From Fig. 3 it is clear that not all the neurons were tuned to TF. To quantify the orientation of the peaks in the ER plots, each peak was fit to a 2D Gaussian function, using a slightly modified version of the method of Perrone and Thiele (2001) where where u is ln (SF), ω is ln (TF), θ is the angle of the Gaussian, (x, y) is the location of the peak of the Gaussian, σx and σy are the spread of the Gaussian in the u′ and ω′ dimensions, respectively, and P is a constant. The values σx, σy, x, y, θ, and P were optimized to minimize the sum of the mean error between the real and G values using the solver function in Microsoft Excel.
Following Perrone and Thiele (2001), each ER peak was fitted to two different types of Gaussian functions: nonoriented and oriented. In the nonoriented function θ was constrained to zero, whereas θ was free to take on any value in the oriented Gaussian function. The square of the Pearson product moment correlation coefficient (r2) was calculated for each Gaussian to measure the overall fit to the data. Averaged across the entire data set, which consisted of 52 fitted peaks, the r2 values of the oriented and nonoriented fits were 0.84 ± 0.09 and 0.77 ± 0.11, respectively (mean + SD). These were significantly different (single-sample Student t-test, P < 0.0001). (There were 13 neurons that were not fit with Gaussians either because the two peaks in the ER plot appeared inseparable, or there were more than two peaks in the contour plot.)
In Fig. 3 oriented Gaussian fits to the ER plots of the 3 neurons are shown. For perfect velocity-tuning θ would equal 45° (i.e., a slope of 1), but for TF-tuning θ would equal 0° or 90°. For the neurons in Fig. 3, A and B, the peaks in the fast and slow regions were fit separately, and the gray borders indicate the range of SFs and TFs used for each fit. For the neuron in Fig. 3A, the θ values for the fast and slow peaks were 85 and 42°, respectively. For the neuron in Fig. 3B, the θ values for the fast and slow peaks were 87 and 37°, respectively. For the neuron in Fig. 3C, which had a single slow peak, θ = 57°.
Figure 4 shows the location [(x, y); circles] and orientation (θ; solid line) of each oriented Gaussian fit. For those ER plots with two peaks, the location of the primary and secondary peaks were plotted as filled and empty dots, respectively. Following previous studies of the pretectum and AOS (Crowder and Wylie 2001; Ibbotson et al. 1994; Wylie and Crowder 2000), we use 4°/s as the border between “Fast” and “Slow” neurons, although the distinction in the data are not as apparent as in those previous studies. For fast cells the peak excitation occurred in response to low-mid SFs (0.03–0.13 cpd) and mid-high TFs (0.5–16 Hz). For slow cells the peak excitation occurred in response to mid-high SFs (0.3–2 cpd) and low-mid TFs (0.06–2 Hz). Shown in Table 1, which considers data from only the primary peaks, the average SF and TF of the fast ERs were 0.078 cpd and 2.84 Hz, respectively. The average SF and TF of the slow neurons were 0.53 cpd and 0.30 Hz, respectively. (All values were first transformed to the natural log, the average was calculated, and then the inverse transformation was performed.) As indicated by the orientation of lines in Fig. 4, for most peaks in the fast zone θ approximated 0 or 90° (suggesting TF-tuning), whereas θ approached 45° for most peaks in the slow zone (suggesting velocity-tuning).
Figure 5 shows the responses of two cells as a function of velocity (left column) and TF (right column). Responses to low SFs (0.03–0.125 cpd) and high SFs (0.25–1 cpd) are separated into top and bottom panels, respectively. The neuron in Fig. 5A showed velocity tuning to high SFs with a peak response at 1°/s (bottom left panel). At low SFs, this neuron was more closely tuned to TF (top right panel; peak at 0.125 Hz) than to velocity. Figure 5B also shows a neuron that was more closely tuned for velocity (peak at 0.1–1°/s) at high SFs, but TF-tuned at lower SFs, with a sharp peak at 16 Hz.
Direction tuning in fast and slow zones
Figure 6 shows three down cells (A, C, D) and one back cell (B) from which direction-tuning curves were collected using slow gratings (solid line, 0.5 cpd/0.5 Hz) and fast gratings (dashed line, 0.063 cpd/4 Hz). The firing rate relative to the SR (gray circle) is plotted as a function of the direction of motion in polar coordinates (i.e., the SR has been set to zero; outside the gray circle= excitation, inside = inhibition). The neurons in Fig. 6, A, C, and D preferred the slow gratings, showing a much greater depth of modulation. The neuron in Fig. 6B responded equally to slow and fast gratings. Solid and dashed arrows represent the neuron's preferred direction for slow and fast gratings, respectively, as calculated from the best-fit cosines to the tuning curves. The neurons in Fig. 6, A and C showed very little variation in preferred direction in response to slow and fast gratings. The neurons in Fig. 6, B and D had differences of about 20° in their preferred directions in response to slow and fast gratings, but these were the largest changes we observed. No cells showed large enough differences in direction preference to be classified as one direction type in response to slow gratings and another direction type in response to fast gratings.
In the present study we examined the responses of neurons in the pigeon AOS to large-field drifting sine wave gratings. nBOR neurons clustered into two groups based on the location of peak response in the spatiotemporal domain: fast cells that preferred low SFs and high TFs, and slow cells that preferred high SFs and low TFs, although many neurons showed peaks in both the fast and slow regions. Most of the fast peaks were tuned to a specific TF (see Figs. 3, A and B, 4, 5), whereas most of the slow peaks showed apparent velocity tuning, insofar as the 2D Gaussians fit to the slow peaks were oriented at about 45° (see Figs. 2, 3, 4, 5). Strictly speaking, the slow neurons cannot be called velocity-tuned because the response is SF dependent. For example, the ER plot shown in Fig. 3C shows a peak oriented at approximately 45°, suggestive of velocity-tuning. However, the response to 1 cpd/2 Hz (2°/s) was about 200 spikes/s, whereas the response to 0.25 cpd/0.5 Hz (2°/s) was 150 spikes/s. A velocity-tuned neuron would respond equally well to a preferred velocity, irrespective of the SF, and the peak in the ER plot would appear as an elongated ridge (Zanker et al. 1999). Thus we use the term “velocity-like” tuning, or “apparent velocity tuning.”
Comparison with previous studies of the pretectum of birds and mammals
Ibbotson et al. (1994) were the first to demonstrate that neurons in the pretectum (wallaby NOT) were tuned in the spatiotemporal domain to either low SF/high TFs (fast cells) or high SFs/low TFs (slow cells). Subsequently, Wylie and Crowder (2000) showed that neurons in the pretectum (nucleus LM) of pigeons contained such fast and slow neurons. Following Ibbotson and Price (2001), a direct comparison of the pigeon and wallaby pretectal data are offered in Table 1, along with data from the pigeon nBOR from the present study. The mean velocity of the slow and fast NOT neurons was 0.8 and 50°/s, respectively, remarkably similar to what we found for the pigeon LM (1.08 and 52°/s, respectively). Such similarities may arise from convergent evolution in response to similar visual environments, or point toward a highly conserved visual system of ancient origin (Ibbotson and Price 2001). Table 1 shows that the TF, SF, and velocity preferences of the fast and slow nBOR neurons are similar to their counterparts in the LM. Note that the percentage of fast cells in the nBOR is much less than that in the LM (also see Crowder and Wylie 2001).
In our previous study of the spatiotemporal tuning in the pigeon LM (Wylie and Crowder 2000), we reported that velocity-tuning was rare. In fact, of 35 ER plots only 1 appeared as velocity-tuned, whereas 14 were TF-tuned. The results of the present study prompted us to reexamine the LM data, with emphasis on the slow cells. The data set from Wylie and Crowder (2000) consisted of 12 slow cells, but we have subsequently recorded from an additional 8 slow LM neurons (e.g., from Crowder et al., 2003). 2D Gaussian functions were fit to the peaks in the LM ER plots, and the locations (x, y) and orientations (θ) of each oriented Gaussian fit is shown in Fig. 4, alongside the same data from the nBOR cells. Of the 20 slow peaks, 12 had slopes that approached 45° (i.e., within 20°). Thus it appears that slow neurons in nBOR and LM show apparent velocity-tuning.
Implications for models of motion detection
Initially proposed by Reichardt (1961), the correlation model of motion detection has been very successful in describing motion processing in animal vision (for reviews see Borst and Egelhaaf 1989; Buchner 1984; Clifford and Ibbotson 2003; Srinivasen et al. 1999). The classic correlation detector consists of two subunits, or “half-detectors,” each selective for motion in opposite directions. When the outputs of these two half-detectors are subtracted from each other, a highly directional motion detector is created (see also the appendix, Fig. A1). Recent elaborations of the basic correlation-type detector have involved the addition of spatial and temporal prefilters (e.g., Dawson and DiLollo 1990; Ibbotson and Clifford 2001; Price and Ibbotson 2002). The energy model is a variant of this basic scheme and generates similar response properties to elaborated correlation-type detectors (Adelson and Bergen 1985; Zanker et al. 1999).
One of the most prominent features of the correlation model of motion detection is its dependency on the spatial structure and contrast of the visual stimulus (Buchner 1984; Reichardt 1961). Furthermore, correlation motion detectors are tuned to a particular TF rather than to a particular velocity (for reviews see Buchner 1984; Egelhaaf et al. 1989; Ibbotson et al. 1994; Srinivasen et al. 1999; Wolf-Oberhollenzer and Kirschfeld 1994). This TF-tuning has been used as an identifying characteristic of the correlation scheme for many years (e.g., Wolf-Oberhollenzer and Kirschfeld 1994). Behavioral and physiological studies of insects over the last 40 years have emphasized that the motion detectors underlying the optokinetic “turning response” are of the correlation type (Srinivasen et al. 1999). The amplitude of the turning response is dependent on TF rather than on the velocity of the stimulus, and the responses of the optic flow sensitive neurons in the visual neuropile exhibit properties consistent with the correlation model, including tuning for TF rather than for velocity (e.g., Borst and Egelhaaf 1989; Buchner 1984; Eckert 1980; Egelhaaf et al. 1989, 1990; Hausen 1984; O'Carroll et al. 1996; Reichardt 1969). Moreover, there is behavioral and physiological evidence from cats, monkeys, and humans indicating that detectors of the correlation type are involved in motion analysis in mammals (Miles and Kawano 1987; Tolhurst and Movshon 1975; see also Borst and Egelhaaf 1989; Nakayama 1985).
Evidence in favor of the correlation scheme has also been reported for the optokinetic system. Neurons in the wallaby NOT were sensitive to contrast and most were tuned to TF (Ibbotson and Price 2001; Ibbotson et al. 1994). Turke et al. (1996) recorded optokinetic head movements in unrestrained pigeons in response to horizontally drifting gratings of varying SF, contrast, and stimulus velocity. They noted a strong dependency on contrast and TF rather than on velocity. In a study of responses of neurons in the pigeon nBOR, Wolf-Oberhollenzer and Kirschfeld (1994) reported that most neurons were TF-tuned, but only one neuron tested showed velocity-tuning. This is in stark contrast to our findings. However, neither Wolf-Oberhollenzer and Kirschfeld (1994) nor Turke et al. (1996) used the higher SFs that would maximally excite the slow nBOR cells discussed in the present study. Indeed, the classic correlation motion detection model cannot account for the velocity-like tuning of slow nBOR neurons.
Recently Zanker et al. (1999) explicitly showed that altering the subtraction step, or “balance,” of the two half-detectors critically affects the tuning of the detector. The classic correlation scheme described above is a fully balanced detector, where the inputs from the two antisymmetric half detectors are equally weighted. Recall that this fully balance detector is TF-tuned. Conversely, Zanker et al. (1999) showed that a fully unbalanced detector, which is essentially a lone half-detector, is velocity-tuned. Finally, a partially balanced detector had responses between these two extremes, with velocity-tuning that was weakly dependent on SF (Zanker et al. 1999). It is possible that the velocity-like tuning in the slow zone of nBOR neurons represents the output of a partially balanced correlation-type motion detector. We illustrate this in Fig. 7, which shows the ER plots of simulations generated by a model of an elaborated Reichardt detector. The details of the model can be found in the appendix. We used a model from Dawson and DiLollo (1990), but with delay filters given by Clifford et al. (1998) and temporal prefilters described by Ibboston and Clifford (2001) and Price and Ibboston (2002). Moreover, following Zanker et al. (1999) we manipulated the balance by varying the gain (α) of the subtraction step where the response = (S1) – (αS2). When α = 1 the detector is fully balanced, and when α = 0 the detector is fully unbalanced (i.e., a half-detector) (Zanker et al. 1999). In Fig. 7A we modeled the ER plot of a slow cell with a peak response to 1 cpd/0.5 Hz. When α = 1 (left) the ER plot shows TF-tuning, but when α = 0.5 (right) velocity-like tuning is evident. When a 2D Gaussian is fit to this peak θ = 56°, but clearly the response is dependent on SF. Thus we suggest that the slow nBOR neurons might represent the output of partially balanced correlation detectors, perhaps approaching half-detectors. Other electrophysiological evidence from the fly's visual system (Egelhaaf et al. 1989) and the wallaby pretectum (Ibboston and Clifford 2001; Ibbotson et al. 1994; Price and Ibbotson 2002) also suggests that the underlying motion detectors are not perfectly balanced.
In the present study we found that the fast nBOR neurons exhibit TF-tuning. Although one could conclude that this implies that the underlying motion detectors are fully balanced, with Fig. 7B we show that this is not necessarily the case. On the left, we modeled a fully balanced detector tuned to 1 cpd/8 Hz, with rather restrictive temporal prefilters. Note the TF-tuning. On the right we show the response of the model with the same parameters except α = 0. The peak, which has been pushed to the lower range of SFs, appears TF-tuned. Clearly the shape of the peak in the spatiotemporal domain is dependent on both the prefilter settings and the balance of the detector. Dror et al. (2001) demonstrated that other processes such as response compression and adaptation are also critical when considering velocity estimations by Reichardt detectors.
Another model of motion detection that may be applied to the current results is the weighted intersection mechanism (WIM) model developed by Perrone and Thiele (2002). The WIM model of velocity sensitivity was developed to show how MT neurons in the primate extra-striate cortex could build velocity-tuned spatiotemporal peaks by summing the spatiotemporal inputs from a sustained V1 neuron and a transient V1 neuron. In this model, the spatiotemporal tuning of the sustained V1 neuron must differ slightly from the tuning of the transient V1 neurons; this difference produces a diagonal peak in the spatiotemporal domain enabling narrow velocity tuning (Perrone and Thiele 2002). Although this model appears to be tailor-made for the geniculostriate pathway, it demonstrates that the spatiotemporal tuning from multiple inputs can be combined to shape the spatiotemporal tuning of an afferent neuron. This shaping has already been shown experimentally in the AOS and pretectum. The spatiotemporal tuning of LM neurons is drastically altered when input from the nBOR is inactivated by tetrodotoxin (Crowder et al., 2003). Similar results are expected for nBOR neurons if the LM were inactivated. Antidromic stimulation studies in the turtle AOS indicate that the receptive fields of AOS neurons result from the pooling of multiple directionally selective retinal inputs (Kogo et al. 1998). The spatiotemporal tuning of these retinal inputs could be combined to form velocity-like tuning.
Function of fast and slow neurons
Ibbotson et al. (1994) provide an extensive discussion of the potential role of the slow and fast NOT neurons in the generation and maintenance of optokinetic nystagmus (OKN). Immediately after the onset of an optokinetic stimulus, there is a 50- to 100 ms latent period before ocular following begins (e.g., Collewijn 1972). During this period, the retinal slip velocity (RSV) is high, and Ibbotson et al. (1994) suggest that the fast NOT neurons are responsible for initiating ocular following (the “direct” phase of OKN; Cohen et al. 1977; Gellman et al. 1990; Miles et al. 1986). Moreover, they suggest that the fast neurons are involved in the charging of the velocity storage mechanism (“indirect” phase of OKN) when stimulus speeds are high. Ibbotson et al. (1994) noted that rapidly moving visual images become blurred, which is consistent with the fact that the fast NOT neurons respond best to low SFs. The slow NOT neurons would become active when the RSV is low, and they would continue to charge the velocity storage mechanism at these slow velocities. Pigeons lack the direct phase of OKN, but they do possess a velocity storage mechanism (Gioanni 1988; Nalbach 1992). This precludes the fast LM and nBOR neurons in pigeons from a role in the direct component of OKN, as proposed for the fast NOT neurons. However, it is reasonable to imagine that the fast and slow nBOR and LM neurons are involved in charging the velocity storage mechanism as proposed for the fast and slow NOT neurons. Those neurons with peaks in both the fast and slow regions would be active when RSV is high or low.
Srinivasan et al. (1999) offer another function for the fast and slow cells (see also Heeger 1987; Simoncelli and Heeger 1998). They refer to the single motion detector with a peak in the spatiotemporal domain as a “correlator.” The spatiotemporal tuning of a single correlator would look similar to a contour plot of a pretectal or AOS neuron with a sharp peak in the spatiotemporal domain. The response of a single correlator is ambiguous because all points that lie on a given response contour in the spatiotemporal domain represent combinations of SFs and TFs that elicit the same response. If contrast is allowed to vary, another degree of uncertainty is added. Because the response of the motion detector will increase with contrast (until saturation is reached), all points on a given response contour will be confounded with points on a weaker response contour if the contrast representing the weaker contour is appropriately increased. The above ambiguity can be removed if more than one correlator is incorporated into the movement detecting process, with each correlator having a different spatiotemporal frequency optimum. The velocity of a stimulus would be coded by the relative activity of the correlators. Manipulating the contrast of the stimulus would affect all correlators equally, but the stimulus velocity would still determine the ratio of the activity between the correlators. In this scheme, the velocity of a stimulus can be estimated unambiguously and independently of spatial structure or contrast based on the population response (Srinivasan et al. 1999). Srinivasen et al. (1999) noted that visual systems of insects have two classes of direction-selective neurons differing with respect to preferred TF (Horridge and Marcelja 1992) and the optokinetic system in crabs has three such classes of neurons (Nalbach 1990).
When this model is applied to the AOS and pretectum, the fast and slow cells take on the roles of two classes of correlators. Theoretically, the RSV could be reliably encoded by the pattern of activity in nBOR neurons, and the velocity storage mechanism would be provided with a velocity signal that is unambiguous and independent of spatial structure or contrast of the visual stimulus. Furthermore, this velocity information could be used for other behaviors such as flight speed and “odometry,” which require an unambiguous velocity signal (Srinivasen et al. 1999).
For the simulations shown in Fig. 7 we examined the responses of an elaborated version of the Reichardt detector, depicted in Fig. A1, to 36 SF/TF combinations. It was created by modifying a model originally proposed by Dawson and Di Lollo (1990) by incorporating alternative temporal filters proposed by Ibbotson and Clifford (2001) and Price and Ibbotson (2002). The stimuli for the model, designed to closely resemble the stimuli used during the neural recordings, consisted of a blank gray screen (at mean luminance) for 2 s, followed by a drifting sine wave grating for 2 s. The elementary motion detector (EMD) consists of two subunits (A, B) that we assume to be separated by 2°. The model consists of 5 stages; I, prefiltering, II, delay filtering, III, multiplication, IV, subtraction, and V, phase averaging.
Stage I: prefiltering
In the original Dawson and Di Lollo (1990) model, spatial and temporal band-pass prefiltering was performed to represent photoreceptor responses. In that model, impulses were used as stimuli, a difference of Gaussians (DOG) was used to perform spatial filtering, and temporal filtering was accomplished by an impulse response function defined by Adelson and Bergen (1985). In the current study, we were interested in studying the model's responses to drifting sine gratings. For such stimuli, spatial DOG filtering produces a sinusoid of the same frequency and phase. Because of this, we did not employ spatial prefilters, although we assume that such prefiltering is carried out by the visual system. Ibbotson and Clifford (2001) also adopted this approach in their simulations.
In the current model, the raw signal s(t) that was presented to a subunit of the EMD at time t was defined as where L is the mean luminance in arbitrary units, C is the contrast in arbitrary units, fs is the spatial frequency in cycles/radian, ft is the temporal frequency in 2π × cycles/s, x is the spatial location of the subunit's detector, P is a phase shift of the signal (radians), and t is time (s). In the simulation, the value of x for the left detector was 0, and the value of x for the right detector was π/90 radians (2°).
Temporal filtering was then performed by convolving the signal for each detector with a band-pass filter of the type used by Price and Ibbotson (2002) This filter is a difference between exponential functions, where τ1 and τ2 are the time constants of these respective functions (in s), and β is the gain of the temporal filter, which has a value between 0 and 1.
Stage II: delay filtering
To detect motion, delayed versions of the signals being detected by both receptors in the EMD were computed. In the original Dawson and Di Lollo (1990) model, this was accomplished by a pure phase shift. In the current model, this was instead achieved by convolving the prefiltered signals from stage I with a first-order low-pass filter that was used by Clifford et al. (1998) where t is time (s) and τ is the time constant of the filter (s).
Stage III: multiplication
In the multiplication step, the left-delayed signal was multiplied by the right-undelayed signal to produce the signal from the left half of the detector (S1). The signal from the right half of the detector (S2) was calculated in a similar fashion, by multiplying the right-delayed signal by the left-undelayed signal.
Stage IV: subtraction
In the subtraction step, the signal from the right half of the EMD (S2) was subtracted from that from the left half of the detector (S1), but scaled by α, which controls the “balance” of the detector (Zanker et al. 1999), such that Output = (S1) – (αS2), where 0 ≤ α ≤ 1. When α = 1, the detector is fully balanced, but with α < 1, the detector is said to be partially balanced. With α = 0, the fully unbalanced EMD is referred to as a “half-detector” (Zanker et al. 1999).
Stage V: phase averaging
The final step, phase averaging, serves the purpose of spatial integration that is found in models that employ an array of EMDs (e.g., Ibbotson and Clifford 2001; Price and Ibbotson 2002; Zanker et al. 1999). Because the response of an EMD is sensitive to the phase of the grating (e.g., Buchner 1984) and because we were using a rather short duration of motion for the slower TFs, we averaged the response of the detector to 4 different stimuli. The only difference between each of these stimuli in terms of phase was manipulated by varying the value of P in the equation for the drifting sinusoid that was provided earlier. The values of P for the 4 different stimuli were 0, π/2, π, and 3π/2 radians (i.e., 0, 90, 180, and 270°).
This research was supported by funding from the Natural Sciences and Engineering Research Council (NSERC) of Canada to D.R.W. Wylie. N. A. Crowder was supported by an NSERC postgraduate scholarship. M.R.W. Dawson was supported by an NSERC Discovery Grant.
We thank P. McGivern for help with the statistical analysis and the two anonymous reviewers for their insightful comments.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2003 by the American Physiological Society