The visual system uses binocular disparity to discriminate the relative depth of objects in space. Because the striate cortex is the first site along the central visual pathways at which signals from the left and right eyes converge onto a single neuron, encoding of binocular disparity is thought to begin in this region. There are two possible mechanisms for encoding binocular disparity through simple cells in the striate cortex: a difference in receptive field (RF) position between the two eyes (RF position disparity) and a difference in RF profiles between the two eyes (RF phase disparity). Although there is evidence that supports each of these schemes, both mechanisms have not been examined in a single study to determine their relative roles. In this study, we have measured RF position and phase disparities of individual simple cells in the cat’s striate cortex to address this issue. Using a sophisticated RF mapping technique that employs binary m-sequences, we have obtained left and right eye RF profiles of two or more cells recorded simultaneously. A version of the reference-cell method was used to estimate RF position disparity. We find that RF position disparities generally are limited to values that are not sufficient to encode large binocular disparities. In contrast, RF phase disparities cover a wide range of binocular disparities and exhibit dependencies on RF orientation and spatial frequency in a manner expected for a mechanism that encodes binocular disparity. These results suggest that binocular disparity is encoded mainly through RF phase disparity. However, RF position disparity may play a significant role for cells with high spatial frequency selectivity that are constrained to have only small RF phase disparities.
An image of an object either in front of or behind the point of visual fixation projects onto slightly different locations on the retinae in the two eyes. This difference, binocular disparity, is by itself a sufficient cue for our perception of depth (Julesz 1960; Wheatstone 1838). Since the discovery that most neurons in the striate cortex of cats and monkeys are selective to binocular disparity (e.g., Barlow et al. 1967; Pettigrew 1965; Pettigrew et al. 1968; Poggio and Fischer 1977), describing how these neurons encode binocular disparity has become an important issue for understanding neural mechanisms of binocular fusion and stereopsis (DeAngelis et al. 1991, 1995; Fleet et al. 1996; Freeman and Ohzawa 1990; Joshua and Bishop 1970; Maske et al. 1984; Nikara et al. 1968; Nomura et al. 1990; Ohzawa et al. 1996; Qian 1994; Qian and Zhu 1997; Wagner and Frost 1993; Zhu and Qian 1996).
There are two plausible hypotheses for how cortical neurons encode binocular disparity. The traditional view, illustrated in Fig.1 A, is that left and right eye RFs of a neuron have the same spatial profile, but their positions are not necessarily at retinal correspondence, creating RF position disparity through which binocular disparity can be encoded (Maske et al. 1984; Nikara et al. 1968;Wagner and Frost 1993). In this scheme, the range of binocular disparity that can be encoded is limited by the range of RF position disparity.
Alternatively, binocular disparity can be encoded through a difference in RF profiles or phases between the two eyes, without RF position disparity (DeAngelis et al. 1991, 1995; DeValois and DeValois 1988; Fleet et al. 1996;Freeman and Ohzawa 1990; Nomura et al. 1990; Ohzawa et al. 1996; Qian 1994; Qian and Zhu 1997; Zhu and Qian 1996). This is illustrated in Fig. 1 B. Because, by definition, RF phase disparity is limited to a range between ±180° phase angle (deg PA), the range of binocular disparity that can be encoded with this scheme is proportional to the period of the RF or inversely proportional to the spatial frequency of the RF.
RF position disparities of neurons in the cat’s striate cortex were demonstrated first by Nikara et al. (1968). They measured the positions of left and right eye RFs using moving bars and edges and found that the distribution of RF position disparities ranges between ±1.2 deg visual angle (deg VA) with a standard deviation of ∼0.6 deg VA. Similar results also were obtained by Joshua and Bishop (1970) and by von der Heydt et al. (1978).
Support for the position encoding mechanism also was provided byMaske et al. (1984), who measured spatial profiles for left and right eye RFs of binocular simple cells in cats using moving bright and dark bars. They argued that if a binocular neuron is to serve as a depth detector, then the left and right eye RFs should have an almost identical organization. This would ensure that the cell would respond to the same object features in the two eyes. They report that the number and spatial sequence of on and offsubregions for the left and right eye RFs are always precisely the same. This suggests that there is very little RF phase disparity between the two eyes and hence favors the position encoding mechanism. However, the question of whether left and right eye RFs have the same profile or not cannot be answered solely on the basis of the number and spatial sequence of on and off subregions. The relative strength of subregions must be examined as well. In fact, some of the data in Maske et al. (e.g., RF profiles shown in Fig. 2,A and B, of their paper) actually indicate that the relative strengths of subregions can be different in the two eyes; this suggests some degree of RF phase disparity.
Wagner and Frost (1993) also proposed that binocular disparity is encoded through RF position disparity. They applied a technique used in auditory research to study disparity tuning of neurons in the visual Wulst of barn owls. They found that the peaks of disparity tuning functions measured with sinusoidal gratings of various spatial frequencies are all at about the same disparity, which they call the characteristic disparity. Because there is no well-defined characteristic disparity for RFs that have different profiles in the two eyes, they concluded that the left and right eye RFs must be identical in shape and, therefore, that binocular disparity must be encoded through RF position disparity.
However, the examination of characteristic disparity to distinguish position from phase encoding mechanisms has been questioned byZhu and Qian (1996). They showed that model neurons with an RF phase disparity exhibit an approximate characteristic disparity. Because the data shown in the papers of Wagner and Frost (1993,1994) are virtually indistinguishable from those predicted by the model neurons, Zhu and Qian contend that the mere existence of an approximate characteristic disparity should not be taken as evidence against the phase encoding mechanism.
The idea of using spatial phase as a primitive for encoding binocular disparity came from computational studies in vision (Jenkin and Jepson 1988; Sanger 1988; see alsoDeValois and DeValois 1988 for the same idea).Jenkin and Jepson (1988) proposed a method for measuring a binocular disparity as a local phase difference between the left and right eye images. Sanger (1988) also used local phase disparities between stereo half-images to solve the correspondence problem. Later, RF phase disparity was implemented into models of binocular neurons in the striate cortex as a mechanism for encoding binocular disparity (Fleet et al. 1996; Nomura et al. 1990; Ohzawa et al. 1990; Qian 1994; Qian and Zhu 1997).
Physiological evidence for RF phase disparity first was reported byFreeman and Ohzawa (1990). They measured spatiotemporal RF profiles for binocular simple cells in cats using the sophisticated RF mapping technique of Jones and Palmer (1987) and showed that left and right eye RFs of some neurons have different spatial profiles (and therefore different spatial phases). Subsequently, they also found that phase is the only RF parameter that can be quite different between the two eyes (DeAngelis et al. 1995; Ohzawa et al. 1996). In these studies, however, they did not know the location of the corresponding points on the retinae, and RF position disparities were never examined.
The above mentioned studies have established the existence of RF position and phase disparities and have hypothesized the two encoding mechanisms. However, because each of these studies has examined only one of the two hypotheses, the relative roles of the two encoding mechanisms have not been determined. It is still not clear whether simple cells encode binocular disparity through both RF position and phase disparities or through either one alone. Therefore the question of how neurons encode binocular disparity still remains open and cannot be resolved unless one examines both RF position and phase disparities for individual neurons at the same time.
However, obtaining RF position and phase disparities for individual neurons in a single experiment is not an easy task. To estimate RF phase disparity, one needs to obtain detailed profiles of left and right eye RFs. This can be done with reasonable accuracy using one of the techniques of white noise analysis (e.g., Citron et al. 1981; DeAngelis et al. 1993; Jacobson et al. 1993; Jones and Palmer 1987; Ohzawa et al. 1990; Reid and Shapley 1992; Reid et al. 1997). Because mapping RFs takes time, usually 5–20 min, the animal has to be paralyzed to prevent the eyes from moving. This creates a problem for measurements of RF position disparity. In a paralyzed preparation, eye muscles are relaxed and the visual axes of the eyes deviate from a normal fixation position. As a consequence, it is very difficult to locate corresponding points on the retinae and use them to measure RF position disparity (see Fig. 1 A). Although there have been attempts to determine the corresponding points based on landmarks on the retinae (e.g., Barlow et al. 1967; Nikara et al. 1968; von der Heydt et al. 1978), the procedure is rather complicated and subjective and introduces many possible sources of errors.
To get around this problem, RF position disparity can be measured with respect to the RF position of a reference cell instead of corresponding points (Ferster 1981; Hubel and Wiesel 1970; LeVay and Voigt 1988). Statistical considerations indicate that a distribution of RF position disparities for a population of neurons can be obtained and that RF position disparities of individual neurons can be estimated with a known amount of uncertainty (see methods for details).
Here, the long-standing question of how neurons encode binocular disparity finally is addressed appropriately by measurements of both RF position and phase disparities for individual simple cells with a reference-cell method and an RF mapping technique using binary m-sequence noise (Reid and Alonso 1995; Reid and Shapley 1992; Reid et al. 1997; Sutter 1987, 1992). Relative contributions of RF position and phase disparities to the encoding of binocular disparity are examined in relation to various RF parameters. Preliminary results of this study have been reported (Anzai et al. 1997).
Extracellular recordings were made from single neurons in the striate cortex of anesthetized and paralyzed adult cats. Thirty minutes before anesthesia, acepromazine maleate (0.2 mg/kg) and atropine sulfate (0.04 mg/kg) were injected subcutaneously. Surgery was performed under 2–3% isoflurane anesthesia.
A femoral vein was cannulated for intravenous infusion, a tracheal tube and a rectal thermometer was inserted, and electrocardiograph (ECG) leads and electroencephalograph (EEG) screw electrodes were positioned. A craniotomy (∼5 mm diam) was performed around Horsley-Clarke coordinates P4 L2 and the dura carefully was removed. Two tungsten-in-glass electrodes (Levick 1972) were positioned just above the surface of the cortex at an angle of 10° medial and 20° anterior, and the opening hole was closed with agar and sealed with wax to form a closed chamber.
During recording, the animal was anesthetized and paralyzed by intravenous infusion of a mixture of thiopental sodium (Pentothal, 1.0 mg · kg−1 · h−1) and gallamine triethiodide (Flaxedil, 10 mg · kg−1 · h−1), combined with a 5% dextrose and lactated Ringer solution (0.5 ml · kg−1 · h−1). In addition, the animal received intravenous infusion of a 5% dextrose and lactated Ringer solution (5 ml · kg−1 · h−1) to prevent dehydration. The animal was respired artificially with a mixture of N2O (70%) and O2 (30%) at 25 strokes/min. The body temperature, end-tidal CO2, heart rate, ECG, and EEG were monitored continuously through a PC Based physiological monitoring and analysis system (Ghose et al. 1995). The body temperature was maintained near 38°C and end-tidal CO2 at 4–4.5%. Intratracheal pressure also was monitored. The pupils were dilated with 1% atropine sulfate, and nictitating membranes were retracted with 10% phenylephrine hydrochloride. Contact lenses (+2D) with artificial pupils of 4 mm in diameter were placed on both corneas. Every 12 h, the contact lenses were removed and cleaned, and the clarity of the refractive media was checked with an ophthalmoscope.
Small lesions were made along each recording track when the electrodes were withdrawn. At the end of an experiment, the animal was killed with an overdose of pentobarbital sodium (Nembutal), perfused, and fixed through the heart with a buffered 0.9% saline solution followed by 10% formalin. A block of cortex around the recording site was removed and sectioned at 40-μm intervals. From thionin-stained sections, the cortical laminae were identified, and electrode tracks were reconstructed.
The animal was positioned in front of a tangent screen on which a bar stimulus of variable size and orientation could be swept throughout the visual field in any direction. A PC Based visual stimulator displayed various visual stimuli on two independent CRT displays (Nanao T2 · 17, 28 × 22 cm active display area, 76-Hz refresh rate), thereby allowing independent stimulation of the two eyes. Each stimulus display was positioned on one side of the animal at a distance of ∼57 cm and subtends a visual angle of 28 × 22°. The stimuli were visible to the animal by reflection from semisilvered mirrors. The displays are adjusted to have a mean luminance of 20 cd/m2 as seen through the semisilvered mirrors. A personal computer (Pentium-90 MHz) controlled the visual stimulator and acquired data.
Action potentials were recorded with a pair of electrodes that were separated laterally by 400–600 μm. Signals were amplified, sent to a spike discriminator, and monitored through speakers and oscilloscopes. The isolated spikes were stored along with codes to indicate the time of occurrence (at a resolution of 40 μs) and the identity of the input channel. Data were written to a hard disk and displayed in real time on a monitor screen.
The optic disk for each eye was projected with a reversible ophthalmoscope onto the tangent screen. The electrodes then were advanced while a search was made for neural activity evoked by sweeping a bright bar across the tangent screen. Once a spike was isolated, the RF of the neuron was located on CRT displays with a drifting sinusoidal grating, and the neuron’s preferred orientation and spatial frequency were determined qualitatively for each eye.
The tuning of the neuron for orientation, spatial frequency, temporal frequency, and relative interocular spatial phase then was examined quantitatively. A drifting sinusoidal grating of 40–50% contrast was presented for 4 s to either eye in a randomly interleaved manner. For the tuning of relative interocular spatial phase, gratings were presented dichoptically. The temporal frequency of the grating was set at 2 Hz except for the temporal frequency tuning measurement. Other parameters of the grating were set to the optimal values that were obtained qualitatively until the optimal values were estimated quantitatively from each tuning test. During each tuning test, the value of the tuning parameter was varied, and stimuli were presented in a random order. Optimal stimulus parameters were determined based on either mean spike rates (DC responses) or the first harmonic responses at the temporal frequency of the grating, whichever was greater. Neurons were classified as simple if the first harmonic response was greater than the DC response (Skottun et al. 1991) and/or if on (bright excitatory) andoff (dark excitatory) subregions were clearly defined (Hubel and Wiesel 1959). Otherwise, they were classified as complex.
RF MAPPING WITH BINARY M-SEQUENCE NOISE.
The RF mapping technique employed here was adopted from a systems-analysis method originally developed by Sutter (1987,1992). Receptive fields of neurons were mapped with white noise stimuli generated according to binary m-sequences (Golomb 1982; Zierler 1959). Two kinds of stimulus configuration were used: two- and one-dimensional (2D and 1D) patterns (Fig. 2). In the 2D case, a square patch, large enough to completely contain the RF of the neuron, was divided into 12 × 12 square elements. The size of each element was set to approximately one-fourth of the optimal spatial period (the inverse of the optimal spatial frequency) of the grating. Four elements at each corner were eliminated so that the total number of elements in a patch was always 128 (a power of 2) for optimal measurements (Sutter 1992). In the 1D case, a square patch was divided into 16 rectangular elements. The square patch was rotated so that the orientation of the rectangular elements coincides with the optimal value for the cells recorded. In 2D mapping, RFs of multiple cells can be measured simultaneously regardless of the cells’ orientation preferences. In 1D mapping, however, all cells must have similar orientation preferences for their RFs to be obtained simultaneously.
Two square patches were presented simultaneously—one for each eye. The luminance of each element in the patches was modulated every 40 ms according to m-sequences and took binary values, either +18 cd/m2 or −18 cd/m2 from the mean luminance of the CRT display. The stimulus sequences for all elements were derived from a single m-sequence, but they were shifted temporally by ≥2.5 s from one another. This ensures that the luminance modulation of each element was uncorrelated in space and time, both within each eye and between the two eyes, for the purpose of RF mapping. There are many different sequences for a given length (period) of m-sequences. Some provide more accurate estimates of the RF than others. This is because responses due to nonlinear interactions among stimulus elements could appear as a part of the RF (the anomaly problem, see Sutter 1992 for details) depending on the sequence. We had screened m-sequences beforehand and used only those that were relatively free from the potential anomaly problem. The m-sequences used in 2D measurements were ∼10 min long (a sequence period of 214-1), and measurements were repeated using the same m-sequences but of the inverted luminance polarity. This procedure, known as the inverse-repeat (see e.g., Sutter 1992), was another way to avoid the potential anomaly problem. The 1D measurements were carried out either in the same manner as the 2D measurements (inverse-repeated sequences of 214-1 period) or with m-sequences of ∼20 min long (a sequence period of 215-1).
Each spike train recorded as a response to binary m-sequence noise was cross-correlated with the stimulus sequences by means of the fast m-transform (Sutter 1991) to obtain RF maps. The cross-correlation between stimulus sequences in the left eye and a spike train yielded a left eye RF, whereas the cross-correlation between stimulus sequences in the right eye and the spike train yielded a right eye RF. Details of how RFs are constructed were described inAnzai et al. (1999a; see also Anzai 1997). The RFs were analyzed to obtain RF position and phase disparities for individual simple cells. The RFs were fitted first with a Gabor function (Gabor 1946). Then the parameters of the best fitting Gabor function were used for computing RF position and phase disparities.
A spatial profile of each RF at the optimal cross-correlation delay (the delay at which the sum of squared values of all data points in the RF is maximum, and the same delay is used for both left and right eye RFs) was fitted with a Gabor function (Gabor 1946) using a Levenberg-Marquardt method (Press et al. 1992). The exact formulae of the functions used to fit 1D and 2D RF profiles are described in the (Fig. FA1). Briefly, a Gabor function is the product of a Gaussian envelope and a sinusoid. The RF center coordinates and the RF phase are obtained as the center coordinates of the Gaussian envelope (X o andY o) and the phase of the sinusoid (φ), respectively, and are used to compute RF disparities as described in the next section.
ESTIMATING RF PHASE DISPARITY.
An RF phase disparity (d P) was obtained as the difference between RF phases for the left and right eyes (see for a formal definition). Phase disparity can be expressed in two ways: deg in phase angle (deg PA), and deg in visual angle (deg VA). Phase disparity in deg VA can be derived from phase disparity in deg PA by taking the RF spatial frequency (the frequency of the sinusoid in Fig. FA1) into account. There is an important distinction between phase disparities expressed in deg PA and deg VA. Phase disparity in deg PA indicates the similarity or dissimilarity between spatial profiles of the left and right eye RFs, whereas phase disparity in deg VA indicates a spatial offset between sinusoidal components of the left and right eye RFs. The latter is comparable with position disparity, but the former is not.
ESTIMATING RF POSITION DISPARITY USING A REFERENCE-CELL METHOD.
An RF position disparity was estimated using a reference-cell method1(Ferster 1981; cf. Hubel and Wiesel 1970;LeVay and Voigt 1988), which is illustrated in Fig.3 A. This method requires RFs of at least two binocular simple cells recorded simultaneously. For each RF measurement, cells are grouped in distinct pairs. One member of each pair, which is chosen randomly, is regarded as a reference cell and the RF position disparity of the other member is measured relative to the RF position disparity of the reference cell. In other words, the RF position disparity of a cell is obtained as the distance in deg VA between the centers of the cell’s left and right eye RFs while the RF position disparity of the reference cell is assumed to be zero. An RF position disparity measured here is, therefore, the relative position disparity of one cell to that of a reference cell. However, as illustrated in Fig. 3 B, the standard deviation of the population distribution for true position disparity is expected to be smaller than that for relative position disparity by a factor of (see for proof). This is simply because two samples drawn randomly from a distribution sometimes add and sometimes cancel each other when they are summed (or their difference is taken), and as a result, the distribution of the sum (or difference) becomes broader than the original. Therefore the population distribution of true position disparities can be recovered from the distribution of relative position disparities for a population of cells. Furthermore true position disparities of individual neurons can be estimated with a specified amount of uncertainty (see the legend of Fig. 3 B). Using this method, the RF position disparity along the direction perpendicular to the RF orientation (d X), which is also the direction in which the RF phase disparity is measured, was estimated. In addition, for 2D RF data, the RF position disparity along the direction parallel to the RF orientation (d Y) was estimated (see for formal definitions of the RF position disparities).
ESTIMATING MEASUREMENT ERRORS ASSOCIATED WITH POSITION AND PHASE DISPARITY DATA.
It is important to know how much measurement variability there is in each disparity estimate. Unfortunately, RFs are usually measured only once for each cell, and we do not have multiple independent estimates of the RF disparity to compute variability. Alternatively, we can compute the amount of variation in the disparity estimates using a Monte Carlo simulation (e.g., Press et al. 1992). The Monte Carlo simulation is a standard technique for generating random samples of data, simulating independent measurements.
Here, instead of generating random samples of RF maps, we generated random samples of parameters of the best-fitting Gabor function for each RF. This served our purpose because the best fitting parameters are subject to random variation due to variability in RF measurements. We assumed that the error associated with each parameter was additive and conformed to a normal distribution. The fitting algorithm (Levemberg-Marquard) used in our study provided best-fitting parameters as well as the amount of variation in each parameter and the amount of covariation between any two parameters. Using the variance-covariance information, we generated random samples of the parameters and computed RF position and phase disparities for each cell. By repeating this a large number of times (10,000), distributions of phase and position disparities were obtained. The standard deviations of the distributions represent standard errors associated with estimates of position and phase disparities.
We have obtained either 2D or 1D (or both) profiles of left and right eye RFs2 from 97 simple cells in 14 adult cats. Of these, 48 cells were recorded individually under conditions in which RF position disparity could not be determined. The remaining 49 cells were either from pair recordings (20 cases) or from trio recordings (3 cases). For each cell, an RF phase disparityd P was obtained. A total of 23 multiple-cell recordings yielded 29 distinct cell pairs, and an RF position disparity d X was estimated for each of these pairs using a reference-cell method. Among these pairs, there were 15 cases in which 2D RF maps were obtained so that an RF position disparity d Y also could be estimated.
Examples of RF maps
Each panel in Fig. 4 shows an example of left and right eye RFs for a pair of simple cells recorded simultaneously. As reported previously (DeAngelis et al. 1991,1995; Freeman and Ohzawa 1990; Ohzawa et al. 1996), left and right eye RFs can have different spatial profiles. For instance, the right eye RF of cell A and the left eye RF of cell B shown in Fig. 4 A are approximately even-symmetric, whereas their RFs in the other eye are more similar to those of odd-symmetric. The difference between the left and right eye RF profiles indicates an RF phase disparity. On the other hand, RF position disparities (the distance between the centers of left and right eye RFs for cell B minus that for cell A, a reference cell) appear to be relatively small in the examples shown. In general, we find that the left eye RFs of cells Aand B overlap in a manner similar to their right eye RFs, i.e., their relative locations in one eye are comparable with those in the other eye.
We have fitted each RF with a Gabor function, which generally provides a good fit. The parameters of the best fitting function, such as the width of the Gaussian (RF size) and the frequency of the sinusoid (RF spatial frequency), are matched well in the two eyes as described previously (DeAngelis et al. 1995; Ohzawa et al. 1996). The difference in RF orientation between the two eyes is always very similar for cells recorded simultaneously. This confirms the previous finding of Nelson et al. (1977) that the interocular orientation disparity of each cell is minimal once RF orientations are corrected for cyclorotation of the eyes due to paralysis. On the basis of the parameters of the best fitting Gabor function, we have computed RF phase and position disparities.
Histograms of RF position and phase disparities
Figure 5 shows histograms of RF position and phase disparities for a population of simple cells. In Fig. 5 A, a histogram of phase disparity in deg PA is shown. Cells for which position disparities also are estimated (matched samples) are shown (■). The phase disparities are distributed around zero, indicating that cells with similar RF profiles in the two eyes are most numerous. However, the distribution is rather broad; there are also many cells with dissimilar RF profiles in the two eyes. The phase disparities are mostly limited within ±90°. It has been suggested that, because of the cyclic nature of phase, phase disparity must be limited to a quarter cycle (90°) in order for band-pass filters to unambiguously encode binocular disparity (Blake and Wilson 1991; Marr and Poggio 1979). The data shown in Fig. 5 A demonstrate that the visual system by and large satisfies this requirement.
The phase disparity histogram is replotted in Fig. 5 B in terms of deg VA so that it can be directly compared with the position disparity histograms shown in Fig. 5, C and D. Both the position and phase disparities are distributed around zero, and the disparities of most cells are within ±1 deg VA. This range corresponds roughly to the limits of binocular fusion in cats (Packwood and Gordon 1975). The standard deviations of the distributions for position disparitiesd X andd Y are 0.52 and 0.62 deg VA, respectively. These values divided by , i.e., 0.37 and 0.44, are the estimated standard deviations of the distributions for true position disparities (see Fig. 3 B). These numbers are comparable with the results of the recent study by Hetherington and Swindale (1997; also personal communication) who measured RF position disparities of neurons in the cat’s striate cortex using a tetrode (e.g., Gray et al. 1995; Wilson and McNaughton 1993) and a variation of the reference-cell method. The standard deviation for the phase disparity distribution is 0.59 deg VA (0.68 deg VA for the matched sample distribution), which is 1.6 (1.8 for the matched sample distribution) times greater than that of the distribution for true position disparityd X. Statistical analysis indicates that the phase disparity distribution has a larger variance compared with the distribution for true position disparity [F test:F ratio = 2.546, df = (96,28), P< 0.01; with the matched sample distribution, F ratio = 3.403, df = (28,28), P < 0.01]. Thus position disparity is limited to a relatively small range compared with that of phase disparity.
Factors that may contribute to the difference between disparity histograms
It is possible that the difference between the position and phase disparity distributions may be due to differences in the amount of error associated with the estimates of position and phase disparities. To examine this possibility, a Monte Carlo simulation has been conducted to obtain a standard error for each disparity estimate. More than 90% of the standard errors are <0.25 deg VA. Mean values of the standard errors for position disparitiesd X andd Y and phase disparity are 0.12, 0.1, and 0.12 deg VA, respectively. Therefore errors in the disparity estimates are comparable for position and phase disparities and cannot account for the difference between the distributions of position and phase disparities.
Another possible factor that may contribute to the difference in standard deviation between the distributions is local clustering of cells with similar position disparities. The assumption necessary for the reference-cell method used in this study to work is that the true position disparities of cells recorded simultaneously are independent, or uncorrelated (see Fig. 3 B). If this assumption does not hold, the factor used to estimate the standard deviation of the distribution for true position disparity would be something other than (see Eq. EA15 in the ). Suppose that there was a negative correlation between the true position disparities of cells recorded simultaneously, i.e., one cell exhibits a crossed disparity and the other, an uncrossed disparity. Then the factor would be more than and using a factor of would be overestimating the standard deviation of the true position disparity distribution. Therefore the conclusion drawn in the previous section would not change. However, if there was a positive correlation, i.e., true position disparities of nearby neurons were similar, then the factor would be something between 0 and . In this case, using a factor of would underestimate the standard deviation of the true position disparity distribution and would contribute to the difference between standard deviations of distributions for position and phase disparities.
Unfortunately, it is not possible to determine if there is such a correlation between true position disparities of cells recorded simultaneously. However, if cells with similar preference for binocular disparity were clustered (i.e., the sum of position and phase disparities are similar for nearby cells) and their position disparities were correlated, then their phase disparities also would have to be correlated. Whether phase disparities of cells recorded simultaneously are correlated can be examined. In Fig.6, phase disparities of individual cells are plotted against those of reference cells. Data obtained from pair recordings made through a single electrode are shown as open circles, and those in which a pair of cells were recorded from different electrodes that are separated by 400–600 μm are shown as filled circles. No correlation is evident in this plot (correlation coefficient r = −0.06,R 2 = 0.3%), indicating that phase disparities of nearby cells are not correlated. This suggests that if cells with similar preferences for binocular disparity were clustered, position disparities of these cells would not be correlated. However, it has been reported that preferred binocular disparities of nearby cells are correlated only weakly (LeVay and Voigt 1988). Therefore it is still possible, though unlikely, that position disparities of nearby cells are somewhat correlated.
Relationship between RF position and phase disparities
Although the range of position disparities is smaller than that of phase disparities, position disparities still may contribute to the overall preference of cells for binocular disparity. How does a position disparity contribute to the cell’s preferred disparity? Does it always add to phase disparity to yield a cell’s preferred disparity that is larger than either the phase or position disparity? Or does it always cancel a phase disparity? In Fig.7, position disparities of individual cells are plotted against their phase disparities. No correlation is found between position and phase disparities (correlation coefficientr = 0.12, R 2 = 1.45%), suggesting that position and phase disparities are largely independent of each other. In other words, they may add up or partially cancel each other.
Relationship between disparity and RF orientation
It has been shown that RF profiles for the left and right eyes are relatively matched for cells tuned to horizontal orientations, whereas those for cells tuned to vertical orientations are predominantly dissimilar (DeAngelis et al. 1991, 1995; Ohzawa et al. 1996). This finding is confirmed by the data reported here. In Fig. 8 A, magnitudes of phase disparities in deg PA are plotted for individual cells as a function of RF orientation. Orientations of 0 and 90° correspond to horizontal and vertical, respectively. Cells tuned to horizontal orientations tend to have small phase disparities, indicating that left and right eye RFs of these cells have relatively similar spatial profiles. In contrast, phase disparities of cells tuned to more oblique and vertical orientations are spread along the y axis, indicating that the spatial profiles of left and right eye RFs are quite different for some cells. A statistical analysis indicates that the distribution of phase disparity for cells tuned to orientations within ±20° from horizontal has a smaller variance compared with the distribution for cells tuned to orientations within ±20° from vertical [F test: F ratio = 2.995, df = (18,25), P < 0.01]. Limiting data points to matched samples (○) does not alter the statistical significance [F test: F ratio = 6.826, df = (10,7),P < 0.01].
This result implies that cells tuned to horizontal orientations encode a small range of binocular disparity compared with cells tuned to vertical orientations. This orientation anisotropy is expected because binocular parallax yields a larger range of binocular disparities along horizontal than vertical directions due to the fact that the eyes are displaced laterally. In Fig. 8 B, magnitudes of position (●) and phase disparities (○ and ▵; ○ indicate matched samples) in deg VA are plotted as a function of RF orientation. As expected, there is a tendency for cells tuned to horizontal orientations to have small phase disparities compared with those tuned to vertical orientations. This tendency is also statistically significant; the distribution of phase disparity for cells tuned to orientations within ±20° from horizontal has a smaller variance compared with the distribution for cells tuned to orientations within ±20 deg from vertical [F test: F ratio = 2.935, df = (18,25), P < 0.01]. This is also true for the matched samples [F test: F ratio = 7.041, df = (10,7), P < 0.01]. A similar orientation anisotropy was reported by Barlow et al. (1967) who examined the range of cells’ preferred binocular disparities (but seeFerster 1981; LeVay and Voigt 1988). On the other hand, no orientation anisotropy is found for position disparity [F test: F ratio = 2.307, df = (10,7), P = 0.14], which is consistent with most previous studies that measured the position difference between left and right eye RFs (Joshua and Bishop 1970; Nikara et al. 1968; von der Heydt et al. 1978).
Relationship between disparity and RF spatial frequency
Figure 9 shows how position and phase disparities depend on RF spatial frequency. In Fig.9 A, magnitudes of phase disparities in deg PA are plotted as a function of RF spatial frequency. As reported previously (DeAngelis et al. 1995; Ohzawa et al. 1996), there is no obvious tendency for cells tuned to different spatial frequencies to have different ranges of phase disparities (linear regression: slope = 0.49, P = 0.35; for the matched samples, slope = 0.31, P = 0.77). This suggests that the similarity or dissimilarity between spatial profiles of left and right eye RFs does not depend on RF spatial frequency.
In contrast, phase disparities in deg VA clearly show a dependency on RF spatial frequency. In Fig. 9 B, magnitudes of phase disparities (○ and ▵; ○ indicate matched samples) in deg VA are plotted, together with position disparities (●), as a function of RF spatial frequency. As a reference, phase disparities equivalent to 180 and 90 deg PA are indicated by the solid and dashed lines, respectively. Phase disparities are scattered below the solid line, suggesting that they can be used to encode a wide range of binocular disparities within the limit indicated (—). A linear regression analysis indicates that there is a tendency for phase disparity to decrease with spatial frequency (slope = −0.86, P< 0.01). This tendency becomes weaker (linear regression: slope = −1.05, P = 0.076) when data points are limited to matched samples (○). Nevertheless, the matched samples are still scattered widely below the solid line. In any case, the observed tendency is consistent with previous reports that the range of cells’ preferred binocular disparities (Pettigrew et al. 1968) and the width of binocular disparity tuning (Ferster 1981) increase with RF size.
Unlike phase disparity, which is limited to ±180 deg PA by definition, position disparity has no such limit in theory. This is an important advantage of the disparity encoding scheme based on RF position disparity because it allows the visual system to encode a larger range of binocular disparity than would be with RF phase disparity. It becomes especially important at high spatial frequencies for which RF phase disparity in deg VA is necessarily small. However, the visual system does not appear to take advantage of this. As shown in Fig.9 B, position disparities are generally very small (note that the spread of the position disparities along the vertical axis in the figure would be even smaller by a factor of for true position disparities), and most of them fall well below the 90 deg phase disparity line. They are relatively constant across spatial frequency (regression slope = −0.75, P = 0.089). Therefore unless RF spatial frequencies are very high, the range of binocular disparity that can be encoded would be larger with RF phase disparity compared with that with RF position disparity.
These results also suggest that if the visual system was to encode binocular disparity through position disparity, its performance in binocular fusion and stereo tasks would not depend on stimulus spatial frequency. Whereas, if phase disparity was to be used, dependence on spatial frequency would be expected. It has been reported that performance of human observers in binocular fusion and stereo tasks does depend on stimulus spatial frequency (DeValois 1982; Felton et al. 1972; Kulikowski 1978; Legge and Gu 1989; Richards and Kaye 1974; Schor and Wood 1983; Schor et al. 1984a,b; Smallman and MacLeod 1994). One such example is shown in Fig. 10. In the figure, the fusion limit of human observers is plotted as a function of stimulus spatial frequency. Data points (○) are replotted from a study by Schor et al. (1984b). They found that the fusion limit of human observers decreases with stimulus spatial frequency (size-disparity correlation) in a manner similar to the prediction of a phase encoding model (—), up to a spatial frequency of ∼2.5 c/deg. Beyond this spatial frequency, however, the performance of human observers deviates from the prediction and becomes constant (- - -). The physiological data reported here are concordant with the psychophysical data in the sense that phase disparity seems to provide the upper limit of binocular disparity at low spatial frequencies and position disparity provides a constant limit at high spatial frequencies.
Using a quantitative RF mapping technique, combined with a reference-cell method, RF position and phase disparities for simple cells in the cat’s striate cortex have been estimated. Position disparities are generally small and are only suitable for encoding small binocular disparities. They do not show any correlation with RF orientation or spatial frequency. It seems, therefore, that RF position disparity may be a byproduct of random jitter in RF position. On the other hand, phase disparities cover a wide range of binocular disparities and exhibit an orientation anisotropy. They are generally within the quarter cycle limit and provide a basis for the size-disparity correlation observed in psychophysics. Considered together, these results strongly favor the notion that binocular disparity is encoded mainly through RF phase disparity. However, RF position disparity still may play an important role in encoding binocular disparity at high spatial frequencies for which RF phase disparity becomes necessarily small in deg VA.
RF position and phase for encoding image phase
In their ground-breaking work, Hubel and Wiesel (1962) found that neurons in the striate cortex responded to elongated slits or bars and oriented edges more effectively than to diffuse light or spots. Subsequently these neurons were considered bar and edge detectors (e.g., Barlow 1972; Bishop et al. 1971; Hubel 1963). Neurons selective to binocular disparity also were considered as detectors of the monocular trigger features that are located at slightly different retinal locations in the two eyes (Barlow et al. 1967;Bishop 1973; Maske et al. 1984). In a sense, binocular disparity was yet another trigger feature. The idea of feature detectors seemed to fit with the notion that the visual system analyzes an image first by decomposing it into simple features like bars and edges (Lindsay and Norman 1972; Neisser 1967).
However, the introduction of sinusoidal grating stimuli (Campbell and Robson 1968; Robson 1966;Schade 1956) for analysis of the visual system in the frequency domain provided an alternative view: the visual system analyzes an image first by decomposing it into various spatial frequency components. This notion received strong support from the various studies that demonstrated that responses of neurons in the striate cortex can be predicted not by the bars and edges that are contained in the stimulus but by the frequency components in the stimulus (e.g., Albrecht and DeValois 1981;DeValois 1982; DeValois and DeValois 1988; DeValois et al. 1978, 1979; Maffei et al. 1979; Pollen and Ronner 1982). It now is accepted widely that one of the main functions of neurons in the striate cortex is to perform band-pass filtering in the spatiotemporal frequency domain.
As frequency-based processing devices, these neurons do not encode the position of bars and edges per se; they encode the position of frequency components in the image, namely the phase. Indeed, the importance of image phase has been pointed out by many researchers (e.g., Morrone and Burr 1988; Openheim and Lim 1981; Piotrowski and Campbell 1982). Likewise, binocular disparity is not encoded as the relative position of bars and edges in the images between the two eyes but as the relative phase of the frequency components in the images (DeValois and DeValois 1988). This view is consistent with computational studies that demonstrated that phase disparities in a stereo image can be used to compute the binocular disparity of the image (Jenkin and Jepson 1988) and solve the correspondence problem (Sanger 1988).
Because position and phase are interchangeable in space, one could build a visual system that detects the relative image phase using Gabor-like RFs with a position offset, a phase offset, or both. If the relative phase is detected only through a position offset, then the RFs should be identical in shape. However, RFs of simple cells come in a variety of monocular phases (DeAngelis et al. 1993;Field and Tolhurst 1986; Jones and Palmer 1987) and also a variety of binocular phase disparities, as shown in this and previous studies (DeAngelis et al. 1991,1995; Freeman and Ohzawa 1990; Ohzawa et al. 1996). In addition, the results presented here show that position disparities are relatively small compared with phase disparities. Therefore at least in the binocular domain, image phase is likely to be detected through a phase encoding mechanism. Although there have been no studies that have examined position and phase mechanisms for encoding monocular image phase, Pollen and Ronner (1981, 1982) reported that adjacent simple cells recorded simultaneously from the same electrode tend to be tuned to the same orientation and spatial frequency but differ in spatial phase by ∼90° (see also Liu et al. 1992). Therefore it is likely that a phase-encoding mechanism operates in the monocular domain as well. Recent models of cortical neurons explicitly use RF phases to encode monocular phase as well as binocular phase disparity (e.g.,Fleet et al. 1996; Nomura et al. 1990; Ohzawa et al. 1990; Pollen and Ronner 1981, 1982; Qian 1994).
Despite the recent emphasis on phase encoding, however, one should not abandon position encoding altogether. One serious limitation of phase encoding is that the range of spatial displacement that can be encoded through phase decreases with increasing spatial frequency. This is simply because a constant phase angle spans a smaller and smaller distance in visual angle as spatial frequency increases. On the other hand, position does not have such a dependency on spatial frequency. Therefore as noted in the preceding text, position encoding may be quite useful at high spatial frequencies, at which the phase in degree visual angle becomes necessarily small.
Definition of RF position disparity and the aperture problem
In this study, RF disparities were measured in the direction perpendicular to the RF orientation. This is necessary in the case of phase disparity because it is defined as orthogonal to the RF orientation. However, position disparities could have been measured in any direction. For instance, a position disparity can be defined by a vector that connects the centers of the left and right eye RFs, irrespective of the RF orientation. Another way to define a position disparity is to project the vector onto the horizontal and vertical axes of the visual field. This definition may be suitable for direct comparisons with data from psychophysical studies for which horizontal and vertical disparities of stimuli are manipulated. If the vector is projected onto the axes parallel and orthogonal to the RF orientation, one obtains the definition used in this study. All of these definitions are equivalent, but it is not immediately clear which is most appropriate (Cumming 1997).
There are two practical reasons why position disparities had to be measured orthogonal to the RF orientation in this study. First, it is the only direction in which the position disparity is obtainable from 1D RF data. Second, to compare phase and position disparities, the position disparity must be measured in the same direction as the phase disparity. Although these reasons certainly dictated the way in which the analysis was conducted in this study, they do not necessarily provide a full justification for the choice of one definition over others. Are there any other justifications that have a more functional basis? To answer this question, one needs to consider if neurons in the striate cortex can distinguish directions in which binocular disparity is introduced.
Each neuron in the striate cortex has to encode a binocular disparity based on the information available within its left and right eye RFs. When an extended stimulus only containing a 1D pattern such as a sinusoidal grating is presented over the RFs, there are an infinite number of directions in which binocular disparity can be introduced to yield the same stereo image within the RFs. In Fig.11, three examples of such a stimulus are illustrated. Although the amount and direction of binocular disparity (indicated by arrows) are different for each stimulus, the image within the RF (indicated by circles) of each eye is the same. Therefore responses of a neuron to these stimuli would be the same regardless of the direction of binocular disparity. It should be noted that the vector component of binocular disparity parallel to the orientation of the image pattern is different for each stimulus, whereas the vector component orthogonal to the pattern orientation is the same. In other words, changes of a stimulus in the direction parallel to the pattern orientation are not detectable, but those in the orthogonal direction are. This is called the aperture problem and is analogous to the aperture problem in identification of direction of motion (e.g., Movshon et al. 1985). Because neurons in the striate cortex respond best to stimuli that are elongated along the RF orientation, they can encode binocular disparity in the direction orthogonal to, but not parallel to, the RF orientation.
Psychophysical data also support this assertion; stereoacuity of human observers for 1D stimuli such as gratings and oriented Gabor patches decreases in proportion to the cosine of the stimulus orientation angle from the vertical in the frontal plane, and the depth threshold expressed in phase disparity at right angle to the stimulus orientation remains constant (e.g., Morgan and Castet 1997; seeHoward and Rogers 1995 for a review). These results are consistent with the detection of binocular disparity by neurons in the striate cortex along the direction orthogonal to the stimulus orientation rather than along the horizontal axis of the visual field. Therefore the RF position disparity measured orthogonal to the RF orientation is indeed suitable for the analyses conducted in this study.
Position and phase encoding in other visual tasks
As described in results, psychophysical data such as those of Schor et al. (1984b) indicate that the performance of human observers in binocular fusion and stereo tasks consists of two parts: a spatial frequency dependent portion (at low spatial frequencies) and an independent portion (at high spatial frequencies). Interestingly, this dual behavior is apparently not unique to binocular fusion and stereopsis but also is found in various spatial tasks (Baker et al. 1989; Boulton and Baker 1991; Burr 1980; Burr et al. 1986; Chang and Julesz 1985; Cleary and Braddick 1990a,b; DeValois and DeValois 1988;Westheimer 1978; Yo et al. 1989). For example, DeValois and DeValois (1988) measured the threshold of human observers for displacement of sinusoidal gratings at various spatial frequencies. They found that, at spatial frequencies <2 c/deg, the threshold decreased with spatial frequency (see alsoBurr 1980; Yo et al. 1989), but for higher spatial frequencies the threshold was approximately constant (see also Westheimer 1978). Similar results also have been reported for measurements of maximum displacement (Dmax) for correct identification of direction in short-range apparent motion (Baker et al. 1989;Boulton and Baker 1991; Burr et al. 1986;Chang and Julesz 1985; Cleary and Braddick 1990a,b).
It is tempting to speculate that the dual behavior observed for binocular fusion and stereopsis, monocular displacement detection, and short-range apparent motion all share the same neural basis: a phase encoding mechanism for low spatial frequencies and a position encoding mechanism for high spatial frequencies. For the phase encoding mechanism to work properly, RF centers have to be at the same position. However, RF position is subject to slight random jitter (Hetherington and Swindale 1997). For RFs with low spatial frequency selectivity, this is not a problem because the amount of jitter is very small compared with the size of the RFs. For RFs with high spatial frequency selectivity, however, the amount of jitter may be significant compared with the size of the RFs, and position encoding becomes more reliable than phase encoding. To explain the dual behavior of their displacement threshold data, DeValois (1982; see also DeValois and DeValois 1988) proposed a two-stage model in which a phase processing stage (presumably at the striate cortex level) is followed by a position processing stage (extrastriate cortex). Our results suggest that both mechanisms may reside at the level of the striate cortex.
RF disparity and cells’ tuning for binocular disparity
In this study, RF position and phase disparities of simple cells have been examined as mechanisms through which binocular disparity is encoded. An implicit assumption here is that the cell’s preferred binocular disparity can be predicted from differences between left and right eye RFs, i.e., cell’s responses to binocular stimulation can be predicted from cell’s responses to monocular stimulation. This assumption is true if signals from the left and right eyes are combined linearly or nearly so. If the binocular combination of signals is nonlinear, then the assumption may or may not hold depending on the type of nonlinearity.
There is some evidence that indicates that a combination of left and right eye signals is indeed linear. Ohzawa and Freeman (1986a) studied phase-specific binocular interactions of simple cells in the cat’s striate cortex using drifting sinusoidal gratings and suggested that the majority of binocular interactions may be accounted for by a simple linear summation of monocular signals and a threshold mechanism. In addition, it is shown in the following paper (Anzai et al. 1999a) that a simple cell can be modeled as a linear binocular filter followed by a static nonlinearity. Because the static nonlinearity only affects response amplitude and not peak locations of disparity tuning functions, the RF disparity of a cell should correspond to the optimal binocular disparity for that cell.
Orientation dependency of binocular disparity
Because our eyes are displaced laterally, binocular parallax yields a larger range of binocular disparities along horizontal compared with vertical directions. Therefore the range of RF disparity is expected to be larger for cells tuned to vertical compared with horizontal orientations. Physiological evidence for this orientation anisotropy first was reported by Barlow et al. (1967). They measured for each cell the binocular disparity necessary to evoke maximum binocular facilitation, and found that the range of disparities was larger for horizontal (±3.3°) compared with vertical disparity (±1.1°).
However, subsequent studies failed to find such an orientation anisotropy (Ferster 1981; Joshua and Bishop 1970; LeVay and Voigt 1988; Nikara et al. 1968; von der Heydt et al. 1978). For example,Nikara et al. (1968) measured positions of left and right eye RFs using moving bars and edges and found that the RF position disparities ranged between ±1.2°, with a standard deviation of ∼0.6°, in both the horizontal and vertical directions. Similar results were also obtained by von der Heydt et al. (1978). Joshua and Bishop (1970) found RF position disparity to be dependent on eccentricity. For small eccentricities, they observed no difference between the ranges of RF position disparities along the horizontal and vertical directions, but at horizontal eccentricities beyond 8°, they found an orientation anisotropy. Therefore they concluded that the orientation anisotropy reported in Barlow et al. (1967) could be attributed to the large range of eccentricities from which data were sampled and errors in measurement of RFs in the periphery.
Although eccentricity might have been a factor that contributed to the difference between the results of Barlow et al. (1967)and those of the three studies just mentioned (Joshua and Bishop 1970; Nikara et al. 1968; von der Heydt et al. 1978), there is one other important factor that needs to be considered. Barlow et al. reached their conclusion based on the measurements of cells’ preferred binocular disparities, whereas others used the measurements of monocular RF locations, i.e., RF position disparities. A cell’s preferred binocular disparity is the sum of its RF phase and position disparities. If RF position disparities are relatively small compared with RF phase disparities and if RF phase disparity, but not RF position disparity, shows orientation anisotropy as found in the current study, then the previous reports are all consistent. Barlow et al. measured something very close to RF phase disparity, whereas others measured RF position disparity. In fact, Barlow et al. observed that the centers of the monocular RFs did not necessarily correspond to the positions of the stimuli that elicited the maximal binocular facilitation.
There are two more studies that are relevant to this issue.Ferster (1981) and LeVay and Voigt (1988)measured the preferred binocular disparities of cells, and their results are, therefore, comparable with those of Barlow et al. (1967). Yet they failed to find an orientation anisotropy. Because they used a reference-cell method, errors in their measurements are probably less than those in Barlow et al. However, the orientation anisotropy found in the current study is also not very strong, though it is statistically significant (Fig. 8 B). Therefore it is not surprising that they did not find an orientation anisotropy. On the other hand, RF phase disparity in phase angle does show a clear orientation anisotropy as presented in Fig. 8 A and also in previous studies (DeAngelis et al. 1991, 1995;Ohzawa et al. 1996), which indicates that a similarity (or dissimilarity) of left and right eye RFs depends on RF orientation.
Phase encoding and the stereo correspondence problem
Numerous psychophysical studies have shown that performance of human observers in binocular fusion and stereo tasks depends on stimulus spatial frequency, at least at low spatial frequencies (DeValois 1982; Felton et al. 1972;Kulikowski 1978; Legge and Gu 1989;Richards and Kaye 1974; Schor and Wood 1983; Schor et al. 1984a,b; Smallman and MacLeod 1994). In general, threshold disparity increases as spatial frequency decreases or as stimulus size increases. Hence this relationship is called a size-disparity correlation (Schor and Wood 1983).
The range of RF phase disparity also exhibits a size-disparity correlation (Fig. 9 B); the range decreases as spatial frequency increases. This means that cells tuned to high spatial frequencies can encode only small binocular disparities, whereas cells tuned to low spatial frequencies could encode relatively large binocular disparities as well. Marr and Poggio (1979)suggested that the stereo correspondence problem can be solved first at coarse scales (low spatial frequencies) to limit the range of disparities for which the match is sought and then at fine scales (high spatial frequencies) to find the match (see also Nishihara 1987; Quam 1987). It also has been suggested that binocular disparity should be limited to a quarter cycle (90 deg PA) for band-pass filters with a bandwidth comparable with that of cortical cells to unambiguously encode binocular disparity (Blake and Wilson 1991; Marr and Poggio 1979). As shown in Fig. 5 A, RF phase disparity satisfies this requirement.
These results imply that the stereo correspondence problem may be solved, at least partially if not completely, at the very earliest stage of cortical processing. This imposes important constraints as to what the possible algorithms for solving the correspondence problem should be based on. Sanger (1988) proposed a phase Based algorithm for solving the correspondence problem by computing something equivalent to interocular cross-correlation of a pair of band-limited images (see also Jenkin and Jepson 1988). In the following two papers (Anzai et al. 1999a,b), it is shown that outputs of simple and complex cells contain response components due to multiplicative binocular interaction, the key ingredient for computing interocular cross-correlation. Therefore these neurons may indeed form a neural basis for solving the correspondence problem.
RF disparities of complex cells
In this study, RF disparities of only simple cells are examined. How do complex cells encode binocular disparity? It has been shown that a significant fraction of complex cells are selective to binocular disparity (Ferster 1981; LeVay and Voigt 1988; Ohzawa and Freeman 1986b; Ohzawa et al. 1990, 1997; Pettigrew et al. 1968;Poggio et al. 1985; von der Heydt et al. 1978). However, it is quite difficult to study their RF mechanisms for encoding binocular disparity. This is simply because complex cells respond to both bright and dark stimuli at the same location of space, and therefore their RF profiles, the responses to bright stimuli minus the responses to dark stimuli, are relatively flat (see Fig. 1, C and F, in Anzai et al. 1999b for examples). Furthermore RF profiles obtained with either bright or dark stimuli alone are approximately Gaussian shaped (e.g., Dean and Tolhurst 1983; Heggelund 1981; Kulikowski et al. 1981; Movshon et al. 1978; Ohzawa et al. 1990; Palmer and Davis 1981; Schiller et al. 1976; see alsoBaker and Cynader 1986) and no sinusoidal component is present. Therefore it is not possible to estimate RF position and phase disparities of complex cells from their monocular RFs.
Numerous studies have provided evidence consistent with the idea that complex cells are made up of subunits that resemble simple cells (Baker and Cynader 1986; Dean and Tolhurst 1983; Emerson et al. 1987, 1992; Gaska et al. 1994; Movshon et al. 1978; Ohzawa et al. 1990; Pollen and Ronner 1982;Szulborski and Palmer 1990), originally suggested byHubel and Wiesel (1962). Therefore the RF properties of complex cells are thought to be inherited directly from the underlying subunits. Then one would expect that complex cells encode binocular disparity in the same manner as simple cells do, i.e., mainly through RF phase disparity.
Recently, Ohzawa et al. (1997) measured interocular two bar interaction and estimated disparity tuning curves for complex cells. They found that the disparity tuning of complex cells can be either symmetric or asymmetric in shape. Because phase encoding, but not position encoding, predicts asymmetric disparity tuning, they concluded that complex cells encode binocular disparity through RF phase disparity of subunits.
In the third paper of this series (Anzai et al. 1999b), RF profiles of subunits are estimated from the binocular interaction RFs of complex cells, and RF position and phase disparities of subunits are obtained. It is shown that the RF position and phase disparities of complex cell subunits are mostly consistent with those of simple cells as reported in the current study. Therefore complex cells seem to encode binocular disparity mainly through RF phase disparity, just like simple cells.
We are grateful to Dr. Erich Sutter for advice on binary m-sequences and their applications to receptive field mapping. We also thank Drs. Bruce Cumming, Greg DeAngelis, Karen DeValois, Russel DeValois, Ed Erwin, Edwin Lewis, and Clifton Schor for discussions and helpful comments and suggestions.
This work was supported by research and CORE grants from the National Eye Institute (EY-01175 and EY-03176).
Address for reprint requests: R. D. Freeman, 360 Minor Hall, School of Optometry, University of California, Berkeley, CA 94720-2020.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
↵1 There are a number of variations in the reference-cell method. For instance, Ferster (1981) used a different reference cell for each measurement, whereas Hubel and Wiesel (1970) kept a single reference cell for the entire experiment. We have employed Ferster’s method because it is more practical and the data obtained are amenable to a simple statistical analysis described here.
↵2 The rotation of the RFs is necessary to correct cyclorotation of the eyes due to paralysis.
- Copyright © 1999 The American Physiological Society
Gabor functions used to fit RF profiles
A Gabor function (G) is the product of a Gaussian envelope (E) and a sinusoid (S). The function used to fit one-dimensional (1D) RF profiles is illustrated in Fig.FA1 A. There are six free parameters for the 1D Gabor function: the center coordinate of the Gaussian (X o), the width of the Gaussian (W), the frequency of the sinusoid (f), the phase of the sinusoid (φ), the amplitude (A m), and the amplitude offset (A o). The amplitude offset is expected to be zero, and in fact, it is always very close to zero. Therefore it is not necessary to have this parameter. Nonetheless it is included here because real data always exhibit a finite value for this parameter, no matter how small. Figure FA1 B shows the Gabor function used to fit two-dimensional (2D) RF profiles. The 1D profiles in the figure are the sections through the center of the Gabor function. There are 10 free parameters: the center coordinate of the Gaussian on the x axis (X o), the center coordinate of the Gaussian on the y axis (Y o), the rotation angle of the Gaussian (γ), the width of the Gaussian along the minor axis (W p), the width of the Gaussian along the major axis (W q), the frequency of the sinusoid (f), the phase of the sinusoid (φ), the rotation angle of the sinusoid (θ), the amplitude (A m), and the amplitude offset (A o). A 2D Gabor function can be formulated in a number of different ways. The choice of this particular function is based on empirical reasons. First, the parameters of the function are fairly uncorrelated, i.e., they are not redundant. Second, the parameters almost always converge, i.e., the fitting algorithm can find the best fit.
Definition of RF disparities
RF PHASE DISPARITY.
Receptive field disparity d P is defined as follows Equation A1 Equation A2where φL and φRdenotes spatial phases in deg PA of left and right eye RFs, respectively, and f L andf R, spatial frequencies in cycles per deg VA of left and right eye RFs, respectively. Phase disparity in deg PA indicates a similarity/dissimilarity between spatial profiles of left and right eye RFs, provided thatf L andf R are comparable. On the other hand, phase disparity in deg VA indicates a spatial offset between sinusoidal components of the left and right eye RFs. A negative sign leading the parenthesis in Eq. EA2 is necessary so that phase disparity in deg VA and position disparity have the same sign for disparities in the same direction.
RF POSITION DISPARITY.
Let X o andY o be the center coordinates of the RF mapped on the visual field, and θ, the RF orientation. Parenthesized superscripts, A and B, will be used on these parameters to indicate that they belong to cell A (a reference cell) and cell B, respectively. The RF coordinates of cell B can be transformed into a new coordinate system so that the center coordinates of cell A becomes the origin and the RF orientation ofcell A matches the ordinate. The new coordinates ofcell B’s RF is given by Equation A3 Equation A4and the RF orientation of cell B with respect to the ordinate is Equation A5This transformation, when performed on RFs in the left and right eyes separately, is equivalent to shifting and rotating RF maps so that left and right eye RFs of cell A are at the same location and orientation, i.e., there is no RF position disparity for cell A (see Fig. 3 A). Then RF position disparities ofcell B can be measured as Equation A6 Equation A7where d X andd Y denote position disparities along the direction perpendicular to and parallel to the orientation of the right eye RF, respectively, and the subscripts L and R of the parameters indicate that the parameters belong to left and right eye RFs, respectively.
Relationship between distributions of true and relative position disparities
Suppose that the population distribution of true position disparity (see Fig. 3 B; top) has a mean μt and a variance ςt 2, i.e., Equation A8 Equation A9where d is a sample taken from the distribution andE[ ] denotes the expected value of the quantity inside the brackets. Then the mean (μr) of the population distribution for relative position disparity (see Fig.3 B, bottom) is Equation A10where d A andd B are true position disparities ofcell A (a reference cell) and cell B,respectively. The variance (ςr 2) of the distribution is given by Equation A11 where r AB is a coefficient of correlation between d A andd B, i.e., Equation A12Substituting Eq. EA9 into Eq. EA11 yields Equation A13Assuming that true position disparities are distributed around zero, i.e., μt = 0 Equation A14Therefore the relationship between the standard deviations for the true and relative position disparities is given by Equation A15This equation indicates that when there is no correlation between true position disparities of cell A and cell B, i.e., r AB = 0, the standard deviation for the true position disparity is expected to be smaller than that for the relative position disparity by a factor of .