The responses of single cortical neurons were measured as a function of the binocular disparity of dynamic random dot stereograms for a large sample of neurons (n = 787) from V1 of the awake macaque. From this sample, we selected 180 neurons whose tuning curves were strongly tuned for disparity, well sampled and well described by one-dimensional Gabor functions. The fitted parameters of the Gabor functions were used to resolve three outstanding issues in binocular stereopsis. First, we considered whether tuning curves can be meaningfully divided into discrete tuning types. Careful examination of the distributions of the Gabor parameters that determine tuning shape revealed no evidence for clustering. We conclude that a continuum of tuning types is present. Second, we investigated the mechanism of disparity encoding for V1 neurons. The shape of the disparity tuning function can be used to distinguish between position-encoding (in which disparity is encoded by an interocular shift in receptive field position) and phase-encoding (in which disparity is encoded by a difference in the receptive field profile in the 2 eyes). Both position and phase encoding were found to be common. This was confirmed by an independent assessment of disparity encoding based on the measurement of disparity sensitivity for sinusoidal luminance gratings of different spatial frequencies. The contributions of phase and position to disparity encoding were compared by estimating a population average of the rate of change in firing rate per degree of disparity. When this was calculated separately for the phase and position contributions, they were found to be closely similar. Third, we investigated the range of disparity tuning in V1 as a function of eccentricity in the parafoveal range. We find few cells which are selective for disparities greater than±1° even at the largest eccentricity of ∼5°. The preferred disparity was correlated with the spatial scale of the tuning curve, and for most units lay within a±π radians phase limit. Such a size-disparity correlation is potentially useful for the solution of the correspondence problem.
Prince et al. (2002) measured the disparity selectivity of V1 neurons with dynamic random dot stereogram (RDS) stimuli and used Gabor functions to describe their tuning profiles. This paper aims to resolve three issues about the mechanism and range of encoding of horizontal disparity in V1. First, we examine whether distinct types of tuning profile are present, as described by Poggio (1995), or whether the shapes of these profiles form a continuum (Freeman and Ohzawa 1990). Second, we examine whether phase disparities or position disparities are used to encode nonzero disparities. Third, we examine the range of disparities encoded by the population of disparity-selective V1 neurons in a way that permits comparison with psychophysical data. We also examine the relationship between the spatial scale of the receptive field (RF) and the range of disparities it encodes.
Studies of disparity encoding in macaque V1 neurons (Poggio 1995; Poggio and Talbot 1981; Poggio et al. 1988) established a widely used classification scheme. Tuned zero (T0) cells responded strongly to zero disparity but their firing was suppressed at disparities away from zero. Tuned inhibitory (TI) cells showed the opposite pattern of response, so their firing was suppressed at zero disparity. Tuned near (TN) and tuned far (TF) cells showed sharp excitation at a given near (crossed) or far (uncrossed) disparity, respectively. Near (NE) cells responded well to a wide range of near disparities and less to far disparities, while far (FA) cells showed the opposite pattern of response.
Although these descriptions have been widely adopted (e.g.,Burkhalter and van Essen 1986; Hubel and Livingstone 1987; Maunsell and van Essen 1983;Poggio and Talbot 1981; Poggio et al. 1985), there has been no quantitative demonstration that these proposed descriptions identify truly distinct classes of neurons. If a Gabor function is used to describe the disparity tuning profile, then each proposed class of neuron corresponds to a characteristic value of phase in the Gabor function, as first pointed out by Freeman and collaborators (DeAngelis et al. 1991; Freeman and Ohzawa 1990; Ohzawa et al. 1996), who concluded that these responses are better viewed as a continuum (see alsoLeVay and Voigt 1988). Those studies were conducted in anesthetized cats, whereas the classification of Poggio describes data from the awake monkey. This paper investigates whether data fromPrince et al. (2002) indicate the existence of discrete tuning types or a continuum of tuning shapes.
A second issue concerns the nature of the mechanisms for encoding nonzero disparities. One approach is through a difference in position between the left and right eyes' RFs (Barlow et al. 1967; Nikara et al. 1968), which is termed “position disparity encoding.” More recently Ohzawa et al. (1990) and DeAngelis et al. (1991) have argued for an alternative “phase disparity encoding” scheme, in which the left and right eyes' RFs have the same mean position but different profiles. If the monocular RFs are described by two-dimensional Gabor functions, selectivity for nonzero disparities can achieved by changing the relative phase of the Gabor functions between the eyes RFs (Ohzawa et al. 1990).
The shape of the disparity tuning profile can distinguish whether the underlying encoding is due to a position or phase mechanism. With the binocular energy model (Ohzawa et al. 1990), the position disparity component specifies the center of the range of disparities over which any change in firing rate occurs (either increases or decreases). The symmetry (even or odd) of the tuning curve around this center position specifies the phase component (seePrince et al. 2002 for details). Ohzawa et al. (1997) and Anzai et al. (1999c) used this principle to estimate phase shifts in disparity-selective complex cells of the cat. Because we measured the position of both eyes, we were able to exploit the same principle to estimate phase and position shifts in both simple and complex cells.
An encoding scheme based purely on phase disparities predicts that neurons that with a peak response to large nonzero disparities should also respond over a wide range of disparities because both these factors are controlled by the spatial scale of the underlying monocular RFs. In principle, position disparities can be larger than the spatial period of the RF, allowing them to encode larger disparities. Regardless of the encoding scheme, there may be benefits in limiting disparity encoding according to spatial scale. A number of stereo correspondence algorithms make use of such a “size-disparity” correlation to limit the number of false matches (e.g., Marr and Poggio 1979). Some psychophysical evidence in humans suggests that this constraint is present (Smallman and Macleod 1994). We examined whether the same constraint is manifest at the level of single neurons in primate V1. Finally, we compare the range of disparities encoded by the population of V1 neurons with the range of disparities that support psychophysical judgments of depth.
The methods were described in detail in the preceding paper (Prince et al. 2002) and are briefly summarized here. We measured disparity selectivity to dynamic RDS in 787 isolated single units in V1 of two awake monkeys. This paper analyses the responses of a subset of those neurons selected by four criteria. First, they had to be strongly disparity tuned (n = 338), defined by a value of F index > 0.8, where Equation 1where MSwithin and MSbetween are the familiar terms from a one-way ANOVA. MSwithin represents the mean variability of firing to any constant disparity and MSbetweenrepresents the variance in firing rate that is elicited by changes in disparity. One way to think of this index is that it rescales the ANOVA F ratio to values between 0 and 1. Hence, our criterion F index of 0.8 is equivalent to requiring that the between-means variance is four times greater than the within-means variance. Every neuron withF index > 0.8 showed a significant (5%) effect of disparity on firing rate by both a one-way ANOVA and a Kruskal-Wallis test. All of these statistical tests were performed on the square root of the firing rate.
Second, we required that the neuron had been tested with at least seven different disparities (n = 253). A one-dimensional Gabor function was then fit to the disparity tuning data Equation 2where d is the horizontal disparity of the stimulus,R mean is the mean height of the curve,A is the amplitude, d 0 is the mean position of the curve in disparity (which we shall call the disparity offset), and ς is the width of the function. The phase, φ, is measured relative to the center of the Gaussian envelope. The frequency term, f, was not a free parameter in the fit but was fixed at the “disparity frequency” as described in the accompanying paper (Prince et al. 2002). This is defined as the frequency at the peak of a continuous Fourier transform of the disparity tuning curve (after removing the overall mean firing). As demonstrated in the accompanying paper, the energy model predicts that the phase of the fitted Gabor (φ) is equal to the underlying phase disparity of the binocular RF, and the disparity offset (d 0) is equal to the position disparity.
After fitting, a third selection criterion was applied. Thirteen neurons were rejected because the fit accounted for <75% of the disparity related variance in firing rate, and 48 were rejected because the range of disparities sampled covered a range of less than two SDs of the fitted Gabor. In these cases, the range of disparities chosen during data acquisition did not adequately constrain the fit. Fourth, we visually inspected the data-set and removed a further 12 cells on the basis that the Gabor did not accurately describe the variation in the tuning profile. There was no consistent pattern to these neurons' tuning functions and in general there was no hint that an alternative functional form might be consistently superior to the Gabor model. After this selection process, 180 disparity tuning profiles remained (93 from monkey Rb, 87 from monkey Hg), each of which was well tuned for horizontal disparity and well described by a Gabor function. Note that of the 253 neurons that were strongly tuned for disparity, only 25 (10%) were excluded because the Gabor yielded a poor description.
Both animals maintained conjugate fixation much more precisely than the required fixation window (see Prince et al. 2002 for details). For each disparity tuning curve, we measured the SD of the trial mean conjugate eye position. The mean value of this SD was 0.059° for horizontal position and 0.056° for vertical position. Thus it is extremely unlikely that variation in conjugate eye position compromised the tuning curves.
The mean of these SDs for vergence was 0.039° formonkey Rb and 0.137° for monkey Hg.A number of human studies have indicated that vergence is maintained considerably more accurately than this (Collewijn et al. 1988; Enright 1991; McKee and Levi 1987; Ogle 1964; Riggs and Neill 1960; St. Cyr and Fender 1969), so it is likely that our measures of vergence errors are limited by instrumental artifacts. Even so, these figures indicate that the effect of vergence fluctuations on the great majority of tuning curves was negligible. Furthermore there was no correlation between the disparity frequency (see Prince et al. 2002) and the SD of vergence (r = 0.017, P < 0.8). Similarly there was no correlation between measured fixation disparity and the estimate of position disparity across tuning curves. Thus there is no evidence to indicate that eye movements compromised the conclusions in the following text.
Classification of disparity profiles
The population distribution of disparity tuning profiles in V1 was examined to test whether distinct classes of disparity tuning (T0, TI, etc.) could be found. The shape of each disparity tuning curve was described by a Gabor function (Prince et al. 2002), which has the advantage that several of the individual parameters inEq. 2 have clearly defined interpretations in terms of Poggio's classification. The most relevant parameter is the phase of the fitted Gabor, as illustrated by the example tuning curves in Fig.1.
Freeman and Ohzawa (1990) pointed out that phases near 0 correspond to T0 cells (with a symmetrical peak near 0 disparity,Hg089 in Fig. 1) and phases near π (with a trough near 0, like Rb537) correspond to TI neurons. NE and FA cells are described by Poggio et al. (1988) as having broad asymmetrical tuning curves with excitation for near (or far) disparities and inhibition at far (or near) disparities. This description implies odd-symmetry, requiring phase shifts in the region of ±π/2 (Rb793 and Rb590). The other important parameter is the disparity offset: phases near zero combined with small offsets would be classified as TE.
The remaining parameters in Eq. 2 have no influence on the classification. Therefore the phase (φ) and disparity offset (d 0) should suffice to identify the distinct classes, which should be apparent as a clustering in a scatter plot of these two parameters. Figure 1 shows such a scatter plot in which the points form a continuum. Importantly, our failure to identify distinct groupings does not arise from our inability to find prototypical examples of the tuning curves described by Poggio et al. (1988). The examples of tuning curves along the margins of the Fig. 1 are readily identified as classic examples of the prototypes and they occupy their predicted locations in this space. The difficulty is that many clear cases of intermediate types also exist.
We examine the distribution of fitted phases more closely in Fig.2, which shows the smoothed frequency density function for fitted phases. This is compared with the distributions reported in cat (combined data from Anzai et al. 1999a,c; DeAngelis et al. 1991) and barn owl (Nieder and Wagner 2000). The three distributions are similar with a predominance of cells exhibiting phases near 0 and a paucity of neurons with phases near π. The fitted phase φ can be used to assign each neuron to the categories defined by Poggio: for TE neurons, −π/4 < φ < π/4; for TI neurons, −π < φ < −3π/4 or 3π/4 < φ < π; and for NE/FA neurons, π/4 < φ < 3π/4 or −3π/4 < φ < −π/4. Applying this simple criterion to the 180 neurons considered here, 69 (38%) were TE neurons, 29 (16%) were TI neurons, 38 (21%) were NE neurons, and 44 (25%) were FA neurons. These proportions are similar to those reported by Poggio and Fischer (1977)and Poggio et al. (1988).
Although the TE/TI/NE/FA classification has a clear interpretation in terms of phase and position disparities, it is possible that other patterns of clustering could be present. Figure3 therefore examines the relationship of position, phase, disparity frequency, and the SD of the Gaussian envelope. Once again, there is no clustering into distinct groups. Only 98 of the 180 neurons analyzed here were significantly better described by a Gabor than a Gaussian (sequential F test,P < 0.05). In these cases, the Gabor function is still a suitable description of the data even though it produces little improvement over the Gaussian fit. The Gabor fit correctly assigns even-symmetric phases to these curves, as expected. The fitted curves may nonetheless require substantial disparity offsets (like the example in Fig. 8 A of the accompanying paper). For these types of neuron, position disparity is the only effective encoding mechanism.
The classification of NE/FA neurons is potentially problematic. In addition to odd-symmetry, Poggio (1995) describes their responses as “extended rather than tuned.” In other words, there is a broad range of disparities over which their response changes little. Many examples (e.g., Fig. 8 in Poggio et al. 1988) show broad plateaus in firing rate that extend to the largest disparities tested. We found no examples of this phenomenon in our dataset: when sufficiently large disparities were used, the response rate invariably approached the baseline level at both crossed and uncrossed disparities. Indeed, this seems inevitable with dynamic RDS because the stimulus within a finite RF is uncorrelated when the disparity is large. We have also been unable to find a single published example of a NE/FA cell from V1 characterized with RDS that shows extended plateaus. It seems likely that the discrepancy is attributable to the stimuli used, since the earlier demonstrations of NE and FA tuning types were performed with bars. Here it is possible that at the largest disparities, the stimulus continues to cross the RF in at least one eye. It therefore seems appropriate to equate odd-symmetric disparity tuning in response to RDS to the NE/FA classification (as argued byDeAngelis et al. 1995; Nomura et al. 1990).
It is useful to consider monocular responses in more detail.Poggio and Fischer (1977) and LeVay and Voigt (1988) both suggested that NE/FA cells tended to be dominated by one eye, whereas TE cells generally had balanced ocularity. Figure4 A plots the monocularity index (see Prince et al. 2002) as a function of the symmetry of the Gabor function, which is quantified as the modulus of the fitted phase parameter. A monocularity index of one means that the cell is entirely monocular, whereas a value of zero indicates a cell with balanced ocularity. There is no reliable relationship between the symmetry of the curve and the ocularity of the cell.
Another relationship, noted by Poggio and Talbot (1981), was that TE neurons were commonly associated with strong binocular facilitation at the preferred disparity and weak monocular responses. Figure 4 B provides quantitative evidence for such phenomenon: the majority of neurons with fitted phases near 0 (TE neurons, □) show a maximum binocular response that is substantially greater than either monocular response. The majority of neurons with fitted phases near π (TI neurons, ▴), show maximum binocular responses similar to their largest monocular response. Although there is a clear correlation between phase and monocular responsiveness, again there appears to be a continuum of response types. This relationship between phase and monocular responses may ultimately reflect the way in which a limited dynamic is exploited to encode disparity. TI neurons respond to certain disparities primarily by suppressing their firing relative to their response to an uncorrelated stimulus. Hence a substantial response to monocular or uncorrelated dots is necessary, to allow reductions in firing rate to convey any useful information.
Phase and position encoding
Ohzawa et al. (1997) and Anzai et al. (1999c) have shown that the precise shape of the disparity tuning profile of complex cells can be used to determine whether the underlying encoding was phase or position based, without requiring measurement of the monocular RF shape. In the accompanying paper (Prince et al. 2002), we present simulations of model complex cells that illustrate this point and are summarized in Fig.5. Figure 1 shows our estimates of both phase and position disparities. We now develop a statistical approach to testing for their presence.
Initially, Gabor functions were fitted in which the phase and position parameters were both allowed to vary. To test for the existence of a significant nonzero position disparity component, the disparity offset parameter was fixed at zero. The phase, amplitude, and mean firing rate parameters were then refit. The variance that could be explained by this zero-position fit was compared with the variance explained by the original Gabor fit. A sequential F test (see Draper and Smith 1998, p. 159–160) was used to determine whether the position parameter contributed significantly to the model. Equivalent tests evaluated the contribution of the phase-parameter and the contribution of both parameters together. Although these fits are nonlinear in their parameters, Monte Carlo simulations on test data indicated that using this test with a 5% criterion for theF test did indeed produce a type I error rate of ∼5%.
These tests allowed us to classify cells as requiring nonzero phase disparity only, nonzero position disparity only, both, or neither. A small number of cells required either a nonzero position component or a nonzero phase component but not both. Figure6 shows two examples, one that required a nonzero position disparity (left) and the other that required both a nonzero phase- and position-disparity (right). In each case, the top row shows the original Gabor fit where all parameters are free to vary. Themiddle row shows the fit when the disparity offset of the Gabor curve is restricted to be at zero disparity (i.e., only phase disparities are used to fit the data). Neither disparity tuning curve is well described under these conditions. This suggests that a nonzero position disparity is necessary to describe both of these tuning curves. The bottom row shows fits with a free position term but a phase component fixed at zero (i.e., only position disparities are used to fit the data, and the fits are constrained to be even symmetric). The cell in the left is well described by this fit, and we conclude that a phase-disparity component of zero is sufficient. However, the cell in the right hand-column also requires a nonzero phase component to be present.
Figure 7 shows examples of the other 3 possible categories. From top to bottom, these 3 cells were classified as requiring “phase disparity,” “either phase or position disparity,” and “neither phase nor position disparity” respectively. Across the population of 180 cells, 45 (25%) cells required nonzero phase disparity only, 26 (14%) cells required nonzero position disparity only and 78 (43%) cells required both components to describe their disparity tuning curves. Either a nonzero phase or a nonzero position component (but not both) was required to explain a further 20 (11%) curves. Neither nonzero phase nor nonzero position disparity were required to explain the remaining 11 (6%) curves.
As a whole, it appears that both phase and position encoding of horizontal disparity are frequently found in macaque V1, but there are circumstances under which this conclusion may be invalid. Figure8 shows that if a neuron has an oblique orientation preference and a large vertical position difference between the eyes, then the responses to horizontal disparity measured with zero vertical disparity can be odd-symmetric. Thus our analysis depends on an assumption that for obliquely oriented RFs, any vertical position disparity is small relative to the spatial period of the disparity tuning. To evaluate this, we re-examined the disparity tuning curves for cells that prefer nearly vertical orientations (in which this confound cannot be present). The distribution of phase and position shifts in this group was similar to those in the population as a whole. It therefore seems unlikely that vertical position shifts have substantially biased our estimates of phase shift.
As a further check, we employed another, quite different, method for estimating phase and position disparity components in cortical cells, suggested by Fleet et al. (1996) and Wagner and Frost (1993). Disparity-tuning profiles were measured for drifting sinusoidal gratings and the results were compared for different spatial frequencies. If a pure position component were present, all the tuning profiles should peak at the same disparity when expressed in terms of position. If a pure phase component were present, all the tuning profiles should peak at the same disparity when expressed in terms of phase. These predictions hold even if a vertical position shift is present. Example data are shown in Fig.9. The first panel shows the disparity tuning function for RDS. The fitted Gabor function is nearly symmetric, and has a small position shift toward uncrossed disparities. This suggests that the disparity encoding consists of a small position shift toward uncrossed disparities, but no interocular phase difference. Figure 9, B and C, shows the disparity tuning profiles for drifting sinusoidal grating patches at two different spatial frequencies, expressed in units of position and phase disparity respectively. The fitted sinusoids align well when plotted in terms of position disparity, but not in terms of phase disparity. This again suggests that a small uncrossed position disparity is present, and the contribution of phase disparity is minimal. Quantitative measures of RF location in each eye, varying the location of small grating patches, also showed a small horizontal position shift (Fig. 9 D).
The contribution of phase and position components to disparity tuning were estimated in this way for 15 units. The smallest disparity for which the phase of the fitted sinusoidal function was identical at both frequencies was taken as an estimate of the position shift. The phase of the sinusoid at this point was taken as an estimate of the phase disparity component. In agreement with the results derived from RDS tuning, this method suggested a continuous distribution of phase shifts. Estimates of phase disparity from disparate gratings and random dot stereograms were significantly correlated (T monotone association, P ≤ 0.005) (see Fisher 1993, p. 148). Although the estimates of position disparity were not correlated (Spearman's rank correlation,rs = −0.02), they were all estimated to be small (14/15 ≤ 0.12°) by both methods in this sample of neurons. It therefore seems that our estimates of phase and position shifts derived from the tuning to random dot patterns are reliable.
Overall, this analysis confirms the major conclusion that we wish to draw from Fig. 1 and Fig. 10: both phase and position disparities are used in constructing disparity-selective responses in primate V1. The relative contribution of each mechanism to the range of disparities coded is considered in the next section.
Range of disparity encoding in V1
In considering the range of disparities encoded by a population of neurons, it is natural first to consider the distribution of “preferred” disparities. Unfortunately, the preferred disparity of a cell with a Gabor-shaped tuning curve does not have a straightforward interpretation. One possibility is to use the disparity at which the cell fires the most spikes. However, TI type cells, which have a primarily inhibitory response to binocular correlation, fail to exhibit a distinct preferred disparity using this criterion (see Fig.4 C). A related measure is plotted in Fig.11 A. We define the “maximum interaction position” as the disparity that produces the greatest deviation from the response to uncorrelated stimuli. For TI cells, this is the location of the trough.
This “maximum interaction” criterion works well for tuned cells but may not characterize odd-symmetric tuning curves well. The maximum interaction position of such tuning curves may be finely balanced between large near and far disparities. As an alternative, we define the “maximum slope position” as the position on the curve where the square root of the response changes most rapidly. At this position, disparity discrimination performance is greatest (seeBritten et al. 1992; Prince et al. 2000). The square root operation eliminates the relationship between mean firing rate and variance (see the appendix of the accompanying paper, Prince et al. 2002). The relationship between these measures of preferred disparity and eccentricity is shown in Fig. 11.
Each of the preceding measures is appropriate for some types of tuning but is unstable for others. Indeed it may be inappropriate to characterize the disparity sensitivity of the cell with a single disparity value. An alternative approach is to examine the parameters of the Gabor functions used to describe the disparity tuning profile. Figure 11, C and D, plots the spatial frequency (f) and SD (ς) parameters for the disparity tuning profiles as a function of eccentricity. It can be seen that there is an increase in scale as the RFs move away from the fovea. However, there is no evidence for separate “coarse” and “fine” processing systems at any one eccentricity.
Although these newly developed measures of disparity sensitivity better reflect the information conveyed by V1 about disparity, in Fig.12 we follow another approach. We estimate disparity discriminability as the mean absolute change in population firing rate in response to a small change in horizontal disparity about a given pedestal. For each tuning profile, the absolute slope of the square root of the fitted Gabor function was calculated for disparities from −1.0 to 1.0°. This quantity is closely related to disparity discriminability, as the variance of is not strongly dependent on the mean (appendix a of the accompanying paper, Prince et al. 2002). RFs with eccentricities from 1.0 to 4.5° were grouped into six bins except for a small number outside this range, which were included in the nearest bin. The estimate of disparity discriminability was then averaged across all neurons in each bin.
The resulting functions are plotted in Fig. 12. At all eccentricities, the peak sensitivity is close to zero disparity and decreases as a function of disparity. There is very little sensitivity to disparities of greater than ±1°. At larger eccentricities, the peak sensitivity decreases but the shape of the function is unchanged. At first sight, this finding might seem incompatible with a predominance, noted earlier, of symmetrical T0-type cells, which do not vary their response at zero disparity. However, several other types of cell (NE, FA, TN, TF) change their response rates near zero disparity and, as a whole, these cause the peak sensitivity to be close to zero.
This analysis assumes that all neurons in the population have the same relationship between the variance and the mean of their spike counts. We performed the entire analysis in an alternative way by weighting each neuron's contribution by the inverse of each measured SD of . This resulted in no substantial difference in the profile of sensitivity to disparity from the outcome shown in Fig. 12.
Phase and position disparities
The disparity sensitivity profile can also be used to compare the separate contributions of phase and position encoding. Figure 10 shows a scatter plot of the phase and position disparity components of the tuning curves. Position disparity was estimated from the disparity offset (d 0 ) of the fitted Gabor. The contribution of phase disparity was estimated by rescaling the phase parameter of the fitted Gabor (φ) by the wavelength of the sinusoidal component of the fitted Gabor (1/f) and changing its sign, so that positive values of both parameters signify far disparities. Both phase and position disparity mechanisms are present and each encodes large ranges of disparity, as in the cat (Anzai et al. 1999a). The distribution of position shifts for cells that were adequately described by a Gaussian function (n = 98, ς = 0.2094°) was found to be the same as for the whole population.
When quantified in this way, a slightly larger range of disparities is encoded by interocular phase differences (ς = 0.264°) than by position mechanisms (ς = 0.211°), but these numbers are not straightforward to compare. The conversion from phase disparity to equivalent position disparity considers only the disparity that produces the highest firing rate. This suffers from the limitation mentioned in the preceding text; the limitation is illustrated by neurons with the TI response pattern (±π phase shift) that are simply inverted forms of T0 curves (0 phase shift). Despite this, phase shifts of π are plotted as equivalent to large position disparities.
We therefore examined the contributions of phase and position disparities to the surface of sensitivity shown in Fig. 12. In Fig.13, we have re-plotted these data, separating the contributions of position shifts and phase shifts. The solid line (—) shows the sensitivity attributable to the phase component alone. For this curve, the position parameter of the Gabor fits was set to zero, i.e., the best-fitting Gabor function was simply translated to a position where the Gaussian envelope had zero disparity. Note that the data were not refit, so this translated Gabor shows the range of disparities that are encoded by the phase disparity, if there had been zero position disparity. The population sensitivity was recalculated from this set of curves as for Fig. 12. The dashed line (- - -) shows the equivalent calculation for position disparity. In this case, the phase component of the Gabor fits was fixed at zero before the sensitivity was calculated. Thus this shows the range of disparities encoded by the position disparity, if there had been no phase disparities. This comparison shows that similar ranges of disparity are encoded by phase and position mechanisms with a measure that employs information from the whole of the tuning curve.
Comparison with psychophysics
For comparison with the V1 physiology, we measured the largest disparity at which psychophysical observers could perform a front/back discrimination (stereo D max) (Glennerster 1998), using an RDS stimulus whose spatial properties were set to the mean of the stimulus set used for the neuronal recording. This was performed with the two animals used in this study and with two human observers. For one animal (Hg), the performance was stable with 75% performance at 0.602°, similar to the values found for the human observers (0.475 and 0.453°). Although monkey Rb showed variable performance, the value of stereo D maxcalculated day by day was always smaller than these three values, and the largest measured value for Rb was 0.308°. Thus it appears that at least for these stimuli, primates are unable to determine the sign of disparities >0.6°. This reflects a limit that is consistent with the total range of responses of V1 neurons to the disparity of RDS patterns.
The preceding section described the range of disparities encoded across the entire population of V1 neurons. There are several reasons (see discussion) why it may be advantageous to limit the range of disparities encoded depending on the periodicity of the tuning function—a size-disparity correlation. This is examined in Fig.14, where the maximum interaction position is plotted as a function of the frequency of the fitted Gabor. The solid line (—) indicates a ±λ/2 phase limit. For almost all cells, the preferred disparity is within this range. Moreover, the range of disparity encoding decreases as a function of the spatial frequency component of the tuning curve. We conclude that a size-disparity correlation is present. This is partially a reflection of the phase-disparity encoding: the maximum interaction position of a cell that encodes disparity using no position shift, but a nonzero phase shift will necessarily be within±90°. A size-disparity correlation is also evident in the distribution of position shifts (Fig. 14), a measure which is not influenced by this constraint. Note that only 4/180 neurons have position disparities that correspond to phases exceeding the ±λ/2 limit. For these four neurons, the disparity tuning curve is displaced along the disparity axis by more than one half of its spatial period.
Classification of tuning curves
Poggio and collaborators (Poggio 1995;Poggio and Fischer 1977; Poggio and Talbot 1981; Poggio et al. 1988) classified disparity tuning curves in V1 into one of six categories based on a qualitative analysis of the tuning shape. We quantified the shapes of the tuning profiles with Gabor functions, whose phase and disparity offset parameters determine how the neurons should be classified. There was no indication of any clustering into distinct groups. Examples of neurons that would fall into each response category were found, in similar proportions to those reported previously (Poggio and Fischer 1977; Poggio et al. 1988). We conclude that the terminology developed by Poggio et al. remains a useful set of descriptive labels, but these labels appear not to identify distinct classes of neurons. Quantitative studies in cat striate cortex reached the same conclusion (Anzai et al. 1999a,b; Ohzawa et al. 1997).
Phase and position mechanisms
The present study shows that both phase- and position-based mechanisms encode horizontal disparity in cortical area V1 of the monkey. Thus it is necessary to use a “hybrid” (Fleet et al. 1996) phase and position model to account for these data. The advantages that may derive from using both types of disparity are unclear. Erwin and Miller (1999) have suggested that it reflects a developmental drive toward “subregion correspondence.” Their model predicts a negative correlation between phase and position disparities, but our data show a modest positive correlation.
Simply demonstrating that both phase and position disparities occur using tests of statistical significance does not demonstrate that they contribute equally to binocular visual processing. One way to assess their relative importance is to consider the extent to which each mechanism contributes to the ability of V1 to encode a range of disparities. We expressed the contribution of the phase and position components in terms of the rate of change in firing across the population as a function of disparity. This metric suggests that the phase- and position-components encode very similar disparity ranges. Our data may be compared directly with those of Anzai et al. (1997, 1999a), who examined phase and position encoding in simple cells from cat area 17. With a simple numerical comparison, they found that phase shifts encoded a somewhat larger range of disparities than position shifts. The similarity of the two data sets is striking, given the many differences in the methods used.
The data from monkey and cat are similar to those recently reported from the visual Wulst of the barn owl (Nieder and Wagner 2000): Gabor functions were good fits, there was a similar distribution of fitted phases, and there was a modest clustering of neurons by disparity selectivity. The chief discrepancy is with some of the earlier data reported from barn owl (Wagner and Frost 1993,1994). Analysis of the response to sinusoidal gratings of different frequencies led to the conclusion that phase disparities were small or nonexistent in the barn owl. It is not clear how to reconcile these earlier observations with the more recent data in the same creature. In the awake monkey, we found that this method produced similar estimates of phase disparity.
Relationship with ocularity
The accompanying paper (Prince et al. 2002) showed that a strong response to uncorrelated RDS stimuli is always accompanied by at least one substantial monocular response. Frequently, for TI neurons, the binocular responses were all smaller than the monocular response through the dominant eye, suggesting that inhibitory processes are involved. All formulations of the energy model to date have proposed only the use of excitatory outputs after half-wave rectification in simple cells. This has two consequences, which may be appreciated by consideration of the effect of dynamic RDS stimuli on the simple cells shown in Fig. 5. First, the response to the preferred disparity (derived from net excitation in both eyes) should be larger than either monocular response. Similarly the response to the null disparity should be smaller than the weaker of the two monocular responses. Second, monocular stimulation with a dynamic RDS in either eye should produce a net excitation—some samples in the dynamic sequence of patterns will be inhibitory, whereas others will be excitatory but the rectification leaves only positive responses.
The example of a TI cell in Fig. 4 C shows a clear case that obeys none of these predictions: the maximum binocular response is smaller than the response to monocular RDS in the dominant (left) eye; the minimum binocular response is greater than the response to monocular stimulation of the nondominant (right) eye; and the response to right monocular stimulation is lower than the spontaneous activity. This pattern strongly suggests that at some point the results of half-wave rectification have been passed through an inhibitory synapse.
Range of disparity encoding and psychophysics
A strategy of encoding disparity purely based on phase would have significant implications for the solution of the correspondence problem. Phase encoding necessitates a linkage between the preferred frequency of the cell and the range of disparity selectivity. This is known as “size-disparity correlation” in which the range of disparity encoding is limited to ±1/2f where fis the disparity frequency of the cell. If one is prepared to assume that the correct disparity is within this range, the correspondence problem is eased, a fact first pointed out by Marr and Poggio (1979).
Several psychophysical experiments provide behavioral evidence that a size-disparity correlation exists in human vision (Schor et al. 1984; Smallman and Macleod 1994), although some of these correlations may simply reflect properties of the stimulus (see Prince and Eagle 1999). Using isolated Gabor patches, Prince and Eagle (1999) showed that performance extends to large disparities, much larger than one cycle of the stimulus (see also Schor and Wood 1983;Simmons and Kingdom 1995). These disparities (several degrees) are too large to be accounted for by the range of disparities encoded by neurons in this study. One possible explanation for this discrepancy is that responses to larger disparities are present in extra-striate areas. Note that this could not simply be inherited from neurons in V1, which do not respond at large disparities, and would presumably need to be constructed from monocular signals.
In contrast to these results with Gabor patches, psychophysical experiments with random dot stimuli show that depth discrimination is only possible within a range of disparities similar to that encoded by V1 neurons (see results and Glennerster 1998). It may be that the psychophysical ability to identify large disparities with isolated stimuli (Gabor patches or bars) exploits signals in V1 neurons in a more subtle way than simply pooling the outputs of disparity selective neurons. For example, the monocular responses of V1 neurons may implicitly encode the disparity of spatially isolated stimuli. In a crowded random-dot pattern, the binocular response to uncorrelated dots means that the only reliable information about disparity is in the form of disparity-selective responses.
At the other extreme, the responses of V1 neurons appear to be sensitive enough to encode disparities in the neighborhood of psychophysical threshold (Prince et al. 2000). Together, these data indicate that disparity-selective responses in V1 are sufficient to support psychophysical disparity judgments with random dot stereograms. Whether they might also be sufficient to support other binocular functions, such as fusion or the control of vergence eye movements, requires further investigation.
We are grateful to H. Bridge, O. Thomas, and A. Pointon for help in collecting the data and to G. DeAngelis and J. Read for comments on earlier drafts of this work. We are also grateful to G. DeAngelis and W. Newsome for access to data from macaque area MT and to H. Wagner and A. Nieder for data from the owl visual Wulst. B. G. Cumming was a Royal Society University Research Fellow.
This work was supported by the Wellcome Trust.
Address for reprint requests: A. J. Parker, University Laboratory of Physiology, Parks Road, Oxford OX1 3PT, UK (E-mail:).
- Copyright © 2002 The American Physiological Society