## Abstract

Horizontal disparity tuning for dynamic random-dot stereograms was investigated for a large population of neurons (*n* = 787) in V1 of the awake macaque. Disparity sensitivity was quantified using a measure of the discriminability of the maximum and minimum points on the disparity tuning curve. This measure and others revealed a continuum of selectivity rather than separate populations of disparity- and nondisparity-sensitive neurons. Although disparity sensitivity was correlated with the degree of direction tuning, it was not correlated with other significant neuronal properties, including preferred orientation and ocular dominance. In accordance with the Gabor energy model, tuning curves for horizontal disparity were adequately described by Gabor functions when the neuron's orientation preference was near vertical. For neurons with orientation preferences near to horizontal, a Gaussian function was more frequently sufficient. The spatial frequency of the Gabor function that described the disparity tuning was weakly correlated with measurements of the spatial frequency and orientation preference of the neuron for drifting sinusoidal gratings. Energy models make several predictions about the relationship between the response rates to monocular and binocular dot patterns. Few of the predictions were fulfilled exactly, although the observations can be reconciled with the energy model by simple modifications. These same modifications also provide an account of the observed continuum in strength of disparity selectivity. A weak correlation between the disparity sensitivity of simultaneously recorded single- and multiunit data were revealed as well as a weak tendency to show similar disparity preferences. This is compatible with a degree of local clustering for disparity sensitivity in V1, although this is much weaker than that reported in area MT.

## INTRODUCTION

Selectivity for binocular disparity was initially demonstrated using elongated bar stimuli in cat area 17 (Barlow et al. 1967; Pettigrew et al. 1968) and V1 of the awake monkey (Poggio and Fischer 1977). Subsequently, Poggio and colleagues (Poggio 1995; Poggio et al. 1985, 1988) examined the sensitivity to horizontal disparity in random-dot stereograms (RDS) in macaque V1. None of the studies using RDS has attempted to describe the disparity tuning quantitatively. Consequently, there has been no quantitative analysis of the relationship between disparity selectivity to RDS and other fundamental properties of V1 neurons, such as orientation tuning and ocular dominance.

There are several reasons why it is important to study these issues with RDS. First, a change in the disparity of a bar stimulus also generates changes in the monocular images, which by themselves may influence neuronal firing. By contrast the monocular images of random-dot stimuli are spatially homogeneous. There is nothing that can be discovered about the disparity of RDS by inspecting one eye's image alone. Second, random-dot patterns contain a complete spectrum of orientations, which permits horizontal disparities to be explored regardless of the neuron's orientation preference. With bars or gratings, only the component of disparity orthogonal to the*stimulus* orientation can influence the neuron regardless of the receptive field properties. Remarkably, there are no published data that compare selectivity for the horizontal disparity of orientation broadband stimuli against orientation preference in area V1. Third, ever since the initiative of Julesz (1964,1971), many psychophysical studies of stereopsis have used RDS to isolate binocular from monocular processes. To understand the physiological substrate of such behavior, it is important to use equivalent stimuli in a species whose psychophysical performance approaches that of human observers (Harwerth and Boltz 1979; Harwerth et al. 1995; Prince et al. 2000; Siderov and Harwerth 1995) and in a brain area where the neuronal performance can potentially account for the precision of psychophysical performance (Prince et al. 2000).

We therefore undertook a quantitative survey of the responses to RDS in a large population of V1 neurons (*n* = 787) recorded from awake behaving monkeys. This provides a detailed description of the prevalence, type, and range of disparity tuning in macaque V1 as well as the relationship between disparity selectivity and other RF properties. We also tested quantitatively models of the underlying mechanisms of disparity tuning. Specifically, we sought to determine whether the observed data are compatible with the “energy” model of disparity selective neurons (Ohzawa et al. 1990), which was developed to describe data from area 17 of the cat. In this model, binocular simple cells are modeled as linear filters followed by a static output nonlinearity—a half-squaring operation (Albrecht and Geisler 1991; Heeger 1992; Jagadeesh et al. 1993; Movshon et al. 1978;Tolhurst and Dean 1987, 1990). Disparity selectivity arises because the response of the linear filter for one eye is summed with that of a similar filter for the other eye before the half-squaring operation. Hence the expansive nonlinearity provides an increase in firing rate when both left- and right-eye filters match the image. Ohzawa et al. (1990) suggest that a complex cell may be constructed by combining four simple cells that have different monocular phase profiles but are all tuned to the same disparity.

According to the energy model, the shape of the disparity tuning curve is determined by the shapes of the monocular receptive fields. The strongest test of the model is therefore to perform a quantitative comparison of the shape of the monocular subunits and the disparity selectivity. Such an analysis has recently been performed for simple cells in anesthetized cats (Anzai et al. 1999b). The analysis has not been performed for complex cells because the spatial nonlinearity of these cells makes it difficult to determine the receptive field (RF) profile of the subunits. In awake animals, the analysis is difficult even for simple cells, owing to the complications created by fixational eye movements (Livingstone and Tsao 1999).

Fortunately, there are many other tests of the energy model that can be applied in the absence of direct measurements of monocular RF structure. These are based on assuming a specific functional form for the underlying monocular subunits. A suitable function is the Gabor function, which has been extensively evaluated as a suitable function for describing both monocular RF profiles of simple cells (Daugman 1985; Jones and Palmer 1987;Marcelja 1980) in cortical area V1. Under the assumption that the Gabor is a correct description for the monocular profiles, the shape of the disparity tuning curve should be well described by a Gabor function, whose parameters should be related to the spatial properties of the neuron (for example, its orientation and spatial frequency tuning). Some deviations from the Gabor model for V1 cortical neurons have been previously observed (Hawken and Parker 1987). Although similar deviations occur within the present data set, they are slight.

The energy model is commonly combined with the assumption that its subunits are described by Gabor functions. This has been used widely (e.g., Fleet et al. 1996a,b; Prince and Eagle 2000; Qian 1994; Qian and Zhu 1997) since its inception and will be referred to here as “Gabor energy model.” All experimental data used to assess its validity (Anzai et al. 1999a,b, 1997; DeAngelis et al. 1995; Ohzawa and Freeman 1986a,b;Ohzawa et al. 1990, 1996, 1997) have been gathered from V1 neurons in the anesthetized cat using one-dimensional stimuli (bars or gratings). The present paper gives a quantitative summary of the responses of cortical neurons in V1 of the awake monkey to RDS patterns and examines how well the energy model describes these responses.

The accompanying paper (Prince et al. 2002) concentrates on those neurons that show strong disparity selectivity. The parameters of the curves fitted to the disparity tuning functions are used to address four questions concerning the mechanism of disparity selectivity. *1*) Is there any evidence for a distinct grouping into different types of disparity tuning curve?*2*) Are interocular differences in RF position or phase used to generate selectivity for nonzero disparities? *3*) What range of disparities is signaled by V1 neurons? *4*) Is the disparity encoding limited by the periodicity of the tuning curve (size-disparity correlation)? Together, these two papers provide a comprehensive, quantitative account of the properties of disparity-selective neurons in primate V1.

## METHODS

### General methods

The methods employed in this experiment for recording from V1 of the awake behaving monkey have been described in full in Cumming and Parker (1999). All of the procedures carried out complied with the United Kingdom Home Office regulations on animal experimentation. In brief, extracellular recordings were made from the striate cortex of two adult monkeys (*Macaca mulatta*), which had been trained to perform attentive fixation while viewing visual stimuli in a Wheatstone stereoscope for fluid rewards. Single-unit sensitivity to dynamic random-dot stereograms was measured as a function of horizontal disparity. Sinusoidal and bar stimuli were used to characterize a variety of other parameters.

### Apparatus and single-unit recording

Binocular stimuli were presented on two monochrome monitors (Textronix GMA 201) driven by a split-color signal from a Silicon Graphics Indigo computer and viewed using a Wheatstone stereoscope. Mean luminance was 188 cd.m^{−2}, the maximum contrast was 99%, and the frame rate was 72 Hz. The screens were at a distance of 89 cm from the eyes, such that each pixel subtended 0.98 arc min. For a small number of the later experiments, EIZO FlexScan F78 monitors were used with a mean luminance of 42 cd.m^{−2}. The positions of both of the animals' eyes were monitored using a magnetic scleral search coil system (C-N-C Engineering). To initialize a stimulus presentation, the animals were required to fixate to within either 0.4*°* (*monkey Rb*) or 0.6*°* (*monkey Hg*) of a binocularly presented spot. If the animal failed to maintain fixation within this window for the trial duration of 2 s, the trial was abandoned and a brief time-out period ensued. For the majority of trials, oculomotor control was much tighter than these limits.

Tungsten-in-glass microelectrodes (Merrill and Ainsworth 1972) were passed transdurally into the opercular cortex. Extracellular measurements of electrical activity in cortical area V1 were made from the left hemisphere for *monkey Hg,* and both hemispheres for *monkey Rb.* On isolation of a single unit, the classical minimum response field was determined, and its orientation preference was measured with a sweeping bar stimulus (see following text). Ninety-five percent of the RF centers were at eccentricities between 0.99 and 4.93*°*.

### Measurement of disparity tuning functions

The disparity sensitivity of single units was assessed using dynamic random-dot stereogram patterns. Each stereo-half consisted of equal numbers of black and white dots (usually 0.08 × 0.08*°*) presented against a midgray background with an overall density of 25%. A new pattern of random dots was used on each video frame to construct the stereograms. Thus in a 2-s presentation, there were 144 frames. As an absolute minimum, every measurement of a disparity-tuning curve was based on at least two trials for each disparity, which means that ≥288 different random-dot patterns were presented to each neuron for each disparity tested. For practical purposes, the sum of this stimulation is spatially homogeneous in each monocular image. In practice, many more trials were acquired for data that were subjected to detailed quantitative analysis (see following text).

The stereogram stimuli consisted of a central circular region that varied in disparity and a surround region that was held at a constant disparity. The central region was sufficiently large to cover the monocular receptive fields of the cell even at the largest disparity tested. The surround region was present to mask the monocular shifts in stimulus position that accompany the introduction of disparity, and to provide a reference for simultaneous psychophysical judgments (seePrince et al. 2000 for details). The disparity of the surround has been demonstrated not to influence the mean firing rate of units in V1 (see Cumming and Parker 1999). An initial test of disparity selectivity was carried out using five stimuli with disparities varying from −0.4 to 0.4*°*. If the mean firing rate was <10 spikes/s for all disparities, then the data were discarded. If the cell did not modulate its firing rate with disparity, then measurements of spatial properties were made and another unit was sought. For a subset of a cells, a wider range of disparities was sampled at the outset to ensure that cells tuned only to large disparities were not missed. In all cases where disparity tuning was found, there was some modulation in the range *±*0.4*°*, even if the maximum response was for a greater disparity.

If the cell modulated its firing rate with disparity, further measurements of disparity sensitivity were made, and the stimulus disparities were adjusted to cover the range over which modulation occurred. In general, the disparity tuning curves analyzed here were used for a variety of other studies, which influenced the choice of sampling. Thus the quantity and range of data gathered varied widely from cell to cell. The number of disparity levels sampled varied from 5 to 34, and the total number of trials varied from 10 to 958. For some cells, responses to binocularly uncorrelated random-dot stereograms were also measured. Disparity tuning functions were recorded from a total of 787 V1 cortical cells, of which 489 were from *monkey Rb* and 298 were from *monkey Hg.*

Sensitivity to disparity was first characterized with a binocular interaction index or BII (Ohzawa and Freeman 1986b;Smith et al. 1997b), which measures the degree to which the firing modulates with disparity. A BII of near 1 indicates that disparity variations can modulate the firing rate from zero to the maximum rate on the disparity tuning curve. A BII of near 0 indicates that disparity hardly changes the firing rate at all. The binocular interaction index is defined as
Equation 1where *R*
_{max} is the firing rate at the preferred disparity (i.e., the greatest response on the tuning curve) and *R*
_{min} is the firing rate at the least preferred disparity (the minimum response on the tuning curve).

The Gabor energy model predicts that disparity tuning curves will have the form of either a one dimensional Gabor function or a Gaussian curve (see appendix
). Well tuned cells were fit with both of these models using a nonlinear least squares algorithm (Numerical Algorithms Group, Oxford). A one-dimensional Gabor function may be described by the equation
Equation 2where *R*
_{mean} is the mean height of the curve (binocular baseline firing), *A* is the amplitude, *d* is the stimulus disparity,*d*
_{0} is the mean position of the curve in disparity (disparity offset), and ς is the width of the function. The frequency and phase of the Gabor function are controlled by the parameters *f* and φ, respectively. Note that phase is defined relative to the disparity offset. Hence the phase parameter describes the symmetry of the tuning profile relative to the mean position of the Gaussian envelope. The operation Pos denotes half-wave rectification. For the energy model, the binocular baseline firing rate about which modulations occur (*R*
_{mean}) is the response of the neuron to dots that are binocularly uncorrelated.

Three constraints were placed on the Gabor fitting. First, the amplitude parameter *A* was restricted to be less than the observed range of firing rates. This ensured that the fitted curve was limited to a plausible range of firing rates. Note that for the Gabor function, the largest possible range of firing rates is*R*
_{mean}
* ± A*, so the restriction on *A* allows for the fitted modulation to be up to twice the experimentally observed range. Second, an upper limit on the frequency of the sinusoidal component (*f*) was set so that it did not exceed the limit constrained by the data sampling. Third, the disparity offset (*d*
_{0}) was constrained to be within the range of the data samples. For each curve, a Gaussian function was also fit. This is defined identically to the Gabor with the cosine term omitted. A sequential *F* test was carried out to test whether the Gabor function explained significantly more of the variation in the curves than the Gaussian function. Because these regressions are nonlinear in their parameters, a separate numerical simulation was carried out to confirm that the *F*test had an appropriate rejection rate.

Many statistical procedures, including regression analysis, rely on an assumption of homogeneity of variance. This poses a problem for neuronal firing data, in which the variance is known to be approximately proportional to mean firing rate (e.g., Dean 1981; Tolhurst et al. 1981). Taking the square root of measured firing provides a variance stabilizing transformation. This should remove the mean:variance dependency and de-skew the data (see Armitage and Berry 1994; Snedecor and Cochran 1989), and appendix shows that the transform achieves this for our dataset. All statistical analysis and curve fitting in this paper was hence performed on . When analytic functions such as Gabor curves are fit, the square root of the function is fitted to the square root of the data. The result is that of fitting a Gabor function in which data at high firing rates are weighted less because those firing rates are known to be more variable.

### Measurement of spatial properties

Orientation preference was assessed using binocular sweeping bar stimuli. Mean firing rate was measured at a number of orientations and a Gaussian curve was fitted to the resulting tuning profile. The position of the peak of this curve was taken to be the preferred orientation. For many units, orientation preference was also tested with binocular, drifting sinusoidal gratings of the optimal spatial, and temporal frequency. A Gaussian curve was also fitted to this orientation response curve, and the peak position was taken as a measure of the preferred orientation. Where both bar and grating stimuli were used to measure the cell's orientation preference, results were generally in close agreement. In these cases, the sinusoidal grating data were used. For both stimuli, the half-width at half height of the fitted Gaussian curve was taken as a measure of the orientation bandwidth.

Preferred spatial frequency was assessed by presenting a drifting grating patch at the preferred orientation. The neuronal response was measured at a number of spatial frequencies (usually 5), and a Gaussian in log frequency was fit to the resulting data. The peak of this Gaussian was taken as a measure of the spatial frequency preference of the cell. When the peak position of the fitted Gaussian was above or below the data range, the data were deemed not to permit the designation of a preferred spatial frequency. It should be noted that spatial frequency sampling was usually sparse (typically at 1, 2, 4, 8, and 16 cpd), and hence our estimates of the preferred spatial frequency for luminance gratings are less precise than estimates of other parameters. Examples of disparity, orientation and spatial frequency tuning curves are shown in Fig. 1. The spatial frequency preference was usually measured after characterizing disparity selectivity. In many cases, the unit isolation was lost before this stage was reached so there are many units for which no spatial frequency tuning is available.

Ocular dominance was determined by presenting monocular drifting gratings of the preferred spatial frequency, orientation, and direction to each eye alone. For some cells, monocular random-dot stimuli were interleaved in the main disparity tuning measurement, and a further estimate of ocular dominance was produced from these. For both stimuli, the eye that was not being tested viewed a blank, dark screen. The ocular dominance index (ODI) was defined by LeVay and Voigt (1988) as the response of the ipsilateral eye to a monocular stimulus divided by the sum of the ipsi- and contralateral responses (see *Eq. 3
*). Hence, cells that have an ocular dominance index near 1 have a large ipsilateral response, cells with an ocular dominance index near 0 have a large contra-lateral response, and cells with an ODI of near 0.5 are well balanced. This can be re-expressed as a monocularity index MI, where 0 is totally binocular and 1 is totally monocular (see *Eq. 4
*). It should be noted that, unlike LeVay and Voigt (1988), spontaneous rates were not measured explicitly in the present work and have not been subtracted from these measures. However, examination of the mean firing rate in the prestimulus period suggests that spontaneous rates were almost always small compared with the responses to random-dot patterns
Equation 3
Equation 4Direction preference was tested by presenting a monocular drifting grating of the preferred temporal frequency, spatial frequency, and orientation to the dominant eye and comparing the response in the two possible drift directions. The direction tuning index or DTI quantifies the difference in firing to the preferred and null directions. It is defined as the difference between the responses in the preferred and null directions divided by their sum
Equation 5Cells were classified as simple or complex using the method ofCumming et al. (1999): responses to drifting gratings were analyzed but stimulus cycles during which a saccade was made were discarded. Cells in which the F1:F0 harmonic ratio was greater than one were classified as simple, following Skottun et al. (1991).

## RESULTS

### Prevalence of disparity tuning

In this section we address two questions. First, we consider whether the degree of tuning for horizontal disparity is distributed continuously or whether there is evidence for a distinct population of disparity selective cells. Second, we examine whether disparity selectivity is correlated with other neuronal properties. To answer these questions, disparity selectivity must be adequately characterized.

The most common way to assess disparity selectivity has been to employ a “relative modulation” index. For example, Ohzawa and Freeman (1986a) and Smith et al. (1997b)measured disparity tuning using drifting sinusoidal grating stimuli. They fitted sinusoidal functions to the tuning curves and defined the BII to be the ratio of the amplitude to the mean firing level. In this paper, we use a related measure, also referred to as the BII, which is suitable for use with data from random-dot stereograms (see *Eq.1
* in methods).

There are several potential difficulties with the BII because it takes no account of the variability in firing, nor the dependence of this variability on the firing rate. For these reasons we developed a different, statistical index that estimates the discriminability of the maximum and minimum points on the disparity tuning profile. We call this the disparity discrimination index or DDI
Equation 6where, as before, *R*
_{max} is the greatest response on the measured tuning curve, and*R*
_{min} is the smallest response. RMS_{error} is the square root of the residual variance around the means across the whole tuning curve. All of these calculations were performed on
.

The DDI essentially compares the difference in firing to the preferred and least-preferred disparities to the within-stimulus variation in neuronal firing. If the disparity tuning curve modulates a great deal and the response to each particular disparity on the curve is statistically reliable, then this index will be near one. If the firing rate is not modulated by disparity, then the fluctuations in the disparity tuning curve will be due to noise and this index will be small. Better estimates of the term RMS_{error} are of course achieved by increasing the number of stimulus presentations. However, increasing the duration over which rates are measured systematically reduces RMS_{error}
*,* leading to large values of the DDI. It is therefore important that comparisons of this type of measure are made between datasets with the same the interval of time over which firing rates are measured, since changes in RMS_{error} alter the value of the DDI, even if*R*
_{max} and*R*
_{min} do not change. The DDI index is related to the ability of an ideal observer to perform a disparity discrimination task, given only the measured firing rates of the neuron at the preferred and least-preferred disparities. See Prince et al. (2000) for a further discussion of neuronal discrimination of horizontal disparities.

Figure 2 compares these two measures of disparity selectivity in a way that reveals three advantages of the DDI. First, the BII is negatively correlated with the mean firing rate, which the DDI is not. The reason is that neurons with low firing rates can spuriously acquire a high value of BII simply due to random fluctuations of an otherwise weak response. Second, and conveniently, the DDI is more or less normally distributed close to a Gaussian in its frequency distribution. Third, and most importantly, the DDI is a better indicator of whether disparity-induced modulation is statistically reliable. For all these reasons, it is more appropriate to use the DDI when examining the correlation of disparity selectivity with other neuronal properties. Figure3 presents examples of disparity tuning curves with low (*A*), medium (*B*), and high (*C*) disparity tuning indices. Note that error bars on these plots represent the SDs of the firing rate. Across the entire population, the order of DDI values accorded well with judgments by eye of the strength of disparity tuning.

Figure 2 shows the distributions of both measures of disparity selectivity (BII and DDI) for the whole population (787 neurons). There is no evidence of two distinct populations despite the large sample. Rather there is a continuum in the strength of disparity tuning. A similar result using gratings has been reported both in the cat (Ohzawa and Freeman 1986a,b) and the monkey (Smith et al. 1997b). We also quantified disparity tuning in a number of other ways, including the maximum rate of change of firing with disparity and the *F* ratio from a one-way ANOVA. None of these showed a separation into two populations.

Because the distribution of disparity selectivity is unimodal, the proportion of neurons that are deemed to be disparity selective will depend on the criterion used. For example, 378/787 (48%) neurons showed significant modulation at the 5% level on a Kruskal-Wallis test. On the other hand, if a one-way ANOVA is used, 431/787 (55%) of our V1 neurons were significant at the 5% level. Figure 2shows that the DDI is quite closely related to the statistical classification: values of DDI smaller than 0.4 are very rarely the result of significant modulation, and values of DDI >0.6 are almost invariably the result of significant modulation. Even so, some cases where the DDI is >0.6 and statistically significant actually represent very weak tuning (see Fig. 3
*B*). With a more stringent criterion that accepts values of DDI >0.8, most tuning curves were strongly modulated by disparity and could be reliably quantified at later stages.

### Comparing selectivity for disparity with other neuronal properties

The preceding section establishes the DDI as a valid measure of the strength of disparity tuning. Figure4 shows the relationship between the DDI and other neuronal properties.

Figure 4
*A* shows a weak, but significant, positive correlation (*r*
_{s} = 0.23,*P* ≤ 0.002) between the DDI and direction selectivity. Inspection of the plot indicates that it is uncommon for neurons to show a combination of strong direction selectivity and weak disparity sensitivity. Figure 4
*B* plots the disparity discrimination index as a function of the degree of monocularity. There is no tendency for neurons that respond equally to monocular stimulation in each eye to exhibit a greater sensitivity to disparity, again in agreement with earlier quantitative data from anesthetized animals. AlthoughSmith et al. (1997b) found that simple cells with ocular imbalance tended to have lower disparity sensitivity than those that were balanced, they found no relationship for complex cells. In area 17 of the cat, Ohzawa and Freeman (1986b) reported that the degree of binocular interaction in simple cells did not depend on their ocular dominance.

Figure 4
*D* shows no correlation between disparity selectivity and preferred orientation—cells that are highly sensitive to disparity are found at all orientations. In practice, only a modest relationship between orientation preference and disparity sensitivity in random-dot stereograms is predicted by the binocular energy model (seeappendix
and Fig. 12). Even this is not observed in our dataset. The correlation between orientation bandwidth and DDI in Fig.4
*C* is weak compared with that found by Smith et al. (1997b) and not statistically significant (*r*
_{s}
*=* −0.09, *P*< 0.07). This difference may reflect the fact that Smith et al. (1997b) used the BII, a measure that depends on mean firing rate. Indeed in our data, a stronger, statistically significant correlation was found between the BII and orientation bandwidth (*r*
_{s}
* =* −0.13, *P*< 0.012). However, we also found that orientation bandwidth was negatively correlated with the mean firing rate.

Figure 4
*E* indicates that there is no tendency for disparity selectivity to vary with the preferred stimulus spatial frequency (*r*
_{s} *=* 0.07, n.s.). Figure4
*F* shows that there is no tendency for the strength of disparity selectivity to change as a function of eccentricity (*r*
_{s}
* =* 0.03, n.s.). Indeed, the general lack of structure in these data sets is a little surprising at first glance. As a precaution, we re-examined all of these relationships after setting a tighter criterion on the average firing rate achieved at the most preferred disparity on the tuning curve. Previously, this had been 10 impulses/s (see methods). Raising this value to 40 impulses/s did not substantially alter the conclusions.

We also examined the relationship between disparity sensitivity and classification as simple or complex. Cells were classified by the method of Cumming et al. (1999) (seemethods). Of 226 neurons to which this analysis could be applied, 57 were classified as simple. It has previously been claimed (e.g., Poggio et al. 1985) that simple cells do not respond to random-dot stereograms and also rarely show disparity selectivity to RDS. It is important to consider these two issues separately. There is a significant negative correlation between mean firing rate to RDS patterns and F1:F0 ratio (*r*
_{s} = −0.377, *P* ≪ 0.0001), confirming that simple cells tend to have lower firing rates on average than complex cells in response to RDS. This is unsurprising because simple cells can only respond to those dot patterns that happen to match their monocular phase preferences. Complex cells may be stimulated by all dot patterns. As a consequence of the relationship between the F1:F0 ratio and the response rates to RDS, it is important to use a measure of disparity selectivity that is not influenced by mean firing rate when comparing simple and complex cells. The DDI has this property, and we found no relationship between the F1:F0 ratio and DDI (*r*
_{s} = 0.026, NS). Fig.5
*C* shows an example tuning curve from one simple cell that is strongly selective for disparity in these random-dot stereograms. We conclude that both simple and complex cells respond to dynamic random-dot stereograms and vary their response as a function of the disparity of such patterns.

For many cells, disparity tuning was also measured for drifting sinusoidal gratings as a function of the interocular phase difference. The spatial frequency, temporal frequency, orientation and drift direction of these gratings were matched to the preferred values for each unit. The disparity discrimination index for sinusoidal gratings was significantly correlated with the disparity discrimination index for random-dot stereograms (*r*
_{s} = 0.28;*P* ≤ 0.00015, *n* = 176). One reason why this correlation might be less than perfect is that if the spatial properties of the grating or the RDS pattern are not optimal for the neuron, this may limit the DDI. A second reason is that mis-sampling in either tuning curve inevitably affects the value of DDI that is measured. Nonetheless, neurons that exhibit disparity tuning to dynamic random-dot patterns also generally exhibit tuning to sinusoidal grating stimuli.

### Description of disparity tuning curves

To summarize the population of disparity tuning curves, it is useful to fit analytic functions to the disparity tuning data and examine the resulting parameters. The energy model predicts that the disparity tuning profile for dynamic random-dot stereograms will take the form of the horizontal cross-correlation between the left- and right-eye receptive field shapes (see appendix ). We did not measure directly the shape of the monocular receptive fields. Previous measures of this property have concluded that a reasonable description of the monocular receptive fields can be delivered by Gabor functions, which consist of a sinusoid multiplied by a Gaussian envelope (Daugman 1985; Marcelja 1980; but see also Hawken and Parker 1987). Under these circumstances, the energy model predicts that the disparity tuning functions should be well described by a Gabor function, which is the function we therefore fit to the data (as described inmethods).

The requirements for this stage of the analysis were that neurons should be strongly modulated by disparity and their modulation should have been reliably characterized. The DDI provides a good estimate of the extent to which neuronal activity is modulated by disparity, but it provides no statistical assurance that there was reliable stimulus-related variance within the tuning profile as a whole. We used another metric to select neurons for further analysis, the*F*
_{index}, given by
Equation 7where MS_{error} and MS_{treatment} are the familiar “mean-square” terms from a one-way ANOVA (performed on
). MS_{error} is the mean “within disparity variance,” and MS_{treatment} is the mean “between disparity variance.” The*F*
_{index} is maximized when the mean firing rate exhibits consistent stimulus-related changes at several different disparities. Hence, this criterion rejects some cells that had a high DDI but inadequate sampling of their tuning curves. Although the DDI provides the best estimate of how strongly disparity modulates a neuron's activity, the *F*
_{index}identifies more accurately neurons that had both strong modulation*and* a reliable characterization of their tuning function: strong modulation by disparity is essentially a property of the neuron itself, whereas the reliable characterization of the tuning function is more an indicator of the quality of the experimental data gathered on a particular neuron. Using the *F*
_{index}captures both these criteria, so neurons that had an*F*
_{index} of >0.8 were admitted to further analysis (338/787 cells, 44%). Tuning curves that had been sampled at fewer than seven disparities were in any case rejected from further model-based analysis. This yielded 253 cells (136 from*monkey Rb* and 117 from *monkey Hg*).

Six disparity tuning profiles and their associated Gabor fits are illustrated in Fig. 5. The Gabor function described the disparity tuning data extremely well: for 163/253 neurons the fit accounted for ≥90% of the variance attributable to disparity and for 233/253 neurons the fit accounted for ≥75% of the variance. Only a few disparity tuning functions were not well fit by Gabors. For most of these cases, the disparity tuning curves appeared to lack any coherent form. The only systematic deviation from the Gabor model that was noted is illustrated in Fig. 5, *E* and *F.* For these curves, the side lobes of the disparity tuning function are noticeably wider than the central peak. This is the same form of deviation from the Gabor model noted by Hawken and Parker (1987). However, its effect in this dataset of disparity tuning curves is slight. Within this data set, the examples in Fig. 5, *E* and*F,* are relatively poor fits: 68% of all fits accounted for a larger fraction of the variance than that accounted for by the fit in Fig. 5
*F*. Nonetheless the best-fitting Gabor functions in Fig. 5, *E* and *F,* still capture the essential features of these tuning curves. In particular, the fitted phase provides an accurate reflection of the degree of symmetry in the tuning curve, and the fitted position of the Gaussian envelope describes well the center of the disparity range over which modulation occurs.

One parameter that needs to be interpreted with care is the fitted frequency. Figure 6 shows an example with two different fits to the same disparity tuning data. The sinusoidal and Gaussian components of the fitted Gabors are shown separately in the *bottom two panels.* Although the fitted frequencies are very different, these combine with equally different Gaussians to produce very similar looking Gabor functions. To provide a fitted measure that fairly reflected the spatial scale of the modulation in disparity, we adopted a procedure in which the frequency of the Gabor function was derived directly from the data. We examined the Fourier spectra of the disparity tuning curves after the DC component had been removed (Fig. 6). The frequency of the sinusoidal component of the Gabor was set to be the equal to Fourier component with the greatest energy, which we term the “disparity frequency” of the tuning curve (following Ohzawa et al. 1997). The remaining five parameters of the Gabor curve were then refit. In all cases the shape of the new fit was extremely similar to the previous fit. However, the “disparity frequency” corresponded much better with the scale of the tuning curves as assessed “by eye.”

We also investigated whether a more economic model than a Gabor function would suffice. Many curves could be well described by a four-parameter Gaussian model. We found that for 142/253 cells, the addition of the frequency (*f*) and phase (*φ*) components for the Gabor model did not provide a significant improvement in the fit (sequential *F* test,*P* ≥ 0.05—see methods for details). Fig.7, *A* and *B*, demonstrates two curves that are well described by Gaussian functions. There is no suggestion of a sinusoidal component in these disparity tuning curves. Figure 7
*C* presents an example of a borderline case, in which the Gabor model is significantly better than a Gaussian model but only at the 5% level. The disparity tuning curves in Figs.1
*A* and 5
*A* provide examples where Gaussian models are insufficient. Note that a failure to demonstrate statistically that a Gabor fit is necessary does not guarantee that the underlying tuning is truly Gaussian, only that we cannot reject that possibility. If the underlying Gabor shape had relatively shallow side lobes (i.e., the frequency is low relative to the SD), then our sampling may not have been fine enough to detect these reliably.

For neurons that are adequately described by Gaussian tuning profiles, the frequency term in *Eq. 4
* is poorly constrained: provided the period of the cosine term is large relative to the SD of the Gaussian, it has little influence. This does not mean that the measure of “disparity frequency” is uninterpretable. Somewhat paradoxically, Gaussian tuning curves give rise to a peak “disparity frequency” in the continuous Fourier transform (see Fig.6
*D*) of the disparity tuning data, because the DC component is removed prior to the transform. A similar issue arises with neurons whose disparity tuning is odd symmetric and broadly tuned for disparity, for which a low-frequency, odd-symmetric Gabor is the most appropriate functional form. The disparity frequency gives a consistent measure of the spatial scale of the disparity modulation that can be applied to both Gabor-shaped and Gaussian tuning curves. For this reason, all subsequent analysis is performed on Gabor fits in which the frequency term was not a free parameter, but was set to the disparity frequency.

For those cells where the Gaussian model is sufficient, the Gabor still yields an equally good description even though some of the parameters of the Gabor are poorly constrained. Two important parameters are still well constrained, even in these cases. The first is the horizontal position term. The curve in Fig. 7
*A* is both Gaussian in shape and requires a nonzero position term. The second is the phase term, which is inevitably near zero or *π*, because the Gaussian shape is symmetrical. When applying population analyses to the shape of tuning curves, the shape of the fitted Gabor is used even for cells where a Gaussian would have been adequate.

### Relating disparity tuning to spatial properties

The preceding section demonstrates that Gabor functions provide highly accurate descriptions of the shape of disparity tuning functions in primate V1. Given that many underlying monocular tuning curves are often well described by Gabor functions, this represents a successful prediction of the energy model. It would be difficult to reconcile the energy model with disparity tuning curves that looked very different from Gabor functions. Of course, this does not demonstrate that the shape of each disparity tuning function is explained by the monocular RF structure of that particular cell. Because we did not measure monocular line-weighting functions, we rely on other measures to test the link between the spatial properties of the RF and the disparity tuning function.

Before attempting such an analysis, it is important to restrict the sample to neurons that are well tuned and well described by the Gabor fits. As described in the preceding text, we initially fit Gabor functions only to the 253 neurons with*F*
_{index} > 0.8 that had been sampled at seven or more disparities. Next, we rejected 13 cells for which the Gabor fit accounted for <75% of the between-disparities variance. Also, we excluded 48 cells for which our measurements of disparity tuning covered <2 SDs of the fitted Gabor (which is to say the sampled disparities did not adequately constrain the fit). All of these cases also had a minimum of three samples per period of the disparity frequency (i.e., the sampling was always above the Nyquist limit). Finally we removed a further 12 cells by hand for which the Gabor did not provide a realistic description of the variation in the curve. After these refinements, 180 tuning curves remained, for which the fit of the Gabor function was both extremely good and adequately constrained. Note that this group includes tuning curves that could be described by Gaussians, as the Gabor function still provides a good fit to the data with a Gaussian form.

The Gabor energy model makes clear predictions about how the form of the disparity tuning curve should depend on preferred orientation and spatial frequency (see appendix for a detailed discussion). First, the sinusoidal component of the Gabor curve should increase in scale as the preferred orientation of the neuron moves from vertical to horizontal (when preferred spatial frequency is held constant). When the orientation approaches horizontal, the period of the sinusoid becomes very broad, and the tuning profile either becomes Gaussian or flat, depending on the original symmetry of the curve (seeappendix , Fig. 12).

This prediction was tested by taking the ratio of the wavelength of the sinusoid, λ *=* 1/*f*, to the SD parameter, ς. This provides an estimate of the number of cycles in the tuning curve. As this ratio decreases, the curve becomes more nearly Gaussian. Figure8 shows the ratio plotted as a function of the preferred orientation. There is a marked absence of points in the upper left quadrant, indicating that cells with orientations near horizontal tend to have more Gaussian tuning curves as predicted by the energy model. The relationship is statistically significant: for cells with orientations within 45*°* of horizontal, the spread of values for this parameter ς/λ was significantly smaller than for those with orientations within 45*°* of vertical (*F* test, *P* < 0.005). Nonetheless, some cells with vertically oriented receptive fields also exhibit disparity tuning curves that are well described by a Gaussian. This is what might be expected if the spatial frequency tuning of these cells was very broad.

In fact, it is notable that the mean number of cycles per SD of the Gabor envelope is only 0.25 for the whole population. Hence, the Fourier amplitude spectra of the majority of the tuning curves are effectively low-pass. This is surprising because the Gabor energy model predicts that the bandwidth of the tuning curves should strictly be narrower than that of the underlying RFs measured monocularly with sinusoidal gratings (appendix ). For sinusoidal luminance gratings, primate V1 neurons recorded under anesthesia typically have spatial frequency bandwidths (full width at half height) of ∼1.5 octaves (DeValois et al. 1982). This corresponds to 0.38 cycles/SD, which is substantially larger than the majority of values in Fig. 8, even for vertically oriented cells. These data are highly suggestive of a discrepancy between the bandwidth of the disparity frequency and the tuning for sinusoidal luminance gratings. However, directly comparable data of sufficient quality were not available on sufficient neurons to evaluate this hypothesis fully.

The second prediction of the model is that the frequency at which the disparity tuning function modulates (disparity frequency) should equal the spatial frequency of a horizontal section through the RF (or RF subunits, for complex cells). This is closely related to the spatial frequency of a horizontal cross section through the preferred grating stimulus (the “horizontal frequency”). Figure9 compares the disparity frequency and the horizontal frequency. The data are clustered around the identity line, indicating that the spatial scale of the disparity tuning curves is on average similar to that predicted by spatial properties. The correlation between disparity frequency and the horizontal frequency is also significant, but it is weak (Spearman's rank correlation co-efficient, *r*
_{s} = 0.36,*P* ≤ 0.01, *n* = 52). Note that the predicted relationship does not necessarily follow the identity line: the estimated disparity frequency never reaches very low values because the Gaussian envelope of the tuning curve places a lower limit on this parameter. This explains why some of the points on the left of the graph lie above the identity line. Nonetheless this phenomenon cannot entirely explain the weakness of the correlation. Even when the analysis was restricted to tuning curves that were not Gaussian in shape, the correlation was similar. Ohzawa et al. (1997)compared the disparity frequency perpendicular to the RF orientation with the preferred spatial frequency of complex cells in cat area 17. They also found only a weak relationship, and the linear regression had a slope that was considerably less than unity.

### Relation between monocular responses and disparity tuning

The energy model makes several predictions about the relationship between monocular and binocular responses. First, purely monocular cells, that lack excitatory input from one eye, should not be disparity selective. In fact, the extent of disparity selectivity was found to be unrelated to ocular dominance (Fig. 4
*F*). Second, the response to binocular uncorrelated RDS should equal the sum of the monocular responses to RDS (appendix
). We took the baseline firing of the fitted Gabor as a measure of the response to uncorrelated RDS. This baseline firing reflects the binocular response to stimuli of large disparity: because V1 RFs are small, large disparities mean that the stimulus within the RF is binocularly uncorrelated. In 18 cases, we measured the responses to binocularly uncorrelated dot patterns. These independent measures of activity were closely correlated with the fitted baseline of the Gabor function (*r*
_{s} = 0.917, *P* ≪ 0.0001,*n* = 18). Thus V1 neurons calculate binocular correlation over a finite area. Moreover, the range of disparity samples was broad enough in our experiments to determine the response to uncorrelated binocular stimuli from the flanks of the disparity tuning curves. Taking the baseline firing of the fitted Gabor as a measure of the binocular response, this response was well correlated with strength of the monocular responses, but was usually closer to their mean rather than their sum (see Fig.10
*A*).

The third relationship predicted by the energy model is that the ability of changes of disparity to cause changes in the firing rate of the neuron should be determined by the strength of monocular responses (see appendix
). Once again, although there is a significant correlation (Fig. 10
*B*), the observed amplitude of modulation tends to be smaller than predicted. Both this observation and the previous one show that the monocular responsiveness to random-dot patterns does play an important role in determining the binocular response to RDS. This is only in partial agreement with the energy model because some additional factor (perhaps response normalization) leads to lower than predicted activity in response to binocular stimuli.

### Architecture of disparity tuning

In this section, we address the question of whether there is a “functional architecture” for disparity tuning in macaque V1. Other properties of cells in the striate cortex are known to be organized in a systematic fashion, such as the columnar organization of orientation preference (Hubel and Wiesel 1977). For cat visual cortex, Blakemore (1970) proposed the existence of “constant depth” columns, in which the preferred disparity was the same for all units. However, LeVay and Voigt (1988)found that neurons with similar disparity preferences were only weakly clustered together.

A recent method used to examine the functional architecture of disparity in MT has been to compare single- and multiunit data recorded at the same site (DeAngelis and Newsome 1999). During many experiments, we recorded both a clearly isolated spike from the neuron under investigation and a mixture of unisolated spikes from nearby neurons (multiunit activity). We used this approach to compare disparity tuning curves for the multiunit data with that for the isolated single spikes. This was performed for 195 sites where we had recorded substantial multiunit activity, the distinction between the isolated unit and the multiunit spikes was extremely clear, and there was no slow drift in the number of multiunit events over time.

Figure 11, *left*, shows a plot of the disparity discrimination index for single-unit data as a function of the same parameter for the multiunit data. These are significantly correlated (*r* = 0.36, *P* ≪ 0.0001, *n* = 195), suggesting that disparity-selective neurons are to some extent clustered together as if there is a columnar organization. This is not merely a consequence of the functional architecture for ocular dominance: the disparity discrimination index was found to be independent of the ocular dominance for both the single-unit data (see Fig. 4
*F*) and the multiunit data (not shown). The plot also shows data from a study of disparity tuning in visual area MT (DeAngelis and Newsome 1999), which has been re-analyzed using the same method. Cells in MT generally exhibit much stronger disparity tuning. Moreover, the correlation between the degree of disparity tuning is considerably stronger in MT (*r* = 0.61, *P* ≪ 0.0001) than in V1.

Given that disparity tuning tends to be present in the multiunit data when the single unit is itself disparity tuned, it is appropriate to consider whether the disparity tuning profile of nearby neurons tends to be similar. To maintain compatibility with the work ofDeAngelis and Newsome (1999), we selected cells for which both the single- and multiunit responses were significantly modulated by disparity (ANOVA, *P* < 0.05). We then fit cubic splines to each response profile and took the position of the maximum as a measure of the preferred disparity. Figure 11,*right*, shows the preferred disparity for the single-unit data plotted as a function of the preferred disparity of the multi-unit data. There is a weak correlation between the preferred disparities, demonstrating that the organization of disparity sensitivity in V1 is not random. However the correlation is much weaker than the one found in MT (DeAngelis and Newsome 1999).

As a check on the validity of our procedures, we carried out a similar analysis that compared the orientation preferences of single units in V1 against multiunit data from the same site. As expected, the relationship here was much stronger than that for disparity. For 106 recording sites tested with sinusoidal gratings, the correlation was 0.972, and for 102 recording sites tested with sweeping bar stimuli, the correlation was 0.870 (both values highly significant). We conclude that if there is a correlation between the preferred disparity of nearby neurons, it is considerably weaker than the correlation in area MT (DeAngelis and Newsome 1999) and also much weaker than the organization for orientation in V1.

## DISCUSSION

We have carried out a quantitative analysis of disparity selectivity in V1 of the awake monkey. There is a continuum in the degree of disparity selectivity with no evidence of distinct populations of disparity-selective and -insensitive neurons. This conclusion is reinforced by the observation that disparity selectivity is largely uncorrelated with other cell properties (including preferred orientation and ocular dominance). The only parameter that correlates significantly with disparity selectivity is direction selectivity, and even this correlation is modest. Our quantitative analysis of a large population of neurons tested with RDS in the awake animal broadly agrees with earlier studies from anesthetized animals using grating stimuli.

We selected dynamic RDS for measurement of the tuning functions for horizontal disparity. These stimuli isolate sensitivity to binocular disparity from any responses to the monocular stimulus. Also, dynamic random-dot stimuli allow the measurement of sensitivity for horizontal disparity regardless of the unit's preferred orientation. This reveals that neurons may exhibit tuning for horizontal disparity irrespective of their orientation preference.

Some authors have advanced a priori grounds for expecting a relationship between the selectivities for orientation and horizontal disparity. Gonzalez and Perez (1998) state “since horizontal edges do not produce horizontal disparity, units sensitive to horizontal disparities are expected to have predominantly vertical orientation preference” (p. 215). Similarly, Anzai et al. (1999c) suggest that “because neurons in the striate cortex respond best to stimuli which are elongated along the RF orientation, they can encode disparity in the direction orthogonal to, but not parallel to the RF orientation” (p. 884). These statements are true of oriented stimuli: applying a horizontal disparity to a horizontal bar or grating does indeed produce no change in the portion of the stimulus overlapping the RF, so disparity cannot in principle be signaled. However, the same logic does not apply to oriented RF structures. With orientation broadband stimuli, such as RDS, even neurons that prefer horizontal orientations can signal horizontal disparity. appendix
demonstrates this point theoretically for the energy model, and Fig. 4
*D* shows experimentally that it is true for V1 neurons. Our findings agree with those of other quantitative studies, which report no correlation between the BII and preferred orientation (Ohzawa and Freeman 1986a,b;Smith et al. 1997b).

For all of these comparisons, we have employed a new metric to assess the extent of disparity selectivity. There is no consensus about how disparity selectivity should be measured, but the most commonly used measure has been the BII (e.g., Ohzawa and Freeman 1986b; Smith et al. 1997b). A drawback of that measure is its sensitivity to response variability. Consider two neurons with the same BII, one whose firing rate was modulated from 10 to 15 imp/s by disparity and the other from 100 to 150 imp/s. As the variance of neuronal firing is proportional to mean spike counts, the co-efficient of variation (SD divided by mean count) actually decreases with the mean. Hence, the statistical discriminability between the higher firing rates is greater than between the lower rates. So in a statistical sense the neuron with the higher firing rates would actually carry more information about disparity. Furthermore, neurons with a highly variable firing rate will tend to give rise to larger values of the BII simply because the random variations produce a larger difference between the maximum and minimum firing rates. Finally, a BII of 1 can only occur when the lowest firing rate is zero. These factors give rise to an inverse correlation between BII and mean firing rate (Fig. 2
*B*).

In this paper, we have developed an alternative measure (DDI) related to the discriminability of the highest and lowest points on the disparity tuning curve. This measure more accurately reflects the disparity information carried by the neuron (see also Britten et al. 1992; Prince et al. 2000), and we find that this measure is not correlated with the mean firing rate. Of course, all indices need to be interpreted with some care because the value of the index can be affected by the choice of other stimulus parameters. For example, a reduction in contrast that reduces the firing rate for all conditions will have different effects on the BII and the DDI. Broadly viewed, the overall pattern of results with the DDI is similar to that obtained using the BII: the lack of correlation between most RF properties and the strength of disparity tuning is largely independent of the metric used.

### Functional architecture of disparity tuning in V1

Our comparison of single- and multiunit data provides evidence for a functional clustering for disparity tuning in V1. If the single unit is tuned for disparity, the multiunit response is more likely to be disparity tuned than predicted by chance. Nearby cells also show a weak tendency to exhibit similar tuning profiles. These data are compatible with any form of clustering, either laminar or columnar. Furthermore the clustering is much less clear than that demonstrated in area MT byDeAngelis and Newsome (1999). That study also showed a systematic relationship between the disparity selectivity of nearby sites, which our electrode penetrations have not allowed us to examine in V1. However, the weak correlation between multiunit tuning and single-unit tuning means that any such map could not be as consistent as that in MT. Our data suggest the same conclusion advanced byLeVay and Voigt (1988) for a mixed population of cells from areas 17 and 18 of the cat: any columnar architecture for disparity is weak.

### Assessment of the energy model

In simple cells of the anesthetized cat, Anzai et al. (1999a) recently showed that the shape of the monocular RF largely explained the shape of the disparity selectivity as predicted by the energy model. In agreement with earlier studies, the shapes of the monocular RFs were well described by Gabor functions. If one assumes that complex cells are derived from Gabor-shaped subunits, then the energy model makes many predictions about the form of disparity tuning curves in both simple and complex cells. This model (which we call the “Gabor energy” model) provides the most complete description of how disparity selectivity might be produced in simple and complex cells in the primary visual cortex. In a series of papers, Ohzawa and colleagues have demonstrated that there is considerable evidence for this type of processing in the anesthetized cat (Anzai et al. 1999a–c; Ohzawa et al. 1996,1997). Several studies in the macaque have also yielded data compatible with the Gabor energy model. Smith et al. (1997a) demonstrated that summation in the monocular subunits must be linear by varying stimulus contrast. Cumming and Parker (1997) demonstrated that tuning curves invert when random-dot stereograms are contrast-reversed in one eye, as Ohzawa et al. (1990) had shown using contrast-reversed bar stimuli in the cat. Livingstone and Tsao (1999) reported a similar result in the awake macaque.

This paper presents the analysis of a substantial dataset from the awake macaque, which has been compared with the predictions of the binocular energy model. The disparity tuning profile is generally well described by a Gabor function, as in the anesthetized cat (Ohzawa et al. 1996, 1997). The form of this tuning changes as a function of the neuron's preferred orientation: as this approaches horizontal, the wavelength of the sinusoidal component of the disparity tuning profile increases. Hence, the profile is dominated by the Gaussian envelope, and we find that the curves become well described by a Gaussian function. Thus even the simplest form of the energy model (Ohzawa et al. 1990) provides a good quantitative account of many aspects of disparity selectivity to random-dot stereograms in the awake animal.

### Discrepancies with the energy model

Nonetheless several observations are incompatible with the disparity energy model in its simplest form. Some of these discrepancies might readily be accommodated without altering the principal features of the model. One example is the comparison of the responses to monocular random-dot patterns and to binocular uncorrelated dots. The energy model suggests that the binocular response will be the sum of the monocular responses. The binocular data lie closer to the mean of the monocular responses. Therefore the response to binocular uncorrelated dots is usually *smaller*than the larger monocular response. This implies that if a pattern of random dots is shown to the dominant eye alone, presenting uncorrelated dots to the nondominant eye should lead to a reduction in the response—clearly not a straightforward additive interaction. However, incorporation of a binocular response normalization (Fleet et al. 1996a) could reconcile the data with the main features of the disparity energy model.

Another readily explicable discrepancy concerns the degree of disparity sensitivity. In addition to many earlier studies, our results demonstrate that the degree of tuning is continuously variable. To apply the energy model to a large population of real neurons, it must accommodate the presence of binocular cells with weak disparity tuning. One approach is to apply different weights to the left and right eyes inputs; this reduces the predicted extent of modulation. Two observations suggest this explanation is insufficient. First, this explanation implies a relationship between disparity tuning and ocular dominance index that is not present in these data (Fig. 4
*F*) or those of earlier studies (LeVay and Voigt 1988;Smith et al. 1997b). Second, the simple form of the Gabor energy model proposed by Ohzawa et al. (1990)predicts that the fitted amplitude of the Gabor should equal the product of the square root of the monocular responses (seeappendix
), and this is not found (see Fig.10
*B*).

Weak disparity tuning could also result from the energy model if either the RF properties are not matched in the left and right eyes or a number of subunits with different disparity preferences are inappropriately combined. Either of these modifications yields model cells with weaker disparity selectivity. Both would also produce smaller modulations in firing rate than predicted from the monocular response levels (as shown in Fig. 10
*B*). Nonoptimal combination of the monocular subunits might also partially divorce the disparity tuning profile from the spatial frequency and orientation preference. This could explain why we find only a weak relationship between frequency component of the disparity tuning profile and the preferred grating stimulus as did Ohzawa et al. (1997).

Although these modifications would create additional complications, they also seem plausible, given that real complex cells probably receive inputs from more than four subunits. An extreme form of this hypothesis is that the subunit combination targeted on to complex cells is essentially random: those cells that are strongly disparity tuned are those where the combination has resulted (by chance) in appropriately matched subunits. One argument against this extreme view is the predominance of cells with symmetric (TE/TI) type tuning (seePrince et al. 2002).

Other features of our data point to a discrepancy that could never be explained by such simple modifications. The energy model predicts that the shape of the disparity tuning is determined by the cross-correlation function between the RF subunits in the two eyes (seeappendix ). This predicts that the frequency bandwidth of the Gabor describing the disparity tuning curve should be narrower than the spatial frequency bandwidth of the neuron measured with luminance gratings. However, we find that the frequency bandwidth of the fitted Gabors is very broad, and in many cases, a Gaussian curve provides a statistically adequate fit. Unfortunately, our measures of spatial frequency selectivity for sinusoidal luminance gratings did not generally provide a reliable measure of bandwidth, so this property of the disparity tuning curves must be compared with measures of spatial frequency bandwidth from other studies (DeValois et al. 1982). Our data could equally be explained if disparity-selective neurons tended to have broader than average spatial frequency bandwidths. More densely sampled data will be required to resolve this issue.

A further substantial deviation from the energy model, noted in previous studies, is that the responses to stimuli of opposite contrast in the two eyes show weaker modulation than same contrast stimuli (Cumming and Parker 1997; Livingstone and Tsao 1999; Ohzawa et al. 1990). The energy model predicts equal modulations in both conditions (Cumming and Parker 1997), and this discrepancy has been interpreted by some as representing a step toward the solution of the correspondence problem (Ohzawa 1998). However, a recent modeling study shows that the weaker modulation can be predicted by simple modifications of the energy model without the need for nonlocal circuitry (Read et al. 2000).

### Summary of the energy model

Taking the earlier studies together with the data presented here, the disparity energy model appears to be a good, basic description of disparity selective neurons in V1. However, in its simplest form, it cannot account for all aspects of disparity tuning data in the striate cortex. Further modeling work is required to determine whether plausible modifications to the model can be reconciled with all the experimental data.

The success of Gabor functions in describing the form of the disparity tuning functions allowed further analysis in the accompanying paper (Prince et al. 2002) to answer several questions concerning the representation of disparity in V1. The distribution of fitted phase and position parameters is used to examine whether or not the categories described by Poggio and collaborators (Poggio 1995; Poggio and Fischer 1977; Poggio and Talbot 1981; Poggio et al. 1988) represent distinct groups or a continuum. They also allow us to assess the contributions of phase differences and position differences in encoding disparity. Finally, the population of fitted curves is used to quantify the range of disparities that is successfully encoded by the population of V1 neurons, and we examine whether this range is related to the periodicity of each tuning curve.

### What do disparity-selective cells in V1 calculate?

There has been considerable discussion about whether cells in V1 solve the correspondence problem. Poggio et al. (1985)suggested that selectivity for disparity in dynamic random-dot stereograms indicated that the correspondence problem had been overcome. However, Cumming and Parker (1997, 2000) demonstrate that cells in V1 respond to false matches that are not perceived psychophysically.

The energy model is sensitive to the extent of correlation between left and right images after a monocular filtering operation. There are several ways in which the output of the energy model may contribute to the estimation of disparity (see Fleet et al. 1996b;Prince and Eagle 2000). Because the calculation is performed over a limited area (the RF), there will always be circumstances in which false matches can produce substantial correlations. Conversely, the similarity of V1 monocular receptive field characteristics (Ohzawa et al. 1996;Skottun and Freeman 1984) in the two eyes effectively restricts matches to similar spatial and temporal frequencies and orientations in each eye. This eliminates many potential false matches. This also implicitly favors gentle changes in disparity (cf.Burt and Julesz 1981) because responses to stimuli in which there is no interocular orientation or frequency difference are favored.

The data presented here, combined with earlier studies in the cat, indicate that the energy model accounts for most features of disparity tuning in V1 neurons. This then provides a description of what is calculated by disparity selective cells in V1. This provides a basis for understanding and evaluating the contribution of extra-striate visual areas to the processing of binocular, stereoscopic depth perception.

## Acknowledgments

We thank H. Bridge and O. Thomas for help in collecting the data, G. DeAngelis and W. Newsome for access to disparity-tuning data from area MT, and G. DeAngelis and J. Read for comments on earlier drafts of this work. B. G. Cumming was a Royal Society University Research Fellow.

This work was supported by the Wellcome Trust.

## REGRESSION AND NEURONAL DATA

Many standard statistical methods (including regression analysis) rely on the assumption of homogeneity of variance. Unfortunately, in real neuronal firing distributions, the variance of spike counts is found to be approximately proportional to the mean spike count (Dean 1981; Lee et al. 1998;Tolhurst et al. 1981), which means that the variances are not homogeneous. These count distributions are often modeled as the result of a Poisson process (e.g., Teich and Khanna 1985), in which the constant of proportionality is 1.0. If spike counts are expressed in firing rates (over a fixed period), then this constant will depend on the period over which spikes are counted, but the variance of the counts will remain proportional to the mean spike count. One way to deal with this problem is to weight samples according to the measured variance. However, reliable estimates of the variance require large samples, so these methods are hazardous when small numbers of repetitions have been used. An alternative, which we employ, is to apply a transform to the whole population that removes the relationship between mean and variance across the population (see p. 287 in Snedecor and Cochran 1989).

Suppose that the distribution of mean firing rates to a certain stimulus is described by the random variable, **x**. We wish to apply a smooth transformation to this random variable,*g*(**x**) to eliminate the dependence of the variance, ς
on the mean, η_{x}. We can approximate this transformation using a Taylor expansion
Equation A1The expected values of the first two moments of this expansion can be shown (see Papoulis 1991) to be approximated by
Equation A2
Equation A3Hence, for large *N,* we deduce that the distribution of the transformed variable, *g*(**x**) is approximated by the normal distribution
Equation A4For neuronal firing distributions, the variance of firing is approximately proportional to the mean, ς
∝ η_{x}. It is hence possible to choose the function *g,* such that *g*(**x**) has approximately constant variance. Specifically, if we choose the function *g*(**x**) *= x*
^{1/2}, then the expression for the variance of *g*(**x**) given in *Eq. EA4
* will be constant.

To test whether the transformation has been successful, we examined the correlation between the mean and variance of the firing rate for each neuron. As expected, this distribution was strongly biased toward positive values [mean value of Fisher transformed values is 0.86*±* 0.72 (SD), *t*-test *P* ≪ 0.0001]. This provides confirmation that the variance does increase with the mean firing rate and that it is appropriate to transform the data. After the square root transformation, the distribution is more nearly symmetrical ∼0—the mean of the Fisher transformed correlation coefficients, 0.21 ± 0.68, is still significant (*P* < 0.005, *t*-test) This suggests that our variance stabilizing transform has largely (but not completely) removed the relationship between mean and variance. Although transformation with a slightly different exponent may be required to completely remove the relationship, the simplicity of the square root operation, and its theoretical suitability for Poisson processes, led us to use it.

Before applying this transform, the firing distributions around individual mean values (the response distribution for a single stimulus) also tended not to be Gaussian but exhibited a positive skew (sign test, *P* < 0.0001), as expected for a Poisson-like process. The square root operation also acts to eliminate this tendency (sign test, NS after square root transform).

We conclude that square root firing rate is a statistically superior measure of neuronal activity than firing rate itself (see Lee et al. 1998). This is intriguing in the light of the fact that cortical cells impose a output nonlinearity, often modeled by half-squaring (Heeger 1992). This may help to prevent information loss at high firing rates due to spiking mechanisms that resemble a Poisson process.

## BINOCULAR ENERGY MODEL

In this section, we examine the predictions of the energy model of Ohzawa et al. (1990) for dynamic random-dot stereograms. Ohzawa et al. (1990) propose that complex cells are constructed by summing the outputs from four disparity selective simple cells, each sensitive to different monocular phases. Because the model complex cell is linear sum of four simple cells, it is the disparity selectivity of those simple cells that determines the disparity selectivity of the complex cell. We therefore consider first how these simple cells achieve disparity selectivity. Simple cells are modeled as the square of the sum of the linear RF responses from each eye. If *P*
_{l}(*x, y*) and*P*
_{r}(*x, y*) are the left and right RF profiles respectively, and*I*
_{l}(*x, y*) and*I*
_{r}(*x, y*) are the left and right images, then the simple cell response can be described
Equation B1If we expand the squared term, we obtain
Equation B2
Here, the first two terms are the squared left and right filter responses. As such, they reflect a measure of the contrast energy passing through the left and right filters, which we denote*C*
and *C*
, respectively. These terms reflect a combination of the gain of the filters, and the monocular contrasts of the random-dot stereogram. On average, these terms are constant, regardless of the stimulus disparity. The sensitivity to interocular correlation results from the third term, which multiplies the responses of the left and right eyes' filters. Hence, if these responses are similar this term will be positive, but if they are dissimilar it will be negative.

In an ideal random-dot stereogram, the right eye's image is equal to the left eye's image after shifting by the stimulus disparity,*d* and rescaling by the interocular contrast ratio. Hence, we can re-arrange the final term
Equation B3The final term in *Eq. EB3
* is closely related to the cross-correlation of the left- and right-filter profiles. In fact, it can be demonstrated that for the special case of random-dot stereograms it will be equal to the cross-correlation of the filter profiles at a given displacement, *d.* To understand why this is the case, it is necessary to consider two special properties of an ideal random pattern, *I*(*x, y*). First, the expected value of the autocorrelation function will be zero everywhere except for at the origin (when the images are in register). From this property, it can be shown that the expected cross-correlation between *I*(*x, y*) convolved with filter *F*
_{l} and the same pattern, *I*(*x, y*) convolved with filter*F*
_{r} is proportional to the cross-correlation of the filter shapes themselves
Equation B4where ⊗ denotes convolution and ★ denotes cross-correlation, and *C* is the image contrast. The second important property of random-dot patterns is their ergodic nature. One consequence is that for a given disparity, the expected long-term average response of any pair of filters in the convolution in *Eq. EB4
* will be proportional to the response of a population of filters that is sampling different spatial regions of the random pattern. This allows us to rewrite *Eq. EB3
* as
Equation B5From the final term, we expect the disparity tuning curve to have a shape that depends on the cross-correlation of the left- and right-eye filters. This cross-correlation component will necessarily contain the same frequency components as the receptive fields themselves. Hence we expect the frequency of the sinusoidal component of the disparity tuning profile to relate to the monocular receptive field shape. The Fourier transform of the cross-correlation is the product of the Fourier transforms of left and right filters. If these two filters have the same amplitude spectrum, the amplitude spectrum of the cross-correlation will therefore be the square of the amplitude spectrum of the filter. This results in a narrower frequency bandwidth for the disparity tuning curve than for the monocular filters.

The discussion so far has been limited to simple cells, but the predicted responses of complex cells to random-dot stereograms are very similar. In the energy model (Ohzawa et al. 1990), a complex cell takes inputs from four simple cells tuned to the same disparity but with different absolute monocular phase selectivities. Hence the complex cell response is independent of monocular phase but selective to a specific disparity. In an idealized model, the predicted tuning curve for each of these simple cell subunits will be identical and hence the complex cell response will show the same characteristics.

Figure B1 presents simulations of disparity tuning curves produced by model complex cells with different combinations of Gabor filters. Each column shows the simulated disparity tuning curves for three different orientations. The different columns compare the results for different disparity tuning profiles. In each case, the left- and right-eye RFs of one of the simple cell subunits are shown in the corner of the plot. Notice that the shape of the disparity tuning curve mirrors the horizontal cross-correlation of these left and right RFs. In the *left column,* the curves are symmetrical since the left- and right-eyes RF are themselves identical and symmetrical. As the orientation of the RF gets closer to horizontal, the scale of the sinusoidal component in the disparity tuning profile increases and the curve becomes more like a Gaussian curve. The *right column* presents examples for which the RFs have an interocular phase difference of 90*°*. Here, the difference in the RF shapes produces asymmetric tuning curves. As the orientation becomes closer to horizontal, the sinusoidal component increases in scale until the tuning curve is extinguished entirely. Vertical disparity tuning curves could similarly be constructed for these model cells. These would be expected to take the form of the vertical cross-correlation of the left and right RFs.

When a left monocular stimulus is presented, only the first term of*Eq. EB5
* is nonzero. Hence, we measure*C*
. Similarly, we can measure*C*
with a right monocular stimulus. If this simple energy model is correct, we expect the amplitude of Gabor fit to the disparity tuning curve (2*C*
_{l}
*C*
_{r} in*Eq. EB5
*) to be equal to the twice the square root of the product of these terms. In response to uncorrelated dot patterns, the term 2*C*
_{l}
*C*
_{r} will be zero on average, so the predicted response is the sum of the monocular responses. These predictions were tested in Fig. 10.

## Footnotes

Address for reprint requests: A. J. Parker, University Laboratory of Physiology, Parks Road, Oxford OX1 3PT, UK (E-mail:andrew.parker{at}physiol.ox.ac.uk).

- Copyright © 2002 The American Physiological Society