Abstract
Complex cells in the striate cortex exhibit extensive spatiotemporal nonlinearities, presumably due to a convergence of various subunits. Because these subunits essentially determine many aspects of a complex cell receptive field (RF), such as tuning for orientation, spatial frequency, and binocular disparity, examination of the RF properties of subunits is important for understanding functional roles of complex cells. Although monocular aspects of these subunits have been studied, little is known about their binocular properties. Using a sophisticated RF mapping technique that employs binary msequences, we have examined binocular interactions exhibited by complex cells in the cat’s striate cortex and the binocular RF properties of their underlying functional subunits. We find that binocular interaction RFs of complex cells exhibit subregions that are elongated along the frontoparallel axis at different binocular disparities. Therefore responses of complex cells are largely independent of monocular stimulus position or phase as long as the binocular disparity of the stimulus is kept constant. The binocular interaction RF is well described by a sum of binocular interaction RFs of underlying functional subunits, which exhibit simple celllike RFs and a preference for different monocular phases but the same binocular disparity. For more than half of the complex cells examined, subunits of each cell are consistent with the characteristics specified by an energy model, with respect to the number of subunits as well as relationships between the subunit properties. Subunits exhibit RF binocular disparities that are largely consistent with a phase mechanism for encoding binocular disparity. These results indicate that binocular interactions of complex cells are derived from simple celllike subunits, which exhibit multiplicative binocular interactions. Therefore binocular interactions of complex cells are also multiplicative. This suggests that complex cells compute something analogous to an interocular crosscorrelation of images for a local region of visual space. The result of this computation can be used for solving the stereo correspondence problem.
INTRODUCTION
Complex cells are nonlinear computing devices. This was already apparent in Hubel and Wiesel’s original description of complex cells in the cat’s striate cortex (Hubel and Wiesel 1962). They observed that receptive fields (RFs) of complex cells generally do not show discrete on and offsubregions, but appear to consist of overlapping on andoff regions. The subregions, when found, do not follow the rules of summation between on (or off) subregions and antagonism between on and offsubregions. In fact, complex cells respond to a stimulus regardless of its position within the RF. Otherwise, like simple cells, they exhibit selectivity to stimulus orientation. As a possible scheme for explaining complex cell RFs, Hubel and Wiesel (1962)proposed a hierarchical model in which simple cells with similar orientation preferences but different RF positions feed into a complex cell.
Because complex cells do not satisfy the principle of linear superposition, their firstorder responses (e.g., responses to single bars) do not predict their RF properties such as tuning for orientation, spatial frequency, and binocular disparity. However, researchers have found that secondorder responses (e.g., responses to pairs of bars) do provide useful predictions for RF properties of complex cells. For example, Movshon et al. (1978)measured twobar interaction profiles of complex cells in the cat’s striate cortex and found that the interaction profiles along the direction perpendicular to the cells’ preferred orientations exhibiton and off subregions similar to those of simple cells (see also Baker and Cynader 1986;Gaska et al. 1994; Rybicki et al. 1972). They showed that the inverse Fourier transform of spatial frequency tuning measured with drifting sinusoidal gratings agrees well with the twobar interaction profile (see also Gaska et al. 1994). This suggests that there are linear subunits underlying the RFs of complex cells. Later, Szulborski and Palmer (1990) measured twodimensional profiles of the secondorder interaction using a pair of small square or rectangular stimuli and found that the interaction profile consists of on andoff subregions that are elongated along the axis of a cell’s preferred orientation (see also Heggelund 1981).
The secondorder interaction also has been examined in the joint spacetime domain (Baker and Cynader 1986;Emerson et al. 1987, 1992; Gaska et al. 1994; Movshon et al. 1978). Emerson et al. (1987, 1992) measured twobar interactions exhibited by complex cells in the cat’s striate cortex using ternary white noise. They found that directionselective complex cells exhibit spacetimeoriented interaction profiles, indicating that underlying subunits are direction selective. Because the interaction does not depend on the positions of the two bars within the RF as long as the interspacing and time offset of the bars are kept constant, they concluded that subunits are distributed uniformly across the RF.
Ohzawa et al. (1990, 1997) examined the secondorder interaction between the two eyes by measuring binocular interaction profiles of complex cells in the cat’s striate cortex with a pair of bars (1 in each eye) flashed randomly across the RF. They found that the profiles are largely independent of the monocular stimulus position. That is, complex cells respond to bars regardless of their monocular positions as long as the interocular spatial offset, i.e., binocular disparity, is kept constant (see also von der Heydt et al. 1978 for a similar observation). These results suggest that underlying subunits are binocular and share the same optimal binocular disparity (see also Ohzawa and Freeman 1986).
All of these studies indicate that complex cells are composed of subunits that are, to a first approximation, linear. Subunit RFs are strikingly similar to those of simple cells, and they seem to determine RF properties of complex cells. Therefore these results are consistent with the hierarchical model of Hubel and Wiesel (1962). However, it should be noted that these measured subunits do not necessarily represent individual afferent neurons. Because secondorder interaction profiles are likely to reflect responses of multiple afferent neurons, the subunits should be regarded asfunctional (Emerson et al. 1987; see alsoSzulborski and Palmer 1990) rather than cellular units.
The abovementioned studies also suggest that subunits that feed into a complex cell are relatively homogeneous in the sense that they share some of the same optimal stimulus parameters, such as orientation, spatial frequency, direction selectivity, and binocular disparity. However, because complex cells respond to bright and dark stimuli at the same location within the RF, on and offsubregions of the subunits need to overlap extensively to make up a complex cell RF. In other words, the spatial relationship of subunit RFs must conform to one of the following conditions: the RFs are located at different positions as Hubel and Wiesel (1962) originally suggested; they are at the same position but have different spatial phases; or they are at different positions and have different spatial phases.
Various models of complex cells have been proposed using one of the spatial relationships among subunit RFs (e.g., Cavanagh 1984; Glezer et al. 1980, 1982; Pollen and Ronner 1983; Pollen et al. 1989;Spitzer and Hochstein 1985, 1988). For instance, Pollen et al. (Pollen and Ronner 1983; Pollen et al. 1989) proposed that a complex cell consists of four subunits: a pair of even and oddsymmetric subunits (a quadrature pair) and their signinverted versions. This is an attractive model from a computational point of view because these four subunits are sufficient to represent a local Fourier spectrum of the stimulus (Pollen and Ronner 1982; Pollen et al. 1989) and are building blocks of what is known as an energy model (Adelson and Bergen 1985; Watson and Ahumada 1985).
A linear filter followed by a squaring device and then an integrator is called an energy detector (Green and Swets 1966). It generally is modeled as two linear bandpass filters that are in a quadrature phase relationship, with the outputs of the linear filters squared and then summed (Adelson and Bergen 1985;Watson and Ahumada 1985). The model of Pollen et al. (Pollen and Ronner 1983; Pollen et al. 1989) described in the preceding text is a more physiologically plausible variation of the energy model in that each linear filter is subdivided further into two linear filters that are signinverted versions of each other, and their outputs are rectified before being squared and summed. As a model for binocular complex cells,Ohzawa et al. (1990) proposed a binocular version of the energy model that responds to the stimulus energy associated with binocular disparity. The model provides a good first approximation to binocular interactions exhibited by complex cells in the cat’s striate cortex (Ohzawa et al. 1990, 1997). In fact, the secondorder interactions exhibited by complex cells described earlier (Emerson et al. 1987, 1992; Movshon et al. 1978; Ohzawa et al. 1990, 1997;Szulborski and Palmer 1990; see also Baker and Cynader 1986; Gaska et al. 1994; Rybicki et al. 1972) are all, at least qualitatively, consistent with an energy model.
Although the energy model has been increasingly popular for complex cells (e.g., Emerson et al. 1992; Fleet et al. 1996; Ohzawa et al. 1990, 1997; Pollen et al. 1989; Qian 1994; Qian and Zhu 1997), quantitative evaluations of the model have been limited. In particular, binocular properties of subunits that underlie complex cells have not been examined to determine if they are consistent with subunit components of an energy model.
Here, the analysis of nonlinear binocular interactions is extended to complex cells to learn about how they process binocular information. Binocular interaction RFs and monocular RFs of complex cells in the cat’s striate cortex are measured with spatiotemporal white noise generated according to binary msequences (Sutter 1992). Through the examination of binocular interaction RFs, the responses of complex cells to binocular disparity is described. Functional subunits that underlie individual complex cells are estimated by applying singular value decomposition (SVD) on the binocular interaction RF of each cell. To evaluate an energy model for complex cells, the number of subunits as well as the RF properties of subunits are compared with those predicted by the energy model. Phase and position disparities between left and right eye RFs of subunits also are estimated to address the issue of how complex cells encode binocular disparity. Results of these analyses provide important clues for understanding the neural computations performed by binocular complex cells and the participation of subunits in the computations. Possible functional roles of complex cells in processing binocular information are considered.
METHODS
Details of surgical and histological procedures, apparatus, and recording methods are identical to those described in the preceding papers (Anzai et al. 1999a,b) Binocular interaction RFs and monocular RFs of complex cells are measured using dichoptic onedimensional (1D) binary msequence noise (for details of the stimulus configuration, see Anzai et al. 1999a). The RFs are constructed as described in Anzai et al. (1999b). The binocular interaction RFs are decomposed into those of functional subunits that underlie complex cells using the singular value decomposition (SVD). Phase and position disparities between left and right eye RFs of subunits are estimated to determine the relative contribution of the two disparities to the encoding of binocular disparity through complex cells.
SVD of binocular interaction RFs
To estimate functional subunits that underlie binocular complex cells, an SVD is performed for each cell on its binocular interaction RF at the optimal correlation delay (the delay at which the sum of squared values of all data points in the RF is maximum). The SVD is a standard technique of linear algebra (e.g., Press et al. 1992) that can be used to obtain a description of data in terms of orthogonal (quadrature) components, i.e., components that are mutually uncorrelated. The original data are described as a linear sum of the SVD components, which are ordered such that each component accounts for a progressively smaller fraction of the total variance in the data. Mathematically, the SVD is equivalent to principal component analysis.
Performed on the binocular interaction RF (B) of a complex cell, the SVD breaks the RF into a number of binocular interaction RFs, each of which represents an SVD component (see Fig. 3 for an example of SVD). The SVD components are considered subunits of the complex cell. However, it should be noted that the SVD components do not necessarily represent actual afferent neurons underlying the cell. Rather they are likely to represent a combination of multiple afferent neurons. Therefore they should be regarded as functional subunits.
The binocular interaction RF of each SVD component is described by the product of left (L) and right (R) eye RFs, weighted by a constant (W). In a matrix notation, the SVD is formulated as
To estimate a noise level for each SVD component, the SVD also is conducted on binocular interaction RFs that contain only noise. For each cell, binocular interaction RFs are obtained at noncausal correlation delays (the delays for which the response precedes the stimulus) ranging from −45 to −240 ms at a 5ms interval, and an SVD is performed on each RF. Then weights of the noise SVD components are averaged separately for each component order. The mean weights (e.g., Fig. 3 B, ●) represent estimated noise levels for the SVD components obtained from the RF at the optimal correlation delay.
Estimating interocular RF disparities of SVD components
Monocular RFs of the first and second SVD components are fitted with a 1D Gabor function (see Anzai et al. 1999a for details of the fitting procedure). Then RF position and phase disparities of the first SVD component are computed for each cell by applying a referencecell method (Anzai et al. 1999a) to the first two SVD components of the same cell rather than two different cells. An RF phase disparity of the first SVD component is obtained as the difference in RF phase between the left and right eye RFs of the component. An RF position disparity of the first SVD component is obtained as the difference in RF position between the two eyes, while left and right eye RFs of the second SVD component (a reference) are assumed to be at retinal correspondence (i.e., 0 RF position disparity). Therefore RF position disparities measured here are relative position disparities and are subjected to a statistical analysis to estimate true position disparities. See Anzai et al. (1999a) for formal definitions of the RF disparities and a statistical analysis of the RF position disparity.
RESULTS
Monocular RFs and binocular interaction RFs have been obtained for 64 binocular complex cells in 15 adult cats. Of these, 48 cells exhibited significant binocular interactions and are analyzed here. The remaining 16 cells showed very weak, if any, binocular interactions due to low signaltonoise ratios (5 cells were nevertheless strongly responsive to stimulation of either eye; the other 11 cells were either ocularly unbalanced, responding almost exclusively to only one eye, or were not responsive to stimulation of either eye). These cells have been excluded from the analysis.
Examples of monocular RFs and binocular interaction RFs
Figure 1 shows examples of monocular RFs (L and R) and binocular interaction RFs (B) for six complex cells. Most complex cells respond to bright and dark stimuli at the same location in space. Therefore their monocular RFs, the responses to bright stimuli minus the responses to dark stimuli, are in general relatively flat (e.g., Fig. 1, C and F), although there are some cells that exhibit significant residual responses in monocular RFs (e.g., Fig. 1, A andD).
The binocular interaction RF is a profile of responses to stimuli of matched polarity (brightbright and darkdark) in the two eyes minus the responses to stimuli of mismatched polarity (brightdark and darkbright) in the two eyes. It represents the responses attributable to nonlinear binocular interaction. Unlike simple cells, complex cells exhibit binocular interaction RFs that are not leftright separable, as shown in Fig. 1. Instead their binocular interaction RFs consist of subregions that are elongated along the frontparallel axisX _{F}. Therefore responses of complex cells are largely independent of monocular stimulus position or phase as long as the binocular disparity of the stimulus is kept constant. In other words, complex cells are truly tuned to binocular disparity. Similar observations have been made by von der Heydt et al. (1978) and Ohzawa et al. (1990, 1997). The profile along the binocular disparity axis D varies from cell to cell, suggesting that each cell has a different tuning function for binocular disparity. These binocular interaction RFs are qualitatively consistent with those predicted by a binoculardisparity energy model (Ohzawa et al. 1990, 1997). In the next section, the binocular interaction RFs are examined to see if they agree quantitatively with the predictions of the energy model.
Singular value decomposition (SVD) of binocular interaction RFs
Binocular complex cells have been modeled as detectors of the stimulus energy associated with binocular disparity (Fleet et al. 1996; Ohzawa et al. 1990, 1997; Qian 1994). Figure 2 shows the structure of the model. It consists of two major units, each of which is enclosed by a dashed line in the figure. These units are said to be in quadrature, i.e., the spatial phases of monocular RFs for one unit and those for the other are 90° apart. Thus the model responds to a stimulus independent of its spatial phase. Each quadrature unit consists of two simple celllike subunits, each of which is modeled as a linear binocular filter followed by a halfsquaring nonlinearity (the structure described for simple cells in the previous paper,Anzai et al. 1999b). These subunits have monocular RF profiles that are signinverted versions of each other so that the model responds equally to both bright and dark bars at the same location of space.
Because outputs of the subunits are combined linearly in this model, the binocular interaction RF of the model is a sum of binocular interaction RFs for individual subunits. Therefore if binocular complex cells are consistent with the model, then one should be able to describe their binocular interaction RFs as a sum of binocular interaction RFs for subunits that are in quadrature. To test this prediction, SVD has been performed on the binocular interaction RF of each complex cell to obtain a description of the RF in terms of orthogonal (quadrature) components (see methods for details about the SVD). The SVD components are ordered such that each component accounts for a progressively smaller fraction of the total variance in the data. Because an energy model consists of a pair of units that are in quadrature, the model predicts that the number of SVD components necessary to represent its binocular interaction RF is two. Note that the subunits comprising each quadrature unit are not independent, but are signinverted versions of each other. Therefore these subunits would be represented by a single SVD component.
Figure 3 shows an example of the SVD analysis. The monocular RFs and binocular interaction RF of the raw data are shown in Fig. 3 A. The SVD has been conducted on the binocular interaction RF to obtain 16 mutually uncorrelated components. Weights of the components are shown in Fig. 3 B (○) along with mean weights of components obtained from the SVD performed on estimated noise in the binocular interaction RF (●). Only the first two components have weights that are significantly above those of the noise SVD components. These two components account for >80% of the variance in the raw data. The percentage goes up to 95% if the variance accounted for by noise is subtracted. Binocular interaction RFs and monocular RFs of the first six components are shown in Fig. 3,CH. The first two components exhibit monocular and binocular interaction RFs that are strikingly similar to those of simple cells (see Anzai et al. 1999b). Although the binocular interaction RF of an SVD component is, by definition (Eq. 1 ) the product of its left and right eye RFs, the actual shape of the RFs is derived by the data. These results are consistent with the prediction of an energy model.
However, slight deviations from the prediction also have been observed for some cells. In Fig. 4, another example of the SVD analysis is shown. As in the previous example, there are two major components (the 1st and 2nd) that account for ∼85% of the variance in the raw data. Their RFs are very much like those of simple cells. In addition to these two components, this cell also exhibits two weak but significant components (the 3rd and 4th) whose RFs do not resemble those of simple cells. Altogether, the first four components account for >95% of the total variance in the data. The existence of the third and fourth components suggests that relationships among subunits of complex cells may not be as constrained as those of an energy model. An interpretation of these extra components is considered later in the discussion.
To examine if individual cells are consistent with an energy model, the minimum number of SVD components necessary to represent the binocular interaction RF is estimated for each cell. The number is determined by dividing a plot of component weights into two portions according to the rate of change in component weight (Scree test) (Gorsuch 1983) and counting the number of components in the first portion. For example, a plot of component weights shown in Fig.3 B consists of two parts: a quickly decreasing part (the first 2 components) and a more gradually and linearly decreasing part (the third and the rest of the components). The latter portion is virtually indistinguishable from the noise level and is not necessary to represent the binocular interaction RF. Therefore the number of SVD components for this cell is considered to be two. Likewise, the minimum number of SVD components for the cell shown in Fig. 4 is determined to be four. For most cells examined, the transition between the two portions is abrupt and quite obvious. However, some cells exhibit transitions that are gradual, and it is not immediately clear how many components these cells should be considered to have. In such cases, the latter portion is determined first as a gradually and linearly decreasing part, and the remaining part then is assigned to the first portion.
In Fig. 5 A, a histogram of the minimum number of SVD components for the population of complex cells examined is shown. The majority (56%) of the cells exhibit two components. Therefore these cells are consistent with an energy model. Almost all of the remaining cells exhibit either three or four SVD components, indicating that binocular complex cells are composed of only a small number of functional subunits that are linearly independent.
Although the existence of the extra components is a clear deviation from the prediction of an energy model, an energy model still provides a good approximation to the data on average. Figure 5 B shows a summary of how much variance in the raw data each SVD component accounts for. Each data point is a mean value for the population of complex cells examined. The variance accounted for by noise was subtracted from the data for each cell before the mean was computed. Open circles represent percentages of the total variance accounted for by each SVD component, and open triangles indicate cumulative percentages. Error bars represent ±SD. On average, the first and second SVD components account for ∼50 and 30% of the variance in the data, respectively. Each of the remaining components contributes an average of only ∼6% or less of the total variance. Therefore in general, binocular interaction RFs of complex cells can be well approximated by the sum of binocular interaction RFs of two units that are in quadrature, i.e., the binocular interaction RF of an energy model.
Comparisons between the first and second SVD components
In addition to the number of underlying functional subunits, an energy model also makes a prediction about the RF properties of the subunits. A binoculardisparity energy model assumes that the RF properties of subunits are the same except for their spatial phases, which are constrained to be in quadrature. Therefore if binocular complex cells are consistent with the energy model, then their SVD components should have all RF parameters but spatial phase in common. To examine if this prediction holds, left and right eye RFs of the first and second SVD components were fitted with a 1D Gabor function, and the center coordinate of the Gaussian envelope, envelope width, spatial frequency, and binocular phase disparity of the RFs were extracted (see Anzai et al. 1999a for a definition of the 1D Gabor function). Figure 6 shows scatter plots of the RF parameters for the second SVD components against those for the first SVD components. A slope of unity is indicated by the solid line. Most of the data points are scattered around the solid line, indicating that RF properties of the first and second SVD components are very similar. Therefore the first two SVD components of binocular complex cells are consistent with subunits of an energy model in regard to RF properties.
Interocular RF disparities of SVD components
In the first paper of this series (Anzai et al. 1999a), it is shown that the range of position disparities between the left and right eye RFs of simple cells is relatively small compared with that of RF phase disparities. This suggests that RF phase disparity plays a major role in encoding binocular disparity for simple cells. However, because RF phase disparities of cells tuned to high spatial frequencies are necessarily small in degree visual angle (deg VA), RF position disparity may still play an important role in encoding binocular disparity for those cells.
Because monocular RFs of complex cells do not exhibit structures that allow one to measure the binocular disparities of RFs, the neural mechanism through which complex cells encode binocular disparity is not well understood. Here this issue is addressed by examining RF phase and position disparities of SVD components by applying a referencecell method (see methods) (see alsoAnzai et al. 1999a for details of estimating RF phase and position disparities). The relative contributions of RF phase and position disparities to the encoding of binocular disparity are examined in relation to various RF parameters.
DISPARITY HISTOGRAMS.
Figure 7 shows histograms of RF phase and position disparities for the first SVD components. In Fig.7 A, a histogram of RF phase disparity in degree phase angle (deg PA) is shown. The distribution is centered around zero, indicating that components with similar RF profiles in the two eyes are most numerous. However, there are also many components that exhibit large disparities, suggesting that their RF profiles are quite dissimilar between the two eyes. The majority of the components have RF phase disparities within ±90°. Therefore the first SVD components of most complex cells satisfy the quarter cycle limit suggested by Marr and Poggio (1979) for unambiguously encoding binocular disparity through bandpass filters.
The phase disparity histogram is replotted in Fig. 7
Bin terms of deg VA so that it can be directly compared with the position disparity histogram, which is shown in Fig. 7
C.Both position and phase disparities are distributed around zero. The standard deviations of the distributions are 0.64 and 0.21 deg VA for phase and position disparities, respectively. The phase disparity distribution is clearly broader than the position disparity distribution, and this is statistically significant (F test,P < 0.01). Because the second SVD components have been used as “references” to estimate position disparities for the first SVD components, the standard deviation of the distribution for true position disparities of the first SVD components is expected to be smaller than the distribution for the measured (relative) position disparities by a factor of
RELATIONSHIP BETWEEN POSITION AND PHASE DISPARITIES.
Because the overall preference of cells for binocular disparity is determined by the sum of RF phase and position disparities, it would be interesting to know if there is a relationship between the two types of RF disparities. For example, they may always add up or they may always partially cancel each other. Figure 8shows a scatter plot of position disparity against phase disparity. Data points are scattered widely along the phase disparity axis. Although a linear regression analysis indicates that there is a weak but significant correlation between the two disparities (P = 0.01), the correlation coefficient is only 0.36, and just 13% of the variance in the data are accounted for by the model. Therefore there is a tendency for phase and position disparities to add up, but it is only of marginal significance.
RELATIONSHIP BETWEEN DISPARITY AND RF ORIENTATION.
As described in Anzai et al. (1999a) and in previous studies (DeAngelis et al. 1991, 1995; Ohzawa et al. 1996), profiles of left and right eye RFs are relatively well matched for simple cells tuned to horizontal orientations, whereas those for cells tuned to vertical orientations are predominantly dissimilar. However, this is not the case for the first SVD components of binocular complex cells. In Fig.9 A, magnitudes of phase disparities in deg PA are plotted as a function of RF orientation (the cell’s optimal orientation for gratings). Orientations of 0 and 90° correspond to horizontal and vertical, respectively. Phase disparities of SVD components for complex cells tuned to horizontal orientations (≤20°) are widely scattered. Because simple cells tuned to horizontal orientations do not exhibit phase disparities >90 deg PA (Anzai et al. 1999a), it seems unlikely that they can account for the large phase disparities of the SVD components at horizontal orientations. Therefore it is possible that some of the complex cells that are tuned to horizontal orientations receive nonsimple cell input and are not consistent with the hierarchical model of Hubel and Wiesel (1962). Except for the large phase disparities at horizontal orientations, the data are generally comparable with those for binocular simple cells reported inAnzai et al. (1999a).
When phase disparities (○) are plotted in deg VA, the range of binocular disparity is still large for some cells tuned to horizontal orientations (Fig. 9 B). Therefore complex cells tuned to horizontal orientations can encode large vertical disparities. This is rather counterintuitive considering that the range of horizontal disparities is expected to be larger than that of vertical disparities due to the lateral displacement of the eyes. On the other hand, position disparities (●) are relatively constant across all orientations and are limited to small values. Position disparities of simple cells also are limited to small values and are independent of RF orientation, as described in Anzai et al. (1999a).
RELATIONSHIP BETWEEN DISPARITY AND RF SPATIAL FREQUENCY.
Figure 10 shows how position and phase disparities of the first SVD components depend on RF spatial frequency. In Fig. 10 A, magnitudes of phase disparities in deg PA are plotted as a function of RF spatial frequency. No obvious correlation is found in the data. Therefore whether spatial profiles of left and right eye RFs are similar or dissimilar does not depend on the RF spatial frequency of the components.
However, phase disparities in deg VA clearly show dependency on RF spatial frequency. In Fig. 10 B, magnitudes of phase disparities (○) in deg VA are plotted, together with position disparities (●), as a function of RF spatial frequency. As a reference, phase disparities equivalent to 180 and 90 deg PA are indicated by the solid and dashed lines, respectively. The data points for phase disparities are scattered below the solid line, suggesting that phase disparity encodes a wide range of binocular disparities within the limit indicated by the solid line. A regression analysis indicates that there is a tendency for phase disparity to decrease with spatial frequency (slope = −1.99, P < 0.01). This is consistent with the sizedisparity correlation observed in human psychophysics (DeValois 1982; Felton et al. 1972; Kulikowski 1978; Legge and Gu 1989; Richards and Kaye 1974; Schor and Wood 1983; Schor et al. 1984a,b; Smallman and MacLeod 1994). On the other hand, position disparities are in general limited to small values and are relatively constant across spatial frequency (regression slope = −0.30, P = 0.07).
These results are consistent with the idea that complex cells encode binocular disparity through RF phase disparity. However, because phase disparities necessarily are limited to small values (in deg VA) at high spatial frequencies, position disparities still may play an important role in encoding binocular disparity for cells tuned to high spatial frequencies. These results are comparable to those obtained from binocular simple cells (see Anzai et al. 1999a).
DISCUSSION
In this study, white noise analysis has been applied to measurements of binocular interaction RFs and monocular RFs for complex cells in the cat’s striate cortex. Binocular interaction RFs of complex cells are found to be elongated along the frontoparallel axis at a particular binocular disparity. In other words, the binocular interaction exhibited by complex cells is independent of monocular stimulus position (within limits) or phase as long as stimulus binocular disparity is kept constant. In this sense, complex cells are truly tuned to binocular disparity. The binocular interaction RF is shown to be well described by a sum of binocular interaction RFs of underlying functional subunits that exhibit simple celllike RFs and preference for different monocular phases but the same binocular disparity. A majority of the complex cells examined are found to be consistent with an energy model, with respect to the number of subunits, as well as to the relationships between RF properties of subunits. Subunits also exhibit interocular RF disparities that are largely consistent with a phase mechanism for encoding binocular disparity. These results indicate that binocular interactions of complex cells are derived from simple celllike subunits, which exhibit multiplicative binocular interactions. Therefore binocular interactions of complex cells are also mutiplicative. This suggests that complex cells compute something analogous to the interocular crosscorrelation of images within a local region of space. The result of the computation can be used for solving the stereo correspondence problem.
Binocular interaction RF and binocular disparity tuning
As described in the preceding paper (Anzai et al. 1999b), binocular interaction RFs of simple cells are leftright separable; this indicates that the binocular interaction depends on monocular phases of the stimulus. On the other hand, binocular interaction RFs of complex cells consist of subregions that are elongated along the frontoparallel axis, and they are leftright inseparable. The inseparable RF presumably is constructed by combining separable RFs of simple celllike subunits that exhibit preferences for different monocular phases but for the same binocular disparity. This eliminates the monocular phase dependency at the complex cell level. Therefore complex cells respond to a stimulus regardless of its monocular phase as long as the binocular disparity of the stimulus is kept constant. Because of this, the binocular interaction RF can be reduced to a onedimensional function of binocular disparity by integrating the RF along the frontoparallel axis (Ohzawa et al. 1997). The resulting function represents the binocular disparity tuning of a cell.
Ferster (1981) modeled the binocular disparity tuning of simple cells in areas 17 and 18 of cats as a crosscorrelation between left and right eye RFs. It is shown in the preceding paper (Anzai et al. 1999b) that this model is indeed appropriate for the binocular disparity tuning of simple cells. He also applied the same model for complex cells (Ferster 1981). He measured activity profiles of complex cells using a pair of bars; one bar was presented to one eye as a conditioning stimulus to raise the overall response level and the other was swept across the RF of the other eye to obtain an activity profile. By doing this for each eye, he obtained left and right eye activity profiles and computed an interocular crosscorrelation of the profiles to predict the binocular disparity tuning. Although it is not clear if the activity profiles are analogous to the interocular twobar interaction profiles described in this study, if they were, then they would correspond to monocular RFs of underlying functional subunits of a complex cell. Therefore the binocular disparity tuning obtained as a crosscorrelation between the left and right eye activity profiles is of an underlying functional subunit (which exhibits a separable binocular interaction RF) rather than of the complex cell itself (which exhibits an inseparable binocular interaction RF). To obtain binocular disparity tuning for a complex cell, one needs to sum interocular crosscorrelations of monocular RFs for all subunits. This may explain why the model worked better for simple cells than for complex cells (Ferster 1981). Nonetheless because subunits of a complex cell are expected to have similar binocular disparity tuning, the model still should provide a good approximation to the tuning of the cell.
In any case, the results of Ferster (1981) and of this study suggest that binocular interactions exhibited by complex cells are multiplicative, a direct consequence of inheriting multiplicative binocular interactions from underlying subunits. This has an important implication as to what the functional roles of complex cells might be, which is discussed in the following text. It should be noted that the binocular summation at the input stage of subunits is still linear (Ohzawa and Freeman 1986), and this is not incompatible with a multiplicative interaction, which is observed at the output stage of complex cells.
Subregions of the binocular interaction RF
The binocular interaction RFs of complex cells consist of subregions that are elongated along the frontoparallel axis at different binocular disparities. The subregions of positive values (solid contours in Fig. 1) can be attributed (but not necessarily exclusively) to responses to interocular polaritymatched stimuli (bright or dark bars presented to the 2 eyes), and the subregions of negative values (dashed contours in Fig. 1) to responses to interocular polaritymismatched stimuli (a bright bar presented to 1 eye and a dark bar to the other eye). On the basis of similar observations,Ohzawa et al. (1990) suggested that complex cells respond to different binocular disparities depending on the interocular polarity combination of the stimulus. However, because the binocular interaction RF of a complex cell can be described as a sum of crosscorrelations between left and right eye RFs for underlying subunits, an alternative interpretation is possible.
Consider a stimulus, say a sinusoidal grating (or a Fourier component of a more complicated stimulus), presented at zerodisparity. Figure11 shows 1D profiles of the luminance distribution relative to the mean luminance level for the left (L) and right (R) eye images of such a stimulus. The contour plot in the figure is obtained by multiplying the left and right eye stimulus profiles. This plot represents the spatial structure of the interocular crosscorrelation for the stimulus, in the sense that integrating it along the frontoparallel axis X _{F}yields the interocular crosscorrelation function of the stimulus. The solid and dashed contours represent positive (stimulus polarities are matched between the 2 eyes) and negative (stimulus polarities are mismatched between the 2 eyes) values, respectively. The solid horizontal lines are constant disparity lines that go through solidcontour regions. In other words, along the solid lines, stimulus polarities between the two eyes are always matched. On the other hand, stimulus polarities between the two eyes are always opposite (i.e., only dashedcontour regions are found) along the dashed horizontal lines. The dashed lines indicate constant disparities that are shifted from the disparities indicated by the solid lines by an amount that is equivalent to 180 deg PA of the sinusoidal grating. Therefore an extended periodic (or bandpass filtered) stimulus has an interocular crosscorrelation structure that consists of two types (polaritymatched and polaritymismatched) of subregions at different binocular disparities, despite the fact that the stimulus itself is defined at a single binocular disparity (0disparity for the example shown in Fig. 11). The stimulus illustrated here would be effective for a complex cell that exhibits a binocular interaction RF that consists of polaritymatched response subregions at zero disparity and polaritymismatched response subregions at disparities equivalent to ±180 deg phase of the sinusoidal grating. Because binocular interaction RFs of complex cells are elongated along the frontoparallel axis, the stimulus would be effective regardless of its monocular phases as long as the binocular disparity of the stimulus is kept constant. Therefore it seems that subregions of the binocular interaction RF represent a structure suitable for detecting a stimulus as being at a particular binocular disparity rather than a mechanism designed for detecting different disparities depending on the interocular polarity combination of the stimulus.
As mentioned earlier, integrating the contour plot in Fig. 11 along the frontoparallel axis X _{F} yields the interocular crosscorrelation function of the stimulus. Note that this is also how the binocular disparity tuning of a cell is obtained from the binocular interaction RF. In this sense, the binocular disparity tuning function of a cell can be interpreted as a matching template to be compared with, or a filter to be applied to, the interocular crosscorrelation function of the stimulus.
Assumptions involved in the SVD analysis and interpretation of SVD components
In this study, the SVD has been applied to binocular interaction RFs of complex cells to estimate RFs of underlying subunits. The SVD components obtained from the analysis are assumed to represent underlying functional subunits, but not necessarily actual afferent neurons. But what does it mean that subunits are functional? How should they be interpreted? Before answering these questions, assumptions involved in the SVD analysis need to be examined.
The use of the SVD on the binocular interaction RF involves three assumptions. First of all, binocular interaction RFs of subunits are assumed to be leftright separable, i.e., the binocular interaction RF of a subunit is proportional to the product of left and right eye RFs of the subunit. In the preceding paper (Anzai et al. 1999b), binocular interaction RFs of most simple cells were shown to be proportional to the product of their left and right eye RFs. Therefore if one assumes that subunits of complex cells are either simple cells or LGN cells that are arranged in such a way that they are functionally equivalent to individual simple cells at the dendrites of complex cells, then binocular interaction RFs of subunits are expected to be separable.
Second, the binocular interaction RF of a complex cell is assumed to be a sum of binocular interaction RFs of subunits. In other words, a subunit’s output has an additive contribution to the complex cell. Although there is no direct evidence for this assumption, the behavior of complex cells is consistent with the assumption (e.g.,Emerson et al. 1992; Gaska et al. 1994;Glezer et al. 1980; Hubel and Wiesel 1962; Movshon et al. 1978; Ohzawa et al. 1990, 1997; Spitzer and Hochstein 1985), and there is no evidence that suggests otherwise.
Finally, RFs of subunits are assumed to be mutually orthogonal or in a quadrature phase relationship. This is probably the most critical assumption for understanding what SVD components represent. Because nearby simple cells have been shown to be in quadrature (Liu et al. 1992; Pollen and Ronner 1981), it is possible that subunits that feed into a complex cell are indeed in quadrature. However, because there is no firm evidence for or against this assumption, interpretation of SVD components needs to be considered both for the case where this assumption holds and for the case where it does not.
Suppose that subunits are indeed in quadrature and exhibit preference for the same spatial frequency, then the number of SVD components should be two, as an energy model predicts. However, the converse is not true. If the number of SVD components is two, subunits may or may not be in quadrature. Because subunits are assumed to be linearly summed to make up a complex cell, there is no unique solution for dividing a binocular interaction RF of a complex cell into RFs of subunits unless one makes an assumption regarding relationships among the subunits, such as a quadrature phase constraint. Therefore SVD components do not necessarily represent individual subunits but linear combinations of subunits. In this sense, SVD components are only functionally equivalent to subunits. Obviously this is not a major limitation if one would like to know functional structures of complex cells. It is a problem, however, if one wishes to identify the actual physical implementation of the functional structures.
If the number of SVD components is more than two, that indicates the existence of extra subunits that are not in quadrature, provided that the subunits have the same spatial frequency as that of the first two components (i.e., quadrature subunits). However, RFs of the extra SVD components may not represent those of the underlying nonquadrature subunits. This again is because SVD components are linear combinations of real subunits. Therefore the extra SVD components, which are likely to be linear combinations of quadrature as well as nonquadrature subunits, generally do not exhibit RFs that resemble those of simple cells (e.g., Fig. 4, E and F).
If SVD components do not represent individual subunits but functional subunits, then what do comparisons between RF properties of the first and second SVD components (Fig. 6) and those between RF phase and position disparities (Figs. 710) show? For cells with only two SVD components, the comparisons describe functional structures of the cells in terms of functional subunits. For cells with more than two SVD components, they describe functional structures that approximate behavior of the cells best.
System structure of complex cells and comparisons with that of an energy model
Complex cells have been modeled as a system that consists of parallel subunits (e.g., Glezer et al. 1980, 1982;Hubel and Wiesel 1962; Ohzawa and Freeman 1986; Ohzawa et al. 1990, 1997; Pollen and Ronner 1983; Pollen et al. 1989;Spizer and Hochstein 1985, 1988). Each subunit is modeled as a linear filter followed by a static nonlinearity and is assumed to represent a simple cell or a collection of LGN cells. Variations of the model in the number of subunits and relationships between the subunits can account for the behavior of various complex cells.
The SVD analysis conducted in this study indicates that two functional subunits that form a quadrature pair are sufficient to account for binocular interaction RFs of a majority of complex cells. In other words, most complex cells are consistent with an energy model. For the model to be more physiologically plausible, each member of a quadrature pair needs to be represented by two subunits that are signinverted versions of each other. Therefore at least four subunits are needed to model a complex cell.
Some complex cells are shown to deviate slightly from an energy model; more than two SVD components are required for these cells to describe their binocular interaction RFs. This indicates the existence of nonquadrature subunits. There are at least two possibilities for the origin of the nonquadrature subunits. One is that two subunits that make up a member of a quadrature pair may not be exactly signinverted versions of each other. This could explain why monocular RFs of some complex cells are not entirely flat. Nonquadrature subunits are likely to be due to a misalignment of the RF position.
Another possibility is that the number of subunits may be more than four. An energy model predicts that the aspect ratio of the binocular interaction RF should be one, i.e., the extent of the RF along the frontoparallel axis should be the same as that along the binocular disparity axis. However, some complex cells exhibit binocular interaction RFs that are elongated along the frontoparallel axis more than is expected from their extent along the binocular disparity axis. This suggests that more subunits may be added to expand the overall RF. As a special case, additional subunits may form quadrature pairs themselves. It has been demonstrated that spatial pooling of multiple quadrature pairs improves the reliability of disparity tuning (Qian and Zhu 1997). Therefore the deviations from an energy model seen in some complex cells actually may be advantageous from a computational point of view.
In this study, cells that did not exhibit significant binocular interaction RFs were not analyzed. However, some of these cells still can be activated by stimulation of either eye alone. Similar cells also were reported previously (e.g., Ferster 1981;Ohzawa and Freeman 1986). These cells can be explained by a difference in ocular dominance among subunits (Ohzawa and Freeman 1986). That is, subunits of these cells are presumably quite monocular so that they do not exhibit a significant binocular interaction. However, some subunits are left eye dominant, whereas others are right eye dominant. Therefore such complex cells still respond to stimulation of either eye. There is also a possibility that subunits are binocular, but their preferred binocular disparities are uniformly distributed (Ohzawa and Freeman 1986). In any case, these cells cannot encode binocular disparity, and they are likely to play little or no direct role in the processing of binocular disparity information.
Finally, it should be pointed out that the model examined in this study is a feed forward model and is by no means complete. Complex cells exhibit various nonlinear properties, including contrast gain control (e.g., Ohzawa et al. 1982, 1985) and end and sideinhibition (e.g., Blakemore and Tobin 1972;DeAngelis et al. 1994; DeValois et al. 1985; Hubel and Wiesel 1968; Kato et al. 1978; Maffei and Fiorentini 1976). Although it is not clear at this point if these nonlinearities are essential for the processing of binocular information, they eventually need to be incorporated into any complete model of complex cells.
RF position and phase disparities of complexcell subunits
To examine how complex cells encode binocular disparity, RF position and phase disparities of underlying functional subunits were estimated from left and right eye RFs of SVD components. The range of RF position disparities is found to be quite small compared with that of RF phase disparities. In addition, RF phase disparity, but not RF position disparity, were found to exhibit a dependency on the RF spatial frequency, a result consistent with the sizedisparity correlation observed in human psychophysics (DeValois 1982; Felton et al. 1972; Kulikowski 1978; Legge and Gu 1989; Richards and Kaye 1974; Schor and Wood 1983; Schor et al. 1984a,b; Smallman and MacLeod 1994). Therefore it appears that complex cells encode binocular disparity mainly through the RF phase disparity. However, because RF phase disparities for cells tuned to high spatial frequencies are necessarily small in deg VA, RF position disparities still may play an important role in encoding binocular disparity for these cells.
A referencecell method (see Anzai et al. 1999a for details) was applied for the estimation of position disparities; the position disparity of the first SVD component was measured for each cell with respect to RF positions of the second SVD component (a reference) of the same cell. In other words, the position disparity measured here is the relative position disparity of the first SVD component to that of the second SVD component. Assuming that true position disparities of the first and second SVD components are uncorrelated, the distribution of relative position disparities is expected to be broader than that of true position disparities by a factor of
Suppose that the position disparities of SVD components indeed were correlated. Then the distribution of true position disparities would be broader than that estimated in this study. How much broader would it be? This question cannot be answered unless one measures a degree of correlation between position disparities of the first and second SVD components. However, if one assumes that subunits are simple cells, then the distribution of position disparities for simple cells provides the upper limit for the broadness of the distribution. A comparison between the phase disparity distribution of complex cells and the distribution for true position disparity of simple cells (Anzai et al. 1999a) indicates that the former is broader than the latter (see results). Therefore the range of binocular disparities that can be encoded through RF phase disparity is still larger than that for RF position disparity.
On the basis of the results presented in Anzai et al. (1999a) and in this study, the following picture emerges as a neural mechanism for encoding binocular disparity. Depending on the phase disparity, the profile (phase) of the binocular interaction RFs along the binocular disparity axis changes, i.e., subregions of the binocular interaction RF are located at different depths. Cells are tuned to spatial frequency, and therefore, binocular disparity is encoded at each spatial scale. The binocular interaction RFs would be large for cells tuned to low spatial frequencies, and hence they could encode a wide range of binocular disparity. Cells tuned to somewhat higher spatial frequencies would have correspondingly smaller binocular interaction RFs, and hence they encode a smaller range of binocular disparity (the sizedisparity correlation). Small RF position disparities would not affect their binocular disparity tuning. However, for cells tuned to very high spatial frequencies, the RF position disparities may not be negligible. Therefore monocular RFs are no longer at the corresponding points, and the location of the binocular interaction RFs may be shifted in depth around the fixation plane. This would effectively expand the range of binocular disparity that could be encoded by cells tuned to high spatial frequencies. In other words, the range of binocular disparity would be determined by the range of RF position disparity, and would no longer be a function of spatial frequency (a constant disparity limit).
Functional roles of binocular complex cells
In the preceding paper (Anzai et al. 1999b), binocular interactions exhibited by simple cells were shown to be multiplicative at the output stage. It also was shown that, because of the multiplicative binocular interaction, responses of binocular simple cells contain a component that is formally equivalent to a crosscorrelation of the left and right eye images that are bandpass filtered. Because subunits of complex cells are functionally equivalent to simple cells, complex cells would be expected to exhibit a multiplicative binocular interaction. Therefore complex cells also compute something analogous to an interocular crosscorrelation of images in a local region. The difference between the interocular crosscorrelation computed by simple cells and that computed by complex cells is that the former depends on the monocular stimulus phases, whereas the latter does not; i.e., the binocular interaction RF is leftright separable for simple cells, whereas it is inseparable for complex cells.
An interocular crosscorrelation is a fundamental computation for the processing of binocular information. For example, it has been shown that the stereo correspondence problem can be solved by computing the interocular crosscorrelation of stereo images (Jenkin and Jepson 1988; Sanger 1988). There are also psychophysical studies that indicate that the visual system is very sensitive to the interocular correlation of images (Cormack et al. 1991, 1993; Stevenson et al. 1991, 1992;Tyler and Julesz 1978) and that cyclopean processing in humans is consistent with multiplicative mechanisms such as an interocular crosscorrelation (Cormack et al. 1991;Stevenson et al. 1991). The results of this study suggest that complex cells may underlie these psychophysical data and play an important role in solving the stereo correspondence problem.
The multiplicative binocular interaction results from a squaring nonlinearity that follows a linear binocular filter of a subunit. Because a linear binocular filter is simply the sum of left and right eye linear filters, the monocular interaction also is expected to be multiplicative. This suggests that complex cells may compute something analogous to autocorrelation of the monocular image in a local region (Movshon et al. 1978), which is also an algorithm useful for detecting stimulus attributes such as form and motion. Combining monocular and binocular processing, complex cells can be considered local spatiotemporal correlators. From a computational point of view, this description may be preferable to an energy detector because it indicates the algorithm of the neural computations that they perform. However, it should be noted that because a Fourier transform of an autocorrelation function of signals yields a Fourier power spectrum of the signals, the description of complex cells as a local correlator is equivalent to the notion of an energy detector.
In the series of three papers presented here, we have described the functional architecture of neurons in the striate cortex for processing binocular information. Simple cells exhibit interocular RF phase disparities that are suitable for detecting binocular disparities in the retinal images (Anzai et al. 1999a). The binocular disparity information encoded through such a mechanism then is subjected to a nonlinearity to perform a computation that is analogous to an interocular crosscorrelation of images in a local region of space (Anzai et al. 1999b). Simple cells provide monocular phase specific components of the computation (Anzai et al. 1999b), whereas complex cells combine outputs of simple celllike subunits to eliminate the monocular phase specificity as shown in the current paper. The results of the computation are useful for solving the stereo correspondence problem. Considered together with the previous work we have described, we now have a good understanding of the functional roles of simple and complex cells with respect to binocular vision. Therefore our findings provide a solid foundation on which to base exploration of the next stages of binocular visual processing.
Acknowledgments
We are grateful to Dr. Erich Sutter for advice on binary msequences and their applications to receptive field mapping and to Dr. Stanley Klein for advice on singular value decomposition analysis. We also thank Drs. Russel DeValois and Edwin Lewis for discussions and helpful comments and suggestions.
This work was supported by research and CORE grants from the National Eye Institute (EY01175 and EY03176).
Footnotes

Address reprint requests to: R. D. Freeman, 360 Minor Hall, School of Optometry, University of California, Berkeley, CA 947202020.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
 Copyright © 1999 The American Physiological Society