JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 86: 143-155, 2001;
0022-3077/01 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (9)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chen, Y.
Right arrow Articles by Qian, N.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, Y.
Right arrow Articles by Qian, N.

The Journal of Neurophysiology Vol. 86 No. 1 July 2001, pp. 143-155
Copyright ©2001 by the American Physiological Society

Modeling V1 Disparity Tuning to Time-Varying Stimuli

Yuzhi Chen,1 Yunjiu Wang,2 and Ning Qian1

 1Center for Neurobiology and Behavior and Department of Physiology and Cellular Biophysics, Columbia University, New York, New York 10032; and  2Laboratory of Visual Information Processing, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China


    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Chen, Yuzhi, Yunjiu Wang, and Ning Qian. Modeling V1 Disparity Tuning to Time-Varying Stimuli. J. Neurophysiol. 86: 143-155, 2001. Most models of disparity selectivity consider only the spatial properties of binocular cells. However, the temporal response is an integral component of real neurons' activities, and time-varying stimuli are often used in the experiments of disparity tuning. To understand the temporal dimension of V1 disparity representation, we incorporate a specific temporal response function into the disparity energy model and demonstrate that the binocular interaction of complex cells is separable into a Gabor disparity function and a positive time function. We then investigate how the model simple and complex cells respond to widely used time-varying stimuli, including motion-in-depth patterns, drifting gratings, moving bars, moving random-dot stereograms, and dynamic random-dot stereograms. It is found that both model simple and complex cells show more reliable disparity tuning to time-varying stimuli than to static stimuli, but similarities in the disparity tuning between simple and complex cells depend on the stimulus. Specifically, the disparity tuning curves of the two cell types are similar to each other for either drifting sinusoidal gratings or moving bars. In contrast, when the stimuli are dynamic random-dot stereograms, the disparity tuning of simple cells is highly variable, whereas the tuning of complex cells remains reliable. Moreover, cells with similar motion preferences in the two eyes cannot be truly tuned to motion in depth regardless of the stimulus types. These simulation results are consistent with a large body of extant physiological data, and provide some specific, testable predictions.


    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Numerous physiological studies have documented disparity-tuned cells in V1 (Barlow et al. 1967; Freeman and Ohzawa 1990; Poggio and Poggio 1984). To understand the mechanism of tuning, many researchers have also investigated how the disparity responses of a cell may be explained by the underlying binocular receptive field (RF) structure. Since disparity is a spatially defined property, nearly all stereo models are solely based on spatial considerations while leaving out the temporal dimension as irrelevant. Specifically, most models (Fleet et al. 1996; Nomura et al. 1990; Ohzawa et al. 1990; Qian 1994; Sanger 1988; Zhu and Qian 1996) only consider how the spatial RFs of binocular cells may respond to static stimuli and generate the physiologically observed disparity tuning curves, such as the tuned, near, and far types found in V1 (Poggio and Fischer 1977; Poggio et al. 1988). However, the spatial and temporal response properties always come together for real neurons. More importantly, physiological studies of disparity tuning often use time-varying stimuli such as motion-in-depth patterns, drifting gratings, moving bars, moving random-dot stereograms, or dynamic random-dot stereograms in addition to static images. To fully understand these data, the temporal response properties of cortical cells must be considered.

There is also a functional reason to include time into stereo modeling: consistent with the physiological finding that many visual cortical cells are tuned to both disparity and motion (Bradley et al. 1995; Maunsell and Van Essen 1983; Ohzawa et al. 1996), there is increasing psychophysical evidence indicating that motion and stereo interact with each other in generating our perception (Anstis and Hassis 1974; Nawrot and Blake 1989; Qian et al. 1994a; Regan and Beverley 1973). We have already proposed a model for motion-stereo integration based on the general properties of binocular, spatiotemporal RFs of visual cortical cells (Qian 1994; Qian and Andersen 1997; Qian et al. 1994b). However, we did not explicitly model the disparity tuning curves of cortical cells to specific time-varying stimuli. In this paper, we first present a simple function that conveniently describes the temporal response profiles of real V1 cells and incorporate this function into the disparity energy model (Ohzawa et al. 1990; Qian 1994). We then apply the model to investigate V1 disparity responses to a variety of time-varying stimuli used in physiological experiments. Some of the results were reported previously in abstract form (Chen et al. 2000).


    METHODS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

It is well established that the spatial RFs of V1 simple cells can be accurately fit by Gabor functions (Daugman 1985; Jones and Palmer 1987; Marcelja 1980; Ohzawa et al. 1990). Since we are concerned with disparity tuning instead of orientation tuning in this paper, we only consider vertically oriented binocular cells whose left and right RFs are given by (DeAngelis et al. 1991; Ohzawa et al. 1990, 1996)
<IT>g</IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT>=</IT><FR><NU><IT>1</IT></NU><DE><IT>2&pgr;&sfgr;</IT><SUB><IT>x</IT></SUB><IT>&sfgr;</IT><SUB><IT>y</IT></SUB></DE></FR><IT>  exp</IT><FENCE>−<FR><NU><IT>x</IT><SUP><IT>2</IT></SUP></NU><DE><IT>2&sfgr;</IT><SUP><IT>2</IT></SUP><SUB><IT>x</IT></SUB></DE></FR><IT>−</IT><FR><NU><IT>y</IT><SUP><IT>2</IT></SUP></NU><DE><IT>2&sfgr;</IT><SUP><IT>2</IT></SUP><SUB><IT>y</IT></SUB></DE></FR></FENCE><IT> cos </IT>(<IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>x</IT><IT>+&phgr;<SUB>l</SUB></IT>) (1)

<IT>g</IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT>=</IT><FR><NU><IT>1</IT></NU><DE><IT>2&pgr;&sfgr;</IT><SUB><IT>x</IT></SUB><IT>&sfgr;</IT><SUB><IT>y</IT></SUB></DE></FR><IT>  exp</IT><FENCE>−<FR><NU><IT>x</IT><SUP><IT>2</IT></SUP></NU><DE><IT>2&sfgr;</IT><SUP><IT>2</IT></SUP><SUB><IT>x</IT></SUB></DE></FR><IT>−</IT><FR><NU><IT>y</IT><SUP><IT>2</IT></SUP></NU><DE><IT>2&sfgr;</IT><SUP><IT>2</IT></SUP><SUB><IT>y</IT></SUB></DE></FR></FENCE><IT> cos </IT>(<IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>x</IT><IT>+&phgr;<SUB>r</SUB></IT>) (2)
where omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP> is the preferred horizontal spatial frequency, sigma x and sigma y determine the RF dimensions along the horizontal and vertical axes, respectively, and phi l and phi r are the phase parameters for the left and right RFs, respectively. For oriented stimuli (e.g., bars and gratings), we assume that the stimulus orientations are aligned with the cells' preferred orientation. For moving stimuli, we assume that the direction of motion is perpendicular to the orientation of the RFs.

Unlike the spatial RFs, the temporal response of cortical cells is not Gabor-like (DeAngelis et al. 1993a, 1999; Ohzawa et al. 1996). We examined the temporal profiles of real V1 cells and found that they can be conveniently described by an envelope of the gamma probability density function, multiplied by a sinusoidal modulation
<IT>h</IT>(<IT>t</IT>)<IT>=</IT><FENCE><AR><R><C><FR><NU>1</NU><DE>&Ggr;(&agr;)&tgr;<SUP>&agr;</SUP></DE></FR> <IT>t</IT><SUP><IT>&agr;−1</IT></SUP><IT> exp</IT><FENCE>−<FR><NU><IT>t</IT></NU><DE><IT>&tgr;</IT></DE></FR></FENCE><IT> cos </IT>(<IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>t</IT></SUB><IT>t</IT><IT>+&phgr;</IT><SUB><IT>t</IT></SUB>)</C><C><IT>t</IT><IT>≥0</IT></C></R><R><C>0</C><C><IT>t</IT><IT><0</IT></C></R></AR></FENCE> (3)
Here tau  is the time constant for the envelope, alpha  determines the degree of skewness, and Gamma (alpha ) is the standard gamma function for normalization; for simplicity, we let alpha  = 2 in this paper, and Gamma (2) = 1. The sinusoidal term with frequency omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP> generates alternating on and off responses. Since for many real cells the first half cycle of the temporal response is shorter by various amounts than the second half cycle, the parameter phi t is introduced to reduce the length of the first half cycle. (Due to the rapid decay of the exponential, the durations of the 3rd and later half-cycles are not important.) The phi t parameter also determines whether the initial response is on or off. Although previously proposed functions can fit the real temporal responses just as well (Adelson and Bergen 1985; DeAngelis et al. 1999; Watson and Ahumada 1985), we prefer Eq. 3 because all parameters have simple, intuitive meanings. Equation 3 is plotted for two different sets of parameters in Fig. 1A. The two curves are representative of the real temporal responses from V1 (DeAngelis et al. 1993a; Ohzawa et al. 1996).



View larger version (12K):
[in this window]
[in a new window]
 
Fig. 1. A: temporal responses of Eq. 3 plotted for two sets of parameters. The positive and negative values represent on and off responses, respectively. For both curves, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 7.2 Hz and tau  = 0.016 s, but phi t = 0.1pi , and -0.4pi , respectively. B: the corresponding Fourier amplitude spectra on a log-log scale showing the band-pass and low-pass behavior, respectively. These temporal response profiles and amplitude spectra closely resemble those of real V1 cells.

The frequency tuning of Eq. 3 is determined by its Fourier transform, which can be calculated analytically as
ℋ(&ohgr;<SUB><IT>t</IT></SUB>)<IT>=</IT><FR><NU><IT>1</IT></NU><DE><IT>2&tgr;<SUP>2</SUP></IT></DE></FR> <FENCE><FR><NU><IT>exp</IT>(<IT>i</IT><IT>&phgr;</IT><SUB><IT>t</IT></SUB>)</NU><DE>[<IT>1/&tgr;+</IT><IT>i</IT>(<IT>&ohgr;</IT><SUB><IT>t</IT></SUB><IT>−&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>t</IT></SUB>)]<SUP><IT>2</IT></SUP></DE></FR><IT>+</IT><FR><NU><IT>exp</IT>(−<IT>i</IT><IT>&phgr;</IT><SUB><IT>t</IT></SUB>)</NU><DE>[<IT>1/&tgr;+</IT><IT>i</IT>(<IT>&ohgr;</IT><SUB><IT>t</IT></SUB><IT>+&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>t</IT></SUB>)]<SUP><IT>2</IT></SUP></DE></FR></FENCE> <IT>i</IT><IT>=</IT><RAD><RCD>−<IT>1</IT></RCD></RAD> (4)
for alpha  = 2. Note that because of phi t, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP> may not be close to the preferred temporal frequency of the function. The amplitude spectra for the temporal responses in Fig. 1A are plotted in B, showing band-pass and low-pass characteristics, respectively. These two types of frequency tuning behavior correspond to transient and sustained responses, respectively (Hawken et al. 1996)

The temporal function h(t) can then be combined with the spatial function g(x, y) to model three-dimensional spatiotemporal RFs of simple cells (Adelson and Bergen 1985; Watson and Ahumada 1985). For binocular simple cells, this can be done for the left and right RFs separately
<IT>f</IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT>)<IT>=</IT><IT>g</IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT>h</IT>(<IT>t</IT>)<IT>+&eegr;</IT><IT><A><AC>g</AC><AC>&cjs1171;</AC></A></IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT><A><AC>h</AC><AC>&cjs1171;</AC></A></IT>(<IT>t</IT>) (5)

<IT>f</IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT>)<IT>=</IT><IT>g</IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT>h</IT>(<IT>t</IT>)<IT>+&eegr;</IT><IT><A><AC>g</AC><AC>&cjs1171;</AC></A></IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT>)<IT><A><AC>h</AC><AC>&cjs1171;</AC></A></IT>(<IT>t</IT>) (6)
where <A><AC>g</AC><AC>&cjs1171;</AC></A> and <A><AC>h</AC><AC>&cjs1171;</AC></A> functions are obtained from the corresponding g and h functions by replacing all the cosine terms by the sine terms. The constant weighting factor eta , between 0 and 1, is introduced to model various degrees of directional sensitivity (Adelson and Bergen 1985; Watson and Ahumada 1985).

The response of simple cells to a stereo image pair Il(x, y, t) and Ir(x, y, t) can be approximated by linear spatiotemporal filtering (DeAngelis et al. 1993b; Jones and Palmer 1987; Ohzawa et al. 1990), followed by half-squaring (Anzai et al. 1999a,b; Heeger 1992)
<IT>r</IT><SUB><IT>s</IT></SUB>(<IT>t</IT>)<IT>=&THgr;</IT><FENCE><LIM><OP><LIM><OP>∭</OP></LIM></OP><LL><IT>−∞</IT></LL><UL><IT>+∞</IT></UL></LIM> {<IT>f</IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT><IT>−</IT><IT>t</IT><IT>′</IT>)<IT>I</IT><SUB><IT>l</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT><IT>′</IT>)<IT>+</IT><IT>f</IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT><IT>−</IT><IT>t</IT><IT>′</IT>)<IT>I</IT><SUB><IT>r</IT></SUB>(<IT>x</IT><IT>, </IT><IT>y</IT><IT>, </IT><IT>t</IT><IT>′</IT>)}<IT>d</IT><IT>x</IT><IT>d</IT><IT>y</IT><IT>d</IT><IT>t</IT><IT>′</IT></FENCE> (7)
where the half squaring operation is defined as
&THgr;[<IT>X</IT>]<IT>=</IT><FENCE><AR><R><C><IT>X</IT><SUP><IT>2</IT></SUP></C><C><IT>X</IT><IT>≥0</IT></C></R><R><C>0</C><C><IT>X</IT><IT><0</IT></C></R></AR></FENCE> (8)
For some simulations, we also included a threshold to be subtracted from the integral in Eq. 7 before half-squaring. These will be mentioned specifically in RESULTS. The threshold tends to make tuning curves sharper by removing small responses.

Under the assumption that the RF size is much larger than the horizontal disparity D of the stimulus, it can be shown that the simple cell response is approximately (see APPENDIX)
<IT>r</IT><SUB><IT>s</IT></SUB>(<IT>t</IT>)<IT>≈&THgr;</IT><FENCE><IT>2</IT><IT>B</IT>(<IT>t</IT>)<IT> cos </IT><FENCE><IT>&thgr;</IT>(<IT>t</IT>)<IT>+</IT><FR><NU><IT>&phgr;<SUB>+</SUB>−&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>D</IT></NU><DE><IT>2</IT></DE></FR></FENCE><IT> cos </IT><FENCE><FR><NU><IT>&phgr;<SUB>−</SUB>+&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>D</IT></NU><DE><IT>2</IT></DE></FR></FENCE></FENCE> (9)
where
&phgr;<SUB>+</SUB>≡&phgr;<SUB>l</SUB>+&phgr;<SUB>r</SUB>, &phgr;<SUB>−</SUB>≡&phgr;<SUB>l</SUB>−&phgr;<SUB>r</SUB> (10)
and B(t) and theta (t) (defined in APPENDIX) are independent of phi l, phi r and D. Equation 9 is a generalization to our previous results obtained with spatial RFs only (Qian 1994; Qian and Zhu 1997). It indicates that in addition to stimulus disparity, simple cells are also sensitive to theta (t), which depends on the spatiotemporal details (or Fourier phase) of the stimulus.

We model complex cell responses using the well-known quadrature pair method for disparity energy computation (Adelson and Bergen 1985; Emerson et al. 1992; Ohzawa et al. 1990; Pollen 1981; Qian 1994; Watson and Ahumada 1985). The complex cells derive both their spatial and temporal properties from the constituent simple cells. Because of the half-wave rectification contained in the half-squaring operation for each complex cell, we need to sum the responses of four simple cells (Ohzawa et al. 1990), all with identical phi - but with their phi +/2 differing in steps of pi /2. (This is exactly equivalent to summing the squared responses of two simple cells without the half squaring.) The resulting complex cell response is approximately
<IT>r</IT><SUB><IT>q</IT></SUB>(<IT>t</IT>)<IT>≈</IT><FENCE><IT>2</IT><IT>B</IT>(<IT>t</IT>)<IT> cos </IT><FENCE><FR><NU><IT>&phgr;<SUB>−</SUB>+&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>D</IT></NU><DE><IT>2</IT></DE></FR></FENCE></FENCE><SUP><IT>2</IT></SUP> (11)
which has more reliable disparity tuning because it is no longer a function of theta (t). The preferred disparity of the cell is thus
<IT>D</IT><SUB><IT>pref</IT></SUB><IT>≈</IT>−<FR><NU><IT>&phgr;<SUB>−</SUB></IT></NU><DE><IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB></DE></FR> (12)
which is same as for the static case (Qian 1994).

Previously, we pointed out that for both physiological and computational reasons, a spatial pooling step should be added after the quadrature-pair construction to better simulate complex cell responses (Qian and Zhu 1997; Zhu and Qian 1996). We add this step for modeling complex cell responses to the random-dot type of stimuli, as such pooling significantly improves the reliability of disparity tuning (Fleet et al. 1996; Qian and Zhu 1997; Zhu and Qian 1996). The pooling step is omitted for bar and grating stimuli because it does not make any difference for those stimuli. The weighting function for the spatial pooling is a normalized, circularly symmetric two-dimensional Gaussian with a sigma  equal to sigma x in Eqs. 1 and 2.


    RESULTS
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

Binocular interaction RFs of complex cells

Equations 5 and 6 can be used to model simple cells' binocular, spatiotemporal RFs (results not shown), which are first-order kernels of the white noise analysis (Adelson and Bergen 1985; Anzai et al. 1999a; DeAngelis et al. 1999; Ohzawa et al. 1996). One cannot obtain similar first-order RFs for complex cells because complex cells do not have separated on and off subregions. However, as Ohzawa, DeAngelis, and Freeman (1997) have shown, real complex cells have well-defined binocular interaction RFs, which are the impulse response functions obtained by flashing a line at the preferred orientation at time t to locations xl and xr in the two eyes, respectively. It is a first-order temporal and second-order spatial kernel. Previously, Ohzawa et al. (1997) have modeled the second-order spatial kernel. Here we add the time variable and compare our simulations with the experimental data.

It can be shown that the binocular interaction RF defined by Ohzawa et al. (1997) for a complex cell can be written as (see APPENDIX)
<IT>F</IT><SUB><IT>c</IT></SUB>(<IT>D</IT><IT>, </IT><IT>t</IT>)<IT>=</IT><IT>S</IT>(<IT>D</IT>)<IT>H</IT>(<IT>t</IT>) (13)
where
<IT>S</IT>(<IT>D</IT>)<IT>=</IT><FR><NU><IT>4</IT></NU><DE><RAD><RCD>&pgr;</RCD></RAD><IT>&sfgr;</IT><SUB><IT>x</IT></SUB></DE></FR><IT>  exp</IT><FENCE>− <FR><NU><IT>D</IT><SUP><IT>2</IT></SUP></NU><DE><IT>4&sfgr;</IT><SUP><IT>2</IT></SUP><SUB><IT>x</IT></SUB></DE></FR></FENCE><IT> cos </IT>(<IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>x</IT></SUB><IT>D</IT><IT>+&phgr;<SUB>−</SUB></IT>) (14)

<IT>H</IT>(<IT>t</IT>)<IT>=</IT><IT>h</IT><SUP><IT>2</IT></SUP>(<IT>t</IT>)<IT>+&eegr;<SUP>2</SUP></IT><IT><A><AC>h</AC><AC>&cjs1171;</AC></A></IT><SUP><IT>2</IT></SUP>(<IT>t</IT>) (15)
Remarkably, Eq. 13 is separable in disparity and time regardless of whether the underlying simple cells for the complex cell are spatiotemporally separable or not (i.e., eta  = 0 or not). This is true so long as the simple cells are described by Eqs. 5 and 6 and therefore have the matched degrees of spatiotemporal orientation in the two eyes (Ohzawa et al. 1996). Also note that S(D) is a Gabor function of disparity D (Zhu and Qian 1996) and that unlike the temporal response h(t) for the constituent simple cells, the temporal response H(t) of the complex cell's binocular interaction RF is always positive, indicating that the Gabor disparity tuning of complex cells do not vary over time. These features are consistent with experimental data (Ohzawa et al. 1997).

Equation 13 is plotted in Fig. 2 for four model complex cells. The time-integrated tuning curves are also shown at the bottom of each panel, indicating that these cells are tuned-excitatory (TE), tuned-inhibitory (TI), near (NE), and far (FA) types, respectively, according to Poggio's classification. The disparity-time separability in Eq. 13 is clearly exhibited in the figure for both the nondirectional cell (eta  = 0, Fig. 2A) and the strongly directional cell (eta  = 1, Fig. 2B).



View larger version (15K):
[in this window]
[in a new window]
 
Fig. 2. Binocular interaction RFs (or D - T profiles) of 4 model complex cells plotted according to Eq. 13. The solid and dashed contours represent the positive and negative values, respectively. Below each panel is the disparity tuning curve generated by integrating the D - T profile along the time axis. These complex cells are constructed from simple cell RFs all with omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 0.4 cycles/deg, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 2 Hz, sigma x = 0.8°, sigma y = 1.2°, tau  = 60 ms, and phi t = 0.1pi . The phi - and eta  parameters are A: 0, 0; B: -pi , 1; C: -pi /2, 0.3; D: pi /2, 0.6, respectively. Therefore A is a tuned-excitatory (TE) and nondirectional complex cell; B is a tuned-inhibitory (TI) and strongly directional complex cell; C and D are near (NE) and far (FA) complex cells, respectively, with intermediate degrees of directional selectivity.

Another feature in Fig. 2 is that the D - T profiles of nondirectional or weakly directional complex cells (Fig. 2, A and C) have two peaks along the time axis, while strongly directional complex cells (Fig. 2, B and D) are unimodal over time. This originates from Eq. 15. When the directional factor eta  = 0, the complex cell temporal response function becomes
<IT>H</IT>(<IT>t</IT>)<IT>=</IT><FENCE><FR><NU><IT>t</IT></NU><DE><IT>&tgr;<SUP>2</SUP></IT></DE></FR><IT>  exp</IT><FENCE>−<FR><NU><IT>t</IT></NU><DE><IT>&tgr;</IT></DE></FR></FENCE><IT> cos </IT>(<IT>&ohgr;</IT><SUP><IT>&ogr;</IT></SUP><SUB><IT>t</IT></SUB><IT>t</IT><IT>+&phgr;</IT><SUB><IT>t</IT></SUB>)</FENCE><SUP><IT>2</IT></SUP> (16)
which can show multiple peaks in time because of the cosine term. On the other hand, when the direction factor eta  = 1, we have
<IT>H</IT>(<IT>t</IT>)<IT>=</IT><FENCE><FR><NU><IT>t</IT></NU><DE><IT>&tgr;<SUP>2</SUP></IT></DE></FR><IT>  exp</IT><FENCE>−<FR><NU><IT>t</IT></NU><DE><IT>&tgr;</IT></DE></FR></FENCE></FENCE><SUP><IT>2</IT></SUP> (17)
which can only have one peak. This relationship between directionality and the peak number along the time dimension in D - T plots is a testable prediction.

Motion in depth

When an object is moving toward or away from an observer, the binocular disparity of the object changes over time, and the motion speeds or directions in the two eyes are different. The fact that the disparity tuning of complex cells does not vary with time (Fig. 2) implies that these cells are not tuned to motion in depth (Ohzawa et al. 1997; Qian 1994; Qian and Andersen 1997). Consistent with this, most V1 cells have the same motion preference for the two eyes, and give the strongest response to the frontoparallel motion at the preferred disparity (Ohzawa et al. 1996, 1997; Poggio and Talbot 1981). In addition, Maunsell and Van Essen (1983) reported that no MT (V5) cells were found to be truly tuned for motion in depth when the motion trajectories of the stimuli were properly positioned (see following text).

We have simulated motion-in-depth tuning curves under a variety of conditions (Figs. 3-5). The format of each plot in each figure is identical to that used by Maunsell and Van Essen (1983). Twelve motion trajectories, represented "around the clock," were considered for each tuning curve. The 0 and 180° paths represent the rightward and leftward motions, respectively, in a frontoparallel plane; the 90 and 270° represent motions straight away from and toward the observer, respectively. The remaining eight trajectories represent intermediate, oblique paths in depth. Maunsell and Van Essen (1983) pointed out that to properly assess the motion-in-depth tuning, the mid-points of all trajectories should meet at a point with the preferred disparity of the cell. In this case, the 0 and 180° trajectories are on the cell's preferred disparity plane if it exists.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 3. Motion-in-depth tuning curves of a model simple cell (A) and a model complex cell (B) to a bar, moving along 12 paths whose mid-points coincide at a point on the cells' preferred disparity plane. The two rows in A and B are for the cases with and without threshold, respectively. The threshold is equal to 20% of the maximum response of the linear filtering in Eq. 7. The RF parameters of the simple cell (A) are omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 4 cycles/deg, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP> = 6 Hz, sigma x = 0.1°, sigma y = 0.2°, tau  = 20 ms, phi l = 0°, phi r = 60°, phi t = 0.1pi , and eta  = 0.6. The complex cell (B) receives inputs from the simple cell and 3 other simple cells according to the quadrature method. The bar size and duration are 0.1 × 1° and 0.33 s, respectively. The integrated responses over the 0.33 s period are plotted. The cells have a preferred disparity of 0.04° and a preferred speed of 1.8°/s. (Note that omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP> is not close to the preferred speed because omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP> is not close to the preferred temporal frequency of the cell.) The RFs are computed in a three-dimensional region of 0.5° × 1° × 0.1 s. The spatial and temporal sampling steps used in the simulations are 0.01° and 5 ms, respectively.

The 12 trajectories for the moving stimuli are specified by the horizontal speeds for the two eyes (Maunsell and Van Essen 1983). Starting from the 0° path and going counterclockwise, the 12 speed pairs for the left and right eyes used in our simulations are (1.8, 1.8), (0.6, 1.8), (-0.6, 1.8), (-1.8, 1.8), (-1.8, 0.6), (-1.8, -0.6), (-1.8, -1.8), (-0.6, -1.8), (0.6, -1.8), (1.8, -1.8), (-1.8, -0.6), and (1.8, 0.6), in deg/s.

MOVING BARS. Figure 3 shows the results for a directional simple cell (A) and the corresponding complex cell (B) in response to a moving bar stimulus. The two rows are for the cases with and without a threshold term in Eq. 7, respectively. Since both the left and right RFs of the model cells prefer leftward motion, it is not surprising that the tuning curves are peaked in the left, frontoparallel direction, indicating that these cells are not tuned to motion in depth. We have also performed simulations with nondirectional model cells (results not shown). In this case, the tuning curves usually had two peaks pointing at 0 and 180° directions, and for simple cells, there were additional, smaller peaks at 90 and 270° directions, again indicating the absence of motion-in-depth tuning. These results are consistent with the physiological data for the majority of visual cortical cells (Maunsell and Van Essen 1983; Poggio and Talbot 1981). The inclusion of a threshold term (2nd row) makes the tuning curves sharper because it suppresses small responses from the nonpreferred paths. This could explain some sharp tuning curves found experimentally (Maunsell and Van Essen 1983; Poggio and Talbot 1981).

Although most cortical cells are like those shown in Fig. 3, preferring frontoparallel motion with fixed disparity, there is evidence that some cells in areas V1 and V2 are tuned to motion toward or away from the observer (Cynader and Regan 1978; Poggio and Talbot 1981). However, cells preferring frontoparallel motion may appear to be tuned to motion in depth if the mid-points of the stimulus trajectories meet at a point outside the preferred disparity plane (Maunsell and Van Essen 1983). Under this condition, the 0 and 180° trajectories are not in the cell's preferred disparity plane and thus may not evoke the strongest responses. By contrast, the cell may be most excited by the oblique depth-path that happens to have the best overlap with the preferred disparity plane. The tuning curves under this "off-preferred-plane" situation for the same simple and complex cells in Fig. 3 are shown in the top row of Fig. 4. Here, the mid-points of all paths meet at a point with a disparity of -0.04° while the cells' preferred disparity is 0.04°. As predicted by Maunsell and Van Essen (1983), now the cells appear to prefer motion along oblique paths in depths. Thus some cells may appear tuned to motion in depth simply because of the improper choice of the test paths in an experiment. However, this possibility does not rule out the existence of cortical cells that are truly tuned to motion in depth. These cells should have different preferred directions or speeds in the two eyes (Cynader and Regan 1978; Poggio and Talbot 1981) and can thus show motion-in-depth tuning even when the stimulus paths are properly chosen. Our simulation-results for a simple and a complex cell preferring opposite directions of motion in the two eyes are shown in the bottom row of Fig. 4. The cells are tuned to motion straight away from the observer. Unlike the cells in the top row, these true motion-in-depth cells have a single prominent peak in their tuning curves.



View larger version (38K):
[in this window]
[in a new window]
 
Fig. 4. Two ways of having tuning peaks away from frontoparallel planes. Top: the simple (A) and complex (B) cells are identical to those in Fig. 3 (with threshold). Although they actually prefer frontoparallel motion, they appear tuned to motion in depth here because the mid-points of the stimulus paths meet at a point with disparity -0.04° instead of the cells' preferred disparity 0.04°. Bottom: on the other hand, these cells are truly tuned to motion in depth because the directional preferences of left and right RFs are opposite. The parameters are identical to those for Fig. 3 except that omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi for the right RF has been changed from 6 to -6 Hz to generate opposite directional preference.

RANDOM-DOT STEREOGRAMS. We have also simulated motion-in-depth tuning curves of the same simple and complex cells in Fig. 3 (with threshold) to coherently moving random-dot stereograms (MRDSs), and dynamic random-dot stereograms (DRDSs), and examined the effect of spatial pooling (see METHODS) for the complex cell responses. The dots of a MRDS are all on the same disparity plane at a given time and the whole plane moves along each of the 12 motion paths mentioned in the preceding text. Each MRDS is large enough so that it covers the cells' RFs at all times without the edge effect. A DRDS is identical to the corresponding MRDS in terms of disparity change over time, but the dot positions are randomly replotted for each frame. To investigate the reliability of the tuning curves, we simulated two tuning curves for each case, with two sets of independently generated MRDSs or DRDSs. The results are shown in Fig. 5. It can be seen that the tuning for MRDSs is very similar to that for moving bars (Fig. 3), except that the curves are narrower because there are more weak responses for MRDSs than for moving bars that are suppressed by the threshold. The curves for DRDSs, on the other hand, are quite different. First, because DRDSs, by definition, can only have disparity changes over time, but no directions of motion, the tuning curves are symmetrical with respect to the 90-270° axis. This is independent of the direction selectivity of the cell. Second, the two curves from the two independent simulations are very different from each other for the simple cell but are quite similar to each other for the complex cell with spatial pooling. This indicates that complex cells have more reliable tuning to DRDSs than do simple cells. Finally, the tuning curves for DRDSs are not as narrow as those for moving bars or MRDSs. For the simple cell, the main peak location is often located outside the preferred disparity plane. These specific features of motion-in-depth tuning to MRDSs and DRDSs can be tested experimentally, and have implications for some relevant psychophysical observations (see DISCUSSION).



View larger version (35K):
[in this window]
[in a new window]
 
Fig. 5. Motion-in-depth tuning curves of a model simple cell (A) and a model complex cell without (B) and with (C) spatial pooling to moving and dynamic random-dot stereograms (MRDSs and DRDSs, respectively), with paths centered on a point at the cells' preferred disparity plane. The cell parameters are identical to those in Fig. 4 except that a spatial pooling step was added in C. The pooling function is a normalized, symmetric 2-dimensional Gaussian with a sigma  of 0.1°. Two curves shown in each panel (open circle  and *) are obtained with 2 independently generated sets of stimuli. The dot size is 0.02 × 0.02° and dot density is 10%. The overall size, refresh rate, and duration of each stimuli are 0.5 × 1°, 50 Hz, and 0.5 s, respectively.

Similar to Fig. 4 for the bar stimuli, MRDSs and DRDSs can also give false motion-in-depth tuning if the motion paths are not properly chosen, and real motion-in-depth tuning can only be obtained with cells preferring opposite directions in the two eyes.

Disparity tuning curves

DRIFTING SINUSOIDAL GRATINGS AND BARS. Unlike the motion-in-depth stimuli discussed in the preceding text, all stimuli in this and subsequent subsections have a constant disparity over time. Ohzawa and Freeman (1986a,b) used binocular drifting sinusoidal gratings to test the disparity tuning of V1 cells in the cat. Figure 6 shows the response time courses and disparity tuning curves of a model simple and complex cell stimulated by drifting sinusoidal gratings of various interocular phase differences. The parameters are chosen to simulate the data shown in Fig. 3 of Ohzawa and Freeman (1986b) for the simple cell, and Fig. 1 of Ohzawa and Freeman (1986a) for the complex cell. Since that particular simple cell had shorter active half-cycles than the silent half-cycles, we include a threshold equal to 20% of the maximum value of the linear-filtering-result in Eq. 7. The spatial and temporal frequencies of gratings match the preferred frequencies of the cells, as in the actual experiments. Ohzawa and Freeman (1986b) used the first harmonic amplitude of the simple cell response for plotting the tuning curve. We simply use the time-integrated total response because it is proportional to the first harmonic in the context of our model. Figure 6 shows that the responses of both the simple and complex cells depend on the interocular phase difference (proportional to disparity) of the gratings. The simple cell's responses are modulated sinusoidally in time followed by rectification, while the complex cell responses are sustained. These features agree with the experimental data (Ohzawa and Freeman 1986a,b).



View larger version (34K):
[in this window]
[in a new window]
 
Fig. 6. Response time courses and disparity tuning curves of a model simple cell (A) and a model complex cell (B) stimulated by drifting sinusoidal gratings. Left: the response time courses as the interocular phase difference of the grating varied from 0 to 330° in 30° steps. The initial 0.3 s of transient responses has been excluded to show the steady-state behavior. The left and right monocular responses (LE and RE) of the cells are also shown. Right: the disparity tuning curves created by integrating the responses over a 1-s period. The vertical lines indicate the predicted preferred disparities according to Eq. 12. The simple cell (A) has spatiotemporally inseparable binocular RFs, with omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 0.3 cyc/deg, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 2 Hz, sigma x = 1°, sigma y = 1.6°, tau  = 60 ms, phi l = 0°, phi r = -120°, phi t = 0.1pi , and eta  = 0.6. The RFs are computed in a 3-dimensional region of 5° × 8° × 0.3 s. The threshold value is equal to 20% of the maximum linear filtering response of the simple cell. The RF parameters of the complex cell (B) are omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP>/2pi =0.4 cycles/deg, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = -2 Hz, sigma x = 0.8°, sigma y = 1.2°, tau  = 60 ms, phi - = 210°, phi t = -0.1pi , and eta  = 0.6. The RFs are computed over a region of 4° × 6° × 0.3 s. The spatial and temporal frequencies of the gratings match the preferred spatial and temporal frequencies of the cells. The initial phase of the right image is fixed at 60° for both cells and that of the left image is varied from 60° to 390° in steps of 30°. The spatial and temporal sampling intervals for the simulations are 0.1° and 10 ms, respectively.

Another feature in Fig. 6A is that the temporal responses of the simple cell are tilted to the right as the interocular phase difference increases. This is also consistent with the physiological results in Fig. 3 of Ohzawa and Freeman (1986b). It can be shown that this tilt stems from the specific way of introducing binocular disparity. In both the experiments (Ohzawa and Freeman 1986a,b) and our simulations, the disparity is generated by keeping the grating phase of one eye's image fixed while varying the phase in the other eye. If the disparity is symmetrically divided between the two eyes, then the tilt disappears (results not shown). The reason is that the asymmetric disparity generates a small positional change that leads to a temporal delay in the simple cell's response.

The model cells used in the preceding simulations are ocularly balanced. However, similar results can be obtained when one eye is more dominant than the other. There are two ways to introduce ocular dominance into the model. The first method is to introduce a weighting factor in front of one of the two RF profiles in Eq. 7. Mathematically, this is equivalent to presenting a stereogram with different contrast scales (but of the same contrast sign) to the two eyes. As we have shown previously (Qian 1994; Qian and Mikaelian 2000), the tuning curves will maintain the same shape under this condition although the pedestal will be higher and the amplitude will be smaller. The second method for introducing ocular dominance is to assume that one eye has a higher response threshold than the other. We find through simulations that again similar tuning curves can be obtained unless one of the thresholds is so high that the corresponding eye does not respond (results not shown).

We have also simulated response time courses and disparity tuning curves of simple and complex cells to moving bars (results not shown). Like the grating case, the tuning curves for both simple and complex cells peak at locations predicted by Eq. 12, and the vertical alignment of the response time courses depends on whether the disparities are introduced symmetrically in the two eyes or not. For directional cells, the disparity tuning curves for the preferred and anti-preferred directions have the same peak locations although the responses amplitudes differ markedly. These features are consistent with the experimental data in Fig. 4 of Poggio and Fischer (1977). For each bar sweep, the complex cells give longer responses than the corresponding simple cells because the former do not have the discrete on and off RF subregions.

RANDOM-DOT STEREOGRAMS. Poggio et al. (1985, 1988) also applied DRDSs to measure disparity tuning curves. In their experiments, each stereogram maintained a constant disparity during a trial, but the actual dot locations were randomly re-plotted from frame to frame. They found that simple cells do not show reliable disparity tuning to DRDSs but that complex cells do.

To investigate how reliably our model simple and complex cells were disparity-tuned to DRDSs, we computed, for each cell type, 1,000 disparity tuning curves from 1,000 independent sets of DRDSs, all generated from the same parameters. All DRDSs had a refresh rate of 100 Hz as in Poggio et al.'s experiments. Figure 7 shows the results. We also considered the effect of adding a spatial pooling stage to the complex cell responses (Fig. 7C, see METHODS). For clarity, only 30 randomly picked curves for each cell are shown in the top panels. The distribution histograms of the preferred disparities (bottom panels) are compiled from all 1,000 curves. It is clear from the figure that the peak location of the tuning curves is much more variable for the simple cells than for the complex cells and that spatial pooling helps to further improve the reliability of the complex cell responses. Specifically, 40, 77, and 99% of the tuning curves peak within 0.02° of the predicted preferred disparity for the simple cell, the complex cell without pooling, and the complex cell with pooling, respectively. Additional simulations show that for complex cells, the standard deviation of the peak locations is inversely proportional to the sigma  of the two-dimensional Gaussian used for the spatial pooling. Since the number of cells (N) pooled is proportional to sigma 2, the variability of the peak locations follows the inverse <RAD><RCD><IT>N</IT></RCD></RAD> law, as expected. However, the improvement from the simple cell to the complex cell (without pooling) is about twice that expected from the inverse <RAD><RCD><IT>N</IT></RCD></RAD> law because the four simple cells in the quadrature method are specifically picked to reduce variability.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 7. Disparity tuning curves of a model simple cell (A), and a model complex cell without (B) and with (C) spatial pooling, in response to DRDSs. Top: 30 disparity tuning curves obtained from 30 independent DRDSs. Each point on a curve was obtained by integrating the response over a period of 500 ms. The curves in a panel are normalized by the strongest response. Bottom: the distribution histograms of the peak locations, each compiled from 1,000 disparity tuning curves. The bin size of the histograms is 0.02°. The vertical lines indicate the predicted preferred disparities according to Eq. 12. The RF parameters of the simple cell (A) are omega <UP><SUB><IT>x</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 4 cycles/deg, omega <UP><SUB><IT>t</IT></SUB><SUP>&ogr;</SUP></UP>/2pi  = 6 Hz, sigma x = 0.1°, sigma y = 0.2°, tau y = 0.2°, tau  = 20 ms, phi l = phi r = 60°, phi t = 0.1pi , and eta  = 0.6. The RFs are computed in a 3-dimensional region of 0.5° × 1° × 0.1 s. The complex cell (B) receives inputs from the simple cell and 3 other simple cells according to the quadrature method. C: the spatial pooling procedure (see METHODS) is added to the complex cell in B. The pooling function is a normalized, symmetric 2-dimensional Gaussian with a sigma  of 0.1°. The dot size is 0.02 × 0.02° and dot density was 10%. The overall size, refresh rate, and duration of the stimuli are 1° × 1.2°, 100 Hz, and 0.5 s, respectively. The spatial and temporal sampling steps for these simulations are 0.01° and 5 ms, respectively.

Our simulation result, that disparity tuning curves to DRDSs are more reliable in complex cells than in simple cells, is in qualitative agreement with the experimental data of Poggio and coworkers (Poggio et al. 1985, 1988). Quantitatively, however, there may be some discrepancies. Although they did not publish any simple cell tuning curves to DRDSs, Poggio et al. (1985, 1988) reported that nearly all neurons responding to DRDSs are complex cells and that simple cells are not tuned to these stimuli. In contrast, the simulated tuning curves in Fig. 7A are not completely random but show a tendency to peak around the preferred disparity of the corresponding complex cell (marked by the vertical line in the figure). A close examination reveals that the disparity tuning trend of the model simple cell results from the fact that a small number of frames in each DRDS generate relatively reliable tuning because they happen to contain dot distributions that excite the cell strongly.

A closely related problem in Fig. 7A is that the response amplitudes of the simple cell to different sets of DRDSs fluctuated over a very large range (because some DRDSs happen to contain more frames that strongly excite the cell than other DRDSs). However, experimental data show that although some V1 cells occasionally give a strong response to one random-dot pattern and a weak response to another pattern, most cells have comparable responses to different random dot stimuli (Qian and Andersen 1995; Skottun et al. 1988; Snowden et al. 1992).

The preceding two problems can be resolved by introducing the following contrast response function to replace the half-squaring operation in Eq. 8
<IT>R</IT>[<IT>X</IT>]<IT>=</IT><FENCE><AR><R><C><IT>R</IT><SUB><IT>max</IT></SUB><IT>X<SUP>n</SUP></IT><IT>/</IT>(<IT>X<SUP>n</SUP></IT><IT>+</IT><IT>X</IT><SUP><IT>n</IT></SUP><SUB><IT>50</IT></SUB>)</C><C><IT>X</IT><IT>≥0</IT></C></R><R><C>0</C><C><IT>X</IT><IT><0</IT></C></R></AR></FENCE> (18)
where R is the simple cell response, X is the result of linearly filtering a stereo stimulus through the binocular spatiotemporal RFs of the simple cell, and Rmax, X50, and n denote, respectively, the maximum response, the X at which the response reaches half its maximum value, and the exponent that determines the steepness of the function (Albrecht and Hamilton 1982; Sclar et al. 1990). It has been shown that this type of contrast response can be implemented by a normalization procedure following the half-squaring operation (Heeger 1992). Like the discharge of real simple cells, Eq. 18 saturates at high stimulus contrast. When n = 2, the equation reduces to half-squaring at low stimulus contrast. Since this function compresses the response range, it should effectively increase the contributions to tuning curves from those frames in a DRDS that evoke relatively weak responses, and consequently reduce the tuning reliability of the model simple cells because weak responses usually generate poor tuning curves. The simulation results confirm this expectation (Fig. 8). The simple cell's disparity tuning to DRDSs became much more variable while the tuning of the complex cell remained reliable, especially with spatial pooling. These results are more consistent with Poggio's experimental reports (Poggio et al. 1985, 1988) than are those in Fig. 7, although we cannot make a quantitative comparison due to the lack of published experimental data.



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 8. Disparity tuning curves to DRDSs with contrast saturation. The simulations are identical to those in Fig. 7 except that Eq. 8 is replaced by Eq. 18. The parameters of the contrast response function are Rmax = 1, X50 = 10, and n = 2.

We next simulated the responses of the cells used for Fig. 8 to coherently MRDSs. The results are shown in Fig. 9. Obviously, the simple cell's disparity tuning to MRDSs is much more reliable than to DRDSs. The reason is that theta (t) in Eq. 9 varies randomly over time for DRDSs, while it changes smoothly for MRDSs. Since the temporal averaging of a continuous theta (t) is much closer to a constant than is the averaging of some random values, coherently moving stereograms should always generate more reliable disparity tuning curves than the random frames unless a very large number of frames (>200) is used (in which case both types of tuning curves become reliable). This is a specific prediction that can be tested physiologically. Poggio et al. measured disparity tuning of some V1 cells to MRDSs (Poggio et al. 1985, 1988). Unfortunately, they did not systematically compare the cells' responses to DRDSs and MRDSs but instead appeared to group the two types of stereograms together as the "cyclopean stimuli."



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 9. Disparity tuning curves to MRDSs with contrast saturation. The simulations are identical to those in Fig. 8 except that MRDSs are used. All MRDSs move leftward at a speed of 2°/s.

Finally, for the purpose of comparison, we also simulated the disparity tuning of the cells in Fig. 8 to static random-dot stereograms (SRDSs). The results are shown in Fig. 10. Consistent with our previous simulations with spatial RFs only (Qian 1994; Qian and Zhu 1997; Zhu and Qian 1996), the simple cell showed completely random disparity tuning curves when different sets of SRDSs were used, while the complex cell maintained reasonable tuning reliability when the spatial pooling is applied. Moreover, for all cell types, disparity tuning to SRDSs is not as reliable as that to DRDSs, which in turn is not as reliable as the tuning to MRDSs. This is easy to understand because for static patterns there is only a single value for theta (t) in Eq. 9, and therefore temporal integration does not help to reduce the influence of the first cosine term in the equation.



View larger version (43K):
[in this window]
[in a new window]
 
Fig. 10. Disparity tuning curves to SRDSs with contrast saturation. The simulations are identical to those in Fig. 8 except that SRDSs are used.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
METHODS
RESULTS
DISCUSSION
APPENDIX
REFERENCES

The main goal of this paper is to understand how V1 cells respond to binocular disparity in time-varying stimuli. We introduced a specific function that conveniently describes temporal response profiles of real cortical cells including the transient (or band-pass) and the sustained (low-pass) types. We then incorporated this temporal function into the disparity energy model (Ohzawa et al. 1990; Qian 1994) and found that the binocular interaction RFs of V1 complex cells, with the typical disparity-time separability in the D - T plot (Ohzawa et al. 1997), can be explained. The disparity part is a Gabor function and the time part is always positive. Finally, we investigated how the model simple and complex cells respond to various time-varying stimuli, including motion-in-depth patterns, drifting gratings, moving bars, MRDSs and DRDSs. We found that the simulated tuning curves agree with the extant experimental data quite well (Cynader and Regan 1978; Ohzawa and Freeman 1986a,b; Poggio and Fischer 1977; Poggio and Talbot 1981; Poggio et al. 1985). Our results indicate that both spatial pooling and temporal averaging can significantly improve the reliability of disparity tuning and that in general, complex cells are much better disparity detectors than simple cells (Ohzawa et al. 19