|
|
||||||||
1Department of Psychology, Stanford University, Stanford, California 94305-2130; and 2University Laboratory of Physiology, Oxford OX1 3PT, United Kingdom
Submitted 29 October 2003; accepted in final form 5 April 2004
| ABSTRACT |
|---|
|
|
|---|
) showed an equal adaptation to both; and 3) early visual areas (V1, V2, V3) showed a small effect in both experiments. These results indicate that processing in dorsal areas may rely mostly on information about absolute disparities, while ventral areas split neural resources between the two types of stereoscopic information so as to maintain an important representation of relative disparity. | INTRODUCTION |
|---|
|
|
|---|
In the following decades, psychophysical and computational work emphasized the distinction between two stages in the processing of stereoscopic information (Marr 1985
): 1) local matching of the retinal images to obtain estimates of the absolute disparity of objects or surfaces in reference to where the eyes are fixating; and 2) a more perceptually useful representation based on the relative disparity between different objects, independent of fixation depth. We use the term "absolute disparity" to mean absolute retinal disparity. The difference between absolute and relative disparity is shown in Fig. 1.
|
The neural substrate of human stereoscopic vision has been studied previously using visually evoked potentials (Braddick and Atkinson 1983
; Fiorentini and Maffei 1970
; Lehmann and Julesz 1978
; Norcia and Tyler 1984
; Norcia et al. 1985
; Regan and Spekreijse 1970
), PET (Gulyas and Roland 1994
), and functional MRI (fMRI) (Backus et al. 2001
; Gilaie Dotan et al. 2002
; Kwee et al. 1999
; Nakadomari et al. 1999
; Ptito et al. 1993
; Rutschmann and Greenlee 1999
; Tsao et al. 2003
). The number of studies is rather small, mostly because binocular presentation of visual stimuli can pose technical difficulties when constrained by PET and fMRI scanners. There is general agreement from these studies that disparity signals are spread throughout the visual cortex with, as yet, no putative "depth area" analogous to MT for motion.
We opted for a design that exploits the phenomenon of neural adaptation to assess selectivity for absolute and relative disparity. This method has previously been used to identify visual areas selective for objects (Grill-Spector et al. 1999
), motion (Huk and Heeger 2002
; Huk et al. 2001
), color (Engel and Furmanski 2001
), and shape (Kourtzi and Kanwisher 2001
). It relies on a comparison of the fMRI response to stimuli under two conditions, one in which the attribute of interest remains constant, thus causing adaptation in neurons sensitive to this attribute, and one in which the attribute is varied so as to avoid adaptation. Any reduction in the fMRI signal should be a consequence of the reduction in response from adapted neurons selective for the attribute being tested. The major advantage of this approach is that it targets specific subpopulations of neurons, as response differences can be referred back to adaptation-tagged neurons (Grill-Spector and Malach 2001
). In conventional fMRI experiments, on the other hand, it is not possible to know whether response differences (or lack of differences) between stimulus conditions are due to changes in the activity of one subpopulation of neurons, changes in how activity is subdivided across different subpopulations, or both.
We presented observers with two pairs of transparent planes in depth: one pair above and the other below fixation. For each pair of planes, we could independently vary the distance in depth between each plane and fixation (absolute disparity) and the distance in depth between the two planes (relative disparity). We measured adaptation as a function of both manipulations and found that 1) dorsal areas (V3A, MT+/V5, V7) adapted only to absolute disparity; 2) ventral areas (hV4, V8/V4
) adapted to both absolute and relative disparity; and 3) early visual areas (V1, V2, V3) showed a small effect in both experiments.
| METHODS |
|---|
|
|
|---|
The experiments were undertaken with the written consent of each subject and in compliance with the safety guidelines for MRI research. Subjects participated in multiple MRI scanning sessions on different days: one to obtain a standard, high-resolution, anatomical scan; one to functionally define retinotopic visual areas; one to provide further definition for area MT+/V5; and seven sessions to measure fMRI responses in the different experimental conditions (2 sessions for absolute disparity, 2 for relative disparity, 1 for both absolute and relative, 1 for the control experiment, and 1 for the baseline measurement). Across all sessions, each subject performed between 24 and 30 scans for each of the absolute and relative disparity experiments. All subjects had normal or corrected-to-normal vision.
Experimental set-up and visual stimuli
Observers (lying on their backs) viewed stimuli through a custom-made stereoscope (made of a set of 2 mirrors and a pair of 8x binoculars, 320 cm from the display) that allowed separate projection of left and right halves of the LCD monitor (NEC multisynch LCD 2000, placed inside a Faraday box) to left and right eyes (Fig. 2A). To ensure that the left eye did not see the image meant for the right eye, and vice versa, subjects held a septum between their knees. A bite bar stabilized subjects' heads.
|
Psychophysical task
It is critical to control the subject's attention when attempting to measure stimulus-evoked responses in visual cortex. Several studies have shown that the attentional state of the observer can have dramatic effects on fMRI signals in visual cortex (Brefczynski and DeYoe 1999
; Gandhi et al. 1999
; Huk et al. 2001
; Kastner et al. 1999
; Martinez et al. 1999
; Ress et al. 2000
; Tootell et al. 1998
). Moreover, trial-to-trial variability in attention is correlated with trial-to-trial variability in the amplitude of the fMRI responses (Ress et al. 2000
). To control attention in our experiments, the subject performed a difficult disparity-discrimination task in all experimental and control conditions.
The subjects' task was to detect the pair of depth planes that contained the larger interplane depth distance (difference in absolute disparity between 2 planes belonging to the same pair). On each trial, the interplane distance was nearly the same for both pairs of planes, except one pair had a small additional disparity
d (from 5 to 25% of the interplane distance; Fig. 2C). While in the scanner, subjects were asked to choose either the top or the bottom pair by pressing one of two keys during the 1 s following stimulus presentation. No response was recorded if subjects failed to respond within 1 s, but this rarely occurred. Auditory feedback was provided (correct/incorrect) immediately following each trial.
In one subject (PN)
d was held constant at psychophysical threshold (absolute disparity experiment, 10% of interplane distance; relative disparity experiment, 7%). For the other two subjects, task difficulty was adjusted dynamically using a staircase procedure: the interplane disparity difference was decreased slightly after two consecutive correct responses, and it was increased slightly after an incorrect response. Each block started with the smallest
d.
Main experiments
Two experiments were run to study absolute and relative disparity separately. Scanning sessions consisted of 1012 repeated scans. Each scan utilized a block alternation procedure containing 12 cycles of block-alternation, and each block consisted of eight trials (2 s/trial).
In the absolute disparity experiment, interplane distance was kept constant in all trials throughout the scan (with the exception of
d used for the psychophysical task), thus keeping local relative disparity unchanged (Fig. 2D). On "same" blocks, both pairs of planes were at ±18 arcmin (centered around fixation), providing constant local absolute disparity. On "mixed" blocks, each of the eight trials was randomly assigned (without replacement) one of the eight possible combinations of absolute disparities (Table 1; Fig. 2D). The top and bottom pairs were symmetrically placed in depth around fixation.
|
|
Control experiments were performed to determine if the decreased response during "same" blocks in our main experiments could be explained by a lower responsivity to the specific absolute (or relative) disparity that was repeatedly presented during that block compared with the disparities chosen for the "mixed" block. In the control experiments, we measured the summed response to the "mixed" disparities and compared it with the response to the "same" disparity. Each session consisted of 16 scans, 8 corresponding to the 8 different absolute disparities presented in the "mixed" block of the absolute disparity main experiment (Table 1), and 8 corresponding to the relative disparities (Table 2). Each scan alternated between a pair of stimulus conditions, and subjects performed the same task as in the main experiment, using the same staircase procedure.
For the absolute disparity control, one block consisted of four repeated presentations of the stimulus that was used in the "same" block of the absolute disparity experiment (±18 arcmin), while the other block consisted of four repeated presentations of one of the eight absolute disparity stimuli (Table 1). In other words, across the eight scans, we showed exactly the same stimuli as in the absolute disparity main experiment, but they were reordered to present the eight stimuli from the "mixed" block in eight separate scans. The data were analyzed by averaging across these eight scans. For the relative disparity control, likewise, one block consisted of four repeated presentations of a stimulus that had the same relative disparity (±18 arcmin) used in the "same" block of the relative disparity experiment (which was also the same stimulus used in the absolute disparity control), while the other block in each scan consisted of four repeated presentations of one of the eight relative disparity stimuli (Table 2).
To understand the logic of the control experiment, we begin by assuming that there was no adaptation and that the response modulations observed in the main experiment were caused by a lower responsivity to the stimuli in the "same" block than to those in the "mixed" block. If this were true (in the complete absence of adaptation), the control experiment would have given the same results as the main experiment. It did not (see Table 3), from which we conclude that the responses measured in the main experiment were caused by adaptation to the repeated stimulus presentations in the "same" block, not by a lower responsivity to the stimuli in the "same" block.
Baseline scans
To properly compare the strength of adaptation effects, it is necessary to take into account possible differences in baseline responsivity across visual areas. We therefore defined a "disparity selectivity index," which we computed as the ratio of the mean response from the adaptation scans to the mean response elicited during a separate series of baseline scans, separately for each subject.
The baseline scans consisted of 12 blocks, eight trials per block (as in the main experiment). During the "on" block, we presented a version of the stimulus in which each dot in each plane was randomly assigned one of the absolute disparities used in the main experiment. This generated two clouds of dots in depth, one above and one below fixation. During the "off" block, no stimulus was presented (other than the fixation square). Subjects were not performing any task.
Defining the visual areas
Visual areas were defined using well-established methods that have been extensively described previously (DeYoe et al. 1996
; Engel et al. 1994
, 1997
; Sereno et al. 1995
). Briefly, we presented rotating and expanding-contracting stimuli that evoked traveling waves of activity in retinotopic visual areas. To visualize the retinotopic maps, we rendered the fMRI data on a computationally flattened representation (flat map) of each subject's brain using software developed at Stanford University (Teo et al. 1998
; Wandell et al. 2000
). These procedures allowed for the determination of area boundaries and identification according to well-known anatomical arrangements. We mapped V1, V2, V3, V3A, hV4 (Wade et al. 2002
), V7 (Press et al. 2001
; Tootell et al. 1998
), and V8 (Hadjikhani et al. 1998
) in each subject. The anterior borders of V7 and V8 were often difficult to define precisely. We defined those areas consistent with the retinotopic maps and so that their size would be comparable to early ones. Because our analysis focuses on differences between groups of areas, an exact definition of area boundaries is not crucial to our conclusions. There is some debate over the definition and nomenclature of some of these visual areas, such as V8 versus V4
(Wandell and Wade 2003
; Zeki and Bartels 1999
) and V4v versus hV4 (Fize et al. 2003
; Wade et al. 2002
). We adopted hV4 as defined by Wade et al. (2002)
, and we defined V8 as an area adjacent to hV4 with a retinotopic map of a quarter field.
Area MT+ (the MT complex), also known as V5, was further defined in a separate scanning session as a contiguous region of gray matter in the occipital extension of the inferior temporal cortex that responded more strongly to full field moving dots than to a stationary dot pattern (Huk et al. 2002
; Tootell et al. 1995
; Zeki et al. 1991
).
Each of the above visual areas was further restricted according to the measured responses from "reference" scans. This was done for two reasons. First, the reference scan responses were used to extract the subregion of each visual area that corresponded retinotopically to the stimulus aperture. Second, we have found that doing so reduces session-to-session variability in the fMRI responses. We believe that some of the session-to-session variability derives from small differences in slice orientation and position such that voxels that were mostly gray matter in one session were only partially gray matter in a subsequent session. A liberal statistical threshold on the reference scan responses effectively compensated for this source of variability. The reference scans, which were run at the beginning and at the end of each session, consisted of 12 blocks, eight trials per block (as in the main experiment). During the "on" block, we presented the stimulus with all planes at zero (fixation) absolute disparity, thus reducing it to two square planes, one above and one below fixation. During the "off" block, no stimulus was presented (other than the fixation square). Subjects did not perform a task during the reference scans. We selected a subregion of gray matter within each visual area that was correlated with the stimulus alternations in the reference scans. The data were analyzed with several different choices of correlation threshold to ensure that our conclusions did not depend on a particular value.
The specific stimuli used for the reference and baseline conditions were somewhat arbitrary. The only critical issues in choosing these stimuli were that 1) the stimuli evoked strong responses in all of the visual cortical areas and 2) we needed to acquire separate, statistically independent data for each of these two purposes. We could have used the same stimuli for the reference scans and for the baseline scans. In fact we directly compared the spatial extent of activity to the reference and baseline stimuli and found that there was virtually no difference between the two conditions. Furthermore, we also re-analyzed our data using the reference scan stimulus (rather than the baseline condition) to calculate the adaptation indices, and again the results were almost identical.
Acquisition and analysis of fMRI data
MR imaging was performed on a GE 3T scanner with custom-designed dual surface coils (NMSC-002-TR-3GE transmit-receive coil, Nova Medical, Wakefield, MA).
Each scanning session began by acquiring a set of T1-weighted structural images using a spin-echo pulse sequence (500-ms repetition time, 15-ms echo time, 90 flip angle) in the same slices as the functional images. These inplane anatomical images were aligned to a high-resolution anatomical scan of each subject's brain using custom software (Nestares and Heeger 2000
), so that the functional data (across multiple scanning sessions) from a given subject were co-registered.
fMRI scans were performed using a T2*-sensitive, gradient-recalled echo, spiral pulse sequence (Glover 1999
; Glover and Lai 1998
). Eleven obliquely oriented slices (either parallel or perpendicular to the calcarine sulcus) were acquired every 1.3 s (TR = 666 ms with 2 interleaves/frame), with an effective spatial resolution of 2.9 x 2.9 x 4 mm.
The fMRI data were preprocessed by 1) discarding the first block of each scan to minimize transient effects of magnetic saturation and to allow the hemodynamics to reach steady state, 2) removing the linear trend in the time series at each voxel to compensate for slow signal drift (Smith et al. 1999
), 3) dividing each voxel's time series by its mean intensity to convert the data from arbitrary image intensity units to percent signal modulation and to compensate for the decrease in mean image intensity with distance from the coil, and 4) averaging the resulting time-series across the gray matter subregion of each visual area that corresponded retinotopically to the stimulus.
We quantified the stimulus-evoked responses by fitting a sinusoid to the resulting average time series. The amplitude and phase of this sinusoid equal the component of the discrete Fourier transform (DFT) at the block alternation period. The amplitude reflects the difference in cortical activity evoked by alternating between the two stimulus conditions. The phase reflects the temporal delay caused by the sluggish hemodynamics. The hemodynamic delay was estimated by computing the vector average of all the bivariate (amplitude and phase) responses from all of the experimental conditions, separately for each visual area in each subject. Then the response amplitudes were estimated by projecting the bivariate responses onto a line corresponding to the estimated hemodynamic delay (Heeger et al. 1999
). This response amplitude was positive when the blood oxygenation leveldependent (BOLD) signal evoked during the "mixed" block was larger than the "same" block. The response amplitudes were averaged across the repeated scans for each visual area and each observer.
Adaptation indices were computed from the response amplitudes as shown in Fig. 3. The three small panels show time series (data for 1 subject averaged across scans) from the baseline scan (Fig. 3A), the absolute disparity adaptation experiment (Fig. 3B), and the relative disparity adaptation experiment (Fig. 3C). For each of these conditions, we computed a response amplitude in each visual area. The absolute disparity adaptation index was computed by dividing the response amplitude from the absolute disparity experiment by the response amplitude from the baseline experiment; this is the quantity on the abscissa in Fig. 3D. The relative disparity adaptation index was similarly obtained and plotted on the ordinate. A positive adaptation index indicates adaptation (i.e., smaller response during "same" blocks compared with "mixed" blocks). This procedure of normalizing by a baseline response is the same as that used by Huk and Heeger (2002)
and Backus et al. (2001)
, and is analogous to procedures used in neurophysiology for computing selectivity indices.
|
|
|
|
| RESULTS |
|---|
|
|
|---|
We observed effects consistent with adaptation in both experiments, but the adaptation effects differed across visual areas (Fig. 4). Early areas (Fig. 4, black symbols) exhibited small (but significant, see Table 4) effects that were roughly the same for both types of disparity. Dorsal areas (Fig. 4, blue symbols), on the other hand, departed from early areas in that they showed strong adaptation to absolute disparity (stronger than early areas, see Table 5), but no adaptation to relative disparity (Table 4). This is clearly visible in the projection histogram at the top right of Fig. 4, where the blue data points are shifted away from the unity line. Ventral areas (Fig. 4, red symbols) showed equal and significant (Table 4) adaptation to both types of disparity, but the amount of adaptation (to both relative and absolute disparity) was greater than in early areas (highly significant, Table 5). This pattern is summarized by the two arrows in Fig. 4, which start at the average for the early visual areas and point to the averages for the dorsal and ventral areas. The projection histogram at the bottom of the plot shows the adaptation to absolute disparity and shows the greater adaptation for both dorsal and ventral areas (red and blue bars) compared with the early areas (black bars). Similarly, the histogram on the left shows the adaptation to relative disparity and shows the greater adaptation for ventral areas (red bars) than to early and dorsal areas (black and blue bars).
|
|
|
|
Reductions in activity during "same" blocks compared with "mixed" blocks could be explained in two different ways. The first explanation is based on adaptation: the single disparity that is repeatedly presented in the "same" block leads to adaptation of the neurons that respond to it and therefore to reduced neural activity. This does not happen during the "mixed" blocks, because different disparities are presented and never repeated. Our current hypothesis is based on this explanation. However, an alternative explanation could be that the disparity we chose for the "same" block was less effective per se (in the absence of adaptation) in driving neuronal activity than the disparities chosen for the "mixed" block. This would be sufficient to cause a smaller response in the "same" block. To verify that this potential explanation did not apply in our experiments, we ran a control experiment designed to measure the responses to each of the 16 stimulus conditions (8 for absolute, 8 for relative) separately (see METHODS).
The results of the control experiments indicate that our main results were due to adaptation and not simply to differences in responsivity to the different disparity levels (Fig. 5). If our results were due to adaptation, the effects we observed in our main experiments should disappear in the control. If, on the other hand, the effects we observed were simply due to differences in responses to the different disparity levels, the outcome of the control experiment should look very similar to that obtained in the main experiment. Most of the data from the control experiments showed no effect (the data points clearly cluster around the origin). When averaged across subjects, the data showed a small effect of negative absolute disparity adaptation in three visual areas (Table 4), but there were no statistically significant differences between groups of areas (early, dorsal, and ventral) in either absolute or relative disparity adaptation indices (Table 5).
These results were also evident in the fMRI response amplitudes, separately for each subject and visual area (Fig. 6; Table 3B). The responses were small or negative for both control experiments in all three subjects (Fig. 6). In only a few cases were the responses statistically significant (Table 3B).
Psychophysics
The purpose of the psychophysical task was to engage subjects equally in both the "same" and the "mixed" blocks, so that differences in activation between these two blocks would not be dominated by the subjects' attentional state (Huk and Heeger 2000
). To that end, we adjusted the task difficulty to control performance accuracy as a proxy for controlling attention.
Performance was not perfectly matched between blocks, but the same small differences were present in both the main and control experiments (Table 6). The mismatch in performance resulted from a technical limitation on the number of interplane disparities that could be sampled by the staircase procedure. While there were differences in psychophysical performance between (same and mixed) blocks, these differences were present in both our main experiments and in our control experiments, but significant adaptation effects were only observed in the former. We can conclude that adaptation effects observed in the main experiment were not a consequence of differences in psychophysical performance.
|
| DISCUSSION |
|---|
|
|
|---|
Our results can be summarized as follows. In early visual areas (V1, V2, and V3), there were small effects in both the absolute and relative disparity experiments. In dorsal and ventral streams, different patterns emerged. Dorsal areas (V3A, V7, and MT+) exhibited adaptation to (and hence, selectivity for) absolute disparity but not relative disparity. Ventral areas (hV4 and V8), on the other hand, exhibited equal adaptation to both types of disparity, implying either that an equal number of neurons in these areas are selective for both absolute and relative disparity or that the neurons exhibit partial selectivity for relative disparity. The control experiment allowed us to exclude the possibility that these results were due to differential responses to the particular stimulus configurations in the absence of adaptation. The psychophysical data allowed us to exclude the possibility that our results were due to differences in attention or behavioral performance across conditions.
Overall size of the effects
The effects of adaptation that we observed were rather small; the overall mean adaptation index was
0.1, meaning that the response differences due to adaptation were only
10% as large as those evoked by alternating the stimuli with a blank screen. There are at least two reasons for the small size of the effects. First, not all neurons in early visual areas are selective for disparity, and even among "disparity-selective" cells, there is a continuum of selectivity (DeAngelis and Newsome 1999
; Prince et al. 2002b
; Roy et al. 1992
). Second, our experimental protocol only exposed adaptation in a fraction of this disparity-selective subpopulation. Our experiments targeted neurons with a disparity tuning profile that was reasonably selective for one of our stimulus conditions (with a particular absolute disparity or a particular relative inter-plane disparity) but not other stimuli (with disparities as nearby as ±4 arcmin). Electrophysiological (Prince et al. 2000
, 2002b
) and psychophysical (Cormack et al. 1993
; Neri et al. 1999
; Stevenson et al. 1992
) studies indicate that it is reasonable to expect this degree of selectivity in some disparity-selective neurons. Many neurons are, however, responsive to a broad range of disparities (
1°) (Prince et al. 2002a
). These neurons would respond, and consequently adapt, equally to stimuli presented in both ("same" and "mixed") blocks; therefore no differential effect would be observed.
Effect of relative disparity between upper and lower hemifields
Is it possible that the interpretation of our results is confounded by the presence of neurons whose receptive fields were large enough to span both pairs of planes? When referring to relative disparity in the design of these experiments, we mean the local interplane distance restricted to a spatial region within either the top or bottom pair of planes. Because all the visual areas explored in this study were defined by their retinotopic organization, it is unlikely that the receptive fields of many neurons spanned both upper and lower hemifields. We accept, however, that there could have been some neurons (e.g., with central receptive fields) for which this was the case. In the absolute disparity experiment "mixed" block, there was a changing relative disparity between the upper and lower pairs of planes. If a significant proportion of neurons were selective to the relative disparity between the upper and lower hemifields in a particular visual area, this change in relative disparity would predict the larger response that was found in the "mixed" block in the absolute disparity experiment. Thus if we only consider the results of this experiment, it is not possible to distinguish adaptation to relative disparity between the upper and lower visual fields from adaptation to absolute disparity within a stimulus pair. However, the interpretation of our results is unambiguous when we take into account the results of the relative disparity experiment. In the relative disparity experiment, the relative disparity between the upper and lower visual fields was constant during the "mixed" block, and it was changing during the "same" block. Therefore if it were the relative disparity between the upper and lower pairs of planes causing the adaptation in the absolute disparity experiment, one would also expect a larger response to the "same" condition in the relative disparity experiment (i.e., a negative adaptation to relative disparity). Because this only occurred in a single subject (Table 3A, subject HB, visual areas V3A, V7, and MT+), we suggest that this is an unlikely explanation of our results.
Potential role of featural attention
Detailed analysis of our psychophysical data allows us to exclude that our results may be due to unspecific changes in the attentional state of our subjects (see RESULTS section). However, it seems possible that our results could be partially influenced by more specific extraretinal signals, such as featural attention. Our relative disparity task required observers to select different regions of depth space in different conditions, and it is possible that the attentional strategies underlying these selection processes were different for the different stimulus conditions we explored. For example, the "same" block in the absolute disparity experiment was the only condition in which observers could always focus on the same depth region. Another interpretation along similar lines may be that top-down attention is directed to regions of cortex encoding for the property that varies rather than to those representing the property that remains constantin this scenario, our signals would not reflect neural adaptation but rather top-down attentional shifts.
It appears to us that any interpretation along these lines would have to be rather convoluted to account for all our results, including the differences we observed across visual areas, and that such formulations would not be aided by solid experimental evidence. Instead, we provide a physiologically justified interpretation of our data that is consistent with a large number of previous studies on this subject.
Possible physiological interpretations
In monkey electrophysiology, data directly pertaining to the issue addressed in this paper (coding of absolute and relative disparity) is available for areas V1, V2, MT, and IT. However, there are many different stimuli that can be used for investigating relative disparity tuning. The current experiment used transparent planes, while many single unit studies used a center-surround relative disparity configuration. There is currently no evidence to suggest whether or not these different types of stimuli will be coded with the same neurons. We also briefly consider studies measuring the response of single neurons to binocularly correlated and anti-correlated random dot stereograms (RDS), as depth is only perceived in correlated stereo-pairs. If neurons in specific visual areas represent perceived depth, one might expect neurons in such areas to be both selective for relative disparity, and only respond to correlated RDS stimuli.
Our results in early visual areas differ from the single-unit physiology. Single unit recordings have shown that V1 neurons are only selective for absolute disparity (Cumming and Parker 1999
). V2, on the other hand, contains a small but significant number of units that show partial selectivity for relative disparity (Thomas et al. 2002
). No such distinction between V1 and V2 was seen in our data. Rather, we observed small but roughly equal effects for both relative and absolute disparities in V1, V2, and V3. Our failure to find a difference between V1 and V2 in selectivity for relative disparity might follow from the fact that any such difference would be expected to be small when averaged across the entire neural population (with a small proportion of V2 neurons exhibiting only partial selectivity for relative disparity). The adaptation to relative disparity that we observed in V1 is not consistent with previous literature (Cumming and Parker 1999
), and we do not have an obvious explanation for it. One possibility is that it may reflect feedback from ventral areas, which did show a stronger effect for relative disparity. The overall conclusion that we draw with respect to early areas is that, consistent with the physiology, they are likely to be involved in the early steps of stereoscopic processing. This is also in line with studies using correlated and anticorrelated random dot stereograms (Cumming and Parker 1997
).
The only dorsal area for which we have relevant electrophysiological data is MT. Using center-surround RDS stimuli, Uka and DeAngelis (2002)
reported that there is no coding of relative disparity in MT, and a similar conclusion was reached by Tsao et al. (2003)
. Our fMRI results are in agreement with these studies and actually take an even stronger stand in this respectthe data in Fig. 4 suggest that, as one proceeds along the dorsal stream, stereoscopic processing becomes more involved with information about absolute disparity and less with relative. There is, however, evidence that some MT neurons are selective for relative disparity in that they respond selectively to slanted surfaces, irrespective of the absolute position in depth of the surface (Nguyenkim and DeAngelis 2003
). Our results may also seem inconsistent with micro-stimulation studies in which it was shown that activity in MT can influence stereoscopic judgements (DeAngelis et al. 1998
), but we would argue that this is not necessarily the case. What these micro-stimulation studies show is that MT is part of the circuitry that leads to the final stereoscopic percept (as assessed using certain behavioral parameters), in the same way that the computation of absolute disparities is part of the processing that leads to the computation of relative disparities. However, they do not imply that the neural representation of the final stereoscopic percept must reside in MT. This consideration also applies to psychophysical studies in which it was shown that anticorrelated signals affect behavioral responses in a way that resembles very closely the response of neurons in V1 (Neri et al. 1999
). While the earliest stage of binocular combination can be exposed psychophysically in conditions in which it affects behavioral performance, it does not mean that the final percept actually resides at that stage.
Ventral area IT neurons are selective for disparity-defined shapes, a computation that requires the use of relative disparity (Janssen et al. 2000
). However, such stimuli are more complicated than the transparent planes used in this study, because they contain disparity gradients and curvature. The same group showed that neurons in IT do not respond to anticorrelated random-dot stimuli (Janssen et al. 2003
). It is possible that anticorrelated signals are discarded at an earlier stage between V1 and IT, but at least at the level of this ventral area neuronal signals seem to correlate more closely with perception. The fMRI results presented here point in a direction that is consistent with this electrophysiological evidence. Our data indicate that if one is to find a visual area that codes predominantly for relative disparity, this is likely to be along the ventral, and not the dorsal, stream.
Our results are also in agreement with previous fMRI studies of stereoscopic depth perception. Backus et al. (2001)
addressed the relationship between neural activity and stereoscopic depth perception. These authors showed that most retinotopically defined areas in visual cortex display activity that correlates with behavioral parameters of stereoscopic perceptionthis trend became more evident progressing from V1 toward extrastriate cortex and was particularly marked in area V3A. A recent study by Tsao et al. (2003)
examined disparity sensitivity using fMRI in both macaque monkeys and humans. Using a disparity checkerboard pattern, they showed a similar pattern of activation to Backus et al. (2001)
and also showed that MT does not respond well to this edge-rich disparity pattern but prefers large disparity patterns that changed coherently (i.e., without relative disparity). Neither of these studies, however, was specifically designed to assess selectivity for relative versus absolute disparity. Although Tsao et al. (2003)
did attempt to address this issue, their experiments did not allow for a segregation of neural subpopulations specifically encoding for relative disparity. Their relative disparity condition contained (obviously) strong absolute disparity signals, and it is conceivable that their absolute disparity condition evoked responses in relative disparity selective units.
It is widely believed that the phenomena of stereovision require an explanation in terms of the relative changes in the responses of subpopulations of neurons with different disparity selectivities. An increase in overall level of activity in a cortical area in response to a stereoscopic stimulus does not mean that neurons in that area are selective for disparity. Such an increase in activity might, to the contrary, be caused by increased attention to the stereoscopic percept (Huk et al. 2001
). Likewise, a lack of an increase does not imply that the neurons are not selective. For example, a particular cortical area may contain subpopulations of neurons with different selectivities such that changing the stimulus shifts the activity from one subpopulation to another, while overall activity remains unchanged. Our goal in using an adaptation protocol was to explicitly assess selectivity for absolute and relative disparity to identify the roles played by individual brain regions in different computational stages of disparity processing. Adaptation can be used to reveal the selectivities of subpopulations of neurons in the human brain, even when those neurons are intermingled at a spatial scale that is finer than the spatial sampling resolution (voxel size) of the fMRI measurements (Engel and Furmanski 2001
; Grill-Spector and Malach 2001
; Huk and Heeger 2002
; Huk et al. 2001
; Kourtzi and Kanwisher 2001
).
The results presented here suggest a rough dichotomy between dorsal and ventral streams, in which signals about absolute disparity are mostly represented in dorsal areas, whereas relative disparity coding is represented (together with absolute disparity) in ventral areas. Tyler (1990)
suggested a similar dichotomy between magno and parvo pathways on the basis of psychophysical evidence. It is important to notice that, although it is true that humans perform much better when relative disparity signals are available (Westheimer 1979
), and that large changes in absolute disparity go unnoticed when relative disparity is kept constant (Erkelens and Collewijn 1985
; Regan et al. 1986
), it is also the case that humans do perceive absolute disparity and can use it to make coarse judgments about the depth of objects (stereoacuity thresholds for the depth task designed by Westheimer 1979
are measurable when only absolute disparity is available, despite being 5 times larger than the corresponding relative disparity thresholds). This means that the concept of associating only relative disparity with stereoscopic perception is an oversimplification. A less radical view would be to associate different behavioral demands with the two different types of disparity information. For example, absolute disparity can be useful in providing a rough estimate of the distance of an approaching object, as well as in controlling vergence eye movements (Howard and Rogers 1995
). If signals about absolute disparity are made available more quickly to visual cortex, spatial navigation and orienting may benefit from these signals. This would also make sense in the context of our results, because the dorsal stream has been associated with this type of behavior (Goodale and Milner 1992
). Relative disparity becomes necessary when making fine judgments about the precise three-dimensional shape of an object. This also makes sense in the context of our results because shape processing and object recognition have been associated with the ventral stream (Goodale and Milner 1992
).
It is clear that further research is needed, using both electrophysiological and imaging techniques, if we are to obtain a full account of the neural basis of stereoscopic processing. Our results prompt electrophysiologists interested in these problems to direct their efforts toward the ventral stream, a region of cortex that has so far been relatively neglected in studies of perceptually relevant stereoscopic signals compared with its dorsal counterpart.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
|
|
|---|
Present addresses: P. Neri, 486488 Minor Hall, University of California, Berkeley, CA 94720-2020; D. J. Heeger, Department of Psychology and Center for Neural Science, New York University, 6 Washington Place, New York, NY 10003.
| FOOTNOTES |
|---|
Address for reprint requests and other correspondence: P. Neri, Dept. of Zoology, Univ. of Cambridge, Downing St., Cambridge CB2 3EJ, UK (E-mail: pn232{at}hermes.cam.ac.uk).
| REFERENCES |
|---|
|
|
|---|
Backus BT, Fleet DJ, Parker AJ, and Heeger DJ. Human cortical activity correlates with stereoscopic depth perception. J Neurophysiol 86: 20542068, 2001.
Badcock C and Schor C. Depth-increment detection functions for individual spatial channels. J Opt Soc Am A 2: 12111216, 1985.[ISI][Medline]
Barlow HB, Blakemore C, and Pettigrew JD. The neural mechanism of binocular depth discrimination. J Physiol 193: 327342, 1967.
Blakemore CB. The range and scope of binocular depth discrimination. J Physiol 211: 599622, 1970.
Braddick OJ and Atkinson J. Some recent findings on the development of human binocularity: a review. Behav Brain Res 10: 141150, 1983.[CrossRef][ISI][Medline]
Brefczynski JA and DeYoe EA. A physiological correlate of the spotlight of visual attention. Nature Neurosci 2: 370374, 1999.[CrossRef][ISI][Medline]
Burkhalter A and Van Essen DC. Processing of color, form and disparity information in visual areas VP and V2 of ventral extrastriate cortex in the macaque monkey. J Neurosci 6: 23272351, 1986.[Abstract]
Cormack LK, Stevenson SB, and Schor CM. Disparity-tuned channels of the human visual system. Vis Neurosci 10: 585596, 1993.[ISI][Medline]
Cumming BG. Stereopsis: where depth is seen. Curr Biol 12: R93R95, 2002.[CrossRef][ISI][Medline]
Cumming BG and Parker AJ. Responses of primary visual cortical neurons to binocular disparity without depth perception. Nature 389: 280283, 1997.[CrossRef][Medline]
Cumming BG and Parker AJ. Binocular neurons in V1 of awake monkeys are selective for absolute, not relative, disparity. J Neurosci 19: 56025618, 1999.
Cumming BG and Parker AJ. Local disparity not perceived depth is signaled by binocular neurons in cortical area V1 of the ma