|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Laboratory for Cognitive Neuroscience, Graduate School of Frontier Biosciences and Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, Japan
Submitted 28 March 2005; accepted in final form 29 June 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Area V4 is a major stage in the ventral visual processing pathway, which projects from V1 to IT of the monkey extrastriate visual cortex. It is involved in object and color vision (Heywood et al. 1992
; Merigan 1996
; Schiller 1993
), as well as more cognitive functions such as attention and visual search (Connor et al. 1997
; Mazer and Gallant 2003
; McAdams and Maunsell 1999
; Moore and Armstrong 2003
; Motter 1994
; Ogawa and Komatsu 2004
). V4 neurons are also sensitive to binocular disparity, suggesting that they carry information about stereoscopic depth (Hegdé and Van Essen 2005
; Hinkle and Connor 2001
,2002
; Tanabe et al. 2004
; Watanabe et al. 2002
).
Some studies used solid-figure stereograms (SFSs) to test the disparity selectivity of V4 neurons (Hinkle and Connor 2001
, 2002
; Watanabe et al. 2002
). An SFS is a reasonable stimulus choice for V4 cells because these neurons are selective for the form and pattern of a stimulus (Desimone and Schein 1987
; Gallant et al. 1996
; Kobatake and Tanaka 1994
; Pasupathy and Connor 1999
). Additionally, local features of natural images often contain contrast edges similar to those in solid figures. Tests with SFSs may thus be relevant for understanding depth coding in V4 in some respects, but do not probe genuine disparity selectivity. On the other hand, dynamic RDSs provide a test to isolate neuronal sensitivity to binocular disparity from neuronal modulation caused by other factors.
When viewing an RDS, the stereoscopic system extracts a globally consistent match from the numerous possible matches between the visual patterns projected to the left and right eyes (Julesz 1971
). A shift of the depth of the visual target is visible only with the stereoscopic system and is invisible with a monocular system because there is neither a positional shift of the random-dot patch nor a coherent motion of dots because of the constant renewal of the dot pattern. On the other hand, much less computational complexity is required when viewing an SFS. The matching of the corresponding patterns in the two eyes is unique for an SFS. Even a monocular system is sensitive to a shift in the depth of an SFS because the shift is always associated with a positional shift of the monocular image. Thus with an SFS, it is difficult to isolate responses to binocular disparity from responses to positional shifts in monocular features.
To fully characterize stereo processing in V4, it is thus important to study the disparity tuning of V4 neurons using both an RDS and an SFS. V1 and V2 neurons reportedly exhibit similar disparity tuning for RDSs and for SFSs (Gonzales and Perez 1998
; Poggio 1990
; Poggio et al. 1985
), although no quantitative population analyses are published. A recent study examined the disparity tuning of V4 neurons with both RDSs and SFSs (Hegdé and Van Essen 2005
), and showed that the type of stereogram influences the disparity tuning of most V4 cells. In the present study, we addressed two issues regarding the comparison of disparity tuning for an RDS and for an SFS. First, we examined the differences in the parameters that characterize the disparity-tuning curves. Second, we examined whether the effect of the positional shifts in the monocular images of an SFS can explain the differences between these curves. We also addressed the functional organization of disparity-selective cells in V4 by analyzing the single-unit and multiunit responses recorded simultaneously at the same sites.
| METHODS |
|---|
|
|
|---|
Two female and one male Japanese macaque monkeys (Macaca fuscata) were used. Details of the surgical procedure have been published elsewhere (Uka et al. 2000
). Briefly, under full anesthesia and aseptic conditions, stainless steel and plastic bolts were screwed into holes drilled through the skull. Acrylic resin was mounted over the skull, firmly connecting the bolts with a plastic post for head restraint. A custom-made plastic recording chamber was placed 5 mm posterior and 25 mm dorsal to the ear canal. After at least 1 wk of recovery, scleral search coils made of Teflon-insulated stainless steel wires (Cooner Wire, Chatsworth, CA) were implanted into both eyes. All surgical, experimental, and care protocols conformed to the National Institutes of Health Guide for the Care and Use of Laboratory Animals (1996), and were approved by the Animal Experiment Committee of Osaka University.
Task and visual stimulation
A monkey was seated with its head restrained in a primate chair. A computer display (NuVision 21MX, MacNaughton, Beaverton, OR) was set 57 cm away from the monkey's eyes. The display subtended 40 x 30° of the monkey's visual field. We trained the monkeys to perform a simple fixation task, which was controlled using a commercially available system (TEMPO, Reflective Computing, St. Louis, MO). When a white fixation point (0.2 x 0.2°) appeared at the center of the screen, the monkey was required to make an eye movement toward it within 500 ms. During the next 2 s, the monkey had to maintain fixation within an invisible window that was typically a 1.4 x 1.4° square centered at the fixation point. The vergence was restricted to be within ±0.5° relative to the fixation plane. After successful trials, the monkey was rewarded with a drop of juice or water. When the monkey's eyes moved away from the fixation or the vergence window, the task was aborted immediately and no reward was delivered. Intertrial intervals were randomly selected to be 0.5, 1.0, or 1.5 s at an equal probability.
We developed a visual stimulation program using the OpenGL Utility Toolkit (GLUT). A dynamic RDS was presented for 1 s from 500 ms after the onset of the fixation period to 500 ms before the offset (i.e., the RDS period was centered on the 2-s fixation period). The random-dot pattern was composed of 50% bright (3.5 cd/m2) and 50% dark (0.4 cd/m2) dots (Fig. 1A). All the dots had a size of 0.17 x 0.35° and were positioned following a uniform distribution. The dot density was 26%. Antialiasing was accomplished by the hardware of the video board. The random-dot pattern was renewed every five frames (12 Hz). The RDS was a bipartite patch consisting of a center disk whose disparity was varied between ±1.6 or ±1.2° and a surrounding annulus whose disparity was fixed at zero. The width of the annulus was 1°. Because the largest disparity tested in the present experiments was 1.6°, the largest shift of the center disk was +0.8° in one eye and 0.8° in the other eye. The width of the annulus ensures that no dots composing the disk appear outside the random-dot patch (i.e., no shift occurs in the patch position). The background was a uniform field of midlevel luminance (1.5 cd/m2). For visual stimulation, we illuminated only the red phosphors because their relatively quick decay time resulted in minimal interocular cross talk. With our dichoptic display device in which a liquid-crystal filter alternated the polarity of the display every frame (120 Hz), the left-to-right cross talk was 10% and the right-to-left cross talk was 0% (Tanabe et al. 2004
).
|
After the monkeys were sufficiently trained, a hole for electrode insertion was drilled through the skull inside the recording chamber.
On recording sessions, a micromanipulator (MO-95S, Narishige, Tokyo, Japan) with a tungsten-in-glass microelectrode was attached to the recording chamber. Extracellular voltage signals were amplified and filtered with custom-made instruments. Action potentials of single units were isolated by either a custom-made window-discriminator or a template-matching spike-sorting system (Multi Spike Detector, Alpha-Omega Engineering, Nazareth, Israel) and were recorded at 1-ms resolution. In parallel, background multiunit activity and the v-sync pulses of the visual stimulation were recorded. Eye positions were monitored with the magnetic search coil technique (MEL-25, Enzansi Kogyou, Tokyo, Japan). Eye position signals were recorded at a 1-kHz sampling rate. Before recording, we located area V4 inside the recording chamber of each monkey based on the retinotopic map, the size-eccentricity relationship of receptive fields (RFs), and the surrounding sulci (Gattass et al. 1988
; Watanabe et al. 2002
).
When single-unit spikes were isolated, we determined the position of the cell's RF and the effective stimulus. The probe stimulus was selected from a small bright bar, a small RDS patch, and a drifting sinusoidal grating, based on which stimulus evoked the highest response. We used this stimulus to map the cell's RF. In the initial experiments, we mapped the RF with only a probe stimulus at zero disparity. In later experiments, we tried to maximally drive the cell by testing both crossed and uncrossed disparities in addition to zero disparity for the initial survey of the cell's receptive field. After the RF was determined, we presented an RDS that covered the entire RF. If an RDS of this size evoked a sufficiently strong response, we went on to the recording session; otherwise, we reduced the size of the RDS patch until a sufficient response was evoked. The eccentricity of the RFs ranged from 3.7 to 15.0°. The minimum patch diameter of the RDSs used in the experiments was 4°. In the recording sessions, the tested disparity values and the left and right monocular presentations were randomly ordered. All stimulus conditions were tested at least once before going on to the next block of trials.
In a subset of the tested cells, we examined the horizontal disparity tuning for a bright-bar SFS (Fig. 1B). The length, width, and orientation of the bar were manually adjusted to effectively drive the cell at zero disparity. The bar was typically slightly shorter than the diameter of the RF (range of bar length: 2.3 to 7.4°). A typical bar width was 0.4° (range: 0.2 to 1.4°). In addition to the disparity-tuning test, we examined the responses to monocular presentations of the SFS. The monocular images were the same images for one of the eyes in the disparity-tuning test. For instance, the left monocular image corresponding to an uncrossed disparity of 0.2° was a bar shifted 0.1° to the left from the center. The trials for the disparity tuning for an RDS, the disparity tuning for an SFS, the left monocular shift tuning for an SFS, and the right monocular shift tuning for an SFS were interleaved randomly in the same recording session. Neither orientation disparities nor vertical disparities were presented in this study.
Histology
After recording experiments were completed, we histologically reconstructed the recording site in one of the three monkeys. The monkey was overdosed with pentobarbital sodium (64 mg/kg of BW), and then transcardially perfused with phosphate-buffered saline followed by 4% paraformaldehyde. A metal pin was inserted into the brain at each corner of the recording chamber (four black dots in Fig. 1C). We removed the brain from the skull and cut out a block of brain tissue circumscribed by the four pins. The brain tissue was immersed in a graded series of sucrose solutions (1030%) for 3 days. Then it was frozen, cut into 50-µm sections, and mounted on gelatin-coated glass slides. After the sections were dried, they were stained with standard Nissl staining methods. We found scars from electrode penetrations only in the prelunate gyrus (shaded area in Fig. 1C). The recording site was histologically identified as area V4.
Data analysis
The spike train of each trial was aligned to the onset of the visual stimulus using the presentation command signal and the following v-sync pulses. The response was evaluated as the firing rate from 80 ms after the onset to 80 ms after the offset of the visual stimulus. The 80-ms delay compensated for the typical neuronal firing latency in V4 (Tanabe et al. 2004
). For reliable statistics, cells were discarded from the analysis unless at least six trials with good spike isolation were accumulated for each stimulus condition. For most cells, data were accumulated for ten trials.
For an assessment of disparity selectivity, we calculated the disparity discrimination index (DDI) (DeAngelis and Uka 2003
; Prince et al. 2002a
). The firing rate from each trial was square-root transformed, and the DDI was calculated as follows
![]() |
We fitted Gabor functions to quantitatively analyze the disparity-tuning characteristics of V4 cells. A Gabor function is expressed as follows
![]() |
The parameter combination that minimized the merit function was obtained with the "fmincon" function of the Optimization Toolbox in MATLAB (The MathWorks, Natick, MA). This function constrained the parameter values to prevent unreasonable fitting results. The vertical offset (y0) was constrained to values between zero and the maximum response of all the trials. The amplitude of the Gaussian envelope (A) was constrained between zero and twice the difference between the maximum and minimum responses of all the trials. The horizontal offset of the Gaussian envelope (x0) was constrained to values within the disparity range being tested. The width of the Gaussian envelope (
) was constrained to values between 0.1° and the total range of tested disparities. The frequency of the cosine carrier (f) was constrained to ±10% of the frequency at the primary peak of the power spectrum of the raw tuning curve. The phase of the cosine carrier (
), relative to the center of the Gaussian envelope, was constrained to be within ±3
.
Direct comparison of the fitted Gabor function parameters of the disparity-tuning curves for RDSs and for SFSs was made by simultaneously fitting the two sets of data. x0 and
both reflect the horizontal position, whereas f and
both reflect the width of a disparity-tuning curve. Thus very similar curves can be obtained with different combinations of these parameter values (Prince et al. 2002a
). To improve comparisons of the fitted parameters of the two curves, both x0 and
were shared by both curves. The other four parameters, y0, A, f, and
, were fitted independently using the six constraints described above. For the merit function, we calculated the summed squared error of the raw firing-rate values with respect to the raw values of the Gabor function, rather than their square-root values because a square-root transformation would not be defined for the negative values we obtained when we subtracted responses to monocular presentations from responses to binocular presentations. Details of the subtraction are described in the RESULTS section.
The phase parameter
indicates how much the disparity-tuning curve is displaced from the center of the Gaussian envelope and gives an estimate of the symmetry of the curve. Estimation of symmetry with
, however, is sometimes misleading when the cycle of the carrier 1/f is much larger than the envelope width
(Prince et al. 2002a
). The symmetry phase is a better estimate to evaluate the symmetry of the fitted function itself (Read and Cumming 2004
). To obtain the symmetry phase, we first calculated the centroid of the fitted function, which is given by
![]() |
is Rrefl(x) = R(x + 2
), a completely even-symmetric tuning would satisfy R(x) = Rrefl(x), whereas a completely odd-symmetric tuning would satisfy R(x) = Rrefl(x). Thus the contributions of the ideally even and the ideally odd components of R(x) were calculated as Reven(x) = [R(x) + Rrefl(x)]/2 and Rodd(x) = [R(x) Rrefl(x)]/2, respectively. We then weighed the contributions of Reven(x) as the maximum deviation from zero E and of Rodd(x) as the peak value O. E was assigned a positive or a negative sign depending on whether the maximum deviation of Reven(x) was a positive or a negative peak, respectively. O was assigned a positive or a negative sign depending on whether the position of the peak of Rodd(x) was on the negative or the positive side of the abscissa with respect to the centroid, respectively. Finally, the symmetry phase was obtained as the angle (rad) formed from the projections of E and O onto two perpendicular axes for the even and odd components.
|
| RESULTS |
|---|
|
|
|---|
Our database included 260 cells recorded from the dorsal V4 (i.e., lower visual representation) in three monkeys (76, 29, and 155 from monkeys 1, 2, and 3, respectively). Cells that significantly responded to at least one of the tested disparities (t-test with correction for multiple comparisons) in an RDS were selected for assessment of their disparity selectivity. Over half of the visually responsive cells (142/224, 63%) exhibited significant selectivity for disparity (KruskalWallis test, P < 0.05). We calculated the DDI values for all the visually responsive cells (n = 224; 62, 22, and 140 from monkeys 1, 2, and 3, respectively). The DDI values were distributed unimodally with a mean of 0.48 (SD 0.14) (Fig. 2A). V4 neurons did not fall into discrete classes of disparity-sensitive or -insensitive cells, and thus it might be appropriate to fairly quantify disparity tuning characteristics of all cells. However, it is difficult, if not meaningless, to interpret the disparity tuning of cells that were not significantly disparity sensitive. We thus selected cells with significant disparity tuning for further analysis.
To determine the preferred disparity of each significantly disparity-sensitive cell, we interpolated the disparity-tuning data with a spline curve. The disparity at the peak of the curve was determined to be the preferred disparity of that cell. Sixteen out of the 142 disparity selective cells (11%) whose preferred disparities were at either end of the tested disparity range were discarded from the analysis because the preferred disparities were likely to lie outside the tested range. This method for determining the preferred disparity was used throughout the rest of this paper. The distribution of preferred disparities exhibited a sharp peak near 0.4° (Fig. 2B). A large portion of V4 cells preferred small crossed disparities (0.6° < disparity < 0°), whereas a small population of neurons preferred uncrossed disparities. The overall bias of cells in V4 for crossed disparities was consistent with previous studies that examined disparity selectivity with SFSs (Hinkle and Connor 2001
; Watanabe et al. 2002
).
Description of disparity-tuning characteristics
All cells with significant disparity sensitivity were fitted with a Gabor function (n = 142). To evaluate the goodness-of-fit, we calculated the R2 value for each fitted tuning curve. The disparity-tuning curve of most cells demonstrated an R2 value >0.6 (n = 133). The Gabor function provided a fairly good description of the disparity-tuning curve for most V4 cells. Nine cells with R2 <0.6 were discarded from the following analysis because a Gabor function was not adequate to describe their disparity-tuning profiles. Low R2 values were associated more with poor disparity sensitivity instead of bad fitting quality. This was supported by a highly significant correlation between DDI and R2 (Spearman's rank correlation rS = 0.48, P = 109; data not shown). Therefore the presence of cells with low R2 values reflects noisy data, not disparity-tuning curves that cannot be captured by a Gabor function (see also Tanabe et al. 2004
).
Typical disparity-tuning curves possessed a pronounced peak at a small crossed disparity and a shallow dip at a small, uncrossed disparity. Although a few cells gave a low baseline response (Fig. 3A), most cells responded strongly even at nonpreferred disparities (Fig. 3B). Disparity-tuning curves with a clear dip (Fig. 3C) or a peak at an uncrossed disparity (Fig. 3D) were relatively rare. To capture the overall distribution of the disparity-tuning profiles from the population of V4 cells, we analyzed two Gabor function parameters: the center x0 and the phase
. The center values were distributed tightly around a small crossed disparity (mean 0.21°) (Fig. 3E, top histogram). There was, however, a notable fraction of cells whose center value lay at either end of the tested disparities. These tuning curves were very broad or nearly monotonic. The distribution of the phases was relatively broad with a peak near
/3 and a dip near zero (Fig. 3E, right histogram). A scatter plot of
versus x0 shows a single dense region, indicating that many cells had similar disparity-tuning profiles (Fig. 3E, center plot). The example cells, A through D, are indicated inside the plot to demonstrate that the first two examples were typical profiles of V4 cells and that the next two examples were relatively rare cases.
|
on a polar coordinate graph, where the azimuth is the phase and the deviation from the center is the distribution density of cells (Fig. 4A). According to the conventional classification of cells based on the Gabor phase of the disparity-tuning curve, most V4 cells would be classified as "near-cells" because of their odd-symmetric disparity-tuning curves (DeAngelis et al. 1991
) (Prince et al. 2002a
/2 were estimated to have a symmetry phase near zero (Fig. 4B). The other cells' phases were consistent across both measures of symmetry, although a sign inversion took place in the symmetry of one cell whose phase was near
. Consequently, the distribution of the symmetry phases of V4 cells exhibited a strong bias toward even symmetry (Fig. 4C). The distribution of the data points in the joint parameter space of the symmetry phase versus the centroid (Fig. 4D) is a better description of the overall disparity-tuning profile from the population of V4 cells than the distribution of the data points shown in Fig. 3E (the Gabor phase vs. the center value). Even in the joint parameter space of Fig. 4D, the examples shown in Fig. 3, A and B represent typical cases, and the examples shown in C and D are among the rare cases.
|
|
To compare disparity-tuning characteristics between cortical areas, it is better to focus on neuronal populations with similar RF eccentricities. To expedite this analysis, we examined the dependency of three parameters that characterize the disparity-tuning curve on RF eccentricity. Because RF size increases with RF eccentricity, neurons with larger RF eccentricities should cover a larger range of disparities and have broader disparity-tuning curves. The eccentricities of the RFs of the analyzed neurons had a mean of 6.8° (SD 1.6°). The absolute value of the preferred disparity was weakly correlated with the RF eccentricity (Spearman's rank correlation, rS = 0.19, P = 0.03) (Fig. 6A). The distribution of the symmetry phases, however, did not change with the RF eccentricity (Fig. 6B). Unlike neurons in V1 and MT (DeAngelis and Uka 2003
; Prince et al. 2002b
; but see Durand et al. 2002
), the Gabor frequency of V4 neurons did not correlate with RF eccentricity (Fig. 6C; rS = 0.08, P = 0.35), at least in the range of eccentricities tested in this study. We also examined the correlation between the DDI and RF eccentricity of the 142 cells with statistically significant disparity tuning. However, we found no correlation between them (r = 0.02, P = 0.76; data not shown).
|
The range for optimal disparity discriminability
The typical disparity-tuning profile of V4 cells implies that most of these neurons are most sensitive to subtle disparity differences near zero disparity. To check this point, we calculated the slopes of each of the fitted disparity-tuning curves. It is difficult to directly compare how well a neuron can detect a disparity increment (i.e., how well a neuron discriminates between two adjacent disparities) at a given disparity pedestal with its discrimination between two adjacent disparities at another disparity pedestal because both the mean firing rate and the firing-rate variance depend on the disparity. To normalize the variance across different disparity pedestals, we square-root transformed the fitted Gabor function and then calculated its first derivative (Prince et al. 2002b
). Four cells whose fitted curves were clipped at zero were discarded because derivatives are not defined for broken curves.
For the remaining cells (n = 129), we determined the disparity of the maximum absolute slope value. The distribution of the disparities at the maximum absolute slope exhibited a markedly sharp peak very close to zero disparity with the mean at 0.11° (SD 0.66°) (Fig. 7A). Most of the slope values composing the peak in the maximum slope distribution were negative values (Fig. 7B). Moreover, the absolute slope value averaged across the population of V4 cells demonstrated a conspicuous peak at a crossed disparity close to zero (Fig. 7C). The characteristics of the disparity-tuning profiles of typical V4 cells were preserved in the average disparity-tuning function for the population of V4 neurons (Fig. 7D). Unlike the conventional distributed representation scheme, the pooled response of a population of V4 neurons had a surprisingly strong response bias for a crossed disparity in an RDS.
|
During the recording experiments, we monitored both single-unit (SU) and background multiunit (MU) spike waveforms. The detection threshold for the MU spikes was adjusted on-line so that the tails of the SU spikes with triphasic waveforms were not mixed with MU spikes. If this separation was not successful on-line, we subtracted the SU firing rate from the MU firing rate during off-line analyses. We discarded any MU data if the subtraction yielded negative MU firing rates. At 118 of the 224 sites where SUs gave significant responses, we succeeded in separating the MU activity from the SU activity (Fig. 8A). At recording sites where the SU displayed a high magnitude of disparity sensitivity, the MU also displayed a high magnitude of disparity sensitivity, and vice versa (Figs. 8B). To evaluate this tendency across the 118 sites, we calculated the DDI values for both SUs and MUs. Indeed, the DDI values of MUs strongly correlated with the DDI values of SUs (r = 0.64, P = 1014) (Fig. 8C). This correlation indicates that the magnitude of disparity selectivity is shared by nearby cells within V4.
|
Comparison of disparity tuning for RDSs and SFSs
Among the 260 cells in the initial database, we examined 57 cells for disparity tuning for both an RDS and an SFS. Of these cells, 56 cells were responsive to at least one of the binocular stimuli (t-test, P < 0.05 divided by the number of binocular stimuli).
Direct comparisons were made between the disparity-tuning curves obtained using the two types of stereograms. After superimposing the two disparity-tuning curves, we observed a number of differences. Many cells had a broader tuning curve and a higher baseline response for an SFS than for an RDS (Fig. 9A). For many cells, the end of the disparity-tuning curve for the SFSs did not fall near the baseline response level; thus the overall shape of the curve was nearly monotonic. Furthermore, many cells had different preferred disparities when they were examined with RDSs and with SFSs (Fig. 9B). Some cells exhibited large differences in the magnitude of their responses elicited by the two types of stereograms. We encountered cells that gave much stronger responses to an SFS than to an RDS (Fig. 9C) as well as cells that preferred an RDS to an SFS. Other cells exhibited an equal baseline response level to both types of stereograms, but modulated their responses depending on the disparity value only for one of the two types of stereograms (Fig. 9D). The disparity tunings measured with an RDS and an SFS can thus be drastically different from each other.
|
|
did not correlate between the disparity-tuning curves obtained using the two types of stereograms (Spearman's rank correlation rS = 0.013, P = 0.93; rS = 0.14, P = 0.35; rS = 0.031, P = 0.84 for A, f, and
, respectively). The Gabor frequency parameter f was higher for the disparity-tuning curve obtained with an RDS than that obtained with an SFS (Wilcoxon's signed-rank test, P = 0.00023) (Fig. 11B). It is unlikely that our fitting constraint (i.e., the two curves shared the values of the parameters x0 and
) confounded the results because all the results presented here were duplicated even when the two disparity-tuning curves were fitted independently.
|
Correction for shifts in the monocular images of SFS
The disparity energy model is a simple description of neuronal selectivity for disparity at the early stages of stereo processing (Ohzawa et al. 1990
). A static nonlinearity after binocular summation is the key factor for the disparity-tuned responses to dynamic RDSs (Anzai et al. 1999
; see APPENDIX). Because of this nonlinearity, responses to a binocular stimulus involve a binocular interaction component in addition to the sum of the respective signals from the two eyes (Fig. 12A; Ohzawa et al. 1997
). Because a dynamic RDS is a spatiotemporal white noise, any neuronal sensitivity to a monocular feature is averaged out when the stimulus is a dynamic RDS. The monocular response components are constant across all disparities. The response modulation to an RDS directly reflects the binocular interaction component (Fig. 12A, top row). On the other hand, the monocular image of an SFS shifts its position depending on the disparity. Neuronal sensitivity to monocular features affects the monocular components as the disparity is changed. The response modulation to an SFS is the sum of the modulation of the monocular component and the modulation of the binocular interaction component (Fig. 12A, bottom row). To eliminate the effects of the monocular component in the response to an SFS for the recorded V4 neurons, we mimicked the binocular interaction component of the disparity energy model. This was done by subtracting the trial-averaged response to a monocularly presented SFS from the trial-by-trial responses to a binocularly presented SFS at each corresponding disparity.
|
Among the 57 cells whose disparity tunings were examined with both an RDS and an SFS, we recorded the responses of 50 cells to monocular presentations of an SFS to the left and right eyes in the same block as the binocular presentations. The disparity-tuning data for an SFS were corrected for monocular features by subtracting the sum of the mean firing rates elicited by the left and right monocular presentations at each corresponding disparity. Only one out of the 37 cells that originally had statistically significant disparity sensitivity for an SFS lost its sensitivity after correction of monocular shifts (KruskalWallis test, P < 0.05). Using the resulting surrogate binocular interaction component for an SFS, we refitted the Gabor function for both RDS- and SFS-induced responses (Fig. 12C) of 44 cells that were statistically disparity sensitive for either RDSs or SFSs (KruskalWallis test, P < 0.05). Seven cells were discarded because of inadequate fitting results (R2 < 0.6). Because of the subtraction of the responses to monocular stimuli, many of the surrogate binocular interactions for an SFS were negative. Although negative responses are not realistic, this does not affect the following analysis because we focus only on the modulation, and not on the baseline level of the disparity-tuning data.
We compared the parameters of the fitted Gabor function between the surrogate binocular interaction components obtained using RDSs and SFSs. The amplitude A and the frequency f did not correlate between the two types of stereograms (rS = 0.13, P = 0.45; rS = 0.20, P = 0.24 for A and f, respectively) (Fig. 13, A and B). The frequency f was significantly larger for an RDS than for an SFS (Wilcoxon's signed-rank test, P = 0.03). Although there was no correlation between the Gabor frequency f for the surrogate binocular interaction and the width of the bar in the SFS (rS = 0.21, P = 0.22), the difference in the frequency magnitude may partially be explained by the fact that the bar in the SFS was wider than the dots in the RDS (see APPENDIX). These results were essentially identical to those obtained before subtracting the contribution of the monocular features (Fig. 11, A and B). In contrast, we found a weak, but significant, correlation in the Gabor phase
, between the tuning curves for the two types of stereograms (rS = 0.41, P = 0.012) (Fig. 13C). This correlation held for the symmetry phase as well (rS = 0.41, P = 0.012) (not shown). Although the phases of the disparity-tuning curves were correlated, the preferred disparity of the surrogate binocular interaction for an SFS did not correlate with that for an RDS (rS = 0.075, P = 0.77) (Fig. 13D). Because the correlation in the phase was absent before subtracting the contribution of the monocular features (Fig. 11C), our data suggest that the modulations of the responses arising from positional shifts in the monocular images of an SFS contribute, at least in part, to the discrepancy in the disparity-tuning profiles obtained with an RDS or an SFS.
|
| DISCUSSION |
|---|
|
|
|---|
Functional architecture of stereo processing
The functional architecture of disparity tuning was addressed by analyzing the similarities between SU and MU responses. This approach revealed that disparity-selective neurons are columnarly organized in MT (DeAngelis and Newsome 1999
). The V4 data of the present study showed a strong correlation between the SU and MU DDIs (rS = 0.64), which is similar to the correlation reported in MT (rS = 0.66) and stronger than the correlation reported in V1 (rS = 0.37) (Prince et al. 2002a
). Therefore as in MT, clustering of disparity-sensitive cells and clustering of disparity-insensitive cells exist in the organization of V4, although this clustering was more continuous than discrete because the DDIs exhibited a unimodal instead of a bimodal distribution. Because we did not examine whether the clusters were arranged perpendicular to the surface of the cortex, we cannot address the existence of disparity columns in V4.
Within the clusters of disparity-sensitive cells, there was a weak correlation between the SU and MU preferred disparities (r = 0.30). The magnitude of this correlation is more similar to the correlation observed in V1 (r = 0.30) than the correlation observed in MT (r = 0.91). The small correlation found in V4 does not directly indicate that nearby cells possess a variety of preferred disparities. Neither the distribution of SU preferred disparities nor the distribution of MU preferred disparities covered the full range of tested disparities (Fig. 8D). Thus in V4, the low correlation between the SU and MU preferred disparities reflected only the prevalence of cells preferring a limited range of disparities.
The analysis of SU and MU responses was very similar to the one in Watanabe et al. (2002)
. However, Watanabe et al. used an SFS as their stimulus. The present results show that the disparity tuning of V4 cells drastically differs when the stimulus is an SFS than an RDS. It was an empirical question whether the way changes in disparity tuning are coherent among nearby cells when the stimulus is switched from an RDS to an SFS. It turned out that the changes are coherent because SU/MU correlation was observed with both RDS (this study) and SFS (Watanabe et al. 2002
).
Even-symmetric disparity tuning in V4
Disparity-tuning profiles of cortical neurons are conventionally classified into distinct subtypes (Poggio and Fischer 1977
). One qualitative characteristic of these subtypes is the symmetry of the disparity-tuning curve, which can be captured by the phase parameter of the fitted Gabor function (DeAngelis et al. 1991
). A "near"- or a "far"-type neuron has an odd-symmetric disparity-tuning curve. These classes of neurons are implicated in vergence eye movements that bring the left and right retinal images into registration (Marr and Poggio 1979
). A "tuned-near," a "tuned-zero," or a "tuned-far" neuron has an even-symmetric disparity-tuning curve. These classes of neurons are thought to act as disparity detectors (Marr and Poggio 1979
). Few studies, however, have inferred the functional significance of neurons from their disparity-tuning profiles, especially because quantitative examination failed to find these discrete classes of neurons in V1 (see Cumming and DeAngelis 2001
for a review; Prince et al. 2002b
).
Neurons in MT have a strong bias for odd symmetry, and neurons in the medial superior temporal area (MST) apparently have an even stronger bias for odd symmetry (see Cumming and DeAngelis 2001
for a review; DeAngelis and Uka 2003
; Takemura et al. 2001
). The disparity-tuning curves of neurons in V4 had Gabor phases between
/4 and
/2, which would classify them as odd symmetric if they were evaluated solely based on this parameter. The reevaluation based on symmetry phase, however, indicated that the actual V4 tuning curves were closer to even symmetric. Although the symmetry of MT and MST cells has not been evaluated based on symmetry phase, it is likely that the majority of their disparity-tuning curves are indeed odd symmetric because the published disparity-tuning curves have a clear trough together with a peak. In contrast to areas MT and MST, neurons with odd-symmetric disparity-tuning profiles were rare in V4. The symmetry estimate in the present study was not confounded by truncation of the tuning curve at zero firing rates. Our fitting procedure was allowed to search for the best-fitting truncated function. Nevertheless, it yielded only four cells that had clipped tuning curves. The observation that V4 neurons have predominantly even-symmetric disparity tuning does not directly indicate that V4 neurons serve as disparity detectors in the classical sense. Such disparity detectors are each assigned to signal the presence of a target with a particular binocular disparity (Marr and Poggio 1976
). To sufficiently encode visual information with this interval-encoding scheme, the archetypal disparity detectors should cover an adequately wide range of stereoscopic depth (Lehky and Sejnowski 1990
).
The preferred disparities of V4 neurons, however, were strongly biased toward a narrow range of disparities. Although there are more MT neurons that prefer crossed disparities than uncrossed disparities, the bias for crossed disparities in V4 appears to be stronger than is the bias observed in MT or V1 (DeAngelis and Uka 2003
; Prince et al. 2002b
). Our results suggest that the role of V4 neurons is different from the classical disparity detection. Rather than an interval-encoding scheme, we suggest revisiting a rate-encoding scheme for stereoscopic depth representation in V4. According to the rate-encoding scheme, depth is nearly proportional to the pooled firing rate of a population of relevant neurons, at least within a particular range of disparities. This scheme accounts for some aspects of psychophysical performance during stereoacuity judgments (Badcock and Schor 1985
), but was rejected only because no physiological studies had reported disparity-tuning characteristics suitable for the rate-encoding scheme (Lehky and Sejnowski 1990
). Because the pooled disparity-tuning curve exhibited a steep slope near zero disparity in this study, the population of V4 neurons had a tuning profile that should allow for rate encoding of fine stereoscopic depth.
What information is signaled by V4 neurons?
Because the preferred disparities were confined to a narrow range at a small crossed disparity, the positions on the disparity-tuning curves with the steepest slopes were confined to a narrow range near zero disparity. The sensitivity for subtle disparity increments is thus highest near a zero disparity pedestal. For an observer or a downstream neural system that has access only to the firing of V4 neurons, the largest benefit is attained when these responses are used to detect a subtle protrusion from the background.
This finding agrees with area V4 lesion studies. Ablation or pharmacological inactivation of area V4 leads to mild deficits in a number of visual detection and discrimination tasks and severe impairments in form discrimination and the detection of structured patterns embedded in a random background (Heywood et al. 1992
; Merigan 1996
; Schiller 1993
). Our results demonstrated that many V4 cells were strongly activated when a stereoscopic percept of a disk is slightly protruding from a background. A protrusion is a salient and structured visual feature, and neuronal responses in V4 are suited to detect this. Because both stereoscopic depth contrast and chromatic contrast can trigger large responses in V4 (Schein and Desimone 1990
; Zeki 1983
), the visual attribute that renders the salience may not be important to V4 cells. Because neuronal firing in V4 predicts a saccadic eye movement toward the location of the neuron's RF (Mazer and Gallant 2003
), V4 responses to salient features are likely used by the oculomotor system in more natural viewing conditions.
The validity of the neuronorthoneuron hypothesis in V4 for stereoscopic depth discrimination
In principle, our results only suggest that V4 neurons are suited for the detection of stereoscopic depth. To address the role of V4 neurons more directly, it is important for future studies to examine neuronal responses while monkeys are engaged in a specific stereoscopic task. The design of future studies will critically rely on the framework of how V4 responses are read-out by a downstream system that is involved in decision making during a stereoscopic task. The data of the present study are informative for that framework, and in turn for devising a task that is appropriate for exploring the function of V4 neurons.
A recent study compared neuronal performance of MT cells recorded while a subject performed a task with the psychophysical performance (Uka and DeAngelis 2003
). The monkey was trained to indicate whether it perceived a near or a far plane embedded in binocularly uncorrelated dots. The near and far planes were presented in successive trials, and the two planes were never presented simultaneously. As a consequence of the task design, this study evaluated the neuronal performance by assuming the existence of a hypothetical antineuron that had an exactly opposite disparity preference. The receiver-operating characteristic (ROC) analysis was based on the estimation of an ideal observer who randomly samples responses from the recorded neuron and the hypothetical antineuron (Britten et al. 1992
). The assumption is supported by an earlier report that the preferred disparities of MT neurons are distributed sufficiently widely, ranging from crossed to uncrossed, although there is a small bias toward neurons preferring crossed disparities (DeAngelis and Uka 2003
). This scheme could at least explain