|
|
||||||||
The Journal of Neurophysiology Vol. 87 No. 6 June 2002, pp. 3102-3116
Copyright ©2002 by the American Physiological Society
1The Interdisciplinary Center for Neural Computation and 2Department of Neurobiology, Hebrew University of Jerusalem, Jerusalem 91904; 3Department of Neurobiology, Weizmann Institute of Science, Rehovot 76100; 4Imaging Department, Whol Institute for Advanced Imaging, Sourasky Medical Center, Tel Aviv 64239; and 5Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| |
ABSTRACT |
|---|
|
|
|---|
Avidan, Galia, Michal Harel, Talma Hendler, Dafna Ben-Bashat, Ehud Zohary, and Rafael Malach. Contrast Sensitivity in Human Visual Areas and Its Relationship to Object Recognition. J. Neurophysiol. 87: 3102-3116, 2002. An important characteristic of visual perception is the fact that object recognition is largely immune to changes in viewing conditions. This invariance is obtained within a sequence of ventral stream visual areas beginning in area V1 and ending in high order occipito-temporal object areas (the lateral occipital complex, LOC). Here we studied whether this transformation could be observed in the contrast response of these areas. Subjects were presented with line drawings of common objects and faces in five different contrast levels (0, 4, 6, 10, and 100%). Our results show that indeed there was a gradual trend of increasing contrast invariance moving from area V1, which manifested high sensitivity to contrast changes, to the LOC, which showed a significantly higher degree of invariance at suprathreshold contrasts (from 10 to 100%). The trend toward increased invariance could be observed for both face and object images; however, it was more complete for the face images, while object images still manifested substantial sensitivity to contrast changes. Control experiments ruled out the involvement of attention effects or hemodynamic "ceiling" in producing the contrast invariance. The transition from V1 to LOC was gradual with areas along the ventral stream becoming increasingly contrast-invariant. These results further stress the hierarchical and gradual nature of the transition from early retinotopic areas to high order ones, in the build-up of abstract object representations.
| |
INTRODUCTION |
|---|
|
|
|---|
Recently, several
neuroimaging studies have revealed a high order cortical region,
located at the occipito-temporal junction (the lateral occipital
complex, LOC), which possesses a number of functional properties
associated with high-level object-related representations. Thus the LOC
has been shown to manifest a high degree of size and position
invariance (Grill-Spector et al. 1999
) and to be
activated by image completion (Lerner et al. 2001b
) and
grouping processes (Hasson 2001
; Kourtzi and
Kanwisher 2001
). These processes are remarkably similar to
those encountered in recognition performance. Finally, using a backward
masking paradigm, it was shown that the activation pattern in the LOC
is highly correlated with the subjects' recognition performance rather
than the physical duration of stimulus exposure (Grill-Spector
et al. 2000
).
One issue that remains unresolved is the characteristic of the process
by which the functional transformation from the retinal image to
high-level object representation is accomplished. It is well
established that a sequence of ventral stream object areas is involved
(Felleman and Van Essen 1991
; Lerner et al.
2001a
; Tootell et al. 1996
), but the relative
contribution of each stage in the process is unknown. For example, it
is not clear whether the transformation is gradual, where each stage in
the sequence is contributing a small increment, or whether it occurs in
a few large steps.
Here we used a single well-defined visual property, that of image
contrast, to follow the transformation in image representation along
the entire constellation of ventral stream human visual areas. Using
this approach, differences between areas in terms of their contrast
response function could be explored and related to the putative
hierarchical processing which exists between visual cortical areas
along the ventral stream (Ungerleider and Mishkin 1982
).
Contrast is a suitable parameter to study because perceptually, object recognition is highly invariant to contrast changes beyond a minimal contrast level. However, retinal responses are highly sensitive to all contrast levels. Consequently, the contrast response function can be used as a tool to explore to what extent activation in a given visual area is determined by the physical contrast of the stimulus and to what extent it is related to the subject's perceptual performance. The answer to this question could shed light on the nature of the hierarchical processing in the visual system. More specifically, it will aid in determining to what extent the establishment of contrast invariance is a gradual process, whether contrast invariance in a given cortical area is specific to particular object shapes, and which areas are most closely related to recognition performance.
Using functional magnetic resonance imaging (fMRI), we studied the contrast response function along the entire set of ventral-stream human visual areas. Our results reveal that the correlation between physical stimulus contrast and fMRI response shows a gradual and consistent decline as one moves to high-order visual areas along the ventral stream. Concurrently the fMRI signal shows consistently increasing correlation to recognition performance. Thus the two subdivisions of the lateral occipital complex: the dorsal lateral occipital region (LO) and the more ventral and temporal region located in the posterior fusiform gyrus (pFs) showed the strongest tendency toward contrast invariance especially for face stimuli.
These results reflect a hierarchical trend in the human visual cortex
in which cortical responses gradually depart from the physical aspects
of the visual stimulus and become correlated with perceptual
experience. Some of these results have been published previously in
abstract form (Avidan-Carmel et al. 2000
).
| |
METHODS |
|---|
|
|
|---|
Subjects
Twelve healthy subjects (6 women, ages 24-50), participated in one or more of the experiments. All subjects had normal or corrected to normal vision and provided written informed consent. The Tel-Aviv Sourasky Medical Center approved the experimental protocol.
MRI setup
Subjects were scanned in a 1.5 Signa Horizon LX 8.25 GE scanner equipped with a standard birdcage head coil. In the block-design experiments (experiments 1, 2, 4, and 5), blood-oxygenation-level-dependent (BOLD) contrast was obtained with gradient-echo echo-planar imaging (EPI) sequence (TR = 3,000, TE = 55, flip angle = 90°, field of view 24 × 24 cm2, matrix size 80 × 80). The scanned volume included 17 nearly axial slices of 4-mm thickness and 1-mm gap. In the event-related experiment (experiment 3), the scanning parameters were changed (TR = 1,500, TE = 55, flip angle = 70°) and the scanned volume included eight oblique slices. T1-weighted high-resolution (1 × 1 × 1 mm) anatomical images and three-dimensional spoiled gradient echo sequence were acquired on each subject to allow accurate cortical segmentation, reconstruction and volume-based statistical analysis.
Visual stimulation
Stimuli were generated on a PC and projected via an LCD projector (Epson MP 7200) onto a tangent screen positioned over the subject's forehead and viewed through a tilted mirror located above subjects' eyes.
Experiments
EXPERIMENT 1: FACES AND OBJECTS. Ten subjects participated in the experiment. The experiment (Fig. 1), which lasted 450 s, consisted of 12 different stimulus conditions and had 57 epochs which were presented in a block design paradigm. Stimuli were 19 × 17° black on white line drawings of faces and objects, and control stimuli were texture patterns (for an example, see Fig. 1). The face stimuli were either a woman, man or a child and the object stimuli included: man-made objects, images of vehicles and images of buildings.
|
|
EXPERIMENT 2: FACES, CARS, AND HOUSES. Eight subjects participated in experiment 2, which consisted of 12 different stimulus conditions. Stimuli were pictures of faces, houses, and cars presented at contrasts of 4, 6, 10, and 100%, and each condition was repeated twice. Stimuli were generated in the same way as in experiment 1. Presentation of stimuli and task (i.e., subordinate category naming) were identical to experiment 1.
EXPERIMENT 3: EVENT-RELATED CONTROL. Five subjects participated in experiment 3 in which we used the 100 and 10% contrast pictures of faces and objects that were used in experiment 1. Sixty-four presentations of 16 different faces and 16 different objects (2 contrast levels each) were presented in a counter-balanced event-related paradigm. Each stimulus was presented for 300 ms followed by 5,700 ms, and the experiment lasted 444 s. The experiment began with 24-s blank and ended with 18-s blank. In addition there were two more long blank epochs along the experiment, each lasted 9 s. The subjects' task was to covertly name each of the stimuli that were presented.
EXPERIMENTS 4 AND 5: ATTENTION CONTROLS. Six subjects participated in both experiments 4 and 5. In these experiments, we used 54 different pictures of faces presented in 100 and 10% contrast. Each experiment lasted 228 s and consisted of 13 visual epochs of 9 s followed by a short blank of 6 s. The first visual epoch consisted of images of texture patterns and was not included in the statistical test. In addition there were two long blanks at the beginning and end of the experiment (21 and 12 s, respectively). Each picture was presented for 200 ms followed by 800 ms of blank. During the visual epochs, there was a small light-gray fixation point centered on each image while during the blank epochs the color of the fixation point was red.
In experiment 4, the color of the fixation point was changed once or twice per epoch to a darker gray, and subjects had to perform a one-back memory task on the color of the fixation point. In four of the six subjects that were scanned in each experiment, we measured performance during the fMRI scan. Subjects provided their responses through a "Neuroscan" response box, and data were collected by in-house software. Subjects had to press one button when the fixation point did not change its color ("same") and another button when it did change its color ("different"). The fixation point disappeared during the short 800 interstimulus interval (ISI) blank so that the task had to be performed on the visual stimuli and not during the ISI. In experiment 5, the color of the fixation point did not change along the visual epochs and subjects had to perform a one-back memory task on the identity of the face images. A face was repeated once or twice during each epoch. Again subjects' performance was collected via a response box.Mapping borders of visual areas
The representation of vertical and horizontal visual field
meridians were mapped in all subjects to delineate borders of
retinotopic areas (DeYoe et al. 1996
;
Grill-Spector et al. 1998a
; Sereno et al.
1995
). Visual stimulation was presented at a rate of 4 Hz in
18-s blocks and consisted of triangular wedges that compensated for the
expanded foveal representation. The wedges were presented either
vertically (upper or lower vertical meridians) or horizontally (left or
right horizontal meridians). The wedges consisted of either gray-level
natural images or black and white objects-from texture pictures
(Grill-Spector et al. 1998a
). Subjects were asked to
fixate on a small central cross. Visual epochs alternated with 6-s
blanks. Four cycles of the stimuli were shown.
Because the exact parceling of the ventral areas V4 and V8 is
still debatable in the literature (Hadjikhani et al.
1998
; Zeki and Marini 1998
), we defined a
combined focus V4/V8 for which the posterior border is the upper visual
meridian representation. The anterior border was defined as the border
passing through half field representation (a lower visual filed
representation, a horizontal one and an upper visual field
representation). The collateral sulcus activation was defined as
activation that was located anterior to and outside from the upper
visual field representation and was therefore not retinotopic (see Fig.
2A).
|
Data analysis
fMRI data were analyzed with the "BrainVoyager"
software package (Brain Innovation, Maastricht, Netherlands) and with
complementary in-house software. The data of each subject from each
scan were analyzed separately. The first three images of each
functional scan were discarded and a hemodynamic lag of 3 s was
assumed. The functional images were superimposed on two-dimensional
(2D) anatomical images and incorporated into the three-dimensional (3D)
data sets through trilinear interpolation. The complete data set was
transformed into Talairach space (Talairach and Tournoux 1988
). Preprocessing of functional scans included 3D-motion
correction and filtering out of low frequencies up to five cycles per
experiment (slow drift).
Statistical analysis was based on the General Linear Model
(Friston et al. 1995
). In this analysis a linear
combination of several predictor variables are used to predict the
variation of an observed variable y
|
(t) from the
measured signal value y(t) at each time point.
The GLM analysis is performed independently for the time course of each individual voxel. The results of a GLM analysis of a voxel time course
are estimates for the regression weights
bi such that the predicted values
(t) are as close as possible to the
measured values y(t) at each time point. The
least-squares method is used for estimating the regression weights such
that the error values e(t) are minimized
|
When mapping the relative contribution of two functional responses (Fig. 5A, all face epochs vs. all object epochs in that example), the color coding represents the relative contribution of either set. If both predictor sets contribute roughly equally to the activation at a voxel, this voxel will be colored in blue. Green and red colors show strong contribution of one predictor set over the other. The exact color used depends on the level of differential contributions by each predictor set.
Percent signal change for each subject in each experiment was
calculated as the percent activation from a blank baseline
|
|
All epochs belonging to the same condition were averaged together to provide an average condition epoch time course (Figs. 2B, 3, and 5B) error bars indicate the standard error of the mean in each condition across all subjects.
The cortical surface was reconstructed from the 3D-spoiled gradient echo scan. The procedure included segmentation of the white matter using a grow-region function, the smooth covering of a sphere around the segmented region, and the expansion of the reconstructed white matter into the gray matter. The sulci were smoothed using a cortical "inflation" procedure. The surface was cut along the Calcarine sulcus and unfolded into the flattened format. The obtained activation maps were superimposed on the unfolded cortex and the Talairach coordinates were determined for the center of each region of interest (ROI).
Contrast invariance ratio: to evaluate quantitatively the differences
in the fMRI signal for the 100 and 10% contrast epochs for the faces
and objects separately, we calculated contrast invariance ratio (Fig.
4, Table 1)
|
|
|
|
norm.
fMRI pcs at 10%)
(%correct at 100%
%correct at 10%)].
A one-way paired t-test was conducted to find whether the
distance measure in area V1 was significantly greater then in the
high-order visual areas: LO, pFs, and the Coll.
| |
RESULTS |
|---|
|
|
|---|
Experiment 1
The aim of experiment 1 was to characterize the fMRI response of various visual areas to different contrast levels of visual stimuli. The stimuli we used were black-and-white line drawings of either objects or faces that were presented at five different contrast levels (0, 4, 6, 10, and 100%) in a short-block design paradigm (see METHODS and Fig. 1). Control stimuli were texture patterns that were presented at two contrast levels (10 and 100%). Epochs containing visual stimuli (including the 0% contrast epochs) were indicated by a gray fixation point at the center of each image, while interleaved blank epochs were identified by a red fixation point, located at the center of the screen. This was done mainly to differentiate between low-contrast epochs that engaged subjects' attention when attempting to recognize the objects from the blank epochs, which required only fixation.
Contrast response in visually active areas
Data from 10 subjects were analyzed. The basic statistical test applied to the data searched for voxels that were activated by all visual images (objects, faces, and patterns) irrespective of their contrast level, compared with blank (visual > blank). This test revealed activation in the entire set of visual areas as demonstrated in one representative subject in Fig. 2A. Activation maps are presented in two different formats. On the left, the data are shown on both hemispheres of an inflated brain seen from a ventral view. In the middle, data are shown on a flattened map of the right hemisphere. The visual areas of each subject were delineated by superimposing meridian maps that were obtained in a separate scan on the flattened activation maps obtained in the current experiment (see METHODS). The meridian borders are indicated on the central flattened map by white dotted lines. A detailed meridian map of the same subject is shown on the right, the same retinotopic borders as in the middle image are indicated by the red dotted lines.
This statistical test highlighted activation in the retinotopic areas,
stretching from V1 to V4/V8 ventrally (see METHODS) and to
V3A dorsally. High-order activation was found in two main foci: LO
focus, which was situated ventrally and posteriorly to area MT and
extending into the posterior inferotemporal sulcus, and a focus in the
vicinity of the pFs, which is anterior and lateral to area V4/V8 and
extending into the inferior temporal sulcus (see Table 1 for Talairach
coordinates). The latter focus (pFs) may overlap the fusiform face area
(FFA) described previously (Kanwisher et al. 1997
) (see
Fig. 5A). Another focus was
situated within the anterior portion of the collateral sulcus (Coll.,
see METHODS).
|
In the dorsal pathway, there were two additional foci, one located
adjacent to the upper visual field representation of area V3A, probably
corresponding to area V7 (Hadjikhani et al. 1998
; Mendola et al. 1999
), and another region, located within
the IPS.
After establishing an anatomical definition for each activation focus, we derived the average activation profiles (i.e., contrast response functions) for each cortical area for each subject using the flattened brain format. Figure 2B shows the contrast response function (averaged across all subjects) for the various areas in the ventral pathway. Red and green graphs denote activation profiles for face and object images, respectively. A conspicuous difference was observed in the contrast response function of early and intermediate versus higher visual areas. Early and intermediate visual areas (V1-Vp, V4/V8, respectively) manifested strong contrast dependence at suprathreshold contrast levels, for both faces and objects, so that when the contrast was lowered from 100 to 10%, signal intensity was reduced to about half. On the other hand, higher-order, object-related areas (LO, pFs) showed a significantly lower contrast dependence at this range. This effect was clearly evident for face images and was somewhat weaker for objects.
Figure 3 shows time-course data averaged across 10 subjects from areas V1, LO, and the pFs for the face, object, and pattern stimuli. As shown in Fig. 2B, the marked difference in terms of contrast response function between activation in primary visual cortex (V1) and higher-order areas (LO, pFs) is evident.
To make sure that activation in lower visual areas was not underestimated due to the statistical test used (visual > blank), activation in areas V1 and V2 was also sampled from 4 subjects by a statistical test comparing activation in 100% contrast epochs versus blank and ignoring the rest of the epochs (all 100% epochs > blank). Comparing the number of activated V1/V2 voxels in this test versus the former one (using the same statistical threshold for each subject) revealed no significant difference (paired t-test P < 0.19) but a trend of reduction in the number of voxels in the latter test (all 100% epochs > blank) probably due to its weaker statistical power.
To obtain a quantitative comparison of the level of suprathreshold contrast invariance across the different areas, we calculated a "contrast invariance ratio" (see METHODS). High levels of this ratio indicate greater invariance to contrast changes. Figure 4 exhibits the contrast invariance ratio for each of the visual areas presented in Fig. 2B for each type of visual stimuli used in the experiment. The leftmost bar graph represents the data obtained for the face stimuli, the middle one for the objects, and the rightmost histogram for the pattern stimuli. Note the gradual increase in the contrast invariance ratio going from early retinotopic visual areas (V1-Vp), which manifested high sensitivity to changes in contrast, through intermediate areas (V4/V8) which showed less sensitivity to contrast changes, to higher, nonretinotopic areas (LO, pFs, Coll.), which manifested a high degree of contrast invariance, particularly for faces but also for objects compared with area V1. [V1, contrast invariance ratio: 0.59 ± 0.10 (mean ± SE), 0.53 ± 0.07 for faces and objects, respectively; LO, contrast invariance ratio: 1.06 ± 0.04, 0.78 ± 0.06; pFs: 0.94 ± 0.05 0.75 ± 0.07 for faces and objects, respectively, and see Table 2 for ratios of all areas.]
|
It should be noted that in this experiment there were epochs that
contained 0% contrast stimuli in which subjects were instructed to
attempt to recognize objects, so that imagery and/or expectation effects could be detected. Across all areas the activation for the 0%
contrast epochs was not significantly different from zero (t-test, P < 0.10). Thus it seems that
under the specific conditions of the present experiment there was no
clear evidence for a component of imagery or expectation-related
activation in any of the studied areas. Such effects were reported by
other studies, (Ishai et al. 2000
; Kastner et al.
1999
). This difference may be due to the fact that in the
present experiment subjects were not required to actively try to
imagine stimuli in low-contrast epochs.
An interesting question is whether the trends toward increased invariance continued beyond the LOC. To test this possibility, we looked at activation found for the same statistical test (visual > blank) in the prefrontal region within the vicinity of the middle frontal sulcus. Such activation was found in 6 of the 10 subjects who participated in experiment 1 and was generally noisier than activation in visual areas. Interestingly, this frontal focus exhibited similar results to those obtained in areas LO and pFs, in terms of their contrast invariance ratios (contrast invariance ratio in prefrontal region: faces: 1.00 ± 0.11; objects: 0.72 ± 0.11; patterns: 0.77 ± 0.10, compare with LO and pFs ratios given in Table 2).
In the current experiment, we tried to minimize the interaction between different experimental conditions that might cause contrast adaptation due to repeated presentation of the same stimuli in different contrast levels. This was done by pseudo-randomization of the experimental design (see METHODS for details). However, in three epochs (of 8), high-contrast stimuli appeared before the low-contrast epochs of the same stimuli. In all other cases, low-contrast epochs appeared before high-contrast epochs. Comparing subjects' performance during low contrast (4 and 6%) epochs, in which adaptation could take place, to epochs in which adaptation was not possible did not reveal a significant difference (paired t-test P < 0.2). This implies that such adaptation effects were indeed minimized in the present experiment
Contrast response in specific object-category regions
An interesting question is whether the contrast response function
in high-order areas is related to the shape selectivity of the
different foci of activation. To explore this issue, we conducted
additional statistical tests that looked for the specific functional
signature of object-selective brain regions (Epstein and
Kanwisher 1998
; Ishai et al. 1999
;
Kanwisher et al. 1997
). Figure 5A shows
activation maps of the left and right hemispheres of one subject. In
this map, voxels were color coded according to the relative
contribution of two predictors (Friston et al. 1995
;
Goebel et al. 1998
) (and see METHODS for
details of analysis). Specifically, voxels were color coded according
to their activation by face epochs (red), object epochs (green), and
both (blue), regardless of their contrast level
note that this test is
somewhat different from conventional object-selectivity tests, which
typically use exclusively high-contrast images. Retinotopic borders are indicated by the white dotted lines.
Face-related voxels (red) appeared mainly within LO and the pFs
(arrows). Voxels that showed the highest selectivity for faces within
the pFs are marked as the fusiform face area (FFA) (Kanwisher et
al. 1997
). Except for this clear preference for face-related activation, both LO and pFs tended to exhibit a rather balanced activation for both faces and objects as indicated by the blue color in
the vicinity of these areas. Voxels that showed preferential activation
for objects versus faces (green) appeared mainly in the collateral
sulcus. In addition, such balanced activation was also found in several
brain regions typically stretching from area V3A, V7 and the IPS
dorsally to area Vp, V4/V8 ventrally. Figure 5B depicts
results that were obtained by using the objects > faces test and the
faces > objects test. Activation profiles from the
objects > faces test, were sampled from the collateral sulcus
(object coll.) and activation profiles from the faces > objects test
were sampled from LO and the pFs. (face LO, face pFs). Similar to the
results shown in Fig. 2B for higher-order areas, contrast
invariance existed for faces and to a lesser degree for objects (see
Table 2 for contrast invariance ratios).
Psychophysical experiment
An issue of interest is the correspondence between brain activation and human performance. To explore this relationship to the contrast response, all 10 subjects from experiment 1 also participated in a psychophysical experiment that was conducted in the magnet at the end of the scanning session under the same viewing conditions. In this experiment, subjects were shown the same set of images as in experiment 1 (except for the pattern stimuli), only this time they were asked to overtly name each stimulus. The recognition performance of the subjects is presented in Fig. 6. The averaged recognition performance (percent correct) for each contrast level across the 10 subjects was: faces: 100%: 100 ± 0%; 10%: 96.6 ± 2.2%; 6%: 73.6 ± 6.2%; 4%: 23.9 ± 7.9%; objects: 100%: 99.7 ± 0.3%; 10%: 97.5 ± 0.7%; 6%: 61.9 ± 6.73%; 4%: 26.6 ± 7.9%; means ± SE.
|
To compare the fMRI signal of the subjects to their recognition performance, we normalized the fMRI signal for each subject (see METHODS). This was done separately for the signal for faces and objects in areas V1, LO, pFs, and the Coll. A distance measure between the norm. fMRI signal and recognition performance was calculated for each subject for the face and object stimuli (METHODS). A t-test revealed that fMRI signal in the high-order object related areas LO, pFs, and the Coll. was significantly (paired t-test, P < 0.05) more correlated to recognition performance compared with area V1.
Control experiments
In addition to the main experiment (experiment 1), we conducted several control experiments. Because the contrast response function of LO and the pFs was not significantly different (ANOVA: 2-factor analysis, P < 0.75), for simplicity of presenting the control data in this section we averaged them together to a combined focus termed the lateral occipital complex (LOC).
Experiment 2
While in LOC the activation caused by face stimuli was invariant to changes from 100 to 10% contrast, activation for the object images did not reach the same degree of invariance. A main difference between these two types of stimuli is that the object images, unlike faces, included a diverse set of shapes. To examine the impact of shape diversity, we conducted another experiment (experiment 2) in which we used two well-defined object categories, houses and cars, which have a narrower shape diversity compared with common objects. In addition, we included the face images used in the original experiment (experiment 1). Each image was presented in 4 contrast levels (4, 6, 10, and 100%). The results of this experiment are summarized in Table 2. Three different statistical tests were used: visual > blank, faces > houses, and houses > faces. In agreement with the results obtained in experiment 1, the response to faces was highly invariant to contrast changes within the LOC. In the collateral sulcus and LOC, the contrast invariance ratio for house stimuli was indeed higher than the one obtained in experiment 1 when using objects from various categories (see Table 2). On the other hand, the contrast invariance ratio in LOC for the second category of images (cars) was not substantially different from the results obtained for the mixed object stimuli in experiment 1 (see Table 2). Thus it seems that shape diversity was not the only factor contributing to the lower contrast invariance for objects compared with faces.
Experiment 3
It could be argued that the contrast invariance measured in high-order visual areas is a result of a saturation ("ceiling") of the fMRI hemodynamic signal and thus does not reflect a neuronal contrast invariance. To rule out this possible confound, we conducted another experiment in which we used the 100 and 10% face and object images from experiment 1. However, this time we used an event-related presentation paradigm, which reduces the signal by approximately an order of magnitude, thus ensuring that it would not saturate. The results of that experiment are depicted in Fig. 7 which shows the activation profiles for faces (red) and for objects (green) of V1 and the LOC (see Table 2 for exact ratios). As in the block-design experiment, area V1 was highly sensitive to contrast changes, while the LOC showed a high degree of contrast invariance for object stimuli and complete invariance to contrast for face images. These results match the results of the block-design experiment from the anatomical point of view as well.
|
Experiments 4 and 5
Attention and task demands were previously shown to modulate the activation in high-order visual areas. It could be argued that the contrast invariance measured in LOC is a result of such effects and thus does not reflect contrast invariance of the neurons in that area. To rule out this possible confound, we conducted two additional experiments in which we used 100 and 10% face stimuli that were presented in a block-design fashion (see METHODS for details). The aim of the first experiment (experiment 4) was to explore whether attention could be the source for the contrast invariance found in LOC. This was done by instructing the subjects to perform an attention demanding foveal task and thus focusing their attention away from the face stimuli presented in the experiment. Specifically, the fixation point, centered on each image, changed its color once or twice in each visual epoch from light to darker gray. Subjects had to perform a one-back memory task and to report via pressing on a response box whether the fixation point changed its color or not. Note that the task was identical during the 10 and 100% contrast epoch. The aim of the second experiment was to explore whether changing task demands could affect the contrast response found in LOC. In that experiment (experiment 5), subjects had to perform a one-back memory task on the identity of the face stimuli. Note that this task is markedly different from the covert-naming task used in experiment 1.
As in the analysis of experiment 1 also in the analysis of experiments
4 and 5, LOC voxels were sampled from a statistical test
searching for all visually active voxels (visual > blank). The
results of these two experiments are shown in Fig.
8, the bar graph shows the contrast
invariance ratio (left y axis,
) calculated for the LOC
in experiment 1 (original, for the face stimuli), experiment 4 (attention to fixation), and experiment 5 (attention to faces). In
addition subjects' performance in all three tasks is presented on the
right y axis during the 100% contrast epochs (
) and
during the 10% contrast epochs (
). Note the similarity of the
contrast invariance ratio obtained in the three different experiments.
This implies that the contrast invariance found in the LOC is not a
result of specific task demands or attention modulation and that it is
actually immune to such manipulations.
|
Overall activation level (averaged percent signal change across subjects) was slightly reduced in the attention-to-fixation task (experiment 4) compared with the attention-to-faces task (experiment 5; attention to fixation: 100% contrast: 1.14 ± 0.08%, 10% contrast: 1.11 ± 0.08, attention to faces: 100% contrast: 1.24 ± 0.14, 10% contrast: 1.28 ± 0.13). Regarding subjects performance: in both experiments (experiments 4 and 5), task performance was not significantly different (P < 0.15) during the 100% versus 10% contrast epochs [experiment 4: 100% contrast: 84 ± 10% (mean ± SD), 10% contrast: 80 ± 12 experiment 5: 100% contrast: 94 ± 5 10% contrast: 91 ± 5]. It is important to note that performance was high for all three tasks tested although in the attention-to-fixation task (experiment 4) performance was somewhat lower, which implies that this task was the most demanding.
| |
DISCUSSION |
|---|
|
|
|---|
Hierarchical processing reflected in the contrast sensitivity of visual areas
Our results show that the contrast response profile of visual
areas changes along the cortical hierarchy, moving from strong contrast
dependence in early visual areas, to a contrast invariance of varying
degree in high order object areas. Is this transformation along the
ventral visual pathway a gradual process or involves abrupt transition
along particular visual areas? In the present experiment, we took
advantage of the large coverage offered by the fMRI method and obtained
a detailed analysis of the contrast sensitivity across the entire
constellation of human visual areas for an identical set of stimuli. In
our previous backward-masking experiment (Grill-Spector et al.
2000
), the visual mask employed to limit image exposure
activated by itself early visual areas and thus precluded the analysis
of their object-related signal. The present study provides a
comprehensive comparison of a specific functional response across the
various visual areas. A comparison across different visual areas was
also performed in other studies but to different factors than the
current one (e.g., Polonsky et al. 2000
; Tootell
et al. 1998
). The main question that such analysis allowed us
to answer is whether the transition from early contrast-sensitive areas
to high-order invariant regions was a gradual, monotonic process, or
whether it happened in a single large step. Our results (Figs.
2B and 4) clearly point to a gradual process, which follows
nicely the putative cortical hierarchy (i.e., V1, V2, Vp, V4/V8, and
finally LOC).
Another related question is whether the transformation in the
sensitivity to contrast changes achieves its highest level at the LOC,
or whether it continues at more frontal cortical regions. This is
particularly relevant in the case of the object images, which showed
lower contrast invariance effects compared with faces. Interestingly,
our analysis of frontal cortical regions did not show a significantly
enhanced invariance
so it appears that the contrast invariance effect
reaches its highest degree already at the LOC level.
Object-selective heterogeneity and the contrast invariance levels
Although our results clearly show a gradual increase in contrast invariance as one moves toward occipito-temporal cortex, we did find substantial changes in the level of this invariance for different image categories within LOC. More specifically, activation to face images, as well as activation in face-related regions was much more invariant to contrast changes compared with activation elicited by common objects. The source of such heterogeneity is not clear at this stage. One possibility is that the higher stages of the cortical hierarchy are better activated by face images compared with other objects. In this sense, the movement from object activation to face-specific activation is an extension of the general hierarchical trend to increased contrast invariance discussed earlier.
An alternative possibility is that the face images were more similar to each other within a block compared with the mixed objects epochs and this similarity affected the level of contrast invariance. To test this possibility, we ran experiment 2 in which the invariance to three specific object categories (faces, cars, and houses) was compared. However, the results of that experiment were mixed: we found an elevation of the contrast invariance in the collateral sulcus for the house stimuli compared with the case when a diverse set of objects was used (mixed-objects condition in experiment 1, see Table 2). However, the activation for the car images was very similar to that obtained for the mixed-objects condition. Thus it seems that the level of the contrast invariance is determined by a complex interaction of various factors, and shape diversity was certainly not the only factor contributing to the lower contrast invariance for objects compared with faces.
Finally, it should be noted that several lines of evidence have
suggested that face recognition may be a special process, stressing the
importance of the holistic representation of faces comparing to other
object categories (Farah et al. 1998
; Kanwisher 2000
; Moscovitch and Moscovitch 2000
). Hence, it
may be that the unique properties of face processing are the source for
the greater contrast invariance obtained for faces compared with other
object categories. This, however, requires further investigation.
Correlation to object-recognition performance
Our results show a clear transition in the activation of cortical
visual areas from strong contrast dependence in primary visual areas
toward substantial contrast invariance in higher order
occipito-temporal visual areas. A similar trend was found in the
recognition performance of the subjects measured on the same stimuli
(Fig. 6). Such correlation to recognition performance was found
previously using other manipulations that degrade object recognition
(Grill-Spector et al. 2000
; James et al.
2000
)
The correlation between fMRI activation and recognition performance may
seem surprising given the indirect relationship between neuronal
activity and the MRI signal (Logothetis et al. 2001
). However, both in the case of the backward masking experiments as well
as in the present contrast experiment, the manipulation involved
crossing the recognition threshold. Thus it is plausible that neuronal
populations were increasingly recruited as the contrast level was
manipulated across recognition threshold concomitantly with recognition
performance, leading to the positive correlation between the two. It
should be emphasized that under different experimental situations, such
as fMR-adaptation this correlation does not hold (Grill-Spector
and Malach 2001
). Furthermore, the correlation between
psychophysical performance and fMRI signal in LOC was found when the
subjects performed a specific task, i.e., object recognition. Different
tasks' requirements and different stimulus types may show tighter
correlation to activity in other brain regions. Indeed, it has been
shown that when the task and stimuli were tailored for optimally
activating other areas such as primary visual cortex, V1 activity was
more correlated with performance than in our case (Boynton et
al. 1999
; Huk and Heeger 2000
; Ress et
al. 2000
). Following this rational, the present results further
emphasize the involvement of the LOC in human object recognition.
From a broader perspective of the object recognition processes, the
transformation toward contrast invariance that was found along the
human visual ventral stream (Fig. 9) is
yet another example of a visual process enabling object constancy
(Grill-Spector et al. 1999
; Gross 1972
;
Ito et al. 1995
; Sary et al. 1993
). In this respect, the present results extend our previous findings of
position and size invariance in the LOC (Grill-Spector et al. 1999
). A common theme to all these processes is that the
cortical representation departs from the variable retinal activity
patterns caused by changes in the viewing conditions (such as retinal
size, retinal position, etc.) and becomes more attuned to the
invariant, intrinsic properties of objects in the real environment.
Such transformation of object representation is an essential
characteristic of visual perception.
|
Could hemodynamic nonlinearities account for the contrast invariance?
The hemodynamic signal is assumed to be an indirect measure of the
neuronal response. Thus it is important to establish that the
hemodynamic activation profile obtained by fMRI mirrors the average
activity of the neurons in the same brain area. This has been recently
suggested by Heeger et al. (2000)
, who showed that the
contrast-response function obtained using fMRI in human V1 is closely
correlated with the average single-unit activity measured in V1 of the
macaque monkey. In two other recent papers, the close correlation
between fMRI and neuronal activity was shown for human and monkeys MT
(Heeger et al. 1999
; Rees et al. 2000
).
In our experiment, we found that the fMRI signal reached an asymptotic level at contrast levels more than 10% in high-order visual areas (LOC). A major concern is that in the LOC, the hemodynamic signal may reach saturation while the neuronal response would continue to increase with elevated contrast. Thus it could be that the contrast invariance measured in high-order visual areas is a result of hemodynamic signal saturation and not a characteristic feature of these areas.
The fact that the contrast invariance was found in specific cortical regions and not in others argues against a generalized hemodynamic effect, which presumably should not show such highly localized heterogeneity. However, to address this issue directly, we compared the results obtained by blocks of stimuli to an event-related paradigm. Using such paradigm reduces the fMRI signal substantially, thus preventing it from reaching the putative hemodynamic "ceiling." The results of that experiment were comparable with the results of the original, block-design experiment (experiment 1, compare Figs. 2B and 7A). Area V1 exhibited strong contrast dependence for both faces and objects, whereas LOC showed strong contrast invariance for the object stimuli and full invariance for the face stimuli. These results demonstrate that the contrast invariance in high-order visual areas is not a result of hemodynamic signal saturation, rather, it reflects a true characteristic feature of neuronal activity in high-order object areas.
Could attention effects account for the contrast invariance?
It could be argued that attention effects might contribute to the
contrast invariance found in the LOC. Thus if subjects attended more to
the stimuli that were difficult to recognize and if attention produces
enhanced activation in the LOC (Wojciulik et al. 1998
), this might lead to "flattening" of the contrast response because the lowered activation due to reduced contrast will be compensated by
the increase in activation due to attention. Our control experiments, in which the subject's task required attending the faces at different contrasts, or alternatively, attending the fixation point that had an
unrelated contrast level, clearly rule out this possibility. Thus
despite the fact that subjects did not attend the face stimuli, their
contrast invariance level remained the same as in the case where they
were required to recognize the face images or to remember their shape
(see Fig. 8).
Comparison of the contrast response function in other animal and human studies
Our findings of contrast invariance in higher visual areas are
compatible with previous single-unit studies in primates. Thus Rolls and Baylis (1986)
reported that responses of
neurons in the superior temporal sulcus (STS) were relatively invariant
to contrast changes of face stimuli. The contrast response function was
also characterized physiologically for extrastriate visual areas such
as area MT (Cheng et al. 1994
; Sclar et al.
1990
) and V4 (Cheng et al. 1994
). Using
sinusoidal luminance gratings, Reynolds et al. (2000)
showed that the neuronal response in area V4 increased with log
contrast. The contrast response function obtained in area V4 in our
experiment (see Fig. 2B) is comparable with these physiological findings.
Form perception is considered to be a faculty mediated by the ventral
stream, which was thought to receive its major input from the
parvocellular pathway (Livingstone and Hubel 1988
). The magno- and parvocellular pathways have markedly different contrast response functions with the magnocellular system showing higher sensitivity and early contrast saturation (Merigan and Maunsell 1990
; Merigan et al. 1991
). To test this view,
Ferrera et al. (1992
, 1994
) studied the responses of
neurons in area V4 after inactivating the magno- or parvocellular
layers within the LGN. They found no evidence for a clear dominance of
one of the two pathways in this area. Neither was there a clear spatial
segregation of the two inputs within V4. Thus it is plausible that
visual areas within the inferotemporal cortex, which receive major
ascending inputs from area V4, (Nakamura et al. 1993
)
would also have mixed magnocellular and parvocellular contributions.
However, one should not conclude from the contrast invariance observed
in LOC in our study that it is due to magnocellular input, it may well
be that this effect is produced intrinsically at the level of the LOC itself.
Several neuroimaging studies characterized the contrast response
function of human visual areas. Studying attentional effects, Kastner et al. (2000)
found monotonic increase in the
contrast response function in areas V1, V2/Vp, V4, V3A, and MT. These
findings are consistent with our findings for ventral retinotopic
visual areas (i.e., V1, V2, Vp, and V4; see Fig. 2B).
Tootell et al. (1995)
studied area MT and V1 and showed
that similar to the physiological findings obtained in monkeys, human
area MT exhibits high sensitivity to contrast and its activity
saturates at low contrast levels. The fMRI activation in area V1, on
the other hand, increased as a function of log contrast without obvious saturation.
Neuronal mechanisms responsible for contrast invariance
While the sensitivity to contrast changes observed in area V1 and
even in the retina are fairly well understood physiologically (Kaplan and Shapley 1986
; Ohzawa et al.
1982
; Sclar et al. 1990
). The mechanisms
responsible for contrast invariance observed in higher order areas,
such as the present results and those reported by Rolls and
Baylis (1986)
for cells in the monkey's STS, are still not clear.
A simple mechanism that could produce such invariance is a high
sensitivity to low contrast combined with saturation nonlinearity in
the neuronal response (i.e., a ceiling effect). Enhanced contrast sensitivity in higher-order visual areas may be a consequence of the
large receptive field size, characteristic of neurons in these areas
(Sclar et al. 1990
). This simply follows from the assumption that spatial summation of inputs will increase sensitivity in successive visual areas. The gradual increase in receptive-field size in the ventral stream (Amir et al. 1993
;
Tootell et al. 1997
; Van Essen 1985
),
reaching its highest level in LOC with a bilateral-visual field
activation pattern (Grill-Spector et al. 1998b
), may
therefore be the reason for the contrast invariance observed in LOC.
An alternative mechanism that could produce such invariance is a
nonspecific contrast gain control operating on a fast time scale. Such
mechanism will tend to shift the dynamic range of the contrast response
function so that it will optimally register small changes from the
adapting contrast (Muller et al. 1999
; Ohzawa et
al. 1982
). Contrast gain control effects predict a higher degree of invariance for blocks of images compared with single presentations, and this was not found in our single-event experiment. However, a more direct comparison of these conditions (block design vs.
single event) should be conducted to properly explore this possibility.
A simple sensitivity of LOC to low contrast like that observed in MT/MST is unlikely because the level of the contrast invariance we observed was not identical for all stimulus types as expected from such sensitivity. While full invariance was observed for the face stimuli, it was weaker for the object stimuli and it was not significantly different from V1 for the pattern stimuli. (see Fig. 4).
| |
ACKNOWLEDGMENTS |
|---|
We thank M. Behrmann, U. Hasson, and I. Levy for fruitful discussions and comments. We thank E. Okon for technical assistance.
This study was funded by Israel Academy Grant 8009/00-1 and German-Israeli Foundation Grant I-0576-040.01/98.
| |
FOOTNOTES |
|---|
Address for reprint requests: R. Malach (E-mail: Bnmalach{at}wisemail.weizmann.ac.il).
Received 8 August 2001; accepted in final form 8 February 2002.
| |
REFERENCES |
|---|
|
|
|---|