Extensive psychophysical and computational work proposes that the perception of coherent and meaningful structures in natural images relies on neural processes that convert information about local edges in primary visual cortex to complex object features represented in the temporal cortex. However, the neural basis of these mid-level vision mechanisms in the human brain remains largely unknown. Here, we examine functional MRI (fMRI) selectivity for global forms in the human visual pathways using sensitive multivariate analysis methods that take advantage of information across brain activation patterns. We use Glass patterns, parametrically varying the perceived global form (concentric, radial, translational) while ensuring that the local statistics remain similar. Our findings show a continuum of integration processes that convert selectivity for local signals (orientation, position) in early visual areas to selectivity for global form structure in higher occipitotemporal areas. Interestingly, higher occipitotemporal areas discern differences in global form structure rather than low-level stimulus properties with higher accuracy than early visual areas while relying on information from smaller but more selective neural populations (smaller voxel pattern size), consistent with global pooling mechanisms of local orientation signals. These findings suggest that the human visual system uses a code of increasing efficiency across stages of analysis that is critical for the successful detection and recognition of objects in complex environments.
Despite the ease with which we identify objects in complex environments, the computation of meaningful global forms from local image features on the retina is a challenging task for the visual system. A network of visual areas with selectivity for features of increasing complexity has been implicated in this task: local image features (e.g., position, orientation) are processed in primary visual cortex, whereas complex shapes and object categories (faces, bodies, places) are represented towards the end of the visual pathway in temporal cortex (Felleman and Van Essen 1991; Grill-Spector and Malach 2004; Reddy and Kanwisher 2006; Ungerleider and Mishkin 1982). However, the intermediate level mechanisms that the human brain uses for converting information about elementary features from V1 into selectivity for complex shapes in the temporal cortex remain largely unknown.
Previous psychophysical and computational studies propose that mid-level vision mechanisms mediate shape perception by combining the output of local orientation detectors to higher-order features (Barlow and Olshausen 2004; Geisler et al. 2001; Wilson and Wilkinson 1998). Testing this prediction entails studying the neural code (i.e., selectivity) for features of increasing complexity across stages of visual analysis. However, studying neural coding in the human brain is limited by the spatial resolution of conventional brain imaging approaches that average across neural populations with differential selectivity. Here, we trace selectivity for global forms in the human visual cortex using advanced multi-voxel pattern methods for functional MRI (fMRI) data analysis (Cox and Savoy 2003; Haynes and Rees 2006; Norman et al. 2006). These methods take advantage of information across brain patterns and allow us to discern selectivity for features that are encoded at a higher resolution (small-scale neural populations) than the typical size of fMRI voxels We exploit the sensitivity of these methods to discern selectivity for global form structure beyond selectivity for elementary features (e.g., orientation, position) across human visual areas.
We use Glass patterns (Glass 1969; Glass and Perez 1973), a class of stimuli that evoke the perception of global forms (concentric, radial, translational patterns) when the orientation of local dot pairs (dipoles) is consistent with a geometric rule (e.g., rotation, expansion, translation; Fig. 1A). Using Glass patterns for tracing the neural code for global forms allows controlled variations of global structure (e.g., concentric, radial, translational) while keeping the local stimulus statistics on average the same across stimuli. Despite extensive behavioral work on Glass patterns, little is known about the neural mechanisms that mediate their processing across stages of visual analysis. Previous neurophysiological studies (Smith et al. 2002, 2007) have concentrated on the local integration of dot pairs into oriented dipoles in V1 and V2. We focus on the neural basis of the integration of orientation signals to global forms in the human brain. We test the hypothesis that the perception of global form in Glass patterns is mediated by selectivity for higher-order (global structure) features in occipitotemporal areas that pool local orientation signals from early visual areas. We provide novel evidence that a continuum of form integration processes in the human visual cortex contributes to the perception of global form in Glass patterns: from local orientation analysis in early visual areas to the processing of global configurations in higher occipitotemporal areas critical for the perception of shapes and complex objects.
Eleven students from the University of Birmingham participated in the experiments (6 males and 5 females; median age, 27 yr; age range, 22–32 yr). Eight observers participated in each experiment (5 observers performed both experiments). All observers had normal or corrected to normal vision and gave written informed consent, and the study was approved by the local ethics committee.
Four different stimulus pattern types defined by dot dipoles were used in the experiments (Fig. 1A): concentric, radial, translational, and random. White dots were presented on a black background (100% contrast), the dot density was 0.4% and the Glass shift (distance between dots in a dipole) was 14 arc min. These parameters were chosen based on pilot psychophysical studies and in accordance with previous studies (Wilson and Wilkinson 1998). Dot dipoles were generated by creating a pattern of randomly placed dots and pairing each dot in this seed pattern with a partner dot that was shifted based on a particular geometric rule (Glass 1969; Glass and Perez 1973). A new seed pattern was used for each stimulus presented in a trial, resulting in stimuli that were locally jittered in their position. In particular, in experiment 1 for each stimulus pattern type, we generated 100 unique stimuli starting from a new seed pattern (i.e., random dipole pattern) for each stimulus. All stimuli were displayed within a circular aperture of 10.8° visual angle. For the translational Glass patterns, all dipoles were rotated to the same orientation in steps of 18° from 0 to 180°, i.e., 10 stimuli per orientation (100 translational stimuli). We generated the concentric and radial Glass patterns by placing dipoles tangentially (concentric stimuli) or orthogonal (radial stimuli) to the circumference of an ellipse (eccentricity: 0.77) centered at the fixation dot. To match the 10 orientations used for the translational Glass patterns, the major axis of the ellipse was oriented between 0 and 180° in steps of 18° resulting in 10 stimuli for each of the 10 axis orientations (100 concentric, 100 radial stimuli). For the random patterns, we assigned a random orientation between 0 and 180° to each dipole (100 random stimuli). Evaluating the distributions of the orientations of the dot dipoles (Fig. S11 ) showed similar orientation profiles across all patterns (e.g., mean and SD: concentric, 90.05 ± 51.97°; radial, 90.07 ± 51.97°; translational, 81 ± 54.49°; random, 89.6 ± 51.84°; see Fig. S1 for additional parameters). A hyperbolic Glass pattern (generated by inverting the local orientation rule for the concentric stimulus) was used as the target for the detection task performed by the observers during scanning in experiment 1.
In the control experiment (experiment 2), we tested three stimulus conditions (random 1, random 2, and random 2–90°) and generated 80 stimuli per condition. For condition random 1, we generated 80 stimuli with random dot dipole positions displayed within a circular aperture of 10.8° visual angle. To generate stimuli for condition random 2, for each stimulus in condition random 1, we fixed one of the two dots of each dipole and rotated the other randomly between 0 and 180° around the first dot. That is, the stimuli in conditions random 1 and random 2 had the same distribution of orientations but differed in the local position and orientation (90° in average) of the dot dipoles. To generate stimuli for condition random 2–90°, for each stimulus in condition random 2, we fixed one of the two dots of each dipole and rotated the other one clockwise by 90°. Thus stimuli in this condition (random 2–90°) and condition random 2 differed at the position and local orientation of the dot dipoles by 90°. Note that the 90° orientation difference in the random conditions is the same as the local orientation difference between concentric and radial Glass patterns. Evaluating the distributions of the orientations of the dot dipoles (Fig. S1) showed similar orientation profiles across all patterns (e.g., mean and SD: random 1, 89.5 ± 51.7°; random 2, 89.7 ± 52.2°; random 2–90°, 91.1 ± 51.8°).
Design and procedure
Concentric, radial, translational, and random stimuli were presented in blocks of 16 s (blocked fMRI design). In each block, 20 different stimuli of one type (concentric, radial, translation, random) were presented, each with a different random arrangement of dot pairs. Stimuli in each block were randomly sampled twice from the 10 orientations for each stimulus type. That is, 80 of a total of 100 unique stimuli per stimulus type were randomly sampled and presented per run. This design ensured that differences in the fMRI responses across stimulus types could not be attributed to differential adaptation effects related to stimulus repetition. Furthermore, local adaptation across stimuli was controlled and equated across stimulus conditions by generating each stimulus based on a different seed pattern and introducing different global orientation axes across stimuli. Finally, this variability in the local structure of the stimuli for each condition, ensured that classification of stimulus categories from fMRI data could not be simply achieved based on stimulus regularities that could have formed if the same limited set of stimuli was presented repeatedly.
Each stimulus was presented for 332 ms followed by a 468-ms fixation interval (trial onset asynchrony 800 ms). Each observer was scanned on eight experimental runs. Each run lasted 5.6 min and was comprised of the four stimulus conditions four times each in counterbalanced order and five fixation blocks (1 block in the beginning of each run, 1 in the end, and 3 interleaved after every set of 4 experimental blocks). Observers were instructed to fixate a central fixation dot and performed a target (hyperbolic pattern) detection task by pressing a button. The target stimulus was presented on average twice per 16-s block. This task ensured that observers maintained attention to the global configuration of the stimuli across all experimental conditions.
We used a similar blocked design as in experiment 1 with three stimulus conditions: random 1, random 2, and random 2–90°. In each block, 20 different stimuli of one type were randomly sampled without replacement; that is, all 80 stimuli for each condition were shown per run (4 blocks per condition). The observers fixated and performed a target (translational pattern) detection task, i.e., they detected a translational pattern that was presented at randomly sampled orientations (0–180°). Fixation epochs were confined to the beginning and the end of each experimental run.
MRI data acquisition
The experiments were conducted at the Birmingham University Imaging Centre (3-T Philips Achieva scanner). T2*-weighted functional and T1-weighted anatomical (1 × 1 × 1 mm) data were collected with an eight channel SENSE head coil. EPI data (gradient echo-pulse sequences) were acquired from 33 slices (2.5 × 2.5 × 3-mm resolution, TR: 2,000 ms, TE: 35 ms) providing whole brain coverage.
fMRI data analysis
FMRI DATA PREPROCESSING.
fMRI data were processed using the Brain Voyager QX (Brain Innovations, Maastricht, Germany) software package (Goebel et al. 2006). Preprocessing of all functional data included slice-scan time and head movement correction, temporal high-pass filtering (3 cycles), and removal of linear trends. No spatial smoothing was performed on the functional data used for the multivariate analysis. Anatomical data were transformed into Talairach space, three-dimensional reconstructed, inflated, and flattened. The functional images were aligned to anatomical data resulting in spatially standardized four-dimensional volume-time-course data.
MAPPING REGIONS OF INTEREST.
For each observer we identified the following: 1) retinotopic areas, 2) the lateral occipital complex (LOC), and 3) Glass pattern-responsive regions (GPRRs). We localized early visual areas based on standard retinotopic mapping procedures (supplementary material) (DeYoe et al. 1996; Engel et al. 1994; Sereno et al. 1995). We defined the LOC as the set of contiguous voxels in the ventral occipitotemporal cortex that showed significantly stronger activation (t(158) > 4.0, P < 0.001) for intact than scrambled images of objects (Kourtzi and Kanwisher 2000) (supplementary material).
To identify cortical regions that responded more strongly to global pattern stimuli (concentric, radial, translational Glass patterns) compared with randomly oriented dot dipole stimuli (random stimuli), we compared responses of individual voxels to these stimuli using the general linear model. We performed this analysis across subjects (Fig. 1B, group analysis) and for individual observers (Fig. S2). For the group analysis, smoothed volume-time-course data (Gaussian kernel of 6-mm full-width at half maximum) was z-transformed and modeled with five regressors of interest (4 stimulus conditions and fixation baseline) convolved with a canonical hemodynamic response function and six additional covariates of no interest (the movement-parameters obtained during motion correction for x-, y-, and z-translations and rotations). For individual observer data, we analyzed unsmoothed, z-transformed volume-time-course data using the same general linear model approach.
MULTI-VOXEL PATTERN ANALYSIS.
For each region of interest (ROI), we sorted the voxels according to their response (t-statistic) to all stimulus conditions compared with fixation baseline across all experimental runs. We selected the same number of voxels across ROIs and observers by restricting the pattern size to those voxels that showed a t value >0 for the “all conditions versus fixation” contrast. This procedure resulted in the selection of 130 voxels per ROI, comparable to the dimensionality used in previous studies (Haynes and Rees 2005; Kamitani and Tong 2005). At this pattern size, all voxels across subjects and ROIs had t values that were >0; that is, all voxels in the analyzed patterns were responsive to all stimulus types. We normalized (z-score) each voxel's time course separately for each experimental run to minimize baseline differences across runs. The data vectors for the multivariate analysis were generated by shifting the fMRI time series by 4 s to account for the delay of the hemodynamic response and averaging all time series data points of one experimental block. We used a Support Vector Machine (SVMlight toolbox, supplementary material) for classification and performed an eightfold cross-validation leaving one run out (test sample). That is, we used data from seven runs as training patterns (112 patterns: 16 patterns per run) and data from the remaining run as test patterns (16 patterns). For each subject we averaged the accuracy rates (number of correctly assigned test patterns/total number of assignments) across cross-validations. Statistical significance across subjects was evaluated using repeated measures ANOVA. All ANOVAs were corrected (Greenhouse-Geisser) for nonsphericity (inhomogeneity of variance). For the multi-voxel pattern analysis (MVPA) of voxels in the Glass pattern responsive regions (GPRRs), we selected voxels that showed significantly stronger activation for Glass patterns than random pattern using only the set of training runs included in each cross-validation. We compared accuracy across all areas for a three-way classification (concentric, radial, translation patterns) and pairwise classifications of interest (experiment 1: each Glass pattern compared with random stimuli, concentric vs. radial Glass patterns; experiment 2: random 1 vs. random 2, random 2 vs. random 2–90°).
To study the processing of global structure in Glass patterns across visual areas, we identified in each individual observer retinotopic areas, the LOC, and GPRRs that responded significantly stronger to Glass than random patterns. We used pattern classification analyses (MVPA) previously used successfully for the decoding of elementary visual features (Haynes and Rees 2005; Kamitani and Tong 2005, 2006) and object categories (Hanson et al. 2004; Haxby et al. 2001; O'Toole et al. 2005; Williams et al. 2007). We conducted three main MVPA analyses to study selectivity for features that define global structure in Glass patterns across stages of analysis in the human visual cortex. We first tested which of the cortical regions of interest contain information that allows reliable discrimination of the different Glass patterns (3-way classification: concentric vs. radial vs. translational). Second, we tested whether selectivity for each of the different Glass pattern types differed across regions of interest (binary classifications: concentric, radial, or translational patterns vs. random patterns). Finally, we tested whether selectivity for global forms in these regions reflects differences in global structure (concentric vs. radial) rather than low-level features (local position, orientation signals).
Classification of fMRI signals for different Glass pattern types
We tested whether we could predict the Glass pattern type (3-way classification: concentric vs. radial vs. translational) presented to the observers based on fMRI signals in retinotopic areas and the LOC. As shown in Fig. 2 (Table S1 for statistics), the mean classification accuracy across Glass patterns was significantly higher than chance in all regions of interest, suggesting selectivity for different global form patterns across visual areas. This finding suggests that neural populations across voxels in all visual areas contain information that allows us to differentiate between the Glass pattern types. However, this selectivity was enhanced in higher occipitotemporal areas (LOC) that showed significantly higher classification accuracy than retinotopic areas (F(3,22) = 3.4, P < 0.05). Next, we compared classification accuracy for each of the Glass pattern types (concentric, radial, translational) across areas. A repeated-measures ANOVA showed a significant effect of ROI (F(3.2,22.5) = 3.4, P < 0.05) but not of stimulus type (concentric, radial, translational; F(1.5,10.5) = 1.1, P = 0.36). A significant interaction between stimulus type and ROI (F(4.7,33.1) = 2.5, P < 0.05) suggested differences in pattern classification for the different Glass pattern types across visual areas. Follow-up contrasts showed higher accuracy for radial Glass patterns in intermediate visual areas that reached significance in V3a (radial vs. concentric, t(7) = 1.9, P < 0.05; radial vs. translational, t(7) = 2.2, P < 0.05). These findings are consistent with previous studies showing a radial bias in retinotopic visual areas (Sasaki et al. 2006) and activations for dynamic radial patterns in dorsal visual areas (Braddick et al. 2000; Koyama et al. 2005; Krekelberg et al. 2005). In contrast, no significant differences were observed across stimulus types in V1 (F(1.7,11.7) = 0.43, P = 0.62) or LOC (F(1.2,8.3) = 0.008, P = 0.96), indicating that classification was similar across Glass pattern types in these areas but significantly higher in the LOC consistent with global integration mechanisms of local signals in higher occipitotemporal areas.
It is important to note that our stimulus generation procedures aimed to match the different Glass pattern types at the level of local orientation signals, allowing us to compare across stimulus types. In particular, we varied the global orientation axis (major axis of the elliptical stimulus configuration) of concentric and radial Glass patterns in a similar manner as for the translational stimuli (from 0 to 180° in steps of 18°). As a result, the local dipoles in each stimulus were rotated along the major axis. This manipulation allowed us to match more closely the local orientation signals across stimulus types, as each voxel was stimulated by multiple orientations across the stimuli presented in each condition. The histograms of local dipole orientations (Fig. S1) show that the only difference across stimulus types was in the sampling of local orientations. Although for the concentric and radial patterns, local orientation distributions were uniform (i.e., orientations were sampled randomly within a range of 0–180°), for the translational patterns, orientations were sampled in discrete steps defined by the stimulus global axis (i.e., 10 discrete orientations were sampled within a range of 0–180°). Thus it is possible that concentric and radial patterns stimulate larger neural populations that are highly selective for the finely sampled orientations and result in increased fMRI responses at a single voxel. However, the lack of significant differences in accuracy across stimulus types in V1 and LOC suggests that this difference in the local orientation distributions across patterns did not affect significantly the selectivity for different stimulus types, as measured by pattern classification of fMRI responses. Furthermore, any differences in orientation sampling across Glass patterns types could not account for differences in the classification accuracies across areas (e.g., higher classification accuracy for radial patterns in intermediate visual areas). Rather, global pooling mechanisms of local orientation signals in the LOC may support better discrimination of different global stimulus patterns than early visual areas.
Further analysis of the functional signal-to-noise ratio across all voxels included in the multivariate analysis (Fig. S3) showed that these findings reflect differences in selectivity for global form patterns rather than simply the overall responsiveness to the stimuli. In particular, this analysis showed that the high classification accuracy for Glass patterns in the LOC was not simply caused by high fMRI responses in this area; in contrast, higher occipitotemporal showed lower responsiveness than early retinotopic areas (experiment 1: F(2,18) = 31.6, P < 0.05; experiment 2: F(2,9) = 21.7, P < 0.05). Finally, our findings could not be simply caused by differences in eye movements, attention, or task difficulty. Analysis of the eye movement measurements showed that observers could maintain fixation while performing the target detection task. Eye movement measurements did not differ significantly across conditions (Fig. S4; Table S2), suggesting that it is unlikely that classification accuracy reflected activation differences caused by differential eye movements across conditions. Observers’ performance (accuracy, reaction times) in detecting the target stimulus (hyperbolic pattern) during scanning did not differ significantly across conditions (supplementary material), suggesting that observers attended similarly to all stimuli across conditions.
Comparing fMRI selectivity for different Glass pattern types
Previous psychophysical studies have proposed that the perception of concentric and radial Glass patterns entails processing of global feature configurations, whereas perception of translational patterns entails local orientation processing (Li and Westheimer 1997; Olzak and Thomas 1992; Wilson and Wilkinson 1998). In particular, concentric or radial patterns are easier to discriminate from noise than translational patterns (Wilson and Wilkinson 1998). Furthermore, several studies have shown a behavioral advantage for the perception (i.e., detection, discrimination) of circular patterns across a range of stimuli (Glass patterns, collinear Gabor-defined patterns, gratings, radial frequency patterns) (Achtman et al. 2003; Hess et al. 1999; Kovacs and Julesz 1993, 1994; Kurki and Saarinen 2004; Levi and Klein 2000; McGraw et al. 2004; Regan and Hamstra 1992; Seu and Ferrera 2001; Wilkinson et al. 1998; Wilson and Wilkinson 1998, 2003; Wilson et al. 1997, 2004).
We tested whether fMRI selectivity for Glass pattern types (concentric, radial translational) across visual areas differed in accordance with these behavioral effects. We computed fMRI selectivity for each Glass pattern type by conducting pairwise classifications between activation patterns in retinotopic areas and the LOC for each Glass pattern type (concentric, radial, translational) and the random dot dipole stimuli. First, we compared classification accuracy for each stimulus type against chance (Table S1). This analysis showed that early and higher visual areas contain information that allows us to reliably discern neural responses selective to coherent patterns with global form structure (concentric, radial patterns) or global texture structure (translational patterns) from fMRI signals. Second, we compared selectivity across global form and texture patterns by comparing classification accuracies across Glass pattern types. A repeated-measures ANOVA (Greenhouse-Geisser correction for inhomogeneity of variance) showed a significant main effect of comparison (each Glass pattern type vs. random; F(1,10) = 7.1, P < 0.05) and ROI (F(3,21) = 3.5, P < 0.05) but no significant interaction between these factors (F(4.3,29.9) = 1.66, P = 0.18). As shown in Fig. 3, the significant effect of ROI indicates increasing classification accuracy from earlier to higher visual areas, consistent with the role of higher visual areas in pooling local signals to support better discrimination of both global form and texture patterns. However, the main effect of comparison indicates differences in the selectivity between stimulus types with different global structure. That is, accuracies were significantly higher for radial than concentric patterns (P = 0.03, pairwise comparison) and radial than translational patterns (P = 0.01, pairwise comparison). This higher classification accuracy for radial patterns compared with random stimuli is consistent with a bias for the cortical representation of radial patterns (Braddick et al. 2000; Koyama et al. 2005; Krekelberg et al. 2005; Sasaki et al. 2006). The lack of a significant interaction was probably due to the fact that most of the areas (i.e., all the retinotopic areas) shared a similar pattern of results (a trend for higher accuracy for radial patterns), whereas only the LOC showed a trend for increased accuracy for concentric patterns (Fig. 3). Preplanned comparisons (paired sample t-test) motivated by previous imaging and behavioral findings showed that this radial bias was more evident in early visual areas (radial vs. concentric, P < 0.05; radial vs. translational, P < 0.05) than the LOC where a trend for higher accuracies for concentric patterns was observed (concentric vs. translational t(7) = 2.00, P < 0.05; concentric vs. radial, t(7) = 1.14, P = 0.07). These results suggest potentially higher selectivity for global forms (concentric, radial Glass patterns) than translational patterns in higher occipitotemporal areas, consistent with the role of these areas in representing the perceived global shape (Grill-Spector et al. 2000; Kourtzi and Kanwisher 2001).
Comparing fMRI selectivity for global versus low level stimulus features
Is it possible that classification accuracy for the discrimination of different Glass patterns from fMRI signals was caused by low-level differences across stimuli (i.e., random differences in the position or orientation of the dot dipoles)? We tested whether the differential selectivity observed for different Glass pattern types could be simply caused by local differences in the orientation and position of the dot dipoles. We compared (pairwise classification) concentric and radial Glass patterns (experiment 1) because these stimuli evoke different global form percepts but also differ at the local position and orientation (90°) of the dot dipoles. In addition, we conducted pattern classification on fMRI data recorded when observers viewed sets of random stimuli that did not evoke the perception of global forms but differed in local position and orientation by 90° (experiment 2: random 1 vs. random 2; random 2 vs. random 2–90°). That is, we introduced 90° orientation difference at the local orientation in paired stimuli across conditions (random 2 vs. random 2–90°) and in average across dipoles and stimuli (random 1 vs. random 2). In particular, pairs of stimuli between conditions random 2 and random 2–90° differed by a fixed clockwise rotation of the local dipoles by 90°. In contrast, stimuli in conditions random 1 and random 2 differed at the orientation of the local dipoles randomly across stimuli between 0 and 180°, resulting in a mean 90° orientation difference between stimuli in the two conditions. These two procedures for simulating the local differences between concentric and radial patterns using stimuli with random structure complemented each other. That is, any regularities introduced by the fixed 90° rotation in random 2–90° were controlled by the 90° mean orientation difference across stimuli in conditions random 1 versus random 2.
We reasoned that similar accuracies for the Glass pattern and the random stimuli classification would indicate selectivity for local orientation differences across neural populations. In contrast, higher accuracy for the classification of Glass patterns would indicate selectivity for the perceived differences between stimuli in their global structure. Consistent with this prediction (Fig. 4, data from 5 subjects that participated in both experiments; Fig. S5 for data from all subjects that participated in the experiments), the highest classification accuracy was observed for concentric versus radial Glass patterns in the LOC. Comparison of classification accuracies between experiments (experiment 1: concentric vs. radial Glass patterns; experiment 2: random 1 vs. random 2; random 2 vs. random 2–90°) showed overall higher classification accuracy for comparison of global form patterns (experiment 1) than local orientation differences in random patterns (experiment 2). A repeated-measures ANOVA for comparison (concentric vs. radial, random 1 vs. random 2, random 2 vs. random 2–90°) and ROI (retinotopic areas vs. LOC) showed a significant main effect of comparison (F(1.2,4.7) = 19.75, P < 0.01) and a significant interaction between comparison and ROI (F(1.7,6.8) = 5.6, P < 0.05). Further contrast analysis showed significant differences between comparisons in the LOC (F(1.4,5.4) = 13.6, P < 0.05) with higher accuracy for the classification between concentric and radial patterns than the classification of random stimuli (concentric vs. radial compared with random 1 vs. random 2: P < 0.05; random 2 vs. random 2–90°: P < 0.01). In contrast, no significant differences were observed across comparisons in the early visual areas (F(1.8,7.4) = 1.23, P = 0.34). Moreover, the lack of significant differences (F(1,4) = 2.0, P = 0.2) across areas between the comparisons of random stimuli (random 1 vs. random 2, random 2 vs. random 2–90°) indicates that any differences between these stimulus conditions did not result in significant differences in the classification of random stimuli. These results suggest that the high classification accuracy for concentric versus radial patterns in higher occipitotemporal areas could not be simply attributed to local position and orientation differences between the stimuli. Furthermore, comparing the classification accuracy for concentric versus radial patterns with the classification accuracy across Glass pattern types (Fig. 2) showed similar results (relative to chance) across areas. That is, no significant differences were observed (F(1,7) = 0.38, P = 0.56) between classification accuracies for global patterns in Figs. 2 and 4 (data normalized to respective chance levels). This comparison provides converging evidence across analyses for the role of higher occipitotemporal regions in the integration of local position and orientation signals for the perception of global form and texture patterns.
Finally, we compared classification accuracy for each stimulus comparison (concentric vs. radial Glass patterns; random 1 vs. random 2; random 2 vs. random 2–90°) to chance (Table S1). This analysis showed that classification accuracy for concentric versus radial patterns was significantly different from chance across early and higher visual areas. However, classification accuracy for random patterns was significantly different from chance in the early visual areas rather than higher occipitotemporal areas. These results suggest that differences at the local (position and orientation) and global structure of Glass patterns and random stimuli can be decoded reliably across visual areas using MVPA methods. Interestingly, comparing the average fMRI response across voxels in each area (percent signal change from fixation baseline) for concentric and radial patterns showed no significant differences (F(1,7) = 0.14, P = 0.71). This result (Fig. S6) corroborates previous evidence that multivariate analyses of fMRI signals across activation patterns are more sensitive in showing selectivity for features encoded at fine resolution (small neural populations). Although our findings are consistent with these previous studies (Haynes and Rees 2005; Kamitani and Tong 2005), the classification accuracies in early visual areas observed in our study were weaker than those previously reported, potentially because of stimulus-related noise. Specifically, previous studies used gratings that stimulate the receptive fields measured in each voxel with a single orientation, whereas the dot dipoles in Glass patterns and random stimuli stimulate a given voxel with multiple orientations. However, our results show that differences in position and orientation (90°) of a small number of dipoles that stimulate a single voxel seem sufficient to decode differences in feature selectivity across stimuli. Furthermore, spurious groupings of closely located dipoles that do not match the global structure in Glass patterns may introduce noise into the integration process. However, because stimuli across conditions were generated from different seed patterns, such groupings would be equally probable across stimulus types and affect the fMRI responses in a similar manner. Despite these possible sources of stimulus related noise in the MVPA, our results show that differences across stimulus types can be decoded reliably from fMRI signals in visual areas. Importantly, MVPA (1,000 iterations) on shuffled data (i.e., when we assigned labels randomly to the data) showed accuracies very close to chance for all comparisons (Table S3). This procedure ensured that the classification accuracies were not simply caused by the power of the classification algorithm that could use random statistical regularities in the data for classification; rather, it reflects information across voxels that allows the discrimination between stimuli based on their features.
Taken together, these findings suggest that both retinotopic and higher occipitotemporal areas (LOC) contain information across voxels that allows discrimination between global form patterns (e.g., concentric vs. radial Glass patterns). However, only the LOC contains information about the perceived differences in the global structure of different Glass pattern types beyond their local position and orientation differences, whereas retinotopic visual areas resolve this discrimination based on information about local position and orientation differences.
MVPA in Glass pattern responsive regions
We tested whether regions beyond the independently defined ROIs (retinotopic areas and the LOC) are involved in the processing of Glass patterns. We compared fMRI responses for Glass patterns (concentric, radial, translational) and random stimuli to identify regions that respond significantly stronger to stimuli with coherent than random structure. We identified [GLM group analysis: fixed effects (P < 0.05, Bonferroni corrected); random effects (P < 0.05)] three anatomically separable voxel clusters (Glass pattern responsive regions, GPRRs) that responded significantly stronger to Glass patterns than random stimuli (Figs. 1B and S2): 1) dorsal in the occipital cortex and inferior to V3a (Larsson and Heeger 2006), overlapping with V3B/KO and V4d (Hansen et al. 2007; Tootell and Hadjikhani 2001) (dorsal GPRR, Talairach coordinates: right hemisphere [29, −82, 8], left hemisphere [−28, −83, 8]), 2) ventral lateral in the occipitotemporal cortex (VOT; Brewer et al. 2005), overlapping with LO (ventral-lateral GPRR, Talairach coordinates: right hemisphere [42, −63, −12], left hemisphere [−39, −68, −7]), and 3) ventral medial in the occipitotemporal cortex anterior to V4 (ventral-medial GPRR, Talairach coordinates: right hemisphere [26, −59, −11], left hemisphere [−25, −62, −9]). Similar Glass pattern responsive regions were identified in individual observers (Figs. 1B and S2). Furthermore, comparison of fMRI responses between global patterns (concentric, radial) and random stimuli showed similar activation patterns, ensuring that similar cortical regions were activated when only stimuli with global form structure were considered in the analysis, rather than all stimuli with coherent organization. These results are consistent with previous studies (Chen et al. 2004; Wade et al. 2003) showing stronger activation for Glass patterns than random dipole patterns in ventral and dorsal regions anterior to retinotopic visual areas.
To characterize the representations in the Glass pattern responsive regions, we performed the same MVPA analysis for all comparisons of interest, as described for the retinotopic areas and the LOC (Fig. 5). It is important to note that voxels in these regions were identified based on their response to all Glass pattern types rather than their responses to individual stimulus types. This procedure ensured that the voxels included in MVPA comparisons between different stimulus types were identified based on an independent comparison. However, for MVPA comparisons of each stimulus types versus random, fMRI responses (based on univariate or multivariate analysis) are expected to be higher for Glass patterns and random dot dipole stimuli in these Glass pattern-responsive regions, as these areas were defined based on voxel clusters that respond significantly higher to Glass patterns than random stimuli. We conservatively selected the voxels for the MVPA analysis by defining GPRR voxels based only on the training runs, excluding the test run for each MVPA cross-validation. That is, the data on which we tested the classifications of interest were independent from those used to identify the Glass pattern-responsive regions. Using this methodology, we provide novel evidence for distinct functional roles of the different Glass pattern responsive regions. In particular, pattern classification in the dorsal region overlapping with V3B/KO and V4d reflects discrimination of global form patterns based on local signals (position, orientation), whereas classification in the ventral lateral region overlapping with LO reflects representation of global forms independent of low level stimulus properties.
As shown in Fig. 5A, the three-way classification between different Glass pattern types showed classification accuracies higher than chance for all regions, suggesting that activation patterns in these regions contain information that allows us to discriminate between the patterns. Mean classification accuracies did not differ significantly between the dorsal and ventral GPRR (F(1,5) = 0.42, P = 0.54). However, mean classification accuracy was significantly higher for the lateral than the medial ventral subregions (F(1,5) = 6.7, P < 0.05). Analysis of the classification accuracies for each Glass pattern type in the dorsal and ventral GPRRs (2-way repeated-measures ANOVA) showed no significant differences for ROI (F(1,5) = 1.3, P = 0.31) or stimulus type (F(1.2,5) = 4.2, P = 0.09) and no significant interaction (F(1.6,6.5) = 1.43, P = 0.30). Interestingly, a trend for higher classification accuracy for radial patterns was observed in the ventral medial subregion similar to the radial bias observed in intermediate visual areas; however, this effect did not reach significance (F(1.1,5.0) = 2.4, P = 0.19).
Next, the pairwise classifications (Fig. 5B) for each Glass pattern type versus random showed accuracies significantly higher than chance for all GPRRs. Comparing classification accuracies between the dorsal and ventral GPRRs (2-way repeated-measures ANOVA) showed significantly higher classification accuracy in the ventral than dorsal GPRR (F(1,4) = 20.5, P = 0.01). No significant main effect of comparison (F(1.8,9.0) = 2.2, P = 0.17) or interaction between ROI and comparison (F(1.8,9.0) = 0.67, P = 0.52) was observed. Furthermore, no significant differences were found for classification accuracies in the ventral lateral and ventral medial subregions (main effect of ROI: F(1,5) = 0.16, P = 0.7, main effect of comparison: F(1.8,9.1) = 0.88, P = 0.43, interaction of ROI × comparison: F(1.7,8.4) = 1.3, P = 0.30).
Finally, Fig. 5C shows higher classification accuracy when discriminating between activations for concentric and radial patterns than activations between random stimuli that differ locally by 90° rotation of the dot dipoles (random 1 vs. random 2, random 2 vs. random 2–90°). In particular, analysis of the classification accuracies in the dorsal and ventral GPRRs (2-way repeated-measures ANOVA) showed a significant main effect of comparison (concentric vs. radial, random 1 vs. random 2, random 2 vs. random 2–90°) (F(1.7,5.1) = 10.1, P = 0.02), no significant main effect of ROI (F(1,3) = 0.003, P = 0.96), and a trend for a significant interaction between comparison and ROI (F(1.4,4.2) = 5.5, P = 0.07). Following this interaction trend, repeated-measures ANOVAs for the individual ROIs showed significant effects of comparison for the ventral (F(1.6,4.9) = 10.4, P = 0.02) but not the dorsal GPRRs (F(1.6,4.8) = 2.4, P = 0.18). In particular, classification accuracy in the ventral GPRR was higher for the concentric versus radial comparison than random 1 versus random 2 (t(3) = 3.8, P = 0.03) or random 2 versus random 2–90° (t(3) = 3.5, P = 0.04). Comparing classification accuracies in the ventral lateral and ventral medial GPRR showed a significant effect of comparison in the ventral lateral GPRR (F(1.9,5.6) = 10.3, P = 0.01) but not the ventral medial GPPR (F(1.7,5.1) = 1.8, P = 0.26). This result suggests that the higher classification accuracy for global patterns than random stimuli in the ventral GPRRs was driven by higher accuracy for concentric versus radial than random stimuli in the ventral lateral region (t(3) = 5.3, P = 0.01).
Taken together, these findings provide evidence for a continuum of integration processes for the perception of global forms in Glass patterns across stages of visual analysis. In particular, classification in the medial ventral region anterior to V4 (VOT cortex) and a dorsal region anterior to V3a and corresponding to V3B/KO and V4d suggests that neural populations in these region discriminate between global form patterns based on local signals (position, orientation). In contrast, classification of Glass patterns in a ventral lateral region overlapping with LO showed overall higher accuracies than more posterior (dorsal, ventral medial) regions and increased accuracy for global patterns (concentric vs. radial) than random stimuli. These findings suggest that neural populations in this ventral lateral region integrate local signals to global configurations and represent the global form structure independent of low level stimulus properties.
Our findings are consistent with previous work attributing global integration processes to areas V4 and IT. Regarding area V4, previous physiological studies have shown population selectivity for curvature (Pasupathy and Connor 1999, 2001, 2002) and neurons with selective tuning for global form patterns (concentric, radial, hyperbolic) compared with translational patterns (Gallant et al. 1993, 1996). Human brain imaging (fMRI, EEG, intracranial recordings, lesion) studies show that global integration processes for circular patterns occur at later stages of analysis corresponding to human ventral V4 (Allison et al. 1999; Dumoulin and Hess 2007; Gallant et al. 2000; Ohla et al. 2005; Pei et al. 2005; Wilkinson et al. 2000). However, our results show differential selectivity across different types of Glass patterns (i.e., higher selectivity for concentric and radial than translational patterns) primarily in dorsal regions anterior to V3a that have been suggested to correspond to V3B/KO and dorsal V4 rather than in ventral V4. It is possible that this is because of differences in the stimuli used (e.g., full field gratings used in previous studies vs. patterns defined by sparse dot pairs used in our study) or differences in the functional organization of V4 in the human and monkey brain (Tootell and Hadjikhani 2001). Further imaging studies in both human and monkeys (Tse et al. 2002) using the same stimuli would be necessary to resolve the role of ventral and dorsal V4 as an intermediate stage in the integration of global forms. Finally, previous physiological studies show that neurons in posterior IT integrate local contour fragments as defined by combinations of curvature, orientation, and position information into shape configurations(Brincat and Connor 2004, 2006) that provide the basis for object recognition at more anterior IT regions. Interestingly, a previous imaging study showed higher responses for concentric versus radial patterns in anterior temporal areas specialized for face processing (human area FFA), suggesting that face perception may rely on the processing of circular feature configurations (Wilkinson et al. 2000). Our findings show that selectivity for higher-order features that mediates the discrimination of global forms (concentric vs. radial patterns) starts at more posterior sites of visual analysis in the ventral lateral occipital cortex.
Classification accuracy and selectivity as a function of pattern size
To gain further insight into the amount of information across voxels necessary for the classification of global form structure, we evaluated classification accuracy for concentric versus radial Glass patterns as a function of pattern size. We randomized the order of voxels across the pattern size (i.e., 200 voxels that responded significantly stronger to all stimuli than fixation in each region of interest) and performed 500 MVPA iterations with a different randomized order of voxels per iteration. As observed in previous studies (Haynes and Rees 2005; Kamitani and Tong 2005), classification accuracy increased with increasing pattern size to a saturation value at which the inclusion of more dimensions/voxels does not increase the classification accuracy further (Fig. 6). Fitting the accuracy across pattern size for each area with a power law function that has been previously used for modeling neural data (Bonin et al. 2005; Li and Freeman 2007) confirmed that classification accuracy increased nonlinearly with pattern size for all regions of interest (i.e., exponents closer to 0 than 1; e.g., V1: k = 0.04, r2 = 0.99; LOC: k = 0.04, r2 = 0.91; ventral GPPR: k = 0.05, r2 = 0.94).
To quantify the rate with which classification accuracy increased with pattern size, we used a saturation model that allows us to determine the pattern size (voxel constant, υ) for which the classification accuracy reaches 63% of the difference between the maximum accuracy (asymptote) and chance (supplementary material). The goodness of fit was high and significant (χ2, P < 0.05) for all areas. Figure 6 compares accuracy across pattern size for the classification of concentric versus radial Glass patterns for early (V1) and higher (LOC, vGPRR) visual areas, showing higher accuracy for smaller pattern size in higher visual areas than V1. To quantify differences across areas, we computed the ratio of accuracy relative to chance [ΔA = (AMax –50%)] over the voxel constant (υ) as a measure of the rate with which information for the classification of global form accumulates across voxels. This ratio was significantly higher in occipitotemporal than early visual areas (Table S4), indicating higher classification accuracy at smaller pattern sizes in higher visual areas. These results suggest that higher occipitotemporal areas integrate information about global form structure across a smaller pattern size (i.e., smaller neural populations) than early visual areas, consistent with the neural properties of higher visual areas (i.e., larger receptive fields tuned to the global stimulus structure). Could these results reflect simply differences in the amount of information (i.e., number of dipoles) stimulating voxels corresponding to neural populations with different receptive field sizes across visual areas? Previous fMRI studies (Dumoulin and Wandell 2008) have determined the size of population receptive fields (V1: 0.5–1° radius; LOC: 4–8° radius) based on the cortical magnification factor for the same voxel size as used in our study (2.5 × 2.5 × 3 mm). Based on these estimates, the size (10.8° visual angle) and the density (0.4% dot density) of our stimuli, we estimated that 1–5 dipoles stimulated a single voxel in V1, whereas the entire stimulus (∼150 dipoles) stimulated a single voxel in the LOC. These differences in the amount of information per voxel (i.e., number of dipoles) across areas were similar for global (i.e., concentric vs. radial patterns) and random (e.g., random 2 vs. random 2–90°). However, differences in the rate with which classification accuracy increased between early and higher visual areas were specific to global form stimuli (concentric vs. radial Glass patterns); that is, classification accuracy across pattern size for random patterns was similar across early and higher visual areas (Fig. S7). Furthermore, these differences in the rate of information necessary for classification of global form structure across areas could not be simply attributed to differences in the magnitude of responses across voxels in the pattern size or possible spatial correlations between voxels with similar responsiveness to the stimuli because the order of the voxels in the pattern size was randomized for a large number of MVPA iterations.
To further study these differences in selectivity for global forms across visual areas, we computed the bias (i.e., z-score normalized difference of fMRI responses) for concentric versus radial Glass patterns for each voxel included in the MVPA. Comparison of these voxel-bias distributions (i.e., population histograms) across areas showed narrower distributions in early visual areas than higher occipitotemporal areas (LOC, vGPRR). As shown in Fig. 7, comparison of voxel-bias distributions between V1 and LOC (Fig. 7A) and V1 and vGPRR (Fig. 7B) showed larger number of voxels (proportion of total voxel population in each ROI) with higher bias values in the LOC and vGPRR than V1 (Fig. S8). Comparison of the distribution SD showed significantly higher variance in LOC (F(1,7) = 16.4, P < 0.05) and vGPRR (F(1,7) = 5.4, P = 0.05) than V1, consistent with a trend for higher kurtosis (i.e., higher variance caused by extreme values) in higher than early visual areas (Table S5). This analysis suggests that a smaller but more selective population of voxels in higher occipitotemporal areas than primary visual cortex contains information that supports the discrimination of features defining global forms (concentric vs. radial patterns).
Importantly, this difference between higher and early visual areas was shown to be specific to global form features rather than to local orientation differences. In particular, voxel-bias distributions for the control conditions (i.e., z-score normalized difference of fMRI responses for random 2 and random 2–90°) were similar across areas (Table S5). Comparison of the SD of the voxel-bias distributions for Glass patterns (concentric vs. radial) and control stimuli (random 2 and random 2–90°) for V1 and LOC yielded a significant interaction (F(1,7) = 11.9, P < 0.05). These findings are consistent with differences in neural code across areas specific to global form features rather than differences in local position and orientation signals.
Taken together, these findings suggest that neural populations involved in the analysis of global forms are smaller but more selective in higher occipitotemporal than retinotopic areas. These findings are consistent with the proposal that the neural code for global shapes becomes sparser and more efficient across stages of analysis in the visual cortex (Baker et al. 2002; Brincat and Connor 2004, 2006; Fujita et al. 1992; Riesenhuber and Poggio 1999; Tsunoda et al. 2001). It is important to note that the spatial resolution of fMRI limits us in characterizing the neural code for global forms at the level of neural populations within voxels rather than single neurons. Further studies manipulating shape properties (e.g., stimulus curvature or complexity) are necessary for evaluating the neural properties (i.e., selectivity and sparseness) with which more complex shape features are encoded in the human temporal cortex.
Our findings provide novel evidence for distinct neural mechanisms involved in the integration of global form structure in Glass patterns across stages of visual analysis in the human brain. In particular, both retinotopic and higher occipitotemporal areas contain information across voxels that allows us to discriminate between stimuli that evoke the perception of global form in Glass patterns. However, only higher areas in the ventral lateral occipital cortex show selectivity for higher order (global structure) features beyond low-level stimulus differences. Specifically, early visual areas discriminate between global forms by sampling local information about elementary features (e.g., position, orientation). In contrast, ventral lateral regions (LO) are involved in the integration of local signals to global form structure that facilitates the discrimination of concentric from radial patterns, consistent with the behavioral advantage for the detection of concentric compared with radial and translational patterns. Importantly, the analysis of higher-order stimulus features that define global forms in Glass patterns is supported by smaller but more selective neural populations in higher occipitotemporal areas, consistent with global pooling of local orientation signals from earlier stages of analysis(Wilson and Wilkinson 1998). In contrast, retinotopic areas integrate local position and orientation signals across larger neural populations, that is consistent with a summation process within and outside the classical receptive field of neurons in these areas (Smith et al. 2002, 2007).
Extending beyond understanding selectivity for global forms in Glass patterns in the human brain, these findings advance our understanding of the neural basis of visual form analysis along the human visual cortex in several respects. First, our study provides a systematic study of selectivity for global form structure across stages of analysis in the human visual cortex. We test the hypothesis proposed by previous psychophysical and computational work that early stages of visual analysis resolve the processing of homogenous texture fields (e.g., translational Glass patterns), whereas later stages the perception of more complex forms (radial, concentric patterns) (Cardinal and Kiper 2003; Dakin 1997a,b; Dakin and Bex 2001; De Valois and Switkes 1980; Glass and Switkes 1976; Maloney et al. 1987; Mandelli and Kiper 2005; Prazdny 1984, 1986; Wilson and Wilkinson 1998; Wilson et al. 1997). Using advanced and sensitive multivariate methods for the analysis of fMRI data, we show selectivity for elementary visual features (position, orientation) in retinotopic areas, whereas selectivity for global form structure (i.e., concentric vs. radial patterns) beyond local signal differences in higher occipitotemporal cortex.
Second, our methods allow for systematic comparisons across areas and characterization of their differential functional roles. In contrast to previous studies that have used stimuli varying in complexity and local statistics (gratings, dot patterns, Gabor patches) for characterizing different visual areas based on their preferable stimulus tuning, we use the same stimulus (Glass patterns) to trace neural processing along the visual cortex. Our choice of stimulus was motivated by the global form properties of Glass patterns that have been studied extensively in previous psychophysical and modeling studies. In particular, Glass patterns 1) evoke the perception of different global forms while being on average matched for their local statistics, 2) entail integration of local orientation signals to global form percepts, and 3) represent basic forms (e.g., circular, radial) that form the primitives of biologically relevant complex objects (e.g., faces). Furthermore, the use of fMRI allows us to test for selectivity across all visual areas simultaneously in the same subjects with the same experimental paradigm. Using this methodology, our findings show a functional architecture for the integration of global shapes in the human visual cortex that is consistent with previous neurophysiological studies. In particular, neurons in V1 and V2 (Hegde and Van Essen 2000, 2003, 2004; Ito and Komatsu 2004; Peterhans and von der Heydt 1993; Smith et al. 2002, 2007) have been suggested to compute local orientation signals (i.e., dot dipoles in Glass patterns) within the area of their receptive fields and integrate collinear edges into contours within their extended receptive field or through a circuit of local (facilitative and suppressive) and recurrent (feedback from higher areas) interactions (Allman et al. 1985; Fitzpatrick 2000; Gilbert 1992, 1998; Lamme et al. 1998; Murray et al. 2004). However, global spatial integration of multiple orientation signals has been attributed to V4 neurons that have larger receptive fields than neurons at earlier stages of processing (Desimone and Schein 1987) and show selectivity for higher-order features of moderate complexity (e.g., curvature, angles) that define shape parts (Gallant et al. 1993, 1996; Kobatake and Tanaka 1994; Pasupathy and Connor 1999, 2001, 2002). Finally, information about object parts is converted based on both linear and nonlinear integration mechanisms into sparser representations of complex shapes (multipart configurations) at the posterior inferior temporal cortex (Baker et al. 2002; Brincat and Connor 2004, 2006; Fujita et al. 1992; Riesenhuber and Poggio 1999; Tsunoda et al. 2001). These shape configurations provide the basis for object recognition at more anterior IT regions where neurons selective even for entire objects have been identified (Gross et al. 1972; Logothetis and Sheinberg 1996; Rolls and Tovee 1995; Tanaka 1996; Tsao et al. 2006; Young and Yamane 1992).
Third, the use of multivariate analysis for fMRI signals (MVPA) allows us to study in depth the cascade of processes involved in converting information about image edges from V1 to selectivity for global forms and objects in the temporal cortex. In contrast to univariate analysis, MVPA allows us to discern selectivity for stimulus features that are encoded by neural ensembles at a higher resolution than the typical size of fMRI voxels (Haynes and Rees 2005; Kamitani and Tong 2005; Norman et al. 2006). In particular, our previous imaging work has shown that early visual areas integrate local elements to contours within the neighborhood of their receptive field, whereas higher visual areas represent the perceived global shape by comparing responses to collinear versus random patterns (Altmann et al. 2003; Kourtzi et al. 2003). However, only when using multivariate approaches could we discern selectivity for higher-order features that mediate the discrimination of global forms (e.g., concentric vs. radial Glass patterns). In contrast, comparing univariate responses (average percent signal change across voxels) for concentric versus radial patterns did not show any significant differences. This analysis suggests that weak biases in individual voxels may show information about feature selectivity when the bias of several voxels across multi-voxel patterns is considered. Previous studies have used fMRI adaptation for showing differential responses to global form patterns (e.g., concentric vs. radial dynamic Glass patterns) from weak biases in individual voxels (Krekelberg et al. 2005). These studies suggest that higher occipitotemporal areas represent the perceived global structure rather than local stimulus differences. Despite similarities in the findings between these previous fMRI adaptation studies and the MVPA results presented here, important differences remain. Specifically, fMRI selective adaptation reveals differential stimulus-specific responses in single voxels related to neural adaptation (i.e., decreased responses caused by prolonged stimulus presentation or repetition and recovery from adaptation), whereas MVPA shows stimulus-specific responses based on fMRI signals pooled across multi-voxel patterns. Further combined fMRI and physiology studies using the same stimuli and tasks are necessary to shed more light to the similarities and/or differences between these approaches and the relation of fMRI findings to neural selectivity.
Interestingly, studying the extent of the spatial brain pattern necessary for the classification of global forms based on fMRI signals provides novel insights in understanding the nature of the neural code for representing visual information across cortical areas. This analysis provides evidence that higher occipitotemporal areas discern differences in global form structure (i.e., concentric vs. radial Glass patterns) with higher accuracy than early visual areas while relying on information from smaller but more selective neural populations (smaller pattern size). These findings are consistent with neurophysiological evidence for smaller neural populations with large receptive fields tuned to the global stimulus properties in higher occipitotemporal areas. Thus our study shows that multivariate approaches are a useful and sensitive tool for studying the neural code that the human brain uses for translating sensory information to global percepts. However, cautious interpretation of the MVPA results in relation to neural processing is necessary given the power of the algorithms for optimal classification and the limitations imposed by the spatial resolution of hemodynamic measurements. Taking into account these limitations, we performed a series of stringent tests (e.g., voxel selection based on independent data sets, classification of shuffled data, comparison of the same pattern size across visual areas, comparison of classification accuracies for different sets of stimuli and data within each area) that allow us to characterize conservatively selectivity for global forms along the human visual pathway.
In conclusion, our study investigates the neural basis of mid-level vision mechanisms that convert local signals to global form percepts in the human brain. Combining parametrically defined stimuli and multivariate fMRI methods, we show differential selectivity for global form features across stages of visual analysis. Our results are consistent with computational models of vision (Barlow and Olshausen 2004; Geisler et al. 2001; Sigman et al. 2001), proposing that the perception of higher-order structure in natural images relies on the combination of the output of local orientation filters in primary visual cortex to form higher-order features represented in temporal cortex. Our findings show that coding of global forms in the human brain proceeds from analysis of local orientation signals in V1 to selective representations of higher-order (global structure) features (e.g., curvature, junctions) in the temporal cortex. These findings are consistent with the proposal that the visual system uses a code of increasing efficiency across stages of processing that is advantageous in many respects: it reduces redundancy in the sensory input by integrating local signals into coherent percepts (Barlow 1972; Olshausen and Field 2004; Simoncelli and Olshausen 2001; Willmore and Tolhurst 2001), provides the building blocks for the representation of complex shapes and biologically important object categories (e.g., faces) (Biederman 1987; Marr and Vaina 1982), and supports fast read out of task-relevant information at different levels of shape description. As such, this code makes up a fundamental computational principle for the analysis of sensory input in the human brain and is critical for successful detection and recognition of targets in cluttered environments.
The work was supported by Biotechnology and Biological Sciences Research Council Grant BB/D52199X/1 and Cognitive Foresight Initiative Grant BB/E027436/1 to Z. Kourtzi.
We thank K. Humphreys for help with data collection; O. Rosenthal for help with stimulus generation; J. A. Movshon and A. Norcia for insightful discussions of the work; and A. Bagshaw, C. Connor, B. Krekelberg, M. Smith, and A. Welchman for helpful comments and suggestions.
↵1 The online version of this article contains supplemental data.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2008 by the American Physiological Society