|
|
||||||||
Department of Psychology and Center for Neural Science, New York University, New York, New York
Submitted 27 June 2005; accepted in final form 4 October 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
|
Models of second-order visual processing postulate the existence of a three-stage mechanism variously referred to as the filter-rectify-filter (FRF), linear-nonlinear-linear (LNL), or "back pocket" model (see Landy and Graham 2004
for a review). In this model, the outputs of linear units tuned to the spatial frequency of the carrier stimulus are rectified and pooled by a larger linear filter tuned to the orientation and spatial frequency of the second-order modulation (Fig. 1D). FRF models provide a relatively simple mechanism to account for second-order pattern perception that could be implemented in neuronal circuitry. The properties of the first-stage filters in these models are similar to those of simple cells in V1. The outputs of several such neurons are assumed to be pooled by neurons corresponding to the second-stage filter, the rectification stage being implemented by the spiking threshold of the first-order neurons.
Neurons with properties consistent with the second-stage filter have been found in cat and monkey visual cortex. Single-unit studies have identified neurons selective for a variety of second-order stimuli, such as patterns defined by modulations of carrier orientation (Olavarria et al. 1992
; Rossi et al. 2001
), contrast (Leventhal et al. 1998
; Mareschal and Baker 1998a
,b
, 1999
; O'Keefe and Movshon 1998
; Zhou and Baker 1993
, 1994
, 1996
), spatial frequency (Leventhal et al. 1998
), temporal frequency (Albright 1992
; Chaudhuri and Albright 1997
), and phase shifts of abutting gratings (a type of illusory contour) (Grosof et al. 1993
; von der Heydt and Peterhans 1989
; von der Heydt et al. 1984
). Most of these neurons were found in extrastriate visual areas (area 18 in the cat, V2 and MT in macaque monkey), although a small number of neurons selective for second-order patterns were found in V1. Interestingly, most neurons selective for second-order patterns also responded to first-order patterns having similar properties to the second-order modulating stimulus (e.g., orientation and spatial frequency), suggesting that such neurons encode a cue-invariant representation of the stimuli. Cue-invariant neurons responding to shapes and gratings defined by texture, luminance, motion, or color, have also been found in inferotemporal cortex of monkeys (Sary et al. 1993
, 1995
). Although cue-invariance for different types of second-order patterns is a property of the second-stage filters in some FRF models, these models do not predict that the same neurons should respond selectively to both first- and second-order patterns.
The neurophysiology of human second-order vision has been studied with functional MRI (fMRI) and PET. Most studies to date have focused on second-order visual motion processing (Dumoulin et al. 2003
; Dupont et al. 2003
; Nishida et al. 2003
; Seiffert et al. 2003
; Smith et al. 1998
; Wenderoth et al. 1999
), with somewhat conflicting results. While all of these studies found that first- and second-order motion evoked responses in largely the same regions of visual cortex, only some of them found stronger activation by one stimulus type in any cortical area (as might be expected if the neurons within a particular cortical area were predominantly responsive to a single stimulus category). Smith et al. (1998)
reported that second-order motion evoked stronger responses in areas V3 and V3A/B than did first-order motion. Wenderoth et al. (1999)
, using PET, reached a similar conclusion. In contrast, Dumoulin et al. (2003)
found no difference between the fMRI responses to first- and second-order motion in V3 and V3A/B, but instead observed significantly stronger responses to first-order motion in V1 and stronger responses to second-order motion in lateral occipital cortex anterior to V3 (but not including V5/MT). Other studies failed to find any significant differences between first- and second-order motion (Dupont et al. 2003
; Nishida et al. 2003
; Seiffert et al. 2003
).
With the exception of the study by Nishida et al. (2003)
, the studies cited above relied on finding overall response differences between two physically different stimulus categories (e.g., stimulus plus carrier vs. carrier alone, or 1st- vs. 2nd-order stimuli). A lack of a difference in the overall level of response, however, does not necessarily imply a lack of stimulus selectivity. For example, if one subpopulation of neurons responds only to first-order stimuli and a separate subpopulation of intermingled neurons responds equally strongly only to second-order stimuli, there will be no difference in the overall response (averaged across both subpopulations) to first- and second-order stimuli. The two subpopulations could, however, be distinguished using an adaptation protocol. Response adaptation provides a means for revealing separate subpopulations of neurons selectively tuned for different stimuli even when these neurons are intermingled at a spatial scale that is smaller than the sampling resolution (voxel size) of the measurements. A sampled region of tissue containing neurons selectively tuned for one stimulus category will adapti.e., respond lessafter repeated presentation of these neurons' preferred stimulus. If the same tissue contains a second, separate subpopulation of neurons selectively tuned for a different stimulus category, that tissue will also adapt to that stimulus category. Repeated presentation of one stimulus category will not, however, affect the postadaptation responses to the other stimulus category. Selective adaptation to a particular stimulus category thus provides a measure of the stimulus selectivity of a subpopulation of neurons that is unaffected by the stimulus selectivity of other neurons in the same region of tissue. The only neuroimaging study of second-order motion perception that used adaptation is that by Nishida et al. (2003)
, who measured direction-selective fMRI response adaptation to identify neurons tuned to first- and second-order motion direction. Although the authors found evidence of adaptation as early as V1, they failed to find any difference between the amount of adaptation to first- and second-order motion, leading them to conclude that the two types of motion are analyzed in the same cortical regions. They did not, however, test whether adapting to first-order motion influenced the responses to second-order motion or vice versa.
The pathways and visual areas processing motion are not identical to those processing static patterns and complex shapes. The study by Nishida et al. (2003)
, for instance, was designed to localize neurons selective for the direction of second-order motion; neurons selective for static properties of second-order patterns (e.g., orientation), but not selective for motion direction, would not have been identified by the experimental procedure used. For these reasons, studies of second-order motion are of limited use for understanding the mechanisms underlying static second-order pattern perception. Only a small number of human neuroimaging studies have investigated perception of static second-order textures. Using fMRI, Kastner et al. (2000)
found that stimuli containing texture boundaries evoked stronger responses in higher-tier extrastriate visual areas than did stimuli with uniform texture. Grill-Spector et al. (1998a)
studied cue invariance in object-selective cortical regions using stimuli defined, among other cues, by second-order texture boundaries. Cue-invariant responses were observed in the lateral occipital complex (LOC), but except for a small region anterior to V3, not in early retinotopic areas.
Although these earlier studies of static texture perception identified brain areas that responded to second-order patterns, they did not reveal whether any of these areas are selective for second-order stimuli. For instance, as mentioned above, psychophysical studies have shown that second-order (and 1st-order) mechanisms are orientation-selective, implying that the underlying neuronal mechanisms are also orientation-selective. As a consequence, it should be possible to localize the neuronal populations mediating second-order pattern perception by identifying brain regions that exhibit orientation selectivity for second-order stimuli.
In this study, we used orientation-selective adaptation as a tool to localize populations of neurons selective for the orientation of first-order patterns (defined by luminance modulations), neurons selective for the orientation of second-order patterns (defined by carrier contrast or orientation), or cue-invariant neurons selective for the orientation of both first- and second-order patterns. We used an experimental design similar to that previously used to measure adaptation in electrophysiology and psychophysics experiments (Bradley et al. 1988
; Carandini et al. 1997
, 1998
; De Valois 1977
; Hammett and Snowden 1995
; Kohn and Movshon 2003
; Pantle and Sekuler 1968
; Sclar et al. 1989
; Snowden and Hammett 1996
; Solomon et al. 2004
). Importantly, we used a highly demanding foveal task to divert attention away from the stimuli to equate spatial attention across stimulus conditions. We found orientation-selective response adaptation to both first- and second-order patterns in multiple visual areas, including V1, with no single visual area specialized for either stimulus type. Most of the response adaptation we observed for first-order stimuli could be accounted for by adaptation in V1 neurons. Response adaptation to second-order stimuli, on the other hand, was significantly stronger in several extrastriate visual areas than in V1, particularly ventral area VO1, implying that the proportion of neurons selective for second-order pattern orientation was greater in these areas than in V1. We did not find convincing evidence for cue-invariant orientation-selective response adaptation; adaptation to first-order stimuli did not consistently reduce the responses to second-order stimuli in any visual area examined. Our results are consistent with an FRF model in which the second filtering stage is mediated by neurons primarily in ventral extrastriate visual areas.
| METHODS |
|---|
|
|
|---|
Stimulus conditions
We measured the postadaptation fMRI responses to presentations of vertical or horizontal grating patterns (see Visual stimuli for a detailed description) with modulations of either luminance (condition LM:LM; 1st-order), carrier contrast (condition CM:CM; 2nd-order), or carrier orientation (conditions OM:OM and LM:OM; 2nd-order). In the unimodal adaptation conditions LM:LM, CM:CM, and OM:OM, the adapter and probe patterns were of the same stimulus type. In the cross-modal adaptation condition LM:OM, the adapter pattern was first-order but the probe pattern second-order. The conditions differed only in the types of stimuli used, but were otherwise identical in design. Each subject underwent two scanning sessions per condition, one for each adapter orientation (horizontal or vertical). The results for different adapter orientations were pooled to compensate for any orientation bias in the responses.
Adaptation protocol
We used an event-related design modeled on psychophysical and electrophysiological adaptation experimental protocols (Bradley et al. 1988
; Carandini et al. 1997
, 1998
; De Valois 1977
; Hammett and Snowden 1995
; Kohn and Movshon 2003
; Pantle and Sekuler 1968
; Sclar et al. 1989
; Snowden and Hammett 1996
; Solomon et al. 2004
) to measure the average response to single presentations of an intermediate-contrast probe stimulus after adaptation to a high-contrast adapter stimulus (the actual contrasts used are listed below). The trial structure is shown in Fig. 2A. Before scanning, at the beginning of each experiment, subjects passively viewed the adapter stimulus for 100 s. Each subsequent trial had a duration of 7.2 s. Adaptation was maintained by showing a "top-up" adapter during the first 4 s of each trial. The top-up adapter was followed by a blank screen for 1 s, which was in turn followed by presentation of the probe stimulus for 1 s. The trial ended with a 1.2-s display of a blank screen (uniform gray except for the fixation point). Throughout the trial, subjects performed a highly attention-demanding task at fixation, thus ignoring the adapter and probe stimuli. A single adapter orientation was used for each scanning session. (A note on terminology: we use the term "scan" to refer to a single fMRI data collection run, typically about 5 min long, and "session" to refer to a set of scans run in direct succession.) The spatial phase of the adapter and probe stimuli was varied randomly at 4 Hz. In one-third of the trials, the probe stimulus had the same orientation as the adapter (parallel or adapted trials); in one-third, the probe stimulus was perpendicular to the adapter (orthogonal or unadapted trials); and in one-third of the trials, the screen remained blank throughout the probe stimulus presentation phase (blank trials; Fig. 2B). A scanning session was comprised of 2 localizer scans and 10 adaptation scans, each consisting of 42 trials (14 trials of each of the 3 trial types). The trials were pseudorandomized such that the sequence of trials preceding and following any trial was equally likely to contain any of the three trial types. Specifically, the 14 trials of each trial type were presented in seven blocks of 6 trials each, with trial order randomized within blocks. The large number of trials (420 trials per session) made any systematic biases caused by trial order very unlikely.
|
It is well known that spatial attention can strongly modulate the neuronal responses to visual stimuli measured with fMRI in a spatially specific manner, confounding the interpretation of the results (Huk et al. 2001
). To control and equate attentional load across conditions, we used a highly attention-demanding task at the center of fixation that was identical across trials and conditions (Fig. 2A). The attentional control task required subjects to count the number of target letters (Xs) shown in a stream of rapidly presented distractor letters (10 repeated cycles of Z-L-N-T, in that order, each presented for 150 ms) and report the number of targets observed at the end of each trial. The letters were shown continuously for 6 s throughout each trial from the beginning of presentation of the adapter until the end of presentation of the probe stimuli. After 6 s, the letters were replaced by a fixation cross, cueing subjects to respond by pressing one of four keys corresponding to the number of target Xs presented (14). The targets could appear at any time throughout the trial, but two targets could not appear in direct succession. Although in theory this implied that, on a small proportion of trials, all targets could appear at the beginning of a trial, making it unnecessary for subjects to maintain attention at the center of gaze for the remainder of the trial, in practice the task was so difficult that subjects were never certain that they had seen all targets. Despite the great attentional demands of the task, subjects' performance after practice was well above chance level (Fig. 2C). Informally, we could verify that the task was highly effective at diverting attention from the periphery, because subjects reported that they were unable to perceive the orientation of the probe stimuli while performing the task, whereas with attention directed to the probe stimuli, discriminating the orientation of the probe stimuli was trivial.
Psychophysical measurement of adaptation
To verify that the adapter stimuli were effective in eliciting adaptation, we measured psychophysical postadaptation detection thresholds to stimuli under conditions similar to those in the fMRI experiment. For this experiment, the attentional control task was replaced by a two-interval forced choice task that involved determining which of two sequentially presented probe patterns contained a (1st- or 2nd-order) target stimulus. Before each session, subjects viewed the adapter stimulus for 100 s, analogous to the fMRI experiments. Each trial was 6.6 s long and began by presenting a top-up adapter for 4 s, followed by a blank screen for 0.5 s, followed by two stimuli for 0.5 s each, separated by an interstimulus interval (ISI) of 0.5 s. On one-half of the trials, the first interval contained the target, whereas on the other half of trials, the target was presented in the second interval. Nontarget stimuli were generated in the same way as the target stimuli, but with modulation amplitude set to zero. At the end of the trial, subjects had 1.1 s to respond which interval contained the target. The modulation depth of the target stimulus was varied by a one-up, two-down staircase procedure, with two interleaved staircases. A single experimental session consisted of 10 blocks, each consisting of 20 trials. Target orientation alternated from block to block but was constant within a block to minimize perceptual interactions between the carrier stimulus and the modulator (Morgan et al. 2000
). A single adapter orientation was used for each session, and sessions were run on different days to avoid potential confounding effects of long-term adaptation. One hundred trials were run for each stimulus condition (LM:LM, CM:CM, OM:OM, and LM:OM), trial type (adapt orthogonal, adapt parallel), and adapter orientation (vertical and horizontal). The results were pooled across adapter orientations, and psychometric functions were fitted to the data using a bootstrap procedure (Wichmann and Hill 2001a
,b
). Detection thresholds were defined as the modulation contrast corresponding to 75% correct detection.
Visual stimuli
The stimuli were sinusoidally modulated horizontal or vertical gratings presented within an annulus with inner radius 1.5° and outer radius 5° around the center of fixation (Fig. 3). The modulation spatial frequency was 1.5 cpd for all stimulus types.
|
![]() | (1) |
![]() | (2) |
, and AM is the modulation amplitude (peak contrast). The modulation amplitudes of the adapter and probe stimuli were 80 and 10%, corresponding to root-mean-square (RMS) contrasts of 57 and 7%, respectively.
In condition CM:CM, grating patterns (Fig. 3B) were generated by modulating the luminance contrast of a noise carrier pattern N(x,y)
![]() | (3) |
In condition OM:OM, two noise carriers were used that had the same spatial frequency and contrast as in condition CM:CM, but were oriented either horizontally (Nh) or vertically (Nv) with an orientation bandwidth of 30°. Second-order grating patterns (Fig. 3C) were generated by mixing the two oriented carriers and modulating the relative amount of each carrier in a sinusoidal fashion (Landy and Oruç 2002
)
![]() | (4) |
In condition LM:OM, the (2nd-order) probe stimulus was identical to that of condition OM:OM, but the (1st-order) adapter stimulus (Fig. 3D) was generated by superimposing a luminance grating on an equal mix of horizontal and vertical noise carriers (identical to those used in condition OM:OM)
![]() | (5) |
The same stimuli were used both in the psychophysical and in the fMRI experiments, except that to determine detection thresholds, the modulation depth of the probe stimuli was varied over a sevenfold range spanning the detection thresholds as verified by pilot runs. Stimuli were presented at 800 x 600-pixel resolution on an electromagnetically shielded analog NEC2110 LCD display (for the fMRI experiments) or a Nokia 446 XPro CRT display (for psychophysics) using the Psychophysics Toolbox (Brainard 1997
; Pelli 1997
) and a 10-bit graphics card. Both displays had a refresh rate of 60 Hz. The displays were carefully calibrated to minimize potential first-order artifacts caused by nonlinearities in the display hardware.
Definition of visual area regions of interest
Nine regions of interest (ROIs) were defined based on retinotopy, and two additional ROIs (V5/MT+ and LOC) were defined by a combination of retinotopy and functional properties. Standard traveling wave methods for retinotopic mapping were used to identify boundaries between retinotopically organized visual areas (DeYoe et al. 1996
; Engel et al. 1994
; Sereno et al. 1995
). Area boundaries were identified as the phase reversals (corresponding to representations of the horizontal and vertical visual field meridians) in a map of the polar angle representation of the visual field, measured by the phase of the fMRI response to a slowly (0.04 Hz) rotating wedge stimulus (45° wide) extending from the center of gaze to 6° eccentricity. Eccentricity was mapped in a similar fashion by measuring the phase of the response to a slowly expanding or contracting stimulus annulus (width 1.5°, corresponding to a duty cycle of 25%). The retinotopic mapping scans were carried out in separate scanning sessions for each subject.
We identified nine retinotopically organized areas in every subject (shown for the right hemisphere of subject S2 in Fig. 4A). We have reliably identified these nine cortical areas consistently in a total of 12 subjects, including the 3 subjects participating in this study. Most of these areasV1, V2, V3, V3A, V3B, V7, hV4, and VO1have been previously described in the literature (DeYoe et al. 1996
; Press et al. 2001
; Sereno et al. 1995
; Smith et al. 1998
; Tootell et al. 1995
, 1997
, 1998a
; Wade et al. 2002
; Wandell et al. 2005
). VO1 is a coarsely retinotopic area anterior and lateral to hV4 (Wandell et al. 2005
), which may correspond to area TEO as described by Kastner et al. (2000)
. In addition, we have identified two retinotopic areas in the lateral occipital cortex between dorsal V3 and V5/MT+ not previously described in the literature, although one of these areas may overlap V3B as originally described (Smith et al. 1998
). The detailed retinotopic organization and functional characteristics of these areas, which we have termed LO1 and LO2 (for lateral occipital areas 1 and 2), will be reported elsewhere. For the purposes of this study, we will only provide a brief description of their location and retinotopic organization, focusing instead on their response properties in the context of second-order pattern perception. We have provided the retinotopic maps of the three subjects as supplementary on-line material.1
|
The area that we have named LO1 is a complete map of the contralateral visual hemifield extending from the anterior boundary of dorsal V3 and sharing the central foveal confluence of areas V1, V2, and V3 (Fig. 4A). Visual field eccentricity in this map extended in the same direction as in dorsal V3, i.e., with peripheral locations represented anteriorly and dorsally. This map differed from V3B described by Smith et al. (1998)
in two significant ways. First, the original description of V3B did not explicitly comment on the representation of eccentricity within the map, making exact comparison with later studies difficult. Second, Smith et al. (1998)
found only a representation of the lower quadrant in this area, whereas we found a representation of the entire contralateral hemifield. Because this map did not match any previously described visual area, we have named this area LO1, for lateral occipital area 1. We have used the nomenclature proposed by Wandell et al. (2005)
, by which areas are named by gross anatomical location and a unique number.
In addition to LO1, we reliably identified in every subject a previously undescribed retinotopic area anterior to LO1 which we have named area LO2 (Fig. 4A). Like LO1, LO2 contained a full map of the contralateral visual hemifield and shared the central foveal representation of V1, V2, and V3. The polar angle representation in LO2 was the mirror image of that in LO1, with the upper vertical meridian (defining the boundary with LO1) represented caudally and the lower vertical meridian represented rostrally. Visual field eccentricity in LO2 was mapped parallel to that of LO1, with the visual field periphery mapped anteriorly and dorsally. LO2 was posterior to and did not overlap with functionally defined area V5/MT+. Based on retinotopic criteria, we suggest LO2 should be considered a separate visual area. Furthermore, responses in LO2 to images of intact objects and faces were larger than responses to scrambled images and faces, suggesting that LO2 formed part of the object-selective LOC. Our results also show that the response properties of LO2 with regard to first- and second-order patterns differed markedly from those of LO1 and more posterior visual areas.
We also ran a separate session for each subject to delineate ROIs comprising functionally defined areas V5/MT+, based on its stronger response to random moving dots than to stationary dots (Huk et al. 2002
; Tootell et al. 1995
; Watson et al. 1993
), as well as the LOC, based on its stronger response to images of intact objects and faces than to images of scrambled objects and faces (Grill-Spector et al. 1998b
; Kourtzi and Kanwisher 2000
, 2001
; Lerner et al. 2002
; Malach et al. 1995
). These scans were run using a block design, alternating ten 12-s-long stimulus blocks (moving dots for V5/MT+, images of intact objects for LOC) with ten 12-s baseline blocks (stationary dots for V5/MT+, scrambled images of objects for LOC). Because both of these contrasts also activated parts of retinotopic areas, we restricted the V5/MT+ and LOC ROIs to exclude retinotopic visual areas. Most of the anterior part of the LOC ROI did not overlap with regions activated by the stimuli in the adaptation scans (as assessed with independent localizer scans), and so we further restricted this ROI to include only the posterior section (excluding retinotopic areas). We refer to this ROI as posterior LOC or pLOC in the following. Consistent with previous work (Huk et al. 2002
), we found that functionally defined V5/MT+ could be subdivided into a posterior retinotopically organized part (putative human V5/MT) and an anterior nonretinotopic part (putative human MST), but for the purposes of this study, we did not consider these subdivisions separately.
Localizer scans
In each scanning session, before and after the series of adaptation scans, we measured responses to the probe stimuli presented alone (i.e., without the adapter stimuli), to independently identify the cortical regions responding to the stimulus patterns. The stimuli in the localizer scans were the same as the probe stimuli used in the adaptation scans for a given session. Thus the localizer scan stimuli in the LM:LM condition were the LM probe gratings, whereas in the LM:OM condition, the stimuli were the OM probe gratings. For these scans, we used a block design, alternating 9.6-s ON blocks of the intermediate-contrast probe stimuli, randomly changing in orientation and phase at 4 Hz, with 9.6-s OFF blocks of blank screen (uniform gray except for the fixation point). Subjects were instructed to maintain their gaze on a fixation marker throughout the scan. Each localizer scan consisted of 10 stimulus-blank alternations.
MRI acquisition
Experiments were carried out on a Siemens Allegra 3T scanner, equipped with a four-channel phased-array surface coil covering the back of the head (NM-011 transmit head coil and NMSC-021 receive coil, Nova Medical, Wakefield, MA). A custom-fitted bitebar was used to minimize subject head motion. Standard echoplanar imaging methods were used to measure the blood oxygenation leveldependent (BOLD) signal (Ogawa et al. 1990
) in T2*-weighted images. Functional data in the adaptation scanning sessions were acquired using the following parameters: TR 1,200 ms; TE 30 ms; flip angle 75°; 64 x 64 matrix size; 19 slices oriented perpendicular to the calcarine sulcus; voxel size 3 x 3 x 3 mm. For the retinotopy, V5/MT+, and LOC sessions, we used the same imaging parameters with the following exceptions: 24 slices and 1,500-ms TR. At the beginning of each session, we also acquired an anatomical T1-weighted MPRAGE image that covered the same volume as the functional scans, but with twice the in-plane resolution (voxel size 1.5 x 1.5 x 3 mm). This image was used to compute the alignment between the functional volumes and the high-resolution anatomical image used to extract cortical surfaces, using an automated robust image registration method (Nestares and Heeger 2000
). The alignment parameters obtained were used to project the visual area ROIs (defined in the high-resolution image space) into the image space of each functional scan. To visualize the fMRI responses from the localizer and retinotopy measurements, the statistical data were projected onto the flattened occipital cortex, but no quantitative analyses were performed on the flattened data. By analyzing our data in the native functional image space rather than aligning the data itself to a standard space, we minimized blurring that would have been introduced through interpolation.
fMRI data analysis
The time series data for each scan were corrected for motion within and between scans using MCFLIRT (Jenkinson et al. 2002
). The estimated head movements were consistently <1 mm in any direction. We also manually inspected each time series to ensure there were no sudden movements or artifacts in the data.
Data from the localizer scans, the V5/MT+ scans, and the LOC scans were analyzed separately for every voxel by correlating the time series data with a sinusoid at the stimulus alternation frequency. The time series were first normalized by dividing by the mean (to compensate for variations in intensity with distance from the receiver coil) and detrended with a high-pass filter to remove low-frequency noise and drift (Biswal et al. 1995
, 1997a
,b
; Purdon and Weisskoff 1998
; Smith et al. 1999
; Zarahn et al. 1997
). For each voxel, we computed the correlation (technically coherence) between the best-fit sinusoid and the measured time series. This analysis also yielded a response phase and amplitude, allowing us to distinguish stimulus-correlated significant increases (activations) in the BOLD response from significant decreases (deactivations). Details of this analysis method have been published elsewhere (Backus et al. 2001
; Huk and Heeger 2002
; Neri et al. 2004
; Zenger-Landolt and Heeger 2003
).
Retinotopy data were analyzed using the same procedures. Because the stimuli moved progressively through the visual field, the measured response phases corresponded to (angular or radial) locations in the visual field (Fig. 4A) (DeYoe et al. 1996
; Engel et al. 1994
; Sereno et al. 1995
). Retinotopic visual area ROIs were drawn on computationally flattened representations (flat maps) of the occipital cortex generated from high-resolution T1-weighted anatomy images using the public domain software SurfRelax (Larsson 2001
).
Event-related data from the adaptation scans were analyzed for each visual area ROI as follows. First, separately for each adaptation session, the ROI was restricted to include only those voxels showing significant activation in the localizer scans carried out in the same session. Specifically, the ROIs were restricted to include voxels with a response coherence >0.2 and a response phase between 0 and
, corresponding to stimulus ON blocks (the phase range bracketed the variation of hemodynamic delays across voxels). This ensured that our ROIs only included voxels corresponding to visual field locations within the stimulus annulus and excluded voxels that did not show an increased response to visual stimuli (e.g., deactivations). The coherence threshold was chosen to obtain a consistent size of individual ROIs across conditions and repetitions, because the strength of the evoked fMRI response varied between scanning sessions (Aguirre et al. 1998
). Although we used different visual stimuli for the localizer scans for different stimulus conditions, the different stimuli covered the exact same parts of the visual field (the stimulus annulus) and the spatial extent of evoked fMRI responses were very similar for different stimulus types. Furthermore, the exact coherence threshold used was not critical. We also analyzed our data with more conservative coherence thresholds (0.3 and 0.4), and the results were consistent with those obtained with a coherence threshold of 0.2.
For each voxel within the ROI (combined across left and right hemispheres), the raw unfiltered time-course of the fMRI response was normalized to percent signal change by dividing by the mean intensity across the scan. The normalized time-courses were averaged across voxels to yield a mean ROI time-course. Responses to individual trials were extracted from the mean ROI time-course by extracting the 16 time-points (19.2 s) starting with the onset of each trial. The mean response to the adapter stimulus alone, computed by averaging the responses to the blank trials, was subtracted from each adaptation trial (orthogonal and parallel), and the resulting time-courses were adjusted to a zero baseline by subtracting the mean of the first four time-points (before the onset of the probe stimulus). From the time-courses obtained by this procedure, we extracted a response vector Ri for each adaptation trial i by extracting the eight time-points (9.6 s) from the onset of the probe stimulus. We computed a mean response vector
by averaging the responses for all N adaptation trials regardless of trial type (orthogonal and parallel)
![]() | (6) |
![]() | (7) |
We also computed an adaptation index IA, quantifying how much the measured response changed after adaptation, relative to the overall response to the stimuli in each visual area. The index was calculated as
![]() | (8) |
For the group means shown in the figures below, error estimates for the mean of the three subjects were calculated as the square root of the summed squared SE for individual subjects, divided by the number of subjects.
| RESULTS |
|---|
|
|
|---|
For all stimulus conditions, the localizer stimuli elicited a continuous band of activity across retinotopic visual areas at the eccentricity of the stimulus. Figure 4B shows the responses with a coherence threshold of 0.2 (the threshold used to restrict the ROIs for analyzing the event-related data) from the localizer scan (CM probe stimuli vs. blank) in subject S3 in condition CM:CM, overlaid on a flat map of the right hemisphere. The boundaries of the visual area ROIs are superimposed. Regions where activity increased in response to the localizer stimulus (activations) are shown in shades of orange; regions whose activity decreased in response to the stimulus (deactivations) are shown in shades of green. The spatial distribution of evoked responses across visual areas was very similar for different stimulus conditions and subjects. Particularly for early visual areas, we found a band of decreased activity peripheral and foveal to the region of increased activity. Such deactivations are commonly observed in fMRI and may partially be caused by purely hemodynamic effects (blood stealing), although there is evidence that they reflect true decreases in neuronal activity (Shmuel et al. 2002
). As noted previously, deactivated voxels were excluded in the analysis of event-related data by restricting the ROIs to contain only voxels showing increases in activity to the stimulus in the localizer scans.
In lateral and ventral occipital cortex, the activity elicited by the localizer stimulus was largely restricted to the cortical regions within the visual area ROIs defined above. Our ROI analysis thus covered most or all ventral stream cortical regions that responded significantly to the stimuli. Dorsally, the activity extended beyond V7 into posterior parietal cortex along the intraparietal sulci, where additional topographically organized areas have been reported (Schluppeck et al. 2005
; Silver et al. 2005
). Because the slice prescription we used only partially covered these regions, dorsal regions anterior to V7 were not included in our analysis.
Orientation-selective elevation of detection thresholds after adaptation
Psychophysical detection thresholds were higher for the parallel than for the orthogonal probe stimuli, indicating orientation-selective adaptation (Fig. 5). Detection thresholds increased after adaptation for all stimulus conditions, but the threshold elevation in the cross-modal adaptation condition (LM:OM) was significant only in one of the three subjects. For subject S2, the threshold elevation in condition OM:OM failed to reach statistical significance by a small margin (P < 0.06). The shift in the psychometric functions were equivalent to a modulation contrast decrease after adaptation of
1% for LM:LM, 515% for CM:CM, and
15% for OM:OM.
|
Robust responses to the probe stimuli in both orthogonal and parallel trials were found in most visual areas in all conditions, although responses in V5/MT+, V7, and pLOC were weak and quite variable. Responses to the probe in early visual areas were generally larger than those in downstream areas; mean response amplitudes in V1 were on the order of 0.8%, whereas the response amplitudes in ventral higher-tier areas (hV4 and VO1) were
0.5% or less, and smaller still (
0.2%) in dorsal and lateral extrastriate visual areas (V3A/B, LO1, LO2). Individual subject time-courses for the V1 ROI in the LM condition are plotted in Fig. 6, together with the mean time course of the three subjects. Although the response amplitudes and the shape of the elicited response differed substantially across subjects, the response to the orthogonal (nonadapted) stimulus was stronger than the response to the parallel (adapted) stimulus in all subjects. For the V1 ROI in the LM:LM condition, the response difference to parallel versus orthogonal probe stimuli was statistically significant in two of the three subjects. Unless stated otherwise, the significant effects reported below were statistically significant both when averaged across all subjects and for at least two of the three individual subjects.
|
ADAPTATION TO FIRST-ORDER STIMULI (CONDITION LM:LM). Most visual areas exhibited orientation-selective response adaptation to first-order (luminance) gratings: the responses to the orthogonal (unadapted) stimulus in the LM:LM condition were significantly stronger than the responses to the parallel (adapted) stimulus (Table 1). There was no significant response adaptation in areas V5/MT+, V7, LO2, and pLOC, and these areas also showed the weakest responses to the stimuli. Figure 7 shows the group-averaged time-courses in condition LM:LM for three lower-tier visual areas (V1, V2, and V3), two ventral higher-tier areas (hV4 and VO1), and a dorsal higher-tier area (LO1). The response amplitudes for all ROIs are summarized in Fig. 10A. Among the areas that exhibited significant adaptation, the response amplitudes were larger in early visual areas (V1, V2, and V3) than in higher-tier visual areas (hV4, VO1, V3A/B, and LO1). In contrast, the adaptation indices were much the same across these areas (Fig. 11A), indicating that the relative amount of response adaptation in this condition did not vary between visual areas.
|
|
|
|
0.1) in V1 but significantly larger (
0.3) in higher-tier areas, particularly VO1 (and in the CM:CM condition, also in areas V3A/B and LO1).
|
|
The same probe stimuli were used in the unimodal condition OM:OM and in the cross-modal condition LM:OM, and the pattern of response amplitudes was similar in the two conditions, with a relatively larger response in V1V3 than in downstream extrastriate areas (Fig. 12A). Unlike condition OM:OM, however, there was no consistent evidence of orientation-selective adaptation in the cross-modal condition (Table 1; Fig. 12B): the response to the parallel stimulus did not differ significantly from the response to the orthogonal stimulus in any visual area except V7, where one of the subjects showed significant response enhancement to the adapted condition (opposite to the predicted effect of adaptation). There was greater variability within and between subjects in this condition than in the unimodal conditions, and the effects of adaptation were not consistent across subjects.
|
Because we found adaptation in multiple areas, a possible interpretation is that each of these areas contains neurons selective for first- or second-order stimulus orientation. However, a reduced response to the adapted stimulus in one area could also reflect a reduced input to the area caused by adaptation elsewhere. To interpret response reductions as evidence of selective adaptation of neurons within an area, it is necessary to show that the adaptation effect cannot be accounted for by a reduced input to the area. Failure to do so may lead to erroneous conclusions about the tuning and/or adaptation properties of the area. For example, single-unit recordings in macaque MT have suggested that direction-selective response adaptation in MT neurons is caused by adaptation of the V1 neurons projecting to MT, rather than reflecting properties of the MT neurons themselves (Kohn and Movshon 2004
). A similar effect was recently shown in macaque V4 (Tolias et al. 2005
). Given that V1 provides the major feed-forward input to extrastriate visual cortex, any response adaptation observed in this area will be propagated to downstream extrastriate visual areas. Therefore we would expect response adaptation in V1 to be associated with response adaptation in extrastriate visual areas, even if neurons in extrastriate areas do not themselves adapt to the stimuli (e.g., because they are not selective for stimulus orientation). Note that the adaptation effect in V1 could not have originated at an earlier, subcortical stage (i.e., in the L