Neuroimaging of Direction-Selective Mechanisms for Second-Order Motion

Shin'ya Nishida, Yuka Sasaki, Ikuya Murakami, Takeo Watanabe, Roger B. H. Tootell


Psychophysical findings have revealed a functional segregation of processing for 1st-order motion (movement of luminance modulation) and 2nd-order motion (e.g., movement of contrast modulation). However neural correlates of this psychophysical distinction remain controversial. To test for a corresponding anatomical segregation, we conducted a new functional magnetic resonance imaging (fMRI) study to localize direction-selective cortical mechanisms for 1st- and 2nd-order motion stimuli, by measuring direction-contingent response changes induced by motion adaptation, with deliberate control of attention. The 2nd-order motion stimulus generated direction-selective adaptation in a wide range of visual cortical areas, including areas V1, V2, V3, VP, V3A, V4v, and MT+. Moreover, the pattern of activity was similar to that obtained with 1st-order motion stimuli. Contrary to expectations from psychophysics, these results suggest that in the human visual cortex, the direction of 2nd-order motion is represented as early as V1. In addition, we found no obvious anatomical segregation in the neural substrates for 1st- and 2nd-order motion processing that can be resolved using standard fMRI.


Position shifts of luminance-defined patterns are the most effective stimuli to produce a visual motion percept. According to the standard models of visual motion processing (Adelson and Bergen 1985; van Santen and Sperling 1985; Watson and Ahumada 1985), initial motion sensors have a spatiotemporal receptive field that optimally detects local shifts in luminance distribution (luminance motion energy). However, several examples have also been reported in which motion is seen without luminance-defined movements. In diverse reports based either on human psychophysics (Derrington and Badcock 1985) or single units from a wide range of animals including monkeys (Albright 1992), cats (Zhou and Baker 1993), and larval zebrafish (Orger et al. 2000), it is reportedly possible to discriminate movements of higher-order features, such as spatial modulation of luminance contrast. This type of movement was named 2nd-order motion (Cavanagh and Mather 1989) as opposed to luminance-based 1st-order motion. Such 2nd-order motion is visible even in microbalanced motion stimuli, in which the luminance motion energies are locally balanced with regard to direction and speed; thus any mechanism based on luminance motion energy computation should fail to detect coherent motion (Chubb and Sperling 1988). This finding suggests that the visual system may separately process 1st-order motion and 2nd-order motion (Cavanagh and Mather 1989; Chubb and Sperling 1988).

Although it is theoretically possible that a single mechanism detects both 1st-order motion and some classes of 2nd-order motion (Johnston et al. 1992; Taub et al. 1997), most psychophysical and single-unit electrophysiological findings support the notion of functional segregation in visual motion processing (Baker 1999; Lu and Sperling 2001b; Nishida and Ashida 2000). Because 2nd-order motion analysis requires an extra stage for detecting 2nd-order features before extracting motion signals, it is often suggested that 2nd-order motion is processed at cortical areas higher than those for 1st-order motion (Smith et al. 1998; Wilson et al. 1992).

However, it remains controversial whether there is a specialized anatomical subdivision for 2nd-order motion processing, distinct from that for 1st-order motion. In single-unit electrophysiology, neurons sensitive to the direction of 2nd-order motions are found in V1 and MT of monkey (Albright 1992; Chaudhuri and Albright 1997; O'Keefe and Movshon 1998) and areas 17 and 18 of cat (Mareschal and Baker 1998; Zhou and Baker 1993, 1994), but typically these neurons also respond to 1st-order motion stimuli. In brain-damaged human subjects, some studies indicated a substantial overlap of the cortical areas involved in 1st- and 2nd-order motion processing (Greenlee and Smith 1997), whereas others described patients whose 1st- or 2nd-order motion processing appeared to be selectively impaired (Plant and Nakayama 1993; Vaina and Cowey 1996; Vaina et al. 1999).

This controversy has not yet been settled by functional imaging studies. Smith et al. (1998) reported that 2nd-order motion stimuli produced significantly greater responses than 1st-order motion in V3/VP, whereas this tendency was less evident or absent in other visual areas. They interpreted this to suggest that 1st-order motion sensitivity arises in V1, whereas 2nd-order motion is first represented explicitly in V3/VP. Using similar patterns for 1st- and 2nd-order motion stimuli, Dumoulin et al. (2002) found relative cortical specialization for 1st- or 2nd-order motion in both occipital and parietal lobes. On the other hand, Seiffert et al. (2003) did not find significant differences between the 2 types of motion in the fMRI (functional magnetic resonance imaging) responses in a wide range of visual areas including V3/VP (V1, V2, V3, VP, V4v, V3A, LO, and MT/MST). Seiffert et al. suggested that 2nd-order motion processing occurs as early as in V1.

One limitation of the past fMRI studies lies in the link between the measured MR activity and visual motion processing. Earlier studies compared 1st- and 2nd-order motion stimuli, in a standard fMRI approach (e.g., comparing moving vs. stationary stimuli). However, even when such stimuli produce greater MR activity to motion stimuli, this bias could be generated by motion-irrelevant features involved in the stimuli, if the appropriate controls were not included. One cannot overcome this limitation by simply testing for increments in MR activity for moving stimuli relative to static stimuli: such increments might just reflect a neuron's preference for rapid stimulus changes in general—not specifically for motion. Furthermore, 2nd-order motion stimuli inevitably activate 1st-order motion mechanisms. These mechanisms do not signal a coherent direction of motion, but presumably support the percept of local texture changes. In earlier studies, the increment in MR activity in early visual cortex for 2nd-order motion stimuli could simply reflect the activity of 1st-order motion mechanisms, totally irrelevant to the perception of 2nd-order motion.

In both psychophysics and electrophysiology, direction selectivity has been recognized as direct evidence for motion processing. Here we tried to reveal human cortical areas that show direction selectivity to 2nd-order motion, and compare them with the areas showing direction selectivity to 1st-order motion, including all the necessary control conditions.

To identify direction-selective cortical areas, we used a motion adaptation technique. Tootell et al. (1995b) and others (Culham et al. 1999; He et al. 1998; Huk et al. 2001) revealed direction-selective cortical activity for 1st-order motion in motion-selective visual areas such as MT+, by measuring fMRI responses during a subject's perception of static motion aftereffect (an illusory movement of stationary pattern after prolonged viewing of a motion stimulus). However, the same technique cannot be used for the current purpose, given that adaptation to 2nd-order motion does not generate this type of motion aftereffect (Derrington and Badcock 1985; Nishida and Sato 1992; Seiffert et al. 2003).

However, adaptation to 2nd-order motion can be direction-selective (Turano 1991). For instance, after prolonged viewing of a clockwise, luminance-modulated movement, the detection threshold of contrast modulation increases for the clockwise direction more than for the anticlockwise direction. Moreover, this effect is selective with regard to motion type. That is, adaptation to 2nd-order motion produces only a small direction-selective threshold elevation for detecting 1st-order motion (Nishida et al. 1997). On the other hand, 2nd-order motion can induce direction-selective aftereffects in 1st-order test patterns based on suprathreshold contrast—for example, the speed aftereffect (Ledgeway and Smith 1997) and flicker motion aftereffect (Ledgeway 1994; McCarthy 1993; Nishida and Sato 1995).

These facts suggest that direction-selective sensitivity is reduced at an initial stage of 2nd-order motion processing that is independent of 1st-order processing, as well as at later stages where 2 processing pathways are integrated (Nishida and Ashida 2000). The anatomical correlates of these processing stages should be identifiable by localizing the cortical areas in which adaptation to 2nd-order motion gives rise to direction-selective reduction, in the fMRI response.

Selective adaptation is an important and well-established technique in visual psychophysics. Recently, adaptation techniques have been used to reveal incisive information in brain imaging studies (Culham et al. 1999; Grill-Spector et al. 1999; Hadjikhani et al. 1998; He et al. 1998; Huk and Heeger 2002; Huk et al. 2001; Kourtzi and Kanwisher 2001; Sakai et al. 1995; Tolias et al. 2001; Tootell et al. 1995b, 1998b). With regard to 1st-order motion, both classical single-unit electro-physiological studies (Barlow and Hill 1963; Petersen et al. 1985; Van Wezel and Britten 2002; Vautin and Berkley 1977) and recent fMRI studies (Huk et al. 2001; Sasaki et al. 2002; Tolias et al. 2001) have revealed a postadaptation response reduction. This supports the technical feasibility of our attempt.

It has been suggested that the fMRI response observed during the perception of the motion aftereffect (Culham et al. 1999; He et al. 1998; Huk et al. 2001; Tootell et al. 1995b) might reflect not only the cortical activity itself, produced by the motion signals, but also the attentional enhancement evoked by the aftereffect (Huk et al. 2001). According to this claim, even if we find a change in MR activity consistent with the direction-selective adaptation, we cannot exclude the possibility that a correlated change in the attentional state contributes to the measurement. To avoid this possible artifact, we constantly directed the subjects' attention to the motion stimulus by having them perform a motion-related task throughout all scans. A staircase procedure kept the task difficulty centered on the threshold throughout a scan, to maximize and stabilize the level of subjects' attention (see methods).

In psychophysics, a popular index of direction-selective adaptation is a reduction in the response for the adapted direction relative to the response for the opposite direction (Pantle and Sekuler 1969; Sekuler and Ganz 1963). We applied this analysis to the blood oxygenation level-dependent (BOLD)-based fMRI responses. That is, the MR signals in response to a motion stimulus measured after adaptation to the same direction were compared with those measured after adaptation to the opposite direction. We made measurements for 1st-order and 2nd-order motion stimuli in separate scans using the identical procedure. To our surprise, the results revealed direction-selective adaptation for 2nd-order motion in a wide range of visual cortical areas, including V1. Moreover, the topography of motion-processing activity in the visual cortex was similar, whether it was produced by either 1st- or 2nd-order motion.

Preliminary results of this study were presented in abstract form at the annual meeting of the Vision Sciences Society held in May 2002 in Sarasota, Florida.



Six healthy subjects with normal or corrected-to-normal visual acuity viewed visual stimuli in the MR scanner. Subjects were scanned in 26 1.5- to 2-h sessions to achieve a total of 16,640 functional brain volumes (349,440 slices) in this study. All subjects gave informed written consent. This study was approved by Massachusetts General Hospital Human Studies Protocol 2000p-001155.

Visual stimuli

Visual stimuli were projected into the scanner bore using an LCD projector (Sharp NoteVision 6) onto a rear projection screen (Da-tex), viewed by an adjustable mirror angled about 45° to the subject's normal line of sight. Stimuli were driven by a computer (Apple Computer, PowerBook G3) using the Vision Shell environment (ML Micro).

The main experimental stimuli were moving radial gratings presented within a circle (diameter 25°) centered about a red fixation bull's-eye. Subjects were required to maintain fixation on the bull'seye throughout each scan. The mean luminance of the display was 48 cd/m2.

Three types of motion stimuli were used. One was a 2nd-order stimulus and the other 2 were 1st-order stimuli (Fig. 1). The 2nd-order stimulus was a moving contrast modulation of a static carrier (CMS). The carrier pattern was a binary random dot array (mean luminance contrast = 50%). Each dot was a 13 × 13 min-arc square. The contrast modulation was a sinusoidal grating of 6 cycles/360°, with a modulation amplitude of 100% (i.e., luminance contrast changed between 0 and 100%). It rotated at a speed of 0.66 rotations/s (3.95 Hz in terms of temporal frequency). At a frame rate of 75 Hz, the rotation angle of a single frame step was thus 3.16° (except for occasional transient increases in step size introduced for the attentional control task; see following text). The CMS is a microbalanced stimulus, which is theoretically invisible to luminance-based motion processors (Chubb and Sperling 1988). The luminance within each dot was uniform rather than being modulated by the sinusoid, which introduces a 1st-order artifact (Smith and Ledgeway 1997).

fig. 1.

Three types of motion stimuli used in experiments. A: 2nd-order stimulus. Moving contrast modulation of a static random dot carrier (CMS). B: 1st-order control stimulus. Moving luminance modulation with a static random dot pattern (LMS). C: high-contrast 1st-order stimulus. Moving luminance sine wave (LM0).

Otherwise similar 1st-order motion stimuli served as controls. One 1st-order stimulus was a radial sinusoidal luminance modulation, linearly added to a uniform static random dot array of 50% contrast (LMS). The modulation amplitude was 10%, which made the visibility of LMS nearly equivalent to that of the CMS. The modulation amplitude threshold of motion detection was about 1 log unit higher for CMS than for LMS. To check the effects of stimulus contrast, we also tested the effects of a 90% luminance modulation without random dots (LM0). Other aspects of these stimuli were equated with those of the CMS.

The gamma function of the stimulus projection system was carefully corrected to ensure that the local mean luminance of CMS was not altered by the depth of contrast modulation. Additionally, we ran several psychophysical tests of luminance artifacts (Cropper and Johnston 2001; Smith and Ledgeway 1997). First, we confirmed that a movie sequence in which a CMS grating and an LMS were alternatively presented with a quarter-cycle phase shift did not appear to move in a coherent direction. This ensured that the luminance artifact contained by CMS was negligibly small at least at the modulation spatial frequency (Ledgeway and Smith 1994; Lu and Sperling 2001a). Second, as in many other 2nd-order motion stimuli (Derrington and Badcock 1985; Nishida and Sato 1992), we confirmed that prolonged adaptation to rotating CMS gratings did not produce static motion aftereffects in stationary CMS gratings, or in stationary low-contrast LM0 of various spatial frequencies, whereas adaptation to LMS gratings induced the expected visible aftereffect. Third, the contrast detection threshold of rotating LM0 gratings measured after adaptation to rotating CMS was only 11-13% higher for the adapted direction than for the opposite direction, when the test spatial frequency was either matched with the adaptation frequency or doubled or halved relative to the adaptation frequency. These threshold differences were close to the values previously reported (Nishida et al. 1997). Moreover, these differences were also significantly smaller than the difference obtained when adapted to LMS (38%), and this value would be still larger if adapted to equivalent contrast LM0. The results of these adaptation experiments ensured that the luminance artifact contained by CMS was negligibly small not only at the modulation spatial frequency, but also at the higher or lower spatial frequencies.


A single scan (512 s) consisted of 16 blocks, each lasting 32 s (Fig. 2). A single motion type was used throughout a scan, and the scan order was varied between subjects. During the first 2 blocks, the stimulus rotated in one direction (clockwise or anticlockwise). During the next 2 blocks, the stimulus rotated in the opposite direction. This pattern was repeated 4 times. Thus motion direction was reversed at the onset of odd-numbered blocks (which we therefore call “Changed direction” blocks), but was held identical to the previous block in even-numbered blocks (“Repeated direction” blocks). In the analysis of the time course, we removed slow temporal changes in the raw fMRI signal irrelevant to our stimulus manipulation (<1/64 s = 0.0156 Hz).

fig. 2.

Time sequence of one scan. A prescan block was followed by16 blocks, each lasting 32 s. Motion direction (clockwise and anticlockwise, as illustrated by arrows) was reversed at onset of odd-numbered blocks (Changed direction blocks, indicated by shading), but held identical to previous block in even-numbered blocks (Repeated direction blocks). Inset: luminance contrast of stimulus was slowly ramped off and on at interblock transitions.

Because direction reversal releases adaptation of direction-selective mechanisms, the response was expected to be greater in the Changed direction blocks than in the Repeated direction blocks, with their difference indicating the magnitude of direction-selective adaptation. Note that the stimulus motion in each block was used both as an adaptation pattern and a test pattern. Note also that the MRI measurement started at the beginning of the first block; thus to make the adaptation state of the 1st block equivalent to those of other Changed direction blocks, we included a dummy prescan block in which the stimulus moved in the direction opposite to the 1st block. To exclude the possibility that nonspecific responses to direction reversals artifactually elevate the responses during the Changed direction blocks, the luminance contrast of the overall stimulus was slowly ramped on (linearly increased from zero to full contrast over 1.5 s) at the beginning of each block, and slowly ramped off (linearly decreased from full to zero contrast over 1.5 s) at the end. For CMS and LMS, the same random-dot array was used throughout a single scan, to exclude the possibility that a nonspecific change in the random-dot pattern elicited a response.

To control attention, the subject was required to detect a probe stimulus that was presented at unpredictable times. The probe was a transient increase in step size made by skipping a given number of motion frames in the sequence. It was presented every 1 s with a probability of 0.2, except for the periods of onset/offset ramping. Thus on average, 5.8 probes appeared in each 32-s block. When the subject pressed a response button within 1 s after the probe, the program counted the response as a “Hit.” The absence of response within 1 s was considered a “Miss,” and a response without a corresponding probe was a “False Alarm (FA).” No feedback was given to the subject. To keep task difficulty constant across scans, stimulus types, and subjects, and to maximize the attention load directed to the stimulus, the probe step size was adaptively changed to converge to the detection threshold of that condition. That is, a single “Hit” response reduced the probe step by one frame, whereas a single “Miss” or “FA” response increased it by the same magnitude. When no probe step was presented and the subject made no response (“Correct rejection”), the probe step was not changed.

General imaging procedures

Experimental details of the imaging procedures were similar to those described elsewhere (Hadjikhani et al. 1998; Mendola et al. 1999; Somers et al. 1999; Tootell et al. 1997). Scans were acquired using a 3T Siemens Allegra. A custom-built, quadrature-based, semi-cylindrical surface coil was used to acquire high-sensitivity MR images including occipital, parietal, and posterior temporal lobes bilaterally. Voxels were 3.1 mm2 in-plane and 3 mm thick. Functional MR images were acquired using gradient echo sequences (TE = 30 ms). Each scan constituted 128 images in 21 contiguous slices, oriented approximately orthogonal to the calcarine sulcus. The TR was 4 s for the main fMRI experiments, and each scan took 512 s. Each subject was run for 10-20 scans (3-8 scans for each motion stimulus).

Defining regions of interest (visual areas)

In each subject, visual areas were defined in a separate scan session, and those areas were treated as regions of interest (ROIs) in the subsequent analysis. Retinotopic visual area borders were mapped using phase-encoded stimuli and field sign analysis (Dale and Buckner 1997; Hadjikhani et al. 1998; Sereno et al. 1995; Somers et al. 1999; Tootell et al. 1997, 1998a,b,c). This analysis identified visual areas V1, V2, V3/VP, V3A, V4v, V7, and V8 (DeYoe et al. 1994, 1996; Engel et al. 1997; Schneider et al. 1993; Sereno et al. 1995; Tootell et al. 1997, 1998a,b,c). Low-contrast, moving, and stationary concentric rings were also presented to localize area MT+ because it was identified as a region that responded more strongly to moving low-contrast gratings than to stationary rings (Beauchamp et al. 1997; Dupont et al. 1994; Lueck et al. 1989; McCarthy et al. 1995; Tootell et al. 1995a,b). The TR was 4 s for retinotopic scans (duration: 8 min 32 s) and 2 s for the MT+ localization (duration: 4 min 16 s).

Flattening the visual cortex

In a separate session, structural images of the whole brain were obtained at high resolution (1.0 × 1.0 × 1.3 mm3, 1.5 T) to provide data used in each subject's 3-dimensional brain reconstruction (Dale et al. 1999; Fischl et al. 1999). This allowed us to generate an unfolded and flattened cortical surface for each subject (FreeSurfer,, and to identify a retinotopic map on the cortex.


To analyze cortical mechanisms responsible for 1st- and 2nd-order motion perception, we localized the fMRI response produced by direction-selective adaptation. Figures 3, 4, and 5 (respectively) show the time courses of the BOLD responses for 2nd-order motion (CMS), 1st-order motion with an equivalent visibility (LMS), and 1st-order motion with a high contrast (LM0), in 7 visual areas. In each figure, the top row shows the averaged time course of the signal modulation around the mean during the complete cycle of 64 s. The first half corresponded to a Changed direction (odd-numbered) block and the second half corresponded to a Repeated direction (even-numbered) block (see Fig. 2).

fig. 3.

Time courses of blood oxygenation level-dependent (BOLD) response to 2nd-order motion (CMS), averaged over 240 block pairs of 6 subjects, in 7 visual areas. Top row: time course of signal change around mean during complete cycle of 64 s: first half corresponded to a Changed direction (odd-numbered) block, and latter half corresponded to a Repeated direction (even-numbered) block. On abscissa, zero indicates physical onset of stimulus, and vertical dotted line at 4 s indicates known hemodynamic delay. Thick data lines indicate mean value; thin lines indicate ±1 SE. Second row: 32-s time course of BOLD response averaged over all blocks, which indicates nondirectional component of BOLD response. Note magnification of time scale in comparison with top row. Bottom row: 32-s time course of difference between Changed and Repeated direction blocks, which indicates directional component of BOLD response. Circle (and arrow) indicates directional response significantly greater than zero (P < 0.05 by 1-tailed t-test), suggesting occurrence of direction-selective adaptation in that region of interest (ROI).

fig. 4.

Time courses of BOLD response to low-contrast 1st-order motion with random-dot noise (LMS) averaged over 192 block pairs of 6 subjects.

fig. 5.

Time courses of BOLD response to high-contrast 1st-order motion (LM0), averaged over 240 block pairs of 6 subjects.

We can deconstruct the 64-s time course into 2 orthogonal components that are expected to be reflecting different aspects of the stimulus. The middle row shows a 32-s time course of the BOLD response averaged over all blocks [i.e., (Changed + Repeated)/2]. This indicates a nondirectional component that mainly reflects the effects of the motion stimulus, regardless of the motion direction. On the other hand, the bottom row shows a 32-s time course of the difference between Changed and Repeated direction blocks (Changed - Repeated). This differential response reflects the effects of the direction change. If the BOLD response for the Repeated direction blocks was reduced by direction-selective adaptation (compared with the response for the Changed direction block), the direction-selective response should be positive. The following analysis is mainly concerned with this directional component.

Directional response

For CMS (Fig. 3), we found that the directional response was significantly larger than zero at some time points (P < 0.05 by 1-tailed t-test) within each visual area tested, ranging from V1 to MT+. This suggests that much of visual cortex has direction selectivity for 2nd-order motion.

For LMS (Fig. 4) and LM0 (Fig. 5), the directional response was significantly larger than zero at some time points, in all areas except V4v of the LMS condition. This suggests that much of visual cortex also has direction selectivity in response to 1st-order motion, in agreement with previous fMRI studies (Huk et al. 2001; Tolias et al. 2001).

Directional response: area specificity

To compare the effect of directional adaptation across areas and motion types, we averaged the directional response over the whole block (Fig. 6, top panels). For CMS, the averaged value was significantly positive in all areas, and one-way ANOVA with a repeated measures design indicated a signifi-cant effect of area (P < 0.0001; see the legend of Fig. 6 for detailed results of the statistical analysis). Post hoc analysis (Tukey HSD test, alpha = 0.05) indicated that the averaged directional response was larger in V3A than in V1, V2, V3, VP, and V4v, and larger in MT+ than in V3. The effect of area was also highly significant for LMS (P < 0.0001), with a significantly larger response in V3A and MT+ than in other areas. Two-way ANOVA revealed a significant interaction between motion-type (CMS vs. LMS) and area (P < 0.0001), suggesting some difference in the pattern of area specificity between the 2nd-order and 1st-order stimuli.

fig. 6.

Time-averaged directional response in each visual area obtained for CMS, LMS, and LM0. Top row: total response averaged over whole block period. Error bar indicates ±1 SE. Directional response was not significantly greater than 0 (P > 0.05 by 1-tailed t-test) in areas indicated by an underline. Results of one-way ANOVA (with a repeated-measures design) on effect of area were CMS: F(6,1434) = 6.224, P < 0.0001; LMS: F(6,1146) = 13.849, P < 0.0001; and LM0: F(6,1434) = 3.519, P = 0.0018. Arrow indicates area pair having a significant difference in post hoc analysis (Tukey HSD, alpha = 0.05). Results of 2-way ANOVA (area, motion type: CMS vs. LMS) were F(1,430) = 0.498, P > 0.1 for main effect of motion type, and F(6,2580) = 5.673, P < 0.0001 for interaction of area and motion type. Bottom row: initial response averaged over time points from 4 to 12 s. Results of one-way ANOVA were CMS: F(6,1434) = 2.088, P = 0.0518; LMS: F(6,1146) = 4.326, P = 0.0003, and LM0: F(6,1434) = 5.315, P < 0.0001. Results of 2-way ANOVA (area, motion type: CMS vs. LMS) were F(1,430) = 0.408, P > 0.1 for main effect of motion type, and F(6,2580) = 2.162, P < 0.0439 for interaction of area and motion type.

However, time averaging over the whole block could underestimate some directional adaptation effects. Although the initial peak response to the LMS indicated a clear directional adaptation effect in V1 and V2, the subsequent decrease rendered the time average insignificantly different from zero. Because adaptation effects from the previous block were expected to be most evident in the initial period of a block (Culham et al. 1999; He et al. 1998; Huk et al. 2001; Tolias et al. 2001; Tootell et al. 1995b), we concentrated on the directional responses averaged over 4 to 12 s, during which marked directional responses were observed in many areas both for CMS and LMS (Fig. 6, bottom panels). With this analysis, directional response was evident even in early visual areas, for both motion types. One-way ANOVA indicated that the effect of area was marginally significant for CMS (P = 0.0518) and highly significant for LMS (P = 0.0003). Two-way ANOVA indicated a weak interaction of motion type and area (P = 0.0439); this suggests that the pattern of area specificity was marginally different between the 2 types of motion.

One trend especially distinguished the area specificity of LMS from that of CMS: a relatively strong directional response to LMS in MT+. However, this trend was not observed in the response to high-contrast 1st-order motion (LM0). The time-averaged response of this condition showed only a small variation across areas, irrespective of whether it was measured across a whole block or just the initial peak.1 The exception was the negative directional responses in MT+. This area had an unexpected time course, including a negative response in the initial period, and positive responses only in the latter half of the block. Because the peak value was as large as other areas, we do not think that our results indicated a disappearance of direction selective adaptation in MT+ for the high-contrast stimulus. Nevertheless, this result suggests that a small discrepancy in the area specificity between the CMS and LMS should not immediately be regarded as a general difference between 1st-order and 2nd-order motion stimuli.

Directional response: time course

The time course of the directional response showed a characteristic pattern for each stimulus (Figs. 3, 4, 5). For CMS, the response peaked about 8-12 s from stimulus onset, in most areas. Although there was a small dip in the middle, the directional response was almost always positive. For LMS there was also a positive peak at around 8-12 s in all areas, but some areas showed a negative trough in the latter half of the block. For LM0, the peak and trough became sharper and appeared earlier compared with LMS, and a second peak also appeared. Additionally, a directional response was obtained only at the location of the second peak in MT+. These time course differences undoubtedly reflect differences in the dynamics of adaptation between the 1st-order and 2nd-order motion, although it is difficult to interpret them further.

Effects of block location

In our block design, the 1st block of a scan was always a Changed direction block, and the last block was always a Repeated direction block. If the response to the initial block was stronger, that could have introduced a significant bias in our estimation of direction-selective effects.

To test this possibility, we recalculated the response time courses after omitting the 1st and last blocks, regarding the penultimate block as a pair of the 2nd block. In this case, the effect of block order (if anything), should be reversed, or disappear. The resultant data were nearly indistinguishable from those shown in Figs. 3, 4, 5, 6, with no systematic changes in the magnitude of directional responses: the difference in the total time-averaged response was only 0.0032% for CMS, 0.0027% for LMS, and 0.0030% for LM0. Thus it is unlikely that our estimation was strongly biased by the effect of block order.

Nondirectional response

The time course of the nondirectional response (second row in Figs. 3, 4, 5) revealed an interesting difference between the 1st-order and 2nd-order motions. For CMS, it varied across different areas. There were mild stimulus-driven offset/onsets in V3, VP, V3A, and MT+, but not in V1, V2, and V4v. On the other hand, for LMS, there was a response increase at stimulus offset/onset and a response reduction at the middle of the block, with little difference in the time course across different areas. For LM0, there were sharper offset/onset responses in all the areas, again with little difference between them.

Our analysis assumed that directional and nondirectional factors modulated the fMRI signals additively and independently. Although this assumption might not be completely true, clear differences in the shape of the time course between the directional and nondirectional responses for the CMS condition (correlation coefficient: r = 0.037), and the LMS condition (r = -0.177) suggest that they were in fact nearly independent. However, the correlation was considerably high for the LM0 condition (r = 0.545), suggesting either an accidental similarity of the time courses of the 2 components for this specific stimulus, or the existence of nonlinear interaction of the 2 components. This may be related to other anomalies found for this condition.

Attentional task performance

To control attention, the subject was required to detect a stimulus jump. The difficulty of this task (i.e., the jump size) was adaptively changed using a staircase procedure. The averaged jump size, hit/miss/FA rates, and the reaction time (RT) for each stimulus condition are summarized in Table 1. The results suggest that the performance during the LM0 condition was easiest, and the LMS condition was hardest initially. However, because of the adaptive procedure, the actual behavioral responses made by subjects in the scanner were similar across different stimulus conditions.

View this table:
table 1.

Attentional task performance

It is known that motion adaptation improves the detection of small speed changes, relative to the adapted speed (Bex and Baker 1997; Clifford and Langley 1996; Huk et al. 2001). Therefore one might expect some difference in the task performance between the Changed and Repeated direction blocks. However, this difference was very small (the 2nd and 3rd rows of Table 1). This is presumably because different neural mechanisms contribute to detection of transient and large speed changes (present task) and to detection of sustained and small speed changes (tasks showing postadaptation sensitization). It should also be noted that postadaptation sensitization was not found for 2nd-order motion (Kristjansson 2001).


Using a motion adaptation paradigm, we found that movement of contrast modulation (CMS, 2nd-order motion) produced direction-dependent fMRI activity in a wide range of visual cortical areas, including V1, V2, V3, VP, V3A, V4v, and MT+, with strongest activity in V3A. The distribution of brain activity was not largely different from that produced by movement of 1st-order (luminance) modulation of matched visibility (LMS). These results suggest that in human visual cortex, the direction of 2nd-order motion is represented as early as V1. We did not find evidence for anatomical segregation in the neural substrates for 1st- and 2nd-order motion processing, at least at the scale of measurement here.

Second-order directional response: artifacts?

It could be argued that we found similar directional responses for conditions CMS and LMS simply because “CMS” contained a significant amount of 1st-order artifact produced by imperfect calibration of the projection system. However, if this was true, interleaved presentation of CMS and LMS gratings with a quarter-cycle phase shift should have resulted in the perception of a coherent direction of motion—and this was not found. Contrary to the predictions from the 1st-order artifact hypothesis, adaptation to CMS did not induce static motion aftereffect, nor induce clear direction-selective elevation of the contrast detection threshold of 1st-order motion. In addition, our fMRI data indicate a difference in the nondirectional response, which should also have been similar if the CMS response was elicited by a 1st-order artifact. All this evidence suggests that CMS was not contaminated with 1st-order artifact.

In these experiments, subjects were forced to maintain attention on the motion stimulus, to discriminate near-threshold changes in motion speed, at frequent and randomized times during all motion conditions. Thus it is highly unlikely that the directional response we obtained reflects a change in the level of attention between the Changed and Repeated direction blocks. Furthermore, the magnitude of direction-selective component for CMS did not largely differ between V1 and higher-tier cortical areas; this is inconsistent with the evidence that attention effects tend to be larger in higher visual areas, compared with V1 (Gomez Gonzalez et al. 1994; Gratton 1997; Hillyard and Anllo-Vento 1998; Huk et al. 2001; Tootell et al. 1998a). Also consistent with the current results, Huk et al. (2001) found directional adaptation effects in both cortical areas they tested (V1 to MT+), using a 1st-order motion stimulus and an attention-control task.

Second-order motion processing

This is the 1st functional imaging study that tests the direction selectivity of human visual cortex to 2nd-order motion. The results do not support a prevailing hypothesis that 2nd-order motion is processed at cortical areas higher than those for 1st-order motion (Wilson et al. 1992). On the other hand, the results do agree with past electrophysiological studies that reported direction-selective neural response to 2nd-order motion in a wide range of visual cortex of cat and monkey, without clear anatomical segregation of 1st- and 2nd-order motion processing (Albright 1992; Chaudhuri and Albright 1997; Mareschal and Baker 1998; O'Keefe and Movshon 1998; Zhou and Baker 1993, 1994). The direction-selective activity we found in V1 is consistent with the intriguing possibility suggested recently (Demb et al. 2001) that the initial stage of 2nd-order processing (e.g., contrast rectification) occurs precortically, and cortical neurons combine the rectified outputs to generate 2nd-order direction selectivity. The results of the CMS condition suggested that the directional adaptation effect for 2nd-order motion was not particularly strong in area MT+. Whether MT+ is “the center of processing” for 2nd-order motion, as well as for 1st-order motion, remains an intriguing open question.

In general, our results support the conclusion of a recent fMRI study on the same topic (Seiffert et al. 2003), but not the conclusion of prior studies (Dumoulin et al. 2002; Smith et al. 1998). As discussed above, however, those prior studies did not test the direction-selective response. This is a significant issue in interpreting these data, given that the MRI response to 2nd-order motion potentially includes various components irrelevant to 2nd-order motion perception. In our data, the non-direction-selective responses presumably included some of these irrelevant components. Interestingly, for the 2nd-order stimulus, we found no correlation between the time courses of nondirectional responses, relative to the directional responses. In addition, nondirectional responses show some difference between the 1st-order and 2nd-order motion. Unless direction-selective responses are selectively examined, one could come to inappropriate conclusions about 2nd-order motion processing and its relationship with 1st-order processing.

Psychophysical studies have shown that 2nd-order motion can be detected by motion sensors that are partly monocular, spatial-frequency selective, and specialized for a given type of 2nd-order motion stimulus. Thus presumably these 2nd-order sensors are located at low levels (Chubb and Sperling 1988; Lu and Sperling 1996; Nishida 1993). Psychophysically, the reduction in sensitivity of these sensors produced direction-selective adaptation to a 2nd-order stimulus (Nishida et al. 1997). The directional response to CMS we found in V1 may arise from adaptation of these low-level motion sensors. Adaptation effects in early stages presumably propagate through subsequent areas, where additional adaptation may also occur.

On the other hand, the perception of 2nd-order motion can also be mediated by attentive tracking of salient features (Cavanagh 1992; Lu and Sperling 2001b). Moreover, the present results do not exclude the possibility that the neural substrates for the high-level 2nd-order motion perception are anatomically segregated from those for 1st-order motion processing, in more anterior (higher-tier) brain regions. In fact, a network of many cortical areas (including parietal and frontal regions) may well be involved in attentive tracking (Culham et al. 1998). Cortical specification for 1st- or 2nd-order motion processing, suggested by fMRI (Dumoulin et al. 2002) and by brain-damaged patients (Plant and Nakayama 1993; Vaina and Cowey 1996; Vaina et al. 1999), may be related to (dis)function of the high-level 2nd-order motion process.

Johnston and colleagues (Benton 2002; Johnston and Clifford 1995; Johnston et al. 1992) postulated that a common gradient-type motion mechanism processes both 1st-order motion and some 2nd-order motion, including the moving contrast modulation of the static carrier used here. The present results are superficially consistent with this single-mechanism hypothesis. However, it remains controversial whether their proposal can also be reconciled with other lines of evidence supporting segregation of the mechanisms (Lu and Sperling 2001b), including differences in the effect of motion adaptation (Derrington and Badcock 1985; Nishida and Sato 1992; Nishida et al. 1997). Overall, the currently available evidence suggests that 1st-order motion processing and 2nd-order motion processing are functionally segregated (as indicated by the psychophysics), but anatomically mixed at a scale of about 3 mm (as indicated by the current fMRI).

Nondirectional response

The nondirectional responses typically had V-shaped time courses. This pattern was most evident in the response to LMS, but it was also apparent in the response of some visual areas to CMS or LM0. The initial response increase immediately following at the block transitions was presumably a response to the “time-outs,” when the stimulus including random dot field disappeared transiently. The subsequent response reduction may be in part attributable to a well-known poststimulus “undershoot” in the MRI signal (Kwong et al. 1992), which is thought to result from a temporal mismatch between the cessation of CBF (cerebral blood flow) increase, coupled with a slower recovery of CBV (cerebral blood volume) equilibrium associated with brain activity (Buxton et al. 1998; Hoge et al. 1999; Kruger et al. 1999; Mandeville et al. 1998, 1999a,b). Recovery from this undershoot might account for the response increase before the stimulus offset.

Within each 1st-order motion condition, the time course of nondirectional response was similar across different visual areas, showing a clear offset/onset activity even in early visual cortex. For 2nd-order motion, on the other hand, offset/onset activities were not evident in V1, V2, and V4v. The reason for this discrepancy was not obvious, but it might be related to a difference in the temporal characteristics of early visual response to different spatial-frequency patterns (Kelly 1979). Strong low spatial-frequency luminance modulations in 1st-order stimuli may effectively evoke transient neural responses in early visual cortex. Alternatively, the effect of the onset of random-dot field may be different between CMS and LMS. For many V1 neurons having small receptive fields, the most salient changes in the sequence of LMS condition were changes in the contrast of random dots at the block boundary. In the sequence of CMS condition, on the other hand, the movement of contrast modulation always produced a 4-Hz contrast oscillation. Strong neural activities to such a dynamic contrast change might mask the response to a slow pattern disappearance at the block boundary.

Regardless of the underlying mechanism, the involvement of strong nondirectional responses suggests that in some cases the measured fMRI responses may reflect processing of the stimulus offset/onset more than the processing of stimulus motion. Even so, the adaptation paradigm enabled us to isolate the directional responses that reflected the processing of stimulus motion.

Direction-selective adaptation

MT+ (primarily) and V3A (secondarily) are considered the most motion sensitive areas in human occipital cortex (Tootell et al. 1995a,b, 1997; Watson et al. 1993). In the present study, the directional response was larger in MT+ and V3A when we used the low-contrast 1st-order motion (LMS). However, this tendency was weak for the 2nd-order motion stimulus (CMS) and totally absent for the high-contrast 1st-order grating (LM0). Why were our measures of (motion) direction selectivity not much higher in areas MT+ and V3A?

Several possibilities come to mind. First, it is conceivable that the attention task incorporated here “flattened” the typical pattern of (passive viewing) motion-selective responses across visual areas. This possibility seems unlikely, however, because previous neuroimaging data (Corbetta et al. 1990; Huk et al. 2001) reported a relative increase in activity in MT during attention to stimulus speed (as tested here), rather than the hypothetical decrease necessary to obtain the present results. Additionally, in preliminary tests here, we tested the effects of attention to the present stimuli (relative to passive viewing) and found no obvious effect of attention in MT+ or other areas.

It is also known that MT(+) responses saturate at very low contrasts (Sclar et al. 1990; Tootell et al. 1995a). This may partially explain why the direction-selective adaptation effect was paradoxically reduced in MT+ when we increased the luminance contrast of 1st-order motion. Perhaps the response saturation apparently suppresses the magnitude of directional adaptations (a ceiling effect), by increasing nonlinear interactions between the directional and nondirectional components. However, this argument might be too simple, considering the effect of contrast normalization by a gain control process (Edwards et al. 1996; Heeger 1992; Simoncelli and Heeger 1998). Additionally, previous studies found stronger directional response in MT+ even when using high-contrast 1st-order stimuli (Huk et al. 2001; Tolias et al. 2001).

Another possibility is that direction reversals (as tested here) may not produce the same topography of activity, compared with previous tests of “motion” sensitivity, which have been based on quite different stimulus comparisons. Almost all previous fMRI tests of “motion selectivity,” including MT+ “localizers” (Beauchamp et al. 1997; Dupont et al. 1994; Lueck et al. 1989; McCarthy et al. 1995; Tootell et al. 1995a) and the original fMRI tests of static motion aftereffects (Culham et al. 1999; He et al. 1998; Huk et al. 2001; Tootell et al. 1995b) were tested with moving-versus-stationary stimuli, not using direction reversals as used here and elsewhere (Huk et al. 2001; Tolias et al. 2001). For instance, static motion aftereffects did not produce significant aftereffect-related fMRI responses in V1 (Tootell et al. 1995b), although direction-selective adaptation did (Huk et al. 2001; Tolias et al. 2001). A possible account of this apparent disagreement is given by a 2-stage model of the motion aftereffect generation.2 An alternative hypothesis is that the fMRI responses in higher-tier areas during the perception of static motion aftereffect are mainly evoked by attention (Huk et al. 2001).

However, even using direction-selective adaptation paradigms, previous studies reported stronger direction selectivity in MT(+) than suggested by our data. Some of the apparent discrepancy may be attributable to differences in the way in which the direction index was computed for each ROI. Tolias et al. (2001) examined the response rebounds evoked at abrupt direction reversals in fMRI signals in anesthetized macaques and computed the ratio of the magnitude of rebound response relative to the initial motion response (response increment at motion onset from the blank period). When the absolute magnitude of the rebound response is compared, there is only a small difference between areas. The apparently large direction index for MT (and for V4) is mainly a consequence of its small initial response. The idea behind such normalization was to take into account a difference in the responsivity to the stimulus of each area. We avoided this approach, given that one main concern in past fMRI studies of 2nd-order motion was that much of the neural activation evoked by the stimulus might be totally irrelevant to motion perception.

Huk et al. (2001) also normalized the adaptation effect by stimulus responsiveness in each area. However in their case, this did not greatly affect the area specificity. In the experiment where they found the strongest direction-selective adaptation in human MT+ (“Adapted versus Mixed Direction Experiment”), they compared the fMRI response during blocks of trials in which the stimulus repeatedly moved in the single direction (“Adapted”) with blocks in which the stimulus direction varied among 6 directions from trial to trial (“Mixed”). The direction-selective adaptation effect, indicated by the BOLD signal change (Mixed - Adapted), was much larger in MT+ (and V3A) than in other regions. A possible factor is their use of multiple directions in the nonadapted (mixed) blocks. In fact, in another experiment (“Adapted versus opposite direction”), where they compared the fMRI response for gratings moving in the adapted and the opposite directions, they found that the direction-selective signal change was much smaller than the mixed-direction experiment, and more important, the signal change was not smaller in V1 than in MT+ for 2 of 3 subjects (A. C. Huk, personal communication). There are a few possible reasons why mixed-direction experiments can give stronger adaptation effects than 2-direction experiments, such that direction-selective adaptation was reduced for 2-direction experiments because of long-term storage of the adaptation effect in previous conditions, and/or that exposure to multiple directions was effective in releasing stored adaptation effects. However, the reason that the mixed-direction paradigm can give stronger MT+ activities is not obvious.

Although selective adaptation is potentially a powerful technique of future functional imaging studies, it is not known exactly how postadaptation changes in MR signals are related to postadaptation changes in psychophysical sensitivity or perception, nor how it is related to postadaptation changes in the actual neural responses. In fact, there are very few single-unit studies analyzing the effects of stimulus adaptation, to which the human psychophysics and fMRI could be compared. Moreover, there are also species differences (between humans and macaques) to consider when attempting such comparisons. For effective and appropriate use of the adaptation technique, it will become increasingly necessary to establish these missing links.


We thank Drs. T. Hirahara, N. Sugamura, and Y. Tohkura of Nippon Telegraph and Telephone Corporation for support.


  • 1 Averaging the response over the interval of 4 to 12 s resulted in an underestimation of the initial response for the LM0 condition because there was a dip at around 12 s for many visual areas. When the average was taken over the interval from 4 to 8 s, the directional response was significantly positive for all the areas except MT+.

  • 2 The model consists of “low-level” units, each responding to a given direction of motion, and “high-level” units, each responding to an activity difference between a pair of low-level units tuned to opposite directions (motion opponency). Suppose that motion adaptation reduces the sensitivity of low-level units tuned to the adapted direction. The direction-selective adaptation paradigm compares the postadaptation response for the adapted motion with that for the nonadapted motion. The 2-stage model predicts a response reduction for the adapted direction in both the low-level and high-level units. The model, however, predicts that the motion aftereffect paradigm, which compares the responses to a stationary pattern before and after motion adaptation, can yield a response decrease in the low level and an increase in the high level. Before adaptation, the stationary pattern equally activates the low-level units for both directions; thus high-level units remain almost silent. After adaptation, desensitization of the low-level units tuned to the adapted direction will reduce the population activity of low-level units. The response reduction for a stationary test stimulus in early visual areas was actually observed in the case of dynamic noise adaptation (Sasaki et al. 2002). However, the activity imbalance of the low-level units will yield a postadaptation response increase of the high-level units tuned to the nonadapted direction. This may account for the increase in the fMRI signals during the perception of motion aftereffect (Tootell et al. 1995b). Here we simply assume that the low- and high-level units correspond to neurons in lower-tier areas such as V1 and those in higher-tier areas such as MT+, respectively. This scheme is supported by psychophysical properties of static motion aftereffect (Cameron et al. 1992; Molden 1980; Over et al. 1973; Pantle 1974; Wohlgemuth 1911), suggesting that the adapted mechanisms are low-level motion detectors; and by electro-physiological and fMRI studies (Heeger et al. 1999; Qian and Andersen 1994, 1995), suggesting that some form of motion opponency exists at or before MT(+). Currently available single-unit data (e.g., Petersen et al. 1985; van Wezel and Britten 2002) do not exclude a possibility that the responses of MT neurons to stationary stimuli are increased after motion adaptation.

  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked ”advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract