JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 94: 4373-4386, 2005. First published August 17, 2005; doi:10.1152/jn.00690.2005
0022-3077/05 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
94/6/4373    most recent
00690.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (13)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Krekelberg, B.
Right arrow Articles by Kourtzi, Z.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Krekelberg, B.
Right arrow Articles by Kourtzi, Z.

Implied Motion From Form in the Human Visual Cortex

Bart Krekelberg1, Argiro Vatakis2 and Zoe Kourtzi3,4

1The Salk Institute for Biological Studies, La Jolla, California; 2Department of Experimental Psychology, University of Oxford, Oxford, and 3School of Psychology, Birmingham University, Edgbaston, Birmingham, United Kingdom; and 4Max-Planck Institute for Biological Cybernetics, Tübingen, Germany

Submitted 1 July 2005; accepted in final form 2 August 2005


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
When cartoonists use speed lines—also called motion streaks—to suggest the speed of a stationary object, they use form to imply motion. The goal of this study was to investigate the mechanisms that mediate the percept of implied motion in the human visual cortex. In an adaptation functional imaging paradigm we presented Glass patterns that, just like speed lines, imply motion but do not on average contain coherent motion energy. We found selective adaptation to these patterns in the human motion complex, the lateral occipital complex (LOC), and earlier visual areas. Glass patterns contain both local orientation features and global structure. To disentangle these aspects we performed a control experiment using Glass patterns with minimal local orientation differences but large global structure differences. This experiment showed that selectivity for Glass patterns arises in part in areas beyond V1 and V2. Interestingly, the selective adaptation transferred from implied motion stimuli to similar real motion patterns in dorsal but not ventral areas. This suggests that the same subpopulations of cells in dorsal areas that are selective for implied motion are also selective for real motion. In other words, these cells are invariant with respect to the cue (implied or real) that generates the motion. We conclude that the human motion complex responds to Glass patterns as if they contain coherent motion. This, presumably, is the reason why these patterns appear to move coherently. The LOC, however, has different cells that respond to the structure of real motion patterns versus implied motion patterns. Such a differential response may allow ventral areas to further analyze the structure of global patterns.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Humans perceive motion in some images that contain no real motion. A striking example of this observation is illustrated by the speed lines—also called motion streaks—used in cartoons to suggest the speed of a stationary object (Burr 2000Go; Burr and Ross 2002Go; Geisler 1999Go). This perceptual phenomenon is known as implied motion (form implies motion) and shows an interaction between motion and form processing, two aspects of visual analysis often assumed to be conducted by largely independent pathways (Mishkin et al. 1983Go).

Some implied motion cues operate at a cognitive level (such as an athlete about to throw a javelin), and thus form and motion interactions could take place at a cognitive level. Recent neurophysiological (Jellema and Perrett 2003Go) and human functional magnetic resonance imaging (fMRI) studies, however, suggest that at least the end result of this interaction is represented in the prototypical motion area of the human brain (hMT+/V5). This area is more active when presented with real-life images that imply motion than when similar images are shown that do not imply motion (Kourtzi and Kanwisher 2000aGo; Senior et al. 2000Go).

Other implied motion cues operate at a lower level. In a dynamic Glass pattern sequence (Glass 1969Go; Ross et al. 2000Go), oriented elements are aligned along a common trajectory. This alignment generates a global structure in these patterns (Fig. 1) that evokes a percept of coherent motion (Ross et al. 2000Go). For instance, by orienting the elements along concentric trajectories, a percept of rotational motion is evoked. The direction of this rotation is ambiguously clockwise or counterclockwise, but the fact that a coherent rotation is seen at all is surprising because, on average, the motion energy in these patterns is perfectly balanced in all directions (see APPENDIX and DISCUSSION). Krekelberg et al. (2003)Go found a neural correlate of the motion percept in the superior temporal sulcus of the macaque, where a subpopulation of motion selective cells responds to Glass sequences as if they contain real coherent motion.



View larger version (14K):
[in this window]
[in a new window]
 
FIG. 1. Stimuli and design. A: Glass patterns are created by placing pairs of dots aligned along a common pathway at random positions on the screen (Glass 1969Go). Examples show a common concentric pathway (concentric Glass) and a common radial pathway (radial Glass). Open and gray circle pairs highlight 2 examples of elements that are orthogonal to each other in the 2 types of patterns. In the experiment all dots were white on a black background. In a Glass sequence, randomly chosen patterns of the same type (radial or concentric) are presented in rapid succession. B: temporal design of the trials. In each trial, 2 sequences with a duration of 300 ms and an interpattern interval of 100 ms were shown. In the "same type " conditions, 2 sequences of the same type were shown: concentric–concentric or radial–radial. In the "different type " conditions, 2 sequences of different types were shown: in these examples, a radial sequence followed by a concentric sequence or a concentric sequence followed by a radial sequence. After the visual stimuli had been extinguished, the fixation point remained on for another 2.3 s, after which the next trial started.

 
The goal of our study was threefold. First, we sought to determine which parts of human visual cortex show selective responses to the implied motion represented by Glass patterns. Second, we wanted to determine to what extent the subpopulations of cells that are selective for implied motion overlap with those selective for qualitatively similar real motion patterns. Third, we were interested in investigating to what extent the response to the global structure of the Glass patterns and its concomitant percept of motion was determined by local orientation detectors, possibly already at the level of V1, and how much of the response is a result of the global organization of the oriented elements.

Our fMRI data show that the human motion complex indeed contains subpopulations of cells that are selective for both implied and real motion. Selectivity for global patterns of implied motion and global patterns of real motion was also observed in ventral areas [VP, V4, and lateral occipital complex (LOC)]. In contrast with the dorsal areas, however, the subpopulations selective for the structure of Glass sequences did not show significant fMRI selectivity for the structure of real motion patterns. Finally, only a small part of the implied motion selectivity could be explained on the basis of selectivity for local orientation changes in primary visual cortex (V1, V2).

These findings provide insight into the representation of form and motion in the human visual cortex. Dorsal areas have a high degree of cue invariance (Albright 1992Go); in these areas real and implied motion patterns drive similar subpopulations of neurons. The ventral areas—which presumably respond to the global structure rather than the motion in these sequences—do not show this cue invariance. This allows ventral areas to discriminate between global structure generated by motion cues and that same structure generated by form cues. Dorsal areas, on the other hand, do not make this distinction and extract only the (implied or real) motion information.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
We used an event-related adaptation fMRI paradigm. This section first describes our subjects, then the stimuli we used and how we adapted the stimuli for each subject. The last three sections are concerned with the logic and design of the experiments and how we obtained and analyzed the fMRI data.

Observers

Eleven observers participated in expt 1 (implied motion), 14 in expt 2 (real motion), 16 in expt 3 (implied and real motion interactions), and 13 in expt 4 (local vs. global implied motion). Three observers were excluded from the analysis in expt 2, three in expt 3, and two in expt 4 because of excessive head movement. All observers had normal or corrected-to-normal vision, were paid for participation, and gave their informed consent. All procedures were in accordance with international standards for research involving human subjects (Declaration of Helsinki).

Apparatus

An LCD projector (NEC GT950) displayed the visual stimuli on a tangent screen that the subjects viewed through a mirror. The refresh rate of the projector was set to 60 Hz for all stimuli. The visible screen subtended 21° of visual angle. A custom-made fiber-optics button box allowed the subjects to communicate their decisions in the perceptual tasks.

Stimuli

LOCALIZERS. For the LOC localizer scans we used grayscale images of novel and familiar objects as well as scrambled versions of each set, as described previously (Kourtzi and Kanwisher 2001Go). We localized hMT+/V5 by using a moving-dot pattern (expanding and contracting for 9 s at a speed of 4°/s within a 21° aperture, reversal rate 1 Hz) and a stationary random-dot field (Huk and Heeger 2002Go; Watson et al. 1993Go). Area KO was localized by using kinetic boundaries and transparent motion stimuli (Dupont et al. 1997Go) that consisted of a field of random black (50%) and white (50%) dots (size: 3.1 arcmin; speed: 4.44°/s). To map the borders and the eccentricity of the retinotopic visual areas, we used rotating triangular wedge stimuli and concentric rings. These stimuli consisted of either gray-level natural images or black and white objects-from-texture images that were presented at a temporal frequency of 2 Hz as described in previous studies (Kourtzi et al. 2003Go).

RANDOM-DOT PATTERNS. The random-dot patterns used in the main experiments were created with in-house OpenGL software. Each pattern consisted of 200 dots. A single dot subtended 0.24°, and the whole stimulus pattern was contained within an 18 ° circular aperture around the central fixation point.

IMPLIED MOTION. To generate implied motion, we used sequences of Glass patterns. In each Glass pattern, the dots were arranged in pairs and all pairs in a given pattern were aligned either along concentric circles around the fixation point or along radial lines emanating from the fixation point (Fig. 1A). In a sequence of Glass patterns, a new pattern of the same type, with a new set of randomly positioned pairs was presented every 83 ms, which was close to the optimal range to generate the impression of motion (Ross et al. 2000Go). The duration of a single sequence was 300 ms (see Procedure, below). The distance between the dots in a pair, called the Glass shift, was set per subject to maximize their impression of motion (Ross et al. 2000Go) (see Procedure). When the pairs were aligned along concentric circles, they gave an impression of clockwise or counterclockwise rotational motion. They will be referred to as concentric Glass sequences. Orienting the pairs along radial lines gave an impression of expanding or contracting motion; these will be referred to as radial Glass sequences. As the APPENDIX shows, the average motion energy in these sequences is balanced. For each motion energy component in one direction, there is on average an equal component with energy in the opposite direction. Thus the distribution of motion energy in these sequences does not predict the coherent motion percept. In our terminology, stimuli with balanced motion have no coherent motion energy. In contrast, stimuli in which the motion energy does have a clear peak in some direction are referred to as real motion stimuli.

REAL MOTION. The motion sequences all had 200 dots that were identical to those in the Glass patterns. In the motion sequences, however, all dots were randomly positioned rather than arranged in pairs. Because Glass sequences implicitly contain two motion directions we used motion sequences in which the direction of motion reversed every 83 ms. The motion trajectory could either be along concentric circles (concentric motion) or along lines emanating from the fixation point (radial motion). The speed of the dots in the motion sequences was chosen to perceptually match the (high) speed in the Glass sequences (see Procedure).

Local Orientation Controls

To investigate the influence of local orientation selectivity, we devised segmented Glass patterns. In a segmented Glass pattern, the 18° aperture was divided into a square grid of 8 x 8 segments (see Fig. 5). We assigned a random-pair orientation to each segment. The orthogonal Glass sequences were chosen such that the local orientation per segment was orthogonal to the orientation of the previously presented segmented Glass sequence. Per segment, the change from a segmented sequence to the orthogonal sequence is locally the same as the switch from a radial sequence to a concentric sequence. This is most easily seen by comparing the orientation of the highlighted pairs in the example stimuli of Figs. 1 and 5. The difference between the transition from segmented to orthogonal sequences and the transition from concentric to radial Glass sequences lies solely in the global structure. Locally—at the scale of a single segment (about 2°)—both transitions are from one orientation to the orthogonal orientation.



View larger version (31K):
[in this window]
[in a new window]
 
FIG. 5. Segmented/random Glass sequences. These sequences were constructed to generate a stimulus that has no global structure, but clear local orientation structure. Orientation of the pairs was assigned randomly to each of the 64 segments (indicated by the dashed grid that was not visible in the stimulus). All pairs within a patch have the same orientation. Orthogonal Glass sequence is matched to a segmented Glass sequence; in each segment, the orientation is orthogonal to the corresponding segment in the associated segmented Glass sequence. For clarity only 2 segments are shown; in the real stimuli all segments were filled with pairs of dots. Open circle and gray circle pairs highlight 2 examples of elements that are orthogonal to each other. In the experiment all dots were white on a black background.

 
Procedure

Observers participated in two LOC, one hMT+/V5, one KO localizer scan, and two retinotopic mapping scans, and four scans for each of the four event-related experiments. Before the relevant scanning sessions the observers participated in a practice session. In this session the subjects were familiarized with the types of stimuli used in the experiment. In these sessions we used a simple adjustment procedure to determine the Glass shift that evoked the strongest coherent motion percept as well as the speed of the real motion sequences that matched the perceived speed in the Glass sequences.

During the scan sessions, the subjects performed one of two behavioral tasks that ensured that an equal amount of attention was allocated to the stimulus in all conditions. In the first task ("matching task"), the subjects pressed a key to report whether the two stimuli in a trial were the same or different. We analyzed the percentage of trials in which the decision was correct. In the second task ("change detection task"), the central fixation point briefly (250 ms) changed its color from red to blue at unpredictable times during the trial. Subjects pressed a key to indicate that they detected this change. We analyzed the percentage correct detections as well as the reaction time for each correct detection.

Design

LOCALIZERS. For the LOC localizer scans each stimulus condition was presented in a 16-s stimulus epoch (blocked design), as in previous studies (Kourtzi and Kanwisher 2000bGo). Each condition was repeated four times in a balanced order and with interleaved fixation periods. Twenty images were presented in each block, each for 300 ms with a blank interval of 500 ms between images. The observers fixated and performed a one-back matching task. In the hMT/V5+ localizer, a stationary-dot pattern was shown for 27 s and was then replaced by a moving (expanding and contracting) random-dot pattern. Each condition was repeated nine times. In the V3B/KO localizer scans, each stimulus condition (kinetic boundaries, transparent motion) was presented for seven 16-s epochs with interleaved fixation periods similar to the LOC localizer scan. For the retinotopic mapping scans, eight wedge positions and eight eccentricity rings were presented for 8 s each and repeated eight times. During the hMT/V5, V3B/KO, and retinotopic scans observers performed the change detection task.

EVENT-RELATED ADAPTATION SCANS. Each scan started with a 16-s fixation epoch and ended with an 8-s fixation epoch. After the initial fixation epoch, the experimental trials started. We used an event-related adaptation paradigm (Buckner et al. 1998Go; Grill-Spector and Malach 2001Go; Kourtzi and Kanwisher 2000bGo, 2001Go) in which two stimuli (e.g., A and B) were presented sequentially in 3-s trials. Each stimulus was presented for 300 ms with a 100-ms blank interval between stimuli and an intertrial interval of 2,300 ms (see Fig. 1B). Thus there were four experimental conditions: 1) A followed by A, 2) B followed by B, 3) A followed by B, 4) B followed by A, and one fixation condition in which only the fixation point appeared throughout the trial. As in previous studies (Kourtzi and Kanwisher 2000bGo, 2001Go), the order of trials was counterbalanced across subjects and runs so that trials from each condition, including the fixation condition, were preceded (two trials back) equally often by trials from any of the other conditions. In each of the experiments each condition was repeated 25 times per scan (total of 125 trials across conditions per scan) and subjects were run in four scans in one scanning session.

EXPERIMENT 1 ("IMPLIED MOTION"). The standard event-related adaptation design was used with A = concentric Glass and B = radial Glass, resulting in four experimental conditions: 1) concentric–concentric, in which two concentric sequences were presented in a trial; 2) radial–radial, in which two radial sequences were presented in a trial; 3) concentric–radial, in which a radial sequence followed a concentric one; and 4) radial–concentric, in which a concentric sequence followed a radial one.

EXPERIMENT 2 ("REAL MOTION"). The same design as in expt 1 was used with A = real concentric motion, B = real radial motion.

EXPERIMENT 3 ("IMPLIED AND REAL MOTION INTERACTIONS"). The first stimulus in a trial was always a Glass sequence; the second was always a real motion sequence. The conditions were: 1) concentric–concentric: concentric Glass followed by concentric motion; 2) radial–radial: radial Glass followed by radial motion; 3) concentric–radial: concentric Glass followed by radial motion; 4) radial–concentric: radial Glass followed by concentric motion.

EXPERIMENT 4 ("LOCAL VS. GLOBAL IMPLIED MOTION"). The first stimulus in a trial was always a segmented Glass sequence, and we had the following four conditions: 1) random–random: segmented Glass followed by segmented Glass; 2) random–concentric: segmented Glass followed by concentric Glass; 3) random–radial: segmented Glass followed by radial Glass; 4) random–orthogonal: segmented Glass followed by orthogonal Glass.

Imaging

The experiments were recorded in a 3-Tesla Siemens scanner at the University Clinic, Tübingen, Germany. Data were collected with a head coil from 11 axial (3 x 3 x 5 mm3) slices that covered occipitotemporal regions using gradient-echo pulse sequences (localizer scans: TR = 2 s, TE = 90 ms; event-related scans: TR = 1 s, TE = 40 ms; block-design scans: TR = 2 s, TE = 90 ms).

Data analysis

The fMRI data were processed using the Brain Voyager software package. Preprocessing of all functional data included head movement correction, temporal high-pass filtering (cutoff frequency 0.0468 Hz), and removal of linear trends. The two-dimensional functional images were aligned to three-dimensional anatomical data with 1 x 1 x 1-mm resolution and the complete data set was transformed to Tailarach coordinates. Anatomical data were additionally inflated and unfolded.

Regions of interest

For each individual observer, early visual areas (V1, V2, Vp, V3, V3a, V4) were identified based on standard retinotopic mapping procedures (DeYoe et al. 1996Go; Engel et al. 1994Go; Sereno et al. 1995Go). The motion complex (hMT+/V5) was defined as the set of contiguous voxels in the ascending limb of the inferior temporal sulcus that showed significantly (P < 10–4, corrected) stronger activation for coherently moving (expanding, contracting) than stationary dots. The LOC was defined as the voxels in the ventral occipitotemporal cortex that showed significantly stronger activation (P < 10–4, corrected) to intact than scrambled images based on the averaged data of the two localizer scans. Area KO was defined as the set of voxels anterior to V3a and posterior to hMT+/V5 that showed significantly stronger activation (P < 10–4, corrected) for kinetic boundaries than transparent motion. All regions of interest (ROIs) are shown on a flattened representation of a single subject's cortex in Fig. 2.



View larger version (125K):
[in this window]
[in a new window]
 
FIG. 2. Regions of interest. Functional activation maps for one subject showing the retinotopic ventral (V1, V2, VP, V4) and dorsal (V1, V2, V3, V3a) areas, V3b/KO, hMT+/V5, and lateral occipital complex (LOC). Functional activations are superimposed on flattened cortical surfaces of the right and left hemispheres. A, anterior; P, posterior; STS, superior temporal sulcus; ITS, inferior temporal sulcus; OTS, occipitotemporal sulcus; CoS, collateral sulcus.

 
Event-related scans

For each observer, we extracted fMRI responses by averaging the data from all the voxels within each of the independently defined ROIs in the event-related scans. In each scan, we averaged the signal intensity across all the trials in each condition. We then calculated the percentage signal change for each condition in relation to the fixation baseline as described in previous studies (Kourtzi and Kanwisher 2000aGo, 2001Go). Finally, we averaged these time courses across scans and observers.

The hemodynamic response function peaks several seconds after the onset of the stimulus (Boynton et al. 1996Go). To identify the peak of the fMRI responses in an ROI we fitted a Gaussian model (Kruggel et al. 1999Go) to the average fMRI responses for each condition across observers. This analysis showed average peak responses within the same time window (3–5 s after trial onset) across ROIs and experiments. Based on this analysis the average fMRI response between 3 and 5 s after trial onset was taken as the measure of response magnitude for each condition in subsequent analyses. That is, all comparisons among conditions and fMRI response measures in the figures are based on this averaged signal.

From these averaged fMRI signals we derived a measure of fMRI selectivity to stimulus changes per ROI and experiment. For instance in an experiment with the four conditions (and corresponding fMRI signals) AA, BB, AB, and BA, the selectivity index (or rebound index) was defined as: SI = [(AB + BA)/(AA + BB)] – 1. This represents the enhancement of activity obtained when two different sequences are shown compared with when two sequences of the same type are shown. This measure quantifies a rebound effect (i.e., the release from adaptation). Note that, even though we refer to SI as a selectivity index, we are aware that its relationship with true neuronal selectivity has not yet been demonstrated conclusively (see DISCUSSION). Nevertheless, it has proven to be a sensitive tool to investigate the blood oxygenation level–dependent (BOLD) signal selectivity at a spatial resolution below that of the imaged voxels (Buckner et al. 1998Go; Grill-Spector and Malach 2001Go; Henson and Rugg 2003Go; Kourtzi and Kanwisher 2000bGo, 2001Go).


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
We first present the details of the analysis and findings in the motion complex. Second, we describe the main findings in the other visual areas of interest shown in Fig. 2. Third, we present the results of an analysis that determines whether the selectivity for Glass sequences can be explained on the basis of local orientation selectivity. The final part of the results documents that neither attentional confounds nor eye-movement artifacts could explain our results.

The human motion complex

In this section we present the analysis of the BOLD signal in the human motion complex (hMT+/V5) in some detail. This includes some of the intermediate steps in the analysis necessary to arrive at an index of selectivity for an ROI. In subsequent sections that report the results in other regions of interest we will skip these intermediate results.

Implied motion

In this experiment we presented either two Glass sequences with implied motion of the same type (e.g., concentric–concentric) or two Glass sequences with implied motion of different type (e.g., concentric–radial) (see Fig. 1B).

Figure 3A shows the fMRI time courses (averaged over all subjects) in the motion complex. At time 0 the two implied motion stimuli were presented sequentially. The BOLD signal responded with its typical delay of a few seconds and showed an undershoot after the peak of the response. Note that because of the slow time course of the BOLD response, the separate presentation of the two sequences (Fig. 1B) cannot be resolved. A comparison of the two time courses in this panel, however, shows that the response to two successive Glass sequences of the same type (concentric–concentric) was lower than the response to two successive Glass sequences of different types (concentric–radial). This was a statistically significant effect [repeated-measures ANOVA; F(1,10) = 5.06, P < 0.05]. Figure 3B shows the same analysis for the two conditions that started with a radial pattern. The BOLD response was lower when two Glass sequences of the same type (radial–radial) were shown than when two Glass sequences of different types (radial–concentric) were shown.



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 3. Implied and real motion selectivity in the human motion complex. A: averaged time courses showing the blood oxygenation level–dependent (BOLD) response in trials in which 2 concentric Glass sequences were shown (solid line) and the response in trials in which a concentric Glass sequence was followed by a radial Glass sequence (dotted line). Signal changes in all panels are relative to the fixation baseline. B: averaged time courses showing the BOLD response to 2 successive radial Glass sequences (solid line) and the response to concentric Glass sequence followed by a radial Glass sequence (dashed line). C: average BOLD response to 4 implied motion conditions (expt 1). Calculated by averaging the BOLD signal around the peak response (gray window in A and B). Error bars represent SE. D: average BOLD response to the 4 real motion conditions (expt 2). E: average BOLD response to the 4 interaction conditions (expt 3; see main text). F: selectivity index (SI: [(AB + BA)/(AA + BB)] – 1) calculated from the response measures in B, C, and D. Similar results were observed in the MT and MST subregions of hMT+/V5 that were identified based on retinotopic mapping techniques (Huk et al. 2002Go). Together, these data show that cells in the human motion complex that are selective for real radial vs. concentric motion are also selective for implied motion of the same types.

 
To simplify the data presentation and for easy comparison across areas and experiments, we quantified the average response for each of the four conditions shown in Fig. 3, A and B by averaging the fMRI signal in the peak time points of the fMRI time courses (gray time window; see METHODS). This average peak response for the four conditions is shown in Fig. 3C. The bar plot summarizes what is clear from Fig. 3, A and B: the average peak BOLD response is lower in the conditions where the same pattern is shown twice than in the conditions where two different sequences are shown.

We interpret this as pattern-selective adaptation. That is, when a second sequence of the same type is presented, the response is reduced as a result of adaptation. When a second sequence of a different type does not show this reduction, it must have stimulated a different (nonadapted) set of neurons. Thus we infer from this so-called release from adaptation that separate subpopulations of cells in hMT+/V5 respond to radial and concentric Glass sequences (see DISCUSSION). In a control experiment described below we tested whether the local or the global differences between these categories could explain this selectivity.

To quantify the pattern selectivity we calculated an index that contrasts the BOLD signal in trials in which two sequences of the same type were presented with the BOLD signal in trials in which two sequences of different type were presented. This selectivity index (SI; see METHODS) condenses the analysis to a single number per experiment and ROI. This index is zero—reflecting no selectivity—when the response to two patterns of the same type equals the response to two patterns of different types. Index values significantly above zero correspond to the situation where the two patterns of different types evoke larger responses than two patterns of the same type; that is, a positive index reflects release from adaptation and thus pattern selectivity in the underlying population. The implied motion selectivity for the human motion complex is shown as the white bar in Fig. 3F.

Real motion

In this experiment, we presented either two real motion sequences of the same type (e.g., concentric–concentric) or two real motion sequences of different types (e.g., concentric–radial). The analysis was the same as that of the implied motion data. Figure 3D shows the bar plot of the average peak BOLD signal for the four conditions in this experiment. Here—as in the implied motion data—the BOLD signal in the conditions with two different types of motion sequences was significantly higher than that in the conditions with two of the same types of motion sequences. [F(1,10) = 42.76, P < 0.001]. Thus we infer that—not surprisingly and entirely consistent with single-cell findings—the motion complex has subpopulations of cells that are selective for the type of real motion.

From these average peak BOLD responses we determined the selectivity index of the human motion complex for real motion. The selectivity for real motion in hMT+/V5 was nearly 50% larger than that for implied motion and is represented by the black bar in Fig. 3F.

Interactions between real and implied motion

The analysis so far shows that hMT+/V5 has subpopulations selective for real motion and subpopulations selective for implied motion sequences. However, this does not necessarily mean that these subpopulations were the same. Experiment 3 was designed to test this hypothesis directly. If the same cells that respond to implied rotations also respond to real rotations, it should be possible to obtain pattern-selective adaptation for a concentric real motion pattern after the presentation of an implied rotation, and similarly for expansions. Thus in this experiment we presented concentric implied motion followed by concentric real motion and compared it to trials in which concentric implied motion was followed by radial real motion. We also compared radial implied motion followed by radial real motion to radial implied motion followed by concentric real motion.

Figure 3E shows the average peak BOLD responses for the four conditions. A real motion pattern after an implied motion pattern of the same type led to lower fMRI responses than a real motion pattern after an implied motion pattern of a different type [F(1,10) = 6.28, P < 0.05]. This was the case for both radial and concentric implied motion sequences. In terms of pattern-selective adaptation this suggests that the adaptation caused by an implied motion sequence affects the response to a real motion sequence, but only if it is of the same motion type (radial or concentric). This in turn suggests that the same subpopulations of cells respond (and therefore adapt) to both implied and real motion sequences of the same type.

From the activation measures in Fig. 3E we determined the selectivity index, shown as the gray bar in Fig. 3F. It measures the extent to which pattern-selective adaptation in the motion complex is invariant to changes in the sequence from implied motion to real motion. The fact that the index is nonzero shows that some of the cells selective for real motion were also selective for implied motion. If the gray bar were as tall as the black bar, that would mean that adapting with real motion had the same effect as adapting with implied motion (when tested with real motion). The fact that the selectivity index appears lower in this experiment than in the experiments in which only real motion was used suggests that not all cells selective for real motion are also selective for implied motion. From the relative sizes of the real motion selectivity index and interaction selectivity index we infer that about 45% of cells selective for real motion were also selective for implied motion.

The main points of our findings in the motion complex can be illustrated by Fig. 3F. The selectivity indices shown in this graph (together with their associated statistical tests of significance) make three points: 1) hMT+/V5 is selective for the type of implied motion (white bar), 2) hMT+/V5 is selective for the type of real motion (black bar), and 3) the selectivity for implied and real motion sequences is subserved by overlapping subpopulations (gray bar). For the other visual areas, we will present only these final steps of the analysis.

Retinotopic ventral and dorsal visual areas and the LOC

The same analysis shown in detail for hMT+/V5 in Fig. 3 was applied to the data from the other ROIs in the ventral and dorsal pathway.

Implied motion

The responses for two Glass sequences of the same type were significantly lower than the responses to two Glass sequences of different types in all early areas (V1, V2), dorsal retinotopic areas (V3, V3a, V3b/KO), as well as ventral retinotopic areas (Vp,V4) and the higher occipitotemporal area (LOC). Thus all areas showed some selectivity for these sequences; their selectivity indices are represented by the white bars in Fig. 4. Statistical analysis of selectivity gave the following results: V1: F(1,10) = 43.36, P < 0.001; V2: F(1,10) = 37.15, P < 0.001; V3: F(1,16) = 20.86, P < 0.01; V3a: F(1,16) = 30.33, P < 0.001; V3b/KO: F(1,16) = 53.76, P < 0.001;Vp: F(1,20) = 68.61, P < 0.001; V4: F(1,20) = 72.41, P < 0.001; LOC: F(1,20) = 70.21, P < 0.001.



View larger version (14K):
[in this window]
[in a new window]
 
FIG. 4. Implied and real motion selectivity in early visual areas, V3b/KO and the LOC. A: selectivity index for implied motion (white), real motion (black), and interactions between real and implied motion (gray). These indices were calculated from the intermediate steps shown for the motion complex in Fig. 3 and are based on the data from expts 1, 2, and 3, respectively. Error bars represent SE; they are large because we used a conservative method that incorporates the error estimates of both numerator and denominator. Similar results were observed in 2 subregions of the LOC: the LO (lateral occipital) at the posterior part of the inferior-temporal sulcus and the pFs (posterior Fusiform) in the posterior fusiform gyrus. These results show that all areas contained some subpopulations that were selective for concentric vs. radial real and implied motion, but that only in V3, V3a, and V3b/KO the same subpopulations were selective for real and implied motion.

 
Real motion

We also analyzed the responses to the real motion sequences (expt 2) and again found that there was significant—albeit small—selectivity in early visual areas (V1, V2, V3), ventral areas (Vp, V4, LOC), and the dorsal stream (V3a, V3B/KO). Statistical analysis gave the following results: V1: F(1,10) = 122.12, P < 0.001; V2: F(1,10) = 97.35, P < 0.001; V3: F(1,12) = 5.79, P < 0.05; V3a: F(1,12) = 20.88, P = 0.001; V3b/KO: F(1,12) = 9.02, P = 0.01; Vp: F(1,20) = 12.76, P < 0.01; V4: F(1,20) = 6.89, P = 0.01; LOC: F(1,20) = 6.19, P < 0.05. The selectivity index per area is represented by the black bars in Fig. 4. Note that this selectivity does not necessarily imply motion selectivity; the direction of motion is one feature that distinguishes radial real motion patterns from concentric motion patterns, but each of these patterns also has a distinct structure or form. Given the known properties of cells in the ventral stream, the selectivity observed in V4 and the LOC is most likely attributable to selectivity for the form suggested by the motion, consistent with previous studies that show enhanced fMRI responses in ventral visual areas for moving compared with static forms (Grill-Spector et al. 1998Go; Kourtzi et al. 2002Go).

Interactions between real and implied motion

As in hMT/V5, we then tested whether pattern-selective adaptation to an implied motion pattern transferred to a reduced response in a real motion pattern. The aim of this experiment was to determine whether the neural subpopulations selective for implied motion sequences were also selective for real motion sequences. We found (Fig. 4) that the selectivity indices in the dorsal areas (V3, V3a, V3b/KO) were typically at least twice as large as those in V1 and V2 and the ventral visual areas (Vp, V4, LOC). This suggests that the overlap between the neural populations responding to implied and real motion stimuli was much larger in the dorsal areas than in early or ventral areas.

Specifically, the BOLD response for a real motion sequence after an implied motion sequence of the same type was lower than the response to a real motion sequence after an implied motion sequence of a different type in the dorsal visual areas [V3: F(1,14) = 27.55, P < 0.001; V3a: F(1,14) = 52.07, P < 0.001; V3b/KO: F(1,14) = 2.64, P = 0.05]. However, no significant differences were observed between the fMRI responses for same or different types of sequences in V1 and V2 [F(1,10) = 1.11, P = 0.31] or ventral visual areas [Vp: F(1,20) = 1.61, P = 0.21; V4: F(1,20) <1, P = 0.85; LOC: F(1,20) = 1.68, P = 0.21].

Sensitivity to local versus global changes

Figure 4 shows that some selectivity for implied motion sequences is already present at the level of V1. This leads to two questions. First, how can area V1, with its small receptive fields, be selective for such large patterns? Second, if V1 is selective, does that mean that all selectivity in higher areas is simply inherited from V1, or can selectivity be observed in higher areas with patterns for which no selectivity is observed in V1?

The first question is easily answered; the sequences (concentric vs. radial) differ not only at the global scale, but they also have systematic differences at the scale of a V1 receptive field. In fact, if a given part of the screen contains a pair of dots oriented one way in a concentric pattern, that same part of the screen will contain the orthogonal orientation in a subsequent radial pattern. The highlighted pairs in Fig. 1A illustrate this. Thus (local) orientation selectivity in an area could be enough to lead to the selective adaptation we observed. Experiment 4 was designed to test this hypothesis (Fig. 5). We divided the screen into 64 segments and assigned a random orientation to each segment (see Fig. 5 and METHODS). In one condition, two segmented Glass sequences, with the same random orientation per segment, were shown in succession. In the other condition, a segmented Glass sequence was followed by another segmented sequence in which the orthogonal orientation was assigned to each segment. Locally (at the scale of the segments), the difference between these two sequences was a 90 ° orientation change, just like the transition from a concentric to a radial Glass pattern.

Figure 6 shows that the selectivity index in V1 was about as large for these local changes as it was for the global changes documented in Fig. 4. In particular, fMRI responses were significantly stronger for segmented Glass sequences with orthogonal orientations than the same orientation [V1: F(1,30) = 7.16, P = 0.01; V2: F(1,30) = 2.13, P = 0.05]. Thus it seems likely that the selectivity to the global changes of expt 1 in V1 was in fact a result of local orientation selectivity. This is in agreement with single-cell data demonstrating that the orientation signals present in Glass sequences are enough to drive orientation-selective V1 cells (Smith et al. 2002Go). As Fig. 6 shows, selectivity for local orientation was also found in other areas, both dorsal [V3: F(1,54) = 6.59, P = 0.01; V3a: F(1,54) = 6.27, P = 0.01; V3b/KO: F(1,54) = 2.98, P = 0.05; hMT+/V5: F(1,10) = 8.49, P < 0.01] and ventral [Vp: F(1,60) = 16.72, P < 0.01; V4: F(1,60) = 22.54, P < 0.001; LOC: F(1,60) = 12.21, P < 0.001]. It is possible that cells in areas such as hMT+/V5 picked up some of the implied motion signals at the scale of the segments, or that this reflects true local orientation tuning in these areas (Albright 1984Go). Nevertheless, the most parsimonious explanation of this selectivity is that it was inherited from the differential responses in V1.



View larger version (12K):
[in this window]
[in a new window]
 
FIG. 6. Local and global pattern selectivity. Black bars show selectivity for a change from a segmented Glass pattern to an orthogonal Glass pattern. This selectivity is most likely attributable to simple orientation selectivity. White bars show the selectivity for a change from a segmented Glass sequence to a rotation Glass sequence. Error bars represent the SE. These results show that local orientation selectivity in V1 may have contributed to the selectivity observed in higher areas (black bars). Some selectivity in higher areas, however, occurs without a measurable selectivity in V1 and V2 (white bars), suggesting that this global pattern selectivity arises in areas beyond V1 and V2.

 
Finally, we tested whether there is any selectivity that can only be ascribed to global and not the local organization of the stimuli. The perfect control would be to show two stimuli that are identical locally but different globally. Such a control does not exist because the global structure is entirely defined by the local elements. Thus the next best control uses two sequences with minimal differences at the local scale but large differences at the global scale. We used trials in which a segmented Glass sequence was followed by a concentric or radial Glass sequence. In these trials the local orientation change is 45° on average; this is on the order of the bandwidth of V1 orientation cells (Ringach et al. 2002Go) and thus we expect these two sequences to be minimally different to V1. Globally, on the other hand, this change is a very clear transition from a random jumble of orientations to a structured pattern that appears to rotate or expand. Figure 6 shows that if this stimulus change did lead to changes in V1 or V2, those changes were too small to be resolved with BOLD signals. In particular, no significant differences were observed in the fMRI responses between two successive segmented sequences of the same orientation and sequences in which a Glass pattern followed a segmented pattern [V1: F(1,30) <1, P = 0.88; V2: F(1,30) <1, P = 0.87]. In dorsal visual areas (V3,V3a, V3b/KO, hMT+/V5), however, two successive segmented Glass sequences did lead to significantly less activation than a segmented Glass pattern followed by a radial or concentric Glass pattern. The statistical analysis resulted in: V3: F(1,54) = 13.01, P < 0.001; V3a: F(1,54) = 5.76, P < 0.05; V3b/KO: F(1,54) = 8.94, P < 0.01; hMT+/V5: F(1,30) = 10.27, P < 0.01. For the ventral visual areas, we observed similar selectivity in V4 [F(1,60) = 6.98, P = 0.01] and LOC [F(1,60) = 20.83, P < 0.001] but not in Vp [F(1,60) = 1.81, P = 0.18].

This shows that higher areas in both the dorsal and the ventral stream but not the early areas V1 and V2 were selective (as measured by the BOLD signal) to sequences that have similar local properties but distinct global structure. These findings suggest that at least part of the selectivity for the global organization of Glass sequences arises in areas beyond V1 and V2.

Control experiments and analysis

It is conceivable that attention could be engaged more during trials in which two different stimuli were presented than during trials in which the same stimulus was shown twice. Because attention is known to modulate responses in visual areas, this could then have influenced our results. To control for this possibility, subjects performed a matching task during all scans (see METHODS). This task drew the subject's attention toward the stimuli, even in trials in which the same stimulus was shown twice. A Kruskal–Wallis ANOVA on ranks of the subjects' performance on this task showed that there was no significant difference in the percentage correct across conditions in any of the experiments (expt 1: P = 0.92; expt 3: P = 0.32, expt 4: P = 0.98; because of a software error, the behavioral responses from expt 2 could not be analyzed). The constant level of performance in these experiments indicates that a similar amount of attention was always devoted to the stimuli, regardless of condition. Moreover, it is highly unlikely that observers could selectively choose to attend to particular conditions because trials were presented in quick succession and were randomly interleaved. However, the matching task was not particularly difficult, as witnessed by the high average level of performance (96% correct). This implies that some attentional capacity may have been left for the subject to allocate differently in different conditions.

We therefore performed an additional control experiment. We repeated the implied motion experiment, with the instruction to the subjects to detect a change in the color of the fixation point (see METHODS). This task was subjectively much more difficult than the matching task. Analysis of the behavioral data showed that the number of undetected changes in the fixation point did not vary significantly with condition [F(2,4) = 0.93, P = 0.49]. Moreover, the reaction times for the correctly detected changes in the fixation spot were also not significantly modulated by the stimulus condition [F(2,4) = 2.28; P > 0.05]. These behavioral data suggest that attention was allocated similarly (to the fixation point) in all five conditions. Analysis of the fMRI data and the corresponding selectivity indices obtained in these sessions confirmed the adaptation effects that we reported above for implied motion in expt 1 across all areas [V1: F(1,3) = 58.08, P < 0.01, V2: F(1,3) = 104.97, P < 0.01, V3: F(1,3) = 16.46, P < 0.05, V3a: F(1,3) = 12.24, P < 0.05, Vp: F(1,3) = 134.04, P < 0.001, V4: F(1,3) = 119.92, P < 0.001, LOC: F(1,3) = 62.98, P < 0.001]. This control suggests that our findings were not confounded by a differential allocation of attention across conditions.

Eye movements

During the scans, observers were instructed to fixate the central fixation point. Eye movements of three subjects in each experiment were recorded (Eye-Link video–based system, 250-Hz sample rate). We compared eye position and saccades across conditions in each experiment. This analysis showed no significant differences in the average saccade number, the vertical or horizontal eye position, or the vertical or horizontal saccade amplitude between experimental conditions and the fixation condition. This analysis was applied to all four experiments and none of the statistical tests reached a value of P < 0.11 [F(2,8) ≥ 2.58]. This shows that the subjects were able to fixate for long periods of time and that it is unlikely that our findings could be confounded by differential eye movements across conditions.

Block design analysis

Region of interest analyses are sometimes criticized for preselecting areas and ignoring the rest of the brain. Our study tested very specific hypotheses about the involvement of typical motion and form areas in implied motion perception. In that context we believe the ROI approach to be appropriate. Nevertheless, to investigate whether other brain areas beyond the measured ROIs respond to implied motion sequences we ran block design scans that presented real, implied, and random motion. Each scan consisted of 16-s blocks with a given visual pattern, presented in a counterbalanced order. The seven conditions were: fixation only, concentric Glass, radial Glass, segmented/random Glass, concentric motion, and radial motion. Thus all stimuli that were used in the event-related paradigms were presented here in separate blocks. In each block 20 sequences were presented each for 300 ms followed by 500-ms blank. During this experiment, subjects performed the fixation point color-change detection task (five color changes of the fixation point per block).

Confirming our ROI analysis, we observed significantly stronger activations for Glass sequences than segmented/random (P < 0.01) in hMT+/V5, V3b/KO, and LOC. These activations did not cover the full extent of these areas as defined by independent localizers. Moreover, consistent with previous studies of motion-related areas, we also observed significantly stronger (P < 0.01) activations for real and implied than random motion sequences anterior to hMT+/V5 (Senior et al. 2000Go; Zeki et al. 1993Go) and along the intraparietal sulcus (Claeys et al. 2003Go).


    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Our study demonstrates the following main findings that advance our understanding of interactions between form and motion processing in the human visual cortex. First, subpopulations in the human motion complex were shown to be selective for different types of global Glass pattern sequences that imply motion. Second, this selectivity in hMT+/V5 to implied motion was about 50% of the selectivity for real motion sequences. Importantly, we provide evidence for significant overlap between the subpopulations that encode implied motion and those that encode real motion. A similar pattern of selectivity was found in other dorsal areas, notably, V3, V3a, and V3b/KO. Taken together we interpret the selectivity for global implied motion patterns and the significant overlap of the populations that are selective for real and implied motion as true selectivity for implied motion. Third, in early visual areas (V1, V2,) we observed selectivity for local orientation, but no selectivity for the global structure of Glass sequences. Thus a significant part of the selectivity observed in the dorsal areas could not be attributed to selectivity for local orientation changes inherited from V1, but must be attributed to the global arrangement of the sequences. Finally, ventral areas—V4 and LOC—were also selective for the global structure of Glass sequences and real motion sequences. The main difference with the dorsal areas was that in the ventral areas there was much less overlap between the populations selective for the global structure of Glass sequences and the populations selective for the global structure of real motion sequences (Fig. 4). In sum, our studies provide novel evidence that both dorsal and ventral visual areas in the human brain contribute to the selective processing of global motion from form. Dorsal motion-related areas represent global pattern sequences independent of the motion cue that defines them (real or implied) and thus may mediate the perception of implied motion from static forms. In contrast, this cue invariance was not evident in ventral areas, suggesting that these areas are better able to distinguish real from implied sequences; this may support the fine discrimination of the global structure in such patterns.

fMRI adaptation

The fMRI adaptation paradigm (Grill-Spector and Malach 2001Go) is being used in an increasing number of studies to uncover selective subpopulations of neurons at a resolution below that of the typical human fMRI voxel. The main assumption behind this paradigm is that if a neuron responds to a pattern (pattern selectivity) it will respond less to the second presentation of a pattern than to the first. This assumption has considerable support in early sensory areas (e.g., V1: Movshon and Lennie 1979Go; MT: Petersen et al. 1985Go). In psychophysics (Blakemore and Campbell 1969Go) and functional imaging (Grill-Spector and Malach 2001Go) a noninvasive measure of pattern-selective adaptation is used to infer neuronal selectivity. Specifically, to infer whether a population can distinguish pattern A from B, one presents two identical patterns successively (AA as well as BB) and measures whether this leads to a smaller response than presenting two different patterns successively (AB and BA). If both the AA and BB responses are smaller than the AB and BA responses, then there must be two separate mechanisms that respond and adapt to A and B.

The validity of adaptation fMRI has not yet been confirmed by detailed comparisons of adaptation effects at the neuronal and imaging levels. Initial reports (Sawamura et al. 2004Go), however, suggest that although successive presentation of identical stimuli does typically reduce responses, there are more complicated sequence effects. Some of these properties may be area specific, as witnessed by the qualitatively different adaptation effects found in, for instance, area MT compared with area V1 (Kohn and Movshon 2004Go; B Krekelberg, RJ Van Wezel, and TD Albright, unpublished observations). Moreover, there are still many uncertain steps relating even neuronal activity to the BOLD response (Logothetis and Wandell 2004Go). Thus even if a reduced BOLD response is observed in an adaptation paradigm, the underlying mechanism need not be neuronal pattern-selective adaptation.

These caveats notwithstanding, adaptation fMRI has been successful in various contexts in which the results could be verified at least indirectly with intracortical recordings. Examples are orientation selectivity in visual areas (Boynton and Finney 2003Go; Kourtzi et al. 2003Go; Tootell et al. 1998Go) and motion selectivity in area MT (Huettel et al. 2004Go; Huk et al. 2001Go, 2002Go; Tolias et al. 2001Go). Our study is another example; monkey single-cell recordings showed implied motion selectivity in MT (Krekelberg et al. 2003Go) and our adaptation fMRI revealed selectivity in the human motion complex. However, although some studies, including the current one, could demonstrate orientation selectivity in V1 with adaptation techniques (Kourtzi et al. 2003Go; Tootell et al. 1998Go), others could not (Boynton and Finney 2003Go). These discrepancies may be attributable to differences in the stimuli (Boynton and Finney used low spatial frequency stimuli that were nonoptimal for typical V1 cells), but it also points to possible area differences in adaptation and the effect it has on the BOLD signal. Thus until a fuller validation of the adaptation fMRI paradigm has been obtained with intracortical recordings, statements regarding underlying neuronal selectivity made on the basis of imaging data alone should be treated with caution. Imaging data, however, certainly can be suggestive of and consistent with neuronal selectivity. Treated as a piece of evidence in favor of such selectivity—not the final and conclusive answer—they can be highly valuable.

Glass patterns and implied motion

The percept of motion in a sequence of Glass patterns is so convincing that many observers find it difficult to believe that the motion energy in these sequences is just as balanced as the motion energy in a sequence of random-dot patterns. The APPENDIX provides a mathematical proof of this theorem, which states that in both random-dot sequences and in Glass sequences, the average motion energy spectrum is symmetric: The motion energy in any direction is on average matched by the motion energy in the opposite direction.

Motion energy detectors rely on asymmetries in the stimulus power spectrum (Adelson and Bergen 1985Go). Because the average power spectrum of a Glass pattern is symmetric, a motion energy detector will not assign a consistent direction of motion to it. On the basis of the motion energy distribution alone such a stimulus is therefore not expected to lead to a consistent, coherent motion percept. This is our reason for referring to these balanced motion energy patterns as containing no coherent motion. When we speak of implied motion in Glass patterns, we refer to the percept of coherent motion generated in the absence of coherent motion signals. By contrast, when we speak of real motion, we refer to the percept generated by stimuli whose average power spectrum is asymmetric and therefore contain unambiguous coherent motion signals.

The theorem in the APPENDIX, however, applies to the average motion energy in all Glass pattern sequences of a particular type. Because of the stochastic nature of the placement of the dots, the motion energy at any given time can be larger in any given direction, just as in random-dot sequences. This leaves open the possibility that any individual Glass sequence does contain an asymmetry and that this asymmetry is detected by a simple motion energy detector. However, if these accidental coherent motion components were the underlying cause of the perceived direction of motion, one would expect that a reversal of the sequence would lead to a reversal of the perceived direction of motion. Krekelberg et al. (2003)Go demonstrated that this prediction is not borne out by the data. We conclude that the implied motion percept is not generated by stochastic fluctuations in the motion energy of the stimulus.

Some stage of processing beyond motion energy extraction is therefore required to explain why Glass patterns appear to move coherently. In this context it is instructive to note that some stimuli with balanced motion energy fail to evoke a coherent motion percept. In a random-dot sequence, for instance, the balanced motion energy does not lead to the percept of a globally coherent direction of motion. Instead, it results in a percept of directed motion that changes rapidly over time and is presumably driven by the stochastic fluctuations in dot placement. In the Glass pattern sequences, on the other hand, a globally coherent motion percept is evoked that alternates between only two directions over time (e.g., clockwise and counterclockwise for concentric Glass patterns). This perceptual difference is the phenomenon whose neural mechanisms we wish to understand. The difference between random sequences and Glass sequences is one of form: both global form (concentric patterns) and local form (oriented elements). Thus in this descriptive sense, the form of the Glass patterns generates the implied motion. Whether neural mechanisms make explicit use of this form information to enhance motion processing (Geisler 1999Go) is a topic of ongoing investigation.

Implied motion selectivity in the human motion complex

A recent imaging study has reported not to find selectivity for Glass sequences in hMT+/V5 (Wade et al. 2004Go). We believe this arose primarily from two factors. First, the authors used a restrictive definition of hMT+/V5; they chose voxels that responded better to coherent motion than to random motion. Many MT cells in the macaque, however, respond vigorously to random motion (Churan and Ilg 2002Go; Krekelberg and Albright 2005Go; Thiele et al. 2000Go). Our definition of hMT+/V5 included these cells. Second, the block design of Wade et al. looked for voxels that responded better to Glass sequences than to random motion. However, if cells responsive to random motion and implied motion are spatially interleaved, no difference in activation between the blocks would be expected. The adaptation technique, on the other hand, can resolve selectivity at a spatial resolution below that of a single voxel. Thus the findings of Wade et al. do not contradict ours; at the scale of MRI voxels there is no part of hMT+/V5 specialized for the detection of implied motion, although our study shows that hMT+/V5 does contain spatially intermingled subpopulations of neurons selective for implied motion.

Mechanisms for implied motion perception

Our data confirm and extend the findings of Krekelberg et al. (2003)Go who showed that a subpopulation of cells in the macaque superior temporal sulcus (MT and MST) responds to implied motion as if it is real motion. These cells responded best to those Glass patterns that evoked the strongest sense of motion and the cells' selectivity for real motion carried over to implied motion. This generalization suggests that these cells extract motion signals independent of the real or implied motion cue that delivers the signal. This cue invariance was also evident in the imaging data obtained from dorsal motion areas, but not ventral areas.

The implied motion selectivity in the dorsal stream could arise from direct judicious sampling of neurons in the early visual areas (V1, V2). Both the single-cell data (Smith et al. 2002Go) and our imaging data show that these areas contain the necessary local orientation information. Moreover, in MT a subset of cells responds to oriented features that are aligned along the preferred direction of motion of the cell (Albright 1984Go; Maunsell and Van Essen 1983Go). The oriented form features of Glass patterns could activate these cells and thereby signal motion along those oriented features. In line with the percept of motion in plaids, these type II MT cells respond to plaid stimuli containing two directions of motion as if they contain a single direction of motion (Rodman and Albright 1989Go). Thus they are promising candidates for the computation needed to extract coherent motion from the multiple balanced motion signals in Glass patterns. At the same time, however, it is important to note that the majority of MT cells (nearly 80%) respond preferentially to oriented features orthogonal to their preferred direction of motion (Albright 1984Go). Activation of these cells by Glass patterns would signal motion orthogonal to the motion implied by the pattern. Thus even though some MT cells are expected to respond simply to the oriented features in the Glass patterns, this by itself does not explain why motion is perceived along, and not orthogonal to, the oriented features.

Alternatively, implied motion selectivity in dorsal areas could arise from interactions with early form areas (V4, LOC). Our data support this view in that V4 and LOC contain subpopulations selective for the global structure in these patterns. Such an explicit use of orientation information could improve motion processing at high velocities (Geisler 1999Go). The temporal resolution of current functional imaging is not high enough to test whether selective responses in form areas precede those in the motion areas and thus we cannot resolve whether implied motion selectivity arises directly from V1 and V2 or through an interaction with V4 and the LOC. However, our study does provide electrode guidance for intracortical recordings that could address this issue. Area V4 and the inferotemporal cortex (IT) of the monkey have form selectivity that is comparable to that of V4 and the LOC in humans (Denys et al. 2004Go; Gallant et al. 2000Go; Kobatake and Tanaka 1994Go). Using simultaneous single-cell recordings from areas that respond to complex form (such as V4 or IT) and MT, it should be possible to test the hypothesis that an interaction between form and motion areas underlies our perception of implied motion. It would be particularly interesting to determine whether that same interaction also underlies the perception of motion in the more cognitive implied motion images, such as a cup about to fall from a table (Kourtzi and Kanwisher 2000aGo; Senior et al. 2000Go).

Mechanisms for global form perception

We concentrated on the implied motion percept generated by Glass pattern sequences. These patterns, however, also generate a strong perception of global form. In fact, most of the work on Glass patterns has concentrated on how the local elements that carry the orientation information are bound together to generate the global form percept. Behavioral (Wilson and Wilkinson 1998Go) and recent ERP studies (Pei et al. 2005Go) show that the detection of structure in radial or concentric Glass patterns involves pooling of local information beyond the scale of typical V1 receptive fields. A case study by Gallant et al. (2000)Go strongly implicated V4 because a lesion involving V4 significantly disrupted a patient's ability to detect the global structure in Glass patterns. Single-cell studies in monkeys also point to areas beyond V1 because selectivity for so-called non-Cartesian gratings and complex object features arises in V4 (Gallant et al. 1993Go, 1996Go; Kobatake and Tanaka 1994Go). Moreover, a recent psychophysical study (Clifford and Weston 2005Go) provides evidence that adaptation to the global structure of Glass patterns also has two components, one likely to originate in the local orientation detectors of V1, the other likely to originate in cells with much larger receptive fields and therefore presumably in extrastriate areas. Our data are compatible with this view. Local orientation selectivity—even for the noisy oriented elements in a Glass pattern—was evident in V1, but selectivity for the global organization of a Glass pattern was observed only in later ventral areas (V4, LOC). In agreement with previous models this suggests that global form selectivity starts to arise in V4 from appropriate pooling of V1 orientation detectors.

Interestingly, ventral areas were also selective for the structure of global motion patterns. Previous work has also documented this overlap of sensitivity for motion and form (Braddick et al. 2000Go; Denys et al. 2004Go). Our data suggest, however, that the selectivity for real motion is largely carried by a different subpopulation of cells than those selective for the structure of Glass patterns. In other words, whereas the representation of motion information in the human motion complex has a significant degree of invariance with respect to implied and real motion cues, the representation of the form of those same patterns in V4 and LOC is not cue invariant. The reason for this may be that, although motion processing is essentially complete once the velocity is estimated, form information may have to be processed in much more detail to extract further information on the objects that generate it. For such a detailed analysis an early stage of cue-invariant processing would be detrimental.


    APPENDIX
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Let P be a sequence in which a new random-dot pattern is chosen every time step. P is a function of two spatial dimensions (x, y) and time (t). The Fourier transform of the sequence, denoted by F(P), is a function of spatial frequencies (kx, ky) and temporal frequency ({omega}). The power spectrum of P is written as F(P)F*(P), where * denotes the complex conjugate. Because dot patterns are chosen randomly in each frame of the sequence, there are, on average, no space–time correlations in P. Averaged over time, each direction of motion is equally likely to occur, and the motion signals are balanced. In Fourier space, the balance in directional signals results in a stimulus power spectrum that is symmetric around the {omega}-axis in k–{omega} space. We refer to such a stimulus as one that has no coherent motion energy. (By contrast, a stimulus whose power spectrum is consistently oriented in k–{omega} space is referred to as a real motion stimulus.)

To construct a Glass pattern sequence from a sequence of random-dot patterns, each element in the original sequence P is shifted along some one-dimensional coordinate. For translational Glass patterns this is a simple spatial translation; for concentric and radial Glass patterns this is a translation in polar coordinates. We will represent this translation/rotation/expansion operation by the operator S, which shifts a pattern by an amount s. With this notation, a Glass pattern (G) is simply the sum of a random-dot pattern sequence and that same sequence after the operation S

To assess the motion energy in the Glass pattern, we determine its power spectrum

Because the Fourier transform is a linear operation, we can write this as

Because the operator S is a simple translation in the space–time domain, it can be written as a multiplication by eiks in the Fourier domain. The spatial frequency k here refers to the dimension along which the operator S operates, and s is the amount by which operator S shifts P (the Glass shift). With the transformation F(SP) = eiksF(P), we can write the above formula as




In words: the power spectrum of the Glass pattern is equal to the power spectrum of the underlying random-dot pattern [|F(P)|2] times a factor that depends only on the spatial frequency k. Importantly, because cos (ks) = cos (–ks), this term is symmetric in k.