We recorded responses to apparent motion from directionally selective neurons in primary visual cortex (V1) of anesthetized monkeys and middle temporal area (MT) of awake monkeys. Apparent motion consisted of multiple stationary stimulus flashes presented in sequence, characterized by their temporal separation (Δt) and spatial separation (Δx). Stimuli were 8° square patterns of 100% correlated random dots that moved at apparent speeds of 16 or 32°/s. For both V1 and MT, the difference between the response to the preferred and null directions declined with increasing flash separation. For each neuron, we estimated the maximum flash separation for which directionally selective responses were observed. For the range of speeds we used, Δx provided a better description of the limitation on directional responses than did Δt. When comparing MT and V1 neurons of similar preferred speed, there was no difference in the maximum Δx between our samples from the two areas. In both V1 and MT, the great majority of neurons had maximal values of Δx in the 0.25–1° range. Mean values were almost identical between the two areas. For most neurons, larger flash separations led to both weaker responses to the preferred direction and increased responses to the opposite direction. The former mechanism was slightly more dominant in MT and the latter slightly more dominant in V1. We conclude that V1 and MT neurons lose direction selectivity for similar values of Δx, supporting the hypothesis that basic direction selectivity in MT is inherited from V1, at least over the range of stimulus speeds represented by both areas.
As visual information moves though the brain, the representation of the original retinal stimulus undergoes a series of transformations. For example, orientation selectivity emerges in the transformation between the lateral geniculate nucleus (LGN) and primary visual cortex (V1) (Hubel and Wiesel 1965, 1968). It is common to assume that similar transformations occur between V1 and the secondary visual areas to which it projects. For instance, V1 provides a major ascending input (Maunsell and Van Essen 1983b; Ungerleider and Desimone 1986) to the middle temporal area (MT), and it would be reasonable to assume that the directional motion selectivity found in MT neurons is the outcome of transformations that occur between V1 and MT. However, anatomical and physiological evidence argues that the inputs to MT derive selectively from the minority of V1 neurons that are themselves strongly direction-selective (Lund et al. 1976; Maunsell and Van Essen 1983b; Movshon and Newsome 1996; Shipp and Zeki 1989; Ungerleider and Desimone 1986). Apart from their larger receptive fields (Britten and Heuer 1999; Dubner and Zeki 1971; Maunsell and Van Essen 1983a), MT neurons respond to the motion of simple stimuli in much the same manner as do direction-selective V1 neurons.
Current knowledge highlights several transformations that might occur between V1 and MT. Direction selectivity could become more closely tied to global object motion, an idea supported by responses to plaid patterns (Movshon et al. 1986; Rodman and Albright 1989; Stoner and Albright 1992). Motion could be extracted from higher-order features of the stimulus, an idea supported by more prominent responses to second-order motion in MT (Albright 1992; O'Keefe and Movshon 1998). Finally, the larger receptive fields of MT neurons could allow directional mechanisms to work over larger spatial scales (Mikami et al. 1986b). The last of these proposals appears the most mundane, but is perhaps the most profound. It implies that the most basic selectivity of MT neurons is not entirely dependent on that in V1 but is created anew either by the V1 to MT projection or by MT-specific mechanisms.
Stimuli that provide “apparent motion” allow parameterized degradation of motion, and quantitative analysis of the temporal and spatial limits of directional responses. Apparent motion is composed of a sequence of flashes of a stimulus, with apparent speed defined as the ratio of the spatial and temporal intervals between flashes: Δx/Δt. For a range of small flash separations, directionally selective neurons in both V1 and MT respond to apparent motion as if it was real motion. Once flash separation increases beyond a critical point, neurons in both V1 and MT lose direction selectivity (Churchland and Lisberger 2001; Mikami et al. 1986a, b). Roughly speaking, direction selectivity is limited by a maximum value of Δt for slow apparent speeds, and a maximum value of Δx for faster speeds. Mikami et al. (1986b) reported that V1 neurons lost direction selectivity for smaller values of Δx than did MT neurons. If MT neurons retain directional selectivity for values of Δx where V1 neurons do not, then it follows that MT must benefit from directional mechanisms that do not depend on directional responses in V1.
There are a number of reasons to wish to re-examine this conclusion. First, an MT neuron receives input from multiple V1 neurons, and there is likely a nonlinear relationship between the spike rate of an MT neuron and the spike rates of its inputs (Anderson et al. 2000; Carandini and Ferster 2000; Carandini et al. 1997; Heeger 1992). An MT neuron could therefore be considerably more directional than its average input, even without assuming additional mechanisms of directionality. Second, Mikami et al. did not use the same visual stimuli when recording neural responses in V1 and MT. They presented a narrower bar when recording in V1 with the intent of driving more robust responses. This would seem appropriate: the alternative strategy could have raised the criticism that V1 neurons had reduced resistance to apparent motion only because they were responding to nonoptimal stimuli. However, models of direction selectivity predict that bar width may affect the spatial separation at which direction selectivity is lost (Adelson and Bergen 1985; Borst and Egelhaaf 1989; Watson et al. 1986). Last, Livingstone et al. (2001) analyzed MT responses to white-noise stimuli and found that the spatial scale of directional interactions was smaller than a typical V1 receptive field, in contrast to the conclusion of Mikami et al. To re-examine this issue, we performed an experiment similar to that of Mikami et al. but used the same stimuli for both areas. We found that neurons in MT and V1, when matched for eccentricity and preferred speed, lose directionality over a similar range of spatial flash separations. It is thus likely that MT neurons inherit their basic directional selectivity from V1. However, like Mikami et al., we found that preferred speeds were typically much higher in MT. This may indicate either the presence of MT-specific directional mechanisms dedicated to fast speeds or a substantial overrepresentation of fast speeds in the V1 to MT projection.
Neural recordings were made in V1 and MT of four Old World monkeys. The data from MT were recorded from two awake rhesus macaques (Macaca mulatta). Other data from these recordings were reported by Churchland and Lisberger (2001) as were our basic methods for recording from awake, behaving monkeys. Area MT was located based on the well described response properties of MT neurons (e.g., Maunsell and Van Essen 1983a), the response properties of neurons in surrounding areas V4 and MST (e.g., McAdams and Maunsell 2000; Newsome et al. 1988), and the progression of white matter, gray matter, and lumen of the superior temporal sulcus encountered prior to reaching MT. It is unlikely but possible that directional neurons in V4 near the MT/V4 border may rarely have been identified as MT neurons. A small number of such errors would not affect our interpretation of the data. Recordings from V1 were made in two anesthetized, paralyzed macaques (1 M. fascicularis and 1 M. mulatta) using techniques that are reported in detail elsewhere (Priebe et al. 2002). Area V1 was visualized before inserting electrodes so that recording locations were quite certain. Spikes from each individual neuron were discriminated by two time/amplitude windows (Bak, DDIS-1) that triggered a logic pulse. Accepted waveforms were stored on a storage oscilloscope to verify that only one waveform shape was present and that there was the expected refractory period between spikes. The time of each logic pulse was recorded by the computer with a time resolution of 10 μs. All experimental methods had been approved in advance by the Institutional Animal Care and Use Committee at University of California San Francisco.
Stimuli were square patches of moving dots presented on 12- and 19-in diagonal analog oscilloscopes. Patches contained an average of 24 randomly spaced dots, bounded by an invisible 8° square aperture, centered on the receptive field of the neuron under study. Individual dots were roughly 0.2° across, and their luminance was 1.6 cd/m2. The dots disappeared on reaching the far edge of the aperture and were replaced by dots that appeared on the opposite edge. The control signals for the oscilloscopes were provided by the D/A converters of a digital signal processing (DSP) board that ran in a Pentium computer. All stimuli provided apparent motion: the motion of each dot was created by producing sequential flashes with a given spatial and temporal separation, referred to as Δx and Δt, respectively. Apparent speed is defined as Δx/Δt. Due to hardware constraints, the minimum value of Δt was 4 ms, and larger values of Δt were required to be integer multiples of 4 ms. Spatial resolution was determined by the D/A converters on the DSP board, which provided 65,536 pixels across each dimension of the video screen.
To maintain a constant mean luminance of the target, the luminance of each dot flash was varied linearly with the value of Δt: if Δt was doubled, so was the luminance of each flash. For stimulus speeds <64°/s, we assumed that we were providing smooth motion when Δt was 4 ms and that the neural responses were the same as they would have been with truly smooth motion. This assumption is justified by previous research (Churchland and Lisberger 2000, 2001) and by the finding that the neural response was always very similar when Δt was 4 and 12 ms. Each individual flash had a duration of 160–2,560 μs, depending on luminance. The presentation of dots during a flash was essentially synchronous, followed by an interval when no dots were present until the next flash. The specifications of the display oscilloscopes indicate that the phosphor decayed to 10% of its maximal level in 10 μs to 1 ms.
Visual stimuli were presented in “trials,” where each trial provided target motion at a given speed and Δt. Each experiment used a list of trials. The presentation order of the list was shuffled randomly, and each trial was presented once. After completion of all trials, the list was re-shuffled and presented again. During each trial, the stimulus was present for 500 ms and was always moving during this time. For MT experiments using awake monkeys, the animal was required to fixate a small dot during the course of the trial to allow the stimulus to be placed in the receptive field of the neuron under study. The fixation point appeared 800 ms before the stimulus and was extinguished 300 ms after stimulus offset. A juice reward was delivered if fixation was accurate within 4–5° throughout the course of the trial. Actual fixation was typically much more accurate with the exception that fast stimuli presented near the fovea evoked small smooth eye movements that the monkey was unable to suppress completely. If the monkey broke fixation, then the trial was aborted and placed at the end of the list to be completed before the list was shuffled and repeated. Churchland and Lisberger (2001)'s study contains an analysis of eye movements during fixation for these experiments and shows that smooth eye velocities were very small (0.1–0.6°/s) compared with stimulus velocity, especially for larger values of Δt: for values of Δt ≥32 ms, eye velocity ranged from 0.01 to 0.22°/s, amounting to <1.5% of retinal image velocity. From the standpoint of the present study, the small smooth eye velocity present during fixation would reduce the retinal Δx slightly and would result in a very slight overestimation of the maximum spatial separation tolerated by a given MT neuron.
For each neuron, we began by presenting a series of trials designed to estimate receptive field location and preferred direction. We then tested the response to apparent motion with the dot aperture centered on the receptive field and the axis of motion aligned with the preferred-null axis of the neuron. Apparent motion was presented at 16 and 32°/s at a variety of flash separations. Table 1 shows the presented values of Δx and Δt for the two speeds. Apparent motion was presented in both the preferred and null directions of the neuron under study. Speed tuning was assessed in trials that were interleaved with the apparent motion stimuli. For these trials, speeds ranged from 0 to 128°/s while Δt was fixed at 4 ms, (Δx = 0–0.51°). An exception was the first experiment on V1, for which the trials used to test speed tuning were presented in a separate block that preceded the apparent motion stimuli. Also for this monkey, the apparent motion stimuli moved for 800 ms rather than the 500 ms used for the other three monkeys.
Analysis of neural responses
Neural responses were analyzed by computing peristimulus time histograms of spike rate and the average spike rate from the time of stimulus onset until 600 ms later. This latter limit was 100 ms after stimulus offset for most experiments but was actually 200 ms before stimulus offset for the first V1 experiment. Apart from increasing statistical variability slightly, limiting the window to 600 ms had no effect on the results of this experiment. We therefore chose to retain the same analysis window for all experiments to facilitate comparison.
For each neuron, we estimated two scalar quantities. The first was preferred speed. Average spike rate in response to motion in the preferred direction was plotted as a function of speed for effectively smooth motion (Δt = 4 ms). Data were fitted with the function (1) where R(s) fits the response at speed s, Rbase sets the baseline firing rate, Rmod defines the firing rate modulation, p is the preferred speed, σ is the tuning width, and ζ is the skew. The parameters of Eq. 1 were adjusted to achieve an optimal least squared fit to the data.
The second quantity we estimated was Δxmax, the largest flash separation for which the neuron gave a response that was directionally selective. We define the “opponent response” for a given stimulus as the difference between the average responses in the preferred and null directions. For each neuron, the opponent response was plotted as a function of Δx and fitted with a sigmoid function (2) where R(Δx) fits the opponent response at a given value of Δx, Ro defines the maximum opponent response, g is the gain of the decline in directional firing, and Δxmax is the value of Δx where R(Δx) has declined to 50% of its maximum value. Values of Δxmax were computed separately for stimulus speeds of 16 and 32°/s. We chose to use the opponent responses as our primary measure of neural responses because it directly assesses the magnitude of the difference in preferred and null direction responses. The central question being asked is whether, for large values of Δx, there is sufficient difference between the preferred and null responses in V1 to account for the difference in MT. Note that we are not asking if V1 neurons are as stringently directional as MT neurons. In general they are not, even for smooth motion, but it is assumed that an opponent computation between V1 and MT is sufficient to create more stringently directional neurons in MT. For this reason, we chose to use the opponent response, rather than the directional index.
Nevertheless, to allow direct comparison of our data with those of Mikami et al. (1986b); we also computed the directional index as they did: (Rpref − Rnull)/Rpref, where Rpref and Rnull are the average firing rates during stimulus motion in the preferred and null directions. A directional index of unity indicates that a neuron responds during motion in the preferred direction and is silent during motion in the null direction, whereas a directional index of zero indicates no directional tendency. Note that the directional index cannot become greater than one, but it can fall below negative one if the response to motion in the null direction becomes greater than for motion in the preferred direction as occasionally happens when responses become very weak at large flash separations. To avoid this problem, we limited the directional index to values between one and minus one. We then fitted the relationship between the directional index and Δx with the sigmoid defined by Eq. 2 and compared the values of Δxmax for V1 and MT.
Comparison of methods with those of prior reports
Mikami et al. (1986a, b) examined a variety of issues related to apparent motion, including the mechanisms of direction selectivity, and the relationships of Δxmax and Δtmax to receptive field eccentricity, receptive field size, and preferred speed. We did not design our study to repeat theirs but rather optimized the stimuli, neural selection criteria, and analysis methods to address the single issue of whether MT responses to apparent motion at large values of Δx could be inherited from V1 without the need for additional directional mechanisms. We thought it was important to repeat this aspect of their study because they had used different visual stimuli in V1 and MT, and we wished to verify their conclusion using the same visual stimulus in the two areas. To address this question, it was essential to use relatively high stimulus speeds, and so we chose to test apparent motion only at 16 and 32°/s. As demonstrated by Mikami et al. (1986a, b), directional responses at lower speeds are limited by Δtmax rather than Δxmax, and Δtmax did not vary between the two areas. An unpleasant side effect of the use of higher stimulus speeds is that we had to find and select V1 neurons that responded to those speeds, meaning that many V1 neurons with low preferred speeds were excluded from our study. Had we included such neurons in our population (and used slower stimuli to test them) the data we obtained would have been ambiguous from the standpoint of the value of Δxmax, as Δtmax would have increasingly become the limiting factor.
Comparison of basic response properties in MT and V1
We recorded from 108 directional neurons in area MT of two awake monkeys and 30 directional neurons in area V1 of two anesthetized monkeys. For area V1, many more neurons were isolated and recorded briefly but were not studied in detail because they did not emit directional responses to dot patches that moved at speeds of 16 or 32°/s. The majority of neurons in V1 are not strongly directional, and only a subset of directional neurons respond well to sparse dot patterns moving at these relatively high speeds. Thus the data we report from V1 constitute a heavily screened subset of neurons. This was true to a much lesser degree in area MT, where the vast majority of neurons responded directionally to sparse dot patterns that moved at speeds of 16 or 32°/s.
For area MT, receptive field eccentricities varied from 2.7 to 8.9° with a mean of 6.0°. For area V1, eccentricities varied from 3.2 to 17.0° with a mean of 6.5°. With the exception of the two V1 neurons with the most eccentric receptive fields (centered at 17.0° and 15.6°), every V1 neuron we studied had a receptive field that overlapped in eccentricity with many of the MT neurons in our sample.
Despite intentional selection of the most directional neurons in V1, the responses of V1 neurons to effectively smooth motion (Δt = 4 ms) were less directional on average than were the responses of MT neurons. For effectively smooth motion at 16°/s, MT neurons with preferred speeds between 8 and 32°/s (n = 42) had a mean directional index of 0.92. V1 neurons with preferred speeds in the same range (n = 18) had a mean directional index of 0.71. The average opponent response, defined as the difference in spike rate between the responses to motion in the preferred and null directions, was 70 spikes/s for MT and 24 spikes/s for V1. The difference in spike rates observed for the two areas is consistent with previous findings that random-dot textures are more effective stimuli for MT than for V1 neurons (Skottun et al., 1988), although the difference also could result from our use of anesthesia for V1 recordings.
In agreement with Mikami et al. (1986b); neurons recorded in area MT tended to have higher preferred speeds (mean = 27°/s; range = 0.0–140°/s) than did neurons recorded in V1 (mean = 11°/s, range = 2.1–29°/s, P < 0.001). The true difference is probably even larger. For V1, recordings were aborted if the neuron showed no directional response for stimulus motion at speeds of 16 or 32°/s. Thus many V1 neurons with slow preferred speeds were excluded from our sample.
Estimation of maximum spatial separation
Mikami et al. (1986a) found that, for high stimulus speeds, the maximum spatial separation between target flashes (Δxmax) was the primary factor limiting directional responses and was similar across speeds. For slow stimulus speeds, the maximum temporal separation between target flashes (Δtmax) was the primary limiting factor, and was similar across speeds. Our pilot studies agreed with these findings. To estimate Δxmax, we therefore used relatively high stimulus speeds: 16 and 32°/s. There is little point in testing responses to slower speeds; at those speeds Δtmax would become increasingly important as the limiting factor, making it impossible to estimate Δxmax. Conversely, it would have been difficult to use still faster speeds, because very few V1 neurons at the visual field eccentricities we tested emitted directional responses at speeds faster than 32°/s.
Figure 1 shows the responses of a representative V1 neuron to apparent motion stimuli and illustrates how we estimated preferred speed and Δxmax. Each subpanel of Fig. 1, A, C, and E, contains a pair of peristimulus time histograms (PSTHs) that show responses for motion in the preferred and null directions. For motion in the null direction, increasing firing rates are plotted as downward bars. Figure 1A illustrates how we assessed speed tuning using effectively smooth motion. For this neuron, responses to motion in the preferred direction increased as stimulus speed increased from 4 to 16°/s and then decreased as speed was increased further. The neuron did not respond to motion in the null direction at any speed. Plotting the response to motion in the preferred direction as a function of stimulus speed (Fig. 1B) and fitting with Eq.1 yields a preferred speed of 17°/s.
For a given stimulus speed, increases in the flash separation eventually led to a decline in directionality. Figure 1C shows the response of the same neuron as in A to apparent motion at 16°/s. The response to motion in the preferred direction declined as the value of Δx increased, whereas the response to motion in the null direction was unaffected. We define the opponent response as the difference between the responses in the preferred and null directions: Rpref − Rnull. Plotting mean opponent response as a function of Δx (Fig. 1D) and fitting with Eq. 2 allows us to compute Δxmax, the value where the fitted sigmoid fell to half its maximal value. For this neuron, Δxmax was 0.71° for a stimulus speed of 16°/s. Figure 1, E and F, uses the same approach to show data from the same neuron for a stimulus speed of 32°/s, for which Δxmax was 1.02°. Comparing Fig. 1, C and E, illustrates that the point at which the directional response is lost is not simply tied to the value of Δt. The neuron responded strongly to motion with a Δt of 32 ms when the stimulus moved at 16°/s but not when it moved at 32°/s. Thus Δt cannot be the primary limiting factor. Conversely, at both stimulus speeds strongly directional responses were lost for values of Δx somewhere between 0.7 and 1°, suggesting that Δx is the limiting factor. Across V1 neurons, the value of Δxmax was on average 1.21 times larger when measured at 32°/s than when measured at 16°/s. Across MT neurons the ratio was on average 1.15. Note that the ratio would be 1 if Δxmax was the sole limiting factor and 2 if Δtmax was the sole limiting factor. Thus for stimulus speeds of 16 and 32°/s, directionality is limited largely but not entirely by Δxmax, in both MT and V1.
The data in Fig. 1 are representative of the variability present in our recordings from V1 neurons. Although each data point is based on 23–24 trials, the SE is moderately large, and the average values of firing rate show some variability when plotted as a function of Δx (Fig. 1, D and F). We see two sources of variation: the finite number of spike trains recorded and the sparseness of the random dot patterns themselves. The dot patterns varied from trial to trial and were sparse enough so that at any given moment there was not always a dot present in the small receptive fields of V1 neurons. Variability in the data, and the resulting uncertainty in the estimate of Δxmax, is of particular concern for neurons with very high or low preferred speeds, which often responded weakly to stimulus motion at both 16 and 32°/s. To avoid unreliable estimates, neurons were excluded from the analysis of Δxmax if, for the smallest flash separation at the speed in question, the ratio of the mean spike rate to the standard error was >4. The example neuron in Fig. 1 had ratios of 7.6 and 5.4 for target motion at 16 and 32°/s. Across V1 neurons, the average ratio was similar, 7.7 for 16°/s (range = 1.1–23, 24/30 neurons included) and 5.2 for 32°/s (0.8–13, 10/30 neurons included). MT neurons had higher firing rates and larger receptive fields over which variations in the random dot pattern would tend to average out, and typically produced more reliable estimates of mean firing rate. Across MT neurons, the average ratio of the mean to SE was 11.8 for a stimulus speed of 16°/s (range = 2.6–48, 100/108 neurons included) and 10.9 for a speed of 32°/s (0.2–36, 96/108 neurons included).
Comparison of Δxmax in V1 and MT
Table 2 shows a comparison of mean values of Δxmax for V1 and MT. Contrary to the findings of Mikami et al. (1986b); the mean values of Δxmax across our samples of neurons were similar for MT and V1. Note, however, that previous studies have found that Δxmax correlates with preferred speed for both V1 and MT (Churchland and Lisberger 2001; Mikami et al., 1986b). Therefore it is most appropriate to compare Δxmax between MT and V1 neurons with similar preferred speeds. To do so, we grouped our samples of MT and V1 neurons according to their preferred speeds, normalized and then averaged the opponent response across neurons at each value of Δx and plotted the averaged opponent responses as a function of Δx for each range of preferred speeds (Fig. 2). For each group, the mean opponent response was fitted with a sigmoid to estimate the value of Δxmax (gray and black vertical lines). Within each group of neurons, the value of Δxmax for neurons recorded in V1 was similar to or larger than that for MT. As expected, the value of Δxmax was slightly higher for the groups of neurons with higher preferred speeds.
The analysis in Fig. 2 is based on the opponent response: Rpref − Rnull. The analysis of Mikami et al. (1986a, b) was based on the directional index: (Rpref − Rnull)/Rpref. Figure 3 shows that an analysis of Δxmax based on the directional index yields the same result as that in Fig. 2. The mean directional index remained flat across a range of values of Δx and then declined toward zero at the highest values of Δx. The directional index for smooth motion was on average lower for V1 than MT. However, comparison of V1 and MT neurons with the same range of preferred speeds reveals little differences in the value of Δxmax. Thus MT and V1 show similar declines in directionality as a function of Δx, and similar values of Δxmax, regardless of whether the data are analyzed using the opponent response or the directional index.
To compare the values of Δxmax across individual neurons in MT and V1, Fig. 4 plots Δxmax versus preferred speed for each neuron. Data are plotted separately for stimulus speeds of 16 and 32°/s. Over the range of preferred speeds common to both V1 and MT, values of Δxmax were similar. In particular, all V1 neurons plotted within the range of the larger sample recorded in MT. If there is a difference between the samples from the two areas, Δxmax is slightly larger in V1 than in MT, although this would not withstand tests of statistical significance. Thus Figs. 2–4 show that when comparisons are made across neurons with similar preferred speeds, our samples recorded from V1 and MT have similar values of Δxmax. Figure 4 also illustrates that neurons with high preferred speeds were observed more commonly in MT than in V1.
Apparent motion and null direction responses
The example neuron in Fig. 1 showed a strong response to motion in the preferred direction that decreased with flash separation. As Δx increased from 0.06 to 1.02° at 16°/s, the opponent response fell from ∼19 spikes/s to near 0 as a result of a declining response to motion in the preferred direction. Consistent with prior observations (Mikami et al. 1986a; Churchland and Lisberger 2001), it also was common for neurons to exhibit increases in the response to stimulus motion in the null direction. Figure 5 shows an example of such behavior. At the largest flash separations, this neuron responded strongly, but equally, to stimulus motion in the preferred and null directions. Thus for large flash separations, directionality can be lost due to decreases in the preferred direction response and/or increases in the null direction response. To quantify this range of behavior, we focused on a stimulus speed of 16°/s, and defined the “index of null-facilitation” as (3) where P0.06 and N0.06 are the responses to stimulus motion in the preferred and null directions when Δx was 0.06° (Δt = 4 ms). P1.02 and N1.02 are the responses when Δx was 1.02°, the largest value we used. The index represents the proportion of the loss in opponent response that is accounted for by an increase in the response to motion in the null direction. Values of zero indicate that any loss of directionality is due entirely to a decrease in the preferred direction response, while values near one indicate the reverse. The example neuron in Fig. 1 had an index of −0.06 and the example neuron in Fig. 5 had an index of 0.71. Across all neurons recorded from area V1, the mean index was 0.57 with a SD of 0.47. For MT, the mean index was 0.35 with a SD of 0.26. The difference was statistically significant (P < 0.002). Thus in spite of considerable overlap between V1 and MT, there was a significantly greater tendency in V1 for large values of Δx to produce increased responses to the null direction.
Our study was designed to ask whether the responses of any given MT neuron could be inherited from its V1 inputs or whether must we posit new directional mechanisms extrinsic to V1. More specifically, we assume that an MT neuron of a given preferred direction and speed receives excitatory inputs from a large number of V1 neurons with similar preferred directions and speeds and inhibitory inputs from V1 neurons with the opposite directional tuning. Can MT responses to apparent motion be accounted for by these assumptions or must MT contain neural mechanisms to create directional responses where there were none in V1? Our approach was to record from V1 and MT using the same visual stimuli and to compare responses of neurons with similar preferred speeds and receptive field eccentricities. V1 neurons were on average less responsive and less directional than MT neurons, but the spatial displacement (Δxmax) at which direction selectivity was lost during apparent motion was similar for the two areas. Thus comparison of the responses of V1 and MT to the same stimuli argues that direction selectivity in MT could be inherited from the responses of directional neurons in V1 at least over the range of preferred speeds common to both areas.
Possible reasons for discrepancies with prior studies
Our results run contrary to those of Mikami et al. (1986b), who found that values of Δxmax were higher in MT than in V1. A likely source of the discrepancy is the different bar widths used by Mikami et al. in testing V1 and MT (0.1 vs. 0.3°). The spatial frequency content of a narrow bar is skewed higher than that of a wide bar. Motion energy models predict that direction selectivity will be lost during apparent motion because of aliasing between the stimulus and the spatiotemporal receptive fields of the motion detectors, a situation that will be more pronounced at higher spatial frequencies (Adelson and Bergen 1985; Watson et al. 1986). This prediction is supported by the finding that spatial frequency tuning correlates with the preferred value of Δx for a single cycle of apparent motion (Baker and Cynader 1986). It is not yet clear if motion-energy models provide an accurate account of the mechanism for direction selectivity, but other candidate models, such as Reichardt detectors, also share the property of being less resistant to apparent motion for higher spatial frequencies (Adelsen and Bergen 1985; Borst and Egelhaaf 1989). Thus we think that the choice of stimuli for V1 and MT explains why Mikami et al. found smaller values of Δxmax in V1.
It is also possible that MT-specific mechanisms of long-range motion detection may explain why Mikami et al. found larger values of Δxmax in MT. Mechanisms of long-range motion detection could be engaged effectively by a single bar, like that used by Mikami et al., but perhaps much less effectively by our random dot stimuli, which moved within a stationary aperture. Psychophysical evidence supports the existence for such long-range mechanisms (Braddick 1980; Tyler 1973), and it seems plausible they might influence responses in MT. In the data of Mikami et al., MT neurons occasionally had values of Δxmax larger than the receptive field of any recorded V1 neuron at the same eccentricity (e.g., at 10° eccentric, the largest V1 receptive field recorded was ∼3°, whereas a few MT neurons had values of Δxmax as large as 5°). This suggests that, for discrete stimuli, MT may benefit from long-range mechanisms absent in V1. However, given the difficulty of accurately defining and measuring receptive field width, and the variability inherent in estimating Δxmax, nothing concrete can be concluded. In pilot studies, we recorded from a handful of MT neurons in anesthetized animals using stimuli that consisted of a single spot and saw no evidence of long-range mechanisms that could drive robust responses at values of Δx larger than Δxmax (Churchland and Priebe, unpublished observations). In addition, long-range motion has very minor effects on the responses of MT neurons recorded in anesthetized animals, at least when put in conflict with short-range motion (Priebe et al. 2001). Of course, long-range motion may drive stronger responses in MT of awake animals. In summary, it is unclear whether such effects exist at all, and whether they would be robust enough to account for the discrepancy of our data and that of Mikami et al. However, the fact that responses to long-range motion are minimal or absent in the anesthetized animal suggests that if they do exist, they likely depend on feedback from other extra-striate areas and are not a product of the V1 to MT projection.
In general, our results could have been influenced by the fact that we compared recordings from MT in awake monkeys with recordings from V1 in anesthetized animals. Anesthesia might reduce responsiveness to suboptimal stimuli, such as apparent motion stimuli with long flash separations, causing us to underestimate Δxmax. Thus one might expect our methods to result in smaller values of Δxmax in V1 than in MT. As we in fact found that values of Δxmax were similar in the two areas, there is little reason to be concerned regarding the use of anesthesia in V1. Indeed, if anything, one would expect neurons in V1 of awake monkeys to have still larger values of Δxmax, strengthening our conclusions. Further, in pilot experiments recording from MT of anesthetized animals, we found values of Δxmax similar to those found in the awake animals (Churchland and Priebe, unpublished observations). Thus it seems unlikely that anesthetic state has created our results.
A final concern is that we recorded from fewer directional neurons in V1 than in MT due to the difficulty of finding V1 neurons that responded directionally to our stimuli. However, enough neurons were recorded to show that some V1 neurons remained directional at all values of Δx for which MT neurons responded directionally. Conversely, many MT neurons were recorded, enough to render unlikely the possibility that a subset of MT neurons has unusually high values of Δxmax. Herein lies the crux of our argument. For target motion at 16 and 32°/s, we do not see evidence that MT neurons remain directional when V1 neurons have lost their directional responses. Directionality deteriorates in parallel in the two areas. Therefore at least over this range of speeds, it is not necessary to assume that directionality is created anew in MT.
Although our findings contradict those of Mikami et al., they are in agreement with those of Livingstone et al. (2001), who mapped MT receptive-field structure using white-noise stimuli. They found that directional interactions occurred on spatial scales smaller that V1 receptive fields and found no evidence for directional mechanisms operative across larger distances.
Relationship to models of direction selectivity
Models of motion selectivity developed by Adelson and Bergen (1985) and Watson and Ahumada (1985) provide an appealing account of direction selectivity in V1 and MT (Heeger 1992; Simoncelli and Heeger 1998). They indicate that the Δxmax for a given stimulus is an inevitable consequence of a neuron's temporal and spatial frequency tuning. In the context of motion energy models, our results suggest that neurons in V1 and MT operate over similar ranges of preferred temporal and spatial frequencies. In particular, our results suggest that the range of preferred spatial frequencies for MT largely overlaps that in V1 despite the much larger receptive fields of MT neurons.
Our finding of similar spatial limits for responses to apparent motion in MT and V1 is surprising, not only because the opposite has been previously reported, but also because the contrast-sensitivity of responses in MT and V1 provides an a priori reason to expect Δxmax to be larger in MT, even in the absence of any MT-specific directional mechanism. A given MT neuron presumably receives input from a large number of V1 neurons. Because MT has a steeper contrast response curve than V1, one can infer that it responds strongly when its inputs from V1 are responding weakly (Sclar et al. 1990). If MT neurons give strong responses when V1 responses are weak due to low contrast, then they should also give strong responses when V1 responses are weak because of large spatial separations. This logic implies that an MT neuron should have a value of Δxmax at least as large as the largest found in its V1 inputs. Yet, the values of Δxmax were similar across the two areas. We are unsure of the answer to this seeming paradox. Perhaps, given the suddenness of the decline in response with Δx, any “boost” given to MT neurons by the preceding mechanism has a small effect that is lost in the overall variability.
In the context of the preceding paradox, it is worth pointing out that, from a design standpoint, there is little to be gained by constructing neurons with large values of Δxmax. Essentially all real-world motion is smooth or smooth with episodic disappearances of the stimulus. As an example, a deer running behind a fence is moving smoothly when visible and provides real rather than apparent motion. It would thus seem to make little sense for the brain to develop motion processing circuits with a goal of maximizing Δxmax. It would be most adaptive to minimize values of Δxmax (and Δtmax) rather than allowing them to expand between V1 and MT.
How are motion signals transformed between V1 and MT?
The similar values of Δxmax in V1 and MT imply that the directional responses of MT neurons could be inherited from V1 with no need for any additional directional mechanisms. This conclusion may be limited, however, to the range of speeds over which we found responsive neurons in both areas. We found no V1 neurons with preferred speeds >29°/s, whereas such preferred speeds are common in MT (Churchland and Lisberger 2001; Liu and Newsome 2003). Tuning for high speeds may be unique to MT and created by its own directional mechanisms. Alternatively, it is possible that high preferred speeds are present but rare in V1 and that this small minority of V1 neurons projects heavily to MT. A related possibility is suggested by the typically broad nature of speed tuning. Area V1 may contain very few neurons that prefer speeds in the 30- to 128°/s range, but it certainly contains some neurons that respond to that range. Tuning for high speeds in MT could be created via excitatory projections from V1 neurons tuned for moderate speeds and inhibitory projections from V1 neurons tuned for slower speeds. Further experiments will be needed to discriminate between these possibilities. We note that any explanation will have to take into account the fact that MT neurons that are tuned for high speeds exhibit values of Δxmax that fall on a continuum with those of their slower tuned fellows in both MT and V1.
We did note a number of other differences between the responses of neurons in MT and V1. Neurons in area MT were more responsive and tended to have a higher directional index than neurons in V1 despite the fact that we selected for V1 neurons with directional responses. MT neurons were also less likely than V1 neurons to show increased responses to motion in the null direction at large flash separations. To account for these observations, either MT receives selective input from the more stringently directional V1 neurons (e.g., Movshon and Newsome 1996) or directionality is sharpened by an opponent computation that lies either within MT or between V1 and MT.
Our findings add to a trend in which some response properties previously thought to arise within MT are in fact plausibly inherited from V1 inputs. We have demonstrated here that the basic motion selectivity of MT neurons is most plausibly inherited from V1, with the possible exception of neurons tuned for high speeds. The form variance of speed tuning is now known to be similar both in V1 complex cells and in MT (Lisberger et al. 2003; Priebe et al. 2003). Directional responses to second order motion are more common in MT but are certainly present in V1 (O'Keefe and Movshon 1998). It is worth contrasting the V1 to MT projection with the LGN to V1 projection. The LGN to V1 projection in cat embodies a fundamental transformation: orientation selectivity is created by the pattern of projections that converge on individual neurons (Ferster et al 1996; Kara et al. 2002; Ried and Alonso 1995). It is unclear whether the V1 to MT projection embodies a similarly fundamental transformation. It is entirely possible that the goal of this projection is simply to concentrate motion-selective responses in one area and create larger receptive fields. This is not to deny that some important new properties emerge in MT. For example, neurons that respond to the direction of a pattern rather than its component gratings are found in MT but not V1 (Movshon et al. 1986; Rodman and Albright 1989; although see Guo et al. 2004), and effects of attention are larger in MT than V1 (Treue and Maunsell 1996, 1999). However, such “higher level” properties presumably depend on descending feedback and/or intra-MT connectivity. They are probably not produced simply by the V1 to MT projection. A more likely product of that projection is the structured nature of many MT receptive fields (Allman et al. 1985; Born et al. 2002), although even this is not known. In summary, it is still unclear whether the V1 to MT projection embodies a fundamental transformation, akin to that observed between the LGN and V1. Our results argue against the most obvious candidate for such a transformation, the creation of direction selectivity between V1 and MT.
This research was supported by the Howard Hughes Medical Institute, by National Institutes of Health Grants R01-EY-03878 and T32-EY-07120, and by a National Defense Science and Engineering Graduate predoctoral fellowship.
We are grateful to S. Ruffner for creating the target presentation software and to S. Tokiyama, K. MacLeod, and E. Montgomery for assistance with animal preparation and maintenance.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2005 by the American Physiological Society