|
|
||||||||
J Neurophysiol (May 1, 2003). 10.1152/jn.00550.2002
Submitted on Submitted 11 July 2002; accepted in final form 3 December 2002
Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115
| |
ABSTRACT |
|---|
|
|
|---|
Conway, Bevil R. and Margaret S. Livingstone. Space-Time Maps and Two-Bar Interactions of Different Classes of Direction-Selective Cells in Macaque V-1. J. Neurophysiol. 89: 2726-2742, 2003. We used one-dimensional sparse noise stimuli to generate first-order spatiotemporal maps and second-order two-bar interaction maps for 65 simple and 124 complex direction-selective cells in alert macaque V1. Spatial and temporal phase differences between light and dark space-time maps clearly distinguished simple and complex cell populations. Complex cells usually showed similar direction preferences to light and dark bars, but many of the directional simple cells were much more direction selective to one sign of contrast than the reverse. We show that this is predicted by a simple energy model. Some of the direction-selective simple cells showed multiple space-time-slanted subregions, but others (previously described as S1 cells) had space-time maps that looked like just one subregion of an ordinary simple cell. Both simple and complex cells showed directional interactions (nonlinearities) to pairs of flashed bars (a 2-bar apparent-motion stimulus). The space-time slant of the simple cells correlated with the optimum dX/dT (velocity) of the paired-bar interactions. Some complex cells also showed a space-time slant; the direction of the slant usually correlated with the preferred direction of motion, but the degree of slant correlated with the inferred velocity tuning only when measured by a weighted-centroid calculation. Principal components analysis of the simple-cell space-time maps yielded one fast temporally biphasic component and a slower temporally monophasic component. We saw no consistent pattern for the spatial phase of the components, unlike previous reports; however, we show that principal components analysis may not distinguish between spatial offsets and phase offsets.
| |
INTRODUCTION |
|---|
|
|
|---|
In 1959 David Hubel reported
that some neurons in the cat's visual cortex respond to a visual
stimulus only if it is moved in one direction and not the reverse.
Direction-selective cells were later found in the retinas of other
mammals, like the rabbit, but they have not been found in the retinas
of primates. The mechanism for direction selectivity in retinal
ganglion cells is as yet unresolved, although most evidence suggests an
asymmetry between excitatory and inhibitory inputs (Taylor et
al. 2000
; Yoshida et al. 2001
). The mechanism
for cortical direction selectivity is also not well understood. One
might suppose that cortical directional cells inherit their
directionality from direction-selective retinal ganglion cells, but
direction-selective cells are not found in the primate lateral
geniculate nucleus (Dreher et al. 1976
; Wiesel and Hubel 1966
). Therefore it is more likely that cortical
direction selectivity is generated de novo.
One popular model for cortical direction selectivity proposes that at
the first stage of direction selectivity, the time course of the
response of a directional cell changes progressively across the
receptive field in such a way that the response is slower on the
"preferred" side and faster on the "null" side (Fig.
1A). That
motion perception is based on such linear spatiotemporal filters was
first proposed in theoretical papers (Adelson and Bergen
1985
; Watson and Ahumada 1985
). Response maps of
such cells when plotted in space and time coordinates would show a
slant, and the slant would correspond to the unit's preferred
velocity. Because of the space-time slant, stimulating the slower part
of the receptive field shortly before stimulating the faster part results in greater overlap in the two responses than when two stimuli
are presented in the opposite sequence (Fig. 1B). A purely linear filter would give directionality only for peak firing rate in
response to a moving bar or response modulation to gratings but not for
total spikes in response to a moving bar (Fig. 1B).
|
Direction-selective simple cells in both cat and primate visual cortex
can show space-time-slanted receptive fields (DeAngelis et al.
1993
; De Valois and Cottaris 1998
; De
Valois et al. 2000
; McLean and Palmer 1989
;
Movshon et al. 1978a
); for many simple cells, the
presence of direction selectivity, the direction of preferred motion,
and the preferred velocity all correlate well with the receptive-field
space-time slant. This correlation has been taken as evidence that the
spatiotemporal slant underlies directionality. However, direction
selectivity is usually stronger than is predicted by linear mechanisms,
which can account for only one-fifth to one-half of the observed
directionality (Albrecht and Geisler 1991
; McLean
and Palmer 1989
; Reid et al. 1991
;
Tolhurst and Dean 1991
). Moreover some directional
simple cells do not show a space-time slant (Baker and Cynader
1988
; Baker 2001
; Murthy et al.
1998
), so different or additional mechanisms are needed to
account for their directionality. A static nonlinearity (a squaring or
a thresholding) is usually proposed to enhance the directionality
generated by linear mechanisms.
Here we have generated high-resolution space-time maps and sequential two-bar interaction maps for a large number of direction-selective cells in alert macaque V1. We explore how these maps fit with various models for direction selectivity.
| |
METHODS |
|---|
|
|
|---|
We recorded single units in V1 of three alert fixating macaque
monkeys. The monkeys were prepared for chronic recording from V1 under
general anesthesia using sterile techniques (Livingstone 1998
). The monkeys were trained to keep their gaze within 1°
of a small spot to receive a juice reward. Spikes were used for mapping only if the monkey's eyes were within the fixation window at the time
of stimulus onset; eye movements during a stimulus presentation are
unlikely given that the stimulus durations were 13 ms. Eye position was
determined with a scleral eye coil monitored with a magnetic-field coil
(CNC Engineering, Seattle, WA). To prevent slippage, the eye coils were
sewn to the sclera using fine absorbable sutures. The eye-position
monitor has a spatial resolution of 0.05° (mean noise in the absence
of a monkey). The eye-position monitoring was calibrated at the
beginning of each recording session by having the monkeys look in
random order at tiny dots in the center of the monitor and at the
corners of a 3-10° square (depending on the eccentricity of the
cells recorded). The monkeys had to keep their gaze within the fixation
window for 2-4 s to receive a juice reward. During periods of stable
fixation, average residual eye movements were less than 0.25°. These
residual eye movements were compensated for using eye-position
correction (Conway 2001
; Livingstone
1998
; Livingstone and Tsao 1999
;
Livingstone et al. 1996
). As shown by the maps here and
in our previous papers, this technique affords mapping of simple-cell
subunit organization in parafoveal V1 and gives reproducible
substructure in independent maps obtained from the same cell
(Conway 2001
; Livingstone and Tsao 1999
).
All procedures were approved by the Harvard Medical Area Standing
Committee on Animals.
Neuronal responses were recorded extracellularly using fine
electropolished tungsten electrodes coated with vinyl lacquer (Frederick Haer, Bowdoinham, ME) (Hubel 1957
). Units
were isolated using a dual-window discriminator (BAK Electronics,
Germantown, MD) after they were amplified and band-pass filtered (1-10
kHz). Only well-isolated single units were analyzed. Spikes were
recorded at 1,000 Hz, eye position at 250 Hz.
Cortical cells were tested for orientation and direction selectivity
using fields of moving oriented bars of optimal length and width.
Twenty directions evenly distributed around 360° were presented in
pseudorandom order, with at least four presentations of each direction,
more if the tuning curve was unclear. A direction index (DI) was
calculated from responses to optimally oriented single bars moving back
and forth at the optimum speed and orientation
|
Stimulus sequences from which we determined space-time maps and paired-stimulus interactions
While the monkey fixated, pairs of optimally oriented bars, one
black and one white, were flashed simultaneously on a gray background,
at random positions along a stimulus range that was perpendicular to
the bar orientation (Fig. 1C). This stimulus was designed to
allow us to obtain both the space-time maps (1st-order analyses) and
paired-bar interaction maps (2nd-order analyses) from a single stimulus
run. This is the sparsest stimulus that allows us to do this. The
monitor was 75 or 100 cm from the monkey, depending on the set up, and
had a 75-Hz refresh rate. Stimulus presentation rate was 75 Hz. Stimuli
were presented monocularly to the dominant eye [using colored stimuli
and filters (Livingstone and Tsao 1999
)] or binocularly
(without colored filters) if there was no difference in the receptive
fields in the two eyes. White and black stimuli were
19cd/m2 above and below the mean background
luminance of 20 cd/m2. The overlap of the
black-and-white stimuli appeared as the background gray. For each map,
between 5,000 and 50,000 spikes were collected over a 5- to 30-min
period. Using this stimulus configuration, we calculate, from a single
spike train, space-time maps to each contrast, sequential interaction
maps, and dX/dT maps (Fig. 1, D-F).
Space-time maps (Fig. 1D) were smoothed with a
two-dimensional Gaussian low-pass filter with a sigma of 3 ms in the
temporal dimension and 0.1° in the spatial dimension. Sequential
interaction maps (Fig. 1E; which are space-space maps) were
smoothed with a 0.1° wide Gaussian in both spatial dimensions.
Space-time maps
From a continuous record of stimulus position, eye position, and
neural activity, we calculated the average response to one contrast
bar, at every spatial position along the stimulus range, independent of
the location of the opposite-contrast bar. All the space-time maps are
oriented so that rightward on the space (horizontal) axis corresponds
to the preferred direction of stimulus motion (Fig. 1D).
Each stimulus position was corrected for the monkey's eye position at
stimulus onset (Livingstone 1998
).
There is often a dark band apparent to either side of the excitatory
response in the space-time maps; these dark bands do not necessarily
represent inhibitory side bands. The dark bands reflect the firing rate
at the peak latency produced by frames when the stimulus was outside
the activating region, whereas the higher firing at earlier and later
times corresponds to the firing rate averaged over the entire stimulus
range, including the activating region. In other words, in the
space-time maps, activity is mapped as a function of stimulus position
at time = 0, but there are many stimulus presentations before and
after that stimulus, and some of them land in the cell's receptive
field. So, if one calls the stimulus latency L and optimal
position P, because the maps are keyed to the stimulus
presented at time 0, the response at time L and
positions other than P will be baseline. But at all other
times and positions, there is a finite probability that the stimulus
occurring L ms previously was the optimal one, and so there
is a finite, noisy background at these times. Difference space-time
maps were generated by subtracting the dark-bar map from the light-bar
map (Movshon et al. 1978a
).
To categorize cells as simple or complex, we quantified the space-time
receptive-field organization. We determined the temporal and spatial
receptive-field profiles for the responses to dark and light bars. The
temporal profile was averaged over 0.12° of visual angle; the spatial
profile was averaged over 10 ms, both centered on the peak of the
response (the peak of the dark or the light-bar response, whichever was
bigger). The spatial and temporal profiles for each contrast were fit
to a Gabor function (a Gaussian times a cosine wave)
|
is the SD of the Gaussian,
is spatial frequency of the
sinewave,
is the phase of the cosine wave, and
is the baseline offset.
The fits were constrained so that the Gabors for both light and dark
maps had the same center and spatial frequency. The center and spatial
frequency actually used were the average of the optima for each profile
fit independently. (Unless, as in the 1-subunit simple cells, see
following text, a profile for one contrast was so flat that it was
essentially noise, in which case the stronger profile was used to
determine the center and spatial frequency.) The
of the Gaussian
was at least one full cycle of the sinewave (so the width of the
Gaussian was at least 2 full cycles). Although the Gabor used was much
wider than the best-fitting Gabor would be, it allowed us to determine
the relative positions of the light and dark subregions with a single
parameter, the phase of the cosine. We fit the profiles to a wide Gabor
rather than to just a sinewave so that the fit would be driven
predominantly by the activating region rather than by the regions
outside the receptive field. The fits were optimized via a
least-squares criterion with the Levenberg-Marquardt algorithm in
Matlab (Mathworks, Natick, MA). The phases of the sinusoids of the
best-fitting Gabors were compared for the light and dark stimuli. Phase
is defined as the phase of a cosine wave having the same center as the
Gaussian. We recorded from five cells that responded to only one sign
of contrast; these cells were not included in the population because we
could not distinguish them as simple or complex.
Some cells had space-time maps that looked like one subregion of a
mulitple subunit simple cell. We treat these cells separately in our
analyses, even though we observed a range of subunit number, because
they do not fit the original definition of simple cells (Hubel
and Wiesel 1962
). Cells were categorized as one-subunit simple
cells if they showed complementarity between light and dark-bar
responses but throughout the response had only one spatial subregion.
Whether a cell had more than one subregion was determined by comparing
the magnitudes (positive or negative) of the peaks in the spatial
profile; a peak counted as another subunit if it was at least 10% the
magnitude of the major peak. These one-subunit simple cells correspond
to the S1 cells described by Schiller et al. (1976)
, who
categorized cells using moving light and dark edges.
Principal components analysis was done on the space-time
maps of simple cells using the Matlab code for singular value
decomposition. For the first two principal components of each simple
cell, the magnitude of the largest peak (positive or negative) and the
magnitude of the following peak were measured. The timing of the major
peaks was compared for the first two principal components, and the
component whose major peak reached maximum first was considered the
"fast" component, and the other was the "slow" component. The
biphasic index for each principal component is the ratio of the second temporal peak divided by the first (De Valois et al.
2000
). A biphasic index near 1 indicates a biphasic
temporal profile and indices near 0 correspond to monophasic profiles.
To measure the spatial organization of the first two components for
each cell, the spatial profile through the largest peak was fit with a
Gabor. The phase of the cosine wave was taken as the spatial phase;
even symmetric profiles had phases near 0° and odd symmetric profiles had phases near 90°.
Two-bar interaction maps (Wiener-like kernels)
From the same spike train used to generate the first-order
space-time maps, we could also map activity as a function of
paired-stimulus sequences (Fig. 1E). This is a novel way of
looking at directional interactions at a single interstimulus interval.
In describing the sequences, the second stimulus of each pair is called
the reference stimulus. Each stimulus position was corrected for the monkey's eye position at stimulus onset (Livingstone et al.
1996
). We plotted interactions as a function of reference
stimulus position (horizontal axis) and preceding stimulus position
(vertical axis). Thus nonlinear interactions are mapped in space/space
coordinates, but the coordinates are not two-dimensional (2-D) visual
space but rather paired-bar position coordinates.
Interactions were plotted at a temporal delay corresponding to the time
to peak response for the reference stimulus (between 45 and 60 ms). In the sequential interaction maps, the +45° diagonal indicates
occasions when the two sequential stimuli appeared at the same location (no motion). The maps are oriented so that rightward on the horizontal axis and upward on the vertical axis correspond to the preferred direction of stimulus motion. Therefore regions below/right of the
green diagonal represent two-bar sequences in which the stimuli were
presented in the preferred direction, and regions above/left of the
diagonal represent null-direction sequences. Increasing distance from
the +45° diagonal corresponds to increasing interstimulus distance.
From one spike train, we mapped responses to all four combinations of
two-bar sequences: white-to-white, black-to-black, white-to-black, and
black-to-white. To look at interactions between pairs of
stimuli, we want to know what aspects of a paired-bar response depend
on the pairing of the stimuli and which depend on the responses to the
two stimuli presented independently. Interactions are those parts of
the paired-bar response that depend on pairing, so have the linear or
independent responses to the two stimuli subtracted from the
paired-stimulus response (Fig. 1B). One way to do this is to
subtract the inverting-sequence maps (white-to-black and black-to-white) from the summed same-contrast sequences (white-to-white and black-to-black) (Emerson et al. 1987
); these
difference maps show only those aspects of the sequential response that
depend on stimulus order or position. Subtracting the
inverting-contrast responses from the same-contrast responses
eliminates the first-order (linear) responses, leaving only the
nonlinear interactions. This is equivalent to calculating a
second-order Wiener-like kernel (Emerson et al. 1987
) in
two-bar position coordinates. Any deviation from 0 in the interaction
maps must be attributable to the fact that the stimuli were presented
sequentially and not independently; these maps are equivalent to the
difference between the observed response to two-bar apparent motion and
the linear sum of the responses to the two stimuli presented
independently. The mapping technique is a generalization of the two-bar
interaction maps of Movshon et al. (1978b)
, and the maps
are analogous to the binocular interaction maps of Ohzawa et al.
(1997)
except interactions are mapped as a function of the
position of two sequentially presented bars rather than as a function
of stimulus position in each eye. In our maps, positive interactions
(same-contrast facilitation and inverting-contrast suppression) are
indicated in red and negative interactions (same-contrast suppression
and inverting-contrast facilitation) in blue.
dX/dT maps
To show the evolution of the nonlinear directional interactions at
a series of interstimulus intervals in a single
dX/dT map (Fig. 1F), we first
calculated (for each interstimulus interval) a two-bar interaction
profile from two-bar interaction maps just as described in the
preceding text. For simultaneously-presented stimuli, we have only
inverting-contrast sequences and no same-contrast sequences, so the
nonlinear simultaneous interactions were calculated by subtracting maps
generated using long (250 ms) interstimulus intervals from the 0-ms
maps (Livingstone et al. 2001
). For each interstimulus
interval, we generated one-dimensional profiles of interaction strength
as a function of interstimulus distance by averaging the interactions
for every interstimulus distance (Livingstone et al.
2001
) across a slice running parallel to the
45° diagonal,
encompassing the interaction region. This gives the average
paired-stimulus interactions as a function of interstimulus distance,
and the slices are then stacked to give interactions as a function of
interstimulus distance and interstimulus interval. This analysis is
similar to that used by Emerson et al. (1987)
; they
refer to their dX/dT maps as "motion kernels."
Measuring the slant of space-time maps
We compared three ways of calculating the space-time slope and
settled on using a weighted centroid calculation, similar to the
calculated used by McLean and Palmer (1989)
, but with
each point weighted by the response magnitude. In RESULTS,
we discuss why this is an appropriate calculation for the paired-bar
stimulus we used. We calculated the weighted mean position (centriod)
in space for each ms for ±15 ms on either side of the peak response using the following equation (McLean and Palmer 1989
)
|
We calculated the best fitting line for the centroids in a different
way, however, from McLean and Palmer (1989)
in that we used a maximum likelihood estimation to fit a line, and used the width
(sigma) of the best-fitting Gaussian for each millisecond-long time
slice ±15 ms from the peak as an estimate of the SD of the peak value.
We wanted to weight each centroid by the magnitude of its response.
Maximum likelihood estimation fits a line weighting each data point by
its SD, with points with small SDs being weighted more. For our time
slices, the width of a best-fitting Gaussian will be inversely related
to the peak height.
The second way we calculated the slant of the space-time maps was to
find the angle giving the maximum peak in a Radon transform (Deans 1983
) of the white or dark bar response map,
whichever was larger. To do this, we first subtracted the baseline
activity and set to zero any below-baseline regions. The Radon
transform gives the image intensity summed along image slices taken at
different angles. Thus the peak of the Radon transform gives the angle
of a line that maximizes image intensity. The third way we calculated the slant of space-time maps was from the peak of the Fourier transform
of the space-time map.
Modeling
To generate the models in Figs. 8, 10, and 11, two
nondirectional cells (a fast one and a slow one) were calculated by
multiplying a spatial Gabor function and a temporal function. The
spatial Gabor for the slow nondirectional cell was 90° phase-shifted
from the fast component in Figs. 8 and 11, and in Fig. 10, the center position was shifted, but not the phase. We used temporal functions similar to those used by Adelson and Bergen (1985)
|
| |
RESULTS |
|---|
|
|
|---|
We recorded the activity of single units in primary visual cortex
(V1) of three macaque monkeys. For each cell, we first determined the
preferred stimulus orientation and direction using moving bars. A
direction index, DI, was calculated for both light and dark bars. Cells
were considered directional if they had a DI > 0.2 for one
stimulus contrast and did not show the opposite direction preference
for the opposite stimulus contrast. In an earlier study from this
laboratory (Livingstone 1998
), a more stringent
criterion was used to accept cells as directional (DI >0.5 for
both stimulus contrasts), and almost no simple cells meeting
that criterion were found. As discussed in the following text, this
difference in directionality for black-and-white bar stimuli is to be
expected from the energy model. By relaxing our definition of direction
selectivity, we were able to identify directional simple cells with
slanted space-time maps similar to those previously described in the
cat (DeAngelis et al. 1993
; McLean and Palmer
1989
; Murthy and Humphrey 1999
) and monkey
(De Valois and Cottaris 1998
; De Valois et al.
2000
). After identifying a unit as direction selective, we then
mapped the receptive field using the sparse noise stimulus illustrated
in Fig. 1C.
Space-time maps; simple versus complex cells
From a single spike train recorded during the presentation of
20,000-130,000 stimulus frames (5-30 min), we averaged the response at each retinal position to white bars, disregarding the black bars, or
to black bars, disregarding the white bars. Figure
2 shows responses to moving bars and
space-time maps for a typical directional complex cell and a typical
directional simple cell. The maps for these two cells show several
features that seemed to consistently distinguish simple and complex
directional cells in our population. We will first describe these
differences for this pair of cells, then quantify the differences for
the population. First, the complex cell showed strong direction
selectivity to both light and dark moving bars (Fig. 2, A
and B). The simple cell, in contrast, was directional to
light bars (G) but not to dark bars (H). Second,
the space-time maps of the simple cell showed rough complementarity of
light and dark responsive regions (I and J), but
the complex cell did not (C and D) (Hubel
and Wiesel 1962
). Third, the simple-cell space-time maps showed
an overall slant also as previously described (DeAngelis et al.
1993
; De Valois and Cottaris 1998
; McLean
and Palmer 1989
; Murthy and Humphrey 1999
). For
the complex cell, the space-time maps were not clearly slanted,
although there was an asymmetry along the spatial axis with the
response being more transient on the right (null) side of the receptive
field, with a delayed suppressive response (the darker blue blobs below
and slightly to the right of the red blobs) to both light and dark
stimuli that was slightly spatially offset, toward the null side of the
receptive field, as previously described (Livingstone
1998
). Thus in complex directional cells, ON and OFF regions were co-extensive, but excitatory and
suppressive regions were not necessarily. Fourth, the firing rate was
higher for the complex cell than for the simple cell. Last, the complex cell's response was more transient than the simple cell's response.
|
Simple and complex cells were originally distinguished by Hubel and
Wiesel only in the spatial domain (Hubel and Wiesel
1962
): complex cells have coextensive light and dark excitatory
regions; simple cells have complementary light- and dark-response
organization. But simple cells can also be described in the
spatiotemporal domain, as having complementary light and dark
organization at any point in time (Adelson and Bergen
1985
; McLean and Palmer 1989
; Movshon et
al. 1978a
; Watson and Ahumada 1985
). To quantify
this, we generated spatial and temporal receptive-field profiles
(slices in time or space) by determining the activity along the spatial
and temporal dimension at the peak response (see METHODS).
As shown in Fig. 2, E and F, for the complex
cell, the spatial and temporal profiles for white and black stimuli
were similar, but, for the simple cell, the light and dark profiles
were complementary (K and L). For each cell, we
fit the light and dark spatial and temporal profiles with a
wide Gabor function (Fig. 2, E, F, K, and
L, · · · see METHODS) and
measured the phase difference between the sinusoidal components for the
light and dark profiles. For the two cells in Fig. 2, the phase
difference between the white-bar and black-bar spatial profiles was
11° for the complex cell and 178° for the simple cell; the phase
difference between the white-bar and black-bar temporal profiles was
8° for the complex cell and 146° for the simple cell.
We fit Gabor functions to the light and dark temporal and spatial profiles for the entire population of 189 cells. We had qualitatively categorized 124 of these cells as complex and 65 as simple based on the spatial complementarity of ON and OFF responses. The cells that we qualitatively classified as simple had large spatial and temporal phase differences, whereas the cells we categorized as complex had small phase differences (Fig. 3A). Some of the cells categorized as simple did not show clear spatial complementarity between light and dark responses (Fig. 3B, middle) because the response at any point in time was predominantly to only one stimulus contrast and therefore had a well-modulated spatial profile to only one stimulus contrast (1-subunit simple cells, or S1 cells, discussed in the following text). Nevertheless, the temporal phase differences in these one-subunit cells clearly distinguish them as simple, according to the criterion of complementarity between light and dark response regions (Fig. 3B).
|
The complex cell in Fig. 2 was directional to both light and dark
moving bars, but the simple cell was not. This was generally true in
our population (Fig. 3, C and D). Figure
3C is a scatter plot of the DIs for each cell for black and
white moving bars. The fact that the simple-cell data points tend to
cluster along the axes (where 1 or the other DI is 0), indicates that
many of the simple cells were directional to only one stimulus
contrast, while most of the complex cells were directional to both
contrasts. Figure 3D shows histograms of the ratio of the DI
for the poorer contrast to the DI for the better contrast. Relatively
more of the simple cells had a DI for one contrast that was much lower than the DI for the other contrast, and relatively more complex cells
had balanced DIs for the two contrasts. A similar distinction has been
reported in the cat (Goodwin and Henry 1975
;
Henry 1977
). The fact that many directional simple cells
in primate V1 are strongly directional to only one stimulus contrast
explains why we (Livingstone 1998
), and perhaps others
(Hubel and Wiesel 1968
), missed them in previous
studies. Later, we show that a simple model predicts this result.
The simple cells also had, on average, lower baseline firing rates than
the complex cells [11 ± 7 (SD) spikes/s vs. 40 ± 56.0 spikes/s]. This rate is not the completely unstimulated firing rate
but reflects the average response to the stimuli over the entire
stimulus range, most of which was outside the cells' activating region. This confirms an observation of Schiller et al.
(1976)
. Because our study was done in alert animals, we can
conclude that this difference in baseline firing cannot be attributed
to differential sensitivities of simple and complex cells to anesthetic.
Direction-selective simple cells
We observed a range of simple and complex cells that correspond
with cells previously described in cat and monkey. Because it has been
suggested that complex cells are generated by combining inputs from
several simple cells (Hubel and Wiesel 1962
) and because receptive-field size varies with eccentricity, we thought it might be
informative to compare simple and complex cell receptive fields from a
small range of eccentricities. The following figures show examples of
each of these kinds of cells, all from the same eccentricity. Of the
total population, 40 simple cells and 59 complex cells having
eccentricities between 1.5 and 3° were recorded from one 2-mm
craniotomy in one monkey. The cells in Figs. 4-7 were from this subset
of the entire population.
Figure 4 shows five "conventional"
directional simple cells with multiple, spatiotemporally slanted
subregions. Forty of the 65 simple cells were like this, showing two or
more adjacent ON and OFF subregions over most
of the response duration. Light-bar excitation and dark-bar suppression
occurred in one set of subregions with dark-bar excitation and
light-bar suppression in the other, spatially and temporally
complementary, subregions (Ferster 1988
; Hubel
and Wiesel 1962
). The light-minus-dark maps (3rd
column) make clear the overall slant in the space-time maps as
previously described in cat and monkey (DeAngelis et al.
1993
; De Valois and Cottaris 1998
; De
Valois et al. 2000
; McLean and Palmer 1989
; Murthy and Humphrey 1999
).
|
The first three columns in Fig. 4 show first-order
space-time maps; the fourth and fifth columns
show second-order interaction maps, maps that show the nonlinear
interactions between pairs of stimuli. The fourth column in
Fig. 4 shows the directional interaction maps at 13-ms interstimulus
intervals for each cell. As discussed in the preceding text,
directionality to two-bar sequences must be entirely nonlinear even if
it is based on nonlinear amplification of underlying linear processes.
Therefore the directionality must be reflected in the nonlinear
interactions between pairs of stimuli. As described in
METHODS, in the interaction maps, activity is reverse
correlated with the position of each reference stimulus (mapped along
the horizontal axis) and with the position of an immediately preceding
(13 ms) bar position (mapped along the vertical axis). Positions in
this two-bar space along the +45° diagonal represent occasions when
the two sequentially presented bars fell on the same retinal location
(no motion); positions to the right of the diagonal represent
preferred-direction sequences, and positions to the left represent
null-direction sequences. All of the cells in Fig. 4 showed a
preponderance of facilitatory interactions in the preferred-direction
region of the map and suppressive interactions in the null-direction
region. Many of the maps show a checkerboard arrangement of
facilitatory and suppressive interactions, which, as discussed in the
following text, is predicted by several models for direction
selectivity. In the binocular interaction maps of Ohzawa et al.
(1997)
, which are similar in principal to our sequential
interaction maps, simple cells similarly show checkerboard interaction
patterns. The checks represent interactions between and within
individual subunits of the simple cells.
The sequential interaction maps in the fourth column of Fig.
4 show interactions at one interstimulus interval
13 ms (at every possible position of each stimulus)
because they are generated by
correlating activity with pairs of bars in sequential frames. But we
can also generate similar maps correlating activity with pairs of bars
separated by two, three, or more frames. To look at interaction
strength as a function of both interstimulus interval and interstimulus
distance, we averaged a slice of each interaction map running parallel
to the
45° diagonal for a series of interstimulus intervals
(5th column, Fig. 4). In these graphs, negative
interstimulus distances correspond to the preferred direction. At 0 ms
(simultaneously presented stimuli), there is, by definition, no
directionality, and the interactions are symmetrical about 0. The
interaction pattern shifts abruptly from symmetry (at 0 ms) to showing
facilitation for preferred-direction sequences and suppression for
null-direction sequences in the 13-ms interstimulus interval slice. The
dX/dT maps show an overall slant, although it is
not necessarily a straight line: in the slice corresponding to an
interstimulus interval of two frames (27 ms), the facilitatory peak is
at a larger interstimulus distance than it is at 13 ms, although for
most of the cells this distance is less than twice the optimum
interstimulus distance for 13-ms interstimulus intervals. This slant
should correspond to the cells' velocity preferences, but we did not
actually test this using moving bars. If direction selectivity (a
2nd-order property) derives from the slant of simple cell
spatiotemporal response functions (a 1st-order property), the two
should be correlated (Adelson and Bergen 1985
). Below we
compare the slant of the (1st-order) space-time maps with the initial
slant of the (2nd-order) dX/dT maps.
Figure 5 shows cells that had spatially
antagonistic light and dark responses yet would not fit the original
definition of simple cells of having separate ON and
OFF subdivisions (Hubel and Wiesel
1962
). However, Schiller et al. (1976)
extended
the definition of simple cells to include cells with only one
contrast-opponent region, which they called S1 cells (Henry
1977
; Schiller et al. 1976
). Twenty-five
directional cells that fell into the simple-cell category based on
spatial and temporal phase differences (Fig. 3A) did not
have multiple subregions. These cells showed complementarity between
light and dark responses but had only one spatial region at most
latencies. This single region is organized in a push/pull fashion and
looks like a single subregion of a conventional simple cell. We, and
others (Schiller et al. 1976
), regard them as the simplest form of directional simple cell because it is easy to imagine
constructing a conventional, multisubunit, simple cell from several S1
cells.
|
Like the other simple cells in this study, the single-subunit simple
cells were direction selective, but, like the S1 cells described
previously (Schiller et al. 1976
), they were usually responsive only to one stimulus contrast. The cells in Fig. 5 were
directional to moving white bars but unresponsive or nondirectional to
moving black bars. (To flashed black bars, these cells showed suppression followed by an OFF discharge; thus they were
less responsive to moving dark bars than to flashed dark bars.) The contrast to which a given S1 simple cell was most directional was
always the same as the contrast that gave the early excitatory response.
The space-time maps of the S1 simple cells were usually not as clearly
slanted as the simple cells with more subunits, yet these cells were
direction selective. The nonlinear directional interactions of the S1
simple cells were dominated by null-direction suppressive interactions
unlike the conventional simple cells, which generally showed both
preferred-direction facilitation and null-direction suppression. The
second-order dX/dT maps of the S1 cells were
slanted, even though the space-time maps were not clearly slanted. Some
directional cells in the cat also have been reported to show slanted
interaction maps without having slanted space-time receptive fields
(Baker 2001
).
We have also mapped directional cells that turned out to be intermediate between one-subunit simple (S1) cells and conventional simple cells in that they have one dominant spatial subunit and a second much weaker subunit. That simple cells show a continuous distribution of subunit number is supported by a unimodal distribution of subunit number (data not shown).
We saw many simple cells that showed opposite direction preferences for
light and dark bars. Their space-time maps consisted of a temporally
biphasic ON region adjacent to a temporally biphasic OFF region. These simple cells, similar to those originally
described by (Hubel and Wiesel 1962
), preferred movement
of a white bar from the OFF region into the ON
and movement of a black bar from the ON region into the
OFF region. We did not include these cells in our
population of directional cells because they showed opposite directionality for light and dark stimuli. Nevertheless it is worth
pointing out that both the light-stimulus map and the dark-stimulus map, considered individually, showed a space-time slant appropriate to
the direction preference to that stimulus contrast.
Complex direction-selective cells
We recorded from 124 complex direction-selective cells. Some of
the complex cells had narrow receptive fields and a response duration
around 25 ms or longer (Fig. 6), whereas
other complex cells had wider receptive fields and more transient
responses (Fig. 7). These probably do not
represent two distinct categories, as there was not a bimodal
distribution of receptive-field widths (data not shown). The cell in
Fig. 2, C and D, and the five cells in
Fig. 6 had receptive-field widths about the same size as the width of a
single subunit of a conventional simple cell in the population of cells
recorded at the same receptive-field eccentricity, 1.5-3°. This
group of complex cells had overlapping light and dark excitatory
responses, although the light and dark responses were not always of the
same magnitude. These cells often showed a region of suppression
slightly offset toward the null side of the receptive field
(Livingstone 1998
). The two-bar interactions showed
direction selectivity, which consisted of preferred-direction facilitatory interactions and null-direction suppressive interactions (Figs. 6 and 7, 4th and 5th columns).
|
|
Some of the complex cells mapped at this eccentricity had relatively
wide receptive fields and very transient responses (Fig. 7). Some of
these wide complex cells also showed a very shallow slant that was in
the same direction as their direction preference, but the slant
corresponded to velocities faster than the range to which V1 cells are
responsive (Livingstone 1998
). The two-bar interactions
showed optimum interactions at about the same interstimulus distance as
those of the simple cells (Figs. 4 and 5) and the complex cells in Fig.
6, but they were even more elongated (along the +45° diagonal),
consistent with their wider receptive fields. Because the range of
activation along the vertical or horizontal axes corresponds to the
receptive-field width, and the distance from the +45° diagonal
corresponds to interstimulus distance, the fact that the directional
interactions are narrower perpendicular to the
45° than along the
horizontal or vertical axes indicates that the directional interactions
must take place in subunits that are narrower than the width of the
entire receptive field.
How do our maps fit with various models for direction selectivity?
The idea that the first stage of motion perception is based on
space-time slanted filters, such as the simple cells in Fig. 4, was
first proposed in two theoretical papers (Adelson and Bergen 1985
; Watson and Ahumada 1985
). These papers
also proposed that such filters could be generated in a linear fashion
by summing the responses of nondirectional simple cells (or filters)
having spatially and temporally offset receptive fields (Fig.
8, top). Although
physiological studies have described space-time slanted simple cells,
it has been difficult to determine which cell types might represent the
nondirectional input cells with different time courses required by the
model. The original theoretical papers supposed that the slow component
would have the same biphasic time course as the fast component, just
delayed by a quarter cycle, as in Fig. 8, top.
|
Two other physiologically reasonable candidates for nondirectional
inputs with different time courses have been proposed, one in cats and
a different one in primates: De Valois and colleagues (De Valois
and Cottaris 1998
; De Valois et al. 2000
) found
in primate V1 two populations of nondirectional simple cells: fast biphasic ones and slower monophasic ones. They determined from principal component analysis that a linear sum of these two classes of
nondirectional cells could account for the slant of the space-time maps
of the directional cells they mapped. De Valois and Cottaris proposed
that magnocellular cells, which are fast and transient, provide the
inputs to the fast biphasic inputs in their model, while parvocellular
cells, which are slower and more sustained (Dreher et al.
1976
; Marrocco 1976
; Schiller and Malpeli
1978
; Schmolesky et al. 1998
), provide the
inputs to the slower monophasic component. This model differs from the
original energy model in that the slower nondirectional component is
temporally monophasic (Fig. 8, bottom).
In the cat, geniculate cells with appropriately different response time
courses have been described
the lagged and nonlagged cells
(Mastronarde 1987a
,b
; Saul and Humphrey
1992
; Wolfe and Palmer 1998
). Lagged cells are
delayed relative to nonlagged cells not because they are slower, as
parvo are compared with magno in the monkey, but because, to a
sustained stimulus, they reverse response polarity after a short time.
However, Wolfe and Palmer (1998)
have since suggested
that lagged and nonlagged cells are not distinct categories but rather
that nonlagged cells are center dominated and lagged cells are surround
dominated. Because the initial response in lagged cells is negligible,
the response is essentially monophasic, so the lagged-cell model would
be similar in principle to the magno/parvo model proposed by De Valois
and Cottaris.
Thus physiological findings from anesthetized cat and monkey could be compatible with the original energy model if the model were modified so that the slower component is temporally monophasic and the faster component biphasic. We asked if our recordings in alert monkey also support this idea using principal components analysis.
Principal component analysis of direction-selective simple cells
We analyzed the shape of the simple-cell space-time maps using
principal components analysis, as was previously done by De Valois et al. (2000)
in anesthetized monkeys. Principal
components analysis computes the space-time separable component of the
map that accounts for most of its 2-D shape and the amount of
additional orthogonal components needed to account for the entire
space-time map. DeValois et al. found that their directional
simple-cell receptive fields could be decomposed into a temporally
biphasic, spatially even-symmetrical, fast component and a slower,
temporally monophasic, spatially odd-symmetrical component.
When we did the same analysis on our simple cells recorded in alert
monkeys (Figs. 9 and
10), we also found that
the receptive fields could be decomposed into a fast, temporally
biphasic, component, and a slower, temporally monophasic, component,
which together accounted for more than 89% of the shape of each of the
maps. However, unlike De Valois et al. (2000)
, we did
not find that the fast component was usually spatially even symmetric
and the slow component was usually spatially odd symmetric; indeed, in the five cells analyzed in Fig. 9, which are the same simple cells as
in Fig. 4, the fast components are spatially odd-symmetric, and the
slower components are even. Figure 10, top, shows a summary of the spatial and temporal characteristics of the first two principal components for all the conventional simple cells in our population.(We excluded the single-subunit simple cells from this analysis because their space-time maps did not show much slant, so they did not decompose into two temporally distinct principal components.) A and B show that in general the faster component
was indeed usually temporally biphasic, while the slower component was
usually temporally monophasic. A also shows that we did not
find that the biphasic components were predominantly even symmetric, as
reported by De Valois et al. (2000)
; rather, all spatial
phases were represented in both components across the population.
C shows that the spatial phases of the two components tended
to be inversely correlated, but this may be an artifact of using
principal components analysis, which requires that the each component
be orthogonal to the others. Figure 10, bottom, shows that
this result could indeed be an artifact of the analysis: two
nondirectional model cells, one fast and temporally biphasic
(D) and the other slower and temporally monophasic (E) were summed to generate a model space-time slanted
simple cell (F). Both the fast and the slow component were
spatially odd symmetric (phase angle: 90 °) but were spatially offset
by a quarter cycle. Principal components analysis of the
"directional" cell (F) yielded two components
(G and H) whose time courses corresponded well
with the original inputs but whose spatial organization did not
correspond to the original inputs. In particular, the slower principal
component (H) had a 37° phase angle and was thus more even
symmetric than odd even though the starting slower monophasic input had
a 90° phase angle. This shows that although principal components
analysis can accurately decompose temporal components, it may not be
able to distinguish between a spatial position shift and a spatial
phase shift.
|
|
Therefore our result that direction-selective simple cell space time
maps can be decomposed into a fast temporally biphasic component and a
slower monophasic component are in agreement with the results of
De Valois et al. (2000)
, but we believe that the spatial
phase of the components is not well determined by PCA.
Relationship between first and second-order maps: space-time slant versus velocity
Space-time slant is a property of the first-order maps, that is,
it is a property that depends on only the position and time after each
stimulus presentation. Directionality and velocity selectivity are
higher-order properties that depend on the spatial and temporal
separation of at least two stimuli. The simple cells in Fig. 4 show
both space-time slanted first-order maps and slanted second-order
dX/dT maps. The direction and slant of the
dX/dT maps are similar to the direction and slant
of the space-time maps. This similarity may be causal: the slant of the
space-time map may be (or may reflect) the mechanism underlying the
direction selectivity indicated by the dX/dT map
(Adelson and Bergen 1985
). To look at the correlation
between the first-order space-time (receptive-field) maps and the
second-order (direction/velocity) interaction maps, the most obvious
question is whether the slants of the space-time maps match the
velocity selectivity described by the interaction maps, as they should
if the slant produces the direction and velocity selectivity.
This question has previously been asked in anesthetized cat and
anesthetized monkey, and in general, the direction and velocity selectivities of cells correlate well with the slants of the space-time maps, although the direction selectivity predicted from the first-order maps usually underestimates the measured magnitude of the direction selectivity. This comparison has been made in several different ways.
1) Static gratings are presented at different spatial
positions to measure first-order responses, which are shifted and
summed (superposition) to predict the response to moving gratings
(Jagadeesh et al. 1997
; Murthy and Humphrey
1999
; Reid et al. 1991
; Tolhurst and Dean 1991
). 2) Space-time maps are generated
using flashed bars, and the slant of the space-time maps is measured
using a centriod calculation. This slant is compared with velocity
tuning to moving bars (McLean et al. 1994
).
3) Space-time maps are generated using flashed bars, and the
predicted velocity tuning is estimated by taking the FFT. This
predicted velocity tuning is compared with velocity tuning to moving
gratings (Gaska et al. 1994
). 4) Both first-
and second-order maps are generated using flashed bars, and they are
compared by taking the FFT of each to generate a predicted velocity;
both are compared with grating velocity tuning (Baker
2001
). 5) first- and second-order maps are generated using flashed bars, and they are compared by generating a predicted response to moving bars using a half-squaring model; both first- and
second-order maps are compared velocity tuning to moving bars (Emerson 1997
). Thus there are various ways to measure
first-order space-time slant, and various ways to measure second-order
spatiotemoral interaction slant, and which one is optimal depends on a
number of things, most particularly whether the stimuli were bars or gratings.
Here we compare the space-time slant of the first-order maps
of all the cells in our population with the optimum interstimulus distance in the second-order maps for sequentially presented
stimulus pairs (13-ms intervals), with both first- and second-order
parameters calculated from the same spike train. For the first-order
space-time maps, we measured the slant in several different ways, and
here we evaluate each of these methods using a simple model. The three ways we compared for measuring first-order space-time slant are the
peak of the FFT of the map, which gives you the spatiotemporal frequency with the most power; taking a Radon transform (Deans 1983
), which maps the line integral of the image as a function of orientation; and a weighted centroid calculation (a modification of
the calculation used by McLean and Palmer 1987
) (see
METHODS).
For the second-order (interaction) maps, we measured the optimum
interstimulus distance for pairs of sequentially presented stimuli
divided by 13 ms (the interstimulus interval); this is equivalent to
the initial slant (0-13 ms) of the dX/dT plots. The optimum interstimulus separation for sequentially presented stimulus pairs at some interstimulus interval is a definition of
velocity tuning to a minimal motion stimulus. Velocity selectivity is
usually measured as the optimal dX/dT for a
continuously presented moving stimulus. Of course video monitors cannot
present a continuous stimulus: a "moving" stimulus on a video
monitor consists of flashed stimuli at a series of positions; the
optimal velocity is the average interstimulus separation divided by the
average interstimulus interval. We measure the optimal interstimulus
separation for a 13-ms interval using a minimal motion stimulus
2
flashed bars, which may or may not accurately reflect conventional
velocity tuning. A different optimum dX/dT to a
series of flashed bars than to a pair of bars would represent an
interaction higher than second order.
Figure 11A shows a model
simple cell we used to explore the correlation between receptive-field
space-time slant (1st order) and velocity and direction selectivity
(2nd order); this is similar to the model cell in Fig. 8,
bottom, but with a slightly different spatial phase.
A shows this model cell's space-time map calculated by
subtracting dark responses from light. We have to assume some nonlinearity in this cell's output, otherwise it could not give a
direction-selective response to pairs of flashed bars, as our cells do.
The nonlinearity we assume for this model is a rectification followed
by a squaring (Adelson and Bergen 1985
)
a linear filter followed by a static nonlinearity. If we look at this model cell's response to pairs of flashed stimuli presented t1 and
t2 ms (as indicated in Fig. 11A) prior to the
time of the spikes mapped, we can calculate the nonlinearity due to
stimulus pairing (that is the direction and velocity selectivity) by
subtracting the inverting-contrast responses from the same contrast
(see METHODS). Figure 11B shows the pattern of
paired-bar interactions for this cell t2 ms after the
presentation of the second stimulus. This cell shows predominantly
positive interactions for preferred-direction sequences (same-contrast
facilitation and inverting-contrast suppression) and the reverse for
null-direction sequences. Note that these facilitatory and suppressive
interactions are arranged in a checkerboard pattern, as observed in
some real simple cells (Fig. 4, top 2 cells). Collapsing
(averaging) a slice parallel to the
45° diagonal of B
gives us a one-dimensional dX/dT plot of
interaction strength as a function of interstimulus distance and
interstimulus interval (Fig. 11C). This
dX/dT map is slanted (that is, the overall
Wiener-like kernel is slanted, indicating direction and velocity
selectivity). For this model cell, the slant of the light-minus-dark
space-time map, measured either by taking its FFT or its Radon
transform, corresponds to the average velocity reflected by the
dX/dT map. This is to be expected because, in
this model, the only mechanism for generating direction selectivity
depends on the first-order map.
|
Figure 11, A-C, shows space-time maps and paired-stimulus interactions for light and dark stimuli combined. For the real cells, we measured differences in direction selectivity using light and dark bars independently. Therefore in our model (Fig. 11, D-I) we also looked at interactions between pairs of light or pairs of dark stimuli separately. D and G show the space-time maps for light and dark stimuli independently, assuming the cell rectifies (cannot show negative firing rate). E and H show the interaction maps for pairs of s