Motion in the visual scene is processed by direction-selective neurons in primary visual cortex. These cells receive inputs that differ in space and time. What are these inputs? A previous single-unit recording study in anesthetized monkey V1 proposed that the two major streams arising in the primate retina, the M and P pathways, differed in space and time as required to create direction selectivity. We confirmed that cortical cells driven by P inputs tend to have sustained responses. The M pathway, however, as assessed by recordings in layer 4Cα and from cells with high contrast sensitivity, is not purely transient. The diversity of timing in the M stream suggests that combinations of M inputs, as well as of M and P inputs, create direction selectivity.
Direction-selective (DS) neurons discriminate the direction of moving stimuli by obtaining inputs with receptive fields that differ in space and in time. For one direction of motion, the inputs fire at about the same time (in phase) because the spatial and temporal differences cancel each other. In the opposite direction, the input activity is not as synchronous (out of phase). That is, stimulus direction is translated into relative timing. A wide variety of mechanisms can convert these timing differences into divergent postsynaptic activities.
Robust direction selectivity depends on approximately quarter-cycle phase differences (“spatiotemporal quadrature,” in one direction the quarter cycles subtract to 0, and in the other they sum to a half-cycle). Ideally, a DS cell would obtain inputs that differ by a quarter cycle. Spatially, such inputs exist in the form of simple cells with overlapping receptive fields that differ purely in phase, such as even and odd symmetric fields with on-off-on and on-off arrangements, respectively, for example. This has led some investigators to examine whether DS cells might receive inputs from non-DS cells that are in spatiotemporal quadrature. DeValois et al. (2000) showed that non-DS cells of these types exist in macaque V1. On the other hand, Peterson et al. (2004) argued that this scheme does not seem to work for cat V1.
Actual, as opposed to ideal, DS cells do not receive inputs from just two non-DS cells that are in approximate spatiotemporal quadrature. Instead, multiple inputs converge on DS cells, including both inhibitory and excitatory cortical inputs as well as direct excitation from the LGN. The spatial relationships among these inputs could vary in position as well as in spatial phase. We understand how these positional differences among receptive fields originate in the spatial distribution of retinal cells. The more compelling question is, where do the different timings arise?
In the cat, these timings originate in the retina, where a range of sustained and transient responses is generated. The distribution of timing is extended in the LGN, where additional temporal phase differences are created at low temporal frequencies (Saul and Humphrey 1990). Low temporal frequencies are of interest because a given phase difference corresponds to longer time differences as frequency decreases. For example, a quarter-cycle is 250 ms at 1 Hz and 1 s at 0.25 Hz. Furthermore, we expect single mechanisms to generate this wide range of long delays. What neuronal mechanisms can provide these long and variable delays across low temporal frequencies? Lagged cells in the cat LGN solve this problem via circuitry involving feedforward inhibition (Mastronarde 1987).
In the monkey, it is less clear where the temporal phase differences originate. DeValois et al. (2000) provided evidence that DS neurons receive input from the two major streams arising in the primate retina, the P and M pathways. The P pathway tends to have receptive fields with sustained responses. The M pathway has more transient responses and therefore differs from the P pathway in temporal phase by about a quarter cycle, suggesting that combining these two streams could lead to direction selectivity. This proposal diverges from the standard thinking that DS neurons are dominated by the M pathway (Livingstone and Hubel 1988). DeValois et al. (2000) also concluded that simple cells with transient temporal responses were restricted to a narrow range of spatial phases. This bolstered their argument that it is the combination of M and P inputs that generates direction selectivity.
Conway and Livingstone (2003) found that DS neurons in alert monkeys appear to obtain inputs with sustained and transient timing but did not confirm the spatial correlations observed by DeValois et al. (2000). They speculated that the inputs might correspond to excitatory and inhibitory types with the time course of the inhibition differing from that of the excitation. In the work presented here, we presume that much of the timing information underlying DS in V1 is inherited from the inputs to the DS cells. The fundamental question of the extent to which response properties are derived from inputs versus generated in place has been discussed (Humphrey and Saul 2002; Reid et al. 2002). If intrinsic processes such as transformations of timing that differ between excitation and inhibition are important or if a combination of spatially similar inputs produced a novel type of timing, cells could show timing not present in their inputs. However, we have previ-ously noted that timing in cat visual cortex resembles that in cat LGN (Saul and Humphrey 1992a,b) and interpreted those results as we will do here in terms of inheritance of timing.
We reinvestigated how spatial and temporal responses are structured in single neurons in visual cortex of anesthetized monkeys, to test the hypothesis of DeValois et al. (2000) that direction selectivity reflects the convergence of inputs from M and P pathways. We confirmed many of their findings, but correlations with laminar position and contrast sensitivity suggest that the M pathway is diverse, containing both transient and sustained responses. Results have been presented in abstract form (Saul et al. 2001).
All animals were obtained from licensed providers through the Division of Laboratory Animal Resources at the University of Pittsburgh, and all procedures were in accordance with the guidelines of the University of Pittsburgh and National Institutes of Health. Monkeys (Macaca nemestrina) were initially given 16 mg/kg ketamine and 0.1 mg atropine followed by anesthesia with halothane (3–4% as needed). The radial veins were cannulated for delivery of intravenous fluids, and a femoral artery was cannulated to enable monitoring of arterial pressure. A tracheal tube was inserted to permit artificial respiration. After stereotaxic fixation, using a headpost and removing the earbars, a small craniotomy was made above occipital cortex, and a chamber was mounted on the skull and filled with warm agar.
Heart rate, blood pressure, and electroencephalograms (EEGs) were continuously monitored throughout the experiment. During surgery, halothane (2–3% in 70% N2O-30% O2) or Saffan (Essex Animal Health, Friesoythe, Germany; 6-mg boluses intravenous approximately every 10 min) were used to maintain deep anesthesia. Once all surgery was completed, the monkey was paralyzed with an initial bolus of Pancuronium bromide (0.35 mg iv), and artificial respiration was initiated. End-tidal CO2 was kept near 4% by adjusting stroke volume, and airway pressure was monitored to avoid tracheal clogging. Intravenous paralytic (0.1–0.3 mg · kg−1 · h−1 Pancuronium bromide in 4 ml/h saline) was continued, and anesthesia was slowly transferred to Sufentanil citrate (4–8 μg · kg−1 · h−1 in 6 ml/h lactated Ringer, intravenous). Prophylactic antibiotic (3 mg/lb Ampicillin) and B vitamin complex (0.6 ml) were administered intramuscularly daily.
The eyes were covered with gas-permeable contact lenses and were periodically flushed with hypertonic saline. The eyes were refracted and lenses were placed in front of the eye to focus at a distance of 114 cm. The optic nerve heads were projected onto a tangent screen, and the positions of the foveae were estimated as being 17° temporal.
At the end of each electrode penetration, HRP was ejected from the electrode tip by passing 500-nA positive current on a 7 s on/14 s off schedule over 10 min. As the electrode was slowly withdrawn, several other marks were made at intervals of 1–2 mm.
At the end of the experiment, the monkey was administered a lethal dose (∼150 mg/kg iv) of pentobarbital sodium, which produced an immediate flattening of the EEG. The animal was perfused transcardially with cold saline followed by 2 l of 2% paraformaldehyde. Cortical tissue was sectioned (100 μ thickness), stained for cytochrome oxidase and reacted with DAB chromagen to reveal the HRP marks.
Reconstruction of penetrations relied on estimating laminar borders based on cytoarchitecture (Fig. 1), in most cases, and a combination of cytoarchitecture and cytochrome oxidase staining in a number of cases. Most recording sites could be assigned to a specific laminar location, as penetrations were oblique and stayed in individual layers for a long distance.
Stimulation and recording
Computer-controlled stimuli were presented on a BARCO monitor driven at 69.829 Hz by a Piranha video card inside a PC. Custom software on the PC was controlled in turn by code written in Igor Pro (WaveMetrics, Lake Oswego OR) running on a Macintosh computer.
Single neurons in V1 were recorded extracellularly with beveled glass pipettes filled with 0.2 M KCl in Tris buffer and 6% HRP, having impedances >50 MΩ. Spikes were isolated with a window discriminator, and pulses were timed relative to the stimulus at a resolution of 1 ms. Each cell was tested in several ways. We present here data primarily from responses to “sparse and dense noise” as described in the following text. All neurons were tested extensively with other standard stimulus protocols, including hand-plotting, orientation tuning, spatial frequency tuning and, in many cases, length-tuning. Color preferences were assessed with red, green, and blue stimuli. Oriented cells were classified as simple or complex based on the spatial overlap of excitatory responses to bright and dark stimuli.
Spatiotemporal maps were computed from responses to noise stimuli consisting of several bars (typically 16 for dense and 32 for sparse noise) the luminance of which was reset every frame (that is, at ∼70 Hz or every 14.3 ms). Luminance values were chosen based on an m-sequence from either two values (binary: bright and dark) or three values (ternary: bright, mean, and dark). Mean luminance was 20 cd/m2, and contrast was 80–100%. In the case of sparse noise, a single bright or dark bar appeared in one of the 32 positions during each frame. In addition, for sparse noise the position and luminance were always fixed for three consecutive frames, so each bar was actually presented for ∼43 ms. For dense noise, a randomly chosen luminance value was assigned to each of the 16 bars during each frame.
These spatiotemporal maps are functions of space and time and can be thought of as the average firing rate of a neuron following a brief presentation of a bar at each position. A less intuitive but more useful way to think about these maps, however, is that they represent the average stimulus preceding a spike. For instance, the first-order map is represented by the function where x is the bar position, τ is the latency between the stimulus and a spike, ti are the recorded spike times, and S is the luminance of the noise stimulus (−1 for dark and +1 for bright). In practice, this sum is normalized by the entire stimulus set (by dividing by the number of times a bright or dark bar was presented at each position), but the form in the preceding equation illustrates that these maps represent the stimuli that tended to cause spikes.
First-order spatiotemporal receptive-field maps were generated for each cell by reverse correlating the responses with the noise stimulus. Space was sampled by the number of bar positions, 16 or 32, used for the stimulus. Time was sampled in 5-ms steps, with a duration of 1.28 s. A singular value decomposition (Kontsevich 1995) of these receptive field maps provided a single separable component for non-DS cells and two separable components for most DS cells. These separable components consisted of the product of a temporal and a spatial profile. We analyzed each of these profiles as follows (Fig. 2).
The first two peaks in the temporal profile (also termed the impulse response) were found and the ratio of the sizes of the second to the first peak was taken as the biphasic index (Fig. 2A). Peaks were found automatically, by first examining the absolute value of the impulse response. The first point that exceeded the SD (computed across the entire 1.28 s of the absolute value of the impulse response) was used as a starting point to search for the first peak using Igor's built-in FindPeak routine that uses a simple algorithm based on the first two derivatives. If no peak was found, the data were not used, accounting for cases where only a single component was derived from DS cells. If a peak was found, the same algorithm searched for a second peak occurring ≥20 ms after the first peak. If a second peak was not found, a minimal value (0.02) was assigned to the biphasic index. The automatic peak-finding was always checked for accuracy by one of the authors.
A strongly biphasic cell has a biphasic index near 1 as the two peaks have similar amplitude (—). Profiles with small biphasic indices are monophasic, with one dominant peak ( · · · ). Note that in some cases the second peak is larger than the first peak, in which case the biphasic index is >1 (- - -). Such an impulse response is typical of lagged responses (Saul and Humphrey 1990).
The biphasic index, like other measurements of temporal processes in the time domain, suffers from serious shortcomings discussed in the following text. Because response timing, in contrast to response amplitude, has been demonstrated to behave somewhat linearly (DeAngelis et al. 1993; Lee et al. 1994; Saul 2004; Saul and Humphrey 1990), measuring phase is useful. We therefore transformed the temporal profile, via FFT, to the frequency domain (Fig. 2B). The response phase versus temporal frequency data were fit by a line to give a slope, which we call latency, and an intercept, absolute phase. These fits were weighted by the amplitude data so that frequencies where the response was minimal were ignored. Only frequencies <16 Hz were included. Absolute phase and biphasic index correspond to each other as illustrated by the three artificial examples in Fig. 2 and for the real population data in Fig. 3. Sustained responses with biphasic index near 0 have absolute phase values near 0 ( · · · in Fig. 2 and points near the origin in the polar plot of Fig. 3). As the second peak of the impulse response grows, it increasingly suppresses activity generated by the first peak; responses become more transient; the biphasic index increases; and absolute phase advances (becomes more negative). Figure 2, —, shows an example of such a biphasic, transient response. When the two peaks have the same size, biphasic index is 1 and absolute phase approaches −0.25 c (leftward in the polar plot). The biphasic index can increase further when the 2nd peak is larger than the 1st peak, corresponding to absolute phase lags (positive values). The dashed curves in Fig. 2 show an example of this. Note that if the “second” peak becomes far larger than the “first” peak, one would eventually ignore the vanishing peak and return to the sustained, monophasic case. Absolute phase captures this cycle explicitly, as the values lie on a circle over a half-cycle of phase (as indicated in Fig. 3, where there are points with absolute phase near 0 for both small and large values of biphasic index). The half cycle, rather than full cycle, arises here because we equate responses to bright or dark contrasts for purposes of measuring their timing.
This frequency-domain procedure eliminates the need to identify peaks, which can be ambiguous. Small peaks can be difficult to distinguish from noise. Furthermore, more than two peaks can occur in an impulse response function, but the biphasic index only considers two of them. The absolute phase measure has the advantage of taking into account the entire time course of the impulse response rather than the two points (the peaks) measured with the biphasic index. The biphasic index becomes extremely noisy at values away from 1 for these reasons (Fig. 3). As a ratio measure, biphasic index scales logarithmically (Figs. 3, 5, 8, and 9), although this has often been neglected by other authors (Cai et al. 1997; DeAngelis et al. 1993; DeValois et al. 2000; Peterson et al. 2004). Note that the biphasic index is least sensitive in the range from 0.5 to 2, where absolute phase varies across half of its range. These are the values for transient responses, which the biphasic index tends to lump together, not distinguishing shades of transiency. Impulse response functions for several cells are illustrated in Fig. 3 to reinforce these points. The examples at the bottom of this figure have similar biphasic indices but do not have the same timing. The absolute phase values distinguish these and other timing differences much better. Taking into account the latencies of the peaks along with the biphasic index (DeValois et al. 2000; Peterson et al. 2004) helps, but, as can be seen in these examples, shorter latencies do not necessarily correspond to more transient responses.
We follow the convention of DeValois et al. (2000) in referring to impulse response functions with biphasic indices <0.3 as monophasic and indices >0.5 as biphasic. As seen in Fig. 5, these are arbitrary cutoffs of a distribution that may or may not be bimodal. We will further call profiles with absolute phase values between 0.1 and −0.1 sustained, and the rest transient, approximately matching the cutoffs between monophasic and biphasic cells. We illustrate both measures in the following text, although they are somewhat redundant, as illustrated in Fig. 3. The biphasic index is familiar and directly comparable to previous results in the literature. However, absolute phase more accurately reflects timing for the reasons given in the preceding text.
The latency parameter derived from the phase versus temporal frequency data describes how phase varies across temporal frequency and is thus important for determining how direction selectivity depends on temporal frequency (Saul and Humphrey 1990, 1992a). The phase of the response at any given frequency is the sum of the absolute phase and the product of the latency and the frequency. Unlike latency measures derived in the time domain, the frequency-domain latency is independent of the shape of the impulse response (that is, it is independent of absolute phase). The three examples in Fig. 2 have the same latency. They also share an amplitude tuning curve, so are distinguished purely by absolute phase. We use the term latency for this parameter that is often called “integration time” because it measures the delay between stimulus and response (Saul and Humphrey 1990). We caution that it differs from onset latency measures and is closer to latencies measured with respect to the envelope of an impulse response function (Cai et al. 1997).
The spatial profile (Fig. 2C) was analyzed entirely in the frequency domain (Fig. 2D). First, the measured profile was subsampled, then it was expanded to four times the original width by zero-padding. Following slight smoothing and windowing, it was transformed and the phase was unwrapped. The phase versus spatial frequency data were fit by a line weighted by the amplitude, and the intercept was taken as the spatial phase value (the slope of this line is not important here, it represents the offset between the receptive field and an arbitrary point). This process is unambiguous, in contrast to the space domain methods of estimating spatial phase (e.g., fitting a Gabor function) because some spatial profiles are dominated by a single subzone. Note that the Fourier transform of a Gabor function is Gaussian in its amplitude tuning. This is not a good assumption for most cells. We avoid such assumptions with the more general method of simply fitting the phase data weighted by the amplitude data.
Direction selectivity was assessed by stimulating with optimal gratings drifting in each direction at a range of temporal frequencies (typically 0.25–32 Hz). A cell was considered DS if the mean response amplitude in the preferred direction was at least twice that in the nonpreferred direction over an octave around the optimal temporal frequency. The difference between the directions also had to be significant (t-test, P < 0.01).
Contrast response functions were measured with optimal gratings, testing 20 contrasts between ∼0.01 and 0.9 (Fig. 10). Amplitudes were fit with a sigmoidal function, and phase with a piecewise linear function that was constant below threshold. These fits were performed in the complex plane, so amplitude and phase were fit simultaneously, taking advantage of the phase data, which leads to better estimates of threshold. Contrast threshold was taken as the contrast at 10% of the upper asymptote of the sigmoid.
We also tested some cells with sinusoidally-modulated bars at a series of positions across the receptive field (Fig. 11) (Saul and Humphrey 1992b). This stimulus maps the receptive field much like the noise stimulus but provides stronger stimulation at low temporal frequencies.
Data shown here were obtained from 144 simple cells in V1 from eight monkeys. Of these, 61 were DS (42%). Receptive field spatiotemporal maps were analyzed by decomposing them into separable components that yield spatial and temporal profiles (DeValois et al. 2000; Kontsevich 1995). Non-DS cells provided a single component. Seventeen of the DS cells only yielded one component for analysis because the other had low signal to noise, and 44 DS cells provided two components.
Figure 4 shows four examples of non-DS simple-cell receptive fields. These cells illustrate the combinations of 2 types of temporal profile (sustained and transient) and 2 types of spatial profiles (even and odd symmetry). We strongly caution, however, that these dichotomies represent opposite points on circles and that responses actually cover the full ranges of temporal and spatial phases as shown by population figures in the following text. Spatiotemporal maps are represented by bright-excitatory/dark-inhibitory (bright areas) and dark-excitatory/bright-inhibitory (dark areas) plots with the spatial and temporal profiles shown above and to the right, respectively. The cell in Fig. 4A had a highly sustained response with a single off subzone. The cell in Fig. 4B was biphasic and had even spatial phase, again having a single (off) subzone. Another monophasic receptive field is illustrated in Fig. 4C, this time with an odd-symmetric spatial profile. Finally, a biphasic odd-symmetric receptive field is shown in Fig. 4D.
A full range of temporal profiles was observed across the population, but sustained responses predominated. The distribution of the biphasic index for non-DS cells is shown in Fig. 5A, where we also replot the data from DeValois et al. (2000) for comparison. Of the 83 non-DS cells in this study, 46% were highly sustained (biphasic index < 0.1). The results are similar to those of DeValois et al. (2000), although they sampled transient responses more often. Only 7 of these 83 cells had biphasic index values >0.5. Of those seven cells, three had biphasic index values >1.0. Thus few of the non-DS cells had biphasic indices between 0.5 and 1.0, the range one expects for “classic” biphasic profiles. In contrast, DeValois et al. reported 92 monophasic and 27 biphasic non-DS cells. They argued that this distribution was bimodal. We did not observe two modes.
DeValois et al. (2000) did not report absolute phase values, but we show this measure of timing for our data in Fig. 5B. The solid line represents the percentage of non-DS cells with different values of absolute phase. Almost all of these cells were sustained, with absolute phase centered around −0.03 c. Only 16% (13 of 83) of the non-DS cells had absolute phase values less than −0.1 c or greater than +0.1 c and were thus considered to be transient. Are transient responses observed almost exclusively in DS cells or are few transient responses ever observed? The gray line and filled region in Fig. 5B show the distribution of absolute phase for DS cells, including both components where they were significant. The frequency of transient responses is larger than for the non-DS cells, but sustained responses still dominate. Of the 105 DS receptive field components, 40 (38%) were transient. The distribution of absolute phase for DS cells has three modes. Thus neither absolute phase nor biphasic index suggest that timing is bimodal.
The spatial profiles of the non-DS simple cells were mostly even symmetric. This was true not only of the few transient cells but also held for the sustained cells. Figure 6 plots the frequency of non-DS cells, with either sustained (A) or transient (B) responses, against spatial phase. The small number of transient non-DS simple cells prevents us from drawing strong conclusions, but we did find cells with transient odd-symmetric responses (e.g., Fig. 4D) that the DeValois et al. (2000) study did not observe. The distribution of the sustained cells across spatial phase differed between the studies (P < 0.01, χ2 test). Even symmetry was much more common than odd symmetry in our sample, whereas DeValois et al. found a fairly uniform distribution. Similar results were found when the sustained/transient distinction was replaced by the monophasic/biphasic distinction.
Figure 6, C and D, shows the data from DS cells, including both components where they were significant. Even spatial phases dominated for both the sustained and transient groups. The spatial phase distributions did not differ significantly based on timing (P > 0.64, χ2 test).
We wondered whether the predominance of even symmetry depended on recording from simple cells with a single subzone. In Fig. 7 we break down the population according to the qualitatively judged number of subzones in the receptive field. The histograms in A indicate how the number of subzones relates to the spatial phase measurement. As expected, single-subzone cells always have even spatial phase, and cells with two subzones tended to have odd spatial phase. This correspondence is much stronger for one than for three subzones. Spatial phase depends not just on the number of zones but also on their relative strengths, and those relative strengths only exist when more than one subzone is present. Figure 7B shows the frequency with which we found cells with different numbers of subzones. Single-subzone cells made up 34% of the population. They were distributed across all layers, with no difference from the distribution for cells with two subzones (data not shown). Timing did not differ significantly across receptive fields with different numbers of subzones (geometric means of the biphasic index were 0.18, 0.20, and 0.29 for 1, 2, and 3 subzones, respectively, P = 0.78 for the difference between 1 and 3 zones).
DeValois et al. (2000) attributed the dichotomy they observed between biphasic even symmetric fields and monophasic fields to M and P pathways. We provide three types of evidence relevant to this hypothesis: laminar correlations, contrast sensitivity, and the dependence of direction selectivity on contrast.
Magnocellular LGN projects most strongly to layer 4Cα, and parvocellular LGN to 4Cβ (Blasdel and Lund 1983; Hendrickson et al. 1978; Hubel and Wiesel 1972), so we examined temporal and spatial profiles for these sublayers (Fig. 8 ). In 4Cβ (filled squares), timing was uniformly monophasic and sustained. Timing was much more heterogeneous in 4Cα (open circles): some receptive fields had biphasic and transient timing, but others were monophasic and sustained. In both sublaminae, spatial phases were broadly distributed, with less bias toward even symmetry than in other layers (especially 2/3, 4A, and 5). To the extent that 4Cα is dominated by M input, these results suggest that the M pathway is diverse in its timing, with both sustained and transient responses.
Correlations with contrast sensitivity confirm this idea. Magnocellular neurons tend to have higher contrast sensitivity than parvocellular neurons (Blasdel and Fitzpatrick 1984; Derrington and Lennie 1984; Hawken and Parker 1984; Hubel and Livingstone 1990; Tootell et al. 1988). Figure 9 shows the distribution of contrast thresholds for fields with different timings. Higher thresholds (e.g., >0.1) are associated with sustained timing, as would be expected for the P pathway. Low thresholds were found for cells with both sustained and transient responses. Again, this suggests uniformly sustained/monophasic timing in the P pathway and diverse timing in the M pathway.
If cells depended on mandatory combinations of M and P input for their direction selectivity, they should become much less DS at low contrasts, where the weak responses from P inputs would be overwhelmed by the responses of M inputs. According to this hypothesis, those M inputs would drive the cell similarly in both directions at low contrasts. This behavior was not found for any cells. Figure 10 shows six representative counterexamples, cells that retained their direction selectivity at low contrasts. Neurons that display direction selectivity do so down to threshold. At a contrast of ∼0.1 (indicated by arrowheads in the graphs of DS index vs. contrast), the M pathway should dominate the P pathway at low-moderate spatial frequencies (Hubel and Livingstone 1990; Kaplan and Shapley 1982; Tootell et al. 1988). All of the cells in Fig. 10 are DS at, and usually below, this contrast. This is clearest in the simple cell shown in Fig. 10C, which never fired for stimuli that were below threshold or drifted in the nonpreferred direction. The complex cell in Fig. 10E might be thought to show a loss of direction selectivity at low contrast because the response in the nonpreferred direction increases with decreasing contrast. However, the activity at low contrasts does not represent evoked responses but instead is spontaneous activity that persists to the lowest contrasts in both directions. Above threshold, activity is suppressed in the nonpreferred direction.
One could propose that DS cells receive far more inputs from the P pathway than from the M pathway and that this numerical superiority compensates for the low response amplitudes of P cells at low contrasts. However, one would then expect that at high contrasts, where the difference in response amplitudes between the M and P pathways is less severe, the P inputs would come to dominate the M inputs and DS would decline. Some cells may in fact show this sort of behavior, as in examples A and F in Fig. 10, but others clearly do not show evidence of such an imbalance. Ideally, one would examine contrast-response functions in DS neurons distinguished by the presence or absence of inputs from the P pathway, e.g., pyramidal versus stellate cells in 4B (Yabuta et al. 2001).
In addition to the measurements that integrate over the entire receptive field, we also measured the timing at each position within the field. These data probably reflect the spatially localized inputs more closely than the analyses based on the whole field. Timing was characterized by the intercept and slope of the phase versus temporal frequency plot, the absolute phase and latency. Positions with unreliable timing values were discarded according to the criteria of Saul and Humphrey (1992b). Each cell contributed one to seven points, depending on receptive field size and complexity; receptive fields in 4Cβ yielded only one to four points because of their simpler structures. Absolute phase values in 4Cβ were relatively uniform (Fig. 11), reflecting the sustained responses observed in these cells. Absolute phase was much more heterogeneous in 4Cα. Latencies were nearly identical in these two layers (geometric means of 67 ms for 4Cα and 68 ms for 4Cβ). These sublayers (and presumably their LGN inputs) are distinguished by their temporal properties and in particular by their response timing. However, this distinction does not consist of a latency difference. Instead, the distinction lies in the variance of their distributions of absolute phase. Absolute phase has a small variance in 4Cβ and a large variance in 4Cα, reflecting the respective uniformity and diversity of timing in these layers.
This comparison between responses in layer 4C could reflect differences between cells in parvocellular and magnocellular layers of LGN and/or might be based on cortical processing. Intracortical effects are reduced, although certainly not eliminated, in this experiment. Direct measurement of the temporal differences between P and M cells will be required to confirm that timing in the M pathway is diverse.
M and P inputs
The parallel pathways that arise in the primate retina display several functional distinctions. Previous studies have indicated that timing is one such property. The results of this study suggest that this distinction is subtle. Almost all cells in layer 4Cβ, and cells with low contrast sensitivity, which are probably dominated by P input, had sustained timing. In contrast, cells in 4Cα and those with higher contrast sensitivity showed a wide range of timing.
Our laminar data could be interpreted as indicating that 4Cβ receives inputs exclusively from parvocellular LGN, whereas 4Cα receives inputs from both magnocellular and parvocellular LGN, or at least P inputs relayed from 4Cβ. This might explain the diversity of timing observed in 4Cα. Anatomic studies (Blasdel and Lund 1983; Hubel and Wiesel 1972) have shown that terminals from parvocellular LGN inputs do not invade 4Cα. Neurons near the middle of 4C extend dendrites into the neighboring sublayer, and can thus receive both P and M input. Notably, some of the most DS cells in V1 are found in the middle of 4C (Gur et al. 2004).
Boyd et al. (2000) proposed a tripartite division of layer 4C based on the distinct ascending projection patterns of these neurons, which they suggest correspond to distinct input patterns. Other suggestions have been made that the inputs to 4Cα are diverse. Bauer et al. (1999) argued that two distinct populations of M cells project to 4Cα. One population projects throughout the depth of the sublayer, and another population projects exclusively to upper 4Cα and is responsible for generating direction selectivity, which they argue is found predominantly in upper 4Cα and 4B. DS cells in our sample were found throughout the depth of 4Cα. Our findings are consistent with the notion of multiple populations of M cells but suggest that these populations differ in timing.
The M pathway's hypothesized diversity of timing would mean that it is capable of creating direction selectivity either on its own or in combination with P inputs. Many cells in layer 4B are DS and receive input from both the M and P streams (Sawatari and Callaway 1996). However, many other DS 4B neurons, as well as DS neurons in upper 4Cα, are probably dominated by M inputs and would therefore be expected to obtain temporally and spatially differing inputs from those M inputs alone (Yabuta et al. 2001). Furthermore, cells that are DS at low contrasts presumably receive spatially and temporally disparate inputs from the M pathway.
Comparisons with DeValois et al
DeValois et al. (2000) argued that the M pathway contained only biphasic, even-symmetric receptive fields. Our results, which, we emphasize, were on the whole highly consistent with what they found, differed in two main ways. We recorded clear examples of biphasic odd fields. We also recorded many examples of monophasic responses that we attribute to the M pathway. A smaller disagreement lies in the spatial phase distribution of monophasic receptive fields: our sample was dominated by even spatial phases as opposed to being uniform.
These discrepancies might arise for many reasons, including sampling, different methods of computing spatial phase, different criteria for calling cells DS, and interpretation of M and P origins. The micropipettes we used have been previously shown to record from small cells that are typically missed by low-impedance electrodes like those employed in the DeValois et al. study. We may have sampled cells not included in their results. Whether such a hypothetical population might be biased toward sustained M-dominated cells is speculation. We note that DeValois et al. (2000) and other labs (Dreher et al. 1976; Schiller and Malpeli 1978; Sherman et al. 1976; Usrey and Reid 2000) report that magnocellular LGN contains a uniform population of cells that give transient responses. Some authors find little difference in the timing of M and P cells, however (Blakemore and Vital-Durand 1986; Levitt et al. 2001; Spear et al. 1994). Maunsell and Gibson (1992) did not observe changes in cortical sustained or transient firing after lesions of magno- or parvocellular LGN layers, whereas they did see changes in onset latencies. We speculate that the smaller cells in magnocellular LGN (Liu and Wong-Riley 1990; Norden and Kaas 1978) might be undersampled in many studies and could have more sustained responses that underlie the cortical timing observed in this study. Whether the heterogeneous populations of K cells contribute to direction selectivity remains an open question.
The organization of spatial phase lacked the structure observed by DeValois et al. (2000). Most receptive fields had spatial phase values near 0 (even symmetry), but spatial profiles varied across both sustained and transient populations. The prevalence of even symmetry is due in part to the many single-zone simple receptive fields. With the method used here, such fields usually are assigned spatial phase values near 0. With other methods, such as fitting Gabor functions, nonzero phase values can arise because of the dominance of the Gaussian factor in the best-fitting Gabor function. Even though quadrature spatial phase relationships are not common among pairs of non-DS cells, these non-DS cells could still contribute to the generation of DS receptive fields based on spatial location differences.
DeValois et al. (2000) do not report how many of their receptive fields had a single subzone, and it is not clear to what extent their Gabor fits might have yielded different values than the frequency-domain method used here. They only show examples of multiple-subzone cells and make the strong claim that virtually no biphasic receptive fields had odd symmetry. This difference could be a matter of emphasis, but we would temper their conclusions about the distributions of spatial phase differing between the sustained and transient populations.
We found that latencies did not differ between 4Cα and 4Cβ (Fig. 11). Other authors (Maunsell and Gibson 1992; Nowak et al. 1995; Schmolesky et al. 1998) have reported significant differences in onset latencies between these layers, as well as between the inputs in M and P geniculate layers (Marrocco 1976; Schmolesky et al. 1998; cf. Spear et al. 1994). For instance, DeValois et al. (2000) show latencies as time to peak for M and P cells as well as for cortical cells with biphasic and monophasic temporal profiles. Median onset latencies were 38 ms for M cells and 48 ms for P cells, and peak responses occurred at ∼60 ms for M cells and 80 ms for P cells. However, they also show that the second peak in M cell and biphasic cortical cell responses occurs at a longer latency than the single peak in P and monophasic cortical cells. Taking into account both peaks in biphasic response profiles, one obtains a latency measure more similar to the one we use, and monophasic and biphasic cells have similar latencies.
Asymmetry in inputs to DS cells
A striking feature of our data is that cells exhibiting transient timing are overwhelmingly DS. These cells were recorded primarily in layers 4B, 4Cα, and 6. The data leave the distinct impression that most cells with transient input, presumably via the M pathway, also receive sustained input and are DS as a result. This is not unlike the situation in cat primary visual cortex, where 80% of the cells are DS. In monkey, there are many more non-DS cells, but they nearly always have sustained timing and, according to our results, even spatial phase. Many of these non-DS cells may be dominated by the P pathway. As proposed by Kaplan and Shapley (1982), the M pathway mirrors to some extent the entire pathway from the A layers of cat LGN, including the preponderance of direction selectivity.
The situation in cat has been studied in greater depth. Inseparable simple cell receptive fields have been described in detail (Albrecht and Geisler 1991; Dean and Tolhurst 1986; DeAngelis et al. 1993; Jagadeesh et al. 1997; McLean and Palmer 1989; McLean et al. 1994; Movshon et al. 1978; Murthy et al. 1998; Reid et al. 1987; Saul and Humphrey 1992b). Several clear results have emerged. First, it is not latency that varies across space in inseparable fields but rather the shape of the temporal profile. Second, the temporal profiles seen in cortical receptive fields resemble those recorded in the LGN inputs (Alonso et al. 2001). V1 cells are not always directly driven by LGN afferents, but timing is inherited in large part even when relayed via other cortical neurons. Third, Peterson et al. (2004) focused directly on the issues considered by DeValois et al. (2000) and in the present work, challenging a strong form of the hypothesis that cortical direction selectivity emerges from spatiotemporal quadrature inputs. They found that simple cell direction selectivity in cat V1 can be best explained in terms of combinations of inputs with diverse timing that do not fit into a bimodal distribution of, for example, purely transient and sustained types. Fourth, in analogy to the asymmetry noted in the preceding text where transient responses are strongly associated with direction selectivity in primate V1, Saul and Humphrey (1992b) showed that lagged responses are strongly associated with DS in cat V1.
Direction selectivity could arise without the sorts of variation across inputs considered here. However, all proposals have in common an asymmetry resembling that between M and P inputs. Maex and Orban (1996) and Suarez et al. (1995) modeled the generation of DS based on temporal differences among neurotransmitter receptors, with further elaborations by the cortical network. Chance et al. (1998) proposed that synaptic depression produced the temporal shifts. These mechanisms might be applied to the predominantly sustained responses observed in macaque V1 to study whether their predictions match physiological data. Livingstone (1998) and Conway and Livingstone (2003) argued that timing differences emerge from distinctions between excitatory and inhibitory processes. Specifically, they proposed that inhibitory inputs have an inherent delay. Others have argued that the most orientation and direction selective cells receive the strongest inhibitory inputs (Kagan et al. 2002; Lund et al. 2003; Shapley et al. 2003). This might occur for several reasons: the inhibition could simply help to eliminate responses to nonpreferred stimuli; as suggested by Conway and Livingstone (2003), the inhibitory synapses could alter the timing of their inputs; and the inhibitory interneurons could receive inputs with timing that differs from what the excitatory neurons receive. As argued by Saul (1995, 1999), the spatiotemporal structure of inhibitory inputs to simple cells often has a relationship with the excitatory inputs that ranges from pure push-pull to quadrature (Martinez et al. 2005), and the inhibitory and excitatory inputs share a preferred direction (Priebe and Ferster 2005). The relationships between synaptic interactions in cortex and the inputs from the LGN need to be studied carefully in direction selective cells. Of particular interest would be an examination of differences between stellate and pyramidal neurons in layer 4B, because these pyramidal cells appear to receive convergent P and M input, whereas the stellate cells may be dominated by the M pathway (Yabuta et al. 2001). Thus some form of the hypothesis advanced by DeValois et al. (2000) could apply to at least a subset of cortical DS cells.
This work was supported by National Eye Institute Grants EY-06459 and EY-08098.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2005 by the American Physiological Society