Schmolesky, Matthew T., Youngchang Wang, Doug P. Hanes, Kirk G. Thompson, Stefan Leutgeb, Jeffrey D. Schall, and Audie G. Leventhal. Signal timing across the macaque visual system. J. Neurophysiol. 79: 3272–3278, 1998. The onset latencies of single-unit responses evoked by flashing visual stimuli were measured in the parvocellular (P) and magnocellular (M) layers of the dorsal lateral geniculate nucleus (LGNd) and in cortical visual areas V1, V2, V3, V4, middle temporal area (MT), medial superior temporal area (MST), and in the frontal eye field (FEF) in individual anesthetized monkeys. Identical procedures were carried out to assess latencies in each area, often in the same monkey, thereby permitting direct comparisons of timing across areas. This study presents the visual flash-evoked latencies for cells in areas where such data are common (V1 and V2), and are therefore a good standard, and also in areas where such data are sparse (LGNd M and P layers, MT, V4) or entirely lacking (V3, MST, and FEF in anesthetized preparation). Visual-evoked onset latencies were, on average, 17 ms shorter in the LGNd M layers than in the LGNd P layers. Visual responses occurred in V1 before any other cortical area. The next wave of activation occurred concurrently in areas V3, MT, MST, and FEF. Visual response latencies in areas V2 and V4 were progressively later and more broadly distributed. These differences in the time course of activation across the dorsal and ventral streams provide important temporal constraints on theories of visual processing.
Accurate knowledge of the response latency of neurons across the visual system is necessary for the development of effective models of visual system function. Visual information processing begins in the retina with the different classes of ganglion cells, continues in the different layers of the dorsal lateral geniculate nucleus [LGNd; magnocellular (M), parvocellular (P), and koniocellular (K) layers], and proceeds into the visual cortex along largely parallel streams (Casagrande and Norton 1991; Merigan and Maunsell 1993; Shapley and Perry 1986; Stone et al. 1979). At the cortical level, processing begins in area V1 and is hypothesized to then proceed along three streams (termed M, P, and K for the specific LGNd layer inputs), gaining complexity at progressively higher cortical levels that are regarded as being organized in an anatomic hierarchy (Felleman and Van Essen 1991; Hilgetag et al. 1996; Van Essen et al. 1992). An implication of such an organization is that “higher” visual areas in both streams display longer visual response latencies than do “lower” ones as a result of the time required for the transfer of information from one stage of processing to the next. To date, this hypothesis could be evaluated only indirectly by comparing data collected in different laboratories (reviewed by Nowak and Bullier 1998) or directly in only a small number of cortical areas (see Maunsell 1987; Nowak et al. 1995; Raiguel et al. 1989). However, the more commonly made indirect comparisons are often confounded due to differences in experimental and analytic methodology. In addition, the visual response latencies of several key areas in macaque, such as the M and P layers of LGNd, middle temporal area (MT), and V4 have gone largely unstudied (but see Maunsell 1987; Raiguel et al. 1989). To our knowledge, no visual onset latency data have been reported for macaque area V3 or for medial superior temporal area (MST) or frontal eye field (FEF) in the anesthetized preparation (for awake macaque data, see Kawano et al. 1994; Schall 1991; Thompson et al. 1996). This study was conducted to obtain visual response latencies in areas that have received little or no attention to date and to provide a direct comparison of single-unit visual response latencies recorded from multiple cortical areas of individual monkeys under the same stimulus presentation and animal preparation conditions.
The activity of 558 single units was recorded in the M and P layers of the LGNd, and in cortical visual areas V1, V2, V3, V4, MT, MST, and FEF in four paralyzed, anesthetized macaque monkeys using standard surgical and single-unit recording techniques consistent with Society for Neuroscience and National Institutes of Health guidelines (Leventhal et al. 1995). The areas studied for each monkey were V1, V2, and FEF in monkey 1; V2 in monkey 2; LGNd M and P layers, V2, V4, MT, and MST in monkey 3; and V2, V3, MT, and MST in monkey 4. Anesthesia was maintained via artificial ventilation with a mixture of nitrous oxide (75%) and oxygen (25%) containing halothane (0.25–1.0% as needed). The small variations made in halothane concentration did not appear to alter responsivity. Animals were studied for as long as stable, reliable recording was possible (2–9 days; see physiological criteria for data inclusion below). Optics were routinely checked, and deterioration was minimal in even the longest experiment. The proportion of cells meeting the data inclusion criteria did not appear to decrease over time. The order in which areas were studied was varied from animal to animal, thereby reducing the impact that this factor could have on any interarea latency differences found.
Flashing visual stimuli were generated on a Tektronix 608 display driven by a Picasso image synthesizer (Innisfree). The Picasso was controlled by a PC computer in conjunction with specially designed hardware and software (Cambridge Electronics Design, LTD). Our system is able to randomly generate a broad spectrum of visual stimuli under computer control, collect the data, and perform on-line statistical analyses. A perimeter apparatus was used to position an oscilloscope display at any point in the animal's visual field, maintaining a fixed distance between the display and the animal's retina.
For these studies the center of the display screen was 57 or 228 cm from the animal's retina, depending on the size of the receptive field (RF) being studied. Stimulation was binocular in the majority of FEF cells (23 of 26). In areas outside FEF, stimulation was monocular for all but 12 cells that demonstrated clear binocular summation. The RF center eccentricities were 15–33° from the central area for FEF cells. The vast majority of non-FEF cells displayed RF center eccentricities between 3 and 10°.
For each cell, pretesting was conducted using a hand-held pantascope to determine the preferred stimulus configuration (spot or bar), orientation, size, phase, and/or color. In cases where the preferred parameters were clear the nonpreferred parameters were not assessed quantitatively. For any parameter where pretesting did not clearly demonstrate the preferred attribute (e.g., white, red, or green), each parameter attribute was presented via computer, and the determination of the optimal stimulus was deferred for off-line analysis. In general, each cell provided quantitative latencies to at least two stimuli and some to as many as six.
Each computer-generated flashing stimulus was presented 50 times with an on period of 0.5 s and an off period of 3 s. The stimulus that elicited the optimal response was determined and the latency of the response to that stimulus included in the data set. The optimal response was the one judged to be the greatest in magnitude (based on peak firing rate and ratio of peak to baseline) and lowest response variability (based on percentage of trials that had a significant response and the variability of response latencies from trial to trial). For 512 of the cells (91.8%), the optimal response was obtained while stimulating with a white (8.37cd/m2) or black (0.91 cd/m2) spot or bar with a contrast of 80% [(8.37 − 0.91 cd/m2)/(8.37 + 0.91 cd/m2)]. All but two of the remaining cells (7.9%) responded optimally to a red (1.29 cd/m2) or green (1.51 cd/m2) spot with a contrast of 84% (backgrounds were 0.11 cd/m2 and 0.13 cd/m2, respectively). The optimal response was obtained while stimulating with a blue spot for one LGNd P cell and while stimulating with an annulus for one V1 cell. The size of spots and bars/squares was varied to match RF size and optimize response. Generally, larger stimuli were used in higher order areas such as MST and MT relative to V1. However, even with the use of the maximum stimulus size, due to the limited size of the monitor, the proportion of the RF stimulated was actually smaller in higher order areas than in lower order areas. Wavelength stimuli were generated by fixing Kodak wratten filters to the monitor.
Spike train analysis
Times of onset of visually evoked activity were determined for each spike train using an adaptation of the Poisson spike train analysis originally described by Legendy and Salcman (1985) and modified by Hanes et al. (1995) and Thompson et al. (1996). Examples of the raster plots used to determine the visual onset response latencies of cells in the areas studied are shown inFig. 1.
Histology and histochemistry
At the conclusion of each experiment, the animal was deeply anesthetized and perfused through the heart with 700 ml of lactated Ringer solution containing 0.1% heparin, followed by 1,000 ml of 1% paraformaldehyde and 2.5% glutaraldehyde in 0.1 M phosphate buffer at pH 7.4, followed by 600 ml of lactated Ringer solution containing 5% dextrose. Brains were removed, and the locations of the electrode tracks relative to specific sulci and gyri were determined. Portions of cortex containing electrode tracks were blocked, and alternating coronal sections (90–120 μm) were stained for cells bodies (Nissl) or myelin (Gallyas 1979). The surface position of electrode entrance (all electrodes were aligned perpendicular to the cortical surface), and/or the reconstruction of the electrode track itself was used to confirm the earlier classifications made of each cortical area based on comparisons of physiological recordings with well-documented RF properties (e.g., size, eccentricity, stimulus selectivity, progression of RFs relative to vertical meridian).
The areas studied include the M (n = 52) and P (n = 78) layers of the LGNd, and cortical areas V1 (n = 74), V2 (n = 61), V3 (n = 100), V4 (n = 29), MT (n = 79), MST (n = 59), and FEF (n = 26). A number of V1 cells were classified as 4Cα (n = 13) or 4Cβ (n = 9) based on penetration depth and response characteristics (e.g., nonoriented, small RF, etc). The peripheral RF eccentricity of the FEF cells studied suggests a correspondence to area 8Ac (Schall et al. 1995). For areas that were studied in more than one monkey, interanimal comparisons of average response latencies taken from relatively equal and large sample sizes did not reveal statistically significant differences.
The earliest visual responses measured were in the M layers of the LGNd (see Figs. 1 and 2) in which there was very little latency spread (33 ± 3.8 ms, mean ± SD). P cells in the LGNd exhibited longer, more variable latencies, ranging from 31 to 76 ms (50 ± 8.7 ms). The modal latencies of M and P LGNd cells did not overlap and were, in fact, 10 ms removed; M cell 25–75 percentile = 31–34 ms, P cell 25–75 percentile = 44–56 ms (see Fig. 2).
The shortest latencies in visual cortex were found in layer 4Cα of V1. These cells had latencies as short as 34 ms. Even though the number of cells we identified as being in layer 4C was small, the latencies of 4Cα cells were, onaverage, significantly shorter than those for 4Cβ, t (1, 20) =2.66, P = 0.02. Thus the latency difference found between M and P LGNd layers is maintained in the geniculo-recipient layers of V1. Overall, the latencies of V1 cells ranged from 34 to 97 ms (66 ± 10.7 ms). V2 cells exhibited latencies with an average of 82 ms and a large variance (SD, 21.1 ms). Previous research has shown that V2 latencies increase from thick to pale to thin bands (Munk et al. 1995) and when all three subdivisions are included, as is the case here, a large latency spread is to be expected. Figure 1 gives examples of responses of individual neurons showing how the putative M (A, C, and E) and P (B, D, and F) streams could pass staggered but parallel signals with 10- to 15-ms delays between each stage of activation. V4 cells exhibited the longest and most varied latencies of any area recorded from in this study (104 ± 23.4 ms).
The latencies of cells recorded in areas associated with the dorsal stream of visual processing were shorter and more uniform. The latencies of V3 cells ranged from 55 to 101 ms (72 ± 8.6 ms). Cells in area MT had an identical average latency of 72 ± 10.3 ms. The MT latency data gathered by Maunsell (1987) were obtained from awake monkeys in response to very different stimuli (high contrast square wave gratings) and were presented in a population response format that does not provide a range of latencies. Thus a comparison of the Maunsell (1987) data with our own is difficult. However, note that the earliest response reported by Maunsell (1987) was 39 ms, 10 ms faster than the earliest MT latency reported here. This difference is most likely due to lack of anesthesia and/or differences in stimulus presentation and data analysis. The only other study of macaque MT visual onset latencies (Raiguel et al. 1989) cites a much slower median latency (94 ms) and a particularly wide range of latencies (35–272 ms). Because the V1 latencies reported by Raiguel et al. (1989) are also considerably longer and more varied than our own or those reported by others (Celebrini et al. 1993; Knierim and Van Essen 1992; Maunsell and Gibson 1992; Nowak et al. 1995), the MT data differences are likely due to the use of moving, as opposed to flashing, visual stimuli in the Raiguel et al. (1989) study or differences in analysis techniques. The onset latency data presented are the first reported for V3 and the first reported for MT under anesthetized preparation, flash stimulus conditions and indicate coincident activation timing in the two areas.
Cells in MST exhibited latencies effectively equivalent to V3 and MT averaging at 74 ± 16.1 ms. FEF cell response latencies have not been recorded previously in the anesthetized monkey. We found that cells were visually responsive in arcuate frontal cortex of the anesthetized, paralyzed monkey and that the latencies of a sample of FEF cells gave an average of 75 ± 13.0 ms. This distribution of FEF cell visual latencies agrees with the distribution of latencies measured in awake, behaving monkeys using approximately equal strength stimuli (Schall 1991; Thompson et al. 1996).
A Kruskal-Wallis one-way analysis of variance(ANOVA) on ranks was performed to statistically compare the distributions of latencies across the layers of the LGNd and the different visual areas. Significant variation in latency was confirmed [H(8,488) = 336.9, P < 0.001]. The results of multiple Mann-Whitney two-way rank sum comparisons corrected by the Bonferroni method are shown in Table 1. These results suggest that, based on response onset latency alone, there is a functional sequence in the ventral stream, wherein LGNd P layers, V1, V2, and V4, demonstrate successively longer latencies. In contrast, although the dorsal stream does show progressively longer latencies from LGNd M cells to V1 to V2, there is simultaneous onset of firing in V3, MT, MST, and FEF (see Fig. 1, G–J for representative responses).
Within the limits of methodology and analysis, our findings are generally consistent with estimates of response latencies throughout the visual system (Givre et al. 1995; Maunsell and Gibson 1992; Nowak et al. 1995; Schroeder et al. 1991; for review see Nowak and Bullier 1998). However, the fact that the present data from multiple stations of the visual system were collected in individual monkeys using common stimulus presentation and analysis techniques significantly improves the reliability of conclusions drawn about the relationships between the latencies of cells across the visual system.
Latency differential between M and P streams
One salient finding of the present study was that the onset latency distribution of the M and P layers of the LGNd are almost entirely separated with the P cells being nearly 20 ms slower. There are only two other studies of macaque LGNd single-unit latencies reported to date. Spear et al. (1994) found no difference between M and P cell latencies, reporting an average latency of 77 ms. Marrocco (1976) found a latency difference of ∼20 ms between broadband, transient cells and color-opponent cells. However, all of the cells in the Marrocco study were classified as parvocellular, and the actual values for the transient broadband (range23–80 ms) and color opponent (range 33–108 ms) cells are difficult to reconcile with the present data. Research has shown a 10- to 20-ms latency difference between Y and X cells in cat retina (Bolz et al. 1982) and between 4Cα and 4Cβ cells of primate V1 (Nowak et al. 1995; present research). LGNd data in other species also show a 10- to 20-ms difference between M/Y and P/X cell onset latencies (galago, Irvin et al. 1986; cat, Sestokas and Lehmkuhle 1986). We conclude that macaque LGNd M and P cell latencies are in all likelihood separated by 10–20 ms as shown by our data.
A second major finding of this study was that the cortical cells innervated by the P stream were activated later than those innervated by the M stream. This confirms earlier observations (Maunsell and Gibson 1992; Nowak et al. 1995). The difference in visual latencies across the dorsal and ventral streams indicates the possibility (suggested by Nowak and Bullier 1998) that M stream cells could modulate the responses of P stream cells through feed-forward, lateral or feedback connections. Likewise, modulatory effects on later phases of visual activation in lower order areas may arise from feedback from earlier activated higher order areas (e.g., Knierem and Van Essen 1992; Zipser et al. 1996). Because anesthesia was used in the present experiment, differential effects of anesthesia on the ventral and dorsal streams could feasibly play a role in the differences found between the timing of M stream and P stream activation. However, neither the existence nor the magnitude of such differential effects between the two streams have yet been demonstrated and therefore remain speculative until further research is conducted.
The third major finding of the present study is that the dorsal stream signals travel through tiers of the anatomic hierarchy rapidly. Indeed, the fastest layer 4Cα cell encountered exhibited a latency that was only 6 ms longer than the fastest LGNd M cell studied. The average responses of areas V3, MT, MST, and FEF were, in turn, only 6–9 ms longer than the average V1 response. One factor that is likely contributing to this rapid information transfer is the heavy myelination and relatively large fiber diameter of axons projecting to dorsal stream areas (e.g., V1 to MT) (Movshon and Newsome 1996). It is noteworthy that a relatively small number of geniculate neurons project to extrastriate cortical regions (Benevento and Yoshida 1981; Bullier and Kennedy 1983; Yukie and Iwai 1981) and thus these neurons could play a role in the earliest activation of the geniculo-recipient areas. However, this possibility is questionable because many of these projections are thought to involve the slow-activating S/K layers and interlaminar regions and could also require preactivation of a retino-colliculo-geniculate path (Bullier and Kennedy 1983). In any case, as a resultof the very rapid transfer of information throughout the dorsal stream, most cells in middle tier dorsal stream cortical areas exhibit almost completely overlapping latencies (see Fig. 2).
In contrast, the onset firing in ventral stream cortical areas more closely represents a classical hierarchical progression, from V1 to V2 to V4. The onset latencies determined for several inferotemporal (IT) cells we recorded from are consistent with sequential progression but can only be taken as suggestive until a larger sample is obtained (but see Mikami et al. 1994; Nakamura et al. 1994; and Vogels and Orban 1994 for IT latencies). The V2 and V4 latencies still do appear to have considerable overlap. However, it is known that V2 fast latency pale bands and V2 slow latency thin bands both project to V4 (Munk et al. 1995; Nakamura et al. 1993). Thus the large V4 latency spread probably reflects a combination of fast, pale band–recipient and slow, thin band–recipient V4 modules as suggested by Nowak and Bullier (1998). Functionally then, the ventral stream is still sequential but has split into two staggered, sequential substreams. Response latency data must be gathered from additional ventral stream areas and subarea compartments before any strong conclusions can be drawn regarding the true extent of response onset simultaneity present in this stream.
Anatomic and functional hierarchies
Anatomic evidence has been employed to argue for a hierarchy of visual areas (e.g., Felleman and Van Essen 1991; Hilgetag et al. 1996). However, examination of the present data as well as published results from different laboratories for individual areas reveals a number of inconsistencies. For example, FEF is at level 8 of the Felleman and Van Essen (1991) anatomic hierarchy. However, cells from this area exhibit visual latencies comparable with those in V2 (level 2), V3 (level 3/4), MT (level 5), and MST (level 7), and sometimes even as early as some cells in V1 (level 1). Conversely, many V2 cells exhibit longer response latencies than most MT or MST cells. These inconsistencies are not resolved by alternative hierarchical schemes (Hilgetag et al. 1996).
Clearly an anatomic substrate for the time course of visual activation must exist. Our data simply indicate that the rules of connectivity used to produce the anatomic hierarchies fail to account for the initial flow of signals in the visual system and therefore may not accurately represent the “functional” hierarchy of the visual system (see also Nowak and Bullier 1998). In fact, the results indicate that in many cases the short latencies of cells in higher tier areas can only be accounted for if multiple tiers of processing are bypassed entirely during the transfer of initial information from V1. Anatomic studies do support many bypass routes (e.g., V1 to MT to FEF) (Maunsell and Van Essen 1983; Ungerleider and Desimone 1986), but hierarchical models rarely weigh such paths heavily when assigning areas to tiers. Thus the sequence of neural activation in different areas highlights the limitations of interpretations provided by hierarchical schemes derived solely from anatomic data. Continued studies of the timing of information processing in different cortical areas, layers, and functional cell types are necessary to expand our understanding of the mechanisms of visual perception.
This work was supported by National Institutes of Health Grants EY-04951 to A. G. Leventhal, F31-MH-11178 to D. P. Hanes, and R01-EY-08890 to J. D. Schall.
Address for reprint requests: A. G. Leventhal, Dept. of Neurobiology, University of Utah School of Medicine, Salt Lake City, UT 84132.