Rules by which V1 neurons combine signals originating in the cone photoreceptors are poorly understood. We measured cone inputs to V1 neurons in awake, fixating monkeys with white-noise analysis techniques that reveal properties of light responses not revealed by purely linear models used in previous studies. Simple cells were studied by spike-triggered averaging that is robust to static nonlinearities in spike generation. This analysis revealed, among heterogeneously tuned neurons, two relatively discrete categories: one with opponent L- and M-cone weights and another with nonopponent cone weights. Complex cells were studied by spike-triggered covariance, which identifies features in the stimulus sequence that trigger spikes in neurons with receptive fields containing multiple linear subunits that combine nonlinearly. All complex cells responded to nonopponent stimulus modulations. Although some complex cells responded to cone-opponent stimulus modulations too, none exhibited the pure opponent sensitivity observed in many simple cells. These results extend the findings on distinctions between simple and complex cell chromatic tuning observed in previous studies in anesthetized monkeys.
At photopic light levels, the sensitivity of visual neurons to the spectral content of light depends on the input they receive, via intermediate neurons, from the three classes of cone photoreceptor. Neurons in the lateral geniculate nucleus (LGN) can be assigned to three groups on the basis of cone inputs, suggestive of discrete chromatic pathways (Derrington et al. 1984). Whether such neuronal types exist in area V1 is less clear and the question addressed in this report.
Some studies have assigned V1 neurons to mutually exclusive categories on the basis of spectral sensitivity, for instance into red/green and blue/yellow categories (Livingstone and Hubel 1984; Roe and Ts'o 1999; Ts'o and Gilbert 1988). These studies, however, used relatively few colored stimuli, leaving open the possibility that the categories thus established resulted from the small number of stimuli used. These studies did not estimate cone inputs quantitatively and thus do not reveal whether these cell groups would have segregated on this basis.
Other studies used wider ranges of visual stimulus patterns and data analysis techniques designed to estimate cone inputs quantitatively. These studies found that, for the most part, V1 neurons do not fall into discrete clusters on the basis of cone weights (Johnson et al. 2001, 2004; Lennie et al. 1990; Solomon and Lennie 2005; Solomon et al. 2004). One possibility is that the apparent disorder in the distribution of cone inputs to V1 neurons derives from strong assumptions embedded in data analysis techniques used to estimate cone weights; in some cases these assumptions may be incorrect. For example, simple cells have responses that can be described as resulting from a linear combination of cone inputs followed by a spiking nonlinearity (Movshon et al. 1978). Cone weight estimates that ignore this spiking nonlinearity are likely to be distorted.
We estimated cone weights of simple cells using spike-triggered averaging that is robust to spiking nonlinearities and provides an unbiased estimate of the linear filter that is assumed to relate the stimulus sequence to a time-varying spiking probability (Citron and Emerson 1983; de Boer and Kuyper 1968; DeAngelis et al. 1993; Marmarelis and Marmarelis 1978; Reid et al. 1997; Sakai 1992). Complex cells do not sum cone inputs linearly but can be described as having receptive fields composed of multiple linear subunits. Because different subunits may receive distinct types of cone input, the linear techniques used in previous studies may provide incomplete descriptions of the chromatic response properties of these cells. We estimated the cone weights of complex cells by spike-triggered covariance, which allows estimation of an ensemble of linear filters and is more precise than spike-triggered averaging for neurons with responses that are invariant to contrast sign (Aguera y Arcas et al. 2003; de Ruyter van Steveninck and Bialek 1988; Felsen et al. 2005; Rust et al. 2004, 2005; Schwartz et al. 2002; Simoncelli et al. 2004; Slee et al. 2005; Touryan et al. 2002, 2005).
Visual stimuli and data collection methods were identical to those described in Horwitz, Chichilnisky, and Albright (2005). The 158 cells analyzed in that study were also analyzed in this study.
Five alert rhesus monkeys (Macaca mulatta), weighing between 8 and 10 kg, served as subjects in these experiments. Four of these monkeys were used in a previous study (Horwitz et al. 2005). Experimental protocols were approved by the Salk Institute Animal Care and Use Committee and conform to US Department of Agriculture regulations and to the National Institutes of Health guidelines for the humane care and use of laboratory animals.
Surgical procedures were similar to those described previously (Dobkins and Albright 1994). In an initial surgery performed under aseptic conditions and with general anesthesia, each monkey was implanted with a stainless-steel head post, recording chamber (Crist Instruments), and monocular scleral search coil. After recovery, monkeys were trained to maintain visual fixation in the presence of a peripheral visual stimulus. In a second surgical procedure, a craniotomy was made inside the recording chamber through which electrodes could be inserted. All units were isolated from the occipital operculum (area V1).
Single units were recorded with transdurally inserted platinum/iridium electrodes (Fredrick Haer) with impedances of 1–3 MΩ at 1 kHz. Electrical signals were amplified and spikes were sorted by template matching algorithms either on-line (Alpha-Omega) or off-line (Plexon).
Monkeys were trained to maintain their eye position within a 1 × 1° electronically defined window surrounding a 0.2 × 0.2° black fixation point at the center of the computer screen. Eye movements within this window were small as shown in Horwitz et al. (2005) and did not affect the conclusions of this study. Six hundred milliseconds after fixation, a dynamic, colorful stimulus appeared at the receptive field (RF) of one, or occasionally two, individually isolated V1 neurons. Stimulus presentation persisted until the monkey broke fixation or until 10 s had elapsed. Liquid rewards were provided on a random schedule during periods of stable fixation.
Stimuli were presented on a Sony F500 CRT monitor driven at 100 Hz by a GLoria Synergy graphics card in an IBM compatible personal computer. On isolating a single unit, RF boundaries were mapped with bars of light and the stimulus, shown in Fig. 1, was positioned at the estimated center of the RF. The stimulus was a square 8 × 8 checkerboard grid that subtended 1.8° of visual angle at a viewing distance of 60 cm. During periods of stable fixation, the color of each 0.22° pixel in the grid changed randomly and independently on every screen refresh. These color changes were implemented by modulating phosphor intensities in accordance with quantized and truncated Gaussian distributions (Horwitz and Albright 2003). The space-time averaged intensity of each phosphor was equal to its contribution to the background, which was metameric with an equal-energy white at 65 cd/m2. Fixation breaks suspended data collection and extinguished the stimulus.
Orientation and spatial frequency tuning were measured for 129 neurons with sinusoidal gratings. Gratings were presented within a 3°-diam circular aperture, had a spectrum identical to the background, and were modulated in intensity only. Temporal frequency was fixed at 5 Hz, and spatial frequency varied logarithmically from 0.5 to 8 cycle/°. Responses were averaged across 8–12 stimulus cycles. Grating responses of three of the 129 neurons tested are presented in this report.
Data were analyzed off-line with custom software written in Matlab 6.1 (The MathWorks). Stimulus movies were represented numerically as phosphor intensity differences from the background at each pixel and frame. Spike trains were aligned to the stimulus reconstruction, and each spike provided an index to the stimulus frame on which it occurred. Segments of the stimulus movie preceding each spike were extracted and averaged to derive the spike-triggered average stimulus (STA) by the following formula (1) where N is the total number of spikes recorded, and si is the stimulus preceding the ith spike. In this report, we consider only the 4th–12th frame preceding each spike because none of the cells we studied had a response latency of <40 ms and few integrated visual information over >120 ms. Each si vector therefore had 1,728 elements (3 phosphors × 64 pixels × 9 frames). This representation incorporates information about color, space, and time.
In the following text, we describe statistics that become noisy when the number of elements in each si is large. To avoid this source of noise, we limited most of our analyses to small numbers of stimulus elements. In Figs. 2–5, our central results are illustrated in the spatial domain, so for these analyses each si had 192 elements (3 phosphors × 64 pixels × 1 frame). Population analyses (Figs. 6–11) were conducted at a single pixel close to the receptive field center, which was selected according to the algorithm described in Horwitz et al. (2005). For these analyses, each si had 27 elements (3 phosphors × 1 pixel × 9 frames). To determine whether the results thus obtained were robust to the single pixel selection, we performed additional analyses, shown in Figs. 8B and 10B, in which data were pooled across all pixels and frames.
If a neuron's stimulus selectivity can be described as arising from the output of a linear filter, the STA provides an unbiased estimate of the filter, irrespective of any static nonlinearity that relates the output of the filter to a firing rate (Chichilnisky 2001). The shape of such a static nonlinearity, however, has a profound effect on the signal to noise ratio of the STA. If the nonlinearity is monotonic, the cell responds preferentially to stimuli that resemble (have a large projection onto) the underlying filter, and the STA tends to be relatively noise-free. A cell with a nonmonotonic nonlinearity, on the other hand, responds to stimuli that elicit either a large positive or large negative response from the underlying linear filter. These stimuli cancel when averaged, leading to a noisy STA. In this case, analysis of spike-triggered covariance is preferable to spike-triggered averaging (Paninski 2003). The spike-triggered covariance matrix is computed as follows (2) The largest eigenvector of this matrix is the first principal component (PC1) of the ensemble of spike-triggered stimuli. The second largest eigenvector is the second principal component (PC2), and so on.
For the purpose of estimating cone inputs, a complex cell can be modeled as a linear filter followed by a rectifying nonlinearity (Lennie et al. 1990; Solomon and Lennie 2005; Solomon et al. 2004). Under this model, the PC1 provides a low-variance, unbiased estimate of the underlying linear filter (Paninski 2003). Even if the nonlinearity is monotonic, the PC1 may provide a reasonable estimate of the underlying linear filter, although the STA generally has lower variance in this case.
The STA is preferable to the PC1 for studying simple cells, and the PC1 is preferable to the STA for studying complex cells. Where possible in this report, both analyses are performed on every neuron. Elsewhere, we classify cells into simple and complex categories to determine the more appropriate analysis. We make this classification on the basis of a nonlinearity index (NLI), described in the following text.
The NLI is based on an analysis developed by Pillow and Simoncelli (2006), which uses the STA and the STC to find the maximally informative stimulus dimension under a multivariate Gaussian assumption. For each cell, we calculate the maximally informative stimulus dimension at the pixel selected for analysis. We then project the stimuli shown in the experiment onto this dimension and bin the projections, excluding the upper and lower 5% to avoid the influence of outliers. We calculate the average firing rate across the stimuli within each bin to estimate the static nonlinearity (also called the “feature contrast-response function”) (Chichilnisky 2001; Rust et al. 2005; Touryan et al. 2005). The relationship between firing rate and stimulus projection was fit with three regression equations (3) (4) (5) The goodness of fit of each regression was quantified by the R2 statistic. The nonlinearity index is defined as (6) The NLI attains its theoretical maximal value of 1 when the inclusion of a linear term does not improve the regression fit as would be the case for a linear cell with a quadratic output nonlinearity. It attains its theoretical minimum value of −1 when the inclusion of a quadratic term does not improve the regression fit as would be the case for a purely linear cell. Importantly, the results described in this manuscript are robust to the particulars of this index: similar results were obtained when cells were sorted on the basis of a Quadratic Index of Nonlinearity (Nykamp 2003) or on the basis of the ratio of the squared, summed STA elements to the squared, summed spike-triggered variance elements. For the sake of readability, we use the terms “simple” and “complex” to refer to cells with NLI <0 and NLI >0, respectively, with the acknowledgment that these definitions may not agree perfectly with classifications based on other criteria.
TESTS OF STATISTICAL SIGNIFICANCE.
The statistical significance of principal components was assessed with nonparametric randomization tests (Rust et al. 2005). To test the significance of the PC1, we randomly shifted spike trains in time, recalculated the spike-triggered covariance matrix, and recorded the largest eigenvalue. This procedure was performed 2,000 times. If the largest eigenvalue from the nonrandomized data exceeded 95% of the largest eigenvalues from the randomized datasets, the PC1 was deemed significant at the 0.05 level. To test the significance of the PC2, we projected the spike-triggered stimuli into the subspace orthogonal to the PC1 and performed the randomization test described in the preceding text. This procedure was repeated for each PC (projecting onto progressively lower dimensional subspaces) until significance was no longer attained. Small eigenvalues were not tested for statistical significance because the goal of this procedure was to characterize complex cells, which can be well characterized by the PCs with large eigenvalues (Felsen et al. 2005; Rust et al. 2005; Touryan et al. 2002, 2005).
CALCULATION OF CONE WEIGHTS.
Red, green, and blue monitor phosphors modulated independently in the checkerboard stimulus, and these independent modulations produced correlated activity in the L-, M-, and S-cones. The covariance matrix of the stimulus in cone space is provided in Table 1. These correlations would have led to inaccurate cone weight estimates had the spike-triggered stimuli been transformed to cone activations prior to calculation of the STAs or PCs. Instead, STAs and PCs were computed in the space of monitor phosphors, and then converted to cone weights by the following procedure.
The STA (or PC) was represented as a 3 × n matrix, M. The three rows of M corresponded to phosphor intensity relative to the background level. In the single-pixel analyses and the multiple-pixel analysis of PC1s, the nine columns of M corresponded to frames preceding a spike. In the multiple-pixel analysis of the STA, the columns corresponded to all 576 (= 9 × 64) possible combinations of frames and pixels. We then modeled M as the product of a color-weighting function and a temporal (or spatiotemporal) weighting function using least-squares regression (singular-value decomposition) (7) where f(t) is the temporal (or spatiotemporal) weighting function, and [R,G,B]T is the color-weighting function. This method implicitly assumes color/time separability, which was approximately true for STAs and PC1s calculated at single pixels in the checkerboard (0.22 × 0.22°) but is not true when the stimulus is large with respect to the receptive field (Cottaris and De Valois 1998; Horwitz et al. 2004). The color-weighting function is the triplet of gun intensity values that best describes M in the mean squared error sense, and thereby provides an estimate of the cell's preferred color direction. We converted [R,G,B]T to cone weights by the following formula (8) where A is a 3 × 3 matrix with elements that are the pairwise inner products of 10° cone fundamentals (Stockman et al. 1993) and phosphor emission spectra measured from the monitor. Note that A is the matrix that converts phosphor intensities to cone excitations, but Eq. 8 uses (AT)-1, because STAs and PCs are interpreted as visual mechanisms, which exist in the dual space of lights and are transformed accordingly (Knoblauch and D'Zmura 2001).
The resultant [L,M,S] vector provides the cone weight estimates. To compare cone weights across cells, we express each weight as a fraction of total cone weight to a given cell by the following formulas (Lennie et al. 1990) (9)
Early in the data collection we recorded preferentially from cells with clear STAs and thus were biased against sampling complex cells. Importantly, however, the complex cells we recorded are an unbiased sample of those we encountered. The proportion cone-opponent neurons in the dataset is also artificially inflated beyond the biases expected of extracellular recording because we often moved past relatively common nonopponent cells to find rarer “chromatically interesting” cells.
We recorded from 244 V1 neurons with RFs ranging in eccentricity from 1.6 to 7.9° (mean: 5.6°). One hundred and fifty-eight of these neurons provided the data set for a previous study (Horwitz et al. 2005); the additional 86 neurons were recorded subsequently. Below we present data from three example cells, emphasizing similarities and differences in their stimulus tuning. As illustrated with these cells, simple and complex cells require different analyses: spike-triggered averaging reveals structure in the light responses of simple cells and spike-triggered covariance is preferable for complex cells. To select the more appropriate analysis for each cell, we introduce a nonlinearity index to quantify the complexity of the neural response and thereby classify cells as simple or complex. We then show that the distributions of cone weights are nonrandom and that some complex cells respond to stimulus modulations in multiple color directions.
Spatial receptive field
EXAMPLE CELL 1.
The STA of a representative V1 neuron appears in Fig. 2A. The STA suggests that the preferred orientation of this cell was slightly clockwise from vertical and that the preferred spatial frequency was ∼1 cycle/°. Both predictions were confirmed by direct measurement with drifting gratings (Fig. 2, B and C).
Two features of these data suggest a “simple” classification for this cell. First, the cell was excited by increases in light intensity in one portion of the RF (bright pixels) and by decrements in light intensity in the flanking regions (dark pixels) unlike the sign-invariant response expected of complex cells (Hubel and Wiesel 1962). Second, the response of this cell modulated at the temporal frequency of the drifting grating as evidenced by the single peak and trough in the peri-stimulus time histograms over repeated stimulus cycles (Fig. 2, B and C, insets) (Skottun et al. 1991).
EXAMPLE CELL 2.
Data from a second example cell appear in Fig. 3. This cell's STA, shown in Fig. 3A, reveals a preference for horizontal edges and a spatial frequency of ∼2 cycle/°. Both predictions were confirmed with drifting gratings (Fig. 3, B and C). Like the cell in Fig. 2, this cell gave a modulated response to the drifting grating, consistent with a “simple” classification.
EXAMPLE CELL 3.
Data from a final example cell are shown in Fig. 4. This cell responded vigorously to the white noise stimulus, firing 120 spike/s during stimulation over a baseline of only 1 spike/s. The STA in A is unstructured, however, demonstrating that spike-triggered averaging is not a useful technique for analyzing the stimulus selectivity of this cell. When tested with gratings, this cell was well tuned for orientation and spatial frequency, as shown in Fig. 4, B and C, respectively, and had a sustained, unmodulated response consistent with a “complex” classification (insets).
Spike-triggered covariance analysis
The STA of a complex cell stimulated with white noise is uninformative. Previous studies have used spike-triggered covariance (eigenvector decomposition of the spike-triggered covariance matrix) to measure the direction selectivity (Rust et al. 2005; Touryan et al. 2002) and spatial selectivity (Touryan et al. 2005) of V1 complex cells. Here we use it to measure the spatial and spectral tuning of the three example cells.
STAs from the three example cells have been replotted in Fig. 5, top, to facilitate comparison. The middle row of panels shows the first principal components of the spike-triggered stimuli (PC1s). PC1s of example cells 1 and 2 resembled noisy versions of their STAs, consistent with their classification as simple cells (see methods). The PC1 of example cell 3, however, contained important structure that was not apparent in the STA.
The unstructured STA and structured PC1 of example cell 3 are expected of a linear cell with a static nonlinearity that rises for both positive and negative contrast. Such a cell responds to two complementary classes of stimuli—those that elicit a positive response from the underlying filter and those that elicit a negative response. Because these classes of stimuli are mirror images of each other, they cancel when averaged, accounting for their absence in the STA. This canceling is also reflected in the quadratic component of the NLI (see methods).
The preferred orientation and spatial frequency of cell 3 (Fig. 4, B and C) were not predictable from the STA but were predictable from the PC1. The PC1 revealed spatially offset on-off subregions the spatial configuration and widths of which were roughly consistent with the observed preferred orientation of 60° and the observed preferred spatial frequency of ∼0.5 cycle/°.
Figure 5, bottom, shows the second principal component of the spike-triggered stimuli (PC2) for each of the three example cells. PC2s of cells 1 and 2 were unstructured, but the PC2 of cell 3 revealed sensitivity to obliquely oriented edges as did the PC1. The PC1 and PC2 of cell 3 have the same spatial orientation, but they differ in spatial phase. This relationship between PCs suggests that the cell would respond to an edge of the appropriate orientation, irrespective of contrast polarity or precise position in the RF as expected of a complex cell (Rust et al. 2005; Touryan et al. 2005).
Cone weight estimation
In the rendering of Fig. 5, the STA of cell 1 and the PC1 of cell 3 have similar colors, and this color differs from that of the STA of cell 2. Qualitatively, this implies that cells 1 and 3 had similar cone weights and that these cone weights differed from those of cell 2. To estimate cone weights quantitatively, we calculated the STA of cells 1 and 2 and the PC1 of cell 3 at a single pixel across nine frames including the one shown in Fig. 5. Cone weights for the three example cells are provided in Table 2. Cells 1 and 3 were nonopponent (all of their cone weights had the same sign), but cell 2 was cone-opponent (2 cone weights differed in sign).
Spike-triggered averaging is a useful tool for studying simple cells, and spike-triggered covariance is a superior tool for analyzing complex cells. Classifying cells into simple and complex categories is therefore useful for determining the more appropriate analysis.
We defined a NLI that has a value of −1 for a purely linear cell and 1 for a linear cell with a quadratic output nonlinearity. The value of the index indicates which regression model, linear or a quadratic, better describes a cell's “output nonlinearity” or “feature contrast response function” (Chichilnisky 2001; Touryan et al. 2005). The quality of the regression fits is expressed relative to the fit of a full model that contains both linear and quadratic terms. The full model fit the data quite well: the average R2full was 0.95 ± 0.06 (SD). Means and SDs of the estimated output nonlinearities are shown for the 183 simple cells (NLI < 0) in Fig. 6A and for the 61 complex cells (NLI > 0) in Fig. 6B. As expected, output nonlinearities of cells with low NLI values are roughly linear, and output nonlinearities of cells with high NLI values are roughly quadratic.
Figure 7A shows the distribution of NLI values across the dataset. The prevalence of simple cells in the data set is likely a result of our recording bias toward cells with clear STAs. NLI values for example cells 1–3 were −0.90, −0.87, and 0.96, respectively.
Measurements of luminance tuning
Example cell 1 had nonopponent cone weights and thus a preferred color direction close to photometric luminance. Example cell 2 had opponent cone weights of similar magnitude and thus a preferred color direction further from luminance. To examine the prevalence of luminance tuning across the data set, a preferred color direction was calculated from the each cell's STA and correlated with the color direction given by photometric luminance.
The correlation coefficient is the cosine of the angle between the two color directions, and this angle depends on the color space in which the analysis is performed. In most color spaces, errors are correlated across color channels, complicating interpretation of the analysis (e.g., the transformation from phosphor weights to cone weights forces a negative correlation between estimated L- and M-cone weights). This analysis was thus performed in the space of monitor phosphors, in which sampling errors are independent across color channels (because the phosphors modulated independently). In what follows, we do not draw conclusions that depend on this arbitrary choice.
Results of the analysis appear in Fig. 7B. Positive numbers on the y axis indicate on responses and negative numbers indicate off responses. Luminance tuning was common among simple cells as indicated by the clusters of points at the left of the plot near 1 or −1 on the y axis; 88/183 (48%) of simple cells had |STA·lum| > 0.8. Nevertheless, an appreciable number of simple cells were tuned for color directions other than luminance, as indicated by the points near 0 on the y-axis; 95/183 (52%) of simple cells had |STA·lum| < 0.8. We conclude that the cells we identified as simple are tuned for a variety of color directions, including directions close to photometric luminance.
Cells with large NLI values are nonlinear, so their STAs provide unreliable estimates of preferred color direction. For these cells, analysis of the PC1 is preferable to analysis of the STA. Figure 7C shows the results of an analysis that is identical to that shown in B but is based on PC1s instead of STAs. Importantly, we never observed a cell with a large NLI value that had a PC1 indicating anything other than luminance tuning.
Cone weight estimation
The representation of color preferences in Fig. 7 is incomplete because luminance is a single dimension of a fundamentally three-dimensional color space. Ideally, we would show estimated cone weights and NLI values for each cell in a single plot, but such a plot would require an unwieldy multidimensional data representation. Thus for the sake of figure clarity, we assigned cells to simple and complex categories on the basis of the NLI. Cone weights of simple cells were derived from STAs, and cone weights of complex cells were derived from PC1s.
Figure 8 shows the distribution of normalized L- and M-cone weights across the population of simple cells. A shows results from a single pixel analysis, and B shows results from an analysis in which every pixel contributed to cone weight estimation. S-cone weights are indicated by the distance of each point from the origin and whether it is filled or open (see legend). Interestingly, two more-or-less discrete groups of neurons emerged. One group had nonopponent L- and M-cone weights (top-right and bottom-left edges). Using a criterion of |l + m| > 0.8, 64/183 (35%) of simple cells belonged to this group. The second group had approximately equal and opposite L- and M-cone weights (bottom-right and top-left edges). Using a criterion of |l + m| < 0.2, 61/183 (33%) of simple cells belonged to this group.
Neurons classified as blue/yellow opponent in Horwitz et al. (2005) are shown as ▴ and ▵ in Fig. 8. These neurons have S-cone weights the sign of which is opposite that of a weighted sum of L- and M-cone weights, and they have STAs that appear either blue or yellow when rendered on a CRT. In Fig. 8, L- and M-cone weights are shown separately for these neurons. By this analysis, most of these neurons exhibited LM opponency, with S- and M-cone weights having the same sign. They did not have the S-(L+M) cone weight signature characteristic of blue/yellow neurons in the retina and lateral geniculate nucleus, but appear to be tuned more closely to a perceptual blue/yellow axis (DeValois et al. 2000b; Werner and Wooten 1979; Wuerger et al. 2005). Note, however, that because these neurons are not linear (see Horwitz et al. 2005), the cone weights shown in Fig. 8, which were derived from their STAs only, are incomplete descriptions of their color tuning.
The spread of data points along the upper-right and lower-left edges of Fig. 8 is consistent with the idea that LM nonopponent cells receive a diverse balance of L- and M-cone input. Alternatively, this spread may simply reflect random errors in estimated cone weights. To measure the variability of estimated cone weights, we computed standard errors by bootstrap as a function of angle in the LM plane for each cell. Spline fits delimiting 1 SE are shown for 20 example neurons in Fig. 9.
Measurement error in the proportion of L- to M-cone weight for nonopponent cells was large, as evidenced by the fact that cells at the top-right and bottom-left edges of the plot had long and obliquely oriented standard error regions parallel to the edges of the bounding box. This variability stems from the fact that L- and M-cone spectral absorption functions are very similar, so positively weighted sums of these functions are difficult to distinguish. Weighted differences between L- and M-cone spectral absorption functions, on the other hand, depend steeply on relative cone weight so long as the weights are of similar magnitude. For this reason, cells exhibiting L-M antagonism (those near the top-left and bottom-right edges of Figs. 8 and 9) yielded relatively reliable L- and M-cone weight estimates.
To estimate the cone weights of complex cells, we examined PC1s in the same way that we examined the STAs of simple cells. Results of this analysis are shown in Fig. 10A. Figure 10B shows the results of a closely related analysis in which spike-triggered covariance matrices were computed within, and then averaged across, all 64 pixels prior to eigenvector decomposition. As presaged by the analysis in Fig. 7C, nearly every complex cell we studied had nonopponent L- and M-cone weights.
This result appears to conflict with previous studies that have reported that V1 complex cells are dominantly (Lennie et al. 1990) or partially (Gouras and Kruger 1979; Johnson et al. 2004) cone-opponent. How can we resolve this apparent discrepancy? If the cone inputs to a complex cell can be summarized as a single set of cone weights, these weights can be derived from the PC1. It is easy to imagine, however, that this “single mechanism” hypothesis could be wrong: some complex cells may respond to modulations in several different directions in color space that are not predictable from a single linear combination of cone signals.
Different models of cone signal integration predict qualitatively different outcomes from the spike-triggered covariance analysis. A cell that sums linear inputs with different cone weights prior to rectification will have a PC1 that reflects the weighted average of these inputs and will have higher-order PCs that are uninformative. On the other hand, a cell that receives individually rectified linear inputs, each of which has a different set of cone weights, will have a PC1 dominated by the mechanism that drives the cell most strongly and higher-order PCs that are dominated by the mechanisms that drive the cell more weakly.
To test the single mechanism hypothesis, we examined every significantly large PC from each nonlinear cell (nested randomization tests, P < 0.05). Results of this analysis appear in Fig. 11. Most of the complex cells we studied (52/61, 85%) had two or more significant large PCs, and seven cells had more than four. Most (74/97, 76%) of the higher-order PCs had nonopponent LM weights, like the PC1s did. Nineteen cells, however, had at least one significant cone-opponent PC with normalized L- and M-cone weights of similar magnitude (l + m < 0.5). These cells thus had multiple preferred directions in color space, inconsistent with the single linear mechanism hypothesis and consistent with nonlinear combination of distinct cone-opponent and nonopponent inputs.
Fixational eye movements
Monkeys in this study were trained to maintain stable visual fixation, and trials were aborted if the eye position left a 1 × 1° electronically defined window surrounding the fixation point. Within this window, however, small eye movements occurred frequently. We recorded eye position with high temporal and spatial precision and attempted to compensate for changes in eye position by shifting the reconstructed stimulus patterns relative to each other during off-line analysis. This approach failed because over the duration of a typical experiment eye position measurements became contaminated by slow drifts in the eye coil signal that were unrelated to actual eye movements (Horwitz et al. 2005; Read and Cumming 2003; Tsao et al. 2003).
Fortunately, eye movements tended to be small relative to the size of the pixels in the stimulus (as shown in Horwitz et al. 2005). Within individual trials, the SD of eye position was 0.06° in the horizontal channel and 0.09° in the vertical channel, which is small relative to an individual stimulus pixel (0.22 × 0.22°). Changes in eye position across trials were more difficult to measure due to slow drifts, but, even including this source of variability, fixation positions on sequential trials tended to be fairly consistent: the difference in median eye position between subsequent trials was inside the single pixel boundary 90% of the time. These analyses, in conjunction with the observation many of the RFs we studied had clearly resolvable and distinct subunits (e.g., Fig. 5) suggest that eye movements were not a significant source of spatial blurring.
We estimated cone inputs to V1 neurons in awake macaques with white noise analysis techniques. Both simple and complex cells responded to cone-opponent and nonopponent modulations, complex cells responded particularly well to nonopponent modulations, and a large population of simple cells had L- and M-cone weights with roughly equal magnitude and opposite sign. These results are largely consistent with those obtained previously using periodic stimuli in anesthetized monkeys.
We obtained two unexpected results. First, the cone weights of simple cells were more clearly clustered than has been reported previously: one cluster had roughly equal and opponent L- and M-cone weights (and a variety of S-cone weights) and another cluster had nonopponent cone weights the magnitude of which could not be estimated precisely. Second, none of the complex cells we studied was purely cone-opponent. In what follows, we compare our findings to those in the literature, and we speculate on the significance of our results for visual function.
Cone weight estimates based on strict linear assumptions are distorted by nonlinearities that follow cone signal summation. Standard regression analyses are based on such strict linear assumptions. In contrast, the spike-triggered analyses used in this study are robust to these nonlinearities (Chichilnisky 2001). We speculate that the robustness to nonlinearities was important for revealing clusters of cells in cone weight space. Other factors that distinguish our study from previous studies include the use of awake monkeys and spatiotemporally broadband visual stimuli.
Additional nonlinearities (beyond a static output nonlinearity) undoubtedly distorted our cone weights estimates. We have shown previously that some cone-opponent neurons respond to luminance contrast of either polarity, a nonlinear response that distorts cone weight estimates based solely on the STA (Horwitz et al. 2005). Cells with this response property are neither purely simple nor complex (in fact, their existence reveals the inadequacy of the simple/complex dichotomy), but for the sake of this study, we needed a parsimonious, if incomplete, description of their cone inputs. All but one of these cells had NLI values <0 and were thus classified as simple, so we calculated their cone weights from their STAs.
Many V1 neurons are suppressed by stimuli appearing outside their classical RFs, and potentially inside their classical RFs as well (Albrecht and Geisler 1991; Heeger 1992; Solomon and Lennie 2005; Solomon et al. 2004; Wachtler et al. 2003). This suppression could distort cone weight estimates based on the STA if the cone weights of a rectified linear suppressive mechanism differ from those of the center mechanism (Solomon and Lennie 2005; Solomon et al. 2004). In fact, some of the dominantly linear cells we studied had highly structured, small principal components with cone weights that differed clearly from those estimated from the STA, consistent with a rectified suppressive mechanism (Rust et al. 2005; Schwartz et al. 2002).
Most of the LM nonopponent simple cells we studied received significant S-cone input with the same sign as L- and M-cone inputs. This result is consistent with the idea that macaques are more sensitive to short wavelength light than predicted by the human photopic luminous efficiency function (Conway and Tsao 2006; DeValois 1965; Dobkins and Albright 1994) and may be related to evidence that S-cones contribute to responses in neurons in the magnocellular layers of the LGN (Chatterjee and Callaway 2002; but see Sun et al. 2006) and in area MT (Barberini et al. 2005; Seidemann et al. 1999).
Cone-opponent neurons tended to have S-cone weights with the same sign as M-cone weights (and thus of opposite sign to L-cone weights). This tendency was noted previously by Conway (2001) and Conway and Livingstone (2006), who suggested that these cells may implement a red/cyan axis. Cone weight signs, however, do not define a color direction uniquely, and the range of directions in cone space consistent with cone weight signs –L, +M, +S are associated with a variety of hues (Conway and Livingstone 2005). Indeed, many of the cells we described as blue/yellow in a previous report had these cone weight signs, in qualitative agreement with the −130L +95M +35S blue/yellow mechanism postulated by DeValois and DeValois (1992). Conway (2001) and Conway and Livingstone (2006) also found that many cone-opponent V1 neurons had a double-opponent receptive field organization, and this was true in our data set as well (Fig. 3 and data not shown).
S-cone weights reported in this paper are larger than those reported by others (Johnson et al. 2001, 2004; Lennie et al. 1990; Solomon and Lennie 2005; Solomon et al. 2004; but see DeValois et al. 2000a). Several factors likely contribute to this discrepancy. First, V1 neurons have nonlinear contrast-response functions, so traditional cone weight estimates depend on the experimenters' choice of stimulus intensities. The technique we used, in contrast, is immune to nonlinearities following cone signal summation. Second, no universally accepted convention exists for scaling the cone fundamentals, and different conventions lead to different cone weight estimates. The cone fundamentals we used in this study were scaled to equal peak sensitivity (and therefore unequal integral). Third, the model on which our analysis is based asserts that each V1 neuron takes a weighted sum of cone signals after their respective means have been subtracted. A slightly more complex model assumes that each V1 neuron takes a weighted sum of cone contrast signals (assuming a specific form of Von Kries adaptation). Cone weights calculated under this model are simply the cone weights reported in this manuscript multiplied by the background excitation of each cone type, which in arbitrary units are: L:1, M:0.8, S:0.6. This multiplication reduces the magnitude of S-cone weights relative to L- and M-cone weights. Fourth, nonlinearities beyond a static output nonlinearity play significant roles in the responses of some cone-opponent V1 neurons, and for these neurons, the linear model may be a sufficiently poor approximation that different linear analysis methods provide grossly dissimilar cone weight estimates.
We speculate that the neurons classified as red/green or blue/yellow in qualitative studies of color tuning in V1 had L- and M-cone weights of opposite sign and/or S-cone weights that were strong relative to L- and M-cone weights. Our data did not reveal distinct populations of red/green and blue/yellow V1 neurons based on these criteria.
By the single-pixel analysis, none of the complex cells we studied responded exclusively to opponent modulations. By the multiple pixel analysis, a few complex cells appeared to be more clearly cone opponent. This result may be related to the fact that some complex cells exhibit an enhanced cone-opponent response when stimulated with stimulus patches that are large with respect to the RF (data not shown). On the other hand, it may simply be due to statistical noise: incorporating pixels that did not impinge on the RF increases the error in the estimate of preferred color direction, and the transformation from phosphor space to cone space causes these errors to manifest as LM opponency.
The dominance of nonopponent signals in complex cells is consistent with some other studies (Johnson et al. 2004; Lennie et al. 1990) but is inconsistent with reports of pure color-opponent complex cells (Michael 1978). We cannot exclude the possibility that some of the complex cells we studied that appeared exclusively nonopponent carried an opponent signal that was below our statistical threshold.
Cells that responded to both opponent and nonopponent stimulus features cannot be characterized by a single set of cone weights; at least two sets are required. Studies that have assumed a single set of cone weights for each complex cell have reported a range of preferred color directions (DeValois et al. 2000a; Johnson et al. 2004; Lennie et al. 1990). This heterogeneity may arise in part from the single linear mechanism assumption. For instance, a complex cell that receives both opponent and nonopponent rectified inputs could respond to a wide array of chromatic and achromatic stimuli. In a classical stimulation paradigm, this breadth of tuning could lead to a poorly constrained linear model fit and thus a variable estimate of preferred color direction. These cells may belong to the class of “universal edge detectors” that have been described previously (Gouras and Kruger 1979; Hubel and Livingstone 1990; Thorell et al. 1984; Yoshioka et al. 1996). A rigorous test of this hypothesis will require stimulating these neurons with bars or gratings that vary in luminance and chromaticity.
Functional significance of nonopponent complex cells
The prevalence of luminance tuning in complex cells is appropriate for a role in scene segmentation. Two abutting objects, or two objects in occlusion, form an edge between them the polarity of which (bright to dark, or dark to bright) depends on the position of the light source and the relative positions of the objects. These factors, although important for a complete three-dimensional reconstruction of the scene, are unimportant for delimiting object boundaries. Thus dropping the sign of luminance differences provides an invariance that is useful for finding edge positions. Some of the complex cells that we studied may be involved in such a computation. Dropping the sign of a chromatic difference is less useful because spatial chromatic differences are largely invariant to the depth order of objects and their position relative to the light source. It is thus reasonable that V1 may not contain purely cone-opponent complex cells.
G. D. Horwitz was supported by the Helen Hay Whitney Foundation. E. J. Chichilnisky was supported by the McKnight Foundation and Sloan Foundation. T. D. Albright was supported by the Howard Hughes Medical Institute.
We thank B. Krekelberg, A. Schlack, X. Huang, G. Field, and J. Mitchell for useful discussions and J. Costanza, D. Diep, and D. Woods for excellent technical assistance during the course of the study. J. Mitchell and several anonymous reviewers provided valuable comments on the manuscript.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2007 by the American Physiological Society