Previous research has established that orientation selectivity depends to a great extent on suppressive mechanisms in the visual cortex. In this study, we investigated the spatial organization and the time-course of these mechanisms. Stimuli were presented in circular windows of “optimal” and “large” radii. The two stimulus sizes were chosen based on an area-response function measured with drifting gratings at high contrast. The “optimal” size was defined as the smallest radius that elicited the peak response (average value of 0.45°), whereas “large” was defined as two to five times the optimal size. We found that the peak amplitude of tuned enhancement and untuned suppression varied <10% on average with stimulus radius, indicating that they are mainly concentrated in the classical receptive field. However, tuned suppression—in those cells that showed it—was significantly stronger with large stimuli, indicating that this component has a contribution from beyond the classical receptive field. These results imply that spatial context (in large stimuli) enhances orientation selectivity by increasing tuned suppression. We also characterized the time evolution of enhancement, of untuned suppression, and of tuned suppression. The time-course of tuned suppression was markedly slower in time-to-peak and longer in its persistence than untuned suppression. Therefore tuned suppression is likely to be generated by long-range recurrent connections or cortico-cortical feedback, whereas untuned suppression is mainly generated locally in V1.
A number of models of orientation selectivity include a tuned (feedforward) excitatory component combined with intracortical suppressive (or inhibitory) components (Ben-Yishai et al. 1995; McLaughlin et al. 2000; Shelley et al. 2002; Somers et al. 1995; Troyer et al. 1998; Wielaard et al. 2001). Experimental evidence from a variety of approaches supports the idea that the proposed suppressive components play a major role in orientation selectivity (Bonds 1989; Nelson and Frost 1978; Ringach et al. 2002a,b; Shapley et al. 2003; Sillito et al. 1980; Volgushev et al. 1993). Recently, we have studied the dynamics of orientation tuning in macaque primary visual cortex (V1) to uncover these different components (Ringach et al. 1997, 2003; see Shapley et al. 2003 for a review). In our previous studies we used spatially extensive stimuli covering both the classical receptive field (CRF) and its surround. We found features in the orientation tuning dynamics that corresponded to excitatory and inhibitory processes proposed in models.
In this study, one of our goals was to study the dependence of orientation tuning dynamics on stimulus size. The results provide an initial characterization of the spatial extent of the different excitatory and suppressive mechanisms that influence orientation selectivity. To obtain the data, we ran reverse correlation experiments in the orientation domain with two different stimulus sizes. The optimal size stimulus had a diameter equal to that of a cell's CRF defined by the peak or saturation point of an area summation curve (Sceniak et al. 1999). The large size stimulus had a diameter that was two to five times that of the optimal. Both the CRF of the cell and the nonclassical receptive field surround (Levitt and Lund 2002; Sceniak et al. 2001) were stimulated by the large size stimulus configuration.
We previously developed a descriptive model to account for orientation dynamics (Ringach et al. 2003) where the neuron's response was a linear combination of tuned enhancement, tuned suppression, and an untuned signal. Here we modified this model, assigning the early untuned signal (which was always excitatory) as well as the early tuned enhancement to a single excitatory process. We interpreted the change in sign of the untuned signal later in the response as the onset of untuned suppression. Interpreted in this way, our experimental results showed that the earliest input to a V1 neuron is enhancement (or excitation) that is very broadly tuned for orientation, and this is followed by a rapid untuned suppression. In the large size condition, we also observed tuned suppression in more than one-half of V1 neurons. The peak amplitude of tuned suppression grew markedly when the stimulus was enlarged from the optimal size to the large size condition. The time-course of the tuned suppression was also distinctly slower than that of untuned suppression.
As considered in discussion, the new results offered in this paper support network models that propose that very broadly tuned cortical inhibition may be generated locally in the cortex (cf. McLaughlin et al. 2000; Troyer et al., 1998) and that this local inhibition is important for orientation selectivity. Tuned suppression is a mechanism to enhance orientation selectivity, and our results on its spatial range and dynamics indicate that this mechanism is likely to be the result of cortico-cortical feedback or long-range connections.
A preliminary report of this work was presented at the 2003 meeting of the Society for Neuroscience (Xing et al. 2003).
Acute experiments of several days duration were performed on adult old-world monkeys (Macaca fascicularis) in compliance with National Institutes of Health and New York University (NYU) guidelines. Animal preparation and recording were done as described previously (Hawken et al. 1996; Ringach et al. 2002b). Animals were initially tranquilized with acepromazine (50 μg/kg). After the tranquilizer, the animal was anesthetized by ketamine (30 mg/kg, im). After cannulation and tracheotomy, the animal was placed in a stereotaxic frame for craniotomy and subsequent visual experiments. A craniotomy (≤5 mm diam) was made in one hemisphere posterior to the lunate sulcus (∼15 mm anterior to the occipital ridge) and between 5 and 20 mm lateral from the midline. A small opening in the dura was made (<1 mm radius) to provide access for the electrode. During the whole duration of the acute experiment, anesthesia was continued with sufentanyl (6–18 μg/kg/h, iv), and the animal was paralyzed with pancuronium bromide (0.1 mg/kg/h, iv). Anesthetic level was monitored by measuring the EEG, heart rate, and blood pressure. Expired CO2 was maintained close to 5%. Temperature was kept at a constant 37°C. A broad spectrum antibiotic (Bicillin, 50,000 iu/kg, im) and anti-inflammatory steroid (dexamethasone, 0.5 mg/kg, im) were given on the first day of the experiment and every other day during the recording period. Experiments were terminated with a lethal dose of pentobarbital (60 mg/kg, iv).
Three to six electrolytic lesions (2–3 μA for 2–3 s, tip negative) were made along the length of each electrode penetration. The angle of the electrode track, relative to the surface normal, was ∼60°. A typical electrode track would extend for about 4–5 mm. Consecutive lesions were spaced by about 1 mm. Our electrode tracks resembled the one shown in Hawken and Parker (1984). At the end of the experiment, the animal was killed with an overdose of anesthetic and perfused through the heart. The details of fixation, sectioning, staining, and reconstruction of electrode tracks are described in detail in Hawken et al. (1988).
For the earlier experiments, visual stimuli were generated on a Silicon Graphics O2 R5000 computer. Stimuli were displayed on a Sony Multiscan 17se II color monitor (31.4 cm wide and 23.5 cm high) with a resolution of 800 × 600 pixels. The monitor's mean luminance was 53 cd/m2. The viewing distance was 90–120 cm. The CRT refresh rate was 60 Hz for some of the earlier experiments and 100 Hz for experiments thereafter. For the later two-thirds of the data we collected, the visual stimuli were generated by custom software in a PC computer with a Linux operating system. Stimuli were displayed on a Sony GDM-F520 Trinitron Color Graphic Display (40.38 cm wide and 30.22 cm high) with 1,024 × 768 pixels, running at 100-Hz frame refresh. The mean luminance of the screen was 72.3 cd/m2, and the viewing distance was 115 cm.
Each cell was stimulated monocularly through the dominant eye and characterized by measuring its steady-state response to conventional drifting gratings (the nondominant eye was occluded). Drifting gratings were presented for 2–4 s, and steady-state responses were calculated as the mean firing rate during this period. Using this method, we recorded basic attributes of the cell in response to drifting sinusoidal gratings. These include spatial and temporal frequency tuning, orientation tuning, contrast, and color sensitivity, as well as area summation curves. Receptive fields were located at eccentricities between 1 and 6° from the fovea.
Size tuning curves
After measuring the optimal orientation, spatial frequency, and temporal frequency tuning for a cell, we measured size tuning by varying the radius of the stimulus patch from 0.05 to 5° for a sinusoidal grating of optimal spatio-temporal parameters. The center of a cell's receptive field was carefully located by a small circular patch (usually 0.2° radius or smaller) of drifting grating. The center of the stimulus was put at the center of the cell's receptive field. The optimal size for a cell was defined as the peak or saturation point in the size-tuning curve (Sceniak et al. 1999 and see Fig. 1A).
Reverse correlation in the orientation domain
Figure 1B shows the reverse correlation method in the orientation domain (Ringach et al. 1997). Sinusoidal gratings of 18 different orientations equally spaced from 0 to 180°, plus “blanks” (defined as uniform frames having the same luminance as the mean luminance of the grating images) were used. For each orientation, spatial phase was also varied: each orientation in the set was presented at eight different spatial phases, equally spaced from 0 to 360°. Other parameters (spatial frequency, optimal for the cell, 80–99% contrast) of the gratings were fixed based on previous measurements on each cell. A total of 152 possible frames (18 orientations × 8 spatial phases + 8 blanks) composed each sequence.
Each stimulus in a sequence was randomly chosen from the 152 types of stimuli with replacement and presented on the screen for two refresh frames (20 ms). The length of a random sequence of the stimuli was 30 s for each trial. Thirty trials were run for each experiment; this took 15 min altogether. The sequences of the stimuli were saved in the computer, and the cell's spike times were recorded with 1-ms resolution.
The dynamics of orientation tuning were calculated as follows. First, we initialized a matrix N(θ, t) to be all zeros. θ is an index of the different stimuli (18 orientations plus 1 blank), regardless of their phases, and τ is the time delay. Given a specific time delay τ, we went back τ ms before each action potential and found out the stimulus θi presented at that time and added one in matrix entry N(θi, τ). Gratings of different spatial phases but the same orientation were treated as the same stimulus. At the end of this calculation, each matrix entry N(θi, τ) in matrix N was normalized by the actual number of stimuli θi that appeared in the sequence. This provides an estimate of the number of spikes that the cell will fire in a window (τ, τ + T) ms after stimulus θi is shown (where T is the duration of 1 frame). Once the number of spikes in response to an oriented pattern, p(θ, τ), and the blank, p(blank, τ), were estimated, we calculated R(θ, τ) = log10[p(θ, τ)/p(blank, τ)], which we refer to as the tuning curve at a time lag τ. Oriented patterns that generate responses identical to the “blank” are mapped to R(θ, τ) = 0; stimuli that enhance a cell's response are mapped to R(θ, τ) > 0,whereas stimuli that suppress a cell's response are mapped to R(θ, τ) < 0. A statistical justification for the log transform in the definition of R(θ, τ) was provided in Ringach et al. (2002a).
We ran the reverse correlation experiments in the orientation domain at two different radii. The optimal size stimulus had a diameter equal to that of a cell's CRF defined by the peak or saturation point of an area summation curve (Sceniak et al. 1999). The large size stimulus had a diameter that was two to five times that of the optimal. We used two to three times that of the optimal radius as large stimuli on 20 cells in our earlier experiments. In the later experiments, the large size was set to be not less than four times the optimal. We did not observe a significant difference between the results on the earlier 20 cells and the 81 cells studied later, so we pooled all data together. Each reverse correlation was run in a block design with a random sequence of stimuli all of the same size for each correlation experiment. We found that population average p(blank, τ) did not change significantly between sizes, and therefore the comparisons we offer below between responses at different sizes were based on the same baseline.
Curve fitting of orientation tuning
Given the measured dynamics of a cell's orientation selectivity, we first smoothed the data in time domain by a rectangular time window (5 ms long and 0.2 high). Then at each time delay τ, we did a parametric analysis by fitting (F1) to the data, where F(θ) and G(θ) are von Mises “distributions” (Mardia 1972) normalized between 0 and 1. Fitting of R(θ, τ) was done in the following way. In the first step, we assumed starting values for the parameters (θe, κe, θt, κt) of F(θ) and G(θ) and found the best fitting parameters (a, b, c) at each time delay independently (under the constraints a, b ≥ 0 and θe, κe, θt, κt within 10% of the starting values). In the second step, we estimated best-fitting values of (θe, κe, θt, κt) using the fitted (a, b, c) parameters from the first step. The process was repeated until there was <0.1% change in the parameters (θe, κe, θt, κt) from one iteration to the next. The result of such smoothing by curve-fitting is shown in Fig. 2A, which shows a cell's fitted curve with its raw data at one particular time-lag. The profiles of F(θ) and G(θ) are also shown by red and blue curves. The von Mises distribution has been shown previously to provide very good fits to such empirical data (Ringach et al. 2003; Swindale 1998). To evaluate the goodness of fit for each set of data, we calculated fractional errors defined as Eq. F2 (F2) The computation of fractional error was done over the time interval from τ1 to τ2 when the response was large enough to be out of the noise. τ1 is defined as the first time when the response variance RV(τ) = ΣR(θ, τ)2 reaches 5% of the peak RV(τ), and τ2 is defined as the time when RV(τ) finally relaxes back to 5% of its peak.
Figure 3 shows one example cell's fitted data (Fig. 3A), raw data (Fig. 3B), and the difference between them (Fig. 3C). The mean fractional error is 0.03 for the fitting in the large size condition and 0.04 for the fitting at optimal size. This cell has a Mexican-hat orientation tuning curve at later time under large size condition (Fig. 3B, left). The fitting captured this characteristic very well. For this example, the fractional error is 0.1 for the large size condition and 0.04 for the optimal size condition.
The parameter θe determines the preferred orientation of the excitatory component F(θ), and κe its width. Similarly, the suppressive component was parameterized by θt, κt. Given a cell's orientation tuning curve at time τ, we defined a cell's (orientation) modulation depth A(τ) as the difference of maximum response and minimum response across all orientations. Figure 2B shows the definition of a cell's modulation depth (A), response at its preferred orientation (Rpref), response at its orthogonal orientation (Rorth), and its minimum response (Rmin) at time τ. Note that all of these response measures are functions of time offset τ.
Orientation dynamics at two stimulus sizes: a description
We recorded from 101 cells and observed all of the phenomena previously described in experiments done with large stimuli (Ringach et al. 1997, 2003). In this section, we will describe the orientation tuning dynamics we observed in V1, and how they changed with stimulus size. Descriptive data will be shown in Figs. 4–8. Then we will present an analysis of the data with a model that includes tuned excitation and tuned and untuned suppression (Ringach et al. 2003).
Orientation modulation depth
Orientation selectivity is measured by the modulation depth, A(τ), the difference between the magnitudes of the maximum and minimum responses across orientation at time τ. Formally Therefore A(τ), the modulation depth, is a dimensionless estimate of selectivity. This quantity had a unique maximum at a time delay usually around 60 ms (population mean = 62 ms). A scatter plot between the peak values of the modulation depth at two size conditions, and their distributions, are shown in Fig. 4. The maximal modulation depth was systematically bigger for large stimuli as can be seen in the scatter plot in Fig. 4. Most of the data are on or below the diagonal line indicating larger values of modulation depth for the large stimulus size. Fifty-one cells had a significantly larger modulation depth for large stimulus size (squares in Fig. 4, P < 0.01), and three cells had a significantly larger modulation depth for optimal stimulus size (diamonds, P < 0.01). The distribution of the differences is shown in the histogram on the diagonal; the mean difference is about 0.2 (P < 0.01). Thus orientation selectivity, which increases with modulation depth, is better for larger stimuli.
We also wanted to study dynamic orientation selectivity in different layers of V1, because it is well known that there is different functional connectivity in different cortical layers. We found only a weak dependence of a cell's orientation modulation depth on the cell layer, as seen in Fig. 5. Therefore this result indicates that cells in the input layers (4C) have roughly the same average modulation depth for orientation as cells located in output layers like 2/3, 4B, and 6. This finding resembles previous findings of a weak laminar dependence of steady-state orientation selectivity on layer (Ringach et al. 2002b). Figure 5 also shows that the effect of stimulus size on the peak of the orientation modulation depth was similar in all layers, a small but systematic increase in orientation selectivity with larger stimuli.
Orientation tuning dynamics for large and optimal stimuli
Figure 6, A and B, shows orientation tuning dynamics for two cells at the two stimulus sizes, large and optimal. The top row displays graphs of the orientation dynamics for cell A recorded in layer 4B. The graphs for cell B, recorded in V1 layer 2/3, are below. In each graph, three response versus orientation curves are plotted corresponding to three times: 1) the peak time τpeak at which the modulation depth A(τ) was maximal; 2) the time τdev, which is the earliest time at which A = 1/2 Apeak; and 3) the time τdec, which is the first time, after the peak time, at which A = 1/2 Apeak. Also shown in each graph is the orientation tuning curve derived from reverse correlation at zero time offset (as the solid thick lines), to estimate the noisiness of the reverse correlation tuning curves. One expects that the orientation tuning curve at zero time offset should be totally flat with R(θ,0) = 0. The deviations of the graphed tuning curves from 0 are a measure of the variability in the reverse correlation estimates, and they are acceptably small.
If we consider only the data from large stimuli, we observe many of the same features that were noted by Ringach et al. (2003). In cell A data, for instance, at the early time τdev, there is a very broadly tuned excitation of response above baseline at all orientations. Later in the response, there is suppression that has maximal effect at orthogonal orientations and that increases selectivity by suppressing responses to nonpreferred orientations. In cell B, the early excitation is more tuned for orientation than in cell A. In cell B responses, the suppression that reduces the responses to nonpreferred orientations at later times for the large size stimulus seems to be somewhat tuned around the preferred orientation, because the minimum responses are less than the response to the orthogonal-to-preferred angle. In cell A, there is little effect of stimulus size on the pattern of responses: the orientation tuning curves at the three time slices are similar for both large and optimal size stimuli. In cell B, the features of the tuning curves that are consistent with tuned suppression when the stimulus is large are not apparent in the data obtained with the optimal size stimulus. These effects of size are more apparent when we examine the time course of responses to preferred, orthogonal-to-preferred, and minimum orientations, as in Fig. 7.
From dynamical orientation tuning curves like those in Fig. 6 (also see Fig. 3B), we derived the time-courses of the responses to preferred and nonpreferred stimuli. Rpref(τ) denotes the time-course of a cell's responses to the preferred orientation. Rorth(τ) denotes the time-course of a cell's responses to the orientation orthogonal to its preferred orientation. For each time we also find the minimum response (across orientations) and it is designated Rmin(τ). Figure 7, A and B, shows time-courses of Rpref (red curves), Rorth (green curves), and Rmin (blue curves) for the same two cells whose data are shown in Fig. 6. Solid curves are for large size and dashed curves are for the optimal size. For the population average, we lined up each tuning curve by the early half-peak time of its Rpref(τ). The population averages of the three responses are shown separately in Fig. 7C. For most cells, we see an early, broadly tuned enhancement indicated by large positive values of Rpref and also positive values of Rorth. This is followed by a relatively strong, rapid negative shift of Rorth. In cells like cell A in Figs. 6 and 7, there is little change in the time course of the three response curves with stimulus size, and this is indicated in Fig. 7 by the near overlap of the solid and dashed curves for cell A. However, in cells like cell B in Figs. 6 and 7, there is a difference in the time-course of Rmin between large and optimal stimuli, as can be seen in Fig. 7B, in the right panel. The biggest change with size in the population average, shown in the graphs of Fig. 7C, is also in the magnitude of Rmin.
Population average time-courses
Figure 8 shows the population average time courses of responses at preferred, orthogonal, and minimally effective orientations, with the large size condition, on the same time axis. We aligned cells' responses (shown in Fig. 7) at the time when their responses to preferred orientations (red curves in Fig. 7) were one-half of the peak (Rpref, Rorth, and Rmin are shifted together so that the relative position of them is the same as that of the raw data), and averaged the time-courses of their Rpref (red curve), Rorth (green curve), and Rmin (blue curve). Rorth increased at early times (before 45 ms) and then decreased before Rpref reached its peak response. On average, Rorth was suppressed below 0 at later time. The curves plotted in Fig. 8 show that the response to orthogonal orientations changes from enhancement (+) to suppression (−) just around the time that the preferred response reaches its peak. Furthermore, they show that the average orthogonal response is the minimum response up to around the time of the peak response to preferred, but at later times, there is a divergence between orthogonal and minimum, as they both go negative. This means that there must be a minimum response at an orientation closer than 90° to the preferred, on the average, and that it is a suppressive response. However, there is also significant suppression at orthogonal orientations. This is all confirmatory of previous observations (Ringach et al. 2003). The black dashed curve in Fig. 8 is a scaled down version of the red curve, Rpref(τ). We will describe the scaling procedure a little later when we discuss data analysis. Here we wish only to point out that a scaled version of Rpref is a good fit to the rising phase of Rorth and Rmin. This suggests that a similar mechanism might account for all three functions early in the response, as we propose in the data analysis section.
Theory: three-mechanism model
We previously proposed a three-mechanism model to explain orientation tuning dynamics (Ringach et al. 2003). In this model, the time-courses of Rpref, Rorth, and Rmin are determined by a combination of the three different mechanisms: tuned enhancement (or excitation), an untuned constant signal, and tuned suppression. We postulate that the proposed mechanisms are overlapping in time and orientation, so we need some way to dissect them apart based on analysis of the measured responses.
Here we modified this model, assigning the early, untuned signal as well as the early tuned enhancement to a single excitatory process. Then we interpreted the change in sign of the untuned signal later in the response as the onset of untuned suppression. Tuned suppression was estimated by fitting R(θ, τ) = a(τ)F(θ) − b(τ)G(θ) + c(τ) to the raw data, as in Eq. F1 above, where b(τ) and G(θ) are the time-course and orientation-tuning profile of T(θ, τ) respectively, as in Ringach et al. (2003).
To estimate enhancement and untuned suppression, we assume that R(τ) is a linear combination of tuned enhancement, untuned suppressive, and tuned suppressive components [E(θ, τ), U(τ), and T(θ, τ), respectively] as in Eq. 1 below. To estimate the profile and the dynamics of the enhancement process, E(θ, τ), we adopted the same separability assumption as Ringach et al. (2003), namely that orientation tuning of the excitatory input remained unchanged in shape but was scaled in amplitude with time. In symbols, this means E(θ, τ) = Eθ(θ)Eτ(τ), and we used the normalization max[Eθ(θ)] = 1. The shape of Eθ(θ), which represents the tuning of (putative) feedforward input, was calculated from the response R(θ, τ) at early times, by the following procedure.
Recall (methods) that we fit R(θ, τ) as R(θ, τ) = a(τ)F(θ) − b(τ)G(θ) + c(τ), from Eq. F1. We set the early enhancement Eθ(θ)Eτ(τ) = a(τ)F(θ) + c(τ) for those early times where τ < 50 ms. This is equivalent to the approximation that there is no suppression, tuned or untuned, early in the response. This is a good approximation because the shape of the orientation tuning does not vary with time early in the response, as if a single enhancement process were being measured. Because F is normalized to 1 at its peak, Eτ(τ) = a(τ) + c(τ) for this early process, and the profile of Eθ(θ) is proportional to F(θ) + c(τ)/a(τ); we checked this at a number of values of early time offsets to verify that we got approximately the same orientation tuning function at each time offset. If, as we assume, there is a single orientation tuning curve for enhancement, Eθ(θ), the excitatory component at orthogonal orientations is equal to a scale factor α = Eθ(θorth)/Eθ(θpref) = c(τ)/[a(τ) + c(τ)] multiplied by the peak excitation E(θ, τ) at each time (Eq. 3). The factor α was estimated from the data as described below.
Our goal was to analyze the time-courses of Eτ(τ), U(τ), and T(τ), and we needed the following equations for the analysis. Please notice that E(θ, τ), U(τ), and T(θ, τ) are all positive in the equations, but because U(τ), and T(θ, τ) have subtractive operations, they still represent suppressions (1) (2) (3) (4) (5) The time-course of tuned suppression was estimated by the fitted function b(τ) in Eq. 5. Then we use Eqs. 1–5 to derive the time-courses of excitation and untuned suppression. The time course of excitation Eτ(τ) = E(θpref,τ) is estimated by Eq. 6, which is obtained from Eqs. 2, 4, and 5. Untuned suppression time-course was estimated by Eq. 7, also obtained from Eqs. 2, 4, and 5 by substitution (6) (7) A bootstrapping method was used to estimate the SE of the three components E, T, and U in the population average curves and the distribution of modulation depth E, T, and U for individual cells. For population average curves, 101 cells were randomly chosen with replacement from our database. We estimated E, T, and U from Eqs. 5–7 as described. We repeated this procedure 1,000 times and calculated the SE of E, T, and U. For each cell, 30 trials were randomly chosen with replacement from the original 30 trials, and each resample was used for reverse correlation and estimation of modulation depth E, T, and U. This procedure was also repeated 1,000 times to get the distribution of estimated values and to do a significance test.
The procedure for calculating the α factor is shown in Fig. 8, which shows the procedure with population average data. α was estimated for each neuron by the ratio of the two areas (purple area under green curve, Rorth, and orange area under red curve, Rpref), based on the assumption that the two curves are identical in shape at early times and are simply scaled versions of the same curve. The black dash-dot curve is Rpref rescaled by α. This black curve perfectly matches the early time-course of Rorth (green curve), consistent with our Eq. 3 and with our working hypotheses—that excitation arrives earlier than untuned suppression (please recall that untuned suppression is U in Eqs. 1, 2, and 4, but the earlier, positive, untuned component is c in Eq. F1 or Rorth in Eq. 4. Their relationship is shown in Eq. 4, which also indicates that excitation is separable into a product of orientation tuning Eθ(θ) and time-course Eτ(τ).
Time-courses of the three mechanisms E, U, and T
Using Eqs. 5–7, we can take the orientation dynamics data and estimate the time-courses of the underlying processes, E, U, and T. This was the goal of the preceding analysis, and the results are shown in Fig. 9. There is significant tuned suppression in about two-thirds of the cells tested (70%, 70/101; P < 0.01). Therefore we separated cells into two groups (nonsignificant and significant tuned suppression) and plotted their population average curves of E, U, and T separately: nonsignificant tuned suppression (shown in Fig. 9, A–C) and significant tuned suppression (shown in Fig. 9, D–F). Figure 9 shows the time-courses of estimated excitation at preferred orientation [E(θpref, τ) in Fig. 9, A and D], untuned suppression [U(τ) in Fig. 9, B and E)], and tuned suppression [T(t) in Fig. 9, C and F] for optimal (dashed) and large (solid) sizes. These are population averages, and the gray zone around each curve is ±SE. In Fig. 9, time-courses and profiles of E, U, and T are plotted as positive curves, but please remember that larger U or T means stronger untuned or tuned suppression.
Consider the important features of these new results on the time-courses of these neural mechanisms. Untuned suppression is by definition slower than the enhancement that arrives presumably from feedforward and local recurrent excitation. The population averages of untuned suppression rise quickly toward their peak values. As is evident in Fig. 9 and also in Figs. 10 and 11, the time to peak of untuned suppression U(τ) is almost the same as for enhancement E(τ). It is also evident that tuned suppression is slower to rise to its peak.
If one uses Eq. F1, one can derive an expression for the orientation profile of excitation The insets in Fig. 9, A and D, show the normalized orientation tuning curves [E(θ)] for tuned enhancement (solid line is for large size, dashed line is for optimal size) averaged across the same populations of neurons for which the timing data are graphed. To calculate the population average tuning curves, all preferred orientations were set to 90°. The average tuning curves show the point that the tuned enhancement process was nonzero at orthogonal orientations, on the average across the population. The broad tuning of the excitatory process derived here is consistent with theoretical predictions of relatively weak orientation selectivity of the feedforward excitation from the LGN to V1 cells (McLaughlin et al. 2000; Troyer et al. 1998). The inset in Fig. 9F shows the normalized, population average orientation tuning curve of tuned suppression T(θ, τ) for the large size condition.
The population in which tuned suppression was significant has an excitatory component that is more sharply tuned than the population without significant suppression (cf. Fig. 9, D and A). This is similar to our previously reported finding that dynamic sharpening in orientation bandwidth occurred for cells that were more sharply tuned to begin with (Ringach et al. 2003). It is possible that tuned suppression is also present in the group in which we do not observe significant suppression. The bandwidths for excitation and tuned suppression could be so similar for these cells that there are no major changes in the shape of the tuning curve with time that would indicate the presence of tuned suppression.
Excitation at preferred orientation
The individual model components allow us to estimate the importance of the spatial extent of each component separately. We compared the magnitude of the enhancement E(θpref, τ) for the two size conditions across the whole population (Fig. 10A; n = 101). Interestingly, for many cells, the maximum tuned enhancement is larger for the large than for the optimal stimulus size (P < 0.01, Fig. 10A; also see average curves in Fig. 9, A and D). Twenty-seven cells had significantly larger tuned enhancement for large stimulus size (squares in Fig. 9A, P < 0.01); only three cells had significantly larger tuned enhancement for the optimal stimulus size. This is different from what is usually observed in drifting grating experiments, where cells' responses are usually smaller with large size stimuli (Levitt and Lund 2002; Sceniak et al. 2001). We think this result is a validation that the analysis technique we used isolated the enhancement or excitatory process that simply grows monotonically with stimulus size. The means of peak time of excitation are 60.2 ± 9.5 (SD) ms (Fig. 10B) for the large size condition and 61.6 ± 8.9 ms for the optimal size condition. The observation that there is a small increase in amplitude when the large size stimulus is used suggests that the excitatory region extends beyond the optimal size estimated using drifting grating stimuli of high contrast. Furthermore, the similarity of the peak times for excitation between the two size conditions suggests that large and optimal stimuli are activating the same excitatory process.
We also estimated the magnitude of untuned suppression and its time to peak for the two conditions (Fig. 11). Eleven cells had significantly larger untuned suppression for the large stimulus size (marked as squares in Fig. 11A, P < 0.01) and six cells had significantly larger untuned suppression for the optimal stimulus size (marked as diamonds, P < 0.01). For most cells, the maximum value of untuned suppression is unchanged between the two size conditions. For the whole population, most points are around the diagonal line in Fig. 11A (t-test for pair difference, P > 0.05). Figure 11B shows the distribution of the peak time of untuned suppression for the large size condition (mean, 63.1 ± 10.5 ms). The peak time of untuned suppression with optimal size stimuli has a similar distribution (mean, 65.5 ± 10.0 ms). The similarity in the effect of stimulus size suggests that most of the untuned suppression component is mainly due to visual stimulation on a cell's CRF.
There is a significant difference for the tuned suppression under the two size conditions: tuned suppression is stronger for larger size stimuli (t-test for pair difference, P < 0.01; and most points are below the diagonal line in Fig. 12A). Fifty-two cells showed a significant change of tuned suppression (P < 0.01), among which 51 cells showed an increase of tuned suppression for large size (squares in Fig. 12A) and only 1 cell showed a decrease for the large size (diamond). The distribution of the peak time of the tuned suppression is shown in Fig. 12B for the large stimulus size. On average, the peak time of the tuned suppression was 77.0 ± 12.2 ms after stimulus onset. Note this is around 16 ms later than the value of the peak times for excitation and 14 ms later than the peak time for untuned suppression.
There is considerable diversity across the population in the orientation dynamics data, as indeed has been observed before in the steady-state data (Ringach et al. 2002b). Therefore for completeness, we report that some cells (15/101) don't have significant untuned suppression. They show a similar time-course for Rpref and Rorth but rescaled. We also see some cells (11/101) without any response at the orthogonal orientation. All results above hold for both simple and complex cells, and we didn't see significant difference between simple and complex groups.
Laminar variation of orientation selectivity could be a clue to mechanisms because functional connectivity varies in different V1 layers. We separated cells into layers 2/3 (28/88), 4B (15/88), 4C (17/88), 5 (12/88), and 6 (16/88) cells based on histological reconstructions and compared the mean peak time and mean peak amplitude of excitation (Fig. 13, A and B), untuned suppression (Fig. 13, C and D), and tuned suppression (Fig. 13, E and F) under the two size conditions across different layers. For the graph and calculation of the peak times of untuned suppression, we only included cells with peak untuned suppression significantly larger than its untuned suppression at time 0 [layers 2/3 (14/58), 4B (13/58), 4C (11/58), 5 (8/58), 6 (12/58)]. For the peak time of tuned suppression, we only chose cells with tuned suppression significantly larger than tuned suppression at time 0 [layers 2/3 (19/59), 4B (8/59), 4C (12/59), 5 (9/59), 6 (11/59)]. Therefore the total numbers of cells in each of these laminar comparisons of peak times are not the same as the number of cells in the entire database.
There are interesting differences between the layers in the values of peak excitation, untuned suppression, and tuned suppression, but here we are focused on the effects of stimulus size on these mechanisms. There is little effect of stimulus size on the time course of the estimated functions and on most of the peak amplitude values. However, there are differences in peak amplitude of excitation and untuned suppression noticeable in the layer 5 or layer 6 data. However, the most striking effect of stimulus size is again on the amplitude of tuned suppression; there is a significant difference between large and optimal in all layers (P < 0.01 for all layers). The details of the laminar analysis are given in Table 1.
As shown earlier, in Figs. 4–5, orientation modulation depth depends on stimulus size, being systematically larger for larger size stimuli. In seeking to understand the causes of this greater selectivity, we analyzed the relationship between the change of peak modulation depth with stimulus size versus the change with size of untuned suppression (Fig. 14A) and versus the change with size of tuned suppression (Fig. 14B). The change of tuned suppression is moderately correlated to the change of modulation depth (r = 0.54, P < 0.001), whereas the change of untuned suppression with stimulus size seems uncorrelated with the change of modulation depth with size (r = 0.28, P = 0.005). The implication is that large size stimuli increase orientation selectivity partly because of increased tuned suppression.
One major result of our previous dynamics experiments was that suppression of nonoptimal responses was highly correlated with global measures of selectivity for both orientation and spatial frequency (Bredfeldt and Ringach 2002; Ringach et al. 2002a). This result implied that the neuronal processes in the visual cortex that cause suppression of nonoptimal responses contribute to the selectivity for orientation and spatial frequency in the cortical cells' responses to dynamical stimuli (see also Xing et al. 2004). Another result of the previous dynamics experiments suggested that many V1 cells have tuned suppression.
Here we have studied the effect of stimulus size on excitation, untuned suppression, and tuned suppression. We set up a three-component (tuned enhancement, untuned suppression, and tuned suppression) model to interpret the dynamics of V1 cells' orientation tuning. In reality, there might be more than three mechanisms involved in the dynamics of a V1 cell's orientation tuning. More work needs to be done to link these different components to other physiology results and specific circuit mechanisms. However, the different size dependences of the different kinds of suppression support our dissection of suppression into distinct subprocesses. Our results imply that the presence of spatial context significantly modulates the orientation selectivity of V1 neurons. The principal contributor to this effect is tuned suppression because this is the neuronal mechanism that is most affected by stimulus size. Tuned suppression signals arise from the region outside the CRF, the nonclassical surround (Allman et al. 1985).
Classical receptive field
Study of the time-course of the excitatory process reveals an important feature of spatial interactions in V1. In response to gratings flashed briefly (20 ms) within the rapid stimulus sequence as shown in Fig. 1, at early times Rpref, excitation at preferred orientation was somewhat larger in amplitude for stimuli of large size than they were for optimally sized stimuli (cf. Figs. 7 and 9, A and D). This result holds for most cells, whether the cell had strong steady-state surround suppression or not. It suggests that a stimulus larger than a cell's optimal size can still have an excitatory effect on the cell that is larger than that of the “optimal” stimulus and that excitation does not get weaker when the stimulus size is increased. This is very different from what one observes in cells' responses to drifting gratings. Then stimuli larger than optimal usually elicit a weaker response than the optimal size stimulus. This new result of the analysis of the dynamics indicates that surround inhibition is what causes the weakening of responses to large stimuli. That we only see this effect in experiments in which the stimulus is flashed might be simply because in the drifting grating experiments the response is a mixture of excitatory and inhibitory effects of the stimulus that overlap in time. It is possible to dissect apart excitatory and suppressive processes more easily in the responses to dynamical stimuli. This result also suggests that the optimal size measured by using a drifting grating at high contrast is likely to be smaller than the actual spatial summation region of the CRF, presumably because inhibition makes the summation region appear smaller than it really is. Angelucci et al. (2002) have suggested that V1 horizontal connections might contribute to the spatial summation region of a cell's receptive field center, and that the center's summation area might overlap the region of the same cell's much larger suppressive surround.
The results on the inferred orientation tuning profile of the enhancement process, displayed in Fig. 9, A and D, are revealing about mechanisms of orientation tuning. The broad orientation tuning of excitation shown in Fig. 9, A and D with significant positive responses at orthogonal orientations supports the conclusion that feedforward excitation provides only fairly coarse orientation selectivity that must be sharpened by cortical inhibition (McLaughlin et al. 2000; Sompolinsky and Shapley 1997; Troyer et al. 1998). There is little evidence in our data for the augmentation of orientation selectivity by orientation-tuned cortico-cortical excitation, as postulated in the models of Ben-Yishai et al. (1995) and Somers et al. (1995) among others. One would expect cortico-cortical excitation to appear as a somewhat delayed additional excitatory process, but such a fourth dynamical process was not required to fit our data for either size condition.
Our present results may help to clear up an apparent discrepancy between our earlier work (Ringach et al. 1997, 2003) and the paper by Gillespie et al. (2001), where they only reported tuned excitation and untuned suppression in intracellular experiments in cat V1. However, Gillespie et al. (2001) designed their experiments so that their stimuli were confined to the CRF. Under such stimulus conditions, we also observed little tuned suppression, so there is no disagreement between us and Gillespie et al. on this issue.
We dissected out the neuronal process we called untuned suppression based on a descriptive model. The untuned suppression, estimated according to the additive model that is expressed in the Eqs. 1–7, is rapid (peak time is 2.9 ms slower than the peak time of the excitation). In a few cells, the peak amplitude of untuned suppression increased with stimulus size, but no significant change was found across the population under the two size conditions. When we stimulated a cell with a stimulus of optimal size (0.45° radius on average in our data), we most likely activated a compact region of V1 (Tootell et al. 1988; Van Essen et al. 1984). This patch in the primary visual cortex corresponds to the cell's local neighborhood in V1 (Angelucci et al. 2002). That we see a strong untuned suppression even with a stimulus of optimal size suggests that the untuned suppression mainly comes from the center mechanism and the local circuitry. This is consistent with the recent anatomical findings (Angelucci et al. 2002; Lyon et al. 2003) that a V1 cell gets most of its inhibitory synaptic input from a local area in the cortex of approximate diameter of 100–200 μm. The untuned suppression exists in all layers as well as in simple and complex cell groups. This suggests that untuned suppression is a general mechanism in primary visual cortex (Bredfeldt and Ringach 2002; Ringach et al. 2002a,b; Shapley et al. 2003; Xing et al. 2004). Broadly tuned cortico-cortical inhibition that arises locally in the cortical circuitry is the likely source of the untuned suppression we have measured (McLaughlin et al. 2000; Tao et al. 2004; Troyer et al. 1998). There are other candidate mechanisms for untuned suppression in V1 including synaptic depression (Carandini et al. 2002). The fact that untuned suppression is stronger in layers 4B and 5 than the main thalamo-recipient layers (layers 4C and 6) suggests that the untuned suppression is mainly from cortico-cortical effects instead of from thalamic-cortical effects. Furthermore, the untuned suppression we measured had short persistence, whereas rapid synaptic depression has 200- to 600-ms recovery time (Abbott et al. 1997). Therefore the time-course of untuned suppression is unlike what has been assumed for synaptic depression (e.g., Carandini et al. 2002). Therefore a likely possibility is that fast cortical inhibition is the source of the untuned suppression.
We also observed tuned suppression for many cells. In most cases, the magnitude of tuned suppression increases with stimulus size. When we stimulated a cell with a stimulus two to five times the radius of the CRF (2° on average in our data), not only the local area in V1 around the cell but also many hypercolumns outside the hypercolumn where the cell was located were activated. Therefore under the large size condition, V1 cells in a large area are activated. A larger stimulus may provide more effective drive for receptive fields of neurons in extrastriate visual cortex that are known to have progressively larger receptive fields (Angelucci and Bullier 2003). The slower time-course of tuned suppression (by 20 ms) and its size-dependence might be due to long distance connections in primary visual cortex, feedback from extrastriate visual cortex, or a combination of both. It has been suggested that long distance connections are orientation-specific in such a way that cells with similar orientation preference but located in different hyper-columns tend to connect to each other (Ts'o et al. 1986). Feedback from extrastriate visual cortex to V1 can also be orientation specific (Angelucci et al. 2003). Even though the direct synaptic connection might be excitatory, the effect of long distance connections or feedback could be net-suppressive in a local region via inhibitory cells. It will be important to determine what is the cellular and biophysical basis for tuned suppression to understand the effects of spatial context on orientation selectivity.
This work was supported by National Eye Institute Grants EY-01472, EY-08300, EY-12816, and Core Grant EY-P031-13079.
We thank E. Johnson, J. A. Henrie, P. Williams, and S. Joshi for help with experiments and M. Jazayeri for helpful discussion.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2005 by the American Physiological Society