This study characterizes the spatial organization of excitation and inhibition that influences the visual responses of neurons in macaque monkey's primary visual cortex (V1). To understand the spatial extent of excitatory and inhibitory influences on V1 neurons, we performed area-summation experiments with suprathreshold contrast stimulation. The extent of spatial summation and the magnitude of surround suppression were estimated quantitatively by analyzing the spatial summation experiments with a difference of Gaussians (DOG) model. The average extent of spatial summation is approximately the same across layers except for layer 6 cells, which tend to sum more extensively than cells in the other layers. On average, the extent of length and width summation is approximately equal. Across the population, surround suppression is greatest in layer 4B and weakest in layer 6. Estimates of summation and suppression are compared for the DOG (subtractive) model and a normalization (divisive) model. The two models yield quantitatively similar estimates of the extent of excitation and inhibition. However, the normalization (divisive) model predicts weaker surround strength than the DOG model.
Spatial summation in primary visual cortex
This paper is a report of experimental measurements of the visual spatial properties of a population of neurons in macaque primary visual cortex (V1) studied with high contrast visual patterns. With increasing evidence that receptive fields of individual V1 neurons depend strongly on the overall network activity (Allman et al. 1985;Blakemore and Tobin 1972; Kapadia et al. 1995; Levitt and Lund 1997a; Maffei and Fiorentini 1976; Nelson and Frost 1978;Polat et al. 1998; Walker et al. 1999), it is important to know the spatial limits for excitation and inhibition. This is particularly relevant for understanding how a cell will respond when the receptive field is stimulated with complex stimuli that impinge on the classical receptive field as well as the suppressive surround. Because there are few previous studies of the spatial extent of excitation and inhibition in primate V1 (Born and Tootell 1991; Dow et al. 1981;Schiller et al. 1976a; Snodderly and Gur 1995; von der Heydt 1992), our goal was to obtain quantitative data about spatial properties on a population of V1 neurons.
Our experiments also aimed to characterize the spatial properties of neurons in the different cell layers of V1. There are pronounced differences in receptive field properties across cortical layers in Old World monkeys (Blasdel and Fitzpatrick 1984; Dow 1974; Hawken and Parker 1984; Hawken et al. 1988; Hubel and Wiesel 1968, 1977;Leventhal et al. 1995; Livingstone and Hubel 1984; Sato et al. 1996). Some studies that have measured the extent of receptive fields in V1 have combined cells across all cortical layers (Dow et al. 1981) while others give relative estimates of size across layers (Hubel and Wiesel 1977; Schiller et al. 1976a). Some studies report that the smallest receptive fields are found among the nonoriented cells in layer 4Cβ (Livingstone and Hubel 1984) and 4A (Blasdel and Fitzpatrick 1984), while other studies report the smallest fields are in layers 4B and 5 (Snodderly and Gur 1995). The latter study also reports that receptive fields in the input layers (4A, 4C, and 6) have a diversity of receptive field (RF) widths. No study has reported systematic measures of the spatial extent of the classical receptive field and the suppressive surround among cells in different layers.
Our approach to studying spatial properties of visual neurons differs from earlier, receptive-field mapping techniques. Early studies in cat visual cortex estimated receptive field size by mapping the minimum response field (Gilbert 1977; Henry et al. 1978; Hubel and Wiesel 1962). These studies estimated receptive field size by placing light or dark bars across the receptive field and measuring the minimum area of visual field necessary to evoke an increase in the neuronal discharge. For a linear system, this method will yield an accurate estimate of the total integration area of the receptive field. However, because of neuronal response thresholds, small responses from stimuli in the periphery of the receptive field far from the center will not be revealed with this method (DeAngelis et al. 1992). In addition, mapping the minimum response field is insufficient for mapping the strength and extent of surrounding inhibition. Because surround suppression is known to be a prevalent feature of receptive fields in monkey V1 (Dow et al. 1981; Schiller et al. 1976a;Sillito et al. 1995), it is important to use a method that accurately describes both excitation and inhibition.
Our approach is to study spatial summation with spatially extensive stimuli that the neuron integrates to give a suprathreshold response. DeAngelis et al. (1994) were the first to perform a systematic analysis of the spatial properties of neurons in cat visual cortex with such spatially extensive stimuli. By using drifting sinusoidal gratings extended along either the receptive field's length or width, they were able to make accurate estimates of excitation and inhibition while minimizing the effects of response thresholds. We use this technique to measure area-summation as well as length and width summation and also the amount of surround suppression.
Surround suppression in primary visual cortex
Previously, studies of modulation from regions beyond the excitatory “classical” receptive field focused on suppressive effects along the receptive field ends and sides (Bodis-Wollner et al. 1976; Bolz and Gilbert 1986; Born and Tootell 1991; De Valois et al. 1985;Dreher 1972; Foster et al. 1985;Gilbert 1977; Hubel and Wiesel 1965;Maffei and Fiorentini 1976; Orban and Vandenbussche 1979; Orban et al. 1979;Rose 1977; von der Heydt et al. 1992;Yamane et al. 1985). Hubel and Wiesel (1965) coined the term hypercomplex to describe cells that exhibit end suppression or end-stopping. The degree of side versus end suppression has not been investigated systematically in primate V1, and this is one major issue of our experiments. These estimates were analyzed with respect to cortical layers to determine anatomical trends in these response properties (Levitt and Lund 1997b;Schiller et al. 1976).
The area-summation profiles derived from experiments done with drifting sine wave gratings in cat visual cortex are well described by a difference of integrated Gaussians (DeAngelis et al. 1994). In this model, the center Gaussian corresponds to the excitatory or classical receptive field while the surround Gaussian corresponds to the suppressive component of the receptive field. The excitatory or classical receptive field of simple cells has been modeled by as either a Gabor filter or the difference of overlapping Gaussians (Daugman 1985; Field and Tolhurst 1986; Hawken and Parker 1987; Jones and Palmer 1987; Marcelja 1980; Stork and Wilson 1990). It is the envelope of these models that corresponds to the center Gaussian in the DeAngelis et al. model. Complex cells' overall sensitivity profiles can also be approximated with Gaussian spatial profiles (Ohzawa et al. 1990). For either simple or complex cells, the extent of spatial summation can be taken directly from the excitatory space constants from fits with the DOG model.
Studies of modulation from beyond the “classical” receptive field have an implicit assumption that the classical excitatory receptive field and the surround are distinct regions separated in visual space. Rather than separate regions that correspond to the classical receptive field (CRF) and regions beyond the CRF, the DOG model treats both regions as one response unit with spatially overlapping antagonistic Gaussian-shaped mechanisms. In this formulation, facilitation from surround context might reflect input from the tail of the central excitatory Gaussian mechanism. Similarly, “contextual” suppression might be related to the inhibitory surround Gaussian mechanism that is not restricted to the periphery but might peak in the center along with the excitatory Gaussian mechanism.
The DOG model proposes that excitation and inhibition interact in an additive manner. Other studies have shown that cortical inhibition might act through a divisive rather than a subtractive mechanism (Albrecht and Geisler 1991; Bonds 1989;Robson et al. 1988). A number of recent studies have considered a model of cortical processing that achieves suppression through division rather than subtraction (normalization model) (Carandini and Heeger 1994; Carandini et al. 1997; Heeger 1992; Nestares and Heeger 1997; Simoncelli and Heeger 1998;Tolhurst and Heeger 1997a,b). A computational model bySomers et al. (1998) described the cortex as a neural network of center-surround units. Inhibition in this model was also divisive in nature. To compare these two types of interaction, we fit our summation data with the center-surround DOG model and a center-surround normalization model. The important issue that we addressed is whether the estimates of the center and surround Gaussian sizes differ significantly depending on the type of interaction between the center and the surround mechanisms. We compared the estimated spatial properties obtained from the divisive model to the spatial properties from the subtractive (DOG) model. Good agreement was found between the spatial parameters derived from the two models.
Standard electrophysiological recording techniques were used in acute preparation of macaque monkeys (Hawken et al. 1988,1996; Kaplan and Shapley 1982). Extracellular action potentials were collected from isolated single neurons using extracellular microelectrodes. Visual stimuli and spike arrival times were synchronized under computer control. Spikes were analyzed both during experiments and off-line using standard software packages and custom software written specifically for this purpose. Details of the procedures used in the experiments and the data analysis are given in the following text.
Acute experiments were performed on adult Old World monkeys (Macaca fascicularis) in strict compliance with the guidelines for humane care and use of laboratory animals published by National Institutes of Health and Public Health Service. Animals were initially tranquilized with acepromazine (50 μg/kg im). After administering the tranquilizer (∼15 min), we anesthetized the animal with ketamine (10 mg/kg im). Additional ketamine was given as needed during the initial phase of surgery. Venous cannulation and tracheotomy were carried out under ketamine, then we transferred to an opioid anesthetic sufentanyl (sufentanyl citrate, 6 μg · kg−1 · h−1 iv) and maintained anesthesia throughout the experiment with sufentanyl. A broad spectrum antibiotic (Bicillin, 50,000 IU/kg im) and antiinflammatory steroid (dexamethasone, 0.5 mg/kg im) were given at the initial surgery and every other day during the recording period. Anesthesia level was determined by analysis of the electroencephalographic (EEG) waveform, heart rate, blood pressure, and CO2 output. Anesthetic state was judged to be satisfactory if there was predominant slow wave EEG activity and if potentially mildly noxious stimuli produced no change in EEG, heart rate, or blood pressure. Expired CO2 was maintained close to 5%. Rectal temperature was monitored and kept at a constant 37.5°C. A small craniotomy was performed over the striate cortex for recording. Anesthesia was administered throughout the recording period with sufentanyl (6 μg · kg−1 · h−1 iv), and paralysis was induced with pancuronium bromide (0.1 mg · kg−1 · h−1 iv). The anesthetic and paralytic were administered in balanced physiological solution at a rate to maintain fluid volume of 5–10 ml · kg−1 · h−1 . Experiments were terminated by intravenous injection of the animal with a lethal dose of pentobarbital (60 mg/kg). The animal was then perfused through the heart with a mixture of heparinized saline followed by 2 l of fixative (4% paraformaldehyde in phosphate buffer, pH 7.4) for later histological reconstruction. Histological reconstruction was performed using the same methods as described in Hawken et al. (1988, 1996).
The eyes were initially treated with 1% atropine sulfate solution to dilate the pupils. The eyes were protected by gas-permeable contact lenses. Prior to adding the lenses, the eyes were treated with a topical antibiotic (gentamicin sulfate, 3%). Foveae were mapped onto a tangent screen using a reversing ophthalmoscope (Eldridge 1979). The visual receptive fields of isolated neurons were then mapped on the same tangent screen, keeping reference to the foveae. Proper refraction was achieved by placing corrective lenses mounted in front of the eyes on custom-designed lens holders. Refraction adjustments were made during the recording session by stimulating a responsive cell with a grating composed of a spatial frequency near the cutoff frequency. The lens power was adjusted to produce a maximal response.
The electrode was advanced through the gray matter via a stepping motor (1-μm step size) mounted to a microdrive (Narashige, Japan). Single-unit recordings were made with glass-coated tungsten microelectrodes with exposed tips 5–15 μm (Merrill and Ainsworth 1972). The signal was amplified using a Dagan (Minnesota, MN) EX4–400 differential amplifier and band-pass filtered (0.1–10 kHz). This analog signal was then sent to an A/D signal processing board of a digital computer (SGI). Spikes were discriminated and time-stamped using software custom designed for this purpose and running on a Silicon Graphics O2 computer. Single spikes were isolated from the recording using tailored waveform windowing. Spikes were time stamped with an accuracy of 1 ms. Strict criteria for single-unit recording included fixed shape of the action potential and the absence of spikes during the absolute refractory period.
The visual stimuli were generated on either a Silicon Graphics Elan R4000 or a Silicon Graphics O2 R5000 computer. In the Elan configuration, the screen (Barco) measured 34.3 cm wide and 27.4 cm high, and the resolution was 1,280 × 1,024 pixels. The refresh rate of the monitor was 60 Hz, and its mean luminance was 56 cd/m2. In the O2 configuration, the screen (Sony Multiscan 17se II color monitor) measured 31.4 cm wide and 23.5 cm high, and the resolution was 800 × 600 pixels operating at 100 Hz frame refresh. The mean luminance of the display was 53 cd/m2. The screen viewing distance was 115 cm for both setups.
Optimal drifting sinusoidal gratings
Each cell was stimulated monocularly through the dominant eye with the nondominant eye occluded. Receptive fields were located at eccentricities between 2 and 5°. Using a small circular patch of drifting grating (mean luminance 53–56 cd/m2), we estimated the center location of the receptive field by listening for an elevation in the mean firing rate. This was done by hand under computer mouse control. Then we measured tuning for orientation, spatial frequency, temporal frequency, and contrast to get the optimal parameters for the area-summation experiments. A contrast response function was obtained for each neuron. Ten different luminance contrasts were tested ranging from 2 to 90% in logarithmic steps. Stimulus contrasts were tested in sequential order from low (2%) to high (90%) contrast. Each stimulus was presented for a total of 4 s and repeated twice. Contrast was defined as Contrast = (L max –L min)/(2L mean).
All of the area-summation experiments were conducted using drifting sinusoidal gratings with the optimal stimulus parameters. The center of the receptive field was carefully located using a small (0.2° diam) circular grating patch. Once the center was located, circular patches of drifting sinusoidal grating were presented, centered over the receptive field. Each grating patch size was presented for 4 s. Four-second blanks of the same mean luminance as the grating stimuli were presented interleaved with grating stimuli to determine the spontaneous firing rate and to avoid response adaptation. The patch sizes were presented in a random order. The radius ranged from 0.1 to 5° of visual angle in logarithmic steps. Each summation curve consisted of 10 radii with two repeats at each size. Contrast levels were held constant during repeats to avoid effects of adaptation. Outside each patch, the rest of the screen (12 × 17° visual angle) was kept at the mean luminance of 53 cd/m2.
The contrast level was chosen such that the given contrast elicited a response that was near but less than 90% of the saturating response. Saturating response was determined to be the point where increasing the contrast produced no increase in the response rate.
The summation experiment was repeated using rectangular patches that extend independently in the length or width dimension. The patch length was varied randomly in the same manner described in the preceding text for the circular patch summation experiments. Then we conducted a similar experiment by varying the width in a similar random fashion. During the length or width summation experiments, the parameter that did not vary was kept at the optimal value obtained from the area-summation experiment.
Each summation curve was fitted using the following empirical function Here, R 0 is the spontaneous rate, and each integral represents the relative contribution from putative excitatory and inhibitory components respectively (DeAngelis et al. 1994). Values ofKe , a,Ki, and b were optimized to provide the least mean squared error (MSE) to the data. Excitatory space constant measures are taken as the parameter a from the fitted curves for the first harmonic response of simple cells and the DC response of complex cells. A suppression index (SI) measure was also estimated from the fitted curves. This measure is the ratio of area under the inhibitory Gaussian over that of the excitatory Gaussian We also considered a normalization model of the following form where Lc andLs are linear responses estimated from the integral of a Gaussian profile similar to the DOG model discussed in the preceding text. The center and surround gains arekc andks, respectively, Ccorresponds to the stimulus contrast, and β is an arbitrary exponent. The normalization model uses divisive rather than subtractive inhibition. Otherwise both models are composed of spatially overlapping Gaussian subunits for excitation and inhibition. The excitatory and inhibitory space constants for the DOG model and the normalization model (a, ςc,b, and ςs respectively) and the gains (Ke, kc, Ki, andks ) are given unique labels so as not to confuse estimates from the two models.
All fitting procedures were done with the MATLAB optimization toolbox using the CONSTR and FMINCON nonlinear least-squares functions. Cells were classified as simple, if the ratio of the first harmonic response to the DC response was greater than 1, indicating modulation at the fundamental frequency (De Valois et al. 1982;Movshon et al. 1978a,b; Skottun et al. 1991). Complex cells showed a ratio less than or equal to 1.
Spatial summation characterization
We obtained a full set of data on 138 neurons where all of our selection criteria were met. Electrode penetrations were made consistently into the region of the striate cortex that represents the visual field at 2–5° retinal eccentricity. This allows us to make statements about the spatial scale of units within a fairly well constrained region of the retinotopic map of V1. Spatial frequency, temporal frequency, and orientation of the gratings were optimized for each neuron through the use of the standard set of experiments outlined in methods. Then area-summation curves were collected. Drifting sinusoidal gratings were presented in a circular aperture centered over the receptive field, and the aperture area varied as described in methods. For a subset of the total population (n = 50), we collected summation tuning curves for the length or width of the receptive field. The response was measured as a function of the patch radius, half-length, or half-width.
Summation profiles display a stereotypical form. All cells show an initial increase in responsiveness as the size of the patch is increased from the smallest value (0.1°). After hitting a maximum, some cells display response suppression while others' responses asymptote. The optimal radius of summation, taken either as the radius at maximum response or at 95% of the maximum for cells that asymptote, is synonymous with the size of the “classical” or excitatory receptive field. Suppressive zones surrounding the central receptive field have been shown to be tuned (although more broadly) to stimulus parameters that are similar to the cell's excitatory center tuning (DeAngelis et al. 1994; Li and Li 1994; see Allman et al. 1985 for a review). Because the center and surround display similar optimal stimulus tuning, a grating extending in space is ideally suited to uncover the spatial characteristics of both regions simply by varying the extent of the stimulus.
To explore the spatial mechanisms underlying area summation, we adapted the difference of Gaussians (DOG) model from retinal physiology (Enroth-Cugell and Robson 1966; Rodieck and Stone 1965). Although the center might be approximated by a single Gabor filter, the tuning of the surround suggests that it is composed of pools of Gabor filters, and the net effect of the surround pool can be summarized by a single inhibitory Gaussian. Here we consider the center and surround to be spatially overlapping Gaussians (Fig.1 A). The summation profile can be represented as the linear sum of the area under each Gaussian profile or, equivalently, the difference of the integral of Gaussians (Fig. 1 B). Although both the center and surround have significant spatial subunit structure, each mechanism can be summarized by the size and strength of the Gaussian envelope describing that mechanism (Fig. 1 B). The excitatory Gaussian is described by its gain, Ke , and space constant,a, and the inhibitory Gaussian by its gain, Ki , and space constant, b.
A statistical analysis of the DOG model reveals the uniqueness of individual parameters in the fitting procedure (Linsenmeier et al. 1982). Small deviations in the excitatory gain,Ke , and space constant, a, produce significant errors. The inhibitory gain,Ki , and space constant, b, are less uniquely defined. However, the error surface for the product of the inhibitory gain and space constant,Kib, indicates that the product is uniquely defined, although the slope of the error versusKib function may be relatively shallow at times (Linsenmeier et al. 1982). When fitting the summation profiles, the value of the inhibitory space constant was constrained to be the lowest estimate that yielded an acceptable fit. The inhibitory space constant, b, was also constrained to be either greater than the excitatory space constant, a, or zero if there was no inhibition present. Individual gain parameters were constrained to be within the range predicted by the contrast response functions. For example, if the low-contrast response was 20 spikes/s and the high-contrast response was 100 spikes/s, then the high-contrast gain was constrained to be no more than five times the maximum response at low contrast.
The integral of the DOG model provides good fits to the area-summation response functions. To quantify how well this model fits the data, we estimated the error between the empirical model and the actual data. The error is calculated as the mean fractional error which is the MSE normalized by the average response across all radii. Across the population, the mean fractional errors are small (〈E〉 = 0.02, Fig. 1 C).
From the fitted curves of 138 single isolated units in primary visual cortex we estimated the excitatory and inhibitory spatial spread. The spatial estimates come directly from the fitted curves and indicate the Gaussian spread (radius) of the excitatory and inhibitory components (Fig. 1 A). Across the population, the mean excitatory radius is 1.0° and the mean inhibitory radius is 2.2° (Figs.2 and 3). Cells with no surround suppression (SI = 0) have not been included in the population of cells from which inhibitory space constants are estimated.
Anatomical trends in the spread of excitation and inhibition
To address the conflicting accounts of receptive field size across cortical layers, we examined the laminar distribution of our summation data. The laminar position of each single unit was estimated from a histological reconstruction of the electrode track performed after each experiment. Cells are further categorized by their responses to sinusoidal input as either simple cells (dominant 1st harmonic modulation) or complex cells (unmodulated response) (De Valois et al. 1982; Skottun and Freeman 1984;Skottun et al. 1991). Complex cells' summation area (mean = 0.90°) is on average smaller than simple cells' summation area (mean =1.16°), and this difference is statistically significant (P < 0.05, Wilcoxon test). Across the cortical layers, there is a trend toward larger summation size within the lower layers (Figs. 2 B and4 A).
To determine population trends across cortical layers independent of predefined categorization, we performed a local regression fit to the excitatory space constants across cortical layer (local neighborhood = ±15% of the total population along the cortical depth, Fig. 4 A). The regression fit shows deviations from the mean (- - -, Fig. 4 A) that are consistently smaller in the upper layers and larger in layer 6. To test the significance of these trends, we divided the responses according to the anatomical layer and estimated the means and 95% confidence intervals for each layer using an ANOVA factorial analysis with a Fisher post hoc test (Fig. 4 B). Layer 3B shows excitatory space constant estimates that are significantly smaller than the mean (P < 0.0005), and layer 6 shows estimates that are significantly larger (P < 0.05). The upper layers (2/3A–4B) are also significantly smaller than layer 6.
There is a significantly smaller estimate for the inhibitory space constant in the upper layers (2/3) than in layer 6 (P< 0.01, ANOVA factorial analysis and Fisher post hoc test). However, no layer varies significantly from the population mean (mean = 2.2°, Fig. 3 B). A local regression of the inhibitory space constant across cortical layer shows a similar trend as the excitatory space constant with larger estimates for the lower layers than the upper layers. This results in the constant ratio of the spatial extent of excitation to inhibition, b/a, across cortical layer (data not shown). The average inhibitory space constant, b, is not significantly different between simple and complex cells (〈b〉 = 2.2°, Fig. 3, A and B).
Surround suppression estimates
Previously surround suppression has been estimated as the percent reduction of the peak or maximum response (DeAngelis et al. 1994; Schiller et al. 1976; Sillito et al. 1995). This estimate of surround inhibition is based solely on the relative magnitudes of the two regions without considering their spatial scale. While this estimate does provide one characterization of the strength of the surround, the DOG model can be used to obtain a quantitative characterization of surround strength compared with that of the center (or classical RF). We can estimate the ratio of area under the inhibitory (Kib) to excitatory (Kea) components directly by comparing the excitatory and inhibitory Gaussians from the fitted DOG model. The ratio of excitation to inhibition gives an index of inhibitory strength that varies between 0 and 1 with 0, indicating no suppression, and 1, indicating suppression equal in strength to that of the center.
Across the population of V1 neurons studied, there is strong surround suppression with a mean value of 0.63 (Fig.5 A). Despite a number of cells exhibiting no surround suppression (15/138) at high contrast, the distribution mean is strongly weighted toward suppression (SI > 0.5, Fig. 5 A). The mean SI estimate for complex cells (0.64) is not significantly different from simple cells (0.62). However, a strong trend in SI magnitude emerges when considered across cortical depth and anatomical layer (Fig. 5 B). Suppression appears strongest in layer 4B and 4Cα. There are also many cells in layers 2/3 and 5 that show moderate to strong suppression (SI > 0.5). Neurons in layer 6 stand out as being weakly suppressed with most of the cells that have no suppression located in this layer. However, layer 6 has a bimodal distribution of SI with one population showing no surround suppression and the other population showing higher estimates near the total population average (0.63). The SI strength estimates were smoothed over cortical depth using local regression (local neighborhood = ±15% of the total population along the cortical depth) to determine trends across cortical depth independent of predefined categorization of the cortex (Fig.6 A). The superficial layers through layer 4B show SI estimates greater than the mean (- - -, Fig.6, A and B). Cells in layer 6 show SI estimates that are well below the mean (Fig. 6, A and B). Grouping of the cortical depth into anatomical layers shows that layer 4B and layer 6 do indeed vary significantly from the population mean (layer 4B, P < 0.0001; layer 6, P < 0.005; Fig. 6 B).
Characterization of length and width spatial summation
By considering the summation profiles for length and width summation separately, one can characterize more fully the spatial organization of the receptive field. For example, we can determine the degree of symmetry for summation along the dominant axes for both the excitatory and inhibitory components. Comparing length and width summation in monkey cortex allows us determine whether inhibition is located predominantly along the ends of the receptive field (Fig.7, A and B), the sides of the receptive field (Fig. 7, C and D), or both.
For a subpopulation (n = 50) of the total number of cells studied (n = 138), we obtained complete and satisfactory length and width summation tuning curves in addition to the area-summation data. Summation tuning curves for length and width were fit using the same procedures used to quantify the area summation data. The space constants for the one-dimensional data represent the half-length and -width estimates for length and width summation profiles respectively. Across the population, the mean length space constant is 0.82° and the mean width space constant is 0.72° (Fig.7, A and C, respectively). Estimates of length and width summation indicate that the summation profiles are on average longer than they are wide. For this population of cells (n = 50), the mean excitatory space constant taken from the circular area-summation data is 0.90°.
The inhibitory space constants for length and width summation are 1.94 and 2.10°, respectively (Fig. 7, B and D). Across the cortical layers, there is not a significant difference between length and width summation (Fig.8 A). Both estimates show similar trends toward greatest summation in layer 6 and least in the superficial layers (layer 2/3 through layer 4B). There is no significant difference for the inhibitory space constant between length (〈b length〉 = 1.9°) and width (〈b width〉 = 2.1°) summation (Fig. 8 B). Cells with no surround suppression (SI = 0, 15/138 cells) have not been included in the population of inhibitory space constants.
Length and width surround suppression
Surround suppression for length and width summation was estimated identically to that for total surround suppression in the circular area-summation tuning curves. It is the ratio of the total area under the inhibitory Gaussian (Kib) to the area under the excitatory Gaussian (Kea) with estimates for the gain and space constants estimated from the DOG model fits to the one-dimensional summation tuning curves. Across the population, the average SI strength is less for both length (〈SIlength〉 = 0.51) and width (〈SIwidth〉 = 0.42) than estimates from the complete surround taken from the circular area-summation data (〈SI〉 = 0.63, Fig. 9). Because there is significant suppression along the length and width dimensions on average, it is likely that these regions sum to give greater suppression when stimulated simultaneously as in the circular patch experiments. There is no significant difference between end and side suppression across cortical layers (Fig.10 A). However, there are too few neurons to draw a strong conclusion about the anatomy.
By comparing length and width summation directly, we can draw conclusions about the spatial symmetry of summation and suppression with respect to the dominant axis of orientation. Across the population of cells examined, the ratio of width to length summation (〈a width/a length〉 = 0.98) is broadly distributed around unity (r 2 = 0.2, Fig. 7 E). It is just as likely to have RF profiles that are wider than long as it is to have RF profiles that are longer than wide. On the other hand, there is a significant bias in surround suppression when considered for length and width summation together (Fig. 10, B and C). Although there are equal numbers of cells with suppression only along one dimension (Fig. 10, B and C), the mean ratio of width suppression to length suppression is significantly less than unity. Strong end-stopping is more prevalent than strong side-stopping in our sample of macaque V1 neurons.
Comparing models with subtractive versus divisive inhibition
Recent models have proposed divisive feedback as a means of explaining contrast gain and a number of other physiological properties (Albrecht and Geisler 1991; Carandini et al. 1997; Heeger 1992). A normalization feedback signal that derives from a pool of surrounding neurons is built on the basic feed-forward linear filter (Fig.11 A). As it has been used previously, the normalization model has ignored space. A variation of the normalization model has been formulated in an attempt to fill the spatial void (Cavanaugh et al. 1999). In a variation of the normalization model very similar to that proposed by Cavanaugh et al. (1999), we propose that excitation and inhibition are represented by spatially overlapping Gaussian components. Each component has a gain (kc andks for the center and surround, respectively) and a space constant (ςc for the central space constant and ςs for the surround, Fig.11 A). These parameters are essentially the same as those used in the DOG model with the symbols changed to avoid confusion. The major difference between the two models is that the surround inhibits through division in the normalization model rather than subtraction as in the DOG model. There is an additional parameter used in the normalization model, β, which represents the nonlinear threshold imposed by the spike rate encoding mechanism.
For the same population (n = 138), we fit the circular area-summation tuning curves with the center-surroundnormalization model and compared the parameters of the fits to the normalization model and the DOG model directly. A comparison between the mean fractional error of the fitted curve to the data reveals that both models fit equally well (the linear regression,r 2 = 0.6, P < 0.001, Fig. 11 B). The error estimates for the normalization model are lower than the DOG model. This might result from the additional parameter, β, used in the normalization model. If the absolute error was adjusted to take into consideration the number of free parameters, then both models would probably fit equally well. Estimates of the excitatory space constant for the DOG model, a, and thenormalization model, ςc, show similar mean values across the population (〈a〉 = 1.0 and 〈ςc〉 = 0.9, Fig.12, A and B). Differences in the absolute estimates result from differences in the interactions between the center and surround mechanisms of the two models (see Fig. 13).
The inhibitory space constants were compared between the two models (Fig. 14). On average, the inhibitory space constant is marginally less for the DOG model (〈ςs〉 = 2.1°, Fig. 14 A) than for the normalization model (〈b〉 = 2.3°, Fig.14 B). However, the estimates for excitation and inhibition show similar trends across cortical layers despite small differences in the absolute values (data not shown).
Because the normalization model modulates a divisive component against an excitatory component, the SI can be taken directly as 1 minus the fractional reduction of center response by suppression resulting from normalization where This gives a quantity that varies from 0 for no surround suppression to 1 for maximum suppression (Fig.15 A). Comparing this estimate of surround suppression to the DOG model shows a strong difference between models in the values of the estimates. On average, the divisive normalization model predicts less suppression strength than the subtractive DOG model for the same fits (Fig. 15 B). The average SI across the population is 0.44 for the normalization model and 0.63 for the subtractive DOG model (Fig. 16,A and B). Inhibition in the normalization model acts through division rather than subtraction. Therefore it is not surprising that less absolute suppression can elicit the same amount of response reduction when compared with a subtractive inhibition model such as the DOG model.
Center and surround
Many recent studies have shown that responses of neurons in primary visual cortex depend on the activity of surrounding cortical neurons (Bodis-Wollner et al. 1976; Bolz and Gilbert 1986; Born and Tootell 1991; De Valois et al. 1985; Dreher 1972; Foster et al. 1985; Gilbert 1977; Hubel and Wiesel 1965; Knierim and Van Essen 1992;Levitt and Lund 1997a; Maffei and Fiorentini 1976; Orban and Vandenbussche 1979; Orban et al. 1979; Rose 1977; Sillito et al. 1995; von der Heydt et al. 1992; Yamane et al. 1985). Responses can be modulated by stimulation from beyond the “classical” excitatory receptive field paired with central stimulation. Stimulation of the region beyond the classical receptive field alone would otherwise not evoke a response. Therefore the response properties of cortical neurons are not isolated but depend on the activity in the network. But once cortico-cortical interactions are considered, one has to re-evaluate whether or not it is valid to view a cortical neuron's receptive field as divided into a classical receptive field and “beyond.” This is the main motivation for our measurements of the spatial extent of excitation and suppression: to test whether a model with overlapping spatial mechanisms for excitation and suppression, that might extend different distances in visual space but that are centered around the same locus in visual space, could account for the spatial summation data. For this purpose, it is necessary to have precise spatial estimates of excitation. Previous reports of receptive field size for neurons in V1 give a quite broad range of estimates (Dow et al. 1981; Hubel and Wiesel 1977; Livingstone and Hubel 1984;Schiller et al. 1976; Snodderly and Gur 1995). This might result partly from the fact that many of these studies used mapping techniques that do not reveal the full extent of the excitatory region. For example, most of these studies used light bar stimuli flashed on a dark background at different locations to map the extent of excitation. This method tends to underestimate excitation because it fails to provide sufficient excitatory drive to uncover small responses far from the receptive field center. Even quantitative studies using flashing light and dark bars on a uniform background fail to show the full spatial extent of subregions of the classical receptive field that can be revealed with extended stimuli such as sinewave gratings (Field and Tolhurst 1986).
To get a full estimate of the spatial extent of the excitatory and suppressive receptive field regions, we performed experiments using suprathreshold drifting sine wave grating stimuli in a spatial summation paradigm. Furthermore by sampling spatial summation into the far regions of the receptive field surround (±5° of visual angle from the RF center), we were able to characterize the extent of excitation and, in particular, suppression over at least 10° of visual angle. This spatial extent is thought to be large for cells in V1. Presenting an optimal stimulus over the excitatory center, while stimulating the surround, keeps the excitatory drive to the cell high. This allows for accurate estimation of the extent of excitatory spatial summation out into the periphery of the receptive field, and of surround strength, because it avoids the effects of a threshold masking small responses that are evoked by stimuli placed far from the receptive field center. Contrasts were carefully controlled such that they were in the suprathreshold regime but well below response saturation (less than 90% of response saturation). This is important because response saturation could cause errors in the estimation of the extent of excitation.
Empirical estimates of spatial summation taken directly from the data provide useful information about the spatial organization of receptive fields (DeAngelis et al. 1994; Schiller et al. 1976; Sillito et al. 1995), but they need to be interpreted carefully because they can result in inappropriate estimates of receptive field size. For example, a change in the optimal radius of summation might result from a change in the extent of excitatory summation or it might result from a change in the balance between the strength of excitation and inhibition. Using estimates of center and surround strength based on spatial spread parameters in the DOG model goes some way to overcoming the limitations imposed by the direct empirical estimates and, in addition, allows for a systematic comparison of model parameters to theoretical models of neuronal function.
DOG model and overlap of center and surround
An important feature of the DOG model is that the inhibitory surround is larger than and overlaps the excitatory center. In theory, the inhibitory surround could be overlapping or inhibition could surround excitation with a hole in the area over excitation or anywhere in between (Fig. 1 C). For either an overlapping inhibitory surround or a surround that abuts excitation without overlapping, the spatial summation profile will display a sharp peak or transition between excitation and inhibition. An additive model like the DOG model would predict that if the inhibitory surround had a hole over a region larger than the excitatory center, then the spatial summation profile would display a plateau at maximum response rather than a peak. Summation profiles with a plateau at maximum response would require an additional parameter to describe the data adequately [this parameter is equivalent to the phase offset, ςi, introduced into the DOG model by DeAngelis et al. (1994)]. In our population of neurons, this additional parameter was not needed to provide acceptable fits to the data. In fact, the results we have shown and a consideration of the predicted shape of the functions resulting from different types of models rule out a center-surround mechanism where the surround region is spatially separate from the central region for some of the cells showing relatively sharp peaks in the summation followed by a rapid suppression in response.
Studies of cortical inhibition provide additional evidence that the excitatory center and the inhibitory surround overlap. Several studies have reported inhibitory effects for superimposed stimuli such as two gratings of different orientation or spatial frequency (Bauman and Bonds 1991; Bonds 1989; DeAngelis et al. 1992; De Valois and Tootell 1983;Morrone et al. 1982; Petrov et al. 1980).DeAngelis et al. (1994) showed that these inhibitory effects originate from regions within the excitatory receptive field but are not limited to the excitatory region. This would be consistent with the overlapping center and surround regions for both the subtractive DOG and the normalization models.
We compared the subtractive DOG model with the modified normalization model to test whether the spatial parameters were dependent on the nature of the center-surround interaction. Both models gave very good account of the data—the residual error for both models was very small. Both models have overlapping neural mechanisms. The derived center and surround sizes were approximately the same for both models (Fig. 13). It can be concluded that the spatial parameters that are estimated for the center and surround are not dependent on assumptions about the nature of the interactions between center and surround mechanisms (subtractive vs. divisive).
The results of the data analysis suggest that the receptive field excitatory and inhibitory regions in V1 neurons results from considerable integration across the cortical network. Both DOG and normalization models yielded similar trends of spatial summation in V1 neurons at the relatively high suprathreshold contrasts used in these experiments. At the eccentricity represented by the region of cortex that we recorded from (2–5° of visual angle perifoveal), afferent LGN inputs would be restricted to a consistent small spatial spread. Neurons in the input layers (layer 4Cα and 4Cβ) display a broad distribution in the extent of excitatory spatial summation. Many neurons summate to an extent beyond that predicted by convergence of thalamic inputs that are known to have small radii at this eccentricity (0.2°) (Blasdel and Lund 1983; Derrington and Lennie 1984; Hendrickson et al. 1978). The spatial extent of excitatory (mean excitatory diameter = 2.0° of visual angle) and inhibitory summation (mean inhibitory diameter = 4.4° of visual angle) is large compared with the spatial extent of thalamic inputs. At perifoveal eccentricities of 2° (3.75 mm/° magnification factor) to 5° (2 mm/° magnification) (Dow et al. 1981; Van Essen et al. 1984) of visual angle, one hypercolumn (approximately 400–500 μm for each ocular dominance column or 0.8–1 mm of cortex per hypercolumn) would occupy 0.3–0.5° of visual angle. Therefore spatial summation greater than 0.5° would require cortical area greater than one hypercolumn. This suggests that there is considerable integration across cortical hypercolumns. Even for neurons in layer 4Cα that could potentially receive divergent input from LGN M cells because the axonal arborization of M cells can span up to three hypercolumns (Fitzpatrick et al. 1985), the spatial spread would be maximally three hypercolumns or up to 1.5° of visual angle. Excitatory and inhibitory inputs resulting from mechanisms on the order of 4.4° would require cortico-cortical connections other than simply thalamic afferents. The spatial extent of inhibition observed (4.4°) surrounding the excitatory receptive field is larger than that which would be predicted based solely on the anatomical spread of the dendritic arbors of inhibitory neurons (Fitzpatrick et al. 1985).
The spatial extent of excitation across the population of cells in this study is large on average (1° radius of visual angle representation). Not only is this size too large to be explained by feedforward thalamic inputs at these eccentricities (2–5° eccentricity) but also for half of the cells the estimates are too large to be explained by the known spread of intracortical connections (Angelucci et al. 1998) or long range horizontal connections (Rockland and Lund 1983). In addition, the spatial scale of the suppressive surround is much larger (2.2° of visual angle) than the known spread of inhibitory local connections (Callaway 1998). These estimates of the spatial scale of excitation and inhibition in V1 neurons suggest a possible role for feedback projections from extrastriate neurons. The spatial extent of feedback projections from extrastriate cortex are sufficiently large to explain the extent of excitatory summation and surround suppression (Walton et al. 1999), but further work on the establishing of the connections between extrastriate feedback and the V1 surround mechanism is needed.
We showed previously (Sceniak et al. 1999) that the spatial extent of excitatory summation grows on average by a factor of 2 or more at low contrast. This conclusion followed from analyzing area summation and length and width summation, data like those we have presented here, at both high and low contrast. This result implies that the net effect of excitatory signals from cortico-cortical lateral excitatory connections is even more powerful, relative to feedforward excitatory signals, at low contrast. This is another line of evidence suggesting that intracortical connections are important in establishing the properties within the classical receptive field.
Length and width
The spatial layout of summation appeared to be the same along length and width axes of cortical receptive fields, but there was an asymmetry of suppression. Our analysis of length and width summation reveals that there is no significant difference in the extent of excitatory summation along the length and width of the receptive field. Many neurons show equal summation along the two dimensions. There are also equal numbers of neurons showing greater summation along one or the other axis. This is consistent with a previous study of length and width summation in cat visual cortex (DeAngelis et al. 1994). However, suppression strength along the receptive field ends tends to be greater than side suppression. There are approximately equal numbers of neurons with suppression along only either the end or side of the receptive field. DeAngelis et al. (1994)reported equal inhibition along the receptive field ends and sides in cat visual cortex.
There were significant differences in summation and suppression in different cortical laminae. Excitatory spatial summation tends to be greatest in layer 6 and smallest in the upper layers (2/3A–3B). Consistent with estimates of receptive field width in awake-behaving primates (Snodderly and Gur 1995), receptive fields in the input layers (4Cα and 4Cβ) show a broad distribution of receptive field sizes.
Relatively strong surround suppression is evident throughout all cortical layers (even layer 6, where overall suppression is weakest, there are cells that have strong suppression). Surround suppression is greatest in layer 4B being significantly greater there than the mean suppression across all cortical layers. This agrees with previous work on the prevalence of end-stopping in layer 4B (Hawken et al. 1988; Livingstone and Hubel 1984; Movshon and Newsome 1996; Snodderly and Gur 1995). Although the average surround suppression strength in the upper layers (2/3–3B) is high, it is most likely an underestimate of the true surround strength for these layers. Because of the high degree of surround inhibition in some layer 2/3 cells, we were on occasions unable to characterize these neurons sufficiently using standard drifting grating stimuli. They were not included in the results, but it is possible that we have underestimated the level of surround suppression among the whole population in layers 2/3. Both layer 2/3 and layer 4B are output layers to extrastriate cortex. The high prevalence of surround suppression in the neurons in these layers indicates that V1 input to extrastriate areas retains a high degree of spatial specificity because of the strong surround suppression.
We thank D. Ringach, E. Johnson, I. Mareschal, and A. Henrie for help in data collection and L. Smith for assistance in the histological reconstruction work and for help during physiology experiments. We also thank J. Cavanaugh for helpful discussions.
This work was supported by National Eye Institute Grants EY-13079, EY-01472, and EY-08300 and by National Science Foundation-Learning and Intelligent Systems Grant IBN-9720305.
Present address and address for reprint requests: M. P. Sceniak, Center for Neuroscience, University of California, Davis, 1544 Newton Court, Davis, CA 95616 (E-mail:).
- Copyright © 2001 The American Physiological Society