To examine the role of primary visual cortex in visuospatial integration, we studied the spatial arrangement of contextual interactions in the response properties of neurons in primary visual cortex of alert monkeys and in human perception. We found a spatial segregation of opposing contextual interactions. At the level of cortical neurons, excitatory interactions were located along the ends of receptive fields, while inhibitory interactions were strongest along the orthogonal axis. Parallel psychophysical studies in human observers showed opposing contextual interactions surrounding a target line with a similar spatial distribution. The results suggest that V1 neurons can participate in multiple perceptual processes via spatially segregated and functionally distinct components of their receptive fields.
An important task of the visual cortex is to integrate local information from different parts of a visual image into global percepts such as contours, surfaces, and three-dimensional shapes. Although the process of visuospatial integration has been traditionally ascribed to high-order cortical areas, there is a growing body of evidence that suggests that the primary visual cortex (V1) may play an important role. While the receptive fields (RFs) of V1 neurons, as measured by a simple stimulus, are quite small, stimuli outside of this region can have powerful modulatory influences when presented concurrently with stimuli inside the receptive field. The modulatory influences allow neurons to integrate information from large parts of the visual field and may allow neurons at this early stage of visual processing to participate in complex perceptual tasks such as contour integration and surface segmentation.
The existence of surround effects in V1 neurons, especially inhibitory effects, has been known for many years (Bishop et al. 1973; Gilbert 1977; Gulyas et al. 1987; Hubel and Wiesel 1965; Knierim and Van Essen 1992; Li and Li 1994; Maffei and Fiorentini 1976), but it is now clear that excitatory interactions may play an equally important role in neural responses (Allman et al. 1985; Kapadia et al. 1995;Nelson and Frost 1985; Polat et al. 1998;Sillito et al. 1995).
These findings at the cellular level have their counterpart in psychophysical studies where the perception of an object's attributes is dependent on the context in which a stimulus is presented. Contextual interactions affect many perceptual attributes including the detection of low contrast objects (Dresp 1993;Kapadia et al. 1995; Polat and Sagi 1993) and the perception of brightness (Heinemann 1955;Ito et al. 1998; Rossi et al. 1996), depth (Westheimer 1986), position (Badcock and Westheimer 1985), and orientation (Gibson and Radner 1937; Tyler and Nakayama 1983; Westheimer 1990).
While excitatory contextual interactions have been postulated to play a role in contour integration and saliency (Field et al. 1993), inhibitory interactions are thought to be important in the segmentation of surfaces and textures (Knierim and Van Essen 1992). If opposing neural interactions are needed for different perceptual processes, the question arises as to how these processes can exist in V1 neurons without canceling each other out. One possibility is that excitatory and inhibitory interactions are present in the same neurons but are located at different parts of the RF. To test this hypothesis, we designed physiological and psychophysical experiments to map the RF surround.
Experiments were performed using two, alert macaque monkeys (Macaca mulatta). The monkeys were comfortably seated in a primate chair 1.5 m from a computer monitor with a resolution of 1,200 × 800 pixels refreshed at 60 Hz. Experiments were performed under photopic conditions with ambient light. All procedures complied with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the institutional animal review boards.
Experimental and surgical procedures were as previously described (Kapadia et al. 1995). Briefly, stimuli were presented in the near periphery while animals performed a foveal dimming task to help them maintain tight fixations. The animals received a juice reward if they held fixation within a 0.7–1.0° window and indicated the dimming by releasing a lever at the appropriate time. Neural recordings were obtained in 600-ms epochs. The stimulus was turned on 200 ms after the start of the recording and presented for 100 ms. Each fixation trial consisted of three to five recording epochs separated by at least 300 ms.
The time window used to calculate the evoked response from each recording site was adjusted according to that cell's composite temporal response profile over the entire set of experiments performed on that cell. The number of spikes that occurred in this window was converted to spikes/second by dividing the response by the window length. Spontaneous activity for each experiment was defined as the neural activity in the 200 ms prior to stimulus onset, averaged across all stimuli within an experiment and converted to spikes/second. The displayed results represent the mean evoked response of the unit over the 8–15 trials of each stimulus after subtracting spontaneous activity. Error bars show one standard error of the mean above and below this value.
Neural activity was recorded from the operculum of striate cortex using glass-coated platinum-iridium electrodes. Neural signals were amplified, band-pass filtered between 300 and 3,000 Hz and fed through a time-amplitude window discriminator to isolate individual units and small clusters of two to three units. The results from the single-unit recordings were indistinguishable from the multi-unit recordings. All recording sites had complex cell-like properties and were from superficial cortical layers, the neurons of which form the main output to extrastriate visual areas.
At each recording site, we quickly estimated the basic properties of the unit under study such as its orientation tuning and RF center by displaying bars at different positions and orientations and assessing the neuron's response on an audio monitor. The neuron's optimal orientation was then determined quantitatively by recording neural responses to long oriented bars (measuring 120 × 3′) spaced 10–20° apart over a full range of 180°. Recording sites that showed poor orientation tuning were omitted from the study. The RF center was determined by presenting small optimally oriented bars (measuring 15 × 3′) at adjacent positions along the length and width axes of the RF. The center was defined as the location that produced the largest response. The remainder of the experiments, including the two-dimensional mapping studies, was done with bars measuring 30 × 3′. The length of the stimulus corresponds to the approximate size of RFs at this eccentricity. The stimuli were not scaled according to the dimensions of individual RFs. We use the word “target” to represent a stimulus placed at the center of a neuron's RF and “flank” to represent peripheral stimuli.
The eccentricities of RF centers ranged from 2 to 7° from the fovea, averaging around 4°. The central target stimulus was always presented at the optimal orientation at the center of the units' RFs. Since the experiments were performed over a narrow range of eccentricities, RF sizes tended to be similar to one another, and we chose not to scale the size of the stimuli between individual recording sites.
The stimulus set used to create the two-dimensional maps consisted of 82 unique stimuli interleaved in a random block format. Thus a minimum of 656 stimulus presentations (82 stimuli × 8 trials for each stimulus) were needed to create each set of maps under each set of contrast conditions. Since the subtraction techniques used to create the context and nonlinearity maps depended strongly on the response to the central target presented alone, we often interspersed an additional 8–32 presentations of this stimulus to minimize the uncertainty in its mean response.
Although the fixation window provided limits for continuing a trial, the animal maintained much more precise fixation than that allowed by the window. Eye positions were monitored and recorded at 100 Hz using a scleral search coil system (C-N-C Engineering). An analysis of eye position data for 33 experiments is shown in Fig.1. The data were summarized by calculating the mean eye position during each 100-ms stimulus presentation for a given experiment and calculating the mean and standard deviation of the distribution. The variability in eye position between trials (measured as 1 SD of the mean) for each of the 33 experiments is shown in Fig. 1 A, and the average of the 33 values in Fig. 1 B. The standard deviation in eye position, averaged across all experiments, was 2.7 min of arc in the horizontal direction and 2.9 min in the vertical position. This is much smaller than the grain of the grids used in the two-dimensional receptive field maps (15–30′).
Psychophysical experiments were performed on human observers using a nulling technique to measure the perceived orientation of a vertical line when a pair of additional, tilted lines was presented simultaneously. In these experiments, we use the word “target” to refer to the stimulus that the subject is asked to make a judgement about and “flank” to refer to the additional, contextual lines presented in the surround. Stimuli were presented on a CRT monitor with a resolution of 1,024 × 768 pixels refreshed at a rate of 60 Hz. Stimulus presentation was controlled by a Matrox Millenium video card and a PC-compatible computer. No error feedback was provided. Observation was binocular with normal pupils and a free head. All human experiments were approved by the institutional committees on human experiments.
The foveal tilt illusion experiments were performed at an observation distance of 6 meters. The central target and flanks consisted of white vertical lines, 8 × 1′ in size. The line length used in this and other psychophysical experiments were given values that approximate the size of V1 receptive fields. Unless otherwise noted, the Michelson contrast of the stimuli was 99%. All data points presented here are based on at least 450 total trials for both flank orientations distributed over a minimum of 2 days.
Each trial consisted of a 2,000-ms cycle. The target and a pair of tilted flanks were presented for 100 ms and followed by a 1,900-ms interval during which the subject reported whether the target appeared tilted clockwise or counterclockwise by pressing the appropriate button on a computer mouse. The target line was accompanied by two flanks, which were symmetrically positioned with respect to the target. To concentrate on the orientation signal and to obviate any position clues, the stimulus was designed so that the ends of the flanks had the same relationship to the end of the target for all target and flank orientations. During each session, the location and orientation of the flanking lines remained constant but by changing flank orientations and positions from session to session a full range of values of these parameters could be explored.
During each presentation, the target was shown randomly at one of seven equally spaced orientations centered on vertical, and the observer reported, if necessary by guessing, whether it appeared tilted clockwise or not. A psychometric curve was fitted to the proportion of “yes” responses at the seven orientations by the method of probits (Finney 1952), where the mean value, i.e., the orientation at which “clockwise” and “counterclockwise” responses are equally probable, provides an estimate of the target orientation at which it has no apparent tilt (see Fig. 3). To eliminate possible biases, stimuli with clockwise and counterclockwise flanks were randomly interleaved in each experiment. The induced tilt, defined as half the difference between the means under conditions of clockwise and counterclockwise flanks, is a bias-free measure of the tilt illusion obtained by a nulling method based entirely on the observers' “yes” or “no” responses to the question whether the target line appeared to have a clockwise tilt in a minimum of 600 presentations with randomly distributed values of target line orientation and direction of flank tilt.
In general, there were no important qualitative differences between the fovea and the periphery in this class of psychophysical studies. To assure ourselves that the pattern of attractive and repulsive tilt illusions exists also in the near periphery, one observer performed the experiments described above at an eccentricity of 4° (view distance: 1m, line lengths: 30′). Care was exercised to factor out asymmetries depending on retinal areas by accumulating data across eight meridia.
Two-dimensional contextual maps
The physiological experiments were designed to study the spatial distribution of excitatory and inhibitory surround interactions around a neuron's RF. After quantitatively mapping the center of the RF and finding the units' optimal orientations, we presented combinations of stimuli inside and outside the RF. The total stimulus set was composed of 82 stimuli positioned in a 9 × 9 grid centered on the RF (Fig.2 A). The dimensions of this grid were fixed and were not scaled to the size of individual RFs. The stimulus set was designed to minimize the number of stimuli required to map the RF and its surround and therefore minimize the time required to maintain isolation of the recorded units. All stimuli were presented at the optimal orientation for each cell under study. The stimulus set consisted of conditions where either the central target was presented alone, the target was presented in conjunction with a pair of flanks located symmetrically with respect to the RF center, or the pair of flanks were presented in isolation.
The main findings of these experiments are depicted in Fig.2 B. The central target, presented in isolation, produced a response of 8.5 spikes/s. When collinear flanks were presented simultaneously, the response grew to more than three times the baseline level even though the flanks alone elicited little response. When flanks were presented side by side with the center stimulus, they inhibited the unit's response below the baseline level even though when presented alone, they elicited a strong response. The influence of surround stimuli thus showed significant nonlinearities: the response to the three-bar stimulus could not be predicted from the sum of the responses to the component stimuli. The responses showed further that excitatory and inhibitory inputs from the RF surround were spatially segregated. Excitatory inputs were present along the neurons' collinear axis, while inhibitory inputs were present at the sides of the RF.
We used the neural responses to the entire stimulus set to construct two-dimensional maps of the RF and its interaction with its surround (Fig. 2 C). The stimulus conditions in which the flanks were presented in isolation were used to create the RF map. This technique tended to overestimate the size of the RF compared with more traditional measurements of RF size, such as the minimum response field (not shown) since each presentation consisted of a pair of lines instead of a single line. Nonetheless this technique produced a detailed map of the parts of the visual field from which action potentials could be directly elicited. Similar to traditional methods used to map the RF, there is a strong response at the RF center, which falls off rapidly in all directions.
The three-bar map shows the magnitude of the responses, plotted as a function of the flank position, when the central target and flanks were presented simultaneously. This map is shown here for completeness but is omitted for brevity in the rest of the paper. The strongest responses to the three-bar stimuli occurred when the flanking lines were presented along the collinear axis close to the RF center.
By subtracting the neural response in the three-bar condition at each grid position from the center-alone response, we constructed “context maps.” This map displays how the addition of flanking lines at each grid position influences the neuron's response to the central target. To allow comparison of these maps across different cells, each map was normalized to the cells' response to the central target presented in isolation. Facilitatory interactions were located along the collinear axis, while inhibitory interactions were strongest in discrete regions at the sides of the RF. The four lobes of facilitation and inhibition were superimposed on a larger, weak field of diffuse inhibition. The context map reflects the total contextual influence of the flanking stimuli without subdividing this influence into linear and nonlinear components.
We isolated the nonlinear part of the contextual interaction in “nonlinearity maps,” created by subtracting both the center-alone and flanks-alone responses from the three-bar response at each grid position. This map highlights the neural responses that could not be explained by simple addition of the responses of the unit to the stimulus components, flanks and target, acting alone. This map was also normalized to the center-alone response. Compared to the context map there was strong inhibition at the sides of the RF, while the collinear excitation appeared qualitatively similar. The difference between the context and nonlinearity maps at the sides of the RF arises because at this grid position, the flanks-alone condition elicited a positive response from the cell, but when the same flanks were presented in conjunction with the central target, their effect was inhibitory (see Fig. 2 B).
The spatial segregation of excitatory and inhibitory contextual interactions at the neural level led us to search for a similar dichotomy at the level of visual perception. One type of contextual interaction that has been studied extensively at the psychophysical level is the tilt illusion. In the classical description of this illusion, the presence of tilted, flanking lines causes a target to appear tilted in a direction opposite to the orientation of the flanking lines, which is a “repulsive” effect (Gibson and Radner 1937; Westheimer 1990). We suspected that if opposing contextual interactions were segregated in different positions around the receptive field, a perceptual effect observed when a context is placed along a target line's axis might be reversed for contextual stimuli placed along the orthogonal axis. Furthermore plotting the direction and magnitude of induced tilts with the flanks in different spatial positions should produce similar two-dimensional maps of contextual interactions as those seen in the physiological experiments.
We studied the influence of contextual stimuli on the perception of an object's orientation by measuring the perceived orientation of a vertical line (target) when a pair of tilted, flanking lines was presented simultaneously with the target. All stimuli were presented at high contrast. The flanking lines caused the target to appear tilted in a direction that depended on the spatial position of the flanking lines. When the flanking lines were positioned in a side-by-side arrangement with the central target, the target appeared tilted away from the orientation of the flanks (Fig.3 A). This is the conventional repulsive tilt illusion. However, when the flanking lines were positioned in a collinear or end-to-end arrangement, the target appeared tilted toward the flanks, which we call an attractive tilt illusion (Fig. 3 B). The direction of induced tilts in the two conditions was consistent across the five subjects studied (Fig.3 C).
In the next experiments, we sought to characterize the dependency of induced tilts on the orientation of the flanks and their separation from the central target. Collinear flanks induced attractive tilts when the orientation difference between the target and flanks was 5–10° and repulsive tilts at larger orientation differences (Fig.4 A). The strongest attractive tilt occurred at an orientation difference of 5°. Lateral flanks produced repulsive tilts for most orientation differences, with the largest effect occurring at an orientation difference of 20° (Fig.4 B). A significant attractive tilt illusion was observed at an orientation difference of 75°. This “paradoxical” tilt illusion has been observed in previous studies (e.g., Westheimer 1990), but was only seen in the lateral case here. Results averaged across subjects are shown in Fig. 4 C.
Because the main purpose of these experiments was to study the positional dependency of contextual interactions, we concentrated the remainder of the psychophysical experiments on exploring the spatial distribution of the opposing tilt effects with an orientation difference between target and surround of 5°. This corresponds to the smallest orientation difference at which we could observe consistent attractive and repulsive tilts, therefore using target and flanks with similar orientation and illustrating the shifting balance between the two directions of induced tilt. For collinear flanks, the strongest attractive tilts were observed when the target and flanks were closest together. As the separation distance was increased, the attractive tilt declined, crossed zero near 20′ separation and became repulsive at larger separations (Fig. 4 D).
For lateral flanks, induced tilts were predominantly repulsive and became smaller as target-flank separation was increased (Fig.4 E). Significant repulsive tilts were observed over distances at least as large as 30′. Two of the four subjects showed small attractive tilts when the target and flanks were close together. However, both of these subjects reported difficulty in dissociating the tilt of the three lines of the stimulus pattern as separate, a crowding effect, but this was never a problem in the collinear condition. Results averaged across subjects are shown in Fig. 4 F.
To study in a systematic manner the distribution of flank positions that result in attractive and repulsive tilts, we measured induced tilts when the flanks were positioned at a number of locations along a two-dimensional grid (Fig. 5 A) analogous to the physiological experiments. Attractive tilts were found in a limited area along the collinear axis (Fig. 5 B). Repulsive tilts were strongest along an axis orthogonal to the target line orientation, and weaker effects were observed in a diffuse distribution surrounding the target line. Interestingly, there was a second region of repulsive tilts that were intermediate in strength that was offset and parallel to the collinear axis. Results from a second subject are shown in Fig. 5 C. This subject showed a more diffuse pattern of repulsive tilts than the first subject but a similar pattern of attractive tilts.
To compare the spatial scales of the psychophysical and neurophysiological results, we examined tilt effects in the near periphery in three subjects at an eccentricity of 4° (Fig.6 A), which is similar to the mean eccentricity of RFs in the physiological studies. The dependence of induced tilts on orientation and flank position was similar to the results in the fovea. We constructed a two-dimensional map in one subject similar to the maps in the fovea (Fig. 6 C). While the basic pattern of attractive and repulsive tilts is the same as in the fovea, the effects are seen over much longer distances. For example, attractive tilts were observed over separation distances of 8–16′ in the fovea but extended to separations >2° in the near periphery.
In the next set of experiments, we examined the effects of changing stimulus contrast on the pattern of contextual interactions. Initially, the contrast of the target and flanks were changed simultaneously. Figure 7 shows the results of experiments performed at one recording site using stimuli at three different contrasts. The raw neural response to the five main stimulus configurations at each contrast is shown in Fig. 7 A. At a contrast of 20%, the central target presented in isolation produced a small response from the cell; but this response was greatly enhanced by the addition of collinear flanks. When the central target was presented with side-by-side flanks, there was little change in the neuron's firing rate. At a contrast of 50%, collinear flanks increased the target alone response by a much smaller amount, while side-by-side flanks were inhibitory. The experiments at 30% contrast show intermediate results. Both collinear facilitation and lateral inhibition were evident at this contrast.
The pattern of contextual interactions at different contrasts were examined further in Fig. 7 B, which shows the results of the two-dimensional map analysis at all three contrasts. At the lowest contrast tested (20%), collinear excitatory interactions were quite strong. Inhibition was weak and seen only in the nonlinearity map. As the contrast was increased to 30%, excitatory interactions became weaker and inhibitory interactions became stronger. Both excitatory and inhibitory interactions were observed in the context map. Finally, as the contrast was increased to 50%, only very weak excitatory interactions were seen in the context map, and there was no excitation seen in the nonlinearity map. Consistent with previous results, neurons' minimum response fields are larger at high contrasts than at low contrasts (Kapadia et al. 1999). Several trial experiments were done at higher stimulus contrasts, but there appeared to be little difference in the results as target contrast was increased above 50%. These results were typical of the all the recording sites studied. The findings show that excitatory interactions predominate at low contrasts and inhibitory interactions predominate at high contrasts.
To examine the effects of stimulus contrast at the population level, we averaged across all cells studied at the same spatial scale to create composite maps that reflect the entire neural population (Fig.8). The effects were remarkably constant from one recording site to the next, and the two-dimensional map averaged over the population closely matched the individual example shown in Fig. 7 B. At low contrasts, there was strong excitation in the collinear position and weaker inhibition in the lateral position, while inhibitory interactions dominated at high contrasts.
In the next set of experiments, we studied the effects of stimulus contrast by varying the contrast of the target and flanks independently. The results from a single recording site are shown in Fig. 9 A. When the target and flanks were either both at high contrast or both at low contrast, the results were similar to those in the previous figure. However, when the target was kept at a low contrast and the flanks were presented at high contrast, there was a shift in the position of maximal facilitation toward more distant parts of the RF surround. The position of optimal facilitation was shifted slightly from the collinear axis at this recording site (bright blue region in nonlinearity map). Note the larger scale at which these experiments were performed compared with the maps in the previous figures (see scale bar).
A similar trend was observed across the population of cells. Composite maps created by averaging the results of the six cells studied under these conditions are shown in Fig. 9 B. Excitatory interactions were seen in a region located close to the collinear axis, but at a greater distance from the RF center and in a more diffuse pattern than seen in the other parts of this study. We tended to observe more variability between recording sites for this experiment with the peak facilitatory position often falling slightly off of the collinear axis.
The separation distance between target and surround at which one could see the maximum attractive tilt effect could be changed by altering the relative contrast of the target and surround. When the contrast of the target was lowered to 15% and the surround maintained at high contrast, the optimum separation for obtaining the tilt effect was 20–24′, and the effect was maintained out to 32′ (Fig.10 A). This is quite different from the condition in which all three lines were at high contrast, where the maximal effect was obtained with an 8–10′ separation and disappeared beyond 16′ separation (Fig. 10 B). In the same manner, for both the physiology and the psychophysics, manipulating contrast influenced the spatial extent over which contextual influences were exerted.
The principal conclusion that comes from this work is that contextual influences, both at the physiological and psychophysical levels, are not uniform but rather are highly dependent on the spatial positioning of the surrounding stimuli relative to the receptive field or to the target. While there are somewhat contradictory studies on the sign (facilitatory vs. inhibitory) and character of contextual interactions, our findings show that the RF surround is composed of both excitatory and inhibitory regions and that individual neurons are capable of displaying both types of interactions. We show further that the balance between facilitation and inhibition changes in a stimulus-dependent fashion.
The two levels of analysis (physiology and psychophysics) show that interactions of opposing sign are found in lobes located along orthogonal axes. Excitatory neural interactions and attractive tilt illusions were observed along the orientation axis of line segments, while inhibitory interactions and repulsive tilt illusions were observed at the sides of lines. The physiological and psychophysical interactions show similar dependencies on contrast, such that as one changes the relative contrast of center and surround, the facilitatory surrounds and attractive tilt influences extend farther from the central line.
One might go further to relate the psychophysical and neurophysiological parts of this study via a simple, population-coding model of orientation discrimination, which can explain how tilt illusions can arise from excitatory and inhibitory contextual interactions at the neural level (Fig.11). The neural basis of the tilt illusion, as supported by population coding models, may be the pattern of contextual interactions shown here. Previous implementations of this model have suggested iso-orientation inhibition as a possible basis for repulsive interactions in the orientation domain (Gilbert and Wiesel 1990). Physiological experiments support the feasibility of this class of models (Gilbert and Wiesel 1990;Knierim and Van Essen 1992; Li and Li 1994; Li et al. 2000; Nothdurft et al. 1999).
As indicated in the model, the orientation tuning curves of neurons are sufficiently broad that a single oriented line activates cells optimally tuned to a wide range of orientation preferences. In this model, each neuron acts as a labeled line, signaling the presence of a stimulus at its own preferred orientation. The perceived orientation of a stimulus is derived from the activity of all neurons that are activated by that line. In this model, at the neural level, attractive tilts arise from iso-orientation facilitation and repulsive tilts arise from iso-orientation inhibition (Fig. 11, B andC). The flanking lines facilitate or inhibit neighboring neurons with similar orientation preferences, which skew the population vector in manner that induces a tilt in one direction or another.
Whether the dominant influence of contextual interactions is a change in the height of orientation tuning curves, as used in this model, or also involves changes in the tuning curves' peak orientation or width requires further study. Once a population coding model of orientation is accepted—and there is little choice in view of the broad orientation tuning of neurons and the high precision of orientation perception of contours—one can think of several other ways of implementing interactions to yield the tilt illusion, but to distinguish between them, requires detailed knowledge of the changes in parameters of orientation tuning curves. The model presented here lends plausibility to the idea that the opposing tilt effects can be directly mapped onto the excitatory and inhibitory contextual zones around the receptive field and provides a suggestion for why the patterns of interactions observed in both parts of the study were so similar.
In the physiological experiments, we found that the balance between excitation and inhibition in the RF was not fixed but was regulated in a dynamic, stimulus-dependent manner. Low contrast stimuli tended to invoke strong excitatory interactions and weak inhibition, while the opposite was true at high contrasts. Contrast did not seem to change the positioning of the excitatory and inhibitory subregions but instead seemed to alter their strength and dimensions. Excitatory contextual interactions were observed at both high and low contrasts as evident in the context maps in Fig. 7. At low contrasts, the excitatory drive came from outside the neuron's receptive field; the neuron's response to the target and flanks presented simultaneously was more than the sum of the responses to the individual stimuli. At high contrasts, the excitation came from within the neuron's minimum response field; the neuron's response to the target and flanks presented simultaneously was more than the response to the target alone but less than the sum of the responses to the individual stimuli. Thus high and low contrast stimuli lead to a similar pattern of excitatory interactions in the context maps, but the nonlinearity maps look quite different under the two stimulus conditions. Also, while strong excitatory interactions were only observed close to the RF when the contrast of the flanks was low, excitation could be seen over much longer distances for higher contrast flanks.
Inhibitory contextual interactions were strongest at high contrasts. Under these conditions, there were lobes of strong inhibition along the orientation and orthogonal axes of the RF, which are consistent with the well known properties of end- inhibition (Hubel and Wiesel 1965) and side-band inhibition (Bishop et al. 1973), respectively. The current results lend further credence to the idea that end-inhibition is contrast dependent. High contrast stimuli invoke strong end-inhibition, while at low contrasts, the same regions at the ends of RFs are excitatory and little or no end-inhibition is observed (Kapadia et al. 1999;Sceniak et al. 1999).
While stimulus contrast is one factor that can influence the strength and sign of contextual interactions, there are likely to be many others. One factor that has been implicated in other studies is the complexity of the environment in which a stimulus is presented (Kapadia et al. 1995, 1999). When a high contrast stimulus is embedded in a complex surround, the neural response to that stimulus is often much less than the response to the same stimulus presented in isolation. The suppression induced by the complex surround also alters the pattern of contextual interactions for the stimulus. In effect, the suppression makes the neuron behave as if the central stimulus is at a lower contrast and excitatory contextual interactions become prominent at even the highest contrasts tested. This finding suggests that under real-world conditions, where objects are likely to be embedded in complex scenes, excitatory contextual interactions are likely to be prominent at all levels of stimulus contrast not just at the relatively low contrasts found using the simple stimuli used here.
A strong candidate for the anatomical substrate of the contextual interactions observed here is likely to be the intrinsic, long-range horizontal connections formed by pyramidal cells in V1 (Gilbert and Wiesel 1979, 1983; Martin and Whitteridge 1984; Rockland and Lund 1982), although feedback from extrastriate areas may also play a role. Long-range horizontal connections can extend over distances as large as 6–8 mm and tend to connect cells with similar orientation preferences (Bosking et al. 1997; Gilbert and Wiesel 1989;Kisvarday et al. 1997; Ts'o et al. 1986). The extent of long-range connections at the anatomical level correlates well with the extent of contextual interactions observed in the psychophysical and physiological experiments. At an eccentricity of 4°, the average eccentricity of the RFs in this study, the cortical magnification factor is 2.5 mm/° (Dow et al. 1981). This means that horizontal connections can integrate information over 2–3° of visual space, which is similar to the extent of interactions we observed here.
Another important aspect of the long-range horizontal connections is their ability to provide both excitatory and inhibitory inputs to their postsynaptic neurons. Since the long-range connections arise from glutamatergic pyramidal neurons, they provide excitatory input directly but can also exert inhibitory effects through a disynaptic circuit involving inhibitory interneurons (McGuire et al. 1991). The dynamic change in long-range inputs from excitatory to inhibitory as a function of stimulus contrast is reminiscent of similar effects observed in intracellular recordings in cortical slices, where synaptic potentials evoked by a stimulating electrode change from excitatory to inhibitory depending on the intensity of electrical stimulation (Hirsch and Gilbert 1991; Weliky et al. 1995). In these experiments, low intensity stimulation evoked excitatory synaptic potentials, while higher intensity stimulation also activated inhibitory interneurons, resulting in strong, negative synaptic potentials that overwhelmed the smaller, excitatory potentials. Changing the contrast of a visual stimulus is likely to invoke similar cortical mechanisms as changing the intensity of an electrical stimulus.
The experiments in this study help delineate the basic structure of contextual interactions but likely reveal only a component of the full complexity of the interactions. The stimulus set used here was chosen to minimize the number of stimuli because of the difficulty in maintaining stable neural recordings in alert animals for long periods of time. For example, the stimulus imposes a mirror symmetry on the data since each experiment uses a pair of flanks instead of a single flank. Experiments with single flanks would be able to reveal asymmetries in the distribution of contextual interactions in individual cells. Also, the influence of a surround stimulus on a central target is dependent on the presence of additional contextual elements (Kapadia et al. 1995, 1999), and these cascading nonlinearities may explain higher-order aspects of visual processing. The experiments in this study were also limited to iso-orientation interactions; to characterize fully the role of V1 cells in analyzing complex visual scenes, one must study interactions between stimuli that differ in orientation.
A simple, conceptual model of how the observed pattern of contextual interactions might form the neural basis contour integration and surface segmentation is shown in Fig.12. In this model, the saliency of a stimulus element results from the neural response of that feature relative to other elements in the surround. Collinear, excitatory interactions enhance the neural activity of stimulus elements that form smooth contours, resulting in enhanced saliency of these elements relative to other features in their surrounds. Lateral inhibitory interactions suppress the neural response of stimulus elements whose neighbors have the same orientation. The loss of this suppression in areas where there is a change in orientation results in enhanced saliency of the texture boundary.
The results of the current study suggest that the contextual influences may allow cells at early stages in cortical visual processing to mediate complex processes in intermediate level vision. The spatial segregation of excitatory and inhibitory inputs may allow individual V1 neurons to participate in multiple perceptual tasks that require opposing neural interactions. Our results suggest further that there is no clear separation between the stages of visual processing that serve for the analysis of simple stimulus attributes, such as orientation, and those involved in higher order mechanisms of visuospatial integration.
We thank S. Kane for help with eye coil implantation and A. Glatz and J. Lopez for expert technical assistance.
This work was supported by National Institutes of Health Grants EY-07968 to C. D. Gilbert and MH-11394 to M. K. Kapadia.
Address for reprint requests: C. D. Gilbert, The Rockefeller University, 1230 York Ave., New York, NY 10021 (E-mail:).
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2000 The American Physiological Society