Our visual system can link components of contours and segregate contours from complex backgrounds based on geometric grouping rules. This is an important intermediate step in object recognition. The substrate for contour integration may be based on contextual interactions and intrinsic horizontal connections seen in primary visual cortex (V1). We examined the perceptual rules governing contour saliency to determine whether the spatial extents of contextual interactions and horizontal connections match those mediating saliency. To quantify these rules, we used stimuli composed of randomly oriented nonoverlapping line segments. Salient contours within this complex background were formed by colinear alignment of nearby segments. Contour detectability was measured using a 2-interval-forced-choice design. Contour detectability deteriorated with increasing spacing between contour elements and improved as the number of colinear line elements was increased. At short contour spacing, the detectability reached a plateau with alignment of a few line segments that together formed a contour subtending several visual degrees. At intermediate spacing, saliency built up progressively with a greater number of colinear lines, extending up to 30°. When contour spacing was beyond a critical range (about 2°), however, the detectability dropped to chance levels, regardless of the number of colinear lines. Contour detectability was found to be a function not only of the relative spacing of contour elements with respect to the noise elements but also of the average density of the overall pattern. Furthermore, training significantly improved contour detection, increasing the critical spacing of line elements beyond which contours were no longer detectable. Our data suggest that global contour integration is based on mechanisms of limited spatial extent, comparable to the interactions observed in V1. These interactions can cascade over larger distances provided the spacing of stimulus elements is kept within a limited range.
The visual world is perceived as organized objects, and the percept of a visual object is influenced by the global organization of visual scenes. Early last century, Gestalt psychologists investigated this perceptual phenomenon and summarized the rules governing perceptual organization, including proximity, similarity, continuity, and closure (Wertheimer 1923). The Gestalt grouping rules can be successfully used to explain a series of perceptual phenomena that fall under the rubric of contextual interactions.
Contextual interactions in visual perception have been observed in a number of stimulus domains, including brightness (brightness induction), color (simultaneous color contrast), orientation (tilt illusions and saliency from orientation contrast), spatial scale (size illusions), and depth (assimilation and repulsion of perceived depth of nearby objects). Similar contextual interactions have been reported at the neuronal level in the primary visual cortex (V1). For example, the optimal orientation of V1 neurons is affected by the orientation of contextual lines in a way similar to the tilt illusion (Gilbert and Wiesel 1990). Neuronal responses in V1 are strongly affected by orientation contrast of local stimuli, comparable to the perceptual pop-out based on orientation contrast (Kastner et al. 1997; Knierim and Van Essen 1992; Li et al. 2000). Responses of V1 neurons to an optimally oriented stimulus are facilitated by colinear flanks (Kapadia et al. 1995; Nelson and Frost 1985; Polat et al. 1998), a neuronal correlate of colinear brightness induction (Dresp 1993; Kapadia et al. 1995;Polat and Sagi 1993, 1994b). Contextual interactions seen in V1 are also related to aspects of intermediate level vision such as figure-ground segregation and surface brightness perception (Lamme 1995; Rossi and Paradiso 1999;Rossi et al. 1996; Zipser et al. 1996).
The close similarity between the contextual interactions observed in perception and in response properties of V1 neurons suggests that V1 mediates these perceptual phenomena. Evidence from anatomical (Gilbert and Wiesel 1979, 1983, 1989; Rockland and Lund 1983; Rockland et al. 1982;Schmidt et al. 1997), electrophysiological (Ts'o and Gilbert 1988; Ts'o et al. 1986), and imaging (Das and Gilbert 1995; Malach et al. 1993) studies reveals that in V1 intrinsic horizontal connections formed by the axons of pyramidal cells can link cells with nonoverlapping receptive fields (RFs) and similar orientation preference. This intra-cortical circuitry enables V1 cells to integrate information over a relatively large portion of the visual field, and is suitable to mediate a wide variety of contextual influences that show orientation dependency.
Line segments with specific geometric relationships are perceptually grouped to form visual contours (Wertheimer 1923) that are salient and “pop out” even if embedded in complex environments. The saliency of a contour, and the connectivity of its elements, has been found to follow the Gestalt rule of good continuation (Field et al. 1993). Despite the global nature of contours, the interactions mediating the saliency effect may be more local. It has been proposed that contour integration can be performed by the same neural mechanism underlying local colinear brightness induction (Dresp 1993; Kapadia et al. 1995; Polat and Sagi 1993, 1994b). Computational modeling has revealed that interactions between locally connected V1-neuron-like filters suffice to extract globally salient contours from noise context based on geometrical attributes like smoothness (Li 1998; Pettet et al. 1998;Ullman 1992; VanRullen et al. 2001;Yen and Finkel 1998). Based on their psychophysical observations, Field et al. (1993) coined the term “association field” to account for the local interactions between contour elements. Only colinearly and smoothly arranged elements are strongly associated. With respect to these functions the association field would map well onto the horizontal connections found in V1 if the spatial extent of the interactions underlying the association field matches that of the lateral interactions within V1.
To make such a comparison, we examined the perceptual rules governing contour saliency to quantify the extent of the visuospatial interactions underlying saliency. To this end, we tested the effect of changing separation between stimulus elements on the perceived saliency of contours.
Stimulus generation and data collection
Stimuli were generated by a VSG2/5 visual stimulus generator (Cambridge Research Systems) under computer control on a 21-inch Sony FD Trinitron monitor (model GDM-F500) with a resolution of 1536 × 1152 pixels and a refresh rate of 60 Hz. The viewing distance was 35 cm. Each pixel subtended 0.04°, and the overall display area was 58.4° × 43.8°. The experiments were conducted in a dimly lit room.
Stimuli consisted of an array of randomly oriented line segments (see Fig. 1, B, D, andF for examples). Each line, which was anti-aliased by the built-in function of the commercial software library (VSL version 6.085), was about 0.4° long and 0.08° wide. The positions of line segments were defined by geometric rules that allowed precise control of stimulus parameters, as illustrated in Fig. 1. A circular area of 43.8° in diameter was divided into small square areas of designated size (Fig. 1 A). Each grid position contained a line segment whose position was jittered within the square compartment except when it was part of a contour (Fig. 1 B). A red fixation point (FP) was drawn in the center of the circular patch. The invisible dividing grids controlled the average density of line segments and thus the average spacing between them. Theaverage density was defined as the total number of grids (and thus line segments) within the whole stimulus patch, and theaverage spacing was defined as the width or height of an individual grid box. By colinearly aligning nearby elements along a diagonal of the grids, a straight contour was generated within the otherwise random background. The length of the contour was determined by the number of colinear lines, and its location was controlled by the positions of those grid boxes whose line segments were aligned along a diagonal. The eccentricity of the contour was defined as the radius of the circle around the FP that was tangent to the contour path, and the center element of the contour was located at the center of the grid box whose distance to the FP was the shortest among all the grid boxes along the alignment axis so that the contour extended about the same extent to either side in the periphery. In this study the eccentricity at which the center of the contour was located ranged between 3.6° and 6.4°. The two endpoints of the contour extended into the periphery of larger eccentricities, which depended on the contour length and could be up to 18°.
The orientation of the contour was controlled by rotating the whole stimulus pattern around the FP, which also changed the position of the contour. Using this design a straight contour could be generated along the tangent of an invisible circular path at any given combination of position and orientation while keeping the contour eccentricity unchanged. The orientation and position of the contour was randomly generated in each trial in such a manner.
To control the spacing between contour elements while keeping the density of line segments unchanged, a skew angle was introduced to the square grids described in Fig. 1 A. The skew angle could be either clockwise (angle “a” in Fig. 1 C) or counterclockwise (angle “a” in Fig. 1 E). The clockwise skew (a > 0) increased the spacing between contour elements (Fig. 1, C and D), while the counterclockwise skew (a < 0) reduced the spacing between contour elements (Fig. 1, E and F). This transformation of dividing grids did not alter the width and height of each grid box, so the area of each compartment box remained unchanged at all skew angles. The number of grid boxes within the whole circular patch was also constant. By choosing proper skew angles, the spacing between contour elements could be precisely adjusted without changing the overall density of line segments. For example, if one compares the three stimulus patterns in Fig. 1, B, D andF, no discernible difference is introduced in the noise context while the spacing between colinear lines is changed. This feature in stimulus design is crucial for the purpose of this study. Another important feature of the stimulus array is that the contour was embedded by aligning some of the array elements within the stimulus pattern. This ensured that no extra density cue was introduced within the complex stimuli. The skew angles used in this study ranged from −45.0° to +43.8°, which defined relative contour spacing from 1.0 to 2.2 with respect to the average spacing between background elements. In this paper the term relative contour spacing is used to denote the spacing between contour elements relative to the average spacing between background elements (as mentioned previously, the average spacing is defined as the width or height of an individual grid box). The absolute center-to-center distance between two adjacent contour elements is referred to as contour spacing in visual angle. A relative spacing of 1.0 indicates that the spacing between contour elements is equal to the average spacing between background elements. This is the minimum relative spacing that can be generated with this stimulus design. We emphasize that no density cue was introduced when the relative contour spacing was adjusted.
Unless otherwise indicated, the whole stimulus pattern was 43.8° in diameter and was divided by invisible grid boxes of 0.8° in width and height. The luminance of the composing line segments was 82.1 cd/m2. The luminance of the background was 4.1 cd/m2, against which the line segments were presented.
The method of 2-interval-forced-choice was employed (Fig.2) to measure contour detectability. Two successive stimulus intervals were presented in each trial. Before each stimulus interval there was a leading blank period of 500 ms within which only the FP was displayed. In each stimulus interval a stimulus pattern was presented for 150 ms and a mask was presented for 300 ms immediately after that. An embedded contour was only present during one of the two intervals, and the stimulus pattern in the other interval was just noise (i.e., similar complex background without any contour). The mask following both stimulus intervals also consisted of only noise. The subjects had to indicate by pressing one of two buttons which stimulus interval contained a straight contour. The inter-trial interval was about 3 s. Different stimulus conditions in each experiment were randomized.
The instructions to the subjects were 1) fixate on the center red dot (FP), 2) attend to the surrounding area,3) detect which of the two stimulus intervals in a trial contained a straight contour composed of strictly aligned line segments, and 4) make an arbitrary choice if uncertain. Before collecting data in each new experiment, the subjects underwent some operational training in which the contours to be detected were highlighted in red (see Fig.3 D for example) while all the other parameters were exactly the same as in real test conditions. Experiment began after the subjects were familiarized with the stimulus patterns as well as the display timing and knew exactly what they were going to do. Error feedback was given by the computer with a beep sound.
Contour detectability was measured under various test conditions and psychometric functions were generated by logistic regression of the data points. Each data point was based on 70–420 responses collected in different sessions on different days. For each data point, the responses were randomly and evenly distributed into seven groups. The correct detection ratio (total number of correct responses divided by total number of trials) was calculated for each of the seven groups. Based on the seven ratios, a final mean detection ratio (r) was calculated along with SE. A detection ratio of r = 0.5 represents the chance level, and r = 1.0 represents 100% correct detection of contours. Finally, based on the mean detection ratio (r) contour detectability (p) was calculated for each data point using the following formula Using this calculation, the chance levels are represented byp ≤ 0, and 100% correct detection is represented byp = 1.0. Detectability of p = 0.5, which corresponds to 75% detection ratio (r = 0.75), is defined as the threshold for reliable detection of the contours. In fitting the data points with psychometric curves by logistic regression, all data points with p ≤ 0 were treated as p = 0. The value of stimulus parameter at threshold was calculated by interpolation at detectability ofp = 0.5 level.
Three naı̈ve subjects (HY, MK, MU) and one author (WL) were tested in this study. All four subjects were adults and had normal vision (with optical corrections where necessary).
Experiment 1: contour saliency as a function of number of colinear line segments tested at various contour spacings
A set of stimulus examples are given in Fig. 3 to demonstrate the effects of spacing between colinear lines on contour saliency. The overall density of line segments was the same. A straight contour popped out immediately over a range of contour spacings (Fig.3 A). Saliency decreased with increasing spacing between contour elements (Fig. 3 B). When the spacing was beyond some range the contour disappeared (Fig. 3 C). Without scrutiny or serial search it is difficult to spot the contour in Fig.3 C. The whole stimulus pattern appears to be just an array of randomly oriented lines, but a stack of embedded colinear lines are in fact present, and the contour becomes visible when highlighted in red (Fig. 3 D).
In addition to the spacing between colinear elements, the number of colinear lines was also varied. The contour was generated at random points along the tangent of an invisible circular path that fixed the eccentricity of all contours at 4.0°. The density of line segments in all stimuli was kept unchanged as the spacing between contour elements was varied (see methods for details about stimulus generation).
Results from two subjects are shown in Fig.4, A and B. Each curve (logistic regression of each set of data points) shows contour detectability as a function of number of colinear lines. The family of curves represents detectability at different contour spacings. For both subjects, contour detectability improved as the number of colinear line elements was increased, and detectability deteriorated with increasing spacing between contour elements. At short relative contour spacing (1.0–1.4), contour detectability saturated with the alignment of about nine line segments. At intermediate spacing (relative spacing, 1.6–1.8), detectability increased progressively with a larger number of colinear lines, suggesting a progressive buildup of local colinear interactions. When the separation between colinear elements was beyond a certain range (relative spacing > 1.8), however, no matter how many colinear line segments were embedded in the noise, no contour could be detected, indicating the breakdown of propagation of local colinear interactions.
The same set of data are re-plotted for both subjects in Fig. 4,C and D, respectively. Instead of using the number of colinear lines, contour detectability is plotted against the absolute contour length, end to end, in visual degrees. The integration of contour elements extended over a very large spatial distance, up to 30° under our test conditions. This was subject to the requirement, for both subjects, that the distance between the colinear elements be kept under a separation (center-to-center) of about 2°.
Experiment 2: effects of density of line segments on contour saliency
Results from experiment 1 showed that, when the overall density of line segments remained unchanged, the interactions between colinear segments decreased with increasing contour spacing. In the next experiment, the effects of density of line segments on contour saliency were investigated. Contour detectability was measured as a function of spacing between colinear line segments at four different densities.
As described in methods (Fig. 1), we designed our stimuli to allow independent control of the global density of line segments as well as the relative spacing between contour elements. The general rules for stimulus generation in this particular experiment were the same except that some extra considerations were taken.
The 43.8° circular display area was divided by invisible grids of 0.8°, 1.6°, 3.2°, and 6.4° spacing (see methods for details about stimulus generation). The resulting relative density of line segments in these four stimulus conditions was 64:16:4:1 (see Fig.5 for example). The line segments of which the stimulus arrays were composed were 0.4° × 0.08° in size for all conditions. The number of colinear lines was 25, 13, 7, and 4, respectively, for the four density conditions. With this design the end-to-end length of each contour was kept constant at all densities for any given relative contour spacing. The contour length was 19.6° at a relative spacing of 1.0. In addition to background density, the relative contour spacing was also varied independently at each density to measure the critical spacing at which the saliency was disrupted. Increasing the relative contour spacing also increased the contour length for all densities.
A subset of stimulus examples is shown in Fig. 5, where the conditions of relative contour spacing 1.0 at four different densities are demonstrated. It can be seen that at the same relative contour spacing the saliency of contours deteriorated with decreasing density of line segments.
The experiments were conducted on two subjects and the results presented in Fig. 6. Each curve shows contour detectability as a function of relative contour spacing. The different curves represent different context densities.
At a given context density, contour saliency decreased with increasing contour spacing (Fig. 6, A and B), consistent with the results from experiment 1. The data also show that at any given relative contour spacing contour detectability decreased with decreasing density of line segments. If one takes detectability of 0.5 (75% detection ratio) as the threshold for reliable contour detection, the threshold relative contour spacing decreased progressively as the density of lines decreased. It is noteworthy that when the context density dropped below a certain level (or the average context spacing was increased above a certain value), regardless of the relative contour spacing, contour detectability was far below the threshold and very close to the chance levels for both subjects. For subject WL, contour detectability was close to chance levels at the density corresponding to 6.4° average context spacing, and for subject HY, the detectability already dropped to chance levels at the density corresponding to 3.2° average spacing.
To measure the absolute contour spacing at detection threshold (detectability 0.5, or 75% detection ratio) for different context densities, the same data are re-plotted in Fig. 6, Cand D for the two observers. Contour detectability is plotted as a function of the absolute contour spacing in visual degrees. Four curves correspond to data obtained at four context densities. By taking into account only those context densities giving suprathreshold detectability, the contour spacing at threshold for subject WL was between 1.50° and 3.46°, depending on the context density. For subject HY, the spacing at threshold was between 1.45° and 1.95°.
The data shown in Fig. 6 suggest that contour detectability is a function not only of the relative spacing of contour elements but also of the average density of the overall pattern. At a given context density, there exists a critical spacing between contour elements beyond which no salient contour can be detected. Similarly, at a givenrelative contour spacing, there also exists a critical context density (or average spacing between background elements) for the generation of salient contours.
It should be kept in mind that, although smaller contour spacing produces more salient contours, when the spacing between the contour elements is smaller than the average spacing between the background noise elements (that is, relative contour spacing smaller than 1.0), subjects can use the cue of density for detection of colinear lines rather than the percept of a contour based on geometric grouping processes (Kovács et al. 1999). At contour spacings that are higher than the average spacing of the overall pattern (that is, relative contour spacing larger than 1.0), the percept becomes more of a contour, and it depends on the geometric attributes like colinearity and iso-orientation of the line elements rather than density cues. In other words, a relative contour spacing 1.0 (the contour and noise elements are equally spaced) defines the most salient contours for all context densities without introducing any density cue.
In another experiment, by fixing the relative contour spacing at 1.0, we compared the detectability of contours embedded in the four noise backgrounds of different densities (see Fig. 5 for demonstration of stimuli). This enables us to estimate the maximum spacing between contextual stimuli that disrupts contour saliency and thus provides an estimate of the breakdown or maximal distance of local interactions involved in the formation of perceptual contours embedded in noise.
A total of four subjects were tested (Fig.7). Contour detectability is plotted as a function of the average spacing between line segments. Keep in mind that the contour and noise elements were equally spaced in this experiment. As shown in Fig. 7, contour detectability decreased with increasing average context spacing. If one takes detectability of 0.5 as the threshold for reliable detection of the contour, the threshold average spacing is between 1.63° and 3.61° for the four subjects tested, or 2.3° ± 0.9° (mean ± SD) when averaged across the subjects.
Experiment 3: oblique effect in contour detection
It has been reported that in V1, far more cells prefer orientations close to horizontal and vertical than oblique (Celebrini et al. 1993; De Valois et al. 1982; Kennedy and Orban 1979). If contour integration is based on interactions of V1 cells via intrinsic horizontal connections which link cells with similar orientation preference, one would expect to see some anisotropy of saliency of contours in the orientation domain.
Viewing Fig. 8, one can observe the effect of orientation on contour saliency. Except for the orientation of each contour, all other parameters are the same in these two patterns. The horizontal contour in Fig. 8 A is more salient than the oblique contour in Fig. 8 B. By simply tilting the figure clockwise or counterclockwise back and forth, one perceives a change in the relative saliency of the two contours.
We measured contour detectability as a function of contour orientation and position. Stimuli were similar to those used in experiment 1. The whole stimulus pattern extended 43.8° in diameter and was divided by invisible grid boxes of 0.8° in size. A straight contour was generated along the tangent of an invisible circular path of 4.0° in radius around the FP (see methods for details about stimulus generation), fixing the eccentricity of the contour at 4.0°. The number of line segments comprising the contour was fixed at 9, and the relative contour spacing was fixed at 1.6°. From the results previously shown in Fig. 4 we know that contour detectability was not saturated at these settings for the two subjects tested. For the same two subjects, a combination of sixteen orientations and positions were tested (Fig. 9). The stimuli were presented in an interdigitated fashion.
Contour saliency showed strong anisotropy (Fig. 9), where horizontal and vertical contours were much more easily detected than oblique ones.
Experiment 4: training effects
In this study, subject WL was an experienced observer. WL had higher detection ratio of contours than naı̈ve observers under the same test condition and was able to detect contours with larger line spacing (Figs. 4, 6, and 7). This suggested some learning on contour detection. To test for learning effects in the contour saliency task, we started with naı̈ve observers and measured their performance over time.
The data previously shown in Fig. 7 for subject MU were based on the average of responses collected over 12 days. In Fig.10, the same data from MU were separated into six groups chronologically and demonstrated a strong learning effect. Under the same test conditions the performance in contour detection improved with training (Fig. 10 A). This improvement was most prominent at intermediate spacing between stimulus elements (solid arrow in Fig. 10 A). Note that the training had little effect when the stimulus elements were far apart (open arrow in Fig. 10 A). Due to a significant increase of detectability at intermediate spacing, the critical spacing between contextual lines measured at threshold (detectability of 0.5) also increased by about a factor of 2 for this subject (Fig. 10 B).
Global contour integration and local interactions within V1
Our results show that integration of contour elements can carry over very large distances as long as the spacing between stimulus elements is kept below a certain distance. The critical spacing between colinear stimulus elements (Fig. 6) and the critical spacing between contextual noise elements (Fig. 7) provide a dimension for the expected local interactions underlying global contour integration, approximately 2°. This range of local interactions is in close correspondence with the spatial extent of horizontal interactions and connectivity between V1 neurons. The contextual interactions observed for neurons in the superficial layers of V1 in behaving monkeys show the same spatial extent (Kapadia et al. 1995, 2000). The visuotopic representation of intrinsic horizontal connections, mapped at comparable visual field eccentricity of the stimuli used in the current study, also originates from sites as far as 2° from either side of the target neurons (Stettler et al. 2002).
Our results are consistent with the idea that global contour saliency is mediated by local interactions of intermediate range that can cascade over multiple stages. Based on our findings together with the increasingly converged results from psychophysical, physiological, anatomical, and computational studies, a simple schematic diagram of neural connections in V1 can be used to explain the perceptual phenomenon of saliency of contours embedded in complex environments (Fig. 11). The local interactions mediating colinear brightness induction and other contextual modulation effects can cascade across many links in the horizontal pathway via intrinsic horizontal connections. When more adjacent links are activated by more stimulus elements having certain geometric properties, the horizontal interactions will be reinforced, resulting in the pop-out of a contour from the noisy environment. The diagram in Fig. 11 illustrates the spatial constraint of these interactions. If the distance between two stimulus elements is too large, out of reach of horizontal connections between two adjacent links, the propagation of horizontal interactions will be disrupted. Consequently, the colinear elements will blend into the noisy environments, and no contour will pop out. This diagram only illustrates the connectivity along the contour path. The influence of background density suggests a more complex pattern of interactions (for examples of models seeLi 1998; Pettet et al. 1998;Ullman 1992; VanRullen et al. 2001;Yen and Finkel 1998).
There are other points of similarity between the current psychophysical findings and the functional properties of neurons in V1. First, our data show that contour saliency is a function not only of the geometry of the contour elements but also of the surrounding elements in the visual scene. That is, the colinear interactions between two contour elements are further modified by the greater context within which the contour elements are embedded. This suggests a cascade of nonlinearities, where the mutual influences of the line elements within the contour depend in turn on the placement of elements in the surround. In noisy contextual environments, colinear brightness induction (Polat and Bonneh 2000) and vernier discrimination (Herzog and Fahle 2002) follow the same pattern of interactions, and these interactions are very similar to the contextual modulation effects observed in V1 neurons (Kapadia et al. 2000). Second, our results show that theabsolute critical spacing between colinear lines increased with decreasing noise density (Fig. 6, C and D). This correlates well with the contextual modulation seen in V1, where contextual modulation of neuronal responses by noise background depends on the density of the context. Sparser noise produces weaker contextual inhibition (Knierim and Van Essen 1992). Finally, a strong oblique effect was observed in contour detection (Fig. 9). This perceptual phenomenon closely correlates with the anisotropy observed in distribution of optimal orientations of V1 neurons (Celebrini et al. 1993; De Valois et al. 1982;Kennedy and Orban 1979). All these observations are consistent with the connectivity and functional properties of cells in V1.
The spatial scale constraint of the local interactions underlying contour integration, about 2°, is quite close to that observed for other contextual interactions. For example, the tilt illusion diminishes with the distance between the target and inducing lines, and is very weak when the distance is larger than 1° (Virsu and Taskinen 1975; Wenderoth and Johnstone 1988;Westheimer 1990). Perceptual pop-out based on orientation contrast of local stimuli peaks at line spacing between 1° and 2° (Nothdurft 2000). Colinear brightness induction drops off as the target and inducing lines are separated, and the effect disappears when the separation is larger than about 2° (Dresp and Grossberg 1997; Kapadia et al. 1995). The similarity in effective spatial range of interactions suggests that these different contextual phenomena might have a common neural substrate. The capability for V1 neurons to multiplex their function may come in part from their RF structure. The existence of spatially segregated and functionally distinct components in the RF of V1 neurons (Kapadia et al. 2000) indicates that contextual influences on V1 neuronal responses are highly dependent on the geometric relationship between the stimulus elements. The segregation around the RF of compartments with opposing contextual influences enables the same V1 neurons to participate in multiple perceptual processes mediating different perceptual phenomena, such as the tilt illusion and brightness induction. Different higher-level visual tasks, such as contour integration and texture segregation, could also be accomplished by the same neural apparatus based on distinct contextual modulation processes within the same cell population. In addition to the spatial organization of contextual modulation, top-down influences like attention and perceptual task contribute to the dynamics of neural circuits and thus greatly increase the versatility of V1 neurons in multiplexing visual information processing (Crist et al. 2001; Ito and Gilbert 1999;). In these respects one can speculate how the same cortical area or even the same neurons can be involved in multiple perceptual phenomena and participate in multiple perceptual tasks.
Some discrepancies between the phenomena of colinear brightness induction and contour saliency have been observed and led to questions about the commonality of the underlying mechanisms (Williams and Hess 1998). One issue is that contour integration is much more robust than colinear brightness induction to spatial jitter (e.g., orientation and phase jitter) of colinear elements. This can be accounted for by the cascading mechanism in contour integration (Fig.11) where more horizontal links are activated and thus the colinear facilitation is reinforced along the contour path, resulting in more tolerance to spatial jitter of contour elements. A second issue is the lower contrast levels at which colinear brightness induction takes place. Nevertheless, it has been shown that induction works at high contrast as well as at threshold levels (Ito and Gilbert 1999), and is modified by attentional state (Freeman et al. 2001; Ito and Gilbert 1999) and other contextual stimuli (Ito and Gilbert 1999). A third issue is that contours pop out of the complex background but do not appear brighter than the context, as distinct from colinear brightness induction.
This raises the controversy in contour integration about the form of the neural code for saliency, and how this might be distinguished from the signals underlying brightness perception. Some studies suggest that contour saliency is derived from a general increase of neuronal responses that results from facilitatory horizontal interactions (Kapadia et al. 1995, 1999; Pettet et al. 1998; Polat and Bonneh 2000), while other studies suggest that temporal encoding could be involved in saliency effects (Gray 1999; Gray et al. 1989;Li 1998; Singer 1999; Yen and Finkel 1998). In this discussion, one has to also incorporate the prominent role of top-down influences in early processing, and the possibility for multiplexing of signals for different perceptual tasks (Crist et al. 2001). Brightness and saliency might be distinguished by such top-down influences, and by the difference in attentional state which may provide an efference copy along with the higher levels of activity.
Learning and top-down influences
Learning exerts strong effects on contour detection, as reflected in the evidence that training significantly increases detection ratio for the same test condition and increases the spatial range of local interactions. The phenomenon that training increases spatial range of colinear interactions has been reported in brightness induction.Polat and Sagi (1994a) found that training can increase the effective range of colinear facilitation by a factor of 3. They proposed that this effect reflects an increased range of interactions via a cascade of filters that are locally connected. We speculate that the increase of critical spacing in contour detection share similar neural mechanisms. A parsimonious explanation is that perceptual learning involves strengthening of existing local interactions of intermediate range. This is consistent with the observation that the most prominent improvement in contour detection with training was observed for intermediate spacing (Fig. 10 A), and that contour interactions for experienced and naı̈ve observers asymptote at longer spacings of similar extent (Fig. 7).Kovács et al. (1999) have proposed that the learning in contour detection takes place in an early cortical area based on the observation that learning in contour detection is visual cue specific, with no transfer between contours defined by orientation and defined by color.
Our findings suggest that global contour saliency is based on local integration mechanisms of intermediate spatial extent, comparable to the interactions observed in colinear facilitation and other contextual modulation phenomena both in perception and in neuronal responses in V1. Our data show that these interactions can cascade over very large distances as long as the spacing of stimulus elements is kept within a limited range. Our results also indicate that learning and top-down influence can enhance these local interactions. We propose that intrinsic horizontal connections in V1 are well suited to mediate the global contour saliency in a cascading manner because of their extent and orientation specificity. The Gestalt rules of perceptual organization are already represented at least in part in the functional properties of neurons in the primary visual cortex.
We thank the volunteer subjects. We also thank G. Westheimer and M. Sigman for comments on the manuscript.
This work was supported by National Eye Institute Grant EY-07968.
Address for reprint requests: C. D. Gilbert, The Rockefeller University, 1230 York Ave., New York, NY 10021 (E-mail:).
- Copyright © 2002 The American Physiological Society