|
|
||||||||
J Neurophysiol (November 1, 2002). 10.1152/jn.00289.2002
Submitted on 17 April 2002
Accepted on 28 June 2002
The Rockefeller University, New York, New York 10021
| |
ABSTRACT |
|---|
|
|
|---|
Li, Wu and Charles D. Gilbert. Global Contour Saliency and Local Colinear Interactions. J. Neurophysiol. 88: 2846-2856, 2002. Our visual system can link components of contours and segregate contours from complex backgrounds based on geometric grouping rules. This is an important intermediate step in object recognition. The substrate for contour integration may be based on contextual interactions and intrinsic horizontal connections seen in primary visual cortex (V1). We examined the perceptual rules governing contour saliency to determine whether the spatial extents of contextual interactions and horizontal connections match those mediating saliency. To quantify these rules, we used stimuli composed of randomly oriented nonoverlapping line segments. Salient contours within this complex background were formed by colinear alignment of nearby segments. Contour detectability was measured using a 2-interval-forced-choice design. Contour detectability deteriorated with increasing spacing between contour elements and improved as the number of colinear line elements was increased. At short contour spacing, the detectability reached a plateau with alignment of a few line segments that together formed a contour subtending several visual degrees. At intermediate spacing, saliency built up progressively with a greater number of colinear lines, extending up to 30°. When contour spacing was beyond a critical range (about 2°), however, the detectability dropped to chance levels, regardless of the number of colinear lines. Contour detectability was found to be a function not only of the relative spacing of contour elements with respect to the noise elements but also of the average density of the overall pattern. Furthermore, training significantly improved contour detection, increasing the critical spacing of line elements beyond which contours were no longer detectable. Our data suggest that global contour integration is based on mechanisms of limited spatial extent, comparable to the interactions observed in V1. These interactions can cascade over larger distances provided the spacing of stimulus elements is kept within a limited range.
| |
INTRODUCTION |
|---|
|
|
|---|
The visual world is perceived as
organized objects, and the percept of a visual object is influenced by
the global organization of visual scenes. Early last century, Gestalt
psychologists investigated this perceptual phenomenon and summarized
the rules governing perceptual organization, including proximity,
similarity, continuity, and closure (Wertheimer 1923
).
The Gestalt grouping rules can be successfully used to explain a series
of perceptual phenomena that fall under the rubric of contextual interactions.
Contextual interactions in visual perception have been observed in a
number of stimulus domains, including brightness (brightness induction), color (simultaneous color contrast), orientation (tilt illusions and saliency from orientation contrast), spatial scale (size
illusions), and depth (assimilation and repulsion of perceived depth of
nearby objects). Similar contextual interactions have been reported at
the neuronal level in the primary visual cortex (V1). For example, the
optimal orientation of V1 neurons is affected by the orientation of
contextual lines in a way similar to the tilt illusion (Gilbert
and Wiesel 1990
). Neuronal responses in V1 are strongly
affected by orientation contrast of local stimuli, comparable to the
perceptual pop-out based on orientation contrast (Kastner et al.
1997
; Knierim and Van Essen 1992
; Li et
al. 2000
). Responses of V1 neurons to an optimally oriented
stimulus are facilitated by colinear flanks (Kapadia et al.
1995
; Nelson and Frost 1985
; Polat et al.
1998
), a neuronal correlate of colinear brightness induction
(Dresp 1993
; Kapadia et al. 1995
;
Polat and Sagi 1993
, 1994b
). Contextual interactions
seen in V1 are also related to aspects of intermediate level vision
such as figure-ground segregation and surface brightness perception
(Lamme 1995
; Rossi and Paradiso 1999
;
Rossi et al. 1996
; Zipser et al. 1996
).
The close similarity between the contextual interactions observed in
perception and in response properties of V1 neurons suggests that V1
mediates these perceptual phenomena. Evidence from anatomical (Gilbert and Wiesel 1979
, 1983
, 1989
; Rockland
and Lund 1983
; Rockland et al. 1982
;
Schmidt et al. 1997
), electrophysiological (Ts'o
and Gilbert 1988
; Ts'o et al. 1986
), and
imaging (Das and Gilbert 1995
; Malach et al.
1993
) studies reveals that in V1 intrinsic horizontal
connections formed by the axons of pyramidal cells can link cells with
nonoverlapping receptive fields (RFs) and similar orientation
preference. This intra-cortical circuitry enables V1 cells to integrate
information over a relatively large portion of the visual field, and is
suitable to mediate a wide variety of contextual influences that show
orientation dependency.
Line segments with specific geometric relationships are perceptually
grouped to form visual contours (Wertheimer 1923
) that are salient and "pop out" even if embedded in complex environments. The saliency of a contour, and the connectivity of its elements, has
been found to follow the Gestalt rule of good continuation (Field et al. 1993
). Despite the global nature of
contours, the interactions mediating the saliency effect may be more
local. It has been proposed that contour integration can be performed by the same neural mechanism underlying local colinear brightness induction (Dresp 1993
; Kapadia et al.
1995
; Polat and Sagi 1993
, 1994b
). Computational
modeling has revealed that interactions between locally connected
V1-neuron-like filters suffice to extract globally salient contours
from noise context based on geometrical attributes like smoothness
(Li 1998
; Pettet et al. 1998
;
Ullman 1992
; VanRullen et al. 2001
;
Yen and Finkel 1998
). Based on their psychophysical
observations, Field et al. (1993)
coined the term "association field" to account for the local interactions between contour elements. Only colinearly and smoothly arranged elements are
strongly associated. With respect to these functions the association field would map well onto the horizontal connections found in V1 if the
spatial extent of the interactions underlying the association field
matches that of the lateral interactions within V1.
To make such a comparison, we examined the perceptual rules governing contour saliency to quantify the extent of the visuospatial interactions underlying saliency. To this end, we tested the effect of changing separation between stimulus elements on the perceived saliency of contours.
| |
METHODS |
|---|
|
|
|---|
Stimulus generation and data collection
Stimuli were generated by a VSG2/5 visual stimulus generator (Cambridge Research Systems) under computer control on a 21-inch Sony FD Trinitron monitor (model GDM-F500) with a resolution of 1536 × 1152 pixels and a refresh rate of 60 Hz. The viewing distance was 35 cm. Each pixel subtended 0.04°, and the overall display area was 58.4° × 43.8°. The experiments were conducted in a dimly lit room.
Stimuli consisted of an array of randomly oriented line segments (see Fig. 1, B, D, and F for examples). Each line, which was anti-aliased by the built-in function of the commercial software library (VSL version 6.085), was about 0.4° long and 0.08° wide. The positions of line segments were defined by geometric rules that allowed precise control of stimulus parameters, as illustrated in Fig. 1. A circular area of 43.8° in diameter was divided into small square areas of designated size (Fig. 1A). Each grid position contained a line segment whose position was jittered within the square compartment except when it was part of a contour (Fig. 1B). A red fixation point (FP) was drawn in the center of the circular patch. The invisible dividing grids controlled the average density of line segments and thus the average spacing between them. The average density was defined as the total number of grids (and thus line segments) within the whole stimulus patch, and the average spacing was defined as the width or height of an individual grid box. By colinearly aligning nearby elements along a diagonal of the grids, a straight contour was generated within the otherwise random background. The length of the contour was determined by the number of colinear lines, and its location was controlled by the positions of those grid boxes whose line segments were aligned along a diagonal. The eccentricity of the contour was defined as the radius of the circle around the FP that was tangent to the contour path, and the center element of the contour was located at the center of the grid box whose distance to the FP was the shortest among all the grid boxes along the alignment axis so that the contour extended about the same extent to either side in the periphery. In this study the eccentricity at which the center of the contour was located ranged between 3.6° and 6.4°. The two endpoints of the contour extended into the periphery of larger eccentricities, which depended on the contour length and could be up to 18°.
|
The orientation of the contour was controlled by rotating the whole stimulus pattern around the FP, which also changed the position of the contour. Using this design a straight contour could be generated along the tangent of an invisible circular path at any given combination of position and orientation while keeping the contour eccentricity unchanged. The orientation and position of the contour was randomly generated in each trial in such a manner.
To control the spacing between contour elements while keeping the
density of line segments unchanged, a skew angle was introduced to the
square grids described in Fig. 1A. The skew angle could be
either clockwise (angle "a" in Fig. 1C) or
counterclockwise (angle "a" in Fig. 1E). The
clockwise skew (a > 0) increased the spacing between
contour elements (Fig. 1, C and D), while the counterclockwise skew (a < 0) reduced the spacing
between contour elements (Fig. 1, E and F). This
transformation of dividing grids did not alter the width and height of
each grid box, so the area of each compartment box remained unchanged
at all skew angles. The number of grid boxes within the whole circular
patch was also constant. By choosing proper skew angles, the spacing
between contour elements could be precisely adjusted without changing the overall density of line segments. For example, if one compares the
three stimulus patterns in Fig. 1, B, D and
F, no discernible difference is introduced in the noise
context while the spacing between colinear lines is changed. This
feature in stimulus design is crucial for the purpose of this study.
Another important feature of the stimulus array is that the contour was
embedded by aligning some of the array elements within the stimulus
pattern. This ensured that no extra density cue was introduced
within the complex stimuli. The skew angles used in this study
ranged from
45.0° to +43.8°, which defined relative contour
spacing from 1.0 to 2.2 with respect to the average spacing between
background elements. In this paper the term relative contour
spacing is used to denote the spacing between contour elements
relative to the average spacing between background elements (as
mentioned previously, the average spacing is defined as the
width or height of an individual grid box). The absolute
center-to-center distance between two adjacent contour elements is
referred to as contour spacing in visual angle. A relative
spacing of 1.0 indicates that the spacing between contour elements is
equal to the average spacing between background elements. This is the
minimum relative spacing that can be generated with this stimulus
design. We emphasize that no density cue was introduced when the
relative contour spacing was adjusted.
Unless otherwise indicated, the whole stimulus pattern was 43.8° in diameter and was divided by invisible grid boxes of 0.8° in width and height. The luminance of the composing line segments was 82.1 cd/m2. The luminance of the background was 4.1 cd/m2, against which the line segments were presented.
The method of 2-interval-forced-choice was employed (Fig. 2) to measure contour detectability. Two successive stimulus intervals were presented in each trial. Before each stimulus interval there was a leading blank period of 500 ms within which only the FP was displayed. In each stimulus interval a stimulus pattern was presented for 150 ms and a mask was presented for 300 ms immediately after that. An embedded contour was only present during one of the two intervals, and the stimulus pattern in the other interval was just noise (i.e., similar complex background without any contour). The mask following both stimulus intervals also consisted of only noise. The subjects had to indicate by pressing one of two buttons which stimulus interval contained a straight contour. The inter-trial interval was about 3 s. Different stimulus conditions in each experiment were randomized.
|
The instructions to the subjects were 1) fixate on the center red dot (FP), 2) attend to the surrounding area, 3) detect which of the two stimulus intervals in a trial contained a straight contour composed of strictly aligned line segments, and 4) make an arbitrary choice if uncertain. Before collecting data in each new experiment, the subjects underwent some operational training in which the contours to be detected were highlighted in red (see Fig. 3D for example) while all the other parameters were exactly the same as in real test conditions. Experiment began after the subjects were familiarized with the stimulus patterns as well as the display timing and knew exactly what they were going to do. Error feedback was given by the computer with a beep sound.
|
Data analysis
Contour detectability was measured under various test conditions
and psychometric functions were generated by logistic regression of the
data points. Each data point was based on 70-420 responses collected
in different sessions on different days. For each data point, the
responses were randomly and evenly distributed into seven groups. The
correct detection ratio (total number of correct responses divided by
total number of trials) was calculated for each of the seven groups.
Based on the seven ratios, a final mean detection ratio (r)
was calculated along with SE. A detection ratio of r = 0.5 represents the chance level, and r = 1.0 represents 100% correct detection of contours. Finally, based on the mean detection ratio (r) contour detectability (p) was
calculated for each data point using the following formula
|
0, and 100% correct detection is represented by
p = 1.0. Detectability of p = 0.5, which corresponds to 75% detection ratio (r = 0.75),
is defined as the threshold for reliable detection of the
contours. In fitting the data points with psychometric curves by
logistic regression, all data points with p
0 were treated as p = 0. The value of stimulus parameter at
threshold was calculated by interpolation at detectability of
p = 0.5 level.
Three naïve subjects (HY, MK, MU) and one author (WL) were tested in this study. All four subjects were adults and had normal vision (with optical corrections where necessary).
| |
RESULTS |
|---|
|
|
|---|
Experiment 1: contour saliency as a function of number of colinear line segments tested at various contour spacings
A set of stimulus examples are given in Fig. 3 to demonstrate the effects of spacing between colinear lines on contour saliency. The overall density of line segments was the same. A straight contour popped out immediately over a range of contour spacings (Fig. 3A). Saliency decreased with increasing spacing between contour elements (Fig. 3B). When the spacing was beyond some range the contour disappeared (Fig. 3C). Without scrutiny or serial search it is difficult to spot the contour in Fig. 3C. The whole stimulus pattern appears to be just an array of randomly oriented lines, but a stack of embedded colinear lines are in fact present, and the contour becomes visible when highlighted in red (Fig. 3D).
In addition to the spacing between colinear elements, the number of colinear lines was also varied. The contour was generated at random points along the tangent of an invisible circular path that fixed the eccentricity of all contours at 4.0°. The density of line segments in all stimuli was kept unchanged as the spacing between contour elements was varied (see METHODS for details about stimulus generation).
Results from two subjects are shown in Fig. 4, A and B. Each curve (logistic regression of each set of data points) shows contour detectability as a function of number of colinear lines. The family of curves represents detectability at different contour spacings. For both subjects, contour detectability improved as the number of colinear line elements was increased, and detectability deteriorated with increasing spacing between contour elements. At short relative contour spacing (1.0-1.4), contour detectability saturated with the alignment of about nine line segments. At intermediate spacing (relative spacing, 1.6-1.8), detectability increased progressively with a larger number of colinear lines, suggesting a progressive buildup of local colinear interactions. When the separation between colinear elements was beyond a certain range (relative spacing > 1.8), however, no matter how many colinear line segments were embedded in the noise, no contour could be detected, indicating the breakdown of propagation of local colinear interactions.
|
The same set of data are re-plotted for both subjects in Fig. 4, C and D, respectively. Instead of using the number of colinear lines, contour detectability is plotted against the absolute contour length, end to end, in visual degrees. The integration of contour elements extended over a very large spatial distance, up to 30° under our test conditions. This was subject to the requirement, for both subjects, that the distance between the colinear elements be kept under a separation (center-to-center) of about 2°.
Experiment 2: effects of density of line segments on contour saliency
Results from experiment 1 showed that, when the overall density of line segments remained unchanged, the interactions between colinear segments decreased with increasing contour spacing. In the next experiment, the effects of density of line segments on contour saliency were investigated. Contour detectability was measured as a function of spacing between colinear line segments at four different densities.
As described in METHODS (Fig. 1), we designed our stimuli to allow independent control of the global density of line segments as well as the relative spacing between contour elements. The general rules for stimulus generation in this particular experiment were the same except that some extra considerations were taken.
The 43.8° circular display area was divided by invisible grids of 0.8°, 1.6°, 3.2°, and 6.4° spacing (see METHODS for details about stimulus generation). The resulting relative density of line segments in these four stimulus conditions was 64:16:4:1 (see Fig. 5 for example). The line segments of which the stimulus arrays were composed were 0.4° × 0.08° in size for all conditions. The number of colinear lines was 25, 13, 7, and 4, respectively, for the four density conditions. With this design the end-to-end length of each contour was kept constant at all densities for any given relative contour spacing. The contour length was 19.6° at a relative spacing of 1.0. In addition to background density, the relative contour spacing was also varied independently at each density to measure the critical spacing at which the saliency was disrupted. Increasing the relative contour spacing also increased the contour length for all densities.
|
A subset of stimulus examples is shown in Fig. 5, where the conditions of relative contour spacing 1.0 at four different densities are demonstrated. It can be seen that at the same relative contour spacing the saliency of contours deteriorated with decreasing density of line segments.
The experiments were conducted on two subjects and the results presented in Fig. 6. Each curve shows contour detectability as a function of relative contour spacing. The different curves represent different context densities.
|
At a given context density, contour saliency decreased with increasing contour spacing (Fig. 6, A and B), consistent with the results from experiment 1. The data also show that at any given relative contour spacing contour detectability decreased with decreasing density of line segments. If one takes detectability of 0.5 (75% detection ratio) as the threshold for reliable contour detection, the threshold relative contour spacing decreased progressively as the density of lines decreased. It is noteworthy that when the context density dropped below a certain level (or the average context spacing was increased above a certain value), regardless of the relative contour spacing, contour detectability was far below the threshold and very close to the chance levels for both subjects. For subject WL, contour detectability was close to chance levels at the density corresponding to 6.4° average context spacing, and for subject HY, the detectability already dropped to chance levels at the density corresponding to 3.2° average spacing.
To measure the absolute contour spacing at detection threshold (detectability 0.5, or 75% detection ratio) for different context densities, the same data are re-plotted in Fig. 6, C and D for the two observers. Contour detectability is plotted as a function of the absolute contour spacing in visual degrees. Four curves correspond to data obtained at four context densities. By taking into account only those context densities giving suprathreshold detectability, the contour spacing at threshold for subject WL was between 1.50° and 3.46°, depending on the context density. For subject HY, the spacing at threshold was between 1.45° and 1.95°.
The data shown in Fig. 6 suggest that contour detectability is a function not only of the relative spacing of contour elements but also of the average density of the overall pattern. At a given context density, there exists a critical spacing between contour elements beyond which no salient contour can be detected. Similarly, at a given relative contour spacing, there also exists a critical context density (or average spacing between background elements) for the generation of salient contours.
It should be kept in mind that, although smaller contour spacing
produces more salient contours, when the spacing between the contour
elements is smaller than the average spacing between the background
noise elements (that is, relative contour spacing smaller than 1.0),
subjects can use the cue of density for detection of colinear lines
rather than the percept of a contour based on geometric grouping
processes (Kovács et al. 1999
). At contour spacings that are higher than the average spacing of the overall pattern (that is, relative contour spacing larger than 1.0), the percept becomes more of a contour, and it depends on the geometric attributes like colinearity and iso-orientation of the line elements rather than density cues. In other words, a relative contour spacing 1.0 (the contour and noise elements are equally spaced) defines the
most salient contours for all context densities without introducing any
density cue.
In another experiment, by fixing the relative contour spacing at 1.0, we compared the detectability of contours embedded in the four noise backgrounds of different densities (see Fig. 5 for demonstration of stimuli). This enables us to estimate the maximum spacing between contextual stimuli that disrupts contour saliency and thus provides an estimate of the breakdown or maximal distance of local interactions involved in the formation of perceptual contours embedded in noise.
A total of four subjects were tested (Fig. 7). Contour detectability is plotted as a function of the average spacing between line segments. Keep in mind that the contour and noise elements were equally spaced in this experiment. As shown in Fig. 7, contour detectability decreased with increasing average context spacing. If one takes detectability of 0.5 as the threshold for reliable detection of the contour, the threshold average spacing is between 1.63° and 3.61° for the four subjects tested, or 2.3° ± 0.9° (mean ± SD) when averaged across the subjects.
|
Experiment 3: oblique effect in contour detection
It has been reported that in V1, far more cells prefer
orientations close to horizontal and vertical than oblique
(Celebrini et al. 1993
; De Valois et al.
1982
; Kennedy and Orban 1979
). If contour
integration is based on interactions of V1 cells via intrinsic horizontal connections which link cells with similar orientation preference, one would expect to see some anisotropy of saliency of
contours in the orientation domain.
Viewing Fig. 8, one can observe the effect of orientation on contour saliency. Except for the orientation of each contour, all other parameters are the same in these two patterns. The horizontal contour in Fig. 8A is more salient than the oblique contour in Fig. 8B. By simply tilting the figure clockwise or counterclockwise back and forth, one perceives a change in the relative saliency of the two contours.
|
We measured contour detectability as a function of contour orientation and position. Stimuli were similar to those used in experiment 1. The whole stimulus pattern extended 43.8° in diameter and was divided by invisible grid boxes of 0.8° in size. A straight contour was generated along the tangent of an invisible circular path of 4.0° in radius around the FP (see METHODS for details about stimulus generation), fixing the eccentricity of the contour at 4.0°. The number of line segments comprising the contour was fixed at 9, and the relative contour spacing was fixed at 1.6°. From the results previously shown in Fig. 4 we know that contour detectability was not saturated at these settings for the two subjects tested. For the same two subjects, a combination of sixteen orientations and positions were tested (Fig. 9). The stimuli were presented in an interdigitated fashion.
|
Contour saliency showed strong anisotropy (Fig. 9), where horizontal and vertical contours were much more easily detected than oblique ones.
Experiment 4: training effects
In this study, subject WL was an experienced observer. WL had higher detection ratio of contours than naïve observers under the same test condition and was able to detect contours with larger line spacing (Figs. 4, 6, and 7). This suggested some learning on contour detection. To test for learning effects in the contour saliency task, we started with naïve observers and measured their performance over time.
The data previously shown in Fig. 7 for subject MU were based on the average of responses collected over 12 days. In Fig. 10, the same data from MU were separated into six groups chronologically and demonstrated a strong learning effect. Under the same test conditions the performance in contour detection improved with training (Fig. 10A). This improvement was most prominent at intermediate spacing between stimulus elements (solid arrow in Fig. 10A). Note that the training had little effect when the stimulus elements were far apart (open arrow in Fig. 10A). Due to a significant increase of detectability at intermediate spacing, the critical spacing between contextual lines measured at threshold (detectability of 0.5) also increased by about a factor of 2 for this subject (Fig. 10B).
|
| |
DISCUSSION |
|---|
|
|
|---|
Global contour integration and local interactions within V1
Our results show that integration of contour elements can carry
over very large distances as long as the spacing between stimulus elements is kept below a certain distance. The critical spacing between
colinear stimulus elements (Fig. 6) and the critical spacing between
contextual noise elements (Fig. 7) provide a dimension for the expected
local interactions underlying global contour integration, approximately
2°. This range of local interactions is in close correspondence with
the spatial extent of horizontal interactions and connectivity between
V1 neurons. The contextual interactions observed for neurons in the
superficial layers of V1 in behaving monkeys show the same spatial
extent (Kapadia et al. 1995
, 2000
). The visuotopic
representation of intrinsic horizontal connections, mapped at
comparable visual field eccentricity of the stimuli used in the current
study, also originates from sites as far as 2° from either side of
the target neurons (Stettler et al. 2002
).
Our results are consistent with the idea that global contour
saliency is mediated by local interactions of intermediate range that
can cascade over multiple stages. Based on our findings together with
the increasingly converged results from psychophysical, physiological, anatomical, and computational studies, a simple schematic diagram of
neural connections in V1 can be used to explain the perceptual phenomenon of saliency of contours embedded in complex environments (Fig. 11). The local interactions
mediating colinear brightness induction and other contextual modulation
effects can cascade across many links in the horizontal pathway via
intrinsic horizontal connections. When more adjacent links are
activated by more stimulus elements having certain geometric
properties, the horizontal interactions will be reinforced, resulting
in the pop-out of a contour from the noisy environment. The diagram in
Fig. 11 illustrates the spatial constraint of these interactions. If
the distance between two stimulus elements is too large, out of reach
of horizontal connections between two adjacent links, the propagation
of horizontal interactions will be disrupted. Consequently, the
colinear elements will blend into the noisy environments, and no
contour will pop out. This diagram only illustrates the connectivity
along the contour path. The influence of background density suggests a
more complex pattern of interactions (for examples of models see
Li 1998
; Pettet et al. 1998
;
Ullman 1992
; VanRullen et al. 2001
;
Yen and Finkel 1998
).
|
There are other points of similarity between the current psychophysical
findings and the functional properties of neurons in V1. First, our
data show that contour saliency is a function not only of the geometry
of the contour elements but also of the surrounding elements in the
visual scene. That is, the colinear interactions between two contour
elements are further modified by the greater context within which the
contour elements are embedded. This suggests a cascade of
nonlinearities, where the mutual influences of the line elements within
the contour depend in turn on the placement of elements in the
surround. In noisy contextual environments, colinear brightness
induction (Polat and Bonneh 2000
) and vernier discrimination (Herzog and Fahle 2002
) follow the same
pattern of interactions, and these interactions are very similar to the contextual modulation effects observed in V1 neurons (Kapadia et
al. 2000
). Second, our results show that the
absolute critical spacing between colinear lines increased
with decreasing noise density (Fig. 6, C and D).
This correlates well with the contextual modulation seen in V1, where
contextual modulation of neuronal responses by noise background depends
on the density of the context. Sparser noise produces weaker contextual
inhibition (Knierim and Van Essen 1992
). Finally, a
strong oblique effect was observed in contour detection (Fig. 9). This
perceptual phenomenon closely correlates with the anisotropy observed
in distribution of optimal orientations of V1 neurons (Celebrini
et al. 1993
; De Valois et al. 1982
;
Kennedy and Orban 1979
). All these observations are consistent with the connectivity and functional properties of cells in V1.
The spatial scale constraint of the local interactions underlying
contour integration, about 2°, is quite close to that observed for
other contextual interactions. For example, the tilt illusion diminishes with the distance between the target and inducing lines, and
is very weak when the distance is larger than 1° (Virsu and Taskinen 1975
; Wenderoth and Johnstone 1988
;
Westheimer 1990
). Perceptual pop-out based on
orientation contrast of local stimuli peaks at line spacing between
1° and 2° (Nothdurft 2000
). Colinear brightness
induction drops off as the target and inducing lines are separated, and
the effect disappears when the separation is larger than about 2°
(Dresp and Grossberg 1997
; Kapadia et al. 1995
). The similarity in effective spatial range of
interactions suggests that these different contextual phenomena might
have a common neural substrate. The capability for V1 neurons to
multiplex their function may come in part from their RF structure. The
existence of spatially segregated and functionally distinct components
in the RF of V1 neurons (Kapadia et al. 2000
) indicates
that contextual influences on V1 neuronal responses are highly
dependent on the geometric relationship between the stimulus elements.
The segregation around the RF of compartments with opposing contextual
influences enables the same V1 neurons to participate in multiple
perceptual processes mediating different perceptual phenomena, such as
the tilt illusion and brightness induction. Different higher-level visual tasks, such as contour integration and texture segregation, could also be accomplished by the same neural apparatus based on
distinct contextual modulation processes within the same cell population. In addition to the spatial organization of contextual modulation, top-down influences like attention and perceptual task
contribute to the dynamics of neural circuits and thus greatly increase
the versatility of V1 neurons in multiplexing visual information
processing (Crist et al. 2001
; Ito and Gilbert
1999
;). In these respects one can speculate how the same
cortical area or even the same neurons can be involved in multiple
perceptual phenomena and participate in multiple perceptual tasks.
Some discrepancies between the phenomena of colinear brightness
induction and contour saliency have been observed and led to questions
about the commonality of the underlying mechanisms (Williams and
Hess 1998
). One issue is that contour integration is much more
robust than colinear brightness induction to spatial jitter (e.g.,
orientation and phase jitter) of colinear elements. This can be
accounted for by the cascading mechanism in contour integration (Fig.
11) where more horizontal links are activated and thus the colinear
facilitation is reinforced along the contour path, resulting in more
tolerance to spatial jitter of contour elements. A second issue is the
lower contrast levels at which colinear brightness induction takes
place. Nevertheless, it has been shown that induction works at high
contrast as well as at threshold levels (Ito and Gilbert
1999
), and is modified by attentional state (Freeman et
al. 2001
; Ito and Gilbert 1999
) and other
contextual stimuli (Ito and Gilbert 1999
). A third issue
is that contours pop out of the complex background but do not appear
brighter than the context, as distinct from colinear brightness induction.
This raises the controversy in contour integration about the form of
the neural code for saliency, and how this might be distinguished from
the signals underlying brightness perception. Some studies suggest that
contour saliency is derived from a general increase of neuronal
responses that results from facilitatory horizontal interactions
(Kapadia et al. 1995
, 1999
; Pettet et al.
1998
; Polat and Bonneh 2000
), while other
studies suggest that temporal encoding could be involved in saliency
effects (Gray 1999
; Gray et al. 1989
;
Li 1998
; Singer 1999
; Yen and
Finkel 1998
). In this discussion, one has to also incorporate
the prominent role of top-down influences in early processing, and the
possibility for multiplexing of signals for different perceptual tasks
(Crist et al. 2001
). Brightness and saliency might be
distinguished by such top-down influences, and by the difference in
attentional state which may provide an efference copy along with the
higher levels of activity.
Learning and top-down influences
Learning exerts strong effects on contour detection, as reflected
in the evidence that training significantly increases detection ratio
for the same test condition and increases the spatial range of local
interactions. The phenomenon that training increases spatial range of
colinear interactions has been reported in brightness induction.
Polat and Sagi (1994a)
found that training can increase the effective range of colinear facilitation by a factor of 3. They
proposed that this effect reflects an increased range of interactions
via a cascade of filters that are locally connected. We speculate that
the increase of critical spacing in contour detection share similar
neural mechanisms. A parsimonious explanation is that perceptual
learning involves strengthening of existing local interactions of
intermediate range. This is consistent with the observation that the
most prominent improvement in contour detection with training was
observed for intermediate spacing (Fig. 10A), and that
contour interactions for experienced and naïve observers
asymptote at longer spacings of similar extent (Fig. 7).
Kovács et al. (1999)
have proposed that the
learning in contour detection takes place in an early cortical area
based on the observation that learning in contour detection is visual
cue specific, with no transfer between contours defined by orientation
and defined by color.
| |
SUMMARY |
|---|
|
|
|---|
Our findings suggest that global contour saliency is based on local integration mechanisms of intermediate spatial extent, comparable to the interactions observed in colinear facilitation and other contextual modulation phenomena both in perception and in neuronal responses in V1. Our data show that these interactions can cascade over very large distances as long as the spacing of stimulus elements is kept within a limited range. Our results also indicate that learning and top-down influence can enhance these local interactions. We propose that intrinsic horizontal connections in V1 are well suited to mediate the global contour saliency in a cascading manner because of their extent and orientation specificity. The Gestalt rules of perceptual organization are already represented at least in part in the functional properties of neurons in the primary visual cortex.
| |
ACKNOWLEDGMENTS |
|---|
We thank the volunteer subjects. We also thank G. Westheimer and M. Sigman for comments on the manuscript.
This work was supported by National Eye Institute Grant EY-07968.
| |
FOOTNOTES |
|---|
Address for reprint requests: C. D. Gilbert, The Rockefeller University, 1230 York Ave., New York, NY 10021 (E-mail: gilbert{at}rockefeller.edu).
| |
REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. N. J. McManus, S. Ullman, and C. D. Gilbert A Computational Model of Perceptual Fill-in Following Retinal Degeneration J Neurophysiol, May 1, 2008; 99(5): 2086 - 2100. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Del Viva and R. Agostini Visual Spatial Integration in the Elderly Invest. Ophthalmol. Vis. Sci., June 1, 2007; 48(6): 2940 - 2946. [Abstract] [Full Text] [PDF] |
||||
![]() |
|