|
|
||||||||
Laboratorium voor Neuro- en Psychofysiologie, K.U. Leuven Medical School, Leuven, Belgium
Submitted 18 July 2006; accepted in final form 20 January 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
In contrast with its extensive use in behavioral research (e.g., Chun and Potter 1995
; Potter and Levi 1969
; Subramaniam et al. 2000
), the rapid serial visual presentation (RSVP) paradigm has rarely been used in combination with single-cell recordings in the higher visual cortex. In RSVP, images are presented sequentially and continuously (with no interstimulus interval or ISI) with each image replacing the previous one at the same location on the screen. Keysers et al. (2001)
and Földiák et al. (2004)
pioneered the use of the RSVP paradigm to examine the selectivity for complex images of neurons in the superior temporal sulcus (STS). Although an increasing presentation rate resulted in a flattening of the neuronal tuning, the stimulus-coding ability of the population of STS neurons recorded was preserved even at the highest presentation rates (14 ms/image), suggesting that RSVP is a useful technique for studying the stimulus selectivity of STS neurons with a large number of stimuli. However, in that and another study (Kiani et al. 2005
) using RSVP, stimuli were highly complex and differed sharply.
In studying the stimulus selectivity of IT cells, several researchers (e.g., Brincat and Connor 2004
; Kayaert et al. 2005a
; Op de Beeck et al. 2001
; Sigala and Logothetis 2002
) opted for the use of parametric shape configurations principally because it allows examining how the responses of IT neurons to complex stimuli are related to the parametric variation built into the stimulus sets. The use of all shapes to search for and test responsive neurons is a prerequisite for obtaining an unbiased measure of the responses and tuning of an IT neuron population to a set of parameterized shapes. However, the total experimental time available is limited when using the conventional presentation techniques, thus limiting the number of stimuli that could be presented in most of these studies. This drawback to use parametric shape sets can be overcome by the application of the RSVP paradigm because it allows the presentation of many stimuli. Thus one aim of the present study was to determine whether the RSVP technique is useful for studying the shape selectivity of IT neurons using parameterized sets of shapes (Kayaert et al. 2005a
; Op de Beeck et al. 2001
). The validity of the RSVP technique to study shape selectivity for parametric sets is not obvious since the differences among the shapes in such sets are much smaller than the stimulus differences employed in the previous IT studies using RSVP (Földiák et al. 2004
; Keysers et al. 2001
; Kiani et al. 2005
).
Kayaert et al. (2005a)
recorded from IT cells while showing simple shapes (e.g., a rectangle or triangle) parametrically manipulated along simple shape dimensions (e.g., taper or axis curvature). They found systematic response modulation along these simple dimensions with the largest response, on average, to the extreme values along a given dimension. The findings of Kayaert et al. (2005a)
, which suggest monotonic tuning in IT cortex for simple shapes, raise questions concerning the tuning curves for other, more complex shapes. Op de Beeck et al. (2001)
used more complex shapes, but their squared configurations, consisting of merely eight shapes, did not allow disentangling preferences for extreme values on a given dimension from preferences for intermediate values along that dimension. Because of the nature of their configuration, every shape corresponded a priori to an extreme value on a dimension. Hence in a first experiment, our primary aim was to examine the neural representation in IT cortex of complex shapes that vary systematically along shape dimensions using the RSVP paradigm. The application of the RSVP method provided an opportunity to increase the number of shapes per configuration. Thanks to this, we could also examine the extent to which the similarity among complex shapes is also represented on an ordinal and metrical level in IT cells when configurations are used that consist of a larger number of shapes than those employed by Op de Beeck et al. (2001)
.
As in the Kayaert et al. (2005a)
study, the major portion of the neurons recorded in the present study showed a preference for the extreme values of the parametric space, at least within the range of values tested. One important issue, discussed by Kayaert et al. (2005a)
, is the degree to which the tuning for the extremities of the space is due to the repeated presentation of a restricted set of stimuli, i.e., the statistics of the stimulation history. Recent work has demonstrated that early visual and auditory neurons adapt to recent stimulus statistics so that information transmission is enhanced (e.g., Brenner et al. 2000
; Dean et al. 2005
; Fairhall et al. 2001
; Sharpee et al. 2006
; Smirnakis et al. 1997
). Similar adaptive mechanisms might be operating at higher levels of the visual system or the effects of such adaptive mechanisms at earlier levels of the visual system could be inherited in IT. It is known that the responses of IT neurons can depend on previous stimulus presentations, e.g., stimulus repetition commonly reduces the responses of IT neurons (Baylis and Rolls 1987
; Gross et al. 1967
, 1969
; Miller et al. 1991a
; Riches et al. 1991
; Sobotka and Ringo 1993
), even with intervening presentations of other stimuli (e.g., Brown et al. 1987
; Miller et al. 1991b
; Sawamura et al. 2006
). Because a high number of stimuli are presented repeatedly in RSVP, this paradigm might be more sensitive to adaptive effects than classical testing paradigms in which one stimulus is presented per trial after acquisition of fixation and the intertrial interval is relatively long. This also implies that RSVP might be a useful technique with which to demonstrate the effect of stimulus statistics on neuronal tuning. Given recent demonstrations of an adaptive rescaling of neuronal responses in the fly visual system (Brenner et al. 2000
) and guinea pig inferior colliculus (Dean et al. 2005
) when the variance of the stimulus distribution is altered, we used RSVP to examine whether the tuning of IT neurons adapts to the properties of the stimulus distribution other than the mean. Note that the manipulation of stimulus distribution properties is possible only when parameterized sets of stimuli are used. Thus in a second experiment, we measured the responses of IT neurons to a set of shapes that varied along a single dimension. The neurons were tested in two successive blocks using stimuli that differed in stimulus variance and density between blocks. Parts of the results of the present study have been published in abstract form (De Baene and Vogels 2005
).
| METHODS |
|---|
|
|
|---|
Two male rhesus monkeys (Macaca mulatta; monkeys J and B) served as subjects. Before conducting the experiments, aseptic surgery under isoflurane anesthesia was performed to attach a fixation post to the skull and to stereotactically implant a plastic recording chamber. The recording chambers were positioned dorsal to IT, allowing a vertical approach, as described by Janssen et al. (2000)
. During the course of the recordings, we took a structural magnetic resonance (MRI) image, with a copper sulfate filled tube inserted in the grid at one of the recording positions. The recording positions were estimated by comparing this MRI with depth readings of the white and gray matter transitions and of the skull base during the recordings.
All animal care and experimental and surgical procedures followed national and European guidelines and were approved by the K.U. Leuven Ethical Committee for animal experiments.
Stimuli
All shapes (maximum size: 7.1°) were filled with pixel noise and were presented foveally in continuous rapid random sequences at a rate of 100 ms/image on a gray background on a monitor positioned 60 cm from the monkeys (60-Hz frame rate; 1,024 x 768 pixels). A trial started after 250 ms of stable fixation and ended when the monkey broke fixation or when every stimulus had been presented twice (for experiment 1) or 20 times (for experiment 2). Different visual stimuli were used in each experiment (see following text).
EXPERIMENT 1. We generated five parametric sets in which shapes were varied systematically along two dimensions, permuting into 25 combinations of values of the two dimensions per set in a circular configuration (see Fig. 1). The range of the dimensions of the parametric configurations was restricted in two ways. First, within a set, the pixel-based dissimilarities (cf. see following text, Eq. 3) between the shapes along a given dimension had to be largely similar to the pixel-based dissimilarities between the shapes along the other dimension. Second, these pixel-based dissimilarities, as calculated per set, had to be similar across sets. For sets 1 and 2, the dimensions taper and aspect ratio were manipulated. The amplitude of two radial frequency components were varied in sets 3 and 4, whereas the taper of the bottom and the top part of a two-part shape were independently manipulated in set 5. The isolated upper parts of the stimuli on the vertical axis of set 5 as well as the isolated lower parts of the stimuli on the horizontal axis of set 5 constituted a sixth set (Fig. 1). The shapes of set 6 were presented at the same positions as when presented in combination (set 5). The resulting 135 stimuli were presented in a random order.
|
|
In both experiments, eye position was monitored through the pupil position using an infrared eye tracking system (ISCAN, EC-240A) at a sampling rate of 120 Hz. Monkeys were rewarded with a drop of fruit juice at an increasing pace as long as they kept their gaze within 3° (monkey J) or 1.5° (monkey B) of a black fixation target (0.17° diam) in the center of the display.
EXPERIMENT 1.
We searched for responsive neurons by presenting the 135 stimuli in a random order at a continuous rate (no ISI) of 3.3 images/s (27 neurons) or 10 images/s (57 neurons) in trials of maximally 270 stimuli while the monkey was passively fixating. We visualized the responses of the cell in a peristimulus time histogram (PSTH) averaged over trials and all stimuli. If this PSTH showed that the cell responded, we continued presenting the stimuli for
15 min. If the PSTH indicated that the cell did not respond to any of the stimuli, we abandoned this cell and searched for another. All 27 neurons that were initially tested with the slow presentation rate of 3.3 images/s (stimulus presentation duration = 300 ms) were subsequently tested with the short, standard 100-ms presentation times, allowing a comparison of the responses for the two presentation rates.
EXPERIMENT 2.
Search test.
We searched for responsive neurons with the nine stimuli of every shape set presented for 300 ms, randomly intermixed, with an ISI of 700 ms. For half of the recorded cells, the wide-range configurations were used in the search test; for the other half, the narrow-range configurations were used. When we found a neuron responsive to
1 of these 45 stimuli (based on the visual inspection of the PSTHs), the shape set with the largest neuronal responses was selected for the subsequent test.
RSVP test.
The stimuli from the selected shape set with the same range configuration as that in the search task were presented randomly intermixed at a rate of 10 images/s (with no ISI) in trials of no more than 180 stimuli for a total duration of
15 min. Afterward, the stimuli of the other range configuration of the same shape set were presented in a similar fashion for
30 min. The second range was presented twice as long as the first range. This enables us to examine the temporal evolution of the adaptation to the stimulus statistics even when this adaptation process was slow. The order of the wide- and narrow-range configurations was counterbalanced across neurons.
Recordings
Standard extracellular recordings were performed with Tungsten microelectrodes, lowered in a guiding tube, into the lower bank of the superior temporal sulcus and lateral convexity of IT during a passive fixation task. The signals of the electrode were amplified and filtered using conventional single-cell recording equipment. Spikes from individual neurons were isolated on-line using Plexon software (Plexon, Dallas, TX). The timing of the single units and the stimulus and behavioral events were stored with 1-ms resolution on a personal computer for later off-line analysis.
Data analysis and tests
In both experiments 1 and 2, the response of the neuron was defined as the mean number of spikes in a 50- to 200-ms analysis window relative to stimulus onset. Alternative analysis windows (50250 ms and 100200 ms) showed highly similar results. The first three stimuli of every trial were excluded from all analyses because the responses of the majority of the recorded neurons were characterized by a burst at the start of every trial, lasting for
300 ms (i.e., the total presentation duration of 3 stimuli at a rate of 100 ms/image). The last stimulus of every trial was also excluded from all analyses. This last stimulus differed from all others in the trial in that it was not followed by any other stimulus, excluding any potential backward masking effect that could have been present for all other stimuli.
EXPERIMENT 1. All analyses were performed on those neurons showing shape selectivity within one or more shape sets. The shape selectivity of the neurons was examined by assessing the statistical significance of the observed variance of neuronal mean responses to stimuli within a shape set by using a permutation test. The range of variances per shape set expected by chance was determined by calculating new variances from the data after permuting the order of the stimuli within each trial while maintaining the actual spike counts. A distribution of 1,000 permuted variances was generated, representing the distribution of variances that would have been expected to occur by a chance association between stimulus and neuronal firing. If the observed variance of the neuronal mean responses to stimuli within a shape set was larger than the 95th percentile of the values in its own variance distribution, that neuron was considered to be shape selective within that shape set (P < 0.05, 1-tailed). The shape selectivity for the 10 shapes of set 6 (the isolated parts of the shapes on the axes of set 5) was also tested with this permutation test.
As a measure of the selectivity for the shapes of a set, we computed the depth of selectivity (DOS) (Rainer and Miller 2000
) for every neuron and shape set. This measure of the degree of tuning of a neuron to a given stimulus set is defined as
![]() | (1) |
To estimate the reliability of our procedures in measuring the degree of selectivity, we employed an odd-even split half method. First we computed for each neuron the DOS indices separately for the odd and even repetitions of the stimuli and correlated these DOS indices across neurons. Because the split-half correlation is based on only half of the data, this was corrected by computing the Spearman-Brown split-half coefficient (Lord and Novick 1968
)
![]() | (2) |
To investigate whether the population responses of IT neurons can reveal low-dimensional representations of similarity in the parametrically configured shapes, we compared the within-set configurations obtained from position corrected pixel-based similarities with the neuronal representation space in IT cortex. Pixel-based similarities between two shapes were computed for each of 99 x 99 relative positions. The positions of one shape corresponded to a 99 x 99-square grid (step size = 1 pixel) that was centered on the other shape. As the pixel-based dissimilarity measure, we computed the Euclidean distance between the gray-level values of the pixels of two shapes. This procedure was done for each of the 99 x 99 relative positions, according to the formula
![]() | (3) |
The representation of shape similarities at the neuronal population level was analyzed by computing the Euclidean distance between a pair of stimuli i and j in the multidimensional space spanned by the responses of all neurons
![]() | (4) |
For each stimulus group, the different sets of similarity data were analyzed with nonmetric multidimensional scaling (MDS) using Statistica software.
To determine whether a systematic relationship existed between responses to the nine stimuli on the axes of set 5 (further referred to as the compound stimuli; there are 5 stimuli per axis, but the 2 orthogonal axes have the central stimulus in common, producing 9 distinct stimuli) and responses to the isolated parts of these stimuli (further referred to as the constituent stimuli; set 6), we performed a linear regression analysis per cell with the responses to the compound stimuli as the dependent variable and with the sums of the responses to the respective constituent stimuli as the predictor variable. Additionally, we ran the same regression analysis after pooling the responses of all cells showing shape selectivity for the 10 constituent stimuli to examine the relationship between the responses to the constituent stimuli and to the compound stimuli at a population level. In these regression analyses, a slope of 0.5 would indicate that the responses to the compound stimuli are the averages of the responses to the constituent stimuli, presented in isolation, whereas a slope of 1.0 indicates that the responses to the compound stimuli are the sum of the responses to the constituent stimuli.
We examined the temporal dynamics of both the selectivity among the shapes from different stimulus sets and discrimination among the shapes within one stimulus set at a population level using an information-theoretic approach (cf. Sugase et al. 1999
). According to this approach, each predictable piece of information associated with an occurrence of a neuronal response [I(S;R)] is quantified as the decrease in entropy of the stimulus occurrence [H(S)]
![]() | (5) |
The use of a small number of trials induces an upward bias in the estimation of transmitted information. To correct for this, we subtracted the first-order correction term (C1) from the value calculated using Eq. 5, as C1 represents almost all the error due to limited sampling (Panzeri and Treves 1996
)
![]() | (6) |
s = number of nonzero response bins for the presentations of stimulus s, B = total number of bins, and S = number of stimuli. Thus the corrected transmitted information, Ic, is defined as follows
![]() | (7) |
|
Stimulus sets were chosen for this selectivity time course analysis based on a cluster analysis (Ward's method; Statistica) of the neural similarity matrix of all neurons and all stimuli (see RESULTS). This clustering algorithm starts from a configuration with as many clusters as stimuli and groups similar stimuli in a series of steps (starting with the most similar ones) until all stimuli are clustered together.
EXPERIMENT 2. All analyses were performed on those neurons showing shape selectivity in the RSVP test for at least one of the two range configurations, as tested by a one-way ANOVA (P < 0.05) for each range configuration (of 9 stimuli each).
To compare the neuronal selectivity at the population level for the shapes from the narrow- and wide-range configurations, the stimuli were first ranked based on the difference between the mean response to stimuli A and B, averaged across the two ranges, and the mean response to stimuli D and E of both ranges (see Fig. 2 for definitions of stimuli A, B, etc.). If the former average response was larger than the latter, the stimuli were ranked in ascending order (i.e., A B C D E). If the opposite was true, a descending ranking was used (i.e., E D C B A). The same ranking was used for the nine stimuli. The responses to the nine stimuli of the wide- and narrow-range configurations were fitted with a second-order polynomial least-squares fit.
For all further analyses, we focused on the shapes common to the two ranges, i.e., the A to E stimuli of Fig. 2, to study the effect of stimulus context on the responses and selectivity of our cells. For every cell, the depth of selectivity index (DOS, Eq. 1) was calculated for both the narrow and wide range to quantify the degree of selectivity for the common shapes. To show the evolution of the neuronal selectivity over time, the data were subdivided into blocks of 10 presentations per stimulus and for each range configuration, DOS indices were calculated per block.
To quantify the neuronal ability to discriminate among shapes, we employed receiver operator characteristic (ROC) analysis (e.g., Cohn et al. 1975
; Vogels and Orban 1990
). For each neuron, ROC curves were generated by computing the distribution of the responses in the different presentations of a stimulus and then computing the proportion of spikes that exceeded a particular response criterion (in steps of 1 spike). The ROC analysis was done using the middle stimulus C and one of the extreme stimuli, either A or E (i.e., the one ranked as having the maximum response for that cell), of experiment 2. The area under the ROC curve generated by the neuronal response distributions for this pair of stimuli yields a score for the neuronal discrimination ability. Perfect discrimination results in an area of 1; random discrimination produces an area of 0.5. Because we were interested in discriminability per se, the lowest valuechance performanceis 0.5. Thus values <0.5 were corrected by subtracting these from 1.
To examine whether the stimulus statistics altered the amount of information carried by the responses of the cells, the information transmitted by the neuronal responses regarding the presentation of shapes A to E (in a 50- to 200-ms time window) was quantified per cell for both the narrow and the wide range using Eq. 7.
| RESULTS |
|---|
|
|
|---|
We recorded from 84 neurons (67 from monkey J; 17 from monkey B) using the RSVP procedure with a 100 ms/image presentation rate. Eighty neurons showed shape selectivity within one or more shape sets as measured with a permutation test (see METHODS), resulting in a significant response modulation for a total of 240 shape sets. For each cell, there was an average of 76.52 presentations per stimulus (minimum = 12, maximum = 151).
Across animals, the recording positions were estimated to range from 12 to 16 mm anterior to the external auditory meatus and included the lower bank of the superior temporal sulcus and the cortical convexity lateral to the anterior middle temporal sulcus.
COMPARISON OF SHORT AND LONG PRESENTATION RATES. For 27 of the 80 cells that were selective at a 100-ms presentation rate, stimuli were first presented in an RSVP procedure using 300-ms presentation duration before using the regular 100-ms/image presentation rate. Thus for this sample of neurons, one can correlate the shape selectivity for the fast, 100-ms and slow, 300-ms presentation durations. For both presentation rates, responses were computed using the 50- to 200-ms analysis window. We analyzed the responses to the shapes of those sets (n = 83) for which there was a significant modulation for the standard, fast presentation rate. To qualitatively compare the responses for the two presentation rates, for each neuron, the 25 stimuli within a set were ranked according to the size of their responses in the 300-ms presentation condition. The same ranking was then used to order the stimuli presented at 100 ms/image. As shown in Fig. 3A, the mean response in the 100-ms/image presentation rate condition, averaged across neurons, decreased as a function of stimulus rank. A general linear model (GLM) repeated-measures ANOVA with rank as within-neuron factor showed that this modulation for the 100-ms/image sequences was significant [F(24,1968) = 29.71, P < 0.001], indicating that the overall ranking was preserved at this fast presentation rate. A similar overall preservation of shape rank was obtained when the stimuli were ranked using the responses of the fast presentation rate [Fig. 3B; F(24, 1968) = 30.39, P < 0.001]. To quantify the similarity of the selectivity patterns under both presentation duration conditions, we correlated for each neuron the responses to the 25 shapes in the fast presentation rate with those to the same shapes in the slow presentation rate. The median percent of variance in the response pattern measured at the fast presentation rate that could be explained by the response pattern measured at the slow rate was 0.32 (1st quartile = 0.13; 3rd quartile = 0.56), which corresponds to a correlation coefficient of 0.57. The preceding correlation analysis (as well as the ranking) was performed on the 25 stimuli of a set for which the neuron responded selectively. When the responses to all 135 stimuli were correlated instead, the median correlation between responses to the fast and slow rates was even higher: r = 0.75 (explained variance = 0.57).
|
Note that the ranking curves for the fast and slow presentation of Fig. 3 are more similar for the fast rate reference ranking (Fig. 3B) than when the slow rate is used as a reference (Fig. 3A). This can be explained as follows. Given the imperfect correlation of the responses in the two presentation conditions, the ranking curve for the slow rate will flatten if the shape ranks are based on the fast rate (compare in Fig. 3, A and B). Also, the ranking curve for the fast rate will flatten when the slow rate is used as the reference (compare · · · in Fig. 3, B and A). Given the higher selectivity for the slow compared with the fast rate, the flattened curve for the slow rate will become somewhat similar to the ranking curve for the fast rate when the latter is used as reference (Fig. 3B), whereas the curves for the two rates become more dissimilar when the slow rate is used as a reference (Fig. 3A).
TUNING WITHIN PARAMETRIC SHAPE SPACES. Most of the neurons recorded showed a preference (i.e., the largest response) for the extreme values of the parametric space (see Fig. 4 for a representative neuron). Instead of showing a uniform distribution across the parametric shape space, the neuronal shape preferences across the population of recorded neurons were concentrated at the extremes of the stimulus dimensions (Fig. 5A). This preference for the extremes was significantly higher than expected by chance: in 215 of the 240 tested shape sets, the maximum response was for an extreme value of the parametric configuration [P < 0.001 as tested with a Binomial test over all sets; null hypothesis with expected relative frequency = 0.64 (16/25 extreme stimuli); all P < 0.002 for separate Binomial tests per parametric set]. We further tested whether the mean normalized responses of the cells showing shape selectivity for that shape set were uniformly distributed using a one-way ANOVA per shape set. For each set, the analyses showed that the mean responses were not uniformly distributed over cells (P < 0.01 for sets 2 and 4; P < 0.001 for sets 1, 3, and 5). Indeed Fig. 5B shows that the neurons responded more strongly to the extreme values of a shape configuration.
|
|
We used the responses of the 80 neurons to those shape sets with significant modulation (240 sets) to compute the neuron-based (dis)similarity between each pair of stimuli. To determine how the neurons represent the similarities among the shapes, we analyzed the neural-based Euclidean distances (see METHODS) between the stimuli using MDS and compared the obtained neural-based configurations with the parametric, position corrected pixel-based stimulus configurations (Fig. 6). Note that the pixel-based configurations preserved the stimulus order of the parametric configurations (Fig. 1) and, in addition, showed that the physical distances among the shapes were similar along the two dimensions in each of the five configurations. Two-dimensional configurations explained most of the variance in the neural similarities for stimulus sets 15 [averaged across monkeys: 86% (n = 32 neurons), 91% (n = 52), 88% (n = 59), 87% (n = 61) and 92% (n = 36) for stimulus sets 15, respectively].
|
The stimulus order in the neuron-based two-dimensional (2D) configurations matched the order of the pixel-based configurations for sets 2 and 3. The neuron-based 2D configuration for stimulus set 1 deviated from that of the pixel-based configuration for two stimulus pairs, again demonstrating a good overall fit between physical and neural similarities. Note that for set 3, the neurons represented the horizontal dimension, i.e., "indentation", more acutely than the vertical one, which fits the distribution of the tuning shown in Fig. 5A. An even more striking difference in sensitivity for the two radial frequency dimensions was present for stimulus set 4. For the latter stimulus set, the sensitivity along the horizontal dimension was much weaker than along the vertical, indentation dimension, resulting in a highly anisotropic distribution of the stimuli in the two-dimensional space. However, note that along the vertical dimension, the stimulus order is relatively well preserved, indicating that the neurons represent variations along this dimension at the ordinal level.
Stimulus set 5 is a special case because the shapes were compound stimuli consisting of two shapes, each of which was varied systematically along one dimension. Figure 6E shows the 2D configuration for shape set 5 and indicates that there was an overall correspondence between the parametric space and the neural configuration, albeit not as clear as that for sets 13. Also this set displayed a strong difference in sensitivity for the two stimulus dimensions: the neurons were more sensitive to variations along the vertical "star" dimension than along the horizontal "pentagon" dimension.
COMPARISON OF RESPONSES TO TWO-PART SHAPES AND THEIR SINGLE PARTS. An important question regarding the shapes in set 5 is how the responses to these two-parts shapes relate to the responses to the single parts, i.e., the constituent stimuli. We quantified this relationship for 59 neurons that showed significant shape selectivity for the 10 constituent stimuli (set 6; see METHODS) or for the compound stimuli (set 5) as tested with a permutation test. Figure 7 A shows a scatter plot for one of these cells in which the responses to the nine compound stimuli are plotted against the sum of the responses to the constituent stimuli presented in isolation. The responses to the compound stimuli were much smaller than the sum of the responses to the constituent stimuli, indicating a strongly nonadditive relationship between responses to the two-part and single shapes. The slope of the regression line relating the sum of the responses to the constituent stimuli and the responses to the compound stimuli was 0.51. Thus for this neuron, the responses to the compound stimuli were very close to the average of the responses to the constituent stimuli presented in isolation. As expected from such averaging, the responses to the two-part shape were significantly lower than the responses to the part eliciting the best response when presented alone [paired t-test; t(8) = 2.34, P < 0.05; Fig. 7B] and significantly higher than the responses to the part eliciting the worst response [paired t-test; t(8) = 7.14, P < 0.001; Fig. 7C] when presented alone.
|
As discussed in detail by Zoccolan et al. (2005)
, who reported a similar averaging effect, shifts in attention might explain such an effect. According to this attention hypothesis, the mean response to the compound stimuli will correspond to the average of the responses to the constituent stimuli presented in isolation if the attention of the monkey was directed toward one part (e.g., upper part) for approximately half of the presentations and the other part (e.g., lower part) for the rest of the presentations. This hypothesis assumes that the response distribution across all presentations of each compound stimulus is drawn more or less equally from the distributions of responses to the two constituent shapes. Thus the attention hypothesis predicts that the variance of the distribution of responses to each compound stimulus equals the variance of the distribution that is obtained by combining the response distributions to the constituent shapes when the latter are presented in isolation. Following Zoccolan et al. (2005)
, we computed the Fano factors, i.e., the ratio of the response variance to the mean response, for the response distributions of each of the compound shapes (observed Fano factor) and compared those to the Fano factors computed for the distributions obtained by combining the responses to the constituent shapes, i.e., the response distributions predicted by the attention hypothesis. In performing this analysis for all 59 neurons, we found that the observed mean Fano factor (1.42) was significantly smaller than the value predicted by the attention hypothesis [1.69; paired t-test, t(530) = 12.16, P < 0.001]. When only those compound stimuli were included for which the responses of the neuron to the respective two constituent stimuli differed by a factor of two or more (i.e., where 1 constituent stimulus was much more effective than the other), similar results were found [observed mean Fano factor = 1.34 vs. Fano factor predicted by the attention hypothesis = 1.87; paired t-test, t(101) = 8.84, P < 0.001]. These results indicate that the reported average effect is not likely to be merely the result of shifts in attention.
TIME COURSE OF SHAPE SELECTIVITY.
The RSVP paradigm is similar to reverse correlation paradigms that have been used to examine the time course of e.g., orientation and spatial frequency selectivities in earlier visual areas such as V1 (e.g., Bredfeldt and Ringach 2002
; Ringach et al. 1997
, 2003
). As in the latter studies, the present RSVP data can be used to examine the time course of shape selectivity. Because we used different shape sets, we can compare the time course of the selectivity for the shapes belonging to different sets with that of the selectivity for the shapes of a single set. If shapes from different sets are, at least on average, less similar than shapes from the same set, we could use this characteristic to address the question of whether the onset of selectivity depends on shape similarity because we would therefore expect earlier between-set than within-set selectivity.
To assess whether this prerequisite of greater within-set versus between-set similarity had been met, we first performed a hierarchical cluster analysis on the neuron-based similarities that were computed between all possible pairs of stimuli using the responses of all 80 neurons (neural-based Euclidean distances; see METHODS). As shown in Fig. 8, all stimuli from set 1 were assigned to the same cluster before being clustered with stimuli from other shape sets. The stimuli from set 2 were also clustered together first as well as the stimuli from set 5. However, the stimuli of sets 3 and 4 were not cleanly assigned to two separate clusters. Some stimuli from both sets were clustered with one another before being clustered with other stimuli from their respective sets. This is illustrated in more detail in Fig. 8, inset, in which the shapes of the different sets that were clustered are indicated by colored squares. Interestingly, the clustered shapes of the two sets are similar regarding their "blobby" nature. When either set 3 or 4 was excluded from the cluster analysis, the results showed the expected clustering: all stimuli from a set were first assigned to the same cluster before being clustered with stimuli from the other three sets. Therefore, to examine whether the onset of selectivity is earlier for the between-set compared with the within-set selectivity and to meet the prerequisite of larger within-set versus between-set similarity, we excluded shape set 4 from further analyses (exclusion of set 3 instead of set 4 resulted in highly similar findings).
|
Similar results were obtained if the within-sets transmission rates were computed on only those cells showing significant response modulation (instead of on all 80 recorded cells). The latencies for the within-sets transmitted information were 101, 77, 85, and 109 ms (mean within-sets latency = 93 ms), whereas the peaks were reached after 173, 149, 157, and 165 ms for sets 13 and 5, respectively (mean within-sets peak = 161 ms). When we compare these values to the between set latency and peak of 69 and 149 ms, respectively, then it is clear that although overall the within-set selectivity takes somewhat longer to develop than the between-set selectivity, this difference can be rather small [minimum latency difference of 8 ms (set 2) and minimum peak difference of 0 ms (set 2)].
Experiment 2
We recorded from 46 neurons (26 from monkey J; 20 from monkey B), 40 of which showed shape-selectivity within at least one of the two range configurations. For 22 neurons, the wide-range configuration was presented first. For the remaining 18 cells, the narrow-range configuration was presented first. For the stimuli of the first-presented configurations, an average of 808.31 presentations per stimulus was obtained per cell (minimum = 595, maximum = 961). For the stimuli of the configurations presented second, 1,513.48 presentations were shown on average per cell (minimum = 590, maximum = 4,027).
Across animals, the recording positions were estimated to range from 12 to 16 mm anterior to the external auditory meatus. All neurons, except one that was recorded from the lower bank of the STS, were from the cortical convexity lateral to the anterior middle temporal sulcus or from the lip of the STS (area TEm) (Seltzer and Pandya 1978
).
COMPARISON OF RSVP AND SLOW, INTERMITTENT STIMULUS PRESENTATION.
As in experiment 1, we wanted to check whether stimulus selectivity is preserved at the fast RSVP rate of 100 ms/image. Each neuron was tested in the search test using the same stimuli as in the subsequent RSVP condition. In that search test, stimuli were presented for 300 ms with an ISI of
700 ms. These are standard procedures for testing the responses of visual neurons. As for experiment 1, we used a ranking procedure to qualitatively compare the selectivity in the two testing procedures. It is important to note that in the search test, a minimum number of trials were presented per stimulus, just enough to check which shape set elicited the best responses. We compared the responses in the two testing procedures for those neurons (n = 39) for which in the search test, the stimuli were presented at least twice (median number of presentations = 3; 1st quartile = 2; 3rd quartile = 4). For both testing procedures, the responses were quantified using an identical 50- to 200-ms analysis window. We therefore ranked the responses to the stimuli of a given shape set in the RSVP test according to the responses in the search test. As shown in Fig. 10, the responses to the stimuli presented in the RSVP test monotonically decreased as a function of the stimulus rank determined in the search test. This effect of stimulus rank was significant [repeated-measures ANOVA; F(8,304) = 10.01, P < 0.001], demonstrating that, at the population level, the stimulus preference at the 100-ms/image RSVP rate was overall similar to that obtained when using the intermittent presentations. Note that the average response level was considerably lower for the RSVP than for the intermittent presentations. The stronger forward and backward masking and stronger repetition suppression (see following text) in the RSVP compared with the slow intermittent presentations are the most likely factors contributing to this difference in the overall responsiveness.
|
The DOS indices obtained in the search test correlated significantly with those obtained in the RSVP test [r = 0.35 (P < 0.05; n = 39)]. The Spearman-Brown split-half reliability coefficients for the DOS-indices were 0.87 and 0.97 for the search test and RSVP test, respectively. Thus the low correlation between the DOS of the search and RSVP tests is not due to a low reliability of the DOS measure in the tests but reflects a genuine change in the degree of selectivity. The mean DOS for the search test was significantly higher than for the RSVP test (0.32 and 0.18, respectively; Wilcoxon matched pairs test, P < 0.001).
EFFECT OF STIMULUS STATISTICS ON STIMULUS SELECTIVITY. We performed analyses on the stimuli common to the two range configurations (stimuli AE in Fig. 2) to study the effect of the stimulus distribution statistics, i.e., stimulus context, on the responses and selectivity of all tested cells. First, we performed a repeated-measures ANOVA on the neuronal responses for these common shapes with order of sets as a between-neurons variable (wide range first or narrow range first) and stimulus rank and range (wide or narrow) as within-neurons variables. Stimulus rank was determined as described in METHODS. As expected from the stimulus ranking procedure, the main effect of rank was significant [F(4,152) = 24.77, P < 0.001]. There was no main effect of range (F < 1), but more importantly, the interaction range x rank was significant [F(4,152) = 7.53, P < 0.001], indicating a difference in selectivity between the sets with different ranges.
To elaborate on this result, we compared the slopes of the polynomial fits to the population responses (n = 40 neurons) for the nine stimuli (Fig. 11A): the slope of the polynomial fit for the narrow range was steeper than the slope of the polynomial fit for the wide range (narrow range: y = 0.29x2 3.87x + 32.03; wide range: y = 0.10x2 1.82x + 27.35), indicating that the narrower range increased the selectivity of the same shapes. The polynomial fits for the two set orders are shown separately in Fig. 11A, insets. For both orders, the slope of the polynomial fit for the narrow range was steeper than the slope of the polynomial fit for the wide range (narrow range first: narrow range: y = 0.21x2 2.89x + 28.67; wide range: y = 0.04x2 1.15x + 25.93; wide range first: narrow range: y = 0.38x2 4.86x + 35.39; wide range: y = 0.16x2 2.50x + 28.77). Additional repeated-measures ANOVAs on the neuronal responses for the common shapes for both orders separately with stimulus rank and range (wide or narrow) as within-neurons variables confirmed that, for both orders, the main effect of rank and the interaction range x rank were significant [wide range first: rank: F(4,84) = 19.25, P < 0.001, range x rank: F(4,84) = 4.78, P < 0.002; narrow range first: rank: F(4,68) = 8.07, P < 0.001, range x rank: F(4,68) = 3.31, P < 0.02], whereas the main effect of range did not reach significance [F < 1 and F(1,17) = 2.42, P > 0.10 for, respectively, the wide range and narrow range first]. Note that the overall response level appears to depend on the order (compare Fig. 11A, insets a and b), but statistical testing showed that this interaction between range and order did not reach significance [range x order: F(1,38) = 2.77, P > 0.10]. In fact, there was no significant main effect of order of sets (F < 1), and no interaction-effects with order were significant (rank x order: F < 1; range x rank x order: F < 1).
|
To show that not only is selectivity increased for the narrower range configuration but that the discriminability of the shapes is also improved, we employed ROC analysis. The goal of the ROC analysis was to measure the ability of the neurons to discriminate between the middle stimulus C and one of the extreme stimuli A or E (Fig. 2), depending on which of both stimuli was ranked as having the best response. The ROC analysis takes into account not only mean differences in firing rateas the DOS index doesbut also considers the variability of the spike counts for the presentations of a stimulus. For each range configuration, this analysis was applied to the responses of each cell to the stimuli C and A or E of Fig. 2. The ROC values obtained for the narrow-range configuration were significantly larger than for the wide-range configuration (Wilcoxon matched pairs test, P < 0.001), indicating that the discriminability of the same shapes was larger for the narrow-range stimulus distribution than for the wide-range stimulus distribution (Fig. 11C). A separate analysis for the two orders excluded the possibility that the difference in the ROC values of the two range configurations is merely a presentation order effect: for each presentation order, the mean ROC value for the wide-range configuration was significantly smaller compared with the narrow range (Wilcoxon matched pairs test, both orders: P < 0.01; Fig. 11C).
If discriminability of the shapes is improved for the narrow-range configuration over the wide range, one would expect the neurons to transmit more information about the stimuli in the narrow-range configuration as well. For both range configurations, we calculated the information available in the neuronal responses to the five stimuli common to the two configurations in a 50- to 200-ms time window. A Wilcoxon matched pairs test showed that significantly more information was indeed transmitted by the neurons in the narrow-range compared with the wide-range configuration (P < 0.02; mean Ic narrow range = 0.025 bits; mean Ic wide range = 0.018 bits; Fig. 11D).
If the increase in selectivity for the narrow compared with the wide range results from an adaptation to the stimulus statistics, one would expect this difference to evolve during stimulus exposure. Thus we computed DOS indices in succeeding blocks of 10 presentations per stimulus and did this for the first 30 blocks. These analyses, examining the evolution of the DOS index over time, were performed on the stimuli common to the two range configurations (AE of Fig. 2). A GLM repeated-measures ANOVA with range (wide or narrow) and presentation block as within-neurons variables showed a significant main effect of range [F(1,39) = 5.51, P < 0.05], confirming a selectivity difference between the two range sets. The main effect of presentation block was also significant [F(29,1131) = 26.72, P < 0.001], showing a decrease in selectivity with an increasing number of presentations of the stimuli. Importantly, the interaction between range and presentation block was significant [F(29,1131) = 2.54, P < 0.001], revealing a different evolution of the neuronal selectivity over time for the two range configurations. To evaluate this interaction, we performed post hoc comparisons (Bonferroni test) between the two ranges for every presentation block. Initially, there were no significant differences between the ranges (blocks 17: P = 1). After
70 presentations/stimulus (block 8), the first difference began to appear: the DOS for the narrow-range set was larger than for the wide-range set (P < 0.05). Up to 150 presentations, most blocks showed this pattern (block 10: P < 0.001; block 12 up to 14: P < 0.01; but block 9: P > 0.5 and block 11: P > 0.14). After >150 presentations, the differences in DOS between the two ranges failed to reach significance (block 15: P > 0.07; block 16 up to 30: P = 1), although the DOS indices were still consistently greater for the narrow compared with the wider range. Thus as shown in Fig. 12, the higher DOS for the narrow compared with the wide range was not present from the start but evolved during the successive presentations, suggesting that this effect indeed resulted from an adaptation to the stimulus statistics.
|