|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1Computational Neuroscience Laboratory, The Salk Institute, La Jolla, California; 2Department of Neurobiology and Anatomy, University of TexasHouston Medical School, Houston, Texas; and 3Division of Neuroscience, Baylor College of Medicine, Houston, Texas
Submitted 16 February 2006; accepted in final form 26 September 2006
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
|
In this study, we examine this issue at the level of single-unit neurophysiology. The approach here is to compare shape selectivity in neurons in a high-level ventral area, the anterior inferotemporal cortex (AIT, generally corresponding to area TE), with shape selectivity in a high-level dorsal area, the lateral intraparietal cortex (LIP). Sereno and Maunsell (1998)
have previously shown that shape selectivity is indeed a property of macaque monkey LIP neurons. Here we extend that work with additional data comparing responses in LIP and AIT using the same set of simple two-dimensional shapes.
Anterior inferotemporal cortex is the highest predominantly visual area within the ventral pathway. It is believed to be involved in visual object recognition (Tanaka 1996
) and visual memory of objects and patterns (Miyashita 1993
). As a late stage within the ventral pathway, it shows high selectivity to complex object shapes (Desimone 1984
; Kobatake and Tanaka 1994
; Rolls and Tovee 1995
; Tamura and Tanaka 2001
). Paradoxically, neurons in AIT are also involved in visual categorization (Freedman et al. 2003
; Hung et al. 2005
; Sigala and Logothetis 2002
; Vogels 1999
), which requires neurons to generalize across stimuli. The high selectivity required for object identification accompanied by an ability to generalize required for object categorization together define two key aspects of the problem of object recognition.
The lateral intraparietal cortex is a high-level area within the dorsal pathway concerned with issues related to eye movements (Colby and Goldberg 1999
). Depending on the behavioral context, LIP neurons show mixed response components to the motor, sensory, mnemonic, attentional, and intentional aspects of the task (Colby and Goldberg 1999
; Sereno and Amador 2006
). However, they can respond briskly to flashed visual stimuli under a simple fixation task (Colby et al. 1996
; Robinson et al. 1978
; Sereno and Maunsell 1998
), indicating that visual input in the absence of any actual or anticipated motor action is sufficient to drive them. Sereno and Maunsell (1998)
have shown shape selectivity in LIP to two-dimensional patterns with single-unit recordings in behaving monkeys. Cue invariant responses to three-dimensional shapes were observed by Sereno et al. (2002)
in anesthetized monkey using fMRI, and three-dimensional shape selectivity was shown by Shikata et al. (1996)
and Nakamura et al. (2001)
with single-unit recordings in behaving monkeys.
If one accepts the view that an important function of visual processing in the dorsal pathway is the guidance of skilled action, shape selectivity within parietal structures is not surprising, but occurs as a necessary step in directing appropriate motor responses to a given visual target. The question of interest here is whether shape encoding within the dorsal pathway has similar characteristics to shape encoding in the ventral pathway. Possibly, given the different functionality of the two pathways, different shape information is extracted. Alternatively, the same shape information may serve both pathways, being communicated along the strong neuroanatomical projections known to connect them (Webster et al. 1994
).
| METHODS |
|---|
|
|
|---|
Two male macaque monkeys (Macaca mulatta, 10 kg, identified as monkey 1, and M. nemestrina, 8 kg, identified as monkey 2) were implanted with a head post and scleral search coil in an aseptic surgery. The animals were trained to perform a passive fixation task as well as three other behavioral tasks with these shape stimuli. Each animal had extensive daily training with these shape stimuli across a 6- to 12-mo period. After training, recording chambers were implanted. The chambers for LIP were implanted first and centered 35 mm posterior and 1012 mm lateral, and the chambers for AIT were implanted after recording from LIP and were centered 18 mm anterior and 1821 mm lateral. For all surgeries, animals were first restrained with ketamine and maintained on 13% isoflurane anesthetic. Throughout the surgical procedure, the animal was administered 5% dextrose in lactated Ringer solution, and the level of anesthesia was monitored and recorded. At the end of surgery and daily after surgery as needed, the animal was administered analgesics and antibiotics. All experimental protocols were approved by the University of Texas, Rutgers University, and Baylor College of Medicine Animal Welfare Committees and were in accordance with the National Institutes of Health Guidelines.
Tasks and procedures
Two macaque monkeys performed a passive fixation task with the stimulus shape positioned so that it fell within the receptive field. Recording procedures were identical in the two animals and the two areas. While the electrode was slowly advanced with a hydraulic micropositioner in search of cells, monkeys typically performed two matching tasks (see Sereno and Amador 2006
for details). In brief, we recorded from any neuron that we could isolate well and appeared stable. Before the start of data collection, preliminary testing was conducted to determine the stimulus position producing the strongest response, using a grid of locations over a range of stimulus eccentricities and angles within a polar coordinate system.
The stimulus for each trial in the fixation task was selected from a set of eight different shapes (Sereno and Maunsell 1998
) (Table 1). Each shape was a simple two-dimensional black and white geometric form that fit within the same-sized square region and had an equal number of white pixels. All eight shapes were centered on the same position within the unit's receptive field. Stimulus size across the population of recorded units ranged from 0.65 to 2.00° of visual angle, with stimulus size increasing as a function of eccentricity to remain easily discriminable as acuity declined toward the periphery.
|
) or tungsten microelectrodes (12 M
; Microprobe). Histology
In one animal, histological reconstruction using cresyl violet Nisslstained samples (see Fig. 2) \. showed that the units recorded in posterior parietal cortex lay within area LIP in the lateral bank of the intraparietal sulcus. Units recorded in AIT were in the lower bank of the superior temporal sulcus (STS) and convexity of the middle temporal gyrus. A few perirhinal cells were included at our most anterior recording positions.
|
To determine which neurons were selective for shape, an F-test (ANOVA) was performed on the average rate of firing for the eight different shapes. Average firing rate was calculated starting 50 ms after stimulus until 50 ms after offset (collapsed across repetitions within and across trials). For these ANOVA tests of shape selectivity, a significance criterion level of P < 0.05 was used. Cells identified as shape selective in this manner were picked out for further analysis as detailed below, with response to stimuli always defined as average firing rate. This analysis and subsequent analyses pooled responses from multiple stimulus repetitions within each trial, with the exception of the calculations for the Fano factor described below.
SHAPE SELECTIVITY MEASURES.
Shape selectivity of each neuron was quantified by two measures. The first was the contrast measure of selectivity
![]() | (1) |
The second selectivity measure was the kurtosis of the probability density function of responses of each cell to the eight shapes
![]() | (2) |
was the mean response over all shapes,
is the SD of the responses, and
·
was the mean value operator (Lehky et al. 2005
FANO FACTOR.
We quantified the relative magnitude of signal and noise in the responses of AIT and LIP neurons. This was measured as the Fano factor F
![]() | (3) |
SPARSENESS. In addition to looking at the responses of individual neurons to the shapes, we examined the collective population response of all neurons within our data set (either AIT or LIP) to each shape. Although each neuron was recorded one at a time, for the purpose of doing these population-coding analyses, we treated them as responding in parallel to the presentation of a stimulus.
We first determined the sparseness of the population representation for each shape (Lehky et al. 2005
). This was calculated as the kurtosis of the probability density function of response magnitudes of all neurons within the population. Calculating population kurtosis (sparseness) used the same formula as the cell kurtosis (selectivity) above
![]() | (4) |
was the mean response of all neurons in a population to that shape. Under some theoretical interpretations (Field 1994
For graphical purposes, the probability density functions (PDFs) of population responses were determined using kernel density estimation methods (Silverman 1986
). Smoothing was carried out using a Gaussian kernel with a bandwidth of 3 spikes/s (half-width at half height). A separate PDF was calculated for each of the eight stimulus shapes and averaged together to produce the overall AIT or LIP population response PDF.
CLUSTER ANALYSIS AND MULTIDIMENSIONAL SCALING. We further characterized population responses to different shapes by performing cluster analysis and multidimensional scaling. As a first step for both those procedures, it was necessary to define a distance or dissimilarity matrix, providing the value of some suitable scalar metric that indicated the difference between population responses to every pair of stimulus shapes. As that distance metric, d, we chose d = 1 r, where r was the correlation coefficient between the components of two vectors (x1, x2, ..., xn) and (y1, y2, ..., yn) defining the population responses to two shapes. Here each vector component (xi or yi) represents the response of the ith neuron to a particular shape.
A correlation-based distance metric was chosen instead of Euclidean distance
to emphasize the pattern of relative firing rates within a neural population rather than absolute differences. We regard different shapes as being coded by different directions of the population vector, and not the vector length, which may be affected by such factors as the contrast and luminance of shape stimuli. The Euclidean distance metric, and related metrics such as d', suffer from the disadvantage that they do not distinguish between changes in vector length and vector direction. On the other hand, the correlation metric picks out just changes in vector direction and ignores length, which we regard as a desirable characteristic. For example, if population response vectors for two stimuli had identical directions and differed only in length, the Euclidean metric would report that as having nonzero distance (different shapes), whereas the correlation metric would report that as zero distance (the same shape).
We performed a hierarchical cluster analysis based on the distance matrix defined above. The cluster analysis was carried out using the single linkage (nearest neighbor) method. This procedure grouped together stimuli that produced population response patterns that showed high correlations. In other words, shapes that clustered together indicated that the population of neurons fired similarly (relative firing rates) to these shapes.
We also performed a classical multidimensional scaling (MDS) analysis based on the same distance matrices. This served to reduce the dimensionality of the space representing each stimulus shape from potentially up to n dimensions, where n is the neural population size (although less if given a limited data sample size), to a smaller number of dimensions that capture most of the variance in the data. Such a low dimensional representation allows easier visualization of patterns within the data. Furthermore, to the extent that MDS shows that it is possible to form low-dimensional neural representations of shape compatible with the data indicates a possible advantage from a computational perspective because that may reduce the computational load required for the representation and recognition of visual stimuli.
To better compare the MDS results for AIT and LIP, we carried out a Procrustes mapping (Borg and Groenen 1997
) of the LIP configuration onto the AIT configuration. This involved finding a linear transform (scaling, rotation, translation, reflection) of the LIP configuration that minimized the mean square distance between the LIP points and AIT points. This procedure was carried out incorporating all dimensions generated by the MDS analysis that had positive eigenvalues (6 in AIT, 4 in LIP, plus 2 additional dimensions in LIP padded with 0s to match the dimensionality of AIT), with the results projected down to two dimensions for plotting purposes. Because we were interested in the relative configuration of points within the multidimensional shape space extracted by the MDS analysis and not their absolute locations, we used the Procrustes procedure to give an estimate of the degree of similarity between the shape encoding spaces used by AIT and LIP.
To determine the statistical significance of the goodness-of-fit value for the Procrustes mapping (mean square error), we performed a permutation test in which the goodness-of-fit value was repeatedly calculated for random permutations of the MDS configuration matrix. Random matrices were generated by permuting both the row and column indices of the original MDS matrix. The fraction of random matrices producing a goodness-of-fit measure better than the original observed value was calculated. If this fraction was greater than 0.05, it was concluded that the two inputs to the Procrustes mapping were not significantly different.
We also calculated the two-dimensional correlation coefficient between the two-dimensional MDS configuration for AIT and the corresponding Procrustes mapping for LIP
![]() | (5) |
was the mean value of AIT coordinates of the stimulus shapes in the MDS space,
was the mean value for LIP, m = 8 is the number of shapes, and n = 2 was the number of MDS dimensions included in the calculation.
CONTRIBUTION OF NOISE TO POPULATION SHAPE REPRESENTATIONS.
Differences in noise between the two cortical areas could artificially lead to differences in the dissimilarity matrices (i.e., greater noise, greater distances). Hence in addition to performing a Fano factor analysis of the responses, we split the data in half (even and odd trials) to directly estimate the contribution of noise to the dissimilarity matrix described above. We calculated the vector distance between even trials and odd trials for the same shape, which ideally should be zero. Because this procedure cut the volume of data in half for each vector from six to three trials, the contribution of noise was increased, leading to an overestimate of vector distances. We corrected for that taking into account the following two considerations. First, noise amplitude in the full-set data should be smaller than in the half-set by a factor of
. Second, Monte Carlo simulations on a population of model neurons indicated that the (correlation-based) vector distance between noise-free and noise-degraded responses was approximately proportional to the square of the noise amplitude for small noise amplitudes. Combining those two factors, we estimated that vector distances in the full-set data would be one half those calculated from the half-set data.
| RESULTS |
|---|
|
|
|---|
Neurons in LIP had substantially higher average firing rates than AIT neurons in response to the various shapes. This difference is apparent in plots of peristimulus responses as a function of time shown in Fig. 3, averaging over all shape-selective cells and all shape stimuli. Mean firing rate of LIP units to shape stimuli was 22 spikes/s compared with 12 spikes/s in AIT. This difference was significant at the P < 0.001 level. The same difference between LIP and AIT was apparent in the individual data from each of the two monkeys, with mean firing rates for monkey 1 being 11 spikes/s in AIT and 20 spikes/s in LIP, whereas for monkey 2, the mean firing rates were 13 spikes/s in AIT and 26 spikes/s in LIP. These data did not show significant changes in firing rate as a function stimulus eccentricity (P > 0.1). Even when stimulus eccentricities were taken into account using an analysis of covariance (ANCOVA) procedure, the difference in activity between the two areas remained significant at the P < 0.001 level.
|
There were multiple presentations of the same stimulus within each trial (usually 24). Figure 3 shows both the response for the first presentation (solid line) and the average response over all repetitions (dashed line). Clearly stimulus repetition causes a response decrement in both areas. The data for all repetitions were normalized relative to the first repetition and plotted in Fig. 4. The decrement in activity caused by stimulus repetition is clearly visible in that plot. A two-way ANOVA was performed on the normalized data, with the two factors being number of repetitions and brain area (AIT or LIP), using an unbalanced ANOVA design because the sizes of the different groups were unequal. This analysis showed that the repetition decrement was significant (P < 0.02) but that there was no significant difference in repetition effects between the two brain areas (P > 0.1). Because repetition effects were not significantly different for the two areas, subsequent analysis pooled data from intratrial stimulus repetitions to increase the data sample size.
|
400 ms after the latency-shifted stimulus offset. The long-duration maintained activity, typically associated with some sort of cognitive processing such as memory, occurred even though the monkey was only performing a fixation task. LIP neurons were "noisier" than AIT neurons in the sense that individual neurons showed greater variability from trial to trial in response to a particular stimulus. The Fano factor (ratio of firing rate variance to firing rate mean; Eq. 3) for LIP neurons had a geometric mean of 3.29 and multiplicative SE factor of 1.05. For AIT neurons, the mean Fano factor was 2.81 with a SE of 1.04. The difference was significant (t-test, P < 0.02). In the context of measuring visual responses, "noise" could include any firing rate changes caused by other sources, including drifts in the cognitive state of the animal or adaptation to the stimuli.
The degree of shape selectivity for each cell was quantified by two different measures. The first was the contrast shape-selectivity index SC (Eq. 1), and the second was the kurtosis shape-selectivity index SK (Eq. 2). This latter measure incorporated information across the entire distribution of responses to all shape stimuli, not just maximum and minimum responses used with SC. In each case, larger values indicate greater shape selectivity.
Under both measures, AIT neurons showed higher average shape selectivity than neurons in LIP. For the contrast shape-selectivity index, SC(AIT) = 0.63 ± 0.09 (SE), whereas SC(LIP) = 0.45 ± 0.08, on a scale that ran from 0.0 to 1.0. For the kurtosis shape-selectivity index, SK(AIT) = 0.45 ± 0.50 and SK(LIP) = 0.30 ± 0.37, on an unbounded scale. The higher shape selectivity of AIT versus LIP neurons was significant at the P < 0.001 level under a two-sample t-test for both SC and SK measures.
An ANCOVA test on the stimulus selectivity indices indicated that eccentricity was highly nonsignificant as a factor affecting them (P > 0.8 for both indices). For the contrast selectivity measure SC, the difference between AIT and LIP remained significant after taking eccentricity into account as a nuisance variable (P < 0.02). The kurtosis selectivity index SK slightly missed significance under the same procedure (P = 0.07). However, because eccentricity was not a significant factor affecting SK, that result was likely caused by a loss in statistical power when the data were subdivided as a function of eccentricity, combined with the relatively high sensitivity of SK to noise. (Calculating kurtosis involves raising the data to the fourth power, which means noise is also raised to the fourth power.)
Calculating shape selectivity after subtracting average baseline activity did not change the observation of higher selectivity in AIT. It led to an increase in the mean of the contrast shape-selectivity measure for both AIT and LIP [SC(AIT) = 0.74 and SC(LIP) = 0.55] and produced no changes in the kurtosis shape-selectivity measure.
The higher shape selectivity in AIT held true for each monkey individually. For monkey 1, under the contrast selectivity measure SC(AIT) = 0.68 and SC(LIP) = 0.45, whereas for monkey 2, SC(AIT) = 0.52 and SC(LIP) = 0.45. Under the kurtosis selectivity measure, for monkey 1, SK(AIT) = 3.34 and SK(LIP) = 3.00, whereas for monkey 2, SK(AIT) = 3.74 and SK(LIP) = 2.51.
Histograms of shape-selectivity values for AIT and LIP neurons are shown in Fig. 5. The overall shapes of the histograms are broadly similar for both the SC and SK indices, suggesting that they are indeed both picking out similar aspects of neural responsiveness. The correlation coefficient between SC and SK was 0.53 for AIT and 0.44 for LIP, calculated from selectivity values taken on a cell-by-cell basis.
|
|
|
|
To quantify sparseness in the population coding of shape in AIT and LIP, we calculated the kurtosis, K, of the population response PDF for each stimulus shape. The results were K(AIT) = 1.3 ± 0.2 and K(LIP) = 4.8 ± 0.6. LIP exhibited much higher sparseness than AIT, and this difference was significant at the P < 0.001 level under a two-sample t-test.
Next, we quantified how population responses differed when presented with different stimulus shape, using the distance metric defined in METHODS. Calculating the response distance between all pairs of shape stimuli produced a distance matrix (Table 1). Clearly, the distances in LIP are much smaller. Thus the pattern of responses within the LIP population to different shapes tends to be more highly correlated than it is in AIT. Subtracting average baseline activity had no effect on these correlation-based distance measures. Data from each individual monkey also showed the same pattern of smaller distances in LIP. For monkey 1, the average distance
(AIT) = 0.37 and
(LIP) = 0.18, whereas for monkey 2,
(AIT) = 0.33 and
(LIP) = 0.03.
Although we used a correlation-based definition of distance as the basis of our presentation here, smaller distances in LIP also occur if we use the related Mahalanobis distance measure. For the Mahalanobis measure, mean distance in AIT was 20.8, and in LIP was 11.5, with the difference being significant at P < 0.001 under a t-test.
As differences in noise could affect the distance matrices, the magnitude of noise within population representations of shape was estimated by splitting the data in half (even and odd trials) and calculating population vector distances between the two halves for the same shape (analogous to the distances in Table 1). If the system were noise-free, the distances would all be zero. In fact, we calculated the median self-distance of the shape stimuli to be 0.08 in AIT and 0.02 in LIP. However, splitting the data in this manner reduces the sample size from six to three trials. If reduced sample size is compensated for, the distance measures drop in half (see justification for this in METHODS), so that distance in AIT is 0.04 and distance in LIP is 0.01. These last numbers can be compared with those in Table 1 to give a rough measure of the contribution of noise. The small value of the AIT same-shape distance indicates that noise within the AIT population encoding of shape is insufficient to account for the fact that AIT distances in Table 1 are much greater than those of LIP.
In both AIT and LIP, average response distance between pairs of shapes declined as stimulus eccentricity increased (Fig. 9). These distances are analogous to those shown in Table 1, but are calculated separately for neurons stimulated at different eccentricities. Even when eccentricity was taken into account as a nuisance variable in an ANCOVA analysis, response vector distances in AIT remained significantly greater than in LIP (P < 0.05). Stimulus size was increased as a function of eccentricity, thus this analysis unavoidably confounded those two variables.
|
Cluster analysis dendrograms based on the distance matrices in Table 1 are shown in Fig. 10. As expected from the larger response distances in AIT compared with LIP, the dendrogram for AIT is spread over a larger vertical scale than that of LIP. Interestingly, for AIT, the cluster analysis seems to have divided the eight shape stimuli into three groups, based on similarities in the populations' (relative) response patterns. The first group (yellow) consists of those shapes with strong vertical and horizontal features in their configuration. The second group (green) contains hollow, doughnut-like shapes. Finally, the third group (purple) has shapes that are triangular-like. In LIP, on the other hand, there is a much weaker differentiation of stimulus shapes into clusters (other than the outlier H-shape).
|
A three-dimensional plot of the neural shape space produced by MDS is presented in Fig. 11A. The representations of the eight stimulus shapes within the AIT shape space are shown by the yellow, green, and purple dots, whose colors indicate the three groups of shapes previously picked out by cluster analysis (Fig. 10A). The three groups remain clearly separated for the MDS analysis, as they were for the cluster analysis. For LIP, the representations within its shape space are shown by blue dots. These are clumped near the origin because of the compressed scale of the LIP shape space relative to AIT (as would be expected from the relative magnitudes of the AIT and LIP distance matrices in Table 1). Data from each individual monkey also showed the same pattern of three clusters in AIT and a clustering of all shapes near the origin in LIP. Although both AIT and LIP are plotted on common axes for convenience, there is no necessity that the coordinate dimensions in the plot represent the same thing in each case, because the MDS analyses were carried out independently for the two brain areas.
|
The two-dimensional correlation coefficient (Eq. 5) between the AIT and LIP configurations in Fig. 11B is r2D = 0.80. However, a permutation test on the goodness-of-fit value for the Procrustes mapping rejected the hypothesis that the AIT and LIP shape spaces were identical (P > 0.25). This permutation test showed that the Procrustes fit between the AIT and LIP coordinates in shape space was not significantly better than could be obtained by performing a Procrustes fit between the AIT coordinates and a random permutation of the LIP coordinates. Moreover, the average distance of the eight shapes from the origin remained much greater for AIT than LIP even after the Procrustes transform of the LIP data, with
AIT = 0.26 and
LIP = 0.14 (P = 0.02, paired t-test). Because the Procrustes transform included a scaling of the LIP shape space to maximize its congruency with the AIT space, the fact that a large difference in scale remained indicates that the Procrustes fit was constrained by major nonscale factors that differed between the two areas. Therefore despite some similarity indicated by the correlation coefficient, it seems that shape encoding in LIP is not simply an attenuated copy of shape encoding in AIT but includes notable differences that may reflect different goals within dorsal and ventral visual processing.
| DISCUSSION |
|---|
|
|
|---|
The majority of neurons in both AIT and LIP showed significant shape selectivity. Quantifying this effect further, we found that AIT neurons on average had higher shape selectivity than those of LIP (Fig. 5). In other words, AIT neurons were more narrowly tuned within neural shape space. Because neural responses to shape were measured under a simple fixation task, motor responses and cognitive factors such as attention were constant for the two cortical areas.
While many units in AIT showed low to moderate selectivity, there was a substantial subpopulation with quite high selectivity, a pattern of results similar to that seen in the AIT data of Op de Beek et al. (2001)
. LIP, on the other hand, had few highly selective cells. The higher shape selectivity we observed in AIT neurons indicates that different shapes are represented within AIT more distinctively than in LIP.
Cognitive effects
In addition to shape selectivity, we observed two effects that may be related to cognitive processes. These were both apparent in the peristimulus response plots (Fig. 3). The first was a repetition suppression effect. When the stimulus was presented multiple times within a trial, the response declined (Fig. 4). The second was maintained activity during the time periods between stimulus repetitions, which was much more prominent in LIP. Because the monkeys were performing only a fixation task, these effects were likely to be associated with reflexive, stimulus-driven, "bottom-up" cognitive processing, rather than goal-directed or "top-down" processing.
That the repetition suppression was not purely a low-level biophysical adaptation effect, but rather a more complex, possibly cognitive phenomenon, is indicated by the fact that response decrements were reset between trials. The suppression was not cumulative over extended periods of repeated exposure. Repetition effects have been widely reported in AIT (Brown and Bashir 2002
; Fahy et al. 1993
; Miller et al. 1991
, 1993
; Xiang and Brown 1998
). The reset property has also been noted previously in AIT (Holscher and Rolls 2002
; Miller et al. 1991
, 1993
). Here we report that similar repetition effects occur in LIP.
Repetition effects in AIT have been associated with various types of memory, including priming, long-term recognition memory, and visual working memory (Miller et al. 1991
; Xiang and Brown 1998
). It is possible that such effects in LIP may similarly be associated with some form of memory, perhaps in a manner that reflects its role in spatial processing and attention. One possibility is that repetition decrements in LIP are involved in an attentional phenomenon called "inhibition of return" (IOR) (Itti and Koch 2002
; Klein 2000
; Posner and Cohen 1984
; Sereno et al. 2006
; Tipper et al. 1991
), which biases attention away from returning to recently examined locations or objects in favor of novel stimuli. Inhibition of return correlates with attenuated activity in the superior colliculus (SC) (Bell et al. 2004
; Dorris et al. 2002
). However, this may be secondary to attenuated input to SC from parietal cortex (Dorris et al. 2002
). If that is the case, the attenuation we observed in LIP activity during stimulus repetition may be an upstream source of the inhibition of return effects observed in SC.
Sparseness of shape representations
The probability density functions of neural responses within AIT and LIP populations are shown in Fig. 8. Both PDFs have high kurtosis (Eq. 4), with mean values K(AIT) = 1.3 and K(LIP) = 4.8 over all stimulus shapes. In particular, the LIP kurtosis is vastly larger than anything reported in the visual literature. Population response distributions with high kurtosis are said to have high sparseness. Under some theories of neural encoding, high sparseness has been interpreted as an indicator of statistically efficient coding of visual stimuli, based on information theoretic arguments (Field 1994
; Simoncelli and Olshausen 2001
). However, Lehky et al. (2005)
have disputed any necessary connection between high sparseness and efficient coding, arguing that high sparseness may reflect deterministic nonlinearities in the system involved in implementing visual algorithms. Rather than efficient coding, the unusually high sparseness seen in LIP populations may be related to the implementation of visuomotor algorithms.
Population encoding of shape
Each shape can be treated as a point within a high-dimensional representation space. This may be a high-dimensional shape parameter space in psychophysics experiments (Cutzu and Edelman 1996
; Edelman and Duvdevani-Bar 1997
; Sugihara et al. 1998
) or a high-dimensional neural space in neurophysiology experiments (for example, Kayaert et al. 2005
; Op de Beek et al. 2001
), in which the size of the neural encoding population defines the dimensionality of the representation. The encoding of two shapes within AIT (n = 62) and LIP (n = 43) populations are shown in Fig. 7, where the heights of the histogram bars give the coordinate values along each dimension of the n-dimensional neural representation space.
Once the shapes are embedded in this n-dimensional space, the distances separating them can be calculated, in our case using a correlation-based measure of distance, d = 1 r. The separations between all pairs of shapes are given in Table 1, showing LIP distances are substantially smaller than those of AIT (i.e., LIP population responses to different shapes are more correlated than in AIT).
The relatively small vector distances separating LIP responses to different shapes at the population level can in part be the result of the lower shape selectivity of LIP individual neurons. However, the differences between LIP and AIT in Table 1 appear too large to be fully accounted for by the relatively modest differences in shape selectivity seen in Fig. 5, and we suspect there may be another aspect to the matter.
The high correlations observed in LIP population may reflect the fact that LIP is involved in sensorimotor integration and thus has both a sensory and a motor aspect to its function. Neural responses to different shapes would be expected to contain a large component related to motor response as well as the sensory component we are primarily concerned with here. Both the motor and sensory components would in general be stimulus dependent (shape-dependent in this case), and we shall therefore call them the "shape motor response" and the "shape sensory response." If the monkey is in a fixed motor response state (e.g., simply fixating, as in this experiment), the shape motor response component acts as a constant background that is modulated by the shape sensory response component. The presence of this background dilutes the shape sensory responses and leads to high correlation between the overall responses to different shapes.
In mathematical terms, the LIP sensory response to shape X is given by an n-dimensional vector Rx = (x1, x2, ..., xn) and sensory response to shape Y is Ry = (y1, y2, ..., yn). In addition, there is shape motor response background activity denoted by the vector RB = (b1, b2, ..., bn). RB remains the same for different shapes in the present task, but in other tasks may depend on the motor response required for a particular shape and the cognitive behavior required of the monkey. Total neural activities for the two shapes are given by
![]() |
![]() | (6) |
Postulating a large shape motor response component to LIP responses seems plausible and attractive given the diverse array of factors affecting LIP activity (Colby and Goldberg 1999
; Colby et al. 1996
). Furthermore, this account would leave open the possibility that the shape-selective responses that are seen in a majority of LIP units may become structured differently in another behavioral context. Such a behavioral effect on the grouping of visual stimuli was found in an adjacent parietal region, AIP, where the similarity structure was based on the specific and different behavioral responses associated with those visual stimuli (Murata et al. 2000
) and not visual similarity per se. In contrast to Murata et al. (2000)
, during this study, no response was allowed, and hence the different shapes were not associated with explicit and different behaviors. Comparing AIT with LIP, the shape motor response background within AIT would be almost certainly much smaller and perhaps negligible, so that AIT responses in a passive fixation task would be expected to show greater response distances (lower correlations) for different shapes.
Another factor to consider in seeking to explain the low discriminability (high correlations) among different shape responses in LIP relative to AIT is the possibility that discriminability in LIP neurons is better for three-dimensional shapes over the flat ones used here. Such a suggestion is plausible in light of the particular importance of three-dimensional space in parietal structures for purposes of sensorimotor coordination. However, this issue is not straightforward because AIT neurons (Janssen et al. 2000
; Sereno et al. 2002
; Tanaka et al. 2001
) as well as LIP neurons (Gnadt and May 1995
; Nakamura et al. 2001
; Sereno et al. 2002
) are sensitive to three-dimensional representations. Finally, it is possible that in a different context (e.g., shape discrimination task), AIT responses may change and show even greater response distances than reported here for a passive fixation task.
For the AIT neural population, the cluster analysis divided the eight stimulus shapes into three groups (Fig. 10A). These groups had members who clearly resembled each other ("yellow shapes": dominated by horizontal and vertical edges; "green shapes": variants of a hollow, doughnut-like ring; "purple shapes": triangular-like). In LIP (Fig. 10B), on the other hand, given the small distances separating population responses to different shapes (Table 1), the cluster analysis produced a compressed, poorly differentiated hierarchy, with less than 0.05 response distance separating seven of the eight shapes.
The cluster analysis results show that, in AIT, roughly similar shapes produce similar patterns of activity within a neural population encoding shape. Thus we see in AIT both the potential ability to identify shapes based on a particular pattern of activity in a population and the capability to generalize and group shapes based on correlations in population activities. Indeed, the close connection between identification and generalization of patterns is emphasized by formal models within the experimental psychology literature (Ashby and Lee 1991
; Nosofsky 1986
).
The form of generalization we observe reinforce a population coding approach to categorization. Other studies in AIT that have grouped stimuli based on patterns of population responses include Hung et al. (2005)
, Rolls and Tovee (1995)
, and Tsao et al. (2006)
. A number of AIT studies have focused on categorization effects at the level of single neurons rather than the population level (Freedman et al. 2003
; Sigala and Logothetis 2002
; Vogels 1999
). However, if one views a shape category as a particular region within a multidimensional shape space, examination of the properties of single cells in isolation rather than as populations becomes an inadequate approach to characterizing the system. We are not aware of any previous studies related to shape encoding in LIP, either at the population level or single unit level.
If categorization is indeed built on top of the sort of grouping we observed in AIT, LIP would be expected to be poor at visual categorization, given the relatively undifferentiated results of the cluster analysis for that area (Fig. 10B). That is, differences in responses to different shapes are so small that, in a noisy system, the shape space would need to be carved up into much coarser chunks to be reliably differentiated. Therefore given the small distances separating LIP population responses to different shapes (Table 1), LIP is not only expected to do worse in object identification than AIT, but also worse in object categorization.
The MDS analysis reinforced the results of the cluster analysis. Projection of the MDS configuration to three dimensions (Fig. 11A) picked out the same three groups of shapes in AIT as did the cluster analysis. In the same figure, the LIP configuration appears bunched near the origin, again because of the small distances separating LIP responses to different shapes.
Most of the variance in the data for both LIP and AIT was accounted for by the three dimensions plotted in Fig. 11A (in fact, most variance can be accounted for by just 2 dimensions). While this is consistent with previous reports that the visual system is encoding shapes within a low-dimensional space (Cutzu and Edelman 1996
; Edelman and Duvdevani-Bar 1997
; Op de Beek et al. 2001
; Sugihara et al. 1998
), we cannot place too much significance on this low-dimensionality aspect of the analysis, given the limited number of shapes (n = 8) in our sample. Regardless of the nominally large dimensionality of a shape parameter space or neural population encoding space, N points (shapes) can not possibly occupy more than an N 1 dimensional space. With a small shape sample, even that limited space may not be homogenously filled. Therefore the dimensionality reductions are somewhat less impressive than they might seem. While the reports cited above were aware of this "small sample" issue and constrained their experimental designs to deal with it, the issue of low-dimensional shape representation could still benefit from being re-examined with a larger sample of shapes.
Overall, it seems that AIT and LIP, in the context of a passive fixation task, do not represent these simple, two-dimensional shapes within the same shape space. Specifically, under identical conditions, we found that the patterns of activity within neural populations in AIT show an enhanced capacity not only to discriminate between shapes but also to generalize and group shapes based on correlations in population activities compared with LIP. These differences between the two pathways suggest that the shape spaces may be tailored to the different purposes for which they are being used and are not simply mirror copies of each other.
As was discussed earlier, we suggest that the high correlation among LIP responses to different shapes could be caused by the shape sensory response component being diluted by a large, invariant shape motor response component. The invariance of the motor response component to different shapes in turn reflects the task conditions of this particular experiment. Given the important role of the parietal pathway in sensorimotor integration, it is possible that under different task conditions, the LIP shape space would look quite different (see Murata et al., 2000
for behavioral effects on visual responsiveness in the AIP area, adjacent to LIP).
In summary, in a first comparison of shape selectivities between the two visual pathways, we observed lower selectivity for two-dimensional shapes in LIP than in AIT neurons, suggesting that LIP may be capable of less precise and subtle identification of objects. When population activities were examined as a whole, responses in the LIP population to different shapes were less distinct than those in AIT and showed a more poorly differentiated grouping of similar shapes. The striking attenuation in the shape modulation of population responses in LIP points to a reduced capability for precise object identification and categorization, at least within the present behavioral context. The attenuation of the shape signal in LIP may reflect its mixture with other signals, perhaps related to LIP's role in sensorimotor integration. These findings clearly support the idea that shape selectivities in the dorsal pathway are in some measure independent and are not merely the duplication of those formed in the ventral pathway.
| GRANTS |
|---|
|
|
|---|