|
|
||||||||
The Journal of Neurophysiology Vol. 88 No. 2 August 2002, pp. 929-941
Copyright ©2002 by the American Physiological Society
1Center for Learning and Memory, 2The Institute of Physical and Chemical Research-Massachusetts Institute of Technology Neuroscience Research Center, 3Center for Biological and Computational Learning and 4McGovern Institute for Brain Research, 5Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139
| |
ABSTRACT |
|---|
|
|
|---|
Freedman, David J., Maximilian Riesenhuber, Tomaso Poggio, and Earl K. Miller. Visual Categorization and the Primate Prefrontal Cortex: Neurophysiology and Behavior. J. Neurophysiol. 88: 929-941, 2002. The ability to group stimuli into meaningful categories is a fundamental cognitive process. To explore its neuronal basis, we trained monkeys to categorize computer-generated stimuli as "cats" and "dogs." A morphing system was used to systematically vary stimulus shape and precisely define a category boundary. Psychophysical testing and analysis of eye movements suggest that the monkeys categorized the stimuli by attending to multiple stimulus features. Neuronal activity in the lateral prefrontal cortex reflected the category of visual stimuli and changed with learning when a monkey was retrained with the same stimuli assigned to new categories. Further, many neurons showed activity that appeared to reflect the monkey's decision about whether two stimuli were from the same category or not. These results suggest that the lateral prefrontal cortex is an important part of the neuronal circuitry underlying category learning and category-based behaviors.
| |
INTRODUCTION |
|---|
|
|
|---|
Our perception of the environment is not a faithful registration of its physical attributes. Instead, we carve the world into meaningful groupings or categories. This process of abstracting and storing the commonalities among like-themed individuals is fundamental to cognitive processing because it imparts knowledge. For example, knowing that a new gadget is a "camera" instantly and effortlessly provides a great deal of information about its relevant parts and functions and spares us from having to learn anew each time we encounter a new individual. The ability to categorize stimuli is a cornerstone of complex behavior. Categories are evident in all sensory modalities and range from relatively simple (e.g., color perception) to the most abstract human concepts.
Because perceptual categories often group together very
different-looking things, their representation must involve something beyond the sort of neuronal tuning that typifies encoding of physical appearance: gradual changes in neuronal activity as features gradually change (e.g., shape, orientation, direction). In fact, evidence that a
human or animal has stored a category is that behavior does not track
smoothly with changes in physical appearance: categories have sharp
boundaries (not gradual transitions) between them and members of the
same category are treated as equivalent even though their physical
appearances may vary widely. A simple example is crickets sharply
dividing (at 16 kHz) a continuum of pure tones into "mate" versus
"bat" (a predator) (Wyttenbach et al. 1996
). Other
examples include humans' perception of the phonemes "b" versus
"p" (Lieberman et al. 1967
) and the facial
expressions of emotion (Beale and Keil 1995
).
The elaborate behavioral repertoire of advanced animals naturally
depends on more elaborate categorization abilities. Their mental
lexicon includes categories that are characterized along multiple
dimensions and are often difficult to precisely define. In addition,
advanced animals have an enormous capacity to learn and adapt. Monkeys,
for example, have been taught categories such as animal versus
nonanimal (Roberts and Mazmanian 1988
), food versus
nonfood (Fabre-Thorpe et al. 1998
), tree versus nontree, fish versus nonfish (Vogels 1999
), and ordinal numbers
(Orlov et al. 2000
). Such categories could be processed
in brain areas involved in object recognition such as the inferior
temporal cortex (ITC) (Desimone et al. 1984
;
Gross 1973
; Logothetis and Sheinberg 1996
; Tanaka 1996
) as well as those involved in
orchestrating voluntary, visually guided behaviors, such as the
prefrontal cortex (PFC) (Fuster 1997
;
Goldman-Rakic 1987
; Miller 2000
;
Miller and Cohen 2001
). The PFC and ITC are directly
connected (Ungerleider et al. 1989
; Webster et
al. 1994
) and both contain neurons that often exhibit highly
specific responses to complex stimuli such as trees, fishes, faces,
brushes, etc. (Bruce et al. 1981
; Desimone et al.
1984
; Gross et al. 1972
; Miller et al.
1996
; Perrett et al. 1982
; Scalaidhe et
al. 1999
; Tanaka et al. 1991
) and are influenced by experience (Booth and Rolls 1998
; Kobatake et
al. 1998
; Logothetis et al. 1995
;
Miyashita 1988
; Rainer and Miller, 2000
).
Whether or not their activity reflects stimulus categories has not been clear. These neurons have not been tested for the diagnostic
characteristics of categories, (e.g., sharp boundaries and
within-category generalization); their specificity might reflect
similarities and differences in physical appearance of the
stimuli, not necessarily their category membership.
To evaluate the role of the PFC in visual categorization, we trained
monkeys to categorize computer-generated stimuli into two categories,
cats and dogs. A novel three-dimensional (3-D) morphing system was used
to create a large set of parametric blends of six prototype images (3 species of cats and 3 breeds of dogs) (Beymer and Poggio
1996
; Shelton 2000
). This allowed us to
establish a sharp category boundary between stimuli that were
physically similar yet include in the same category stimuli that were
visually dissimilar. A brief report of these results appeared
previously (Freedman et al. 2001
).
| |
METHODS |
|---|
|
|
|---|
Subjects
Two female adult rhesus monkeys (Macacca mulatta)
weighing 6.0 and 7.5 kg were used in this study. Using previously
described methods (Miller et al. 1993
), they were
implanted with a head bolt to immobilize the head during recording and
with recording chambers. Eye movements were monitored and stored using
an infrared eye-tracking system (Iscan, Cambridge, MA). All surgeries
were performed under sterile conditions while the animals were
anesthetized with isoflurane. The animals received postoperative
antibiotics and analgesics and were handled in accord with National
Institutes of Health guidelines and the recommendations of the
Massachusetts Institute of Technology Animal Care and Use Committee.
Recording techniques
Electrode penetration sites were determined using magnetic
resonance imaging scans obtained prior to surgery. The recording chambers were positioned stereotactically over the lateral prefrontal cortex such that the principal sulcus and ventrolateral prefrontal cortex were readily accessible. Neuronal activity was isolated using
arrays of four to eight independently moveable tungsten microelectrodes
(FHC Instruments, Bowdoinham, ME). The electrodes were advanced using
custom-made screw-driven mini-microdrives (Nichols et al.
1998
) mounted on a plastic grid (Crist Instruments, Damascus,
MD). Neuronal activity was amplified, filtered, and stored for off-line
sorting into individual neuron records (Plexon Systems, Dallas, TX).
This allowed us to isolate an average of nearly two neurons per
electrode. We did not prescreen neurons for task-related activity such
as visual responsiveness or stimulus selectivity. Rather, we randomly
selected neurons for study by advancing each electrode until the
activity of one or more neurons was well isolated and then began data
collection. This procedure was used to ensure an unbiased estimate of
prefrontal activity.
Stimuli
A large continuous set of images was generated from three
prototype cats and three prototype dogs (Fig.
1) with a novel algorithm (Shelton
2000
). It found corresponding points between one of the prototypes and the others and then computed their differences as
vectors. Morphs were created by linear combinations of these vectors
added to that prototype. For more information see
http://www.ai.mit.edu/people/cshelton/corr/. By morphing different
amounts of the prototypes, we could generate thousands of unique
images, continuously vary shape, and precisely define one or more
arbitrary category boundaries. For most of the experiments, the images
were divided into two groups, cats and dogs, with the boundary at an
equal blend of cat and dog. Thus category membership was defined by
whichever category contributed more (>50%) to a given morph. As a
result, stimuli that were close to but on opposite sides of the
boundary were visually similar, while stimuli that belonged to the same
category could be visually dissimilar [e.g., the "housecat" (C1)
and "cheetah" (C2) prototypes]. The stimuli differed along
multiple features and were smoothly morphed, i.e., without sudden
appearance or disappearance of any feature. They were 4.2° in
diameter, had identical color, shading, orientation and scale, and were
presented at the center of gaze.
|
We confirmed that the morphs did indeed vary smoothly by using an image correlation analysis. This analysis was used merely to ensure that the morphing system functioned as designed and generated stimuli that that had no a priori discontinuities that the monkeys could exploit to solve the task. A two-dimensional (2-D) correlation coefficient was calculated for neighboring images at six levels of blends of cat and dog (cat:dog: 100:0, 80:20, 60:40, 40:60, 20:80, 0:100) along each of the nine between-category morph lines. The correlation was calculated by computing the 2-D correlation coefficient separately for each color plane and then averaging across planes. The correlation coefficient between neighbors remained constant and high (~0.9) across the entire morph space. The coefficients between stimuli directly across the cat/dog boundary did not differ from the coefficients calculated between adjacent morphs within the same category (1-way ANOVA, P = 0.44).
Behavioral tasks
The monkeys performed a delayed match-to-category task that required them to judge whether two successive stimuli were from the same category (Fig. 2). The trial began when the monkey grasped a metal bar and fixated a small (0.3°) white spot at the center of a CRT screen. They were required to maintain gaze within a ±2° square window around the fixation spot for the entire trial. After the initial 500 ms of fixation, a sample image was presented at the center of the screen for 600 ms, followed by a 1,000-ms delay. Then a choice image appeared. If the sample and choice stimuli were from the same category (a category match), the monkeys were required to release the lever before the stimulus disappeared 600 ms after its onset to receive a juice reward. If the choice image was from a different category (a category nonmatch), there was an additional brief delay (600 ms) followed by another image that was always a match and thus required a lever release. As a result, a category judgment was only required for the first choice image. The second delay and match image were used so that a behavioral response would be required on every trial. This ensured that the monkeys were always paying attention. Because a decision was only required for the first choice image and the forthcoming behavioral response was predictable from the second delay onwards, that delay and subsequent match image will not be considered further. Note that with this design, the behavioral response (lever release) is not uniquely associated with a category (it was used to signal "match," not cat or dog) and, further, the monkeys could not predict whether the first choice stimulus would require a response. Thus any differential activity to the sample categories could not be related to the behavioral response. A 2,000- to 3,000-ms inter-trial interval followed correct trials. An error was defined as a lever release to a nonmatch or failure to release to a match; breaks of fixation were not counted among the error rates in behavioral analyses. An additional 3,000-ms "time-out" was added to the inter-trial interval following an error. Monkeys typically performed >700 correct trials per day.
|
The monkeys were gradually trained to categorize the images as cats and dogs by beginning with a delayed match-to-sample task in which the prototypes were used as samples, the match was always identical to it, and the nonmatches were a prototype from the other category. We then gradually included more and more morphs as samples and chose images at increasing distances from the prototypes. In parallel, matches were chosen from an increasingly greater distance of morph space around the sample while respecting the category boundary. Nonmatches were always from the other category.
During the course of training, >1,000 sample stimuli were used from
all over the morph space. This prevented monkeys from solving the task
by simply remembering specific stimulus-response contingencies.
Neurophysiological recording, however, requires that a limited number
of stimuli be used so that each can be repeated multiple times and
neuronal variability can be assessed. Thus for recording experiments,
we limited the samples to 54 images. This included the six prototype
images and morphs from equally spaced intervals across each of the nine
morph lines that connected each cat prototype to each dog prototype
(Fig. 1A). There were six levels of blends of cat and dog
(cat:dog, 100:0, 80:20, 60:40, 40:60, 20:80, 0:100) along the nine
morph lines that crossed the two-category boundary (the red lines in
Fig. 1A) and two levels along the six within-category morph
lines (60:40, 40:60; the blue lines in Fig. 1A). To prevent
monkeys from learning to memorize specific stimulus-response
contingencies during the recording experiments, the choice stimuli were
100 randomly generated morphs from each category that were randomly
paired with sample stimuli of the appropriate category. To ensure that
category judgment errors were due to confusion over the sample
category, the choice stimuli unambiguously belonged to a given
category: they were always chosen to be at a distance of
20% from
the boundary.
The monkeys' categorization abilities were further examined with separate psychophysical tests employing an additional 14 morphs that were equally and tightly spaced (6.67% intra-stimulus distance) along each of the morph lines that crossed a category boundary. This allowed for a more precise description of the monkeys' ability to categorize stimuli near the category boundary. This task was identical in all timing and behavioral events except that the monkeys were randomly rewarded on trials in which the sample stimulus was very close to the category boundary (<10% difference). As monkeys were not shown morphs of closer than 10% distance from the category boundary during training, feedback during psychophysical testing was withheld on trials where such stimuli were presented as samples to discourage learning and changes in performance during those sessions.
To test the effects of learning on neuronal activity, we trained a monkey to re-categorize the cat and dog images into three new categories. Two new category boundaries were defined that were orthogonal to the original two-category boundary (Fig. 1A). This resulted in three new classes that each contained morphs centered around one cat prototype and one dog prototype. The same 54 sample stimuli were used for neurophysiological recording under the two and three-category schemes. As in the two-category experiment, the choice stimulus set consisted of 100 randomly generated morphs from each category that had a maximum component of 20% from each of the other two categories.
Data analysis
Neuronal activity level was calculated in four time epochs: baseline, sample presentation, first delay, and first choice stimulus presentation. Baseline neuronal activity was averaged over the 500 ms of fixation preceding sample presentation. Sample period activity was averaged over an 800-ms epoch beginning 100 ms after sample onset to account for the latency of PFC neuronal responses and included the first 300 ms following sample offset to include any activity related to that event. Delay activity was assessed over an 800-ms epoch beginning 300 ms after sample offset and ending 100 ms after first choice stimulus onset. Activity to that choice stimulus was averaged over an epoch that began 100 ms after its onset and ended 2 SD before the monkeys' average reaction time during each recording session to exclude any effects related to the execution of the behavioral response.
Category information in neuronal activity was assessed using several
methods. We computed an index of category tuning by calculating each
neuron's average firing rate difference to pairs of sample morphs from
the same category (within-category difference, WCD) and its average
difference to samples from different categories (between-category
difference, BCD) using images from the morph lines that crossed the
category boundary. The WCD was defined by computing the absolute
difference between the 100 and 80% morphs and between the 80 and 60%
morphs for both categories and averaging these values. The BCD was
computed by averaging the across-boundary differences between the 60%
cats and 60% dogs. As a result, the distance between stimuli in morph
space was identical (20%) for the BCD and WCD comparisons. A standard
index was computed for each neuron by dividing the difference between
their BCD and WCD values by their sum. This index can have values
ranging from
1 to 1. Positive values indicate a larger difference
between categories, whereas negative values reflect larger differences
within the categories than between categories. BCD and WCD values were
computed for neurons recorded during the three-category task in a
similar fashion by determining differences in activity to samples that differed by 20% along the morph lines that crossed the three-category boundaries (Fig. 1). To ensure that the previously learned two-category scheme did not contribute to the values obtained when calculating category effects in the three category scheme, we excluded from this
analysis the morph lines that crossed both the two- and three-category boundaries (e.g., the morph line connecting cat prototype 1 and dog
prototype 2).
In addition to computing an index, we also compared between and within
category differences by using a receiver-operating characteristics
(ROC) analysis (Green and Swets 1966
; Tolhurst et
al. 1983
; Vogels and Orban 1990
). The ROC
analysis measures the degree of overlap between two distributions of
values. It has several advantages. First, it makes no assumptions about
the two distributions, A and B and thus returns
an unbiased estimation of overlap. Second, it can be interpreted as the
performance of an ideal observer in a two-way forced choice task;
values of 0.5 indicate 50% correct classification (guessing) while
values of 0 or 1 indicate error-free classification. Third, it is
independent of neuronal firing rate and number of observations. While
the category index described above explicitly tests for sharp tuning across the category boundary, the ROC value gives a general measure of
the degree of category selectivity.
To determine the time course of category information in neuronal activity, we computed the ROC area within a time bin of 200 ms that was slid in 10-ms steps. We began 500 ms prior to sample stimulus onset and ended 100 ms following the first choice stimulus onset. This was computed for all neurons that were "category selective" (according to a 2-tailed t-test comparing the average response to cats and dogs, evaluated at P < 0.01) during the sample and/or delay epochs.
The latency for neuronal activation (irrespective of category information) was determined by compiling the average histogram of firing rate values for all responsive neurons (i.e., neurons that showed significantly different activity during the sample and/or delay periods compared with baseline, evaluated by 2-tailed t-test at P < 0.01.) This average histogram was smoothed with a 30-ms Gaussian window, and the latency was defined as the point of maximum inflection (determined by computing the 2nd derivative at all points along the histogram) of this curve following sample onset.
Because neurons have a wide range of firing rates, these firing rates were normalized when computing histograms of average effect size across the population. For each neuron, the mean firing rates at each of the six steps from the cat to dog prototypes were computed. Then, the range of firing rates for each neuron were rescaled according to the minimum and maximum values across those six groups such that each neuron's minimum and maximum rate was 0.0 and 1.0, respectively. This allowed each neuron's range of modulation to contribute equally to the population average. Similar results were obtained by conducting these analyses using raw firing rates. Category information was not limited to a specific range of activity; as our single neuron examples will illustrate, it was evident in neurons exhibiting both low and high firing rates.
| |
RESULTS |
|---|
|
|
|---|
Behavioral data
CATEGORY JUDGMENTS. The monkeys were very accurate at the two-category judgments. During the recording sessions, performance was high (~90% correct), even when the samples were close to the category boundary; the monkeys classified dog-like cats (60:40 cat:dog) correctly ~90% of the time, and misclassified them as dogs only 10% of the time and vice-versa (Fig. 3). The results of psychophysical tests with more closely spaced morphs are shown in Fig. 4A. Even very near the boundary, when stimuli were very similar to (only 3% different from) the other category (i.e., a 53.3% cat or dog), the monkeys still performed significantly above chance (~65%, chance = 50%). Thus even with closely spaced morph images, the sudden change in behavior characteristic of category representations were evident in behavior.
|
|
STIMULUS FEATURES USED FOR CATEGORIZATION. To explore which features monkeys tended to focus on when categorizing the images into cats and dogs, we conducted further psychophysical testing. In one set of experiments, we removed the requirement to maintain central fixation (by removing the fixation point) and allowed the monkeys to freely gaze at the images. Given the close link between attention and gaze during unconstrained viewing, the assumption was that monkeys would spend more time gazing at the features that they were using to define the categories.
It seemed that the monkeys were not focusing on a single feature to categorize the images. Even though the sample presentation was brief (600 ms), they typically made several saccades while viewing the stimulus. One monkey made an average of 3.45 saccades and the other monkey averaged 2.25 saccades during sample presentation (defined as the number eye movements exceeding 50°/s, equivalent to 0.5° of movement in adjacent 10-ms time bins). Interestingly, the two monkeys seemed to use different combinations of features to categorize the images. One monkey tended to look toward the tail of the sample images; its gaze was on average 1.46° to the left and 0.60° below the center of the screen. The other monkey tended to direct its gaze toward the head region; on average, its gaze was 0.57° to the right and 1.16° above central fixation. Figure 5A shows representative traces from one trial for each monkey. The gaze patterns for the two monkeys were significantly different from one another (along both the horizontal and vertical axes, t-tests, P < 0.01).
|
Neuronal data
BASIC PROPERTIES. A total of 395 lateral prefrontal cortex neurons were recorded from three hemispheres of two monkeys during performance of the two-category DMC task (130 from monkey A, 265 from monkey B, Fig. 6). Visual responsiveness was evaluated by comparing the activity in the sample and delay intervals to baseline activity using two-tailed t-tests (evaluated at P < 0.01). Based on this criterion, 259/395 (66%, 113 from monkey A, 146 from monkey B) of neurons were activated during one or more task intervals. The onset of neuronal responsiveness across the population of responsive PFC neurons occurred at ~100 ms following sample stimulus onset (METHODS).
|
|
|
QUANTIFICATION OF CATEGORY EFFECT. To quantify the effect of category membership on the neuronal population, we computed a category index that reflected each neuron's average difference in activity to samples across the category boundary versus its difference to samples that were from the same category (see METHODS). Positive values indicate greater differences across the category boundary than within each category and negative index values indicate the opposite.
We examined all stimulus-selective neurons irrespective of whether they were category selective per se (n = 78, 67 sample, 32 delay). The distributions of category index values for the sample and delay periods are shown in Fig. 9. During both epochs, mean category index values were significantly greater than zero, i.e., the distribution was shifted toward category tuning (sample: 0.09, delay: 0.16, 1-tailed t-test, P < 0.001). For the subset of stimulus selective neurons that were category selective (n = 46 sample, 21 delay), the category indices were significantly greater (more shifted toward category tuning) during the delay than the sample interval (index = 0.12 sample, 0.21 delay, 2-tailed t-test, P = 0.04). Similar comparisons were also made by computing ROC values, which reflect how well an ideal observer would do at categorization using each neuron's firing rate (see METHODS). Across the population of stimulus selective neurons, the average ROC value was 0.59 (range: 0.50-0.75) in the sample interval and 0.59 in the delay (range: 0.50-0.82).
|
TEMPORAL CHARACTERISTICS OF CATEGORY INFORMATION. To examine the temporal dynamics of the representation of category information in PFC activity, we used a sliding ROC analysis (see METHODS). For this analysis, we included neurons whose average activity in the sample and/or delay intervals was significantly category-selective (t-test on activity to all cats vs. all dogs, evaluated at P < 0.01, n = 96 neurons).
Figure 10A shows the ROC values for each neuron in 10-ms time steps. The ROC values are sorted by their magnitude separately for each time bin to better illustrate the number of neurons exhibiting ROC values >0.5 (chance) at each moment in time. This revealed that, in general, more neurons conveyed category signals late in the sample epoch than during the delay interval but that the strongest category signals occurred in the late delay and early choice presentation epoch. Figure 10 indicates that there was a greater number of neurons with moderate or small ROC values for the time bins during the sample epoch (i.e., there are more "foothills" leading up the "peaks") but that the highest ROC values occurred during the late delay/choice presentation (the "peaks" are highest then).
|
EFFECTS OF LEARNING ON CATEGORY REPRESENTATIONS. As our monkeys had no prior experience with cats and dogs, it seemed likely that the category information in the PFC was acquired through learning. To test the effects of learning on category representations, we retrained one monkey with the samples reassigned to three new categories (see Fig. 1 and METHODS). We then recorded from 103 neurons at similar depths and locations as those recorded during the two-category task. The incidence of neuronal responsiveness and stimulus selectivity during the three-category task was similar to that during the two-category task: ~63% (65/103) of neurons were visually responsive (t-test vs. baseline, as in the preceding text, P < 0.01) and ~23% (24/103, 14 sample, 14 delay) were stimulus selective (ANOVA with stimulus as factor, P < 0.01).
An example of a neuron recorded during the three-category task is shown in Fig. 11. It showed a significant effect of category during the delay period when the data were sorted according to the new, currently relevant, three-category scheme (ANOVA, P < 0.001); it distinguished one of the categories from the other two (Fig. 11A). By contrast, when the data were sorted according the old (now irrelevant) cats and dogs category scheme (Fig. 11B), there was no significant difference (ANOVA, P = 0.74).
|
0.10, 1-tailed t-test, P = 0.9). However, when the category index was computed using the new
(relevant) three-category boundaries, a significant category effect was
observed in the delay (3-category index =0.16, 1-tailed
t-test, P = 0.008). As we found for the
two-category task, three-category tuning was stronger during the delay
than the sample interval (2-tailed t-test, P = 0.04). In fact, we did not detect significant category tuning across
the population of stimulus selective neurons during the sample interval
(3-category index =
0.01, P = 0.5), although it
was detected when we computed the index for all neurons recorded during
the three-category task (n = 103, see following text).
The same pattern of effects was observed across the entire population
of neurons. Figure 12 shows the
distribution of the category indices for all 103 (randomly sampled)
cells recorded during the three-category task. The indices computed
using the three-category scheme revealed significant category
information (i.e., the distribution was shifted to the right) for both
the sample interval (Fig. 12A, 3-category index = 0.065, 1-tailed t-test, P = 0.0007) and for the delay (Fig. 12B, 3-category index = 0.08, 1-tailed
t-test, P = 0.0005). By contrast, when the
indices were computed using the two-category scheme, there were no
significant category effects during the sample (Fig. 12C,
2-category index =
0.02, 1-tailed t-test,
P = 0.83) or the delay interval (Fig. 12D,
2-category index =
0.03, 1-tailed t-test,
P = 0.82). Thus while information about the
three-category scheme was evident in the population of PFC neurons, we
could no longer detect information about the previously learned,
now-irrelevant, cat and dog categories.
|
CATEGORY MATCH/NONMATCH EFFECTS. When the choice stimulus was presented, the monkeys needed to categorize it and then decide whether or not its category matched that of the sample. Both signals were present in neuronal responses to the choice stimulus. We evaluated activity in this interval with a two-way ANOVA (factor 1: choice stimulus category, factor 2: match vs. nonmatch, evaluated at P < 0.01). Just more than 9% (37/395) of the entire population of PFC neurons reflected the category of the choice stimulus while 11% (43/395) reflected its match/nonmatch status. More than two-thirds of the latter neurons (29/43) showed an effect of matching/nonmatching that was similar regardless of whether the choice stimulus was a cat or dog (main effect of match/nonmatch, no interaction with choice stimulus category). An example of a neuron that exhibited greater activity to matches is shown in Fig. 13A, and an example of a neuron showing greater activity to nonmatches is shown in Fig. 13B. This activity could have encoded the monkeys' decisions about the match/nonmatch status of stimuli and/or the motor aspects of the task (the lever release to matches). The remaining third of these neurons (14/43) showed an interaction between the match/nonmatch status and the category of the choice stimulus (P < 0.01). In other words, they showed match/nonmatch effects that were limited, or much stronger, to one of the categories. An example of a "cat match" neuron is shown in Fig. 13C. For match/nonmatch selective neurons, a similar number preferred matches (22/43 or 51%) as nonmatches (21/43 or 49%).
|
ANALYSIS OF ERROR TRIALS. For insight into neuronal correlates of the monkey's errors, we compared category effects and match/nonmatch effects on correctly performed trials versus those in which monkeys made errors in category judgments. For these analyses, we included neurons that showed significant effects on correct trials. Figure 14 shows the results of these comparisons. Category information was evident during the sample interval on both correct and error trials; the average activity to the preferred versus nonpreferred category was significantly different for both types of trials (t-test, P < 0.001, Fig. 14A). But category information seemed to be lost in the delay. A significant difference between the average activity to the two categories was evident on correct trials (P < 0.001) but not on error trials (P = 0.79, Fig. 14B). Match/nonmatch effects also depended on whether the trial was correctly performed or not. For these analyses, the choice stimulus status (match or nonmatch) that elicited the greater activity on correct trials was termed the "preferred condition." For all neurons that showed pure match versus nonmatch effects (n = 25, i.e., match vs. nonmatch factor: P < 0.01, choice-category and interaction factors: P > 0.01), there was a significant difference (P < 0.001) in average activity to the preferred and nonpreferred conditions on correct trials. On error trials, however, the pattern reversed; there was an increase in activity to nonpreferred over preferred conditions that reached significance at P < 0.05 (Fig. 14C). This is presumably because the monkeys mistakenly responded to nonmatches as if they were matches.
|
| |
DISCUSSION |
|---|
|
|
|---|
We report that neurons in the PFC, a brain region central to many visual behaviors, exhibited properties that mirrored the behavioral characteristics of perceptual categories. They made sharper distinctions between stimuli from different categories than between stimuli from the same category, irrespective of their relative physical similarity. This explicit encoding of category membership in the activity of single neurons did not have to be the case. In principle, categories might have only been reflected on the ensemble level, as an emergent property of neurons encoding different defining features. Our results illustrate instead that familiar categories are reflected on the single-neuron level, much as physical attributes of stimuli are. This ability to carve category membership into the tuning of single neurons might allow for the quick and effortless classification of familiar objects. We also observed neuronal correlates of category match/nonmatch effects, suggesting a role for the PFC in these judgments and/or in issuing resulting motor commands. Finally, the observation that neuronal correlates of categories and category judgments waned or changed on error trials suggests that PFC activity was directly related to task performance.
The presence of category information in the PFC makes sense given its
position at the apex of the perception-action cycle (Fuster
1990
). Categories are defined by their functional relevance. Therefore they might be strongly represented in a brain area that mediates the functions needed to transform perceptions into voluntary actions, functions such as the integration of temporally separated events (Fuster et al. 2000
), the acquisition and
representation of behavior-guiding rules (Asaad et al.
1998
; Wallis et al. 2001
; White and Wise
1999
) and visuomotor decisions (Kim and Shadlen 1999
). The relative specialization of PFC in guiding behavior is reflected in the fact that its damage or reversible inactivation in
monkeys cause deficits in performance of tasks demanding attention, working memory and response inhibition (Dias et al.
1996
; Funahashi et al. 1993
; Goldman and
Rosvold 1970
; Goldman et al. 1971
; Gross and Weiskrantz 1962
; Mishkin 1957
;
Mishkin and Manning 1978
; Mishkin et al.
1969
; Passingham 1975
) but usually spares more
purely perceptual functions such as object recognition, visual
long-term memory, and "high level" visual analysis of form.
But an important contribution must also come from brain areas that
mediate these visual functions, such as the ITC. Its damage causes
deficits in visual discrimination, recognition, and learning (Blum et al. 1950
; Kluver and Bucy 1938
,
1939
; Mishkin 1954
, 1966
; Mishkin and
Pribram 1954
) and category-specific agnosias (e.g., for faces)
in humans (Damasio et al. 1982
). Since the seminal work
of Gross and coworkers, who reported a small population of "face
cells," numerous studies have shown that ITC neurons show selectivity
for objects that cannot be explained by sensitivity to low-level
features, such as orientation or color (Desimone et al.
1984
; Gross et al. 1972
; Perret et al.
1992
; Kobatake and Tanaka 1994
; Tanaka et
al. 1991
). There has even been some recent evidence that
suggests that these neurons play a direct role in categorization.
Vogels (1999)
recorded from the ITC in monkeys trained
to categorize stimuli as tree versus nontree or fish versus nonfish and
found that many neurons were selectively activated by the trained class
(photographs of trees or fish) but not by distracter objects (photos of
household objects or scenes containing neither trees nor fish).
Kreiman et al. (2000)
recorded from medial temporal lobe
neurons in epileptic human patients while they classified stimuli into
nine categories (e.g., faces, cars, food) and found neurons that
selectively responded to stimuli from one of the categories. However,
it has not been clear whether ITC neuronal selectivity encodes the
category membership of stimuli, their physical appearance or some
combination of these two factors. With a large, amorphous set of
stimuli (such as trees or food), the category boundaries are unknown
and the sharp transitions that are diagnostic of categories cannot be
evaluated independently of stimulus similarity. Hence, neuronal
selectivity for, say, trees could reflect the fact that trees look more
like one another than other stimuli. Our results indicate that PFC
neurons can convey information about the category of stimuli largely
irrespective of their physical appearance.
The relative roles of the PFC and ITC in perceptual categorization
remain to be determined. A recent theory of object recognition suggests
that category tuning in the PFC could arise from converging inputs from
ITC neurons that are stimulus, but not category, tuned (Riesenhuber and Poggio 2000
). In this model,
category-tuned neurons perform a weighted sum of the inputs from
neurons broadly tuned for individuals followed by a thresholding
operation. This suggests a greater role for the PFC in the explicit
representation of categories. Another possibility is that category
information is "loaded" into the PFC from long-term storage in the
ITC. A recent study by Tomita et al. (1999)
suggested
that recall of long-term visual memories involved top-down signals from
the PFC that activate representations stored in the ITC. Similar
mechanisms might mediate the retrieval of category information stored
in the ITC.
In sum, our results have provided insight into how perceptual categories and category-related behaviors are encoded in the PFC, a brain area that receives the outputs of sensory cortex and helps mediate voluntary action. How and whether category membership is encoded in sensory systems and the respective roles of the PFC and visual areas like the ITC in representing and storing category information remains to be determined.
| |
ACKNOWLEDGMENTS |
|---|
We thank C. Shelton for the morphing software and K. Anderson, D. Applewhite, W. Asaad, M. Machon, M. Mehta, A. Nieder, A. Pasupathy, J. Wallis, and M. Wicherski for valuable comments, help, and discussions.
This work was supported by a National Institute of Mental Health grant, a National Science Foundation-Knowledge and Distributed Intelligence grant, RIKEN-MIT Neuroscience Research Center, a McDonnell Pew Fellowship (M. Riesenhuber), the Whitaker Chair (T. Poggio), and the Class of 1956 Chair (E. K. Miller).
| |
FOOTNOTES |
|---|
Address for reprint requests: E. K. Miller, Bldg. E25, Room 236, Massachusetts Institute of Technology, Cambridge, MA 02139 (E-mail: ekm{at}ai.mit.edu).
Received 18 January 2002; accepted in final form 26 March 2002.
| |
REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
Y. E. Cohen, B. E. Russ, S. J. Davis, A. E. Baker, A. L. Ackelson, and R. Nitecki A functional role for the ventrolateral prefrontal cortex in non-spatial auditory cognition PNAS, November 24, 2009; 106(47): 20045 - 20050. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Freedman and J. A. Assad Distinct Encoding of Spatial and Nonspatial Visual Information in Parietal Cortex J. Neurosci., April 29, 2009; 29(17): 5671 - 5680. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Akrami, Y. Liu, A. Treves, and B. Jagadeesh Converging Neuronal Activity in Inferior Temporal Cortex during the Classification of Morphed Stimuli Cereb Cortex, April 1, 2009; 19(4): 760 - 776. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Xue, D. G. Ghahremani, and R. A. Poldrack Neural Substrates for Reversing Stimulus-Outcome and Stimulus-Response Associations J. Neurosci., October 29, 2008; 28(44): 11196 - 11204. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Meyers, D. J. Freedman, G. Kreiman, E. K. Miller, and T. Poggio Dynamic Population Coding of Category Information in Inferior Temporal and Prefrontal Cortex J Neurophysiol, September 1, 2008; 100(3): 1407 - 1419. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liu and B. Jagadeesh Neural Selectivity in Anterior Inferotemporal Cortex for Morphed Photographic Images During Behavioral Classification or Fixation J Neurophysiol, August 1, 2008; 100(2): 966 - 982. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Russ, A. L. Ackelson, A. E. Baker, and Y. E. Cohen Coding of Auditory-Stimulus Identity in the Auditory Non-Spatial Processing Stream J Neurophysiol, January 1, 2008; 99(1): 87 - 95. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Tanji and E. Hoshi Role of the Lateral Prefrontal Cortex in Executive Behavioral Control Physiol Rev, January 1, 2008; 88(1): 37 - 57. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Kiani, H. Esteky, K. Mirpour, and K. Tanaka Object Category Structure in Response Patterns of Neuronal Population in Monkey Inferior Temporal Cortex J Neurophysiol, June 1, 2007; 97(6): 4296 - 4309. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Ledberg, S. L. Bressler, M. Ding, R. Coppola, and R. Nakamura Large-Scale Visuomotor Integration in the Cerebral Cortex Cereb Cortex, January 1, 2007; 17(1): 44 - 62. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Johnston and S. Everling Monkey Dorsolateral Prefrontal Cortex Sends Task-Selective Signals Directly to the Superior Colliculus J. Neurosci., November 29, 2006; 26(48): 12471 - 12478. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Genovesio, P. J. Brasted, and S. P. Wise Representation of future and previous spatial goals by separate neural populations in prefrontal cortex. J. Neurosci., July 5, 2006; 26(27): 7305 - 7316. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. E Cohen, M. D Hauser, and B. E Russ Spontaneous processing of abstract categorical information in the ventrolateral prefrontal cortex Biol Lett, June 22, 2006; 2(2): 261 - 265. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. E. Cohen, B. E. Russ, and G. W. Gifford III Auditory processing in the posterior parietal cortex. Behav Cogn Neurosci Rev, September 1, 2005; 4(3): 218 - 231. [Abstract] [PDF] |
||||
![]() |
F. H. Hamker The Reentry Hypothesis: The Putative Interaction of the Frontal Eye Field, Ventrolateral Prefrontal Cortex, and Areas V4, IT for Attention and Eye Movement Cereb Cortex, April 1, 2005; 15(4): 431 - 447. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Nieder, D. J. Freedman, and E. K. Miller Representation of the Quantity of Visual Items in the Primate Prefrontal Cortex Science, September 6, 2002; 297(5587): 1708 - 1711. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |