|
|
||||||||
Department of Psychology and Center for Cognitive Neuroscience, University of Pennsylvania, Philadelphia, Pennsylvania
Submitted 2 January 2007; accepted in final form 15 March 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
In the current study, we tested this idea by using functional magnetic resonance imaging (fMRI) to measure the neural response to familiar and unfamiliar scenes. By "visual scene" we mean a section of the world that is potentially visible from a single vantage point, such as a view of a room, a landscape, a city street, or an image of such a section of the world (Henderson and Hollingworth 1999
; Intraub 1997
) (see Fig. 1). In this usage, the term "scene" contrasts with the term "object," which we use to refer to decontextualized compact entities such as faces, cars, and chairs (Epstein 2005
). We hypothesized that scenes from familiar environments might engage orientational or memory systems not engaged by scenes from unfamiliar environments or engage qualitatively different representations within these systems. We further hypothesized that these differences might be relatively automatic, occurring even when subjects do not explicitly attempt to use the scenes for spatial orientation.
|
Although these results suggest the possibility that scene processing in the PPA, RSC, and TOS might be affected by familiarity with the environment from which the scene is drawn, previous studies have not found clear evidence for this idea. For example, an earlier study from our group observed no significant main effect of environmental familiarity on the response to scenes in the PPA (Epstein et al. 1999
). However, the number of subjects was relatively small (n = 8) and response in the TOS and RSC was not examined. Indeed, somewhat counter to our results, a recent study by Rosenbaum and colleagues (2004) found greater response to familiar landmarks than to unfamiliar buildings in a posterior parahippocampal/lingual region that may adjoin the PPA. However, the data in this study were analyzed using a whole-brain analysis rather than a region of interest analysis, so the overlap between the activated region and the PPA was unclear. Furthermore, the possibility of familiarity effects in the RSC and TOS could not be excluded.
fMRI studies have also examined the degree to which representations in scene processing regions are viewpoint specific (i.e., different views of a scene evoke different representations) versus viewpoint invariant (i.e., different views of a scene evoke the same representation). An initial experiment with unfamiliar tabletop scenes indicated that scene processing within the PPA is largely viewpoint specific (Epstein et al. 2003
) consistent with behavioral results (Chua and Chun 2003
). More recent results indicate that some degree of viewpoint invariance might develop within the PPA as subjects become familiar with the scenes over the course of an experimental session (Epstein et al. 2005
) or if the differences between viewpoints are relatively small (Ewbank et al. 2005
). These results suggest that familiarity with scenes obtained through real-world experience with a familiar environment might lead to the formation of viewpoint-invariant representations that might facilitate the recognition of real-world locations from different views. Alternatively, these results might simply reflect a temporary within-session facilitation of scene processing that has little to do with long-term changes caused by real-world navigational experience. The current study was intended to distinguish between these possibilities within the PPA and also to extend the previous results to scene-processing regions outside of the PPA.
Subjects in the current study were students from the University of Pennsylvania and Temple University, and stimuli were photographs of locations on the two university campuses. Subjects were highly familiar with their own college campus but had only minimal experience with the other college campus. We tested for effects of environmental familiarity in two ways. First, the overall magnitude of the fMRI response to photographs of the familiar college campus was compared with the magnitude of response to photographs of the unfamiliar college campus. We reasoned that cortical regions involved in spatial orientation would be more strongly engaged when viewing images of the familiar campus than when viewing images of the unfamiliar campus because information about the world extending beyond the boundaries of the photograph is only available for the familiar campus. Second, the reduction of response observed on repetition of a scene was compared for scenes obtained from familiar and unfamiliar environments. These repetition suppression (RS) effects (sometimes referred to as fMRI adaptation effects) are believed to index processing overlap between the original and repeated item (Grill-Spector and Malach 2001
). In particular, reduction in response observed on repetition of the same item from a different viewpoint is taken as evidence for processing that has at least some degree of viewpoint invariance, whereas reduction of response observed only on repetition of the same item from the same viewpoint is taken as evidence for processing that has at least some degree of viewpoint specificity (Epstein et al. 2003
, 2005
; Ewbank et al. 2005
; Grill-Spector et al. 1999
; James et al. 2002
; Vuilleumier et al. 2002
). We hypothesized that the degree to which scenes are processed in a viewpoint-invariant versus viewpoint-specific manner might vary as a function of environmental familiarity.
We present data from two experiments. Experiment 1 examined the effects of long-term (i.e., real world) familiarity on scene processing, whereas experiment 2 examined the effects of both long-term familiarity with a college campus and short-term (i.e., within-scan-session) familiarity with specific scene images. To anticipate, we find that scene processing regions respond more strongly to familiar locations than to unfamiliar locations and that the viewpoint invariance of the processing depends on short-term (within-scan-session) familiarity; however, we find little evidence that long-term familiarity with a location leads to more viewpoint-invariant processing.
| METHODS |
|---|
|
|
|---|
Healthy right-handed volunteers (28) were recruited from the University of Pennsylvania and Temple University communities and scanned with fMRI after giving written informed consent according to procedures approved by the University of Pennsylvania institutional review board. Of these 28 volunteers, 14 (7 from Penn; 7 from Temple) were run in experiment 1, and 14 (7 from Penn; 7 from Temple) were run in experiment 2. All subjects had normal or corrected-to-normal vision and were highly familiar with their home campus (average length of experience 3.0 ± 1.0 yr) but had at most minimal familiarity with the other campus.
MRI acquisition
Scanning was performed at the Hospital of the University of Pennsylvania on a 3 Tesla Siemens Trio equipped with a Siemens body coil and a four-channel head coil. T2*-weighted images sensitive to blood-oxygenation-level-dependent contrasts were acquired using a gradient-echo echo-planar pulse sequence (TR = 2,000 ms, TE = 30 ms, matrix size = 64 x 64, voxel size = 3 x 3 x 3 mm or 2.9688 x 2.9688 x 3 mm, 33 axial slices). Stimuli were rear projected onto a Mylar screen at the head of the scanner with an Epson 8100 3-LCD projector equipped with a Buhl long-throw lens and viewed through a mirror mounted to the head coil.
Stimuli
A digital camera was used to collect images of various locations from the University of Pennsylvania and Temple University campuses. Three images of each location were taken from different views. View 2 was a head on view of the scene, whereas views 1 and 3 were viewpoint shifts of
6070° to the left and the right of the central view, respectively (Fig. 1). Stimuli were normalized for familiarity by a group of students at each school (Penn: n = 12; Temple: n = 55). The students rated the pictures on a scale of 14 in response to the question: "Do you recognize this place?" with 1 indicating the response "Yes, and I am pretty sure where is it," 2 indicating "Yes, but I don't know where it is," 3 indicating "Maybe, it looks familiar, but I am not sure," 4 being "No." The final stimulus set consisted of 144 pictures from each school (3 images each of 48 locators). Within this set, ratings on the normalization ranged from 1 to 2. The average score for Penn was 1.25 ± 0.27 and for Temple was 1.37 ± 0.30.
Procedure
EXPERIMENT 1.
Scan sessions consisted of six experimental scans followed by two functional localizer scans. Experimental scans were 9 min 16 s long and were divided into 80 6-s stimulus trials interspersed with 30 2-s "null" trials and a 16-s fixation period at the end of the scan. Functional localizer scans were 8 min 12 s in length and were divided into 16-s epochs during which subjects viewed digitized color photographs of faces, common objects, scenes, and other stimuli presented at a rate of 1.25 pictures/s in a blocked design as described previously (Epstein et al. 2005
).
Each stimulus trial (Fig. 2) began with a 500-ms fixation cross followed by a 500-ms gray screen with a black outline, which alerted subjects to the forthcoming presentation of the visual scenes. After a 500-ms interval, two scenes were sequentially presented for 500 ms each with a 500-ms interstimulus interval. This was followed by a 3,000-ms poststimulus interval in which a fixation cross appeared on the screen and subjects used a button box to report whether the two scenes depicted the same location or different locations (irrespective of viewpoint). Response latencies were measured after the onset of the second stimulus. In null trials, the fixation cross remained on the screen for 2 s, and subjects made no response. In each trial, the two stimuli could either be identical (no-change trials), different views of the same location (viewpoint-change trials) or different locations from the same campus (place-change trials). These three trial types were crossed with environmental familiarity (Penn vs. Temple) in a 3 x 2 design.
|
After completion of the experiment, participants completed a computer survey in which they had to rate all 96 images of the stimulus set in terms of real-world familiarity with the locations depicted. The pictures were rated on a scale of 14; 1 being "I know where this place is," 2 being "I recognize the place but am not sure where it is," 3 being "It looks somewhat familiar," and 4 being "I have never seen this place before today."
EXPERIMENT 2. The procedure for experiment 2 was similar to the procedure for experiment 1 with the following exceptions. A primary goal of this experiment was to measure the effect of within-scan-session experience on scene processing. In particular, we aimed to compare the effects of short-term familiarity gained from displaying scenes multiple times within a scan session with the effects of long-term familiarity gained from multiple real-world encounters with the locations depicted in the scenes. To maximize the effects of within-session experience, we increased the number of exposures to each image beyond the five exposures in the preceding experiment. Given constraints on total scan session length, this was done by reducing the size of the stimulus set presented to each subject. For each subject, two views each of 16 Penn and 16 Temple locations were chosen from the larger stimulus set to serve as stimuli in scans 16. The choice of these locations was counterbalanced across subjects. Images of half of the chosen locations (8 Penn; 8 Temple) were used to construct trials in scans 1, 3, and 5, whereas images of the other half of the chosen locations were used to construct trials in scans 2, 4, and 6. All told, subjects saw 64 different images (2 campuses x 16 places x 2 views), each of which was shown 15 times. Each image was presented for 700 ms.
Scans 16 were followed by two additional scans (7 and 8), which were intended to measure the net effect of within-session familiarity on scene processing. Each of these scans was 6 min 26 s long and was divided into 64 4-s-long stimulus trials interspersed with 64 2-s null trials and a 12-s fixation period at the end of the scan. Stimulus trials consisted of a 500-ms fixation cross, followed by the presentation of a single scene for 500 ms and then a 3,000-ms poststimulus fixation interval. Subjects used a button box to report whether or not the scene depicted a famous world landmark. Subjects were not informed of the identities of the famous landmarks beforehand but all were easily identifiable (e.g., the Taj Majal; Big Ben), and none were from the local Philadelphia area. Famous landmarks were presented in 16 of the 64 stimulus trials of each run, scenes from the Penn campus in 24 trials, and scenes from the Temple campus in 24 trials. Of the 24 scenes from each campus, 8 were images that had been presented in scans 16 (old view condition), 8 were previously unseen views of the campus locations presented in scans 16 (new view condition), and 8 were images of locations that had not been presented in scans 16 (new place condition). These three trial types were crossed with environmental familiarity (Penn versus Temple) in a 3 x 2 design.
In sum, the design of experiment 2 allowed us to examine how repeated exposure to two views of each location during scans 16 affected subsequent processing of these views and also a previously unseen third view in scans 7 and 8. It also allowed us to simultaneously measure the effects of real-world familiarity with these locations. Note that the use of different behavioral tasks in scans 16 and 78 ensured that any cross-scan repetition effects could be attributed to repetition of the view/place itself rather than to repetition of the response (Dobbins et al. 2004
). However, a disadvantage of this design was that inconsistencies between the scans 16 and 78 repetition effects could arise for two reasons: first, because different repetition intervals were used in scans 16 and 78 (within-trial vs. between trial); second, because different tasks were used in scans 16 and 78 (same/different place vs. famous/nonfamous).
Data analysis
Functional images were corrected for differences in slice timing by resampling slices in time to match the first slice of each volume, realigned with respect to the first image of the scan, spatially normalized to the Montreal Neurological Institute (MNI) template, resampled into 3-mm isotropic voxels and spatially smoothed with an 8-mm FWHM Gaussian filter. Data were analyzed using the general linear model as implemented in VoxBo (www.voxbo.org) including an empirically derived 1/f noise model, filters that removed high and low temporal frequencies, regressors to account for global signal variations, and nuisance regressors to account for between-scan differences. Each stimulus condition was modeled as an impulse response function (experimental scans) or a boxcar function (functional localizer scans) convolved with an estimate of the hemodynamic response function (HRFs). Subject-specific HRFs were used in experiment 1; however, as the choice of HRF appeared to make little difference to the results, we simplified the data analysis procedure by using a canonical HRF in experiment 2. Regressors reflecting the first and second derivatives of the predicted hemodynamic response to each stimulus condition were also included. Both region of interest (ROI) and whole-brain analyses were performed.
For ROI analyses, data from the functional localizer scans were used to identify subject-specific regions responding more strongly to scenes than to common objects in the PPA, RSC, and TOS. Thresholds were set for each region in a subject-by-subject manner so that the ROIs were consistent with those identified in previous studies; thresholds ranged from t > 2.5 to t > 5.0. Using these criteria, the PPA was identified in both cerebral hemispheres and the RSC in the left hemisphere in all subjects. Right RSC was identified in 12/14 subjects of experiment 1 and 14/14 subjects of experiment 2, left TOS in 14/14 subjects of experiment 1 and 13/14 subjects of experiment 2, and right TOS in 14/14 subjects of experiment 1 and 13/14 subjects of experiment 2. Mean sizes for each ROI were: left PPA 3.0 ± 1.6 cm3, right PPA 4.1 ± 2.0 cm3, left RSC 1.3 ± 1.0 cm3, right RSC 2.4 ± 1.6 cm3, left TOS 2.3 ± 1.3 cm3, right TOS 3.0 ± 1.8 cm3. The time course of MR response during the main experimental scans was extracted from each ROI (averaging over all voxels) and entered into the general linear model to calculate parameter estimates (beta values) for each condition that were used as the dependent variables in a second-level random-effects ANOVA. We also explored a more anatomically restrictive method for defining ROIs in which voxels were included if they responded more strongly to scenes than to objects at t > 2.5 and were within 3 mm of the voxel showing the strongest value for this contrast. This method of defining the ROIs gave substantially identical results, so these data are not reported.
For whole-brain analyses, subject-specific t-maps were calculated for contrasts of interest and then smoothed to 12-mm FWHM to facilitate between-subject averaging before entry into a random effects analysis. Voxels were considered to be sensitive to environmental familiarity if the significance of this effect exceeded P < 0.001, uncorrected. Voxels were considered to exhibit either a viewpoint-specific or viewpoint-invariant repetition effect when the following two conditions were met: the significance of the tested effect exceeded P < 0.001, uncorrected and the response when all information was repeated (no change or old view condition) was significantly less (P < 0.001) than the response when no information was repeated (place change or new place condition). The second condition restricted the analysis to voxels showing the predicted ordered reduction of response when an increasing fraction of information is repeated (e.g., place change > viewpoint change > no change or new place > new view > old view), as the response of voxels not showing this pattern is not easily interpretable in terms of fMRI adaptation. Clusters containing seven or more above-threshold voxels are reported. Note that insofar as these tests were not corrected for multiple comparisons across voxels, the results should be considered exploratory.
| RESULTS |
|---|
|
|
|---|
BEHAVIORAL DATA. On each trial, the subjects' task was to report whether the two presented images depicted the same place or different places. The correct response was "same" for viewpoint-change and no-change trials and "different" for place-change trials. The two images in the viewpoint-change trials depict the same objects and surfaces (from different views), whereas the two images in the place-change trials depict different objects and surfaces. As such, it is possible to perform the task solely by using visual information locally available in the images, although knowledge about the environment from which the images are drawn can potentially facilitate performance. Accuracies and reaction times are plotted in Fig. 3.
|
Reaction times for correct trials were affected by change type [F(2,26) = 40.0, P < 0.001] but not familiarity [F(1,13) = 3.2, P = 0.10, n.s.]. Specifically, responses were faster in no-change trials (M = 888 ms) than in viewpoint-change (M = 1,039 ms) and place-change trials (M = 1,040 ms), consistent with results from previous studies (Epstein et al. 2003
, 2005
). Post hoc t-test confirmed that responses were faster to no-change than to viewpoint-change [t(13) = 6.2, P < 0.001] and place-change [t(13) = 7.8, P < 0.001] trials, but response times to place- and viewpoint-change trials did not differ (t < 1, n.s.). No interaction of change type and familiarity was observed (F < 1, n.s.).
In the computer survey after the experiment, subjects reported that they were highly familiar with the locations from their campus and highly unfamiliar with the locations from the other campus. The subjectwise average rating for familiar locations ranged from 1 to 2.02 with a mean of 1.18 ± 0.28. (These numbers do not include 1 subject, who was dropped from the analysis due to a failure to follow instructions.) For the unfamiliar locations, the subjectwise average responses ranged from 1.8 to 4 with a mean of 3.60 ± 0.35. The rating difference for familiar and unfamiliar locations was significant [F(1,11) = 483.9, P < 0.0001]. There was no interaction between rating of familiarity or unfamiliarity and type of student [F(1,11) = 0.79, P = 0.39, n.s.], reflecting the fact that Penn students did not report greater relative familiarity with Penn versus Temple images than Temple students did with Temple versus Penn images.
FUNCTIONAL ROIS. Data from the functional localizer scans were used to define scene-responsive regions of interest in PPA, RSC, and TOS. ANOVA revealed significant main effects of familiarity [left PPA F(1,13) = 7.2, P < 0.05; right PPA F(1,13) = 10.7, P < 0.01; left RSC F(1,13) = 34.0, P < 0.0001; right RSC F(1,11) = 30.3, P < 0.0002; left TOS F(1,13) = 3.6, P = 0.08; right TOS F(1,13) = 4.9, P < 0.05] and within-trial repetition [left PPA F(2,26) = 30.9; right PPA F(2,26) = 28.2; left RSC F(2,26) = 24.8; right RSC F(2,22) = 38.2; left TOS F(2,26) = 21.2; RTOS F(2,26) = 21.0; all P's <0.00001] in all three regions during the main experimental scans. The familiarity effects were most dramatic in RSC, where response to familiar locations was >50% higher than response to unfamiliar locations (Fig. 4).
|
We performed additional ANOVAs to analyze these two effects separately. The viewpoint-specific repetition effect (viewpoint-change > no-change) was highly significant in both hemispheres for all three regions [all Fs > 24, all Ps < 0.001]. In contrast, there was no evidence for a viewpoint-invariant repetition effect in any region, except for a marginal effect in right RSC [F(1,11) = 3.8, P = 0.08]. These patterns are also apparent from visual inspection of the data (Fig. 4). Interestingly, the viewpoint-specific effect did not vary with environmental familiarity in any region, but there was a significant familiarity x repetition interaction for the viewpoint-invariant effect in right RSC [F(1,11) = 5.6, P < 0.05 ]. Specifically, scene processing in the right RSC was more viewpoint invariant for scenes drawn from familiar environments than for scenes drawn from unfamiliar environments.
In sum, these results indicate that all three scene-processing regions respond more strongly to scenes drawn from familiar environments than to unfamiliar scenes with the strongest effects in RSC. We found little evidence that familiar and unfamiliar scenes are processed with different degrees of viewpoint specificity except for a weak effect of greater viewpoint invariance for familiar scenes in the right RSC.
WHOLE-BRAIN ANALYSES.
Exploratory whole-brain analyses were performed to determine whether the familiarity and repetition effects observed in the PPA, RSC, and TOS were specific to these regions, or also found in other regions of the brain. The results (Table 1) indicated that the effects were largely restricted to the target regions. Notably, the only three areas responding more strongly to familiar versus unfamiliar locations were posterior parahippocampal cortex (overlapping the PPA), RSC, and a parietal-occipital region adjoining but not completely overlapping the TOS. Similar results were observed for the whole-brain analysis of the viewpoint-specific adaptation effect: outside of the target ROIs, the only region showing a significant repetition effect was the left inferior frontal gyrus, a region that has been demonstrated to exhibit repetition reduction effects in a number of studies using a variety of different stimulus materials (Buckner et al. 1998
; Demb et al. 1995
). No region exhibited a viewpoint-invariant adaptation effect that exceeded the significance threshold.
|
|
This experiment was motivated by an apparent discrepancy between the results of experiment 1 and the results of a previous study (Epstein et al. 2005
). Experiment 1 found little evidence for viewpoint-invariant processing in any cortical region, except for a marginal viewpoint-invariant adaptation effect for familiar scenes in right RSC. In contrast, the earlier study found evidence for viewpoint-invariant processing in the PPA after subjects viewed images of unfamiliar locations multiple times over the course of a scan session. These results led us to predict that real-world experience with familiar locations might lead to viewpoint-invariant processing in the PPA (as well as other cortical regions). Although the failure to find this pattern in experiment 1 might simply reflect the fact that the viewpoint changes examined were fairly large (6070 vs. 35° in the previous experiment), an alternative possibility is that the viewpoint-invariant adaptation observed in the earlier study might reflect a short-term facilitation of scene processing that generalizes across views rather than long-term learning of viewpoint-invariant place representations. To test this idea, experiment 2 simultaneously measured the effects of both long-term (i.e., real world) and short-term (i.e., within scan session) familiarity.
The effects of within-scan-session familiarity were measured in two ways. First, the same small set of images was used to construct all of the trials in scans 16; this meant that each image was presented 15 times. This allowed us to determine whether scene processing became more viewpoint invariant as the images became more familiar by monitoring the evolution of viewpoint-specific and -invariant within-trial adaptation effects over the course of the scan session. Second, two scans were appended at the end of the session (scans 7 and 8) in which subjects made famous/nonfamous judgments on scenes that were either repeated from scans 16 (old view), repeated but from a different view (new view) or not previously seen in the experiment (new places). This allowed us to measure the net effect of within-session familiarity by comparing response to "old" views of a place with response to new views of the same place and also to new places. Note that these manipulations are similar to those used in (Epstein et al. 2005
).
BEHAVIORAL DATA. Accuracies and reaction times for the same/different place task (scan runs 16) and the famous/nonfamous landmark task (scan runs 7 and 8) are reported in Fig. 3. The pattern of errors in the same/different place task was similar to that observed in experiment 1. ANOVA revealed significant main effects of familiarity [F(1,13) = 9.5, P < 0.01] and change type [F(2,26) = 15.0, P < 0.0001] and a significant familiarity x change type interaction [F(2,26) = 3.7, P < 0.05]. Specifically, subjects were more accurate when performing the task on images of the familiar college campus than on images of the unfamiliar college campus, and more accurate on no-change and place-change trials than on viewpoint-change trials. As in experiment 1, familiarity with the locations depicted in the stimuli was associated with improved performance on viewpoint-change trials [t(13) = 2.7, P < 0.02] but not on place-change or no-change trials [both t's < 1.2, n.s.].
Reaction times for correct trials on the same/different place task were significantly affected by campus familiarity [F(1,13) = 8.9, P < 0.02] and change type [F(2,26) = 46.2, P < 0.00001], and there was also a significant familiarity x change type interaction [F(2,26) = 4.2, P < 0.05]. Specifically, response times were faster for images of familiar locations (M = 757 ms) than for images of unfamiliar locations (M = 804 ms), and faster for no-change trials (M = 659 ms) than for viewpoint-change trials (M = 810 ms), which were in turn faster than responses on place-change trials (M = 869 ms). Post hoc t-test confirmed that both viewpoint-specific and -invariant priming effects were observed [viewpoint-change vs. no-change t(13) = 6.2, P < 0.0001; place-change vs. viewpoint-change t(13) = 3.6, P < 0.01], which contrasts with the results of experiment 1 in which only viewpoint-specific priming effects were found. Interestingly, the viewpoint-invariant priming effect was larger with images of the familiar campus than with images of the unfamiliar campus [F(1,13) = 5.4, P < 0.05], suggesting that campus familiarity speeded the matching of locations across views (in addition to improving accuracy in the viewpoint-change condition). In contrast, the magnitude of the viewpoint-specific priming effect did not vary as a function of campus familiarity [P > 0.15, n.s.].
An additional analysis in which run was added as a factor found little of interest. Accuracy effects did not vary significantly by run, with the exception of the familiarity advantage [F(2,26) = 3.7, P < 0.05], which for unclear reasons was largest on runs 5 and 6 and smallest on runs 3 and 4. Nor did reaction time effects vary significantly by run, with the exception of the main effect of familiarity, which was larger at the beginning of the experiment (runs 1 and 2) than on subsequent runs [F(2,26) = 4.1, P < 0.05].
Accuracies and reaction times for the famous/nonfamous task used in runs 7 and 8 are also plotted in Fig. 3. We analyzed the trials in which images of Penn and Temple were shown, ignoring the responses to the famous lures that were not of theoretical interest. Accuracy was quite high (M = 96.4%) and was significantly modulated by between-trial repetition [F(2,26) = 4.1, P < 0.05] but not campus familiarity [P > 0.1, n.s.]. In particular, accuracy on old view trials (M = 98.2%) was higher than accuracy than on new view trials (M = 96.2%), which was in turn higher than accuracy on new place trials (94.8%). Reaction times were modulated in a similar way by between-trial repetition [F(2,26) = 8.4, P < 0.002]: specifically, responses on old view trials (M = 734 ms) were slightly faster than responses on new view trials (M = 749), which were in turn faster than responses on new place trials (M = 800 ms). Note that this means that the viewpoint-invariant between-trial priming effect (new place vs. new view) was relatively large (51 ms), whereas the viewpoint-specific between-trial priming effect (new view vs. old view) was relatively small (15 ms); indeed, only the viewpoint-invariant priming effect was significant [t(13) = 3.1, P < 0.01]. Subjects responded marginally faster to images of the familiar campus than to images of the unfamiliar campus [F(1,13) = 3.6, P = 0.08], but there was no significant interaction of campus familiarity and between-trial repetition [F < 1, n.s.].
In sum, behavioral performance was facilitated by both real-world familiarity with the locations and within-scan-session familiarity with the images. The strongest effects of real-world familiarity were observed in scans 16, where performance on viewpoint-change trials was facilitated by real-world experience with the depicted location. The strongest effects of within-scan-session familiarity were observed in scans 7 and 8, where performance on old view and new view trials was facilitated by previous exposure to images of the locations during scans 16.
FUNCTIONAL ROIS. The results from scans 16 broadly replicated the pattern observed in experiment 1, although some of the effects of campus familiarity were not significant in all ROIs (Fig. 6, top). In particular, greater response to images of the familiar campus was observed in the left PPA [F(1,13) = 4.9, P < 0.05], left RSC [F(1,13) = 12.6, P < 0.01], and right RSC [F(1,13) = 8.0, P < 0.02] but not in right PPA, right TOS, or left TOS [P's > 0.2, n.s.]. As before, within-trial repetition effects were significant in all regions [all Fs > 18, all Ps <0.0001]. Separate analyses of viewpoint-specific and -invariant repetition effects found highly significant viewpoint-specific adaptation effects [all Fs > 36, all Ps < 0.0001] but no significant viewpoint-invariant effects except for a marginal effect in left PPA [F(1,13) = 3.6, P = 0.08].
|
|
Greater response to images of the familiar campus than to images of the unfamiliar campus were observed in the left and right PPA and the left and right RSC in scans 7 and 8 (F's > 4.8, P's < 0.05) but not in the TOS (both hemispheres, P > 0.2, n.s.). Along with the results from scans 16 and of experiment 1, these data provide a third replication of the finding that the PPA and RSC are sensitive to environmental familiarity. There was no interaction between environmental familiarity and either viewpoint-specific or -invariant between-trial repetition in either region (PPA: P > 0.05, n.s.; RSC: F <1, n.s.).
WHOLE-BRAIN ANALYSES. Results of exploratory whole-brain analyses were largely consistent with those observed in experiment 1 although as noted in the preceding text, the effects of environmental familiarity were not significant in all scene-responsive regions (see Table 2). Specifically, greater response to familiar locations than to unfamiliar locations was observed in RSC and a parietal-occipital region adjoining TOS in scans 16 and in RSC and right parahippocampal cortex in scans 7 and 8. Viewpoint-specific adaptation effects were observed in RSC, PPA, and TOS (as well as several additional frontal-parietal regions) during scans 16, whereas viewpoint-invariant adaptation effects were observed in RSC and PPA during scans 7 and 8. Figure 8 shows the striking degree of overlap between the PPA and RSC voxels showing viewpoint-specific adaptation in scans 16 and those showing viewpoint-invariant adaptation in scans 78.
|
|
| DISCUSSION |
|---|
|
|
|---|
Main effects of environmental familiarity
Greater response to images of familiar locations than to images of unfamiliar locations was observed in the PPA, RSC, and TOS in experiment 1 and in RSC and left PPA in experiment 2. The sensitivity to environmental familiarity was most striking in RSC, which responded 50% more strongly to photographs of familiar environments than to photographs of unfamiliar environments. In contrast, the familiarity effects in PPA and TOS were weaker and not always reliable. Exploratory whole-brain analyses confirmed that familiarity effects were largely restricted to the PPA, RSC, and TOS. We draw two conclusions from these results.
First, the sensitivity to environmental familiarity observed in the PPA, RSC, and TOS supports the claim that these regions mediate processes important for spatial orientation. We initially made this claim based on the fact that these regions respond more strongly to visual stimuli that have the potential to convey information about one's spatial whereabouts (i.e., images of scenes and landmarks) than to visual stimuli that are less likely to convey such information (i.e., images of objects and faces). However, the possibility remained open that this preferential response to scenes might simply reflect low-level physical differences between scenes and objects, such as the fact that scene images often extend further into the periphery of the visual field (Levy et al. 2001
). The results of the current experiments indicate that the PPA, RSC, and TOS show the following three effects: they respond more strongly to scenes than to nonscene objects during the functional localizer runs, they respond more to scenes drawn from familiar environments than to scenes drawn from unfamiliar environments, and they show response reduction when scenes are repeated. In other words, these regions respond more strongly when navigationally relevant visual information is present versus absent, more strongly when the visual stimulus conveys information about one's location within a larger, familiar space than when it simply conveys information about one's location within the immediate environment, and more strongly when navigationally relevant information is novel than when it is repeated. Furthermore, these are the only regions of the brain that show all three of these effects. We conclude that the previous association of the PPA, RSC, and TOS with spatial orientation processes is not spurious as multiple different tests designed to isolate navigation-related processes converge on the same network of brain regions (see also Aguirre and D'Esposito 1999
; Burgess et al. 2001
; Janzen and van Turennout 2004
; Maguire et al. 1998
).
Second, our results suggest that there is a certain amount of functional differentiation within the PPA-RSC-TOS cortical network. In particular, the large size of the familiarity effect in RSC suggests that this region may be especially involved in retrieving information about the spatial environment that extends beyond the immediate horizon as this kind of information can only be retrieved for the familiar scenes (Epstein and Higgins 2006
; Park et al. 2006
). In contrast, the familiarity effects in PPA and TOS were smaller and less reliable, suggesting that these regions might be primarily involved in perception of the local scene. The general pattern observed in PPA/TOS is approximately equal response to both familiar and unfamiliar locations (Epstein et al. 1999
) but with a slight boost for familiar locations under some circumstances. The factors driving the presence or absence of this familiarity effect in PPA/TOS are currently unclear, although one possibility is that stronger response to familiar locations is found when familiarity with the depicted location facilitates the interpretation of local spatial geometry. In any case, the current results are consistent with previous neuroimaging and neuropsychological studies that strongly implicate RSC in retrieval of spatial information that extends beyond the currently visible scene (Aguirre and D'Esposito 1999
; Ino et al. 2002
; Katayama et al. 1999
; Park et al. 2006
; Takahashi et al. 1997
; Wolbers and Buchel 2005
) and PPA/TOS in perception of the current scene (Epstein 2005
; Epstein and Higgins 2006
; Mendez and Cherrier 2003
). Thus PPA/TOS and RSC appear to play distinct but complementary roles in spatial navigation.
Results from two recent studies are particularly relevant for interpreting the current results. First, Suguira and colleagues (2005) reported greater fMRI response in posterior cingulate/RSC when subjects viewed places and objects that were personally familiar to them than when they viewed places and objects that were unfamiliar, suggesting that RSC may be a subset of a larger complex that is generally involved in linking the current stimulus to the broader spatial or episodic context from which it was drawn (cf. Bar 2004
). This linking process may operate automatically, even when subjects do not explicitly retrieve information about the context or familiarity of the visible scene, as in the same/different place task of the present experiment. Importantly, Suguira and colleagues found a subset of the posterior cingulate/RSC region that only showed a familiarity effect for places, which may correspond to the RSC as defined by the functional localizer in the current experiment. Second, Cabeza and colleagues (2004) reported greater activity in the medial prefrontal cortex, hippocampus, parahippocampal cortex, and the cuneus/RSC when subjects viewed photographs of a familiar campus that were taken by themselves than when they viewed photographs of a familiar campus taken by other subjects. These earlier results indicate that the PPA and RSC play a role in episodic retrieval, perhaps because spatial codes mediated by these regions are a critical component of any remembered episode.
Effects of familiarity on viewpoint invariance
The second hypothesis we tested was that familiarity would cause scenes to be processed in a more viewpoint-invariant manner. This prediction was based in part on the intuitive notion that view-invariant representations are particularly useful for recognition (Biederman 1987
; Marr 1982
; Tarr et al. 1998
) but can only be acquired after experience with multiple specific views (Booth and Rolls 1998
; Eger et al. 2005
). We used two fMRI adaptation paradigms to test the view specificity of scene processing: within-trial repetition (experiments 1 and 2, runs 16), and between-trial repetition (experiment 2, runs 7 and 8). In both cases, we interpreted reduced response when a location was repeated from a different viewpoint as evidence for viewpoint-invariant processing and reduced response when a location was repeated from the same viewpoint as evidence for viewpoint-specific processing. We examined how these effects were modulated by familiarity with the environment (experiments 1 and 2) and by familiarity with specific scene images (experiment 2).
Contrary to our expectations, familiarity with the depicted environment did not have strong effects on the viewpoint invariance of scene processing within the target cortical regions. Consistent with most of our previous results, within-trial repetition effects were entirely viewpoint specific in the PPA and TOS (Epstein et al. 2003
, 2005
). These effects were not modulated by environmental familiarity. In RSC, there was a marginally significant viewpoint-invariant repetition effect in the right hemisphere in experiment 1 that was significantly larger for images of the familiar campus than for images of the unfamiliar campus. Although interesting, this interaction was not replicated in experiment 2. In general, our results suggest that images of familiar and unfamiliar environments are processed with the same degree of viewpoint specificity, at least when relatively large viewpoint changes (>60°) are considered.
These results were somewhat surprising to us because an earlier study suggested that familiarity with specific scenes could lead to more viewpoint-invariant processing (Epstein et al. 2005
). However, this earlier study examined familiarity with scene images acquired within a scan session rather than familiarity with locations acquired from real-world experience. Experiment 2 was designed to simultaneously measure the effects of real-world and within-scan-session experiences on the viewpoint specificity of scene processing. Both viewpoint-specific and -invariant repetition reductions relating to within-scan-session familiarity were observed in the PPA and RSC in scans 7 and 8 of this experiment, replicating and extending our previous results. These results suggest that within-scan session familiarity can lead to a temporary facilitation of processing (and corresponding reduction of fMRI response) that generalizes to some extent across views (see also Ewbank et al. 2005
). However, these within-scan-session facilitation effects (both behavioral and neural) do not seem to be the precursor to long-term changes in the quality of scene processing as the effects were equally viewpoint-invariant for images of familiar and unfamiliar locations rather than being more viewpoint-invariant for familiar locations as we originally predicted.
One possible interpretation of these results is that scene representations in the PPA, RSC, and TOS are as important for orienting and localizing the observer within the scene as they are for place recognition (Byrne et al. 2007). Although viewpoint invariance is desirable for place recognition, viewpoint specificity is necessary if orientation and within-scene position are to be computed. If this is the case, then we might expect real-world interaction with the physical environment to cause an increase in the richness of both the viewpoint-specific and -invariant aspects of scene representations (hence increasing the effectiveness of both orientation/localization and recognition processes) but no proportional increase in viewpoint-invariance. In contrast, within-scan session experience might lead to priming for image features (or representations of scene geometry) that are at least partially invariant across views.
Two kinds of fMRI repetition suppression?
An assumption often made in fMRI adaptation/priming studies is that behavioral priming effects, within-trial fMRI adaptation effects, and between-trial fMRI adaptation effects all index identical representations and thus should give similar results (Buckner et al. 1998
; Schacter and Buckner 1998
). Several studies have identified cases where such correspondences do indeed exist (Henson et al. 2000
; Wig et al. 2005
). However, to understand the origins of these three effects, it is equally important to identify situations where they do not correspond (Sayres and Grill-Spector 2006
). We observed several disjunctions between these effects in the current experiment.
First, there was a disjunction between behavioral priming effects and fMRI adaptation effects. The behavioral results from experiment 1 and scans 16 of experiment 2 suggest that subjects were able to use viewpoint-invariant representations to facilitate the matching of different views of the same familiar location. This facilitation was evidenced by greater accuracy on viewpoint-change trials for familiar than for unfamiliar locations in both experiments and a larger viewpoint-invariant priming effect (i.e., faster RT for viewpoint changes than for place changes) in experiment 2. Yet little evidence was observed for larger viewpoint-invariant fMRI adaptation effects for familiar locations than for unfamiliar locations during scans 16 in either experiment, except for a small effect in right RSC in experiment 1 which was not replicated in experiment 2. Nor did the whole-brain analysis reveal viewpoint-invariant adaptation effects outside of the targeted ROIs (although it is possible that extra-ROI effects simply failed to exceed the more stringent significance threshold used for whole-brain analyses.)
Second, and perhaps more importantly, there was a disjunction between the within- and between-trial fMRI adaptation effects in experiment 2. The within-trial adaptation effects observed in scans 16 were entirely viewpoint specific, consistent with previous reports (Epstein et al. 2003
). In contrast, the between-trial adaptation effects observed in scans 7 and 8 were largely viewpoint invariant. In a previous study, we observed evidence that viewpoint-specific within-trial adaptation effects can become more viewpoint invariant over the course of a scan session (Epstein et al. 2005
); however, no similar evolution of within-trial viewpoint invariance was observed here. The behavioral priming effects observed in scans 7 and 8 were consistent with the between-trial fMRI adaptation effects insofar as both were largely viewpoint invariant.
What can account for these apparently discrepant results? Intuitively, one might suppose that the within-trial repetition effects index representations that are more "perceptual," whereas the between-trial repetition effects index representations that are more "mnemonic." It is a general feature of declarative memory that stored representations of complex stimuli such as narratives, episodes, or scenes tend to abstract away perceptual details (e.g., Brewer and Treyens 1981
). In the current study, this might correspond to loss of viewpoint specificity in the mnemonic representations. However, we are still left with the challenge of understanding how these different representations are encoded at the neural level. In particular, the whole-brain analyses found no evidence that the cross-trial repetition effects were driven by top-down modulation from areas associated with memory (such as the hippocampus or prefrontal cortex), nor did they find evidence that the within-trial repetition effects were driven by bottom-up-modulation from areas associated with perception (such as early visual cortex). Rather, both repetition effects appear to be coterminous within the same parahippocampal and retrosplenial regions.
One possibility is that within- and between-trial repetition effects may be modulated by different neural mechanisms that operate within the same cortical territory (Grill-Spector 2006
). In particular, within-trial fMRI adaptation effects might reflect modulation of the inputs to a region, whereas between-trial adaptation effects might reflect modulation of neural processing within a region. In this account, the viewpoint specificity observed in the within-trial adaptation effects indicates that information about which view corresponds to which place is not present in the inputs to the PPA, RSC, and TOS. Processing within these regions leads to extraction of (partially) viewpoint-invariant place representations from viewpoint-specific inputs, and this intraregional processing is reflected in the viewpoint invariance observed in the between-trial adaptation effects. The within-trial adaptation effect might be caused by synaptic depression (Abbott et al. 1997
), which is believed to act on a relatively short time scale of <2 s (Muller et al. 1999
), whereas the between-trial adaptation effect might be caused by within-region changes of connectivity leading to more efficient processing (and faster reaction times). Although not observed in the current experiment, the fMRI correlates of a third adaptation mechanism, neuronal adaptation caused by tonic hyperpolarization (Carandini and Ferster 1997
), might be observable when the adapting stimuli are shown for much longer presentation times of several seconds (Fang and He 2005
; Fang et al. 2005
).
Although admittedly speculative, this account is consistent with several results from neurophysiology. Li and colleagues (1993) recorded from neurons in inferior temporal (IT) cortex while monkeys viewed objects that were repeated within trials and between different trials. They observed separate, independent effects for within- and between-trial repetition that they referred to as matching (within-trial) and familiarity (between-trial) effects. They hypothesized that between-trial effects might be caused by a "sharpening" of object representations as neurons that encode nonessential features drop out of the representation, leading to reduced response (and faster reaction times) for repeated items (Desimone 1996
; Wiggs and Martin 1998
). Complementary to this, results from a recent study (Sawamura et al. 2006
) suggest that within-trial repetition effects may reflect adaptation at the synaptic inputs rather than changes in neuronal selectivity. Recordings were made from IT neurons while either identical items or distinct items that elicit nearly identical responses (when presented in isolation) were repeated within a trial. Despite the fact that the neuronal response to each stimulus was equivalent when they were presented singly, more adaptation was observed within a two-item sequence when the second item was the same as the first than when it was different (but "equivalent"). This finding indicates that the adaptation effects caused by immediate repetition show greater selectivity than the neuron itself. One possible explanation of this result is that immediate repetition effects reflect adaptation at the synaptic inputs to the neuron (which could differ for the two different stimuli) rather than adaptation at the level of the neuron itself (which treats the two stimuli as equivalent).
Alternatively, the disjunction between the within- and between-trial adaptation effects might relate to the use of different behavioral tasks in runs 16 and 7 and 8. These tasks have different goals that may have led to the adoption of different performance strategies. In particular, during performance of the same/different place task in runs 16, subjects might have attended to image details such as the relationship between the observer and the scene to identify viewpoint changes and distinguish them from place changes. In contrast, during performance of the famous/nonfamous task in runs 7 and 8, subjects might have attended to more general place features that are likely to be somewhat invariant across views. As such, the same/different place task may have tapped representations that were more viewpoint specific than those used for the famous/nonfamous task. Consequently, the neural adaptation effects in runs 16 may have been primarily driven by view repetition while the neural adaptation effects in runs 7 and 8 may have been primarily driven by place repetition, leading to more viewpoint-specific adaptation in runs 16 than in runs 7 and 8. Note that this account assumes that viewpoint-invariant place representations were primed in runs 16 but did not cause fMRI repetition suppression effects until they were tapped for the behavioral task in runs 7 and 8. In other words, this account postulates that fMRI repetition suppression effects are driven not by repetition per se but by the speeded response that results when a repeated representation is employed in the service of a behavioral task.
This account is consistent with James and Gauthier's "accumulator" model of fMRI repetition suppression, in which the reduced fMRI response observed after repetition is attributed to faster accumulation of information necessary to successfully perform a behavioral task (James and Gauthier 2006
). Although not discussed by these authors, a prediction of their model is that fMRI repetition suppression effects should be at least somewhat task dependent because different forms of information are necessary to complete