Humans build representations of objects and their locations by integrating imperfect information from multiple perceptual modalities (e.g., visual, haptic). Because sensory information is specified in different frames of reference (i.e., eye- and body-centered), it must be remapped into a common coordinate frame before integration and storage in memory. Such transformations require an understanding of body articulation, which is estimated through noisy sensory data. Consequently, target information acquires additional coordinate transformation uncertainty (CTU) during remapping because of errors in joint angle sensing. As a result, CTU creates differences in the reliability of target information depending on the reference frame used for storage. This paper explores whether the brain represents and compensates for CTU when making grasping movements. To address this question, we varied eye position in the head, while participants reached to grasp a spatially fixed object, both when the object was in view and when it was occluded. Varying eye position changes CTU between eye and head, producing additional uncertainty in remapped information away from forward view. The results showed that people adjust their maximum grip aperture to compensate both for changes in visual information and for changes in CTU when the target is occluded. Moreover, the amount of compensation is predicted by a Bayesian model for location inference that uses eye-centered storage.
When humans reach out and grasp an object, information about the object's location arrives at different times through multiple sensory modalities, each in its own frame of reference. Maintaining an accurate representation of the object's location requires both integrating these sources of information and updating the stored (remembered) location after changes in body configuration (e.g., eye, head, or hand movements). Recent physiological data support the idea that visual, auditory, and touch information are remapped and combined in multiple coordinate frames. In particular, the parietal reach region uses eye-centered coordinates (Batista et al. 1999; Buneo et al. 2002), whereas premotor cortex uses body-centered coordinates (Graziano et al. 1994, 1997).
There is also evidence in sensorimotor control that the brain stores object locations for reaching in eye-centered coordinates (Batista et al. 1999; Henriques et al. 1998; Pouget et al. 2002). Eye-centered storage requires remapping locations after every eye movement (Duhamel et al. 1992; Goldberg and Bruce 1990; Henriques et al. 1998; Walker et al. 1995), whereas using this representation for reaching involves remapping to body-centered coordinates (Crawford et al. 2004). Although remapping eye-centered locations after saccades is quite accurate (Blouin et al. 2002; Hallet and Lightstone 1976; Herter and Guitton 1998; Israel and Berthoz 1989; McKenzie and Lisberger 1986; Ohtsuka 1994; Pelisson et al. 1989; Schlag et al. 1990; Sparks and Mays 1983; Zivotofsky et al. 1996), other types of eye movements may introduce substantial error (Baker et al. 2003). Moreover, the precision of eye-centered coordinates for reaching depends on accurate knowledge of the relative positions of body segments between the eye and the hand. However, the brain's encoding of body articulation is imprecise, and the quality of the encoding is position dependent (e.g., Niemeier et al. 2003; van Opstal and van Gisbergen 1989). We term the uncertainty about the relationship between body segments caused by imperfect sensory knowledge coordinate transformation uncertainty (CTU; Fig. 1). When there is CTU in a system, estimates are degraded by each transformation made between coordinate frames.
Recent work by Sober and Sabes (2005) suggests that people select coordinate representations that minimize the impact of errors caused by remapping. Using a virtual display, the authors introduced a systematic discrepancy between visual and proprioceptive coordinates of target locations. They showed that visual information is weighed more than proprioceptive information for reach plans to visual targets but is weighed less for proprioceptive targets where it would introduce larger errors. They interpret these findings as evidence that the coordinate frame used to compute target location is flexible and selected to minimize the impact of coordinate transformation errors.
Sober and Sabes (2005) showed that people use knowledge about CTU when integrating current sensory information. The goal of this paper is to study whether the brain represents and compensates for CTU when making grasping movements to remembered (i.e., stored) target locations. To experimentally test for CTU compensation, head-fixed participants repeatedly reached to an occluded cylindrical target while fixating targets that spanned an 80° range of eye positions. To manipulate CTU between the eyes and head, we took advantage of the fact that error in eye position encoding varies both with saccade magnitude and the eye's eccentricity away from forward view. In addition, we show that grasping movements to visual targets compensate for object location uncertainty by increasing maximum grip aperture (MGA). Therefore if the brain stores targets in eye-centered coordinates and compensates for the effects of CTU, it predicts that MGA should also vary as a function of eye position, even when the target is occluded.
After signing an informed consent form, four subjects (2 males, 2 females) participated in the visual condition of this experiment, whereas five subjects (3 males, 2 females) participated in the target occluded condition of the experiment. The subjects were all right-handed students from the University of Minnesota and were given monetary compensation for their participation in the study. The subjects ranged from 19 to 32 yr of age, and all had normal or corrected to normal vision. Our protocol was approved by University of Minnesota IRB.
Trajectory data were acquired by attaching three infrared emitting devices (IREDs) to the fingernails of both the forefinger and the thumb that were tracked through an Optotrak 3020 sampling at 100 Hz (Fig. 2E). Cylinder viewing distance was 52 cm, and forward viewing distance to the middle fixation point (“E”) was 62 cm. Fixation letters (A–I) corresponded to degrees away from the cylinder (80–0°, in 10° increments). Subjects began each reach from a starting block located 35 cm from a 2.2-cm-diam cylinder, 11.5 cm in length. The starting block was located 95 cm off the ground plane and ∼1 cm below the cylinder plane. Once subjects initiated movement (>2 mm) from the starting block, the trial was initiated. Subjects had 1,200 ms to successfully lift the cylinder. The timer was stopped when a switch was tripped on the bottom of the target resting block, requiring subjects to lift the cylinder ∼5 mm. The cylinder's position was maintained using a 2-cm-tall clear plastic tube that was just large enough to allow smooth cylinder movement.
Occluders were used in the visual and target occluded conditions to remove any information about the hand and target. Moreover, liquid crystal shutter glasses (Milgram 1987) were used to remove visual information at different moments during the reach, depending on the condition (see Procedures for details).
Head-fixed subjects made repeated reaches to the same spatially fixed target located 40° to the right of the subject (0° in cylinder coordinates; Fig. 2). The cylinder was not moved across the workspace to assure that any changes in reaching behavior resulted from changes in eye position/viewing eccentricity and not kinematic demands. Moreover, the cylinder's location was selected such that there was the maximal range of eye positions away from the target (0–80°) to allow for the greatest range of uncertainty.
At the beginning of each trial, the fixation point was announced (e.g., E). After subjects fixated, they were allowed to make their reach to the target. They were instructed to reach as quickly (<1,200 ms) and accurately as possible. Once they successfully lifted and replaced the target, their hand had to be returned within 0.66 mm of their original starting position before the next fixation point was announced. If subjects took too long or did not lift the cylinder correctly, they heard an error message and the trial was repeated. Fixation points were randomly assigned throughout the trial. In the full range condition, each of eight possible fixations A–H, corresponding to 80, 70, 60, 50, 40, 30,20, and 10° target eccentricity were repeated seven times per block. The fixation point corresponding to 0° target eccentricity (I) was excluded to keep similar fixations for both the visual and target occluded participants (because the occluder blocks both the target and the 0° fixation letter). Full range subjects ran in six blocks per day over 3 days, for a total of 1,176 (7 repetitions × 8 fixations × 7 blocks × 3 days) trials. Conversely, partial range subjects were cued to fixate one of three possible letters (H, E, or B) corresponding to 10, 40, or 70°away from the target. Each fixation was repeated 21 times per block for seven blocks, resulting in a total of 441 trials (21 repetitions × 3 fixations × 7 blocks × 1 day). Note that the partial and full range conditions both have the same number of repetitions (147) per fixation point. However, the full range condition allowed us to capture the full functional form of the MGA profile, whereas the partial range condition allowed us to gain statistical power over a shorter period of time. Therefore in the target occluded condition, two subjects were run in the full range condition and three subjects were run in the partial range condition. In the visual condition, all subjects were run in the partial range condition.
Subjects in the target occluded condition were never allowed to see their hand or the target as they were blocked by an occluder (Fig. 2, C and D). Moreover, all of the remaining visual information was removed after reach onset by shutter glasses. Subjects were instructed to maintain their eye position after the shutter lenses were closed. As a result, the only information subjects had available to them during a reach was stored haptic information from previous reaches to the target. If target location is stored in eye-centered coordinates, an estimate of their eye position is also necessary to remap the stored target location to body-centered coordinates. This manipulation allowed us to vary the amount of CTU in the task (i.e., the eye position uncertainty; see Modeling) while keeping the reliability of the haptic information constant. Therefore if subjects are estimating their CTU, we expect maximum grip aperture to vary with eye position.
Subjects in the visual condition were allowed to see the target throughout the duration of their reach. However, at no point were they allowed to see their hand. An occluder was used that allowed subjects to view the target but not see their hand until within 1 cm of the target. In addition, liquid crystal shutter glasses were triggered to block vision of both the hand and target during the last centimeter, where the fingers would be visible from the occluder (Fig. 2, A and B). Subjects were instructed to maintain gaze fixation once the shutters were closed. This manipulation assured that changes in viewing eccentricity only influenced target uncertainty, opposed to uncertainty about both the hand and target. The visual condition allowed us to verify that MGA is encoding target uncertainty for the task.
Cubic interpolating splines were fit to trajectories to allow for an analytic description of the trajectory and to compensate for occlusions. If the trajectory had total occlusion time >40 ms, it was discarded from the dataset. Only ∼5% of the trajectories were discarded from the data because of occlusions that tended to occur at the beginning (i.e., launch off starting block) and return path of the reach. Grip aperture was computed as the distance between the center points of the sensors on the fingernails. MGAs were averaged across subjects and blocks to produce Fig. 5. MGA is the maximum distance between the thumb and forefinger during a reach, and it typically occurs 75–80% of the distance to the target object (Sivak and MacKenzie 1990). Its importance is that it serves as a measure of target uncertainty (Wing et al. 1986) and scales linearly with actual object size (Paulignan et al. 1997).
Change in MGA was computed by subtracting off the global mean of each block (of each subject, across eccentricities) from the mean MGA for each eccentricity. This was done to reduce the effects of intersubject differences, remove drift in grip width across blocks caused by fatigue, and remove the effects of differences in IRED placement across sessions. Data were pooled across subjects, and mean MGA was computed for each eccentricity.
We developed a probabilistic model of target location inference with two distinct goals in mind. First, we provided concrete predictions for the effect of target location uncertainty on maximum grip aperture, relying on previous results to provide values for the parameters in the model. Second, we wanted to extend these predictions to include the possible effects of coordinate transformation uncertainty on MGA. We present separate predictions for eye-centered and head-centered storage of target information. Other possible storage schemes like body-centered or storage in multiple coordinate frames (Avillac 2005) make the same predictions as head-centered storage. The equations and assumptions used for data modeling are presented here, while derivations are presented in the appendix.
In the model, reach plans depend on an inference of target location using information remapped to body-centered coordinates. Target information consists of recent visual and haptic information combined with information stored in memory. Memory is represented by a probability distribution on target coordinates, and operations on this distribution are used to represent the effects of sensory information, and coordinate transformation uncertainty. We present two sets of expressions for target inference—one for eye-centered and one for head-centered coordinates—to compare the effects of combining and storing information in different coordinate frames in the presence of CTU.
The principal variables in the model are target location in eye- and head-centered coordinates and visual, haptic, and eye position signals. For simplicity, target location is represented in spherical coordinates with origin at the center of the head or at the midpoint between the two eyes, for head- or eye-centered coordinates, respectively. Modeling is restricted to the azimuthal angular component of target position (angle in the plane containing both eyes and the origin), because it is sufficient to account for changes in grasping and pointing behavior and to discuss optimality. Azimuthal target coordinates are represented by x in eye- and y in head-centered coordinates. For mathematical simplicity, the transform between x and y is approximated by y = x + r, where r is the azimuthal angle of eye position with respect to forward view. The approximation results from ignoring the offset between the origins in eye and head centered coordinates and only affects the model of pointing data.
Visual, haptic, and eye position signals provide information specifying target location and appropriate coordinate transformations. The azimuthal angle of the retinal projection from eye position provides a noisy visual signal v for target location in eye-centered coordinates. The noise in v is modeled as zero-mean (for convenience and because our predictions do not depend on this quantity), with a signal-dependent variance that models the effects of decreased spatial resolution on visual sensing of peripherally viewed targets. Using results from two-point discrimination studies (Burbeck 1987; Burbeck and Yap 1990; Levi and Klein 1996; Whitaker and Latham 1997), visual uncertainty σv linearly increases with the eccentricity of the visual information (in mm) (1)
This value was derived from the threshold (v/30)° reported in Levi and Klein (1996), and converted to millimeters. Note that similar models for visual uncertainty are used in both Niemeier et al. (2003) and Saunders and Knill (2004).
Touching the object provides a haptic signal h to target location in body-centered coordinates available at the end of each reach. Haptic information from previous trials was included in the model because recent studies have shown haptic and visual information are optimally combined (Ernst and Banks 2002; van Beers et al. 1999, 2003) and that haptic experience affects visual judgments (Atkins et al. 2001). We consider h to provide information about y because the relationship between shoulder and head was held constant. As a result, any effect of the CTU between head- and body-centered coordinates is similar for all data and can safely be ignored. The noise in h is modeled as zero-mean and constant variance . Based on haptic noise estimates derived from data in (Ernst and Banks 2002), we set σk = 15 mm. This is slightly larger than the 10- to 12-mm SD found in van Beers et al. (1999); however, we found the precise value of this variable did not affect our predictions.
Noisy eye position signals e summarize available information about the eye's position in the head, including efference copy of motor commands (Lewis et al.1998), proprioception (Steinbach 1986), and the retinal location of familiar visual landmarks (Niemeier et al. 2003). Uncertainty on e is modeled as zero-mean with a signal-dependent noise (van Opstal and van Gisbergen 1989). While the noise is assumed unbiased, we show that Bayesian inference of eye position r from signals e produces estimates biased toward forward view (Fig. 3). The bias results from the combination of eye position signals with a prior on eye position p(r) that encodes the assumption that the angle between eye position and head direction is maintained around zero (Stahl 2001). For our simulations, the prior was modeled as Gaussian with zero mean and constant uncertainty (σr = 10.0°), which biases eye position estimates by a gain factor (2)
The resulting posterior distribution is used to estimate eye position, and takes on the form N(r; μpe, σpe), where μpe = wee and σpe= weσe.
The decision to put signal-dependent noise on eye position rather than eye movements deserves comment. Although remapping data between eye- and head-centered coordinates requires specifying absolute eye position, not just eye displacement, it is not completely clear how to characterize this error. Because subjects saccade to the fixation points from forward view for the experimental data presented here, we cannot distinguish between uncertainty due to saccade magnitude from uncertainty caused by eye position. However, we believe both introduce uncertainty for the following reasons. For eye displacement tasks, eye position uncertainty is well predicted by saccade scatter (Niemeier et al. 2003), which varies with saccade magnitude. Moreover, the force required to maintain eye eccentricity is roughly linear in eccentricity, caused by signal-dependent noise in force generation that is also linear (Jones et al. 2002). Therefore signal-dependent noise should also result from eye position per se, independent of the saccade that brought the eye to that location.
MODELING GRASPING DATA.
We assume reach trajectories in our grasping experiments are planned to grasp the object based on the best current estimates of object location and diameter while avoiding object-finger collision. By assuming MGA scales with location uncertainty to avoid object-finger collision, MGA is modeled as proportional to the object's diameter D (Paulignan et al. 1997) plus the uncertainty in target's location in head-centered coordinates (3)
Because the hand is occluded throughout the reach, the observed changes in MGA cannot be attributed to an on-line feedback control strategy. We attribute changes in MGA to the value of σy during movement planning. Note the assumption of proportionality—the model does not specify that people use a particular collision avoidance criteria. If the brain represents location uncertainty and uses it for reach planning, MGA should vary proportional to σy. Below we test whether MGA follows the functional trends predicted by σy under different storage and sensory conditions.
Reach planning with collision avoidance requires estimates both of object location and uncertainty. These estimates are assumed to result from a Bayesian computation that combines the information available at the time of movement planning. The available information consists of both visual information (in the visual condition) and a memory distribution that provides a means to accumulate sensory feedback across trials. Estimates of the object's location and uncertainty are formed and updated by the following sequence of events. At the end of t − 1th trial, new haptic data from grasping the object is appropriately remapped (depending on whether storage is in eye- or head-centered coordinates) and combined with memory. At the beginning of tth trial, a fixation saccade changes eye position, and new visual information (when available) is combined with memory. The updated memory distribution is transformed to body centered coordinates and a reach plan generated. After reach execution new haptic information is acquired, completing the cycle. Although the memory distribution adds modeling complexity, we believe that the ability to accumulate information across time is not unnecessary and serves an important function in normal visuomotor behaviors. Maintaining a memory of both the location and uncertainty of an out-of-view object allows a person to interact with the object without seeking current sensory information (e.g., a driver can keep their eyes on the road while reaching to coffee).
Figure 4 shows the differences in storage strategies when combining haptic information (from a previous trial) with newly acquired visual information. Note that this diagram omits remapping targets to hand-centered coordinates, which we assume introduces additional coordinate transformation uncertainty. Uncertainty caused by head-to-hand remapping is assumed constant because the target, hand, and head were fixed during reach planning. However, targets stored in eye-centered coordinates acquire additional eye position CTU when target position is remapped to make a reach.
In the appendix we derive expressions for σy for four different conditions: target visible with eye-centered storage (σy,vis,eye), target occluded with eye-centered storage (σy,occ,eye), target visible with head-centered storage (σy,vis,head), and target occluded with head-centered storage (σy,occ,head), and show that the reliability of Bayesian inference depends on storage strategy caused by CTU. We present expressions for σy for each condition below. To quantitatively predict MGA, degrees were converted to millimeters for eye-position information.
Bayesian inference with eye-centered storage maintains a distribution on target location x that captures both a target estimate μeye and its uncertainty . The distribution is corrected by sensory information remapped to eye-centered coordinates (when available) and adjusted for the effects of each eye movement. When the target is visible, visual and haptic information are combined with memory in eye-centered coordinates and passed to head-centered coordinates for reach planning. The uncertainty in target location resulting from this computation is given by (4) where haptic information acquires additional uncertainty when brought into eye-centered coordinates from CTU because of errors in eye position sensing on the previous trial, and the whole expression acquires additional uncertainty from the transformation from eye- to head-centered coordinates, necessary for making a reach. Target occlusion removes any visual information, resulting in the following expression (5)
Bayesian inference in head-centered coordinates is similar, except a memory distribution is maintained on y with mean μhead and variance . Visual information is remapped to head-centered coordinates (acquiring additional uncertainty ) and combined with memory before reach planning, resulting in the following uncertainty expressions when the target is visible (6) and when the target is occluded (7)
Although Eqs. 4–7 seem complicated, they simplify dramatically when the memory variance is much larger than the current information (). By assuming large memory uncertainty, the predictions only depend on models for visual, haptic, and eye position uncertainty. More specifically, Eq. 4 reduces to (8) whereas Eq. 6 simplifies to (9) and Eqs.5 and 7 reduce to σy,occ,eye = σh + σpe,k and σy,occ,head = σh, respectively. The simplified equations were used to make the predictions shown in Fig. 5. For large memory variance, there is no learning across trials and the integration of information is a more general form of cue combination that incorporates coordinate transformation uncertainty and its impact on the reliability of information resulting from different storage strategies. It should be noted that inaccurate location memory is not unreasonable: memory SD >4° occur in tasks with similar delays involving humans (Elliott and Madalena 1987; Karn et al. 1997; Sheth and Shimojo 2001) and monkeys (Baker et al. 2003; White et al. 1993). In addition, we could not reject a limited memory model on the basis of cross-trial analyses—any influences of fixations from previous trials were too small to be reliably measured.
MODELING POINTING DATA.
To show the effect of CTU on pointing behavior, we modeled data from a previously published experiment (Lewald and Ehrenstein 2000). Similar to how MGA adjusts for uncertainty associated with the target location, we also expect pointing behavior to change with the mean of the target location distribution. As previously mentioned, CTU biases target location estimates toward forward view as a result of the prior distribution on eye position (Fig. 3). Therefore if people are estimating their CTU, we would also expect their pointing behavior to change with these biases.
Head-fixed participants in the Lewald and Ehrenstein (2000) experiment were required to fixate a target that could appear within ±30° of the midline of the head. After the target was extinguished for a period of time, subjects were asked to point to the remembered location of the target. In this task, pointing to target required the subjects to point to the mean location of their estimated eye position, because eye position and target location were coupled at presentation. Pointing biases were recorded as the difference between subject settings and true target direction. Data were pooled across subjects, and medians (with 95% CIs) were reported. We extracted this data from Fig. 2D of Lewald and Ehrenstein (2000) using a computer program, replotting it with model predictions superimposed in Fig. 5C. Full descriptions of data collection and methods are found in the original paper.
Because visual information is constant (v = 0) for all targets in the pointing experiment, pointing errors must result from biases in eye position estimates (like those shown in Fig. 3B). Our model predicts biased eye position estimates as a consequence of the use of priors in Bayesian inference. Previous work suggests the brain uses a prior belief that saccade magnitudes are small (Niemeier et al. 2003). Here we make a similar assumption that the brain has a prior belief that the highest probability eye position is forward view, with eccentric eye positions increasingly less probable. This prior has the effect of biasing eye position estimates toward forward view. When remembered visual targets are remapped to body-centered coordinates, the presence of eye position CTU is expected to bias target location estimates toward forward view as a result of the prior distribution on eye position (Fig. 3). If the brain incorporates knowledge of the effects of CTU on remapped object location estimates, we would expect their pointing behavior to mirror biases in eye position estimates.
We use the eye-centered target storage model and the same parameter values to generate predictions, except haptic data are excluded. Neglecting the influence any systematic motor biases, we assume pointing direction will match the target estimate μy,vis,eye. Based on the experimental design, we assume v = 0 and μeye = 0, which reduces μy,vis,eye to wee. Predicted bias in pointing direction is computed as the difference between the observed pointing directions (modeled as wee) and the actual target direction (given by e), resulting in (10) which produces the curve shown in Fig. 5C.
Previous work shows that maximum grip aperture scales with object size (Paulignan et al. 1991, 1997) and also increases when visual information is degraded (Sivak and MacKenzie 1990; Wing et al. 1986). We tested whether reaching behavior similarly adjusts for changes in target location uncertainty introduced by varying CTU. Specifically, participants reached to the remembered locations of occluded targets while eye position was varied. The experimental logic is that if we find that maximum grip aperture changes with eye position when the target is occluded, it suggests the brain uses knowledge of CTU for reach planning that results from remapping stored target location information between eye- and head-centered coordinates.
To make our predictions more precise, we modeled the amount of target uncertainty that should be introduced by varying eye position. We developed a Bayesian model of eye position sensing, using published data to provide realistic values for model parameters. Figure 3 shows the behavior of Bayesian inference of eye position. Because the biases and uncertainty that arise in the inference of eye position propagate to all information passed between eye- and head-centered coordinates, the figure also shows the predicted consequences of CTU on estimates of target location. Examples of Bayesian inference of eye position are shown in Fig. 3A. Because of signal-dependent noise, eye position uncertainty (measured by the SD of the posterior distribution) increases away from forward view, shown in Fig. 3C. The use of the prior shown in Fig. 3A biases estimates of eye position toward forward view as shown in Fig. 3B. Because remapping from eye- to head-centered coordinates involves adding an estimate of eye position, biases in target estimates should mirror those for eye position. In addition, the uncertainty in remapped target information should vary with eye position like the curve shown in Fig. 3C. In essence, these profiles are fingerprints for identifying whether information has been transformed between eye- and head-centered coordinates. That is, if the brain is estimating CTU and using that information for reach plans, we would expect MGA to change similar to Fig. 3C and pointing to change with Fig. 3B.
Experimental data are shown in Fig. 5, with model predictions superimposed. At reach onset in the target occluded condition, all vision was extinguished (Fig. 2, C and D). Although both the fixation point and the target were not visible during a reach, subjects' maximum grip aperture increased for eye positions away from forward view (Fig. 5A). Moreover, the smallest MGA occurred near forward view, rather than at the target location, showing that the effect is a consequence of eye position and not target location. Although subjects made hundreds of identical reaches in this experiment over the course of many hours across several days, uncertainty about object location varies with eye position as predicted by storing information in an eye-centered reference frame (Fig. 5A). In contrast, if object location were stored in a body-centered coordinate frame (e.g., head or hand), there should be no effect of eye position (Fig. 5A).
To verify that MGA is a measure of object location uncertainty, we subsequently ran the same experiment on a different group of subjects with a visual occluder that allowed view of the target, but not the hand (Fig. 2, A and B). Visual location uncertainty increases linearly with the eccentricity of the target, because of changes in retinal acuity in the periphery (Burbeck 1987; Burbeck and Yap 1990; Levi and Klein 1996; Whitaker and Latham 1997). The data show an almost linear change in MGA with retinal acuity, verifying MGA as a measure of target location uncertainty. Moreover the deviation from linearity is in the direction predicted by eye-centered storage but not head-centered (Fig. 5B).
The MGA results summarize systematic changes in subjects' approach trajectories that occur when the amount of target uncertainty increases, for both occluded and visual data (Fig. 6). Trajectories from the visual condition show that the effect of increased uncertainty is to widen the excursion between finger and thumb and make the finger trajectories became more “hooked” (Fig. 6A). Trajectory changes in the target occluded condition are similar, but the widening pattern is reordered such that the trajectory associated with forward view has the least amount of hook (Fig. 6B). The similarity in the trajectory changes in the visual and occluded conditions suggest that both are the result of target uncertainty.
We also studied whether our model could account for biases in pointing data that have been used to infer target storage in eye-centered coordinates (see Crawford et al. 2004 for review). In these studies, target direction is briefly presented using auditory (Lewald 1998; Pouget et al. 2002), visual (Admiraal et al. 2003, 2004; Henriques et al. 1998; Lewald and Ehrenstein 2000; Mergner et al. 2001), or proprioceptive (Pouget et al. 2002) information while eye position is varied, after which subjects point to remembered target direction. These studies show systematic changes in pointing error with eye position as qualitatively predicted by eye-centered storage. Most of these studies have complicated designs (e.g., multiple fixations per trial; Henriques et al. 1998) and/or mix biases caused by coordinate transformation effects with those caused by visual sensing and motor production. However, our model generates simple predictions for the data presented in Lewald and Ehrenstein (2000) because eye position forms the only available information to target direction. We found good agreement between data and model predictions, shown in Fig. 5C.
The experimental data show that participants adjust their grasping behavior to compensate for target location uncertainty introduced by errors both in visual and eye position sensing (CTU). These results suggest that the brain represents CTU and incorporates its effects on estimates of target location. Previous studies have shown that the brain represents the reliability of sensory data both within (Landy et al. 1995; Saunders and Knill 2004) and between modalities (Atkins et al. 2001; Battaglia et al. 2003; Ernst and Banks 2002), in addition to representing the endpoint variance of motor movements (Harris and Wolpert 1998; Trommershauser et al. 2003, 2005). By representing CTU, the brain has all the components required to predict performance of complex sensorimotor behaviors involving perception-action cycles.
Although we attribute the change in grasping behavior in the occluded condition to target uncertainty, it may be that holding eye position away from forward view creates an attentional load that affects grasping. We believe this possibility is implausible for two reasons. First, it is unclear why decreased attention would create the same kind of trajectory changes as visual uncertainty; however, we showed eye position affects finger trajectories in the same way as changes in the amount of visual information. Second, reaching normally requires focal attention to be shifted to the target (Song and Nakayama 2006). However, we argue allocating focal attention to the target should be easier when fixating the target than when looking straight ahead.
In addition, the effect of CTU on grasping in the target occluded condition is consistent with an eye-centered memory representation for object location. The only sensory information in this condition is the brief haptic contact with the object at the end of each reach. Despite the lack of visual information, grasping varies with eye position consistent with bringing haptic information into eye-centered coordinates using a noisy transformation. These results extend previous neural (Batista et al. 1999; Pouget et al. 2002) and psychophysical evidence (from pointing) (Crawford et al. 2004; Henriques et al. 1998; Lewald and Ehrenstein 2000; Mergner et al. 2001) for eye-centered target storage by showing that eye-centered storage persists in a task for which there is no visual information specifying target location. Note that our results do not preclude the possibility that target information is stored in multiple coordinate frames (Avillac 2005). In this case, reach planning may not be exclusively based on an eye-centered target representation; however, our results suggest that an eye-centered representation is both updated without vision and is incorporated in reach plans.
Although it seems intuitive that an eye-centered coordinate frame is used when visual information is present, it is less clear why nonvisual information is also being stored in eye-centered coordinates (i.e., target occluded condition). One possibility is that the brain uses a robust strategy for information storage. If the brain assumes that the loss of visual information is temporary, eye-centered storage allows rapid prediction of the target's location, which is useful for error correction if the target reappears. It may be the case that people who have had extended periods of poor visual information (i.e., low-vision or blind) will not use eye-centered target representations. Another possibility is that visual fixation marks promoted an eye-centered storage strategy. In particular, fixation marks may have been used to maintain an accurate representation of the body's configuration with respect to the apparatus. Computing hand and target location relative to a fixation point would be equivalent to computing hand and target locations in eye-centered coordinates. In future work, we plan to study if MGA changes across eye position remain when a tactile fixation point is used in the target occluded condition.
In summary, we showed that grasping behavior adjusts for coordinate transformation uncertainty introduced by errors in eye position sensing, suggesting the brain has an internal model capable of predicting the consequences of CTU. In addition, we provided a Bayesian model that quantitatively describes the impact of CTU on both the reliability and bias of the posterior distribution on target location. The model was used to provide an explanation for previously reported biases in pointing, in addition to predicting our own psychophysical data. Together, these results suggest that CTU may affect behavior in tasks where multimodal cue combination is being performed and body articulation is not fixed. CTU could safely be ignored in previous research on multimodal cue combination because it primarily focused on tasks with constant body articulation (Atkins et al. 2001; Battaglia et al. 2003; Ernst and Banks 2002). However, in tasks with variable body articulation, the coordinate transformation uncertainty introduced by errors in joint sensing may cause behavioral changes that will be inexplicable if CTU is not taken into account.
APPENDIX: DATA MODELING
We developed a probabilistic model that simulates how target location information is combined and stored to compare with the grasping data presented here, as well as the pointing data presented in Lewald and Ehrenstein (2000). In this section, the computations are structured to clearly define the dependence of reaches on the most recent sensory information and to show the effects of coordinate transformation uncertainty.
The basic computations and assumptions of the model are as follows. We assume remembered target location can be represented by a probability density function. For example, the memory distribution for eye-centered coordinates is represented by pmem(x|mk−1), where x is the target location in eye-centered coordinates and mk−1 summarizes visual and haptic experience up to the k − 1th trial. At each trial before the kth reach, perceived target location is computed by combining recent sensory data with target information in memory. The perceived target location is represented by a density function px(x|vk,hk−1,ek−1,mk−1), which makes explicit the dependence on the current trial's visual information νk (when available) and the previous trial's haptic information hk−1 (which introduces dependence on eye position information ek−1). Finally, we assume reach actions are based on perceived target location transformed to body-centered coordinates, represented by py(y|vk,hk−1,ek−1,mk−1,ek), and potentially corrupted by CTU because of errors in eye position sensing (for eye-centered storage).
First we derive expressions for perceived target location and derive the effect of CTU. To derive explicit formula for perceived object location, Gaussian approximations are used for all distributions, where N(x;μ,σ2) denotes a Gaussian density on x with mean μ and variance σ2. However, all distributions in the model are realistic insofar as the parameters are known. Specific assumptions about the form of these distributions are described in Modeling.
Perceived target location in eye-centered coordinates
To form the distribution on perceived target location, visual information from the current trial p(vk|x) and haptic information from the previous trial p(hk−1|x,ek−1) are combined with memory using probabilistic inference, resulting in the following expression when the target is visible (11) whereas, the following results when the target is occluded (12) where νk denotes the retinal location of the target, hk−1 is the target location information conveyed by touching the object on the previous trial, ek−1 are eye position signals used to bring haptic information into eye-centered coordinates, and the probabilities in the denominators are normalization constants that do not affect inference.
We model visual and haptic distributions as Gaussian (13) (14) (15) where hx and σh,x2 are the mean and variance of the haptic distribution remapped into eye-centered coordinates. Remapping has the effect of introducing bias and additional uncertainty caused by errors in eye position sensing (16)
The terms ek−1 and σpe,k−12 are the mean and variance of a distribution describing eye position inference, and we is a gain factor that encodes a bias in eye position estimation toward forward view. Both of these are described in the next section below.
The posterior distributions for target location with and without visual information are Gaussian with formulas for mean and variance given by (17) (18)
Models for visual and haptic information are presented in Modeling.
Coordinate transformations and eye position sensing
Modeling coordinate transformations using probability theory involves defining a joint distribution that relates the position of the target object in both coordinate frames using the parameters of the transformation. Although this is complicated in general, we chose to represent target location in angular coordinates so that the transformation between head and eye coordinates becomes approximately linear 1 : y = x + r, where r denotes the azimuthal (angular) coordinate of the eye with respect to forward view. Probabilistically the transformation is a transition kernel expressing the relation between y and x for every value of eye position r: p(y|x,r). A transition kernel for the linear transformation above can be represented by a Dirac delta function: p(y|x,r) = δ[y − (x + r)]. Information about eye position r is represented by p(r|e), where e is a vector that summarizes efference copy of motor commands (Lewis et al.1998), proprioception (Steinbach 1986), and the retinal location of the fixation point (Niemeier et al. 2003). Coordinate transformation uncertainty arises from marginalizing the transition (19)
The effect of CTU is to introduce uncertainty in the relationship between x and y. In particular, assuming p(r|ek) = N(r; we, ek;σpe,k2 is approximately Gaussian (modeled below), the above integral results in (20)
Using D to symbolize an arbitrary set of data, coordinate transformations from eye- to head-centered coordinates (and vice versa) integrate across the transition kernel and incorporate available eye position signals (from eye to head) and from head to eye (21) where p(D|ek) in the second expression is a normalization constant. These transformations arise when information in eye-centered coordinates is transformed to body-centered before a reach and when haptic information is brought into eye-centered coordinates. Because Eq. 22 is a convolution integral given the form of the density in Eq. 20, the effect of CTU is to shift the mean of the transformed distribution by week and to its variance add σpe,k. For haptic information brought into eye-centered coordinates, the effects of CTU previously have been shown in Eq. 16. Using Eq. 22 to modify the expressions in Eqs. 17 and 18 for perceived target location in eye-centered coordinates, the distributions for perceived target location in head-centered coordinates are given by (22) with vision and without vision by (23)
Next we model eye position sensing, after which we pull the model components into the form used to generate data predictions.
Modeling eye position sensing
We assume that the distribution encoding eye position p(r|e) is derived from e and prior information p(r) according to Bayes' formula (24) where ), and (Niemeier et al. 2003; van Opstal and van Gisbergen 1989) models the eye position signals as unbiased but with signal-dependent noise that varies as a function of eye position. The prior on eye position p(r) reflects the fact that the eyes are usually forward with respect to the head (Stahl 2001). For our simulations, we fixed the Gaussian eye position prior to be ). The resulting posterior distribution is used to estimate eye position and takes on the form N(r,μpe,σpe), where μpe = wee and σpe = weσe.
Perceived object location in head-centered coordinates
Expressions for the mean and variance of the perceived object distribution in head-centered coordinates are generated below. Predictions for pointing and grasping based on these results are found in results.
Let ) denote the average mean and variance of the memory distribution in eye-centered coordinates.
Target visible, eye-centered coordinates
Target occluded, eye-centered coordinates
Storing information in head-centered coordinates
To model storage in head-centered coordinates, all the computations above remain the same, except the memory update is performed in head-centered coordinates y, with memory distribution p(y|mk−1). When visual information is available the head-centered memory distribution is updated as follows:
Target visible, head-centered coordinates
Target occluded, head-centered coordinates
Without vision, integrating across v and x removes all dependence between y and r and hence there is no eye position dependence for memory distributions stored in head-centered coordinates.
This project was funded by National Eye Institute Grant NEI R01 EY-015261 and Office of Naval Research Grant ONR N00014-05-1-0124.
↵1 The approximation stems from the fact that the eye is offset from the center of the head and that the eye does not rotate around its center. However, the offset is constant and hence does not affect our results, and the effect of off-axis eye rotation is negligible compared with the rotation.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2007 by the American Physiological Society