|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1Departments of Psychology and 2Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota
Submitted 12 February 2007; accepted in final form 1 April 2007
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
There is also evidence in sensorimotor control that the brain stores object locations for reaching in eye-centered coordinates (Batista et al. 1999
; Henriques et al. 1998
; Pouget et al. 2002
). Eye-centered storage requires remapping locations after every eye movement (Duhamel et al. 1992
; Goldberg and Bruce 1990
; Henriques et al. 1998
; Walker et al. 1995
), whereas using this representation for reaching involves remapping to body-centered coordinates (Crawford et al. 2004
). Although remapping eye-centered locations after saccades is quite accurate (Blouin et al. 2002
; Hallet and Lightstone 1976
; Herter and Guitton 1998
; Israel and Berthoz 1989
; McKenzie and Lisberger 1986
; Ohtsuka 1994
; Pelisson et al. 1989
; Schlag et al. 1990
; Sparks and Mays 1983
; Zivotofsky et al. 1996
), other types of eye movements may introduce substantial error (Baker et al. 2003
). Moreover, the precision of eye-centered coordinates for reaching depends on accurate knowledge of the relative positions of body segments between the eye and the hand. However, the brain's encoding of body articulation is imprecise, and the quality of the encoding is position dependent (e.g., Niemeier et al. 2003
; van Opstal and van Gisbergen 1989
). We term the uncertainty about the relationship between body segments caused by imperfect sensory knowledge coordinate transformation uncertainty (CTU; Fig. 1). When there is CTU in a system, estimates are degraded by each transformation made between coordinate frames.
|
Sober and Sabes (2005)
showed that people use knowledge about CTU when integrating current sensory information. The goal of this paper is to study whether the brain represents and compensates for CTU when making grasping movements to remembered (i.e., stored) target locations. To experimentally test for CTU compensation, head-fixed participants repeatedly reached to an occluded cylindrical target while fixating targets that spanned an 80° range of eye positions. To manipulate CTU between the eyes and head, we took advantage of the fact that error in eye position encoding varies both with saccade magnitude and the eye's eccentricity away from forward view. In addition, we show that grasping movements to visual targets compensate for object location uncertainty by increasing maximum grip aperture (MGA). Therefore if the brain stores targets in eye-centered coordinates and compensates for the effects of CTU, it predicts that MGA should also vary as a function of eye position, even when the target is occluded.
| METHODS |
|---|
|
|
|---|
After signing an informed consent form, four subjects (2 males, 2 females) participated in the visual condition of this experiment, whereas five subjects (3 males, 2 females) participated in the target occluded condition of the experiment. The subjects were all right-handed students from the University of Minnesota and were given monetary compensation for their participation in the study. The subjects ranged from 19 to 32 yr of age, and all had normal or corrected to normal vision. Our protocol was approved by University of Minnesota IRB.
Apparatus
Trajectory data were acquired by attaching three infrared emitting devices (IREDs) to the fingernails of both the forefinger and the thumb that were tracked through an Optotrak 3020 sampling at 100 Hz (Fig. 2E). Cylinder viewing distance was 52 cm, and forward viewing distance to the middle fixation point ("E") was 62 cm. Fixation letters (AI) corresponded to degrees away from the cylinder (800°, in 10° increments). Subjects began each reach from a starting block located 35 cm from a 2.2-cm-diam cylinder, 11.5 cm in length. The starting block was located 95 cm off the ground plane and
1 cm below the cylinder plane. Once subjects initiated movement (>2 mm) from the starting block, the trial was initiated. Subjects had 1,200 ms to successfully lift the cylinder. The timer was stopped when a switch was tripped on the bottom of the target resting block, requiring subjects to lift the cylinder
5 mm. The cylinder's position was maintained using a 2-cm-tall clear plastic tube that was just large enough to allow smooth cylinder movement.
|
Procedures
Head-fixed subjects made repeated reaches to the same spatially fixed target located 40° to the right of the subject (0° in cylinder coordinates; Fig. 2). The cylinder was not moved across the workspace to assure that any changes in reaching behavior resulted from changes in eye position/viewing eccentricity and not kinematic demands. Moreover, the cylinder's location was selected such that there was the maximal range of eye positions away from the target (080°) to allow for the greatest range of uncertainty.
At the beginning of each trial, the fixation point was announced (e.g., E). After subjects fixated, they were allowed to make their reach to the target. They were instructed to reach as quickly (<1,200 ms) and accurately as possible. Once they successfully lifted and replaced the target, their hand had to be returned within 0.66 mm of their original starting position before the next fixation point was announced. If subjects took too long or did not lift the cylinder correctly, they heard an error message and the trial was repeated. Fixation points were randomly assigned throughout the trial. In the full range condition, each of eight possible fixations AH, corresponding to 80, 70, 60, 50, 40, 30,20, and 10° target eccentricity were repeated seven times per block. The fixation point corresponding to 0° target eccentricity (I) was excluded to keep similar fixations for both the visual and target occluded participants (because the occluder blocks both the target and the 0° fixation letter). Full range subjects ran in six blocks per day over 3 days, for a total of 1,176 (7 repetitions x 8 fixations x 7 blocks x 3 days) trials. Conversely, partial range subjects were cued to fixate one of three possible letters (H, E, or B) corresponding to 10, 40, or 70°away from the target. Each fixation was repeated 21 times per block for seven blocks, resulting in a total of 441 trials (21 repetitions x 3 fixations x 7 blocks x 1 day). Note that the partial and full range conditions both have the same number of repetitions (147) per fixation point. However, the full range condition allowed us to capture the full functional form of the MGA profile, whereas the partial range condition allowed us to gain statistical power over a shorter period of time. Therefore in the target occluded condition, two subjects were run in the full range condition and three subjects were run in the partial range condition. In the visual condition, all subjects were run in the partial range condition.
Subjects in the target occluded condition were never allowed to see their hand or the target as they were blocked by an occluder (Fig. 2, C and D). Moreover, all of the remaining visual information was removed after reach onset by shutter glasses. Subjects were instructed to maintain their eye position after the shutter lenses were closed. As a result, the only information subjects had available to them during a reach was stored haptic information from previous reaches to the target. If target location is stored in eye-centered coordinates, an estimate of their eye position is also necessary to remap the stored target location to body-centered coordinates. This manipulation allowed us to vary the amount of CTU in the task (i.e., the eye position uncertainty; see Modeling) while keeping the reliability of the haptic information constant. Therefore if subjects are estimating their CTU, we expect maximum grip aperture to vary with eye position.
Subjects in the visual condition were allowed to see the target throughout the duration of their reach. However, at no point were they allowed to see their hand. An occluder was used that allowed subjects to view the target but not see their hand until within 1 cm of the target. In addition, liquid crystal shutter glasses were triggered to block vision of both the hand and target during the last centimeter, where the fingers would be visible from the occluder (Fig. 2, A and B). Subjects were instructed to maintain gaze fixation once the shutters were closed. This manipulation assured that changes in viewing eccentricity only influenced target uncertainty, opposed to uncertainty about both the hand and target. The visual condition allowed us to verify that MGA is encoding target uncertainty for the task.
Analysis
Cubic interpolating splines were fit to trajectories to allow for an analytic description of the trajectory and to compensate for occlusions. If the trajectory had total occlusion time >40 ms, it was discarded from the dataset. Only
5% of the trajectories were discarded from the data because of occlusions that tended to occur at the beginning (i.e., launch off starting block) and return path of the reach. Grip aperture was computed as the distance between the center points of the sensors on the fingernails. MGAs were averaged across subjects and blocks to produce Fig. 5. MGA is the maximum distance between the thumb and forefinger during a reach, and it typically occurs 7580% of the distance to the target object (Sivak and MacKenzie 1990
). Its importance is that it serves as a measure of target uncertainty (Wing et al. 1986
) and scales linearly with actual object size (Paulignan et al. 1997
).
|
Modeling
We developed a probabilistic model of target location inference with two distinct goals in mind. First, we provided concrete predictions for the effect of target location uncertainty on maximum grip aperture, relying on previous results to provide values for the parameters in the model. Second, we wanted to extend these predictions to include the possible effects of coordinate transformation uncertainty on MGA. We present separate predictions for eye-centered and head-centered storage of target information. Other possible storage schemes like body-centered or storage in multiple coordinate frames (Avillac 2005
) make the same predictions as head-centered storage. The equations and assumptions used for data modeling are presented here, while derivations are presented in the APPENDIX.
In the model, reach plans depend on an inference of target location using information remapped to body-centered coordinates. Target information consists of recent visual and haptic information combined with information stored in memory. Memory is represented by a probability distribution on target coordinates, and operations on this distribution are used to represent the effects of sensory information, and coordinate transformation uncertainty. We present two sets of expressions for target inferenceone for eye-centered and one for head-centered coordinatesto compare the effects of combining and storing information in different coordinate frames in the presence of CTU.
The principal variables in the model are target location in eye- and head-centered coordinates and visual, haptic, and eye position signals. For simplicity, target location is represented in spherical coordinates with origin at the center of the head or at the midpoint between the two eyes, for head- or eye-centered coordinates, respectively. Modeling is restricted to the azimuthal angular component of target position (angle in the plane containing both eyes and the origin), because it is sufficient to account for changes in grasping and pointing behavior and to discuss optimality. Azimuthal target coordinates are represented by x in eye- and y in head-centered coordinates. For mathematical simplicity, the transform between x and y is approximated by y = x + r, where r is the azimuthal angle of eye position with respect to forward view. The approximation results from ignoring the offset between the origins in eye and head centered coordinates and only affects the model of pointing data.
Visual, haptic, and eye position signals provide information specifying target location and appropriate coordinate transformations. The azimuthal angle of the retinal projection from eye position provides a noisy visual signal v for target location in eye-centered coordinates. The noise in v is modeled as zero-mean (for convenience and because our predictions do not depend on this quantity), with a signal-dependent variance that models the effects of decreased spatial resolution on visual sensing of peripherally viewed targets. Using results from two-point discrimination studies (Burbeck 1987
; Burbeck and Yap 1990
; Levi and Klein 1996
; Whitaker and Latham 1997
), visual uncertainty
v linearly increases with the eccentricity of the visual information (in mm)
![]() | (1) |
This value was derived from the threshold (v/30)° reported in Levi and Klein (1996)
, and converted to millimeters. Note that similar models for visual uncertainty are used in both Niemeier et al. (2003)
and Saunders and Knill (2004)
.
Touching the object provides a haptic signal h to target location in body-centered coordinates available at the end of each reach. Haptic information from previous trials was included in the model because recent studies have shown haptic and visual information are optimally combined (Ernst and Banks 2002
; van Beers et al. 1999
, 2003
) and that haptic experience affects visual judgments (Atkins et al. 2001
). We consider h to provide information about y because the relationship between shoulder and head was held constant. As a result, any effect of the CTU between head- and body-centered coordinates is similar for all data and can safely be ignored. The noise in h is modeled as zero-mean and constant variance
. Based on haptic noise estimates derived from data in (Ernst and Banks 2002
), we set
k = 15 mm. This is slightly larger than the 10- to 12-mm SD found in van Beers et al. (1999)
; however, we found the precise value of this variable did not affect our predictions.
Noisy eye position signals e summarize available information about the eye's position in the head, including efference copy of motor commands (Lewis et al.1998
), proprioception (Steinbach 1986
), and the retinal location of familiar visual landmarks (Niemeier et al. 2003
). Uncertainty on e is modeled as zero-mean with a signal-dependent noise
(van Opstal and van Gisbergen 1989
). While the noise is assumed unbiased, we show that Bayesian inference of eye position r from signals e produces estimates biased toward forward view (Fig. 3). The bias results from the combination of eye position signals with a prior on eye position p(r) that encodes the assumption that the angle between eye position and head direction is maintained around zero (Stahl 2001
). For our simulations, the prior was modeled as Gaussian with zero mean and constant uncertainty (
r = 10.0°), which biases eye position estimates by a gain factor
![]() | (2) |
|
pe), where µpe = wee and
pe= we
e.
The decision to put signal-dependent noise on eye position rather than eye movements deserves comment. Although remapping data between eye- and head-centered coordinates requires specifying absolute eye position, not just eye displacement, it is not completely clear how to characterize this error. Because subjects saccade to the fixation points from forward view for the experimental data presented here, we cannot distinguish between uncertainty due to saccade magnitude from uncertainty caused by eye position. However, we believe both introduce uncertainty for the following reasons. For eye displacement tasks, eye position uncertainty is well predicted by saccade scatter (Niemeier et al. 2003
), which varies with saccade magnitude. Moreover, the force required to maintain eye eccentricity is roughly linear in eccentricity, caused by signal-dependent noise in force generation that is also linear (Jones et al. 2002
). Therefore signal-dependent noise should also result from eye position per se, independent of the saccade that brought the eye to that location.
MODELING GRASPING DATA.
We assume reach trajectories in our grasping experiments are planned to grasp the object based on the best current estimates of object location and diameter while avoiding object-finger collision. By assuming MGA scales with location uncertainty to avoid object-finger collision, MGA is modeled as proportional to the object's diameter D (Paulignan et al. 1997
) plus the uncertainty in target's location in head-centered coordinates
![]() | (3) |
Because the hand is occluded throughout the reach, the observed changes in MGA cannot be attributed to an on-line feedback control strategy. We attribute changes in MGA to the value of
y during movement planning. Note the assumption of proportionalitythe model does not specify that people use a particular collision avoidance criteria. If the brain represents location uncertainty and uses it for reach planning, MGA should vary proportional to
y. Below we test whether MGA follows the functional trends predicted by
y under different storage and sensory conditions.
Reach planning with collision avoidance requires estimates both of object location and uncertainty. These estimates are assumed to result from a Bayesian computation that combines the information available at the time of movement planning. The available information consists of both visual information (in the visual condition) and a memory distribution that provides a means to accumulate sensory feedback across trials. Estimates of the object's location and uncertainty are formed and updated by the following sequence of events. At the end of t 1th trial, new haptic data from grasping the object is appropriately remapped (depending on whether storage is in eye- or head-centered coordinates) and combined with memory. At the beginning of tth trial, a fixation saccade changes eye position, and new visual information (when available) is combined with memory. The updated memory distribution is transformed to body centered coordinates and a reach plan generated. After reach execution new haptic information is acquired, completing the cycle. Although the memory distribution adds modeling complexity, we believe that the ability to accumulate information across time is not unnecessary and serves an important function in normal visuomotor behaviors. Maintaining a memory of both the location and uncertainty of an out-of-view object allows a person to interact with the object without seeking current sensory information (e.g., a driver can keep their eyes on the road while reaching to coffee).
Figure 4 shows the differences in storage strategies when combining haptic information (from a previous trial) with newly acquired visual information. Note that this diagram omits remapping targets to hand-centered coordinates, which we assume introduces additional coordinate transformation uncertainty. Uncertainty caused by head-to-hand remapping is assumed constant because the target, hand, and head were fixed during reach planning. However, targets stored in eye-centered coordinates acquire additional eye position CTU when target position is remapped to make a reach.
|
y for four different conditions: target visible with eye-centered storage (
y,vis,eye), target occluded with eye-centered storage (
y,occ,eye), target visible with head-centered storage (
y,vis,head), and target occluded with head-centered storage (
y,occ,head), and show that the reliability of Bayesian inference depends on storage strategy caused by CTU. We present expressions for
y for each condition below. To quantitatively predict MGA, degrees were converted to millimeters for eye-position information.
Bayesian inference with eye-centered storage maintains a distribution on target location x that captures both a target estimate µeye and its uncertainty
. The distribution is corrected by sensory information remapped to eye-centered coordinates (when available) and adjusted for the effects of each eye movement. When the target is visible, visual and haptic information are combined with memory in eye-centered coordinates and passed to head-centered coordinates for reach planning. The uncertainty in target location resulting from this computation is given by
![]() | (4) |
when brought into eye-centered coordinates from CTU because of errors in eye position sensing on the previous trial, and the whole expression acquires additional uncertainty
from the transformation from eye- to head-centered coordinates, necessary for making a reach. Target occlusion removes any visual information, resulting in the following expression
![]() | (5) |
Bayesian inference in head-centered coordinates is similar, except a memory distribution is maintained on y with mean µhead and variance
. Visual information is remapped to head-centered coordinates (acquiring additional uncertainty
) and combined with memory before reach planning, resulting in the following uncertainty expressions when the target is visible
![]() | (6) |
![]() | (7) |
Although Eqs. 47 seem complicated, they simplify dramatically when the memory variance is much larger than the current information (
). By assuming large memory uncertainty, the predictions only depend on models for visual, haptic, and eye position uncertainty. More specifically, Eq. 4 reduces to
![]() | (8) |
![]() | (9) |
y,occ,eye =
h +
pe,k and
y,occ,head =
h, respectively. The simplified equations were used to make the predictions shown in Fig. 5. For large memory variance, there is no learning across trials and the integration of information is a more general form of cue combination that incorporates coordinate transformation uncertainty and its impact on the reliability of information resulting from different storage strategies. It should be noted that inaccurate location memory is not unreasonable: memory SD >4° occur in tasks with similar delays involving humans (Elliott and Madalena 1987
MODELING POINTING DATA.
To show the effect of CTU on pointing behavior, we modeled data from a previously published experiment (Lewald and Ehrenstein 2000
). Similar to how MGA adjusts for uncertainty associated with the target location, we also expect pointing behavior to change with the mean of the target location distribution. As previously mentioned, CTU biases target location estimates toward forward view as a result of the prior distribution on eye position (Fig. 3). Therefore if people are estimating their CTU, we would also expect their pointing behavior to change with these biases.
Head-fixed participants in the Lewald and Ehrenstein (2000)
experiment were required to fixate a target that could appear within ±30° of the midline of the head. After the target was extinguished for a period of time, subjects were asked to point to the remembered location of the target. In this task, pointing to target required the subjects to point to the mean location of their estimated eye position, because eye position and target location were coupled at presentation. Pointing biases were recorded as the difference between subject settings and true target direction. Data were pooled across subjects, and medians (with 95% CIs) were reported. We extracted this data from Fig. 2D of Lewald and Ehrenstein (2000)
using a computer program, replotting it with model predictions superimposed in Fig. 5C. Full descriptions of data collection and methods are found in the original paper.
Because visual information is constant (v = 0) for all targets in the pointing experiment, pointing errors must result from biases in eye position estimates (like those shown in Fig. 3B). Our model predicts biased eye position estimates as a consequence of the use of priors in Bayesian inference. Previous work suggests the brain uses a prior belief that saccade magnitudes are small (Niemeier et al. 2003
). Here we make a similar assumption that the brain has a prior belief that the highest probability eye position is forward view, with eccentric eye positions increasingly less probable. This prior has the effect of biasing eye position estimates toward forward view. When remembered visual targets are remapped to body-centered coordinates, the presence of eye position CTU is expected to bias target location estimates toward forward view as a result of the prior distribution on eye position (Fig. 3). If the brain incorporates knowledge of the effects of CTU on remapped object location estimates, we would expect their pointing behavior to mirror biases in eye position estimates.
We use the eye-centered target storage model and the same parameter values to generate predictions, except haptic data are excluded. Neglecting the influence any systematic motor biases, we assume pointing direction will match the target estimate µy,vis,eye. Based on the experimental design, we assume v = 0 and µeye = 0, which reduces µy,vis,eye to wee. Predicted bias in pointing direction is computed as the difference between the observed pointing directions (modeled as wee) and the actual target direction (given by e), resulting in
![]() | (10) |
| RESULTS |
|---|
|
|
|---|
To make our predictions more precise, we modeled the amount of target uncertainty that should be introduced by varying eye position. We developed a Bayesian model of eye position sensing, using published data to provide realistic values for model parameters. Figure 3 shows the behavior of Bayesian inference of eye position. Because the biases and uncertainty that arise in the inference of eye position propagate to all information passed between eye- and head-centered coordinates, the figure also shows the predicted consequences of CTU on estimates of target location. Examples of Bayesian inference of eye position are shown in Fig. 3A. Because of signal-dependent noise, eye position uncertainty (measured by the SD of the posterior distribution) increases away from forward view, shown in Fig. 3C. The use of the prior shown in Fig. 3A biases estimates of eye position toward forward view as shown in Fig. 3B. Because remapping from eye- to head-centered coordinates involves adding an estimate of eye position, biases in target estimates should mirror those for eye position. In addition, the uncertainty in remapped target information should vary with eye position like the curve shown in Fig. 3C. In essence, these profiles are fingerprints for identifying whether information has been transformed between eye- and head-centered coordinates. That is, if the brain is estimating CTU and using that information for reach plans, we would expect MGA to change similar to Fig. 3C and pointing to change with Fig. 3B.
Experimental data are shown in Fig. 5, with model predictions superimposed. At reach onset in the target occluded condition, all vision was extinguished (Fig. 2, C and D). Although both the fixation point and the target were not visible during a reach, subjects' maximum grip aperture increased for eye positions away from forward view (Fig. 5A). Moreover, the smallest MGA occurred near forward view, rather than at the target location, showing that the effect is a consequence of eye position and not target location. Although subjects made hundreds of identical reaches in this experiment over the course of many hours across several days, uncertainty about object location varies with eye position as predicted by storing information in an eye-centered reference frame (Fig. 5A). In contrast, if object location were stored in a body-centered coordinate frame (e.g., head or hand), there should be no effect of eye position (Fig. 5A).
To verify that MGA is a measure of object location uncertainty, we subsequently ran the same experiment on a different group of subjects with a visual occluder that allowed view of the target, but not the hand (Fig. 2, A and B). Visual location uncertainty increases linearly with the eccentricity of the target, because of changes in retinal acuity in the periphery (Burbeck 1987
; Burbeck and Yap 1990
; Levi and Klein 1996
; Whitaker and Latham 1997
). The data show an almost linear change in MGA with retinal acuity, verifying MGA as a measure of target location uncertainty. Moreover the deviation from linearity is in the direction predicted by eye-centered storage but not head-centered (Fig. 5B).
The MGA results summarize systematic changes in subjects' approach trajectories that occur when the amount of target uncertainty increases, for both occluded and visual data (Fig. 6). Trajectories from the visual condition show that the effect of increased uncertainty is to widen the excursion between finger and thumb and make the finger trajectories became more "hooked" (Fig. 6A). Trajectory changes in the target occluded condition are similar, but the widening pattern is reordered such that the trajectory associated with forward view has the least amount of hook (Fig. 6B). The similarity in the trajectory changes in the visual and occluded conditions suggest that both are the result of target uncertainty.
|
| DISCUSSION |
|---|
|
|
|---|
Although we attribute the change in grasping behavior in the occluded condition to target uncertainty, it may be that holding eye position away from forward view creates an attentional load that affects grasping. We believe this possibility is implausible for two reasons. First, it is unclear why decreased attention would create the same kind of trajectory changes as visual uncertainty; however, we showed eye position affects finger trajectories in the same way as changes in the amount of visual information. Second, reaching normally requires focal attention to be shifted to the target (Song and Nakayama 2006
). However, we argue allocating focal attention to the target should be easier when fixating the target than when looking straight ahead.
In addition, the effect of CTU on grasping in the target occluded condition is consistent with an eye-centered memory representation for object location. The only sensory information in this condition is the brief haptic contact with the object at the end of each reach. Despite the lack of visual information, grasping varies with eye position consistent with bringing haptic information into eye-centered coordinates using a noisy transformation. These results extend previous neural (Batista et al. 1999
; Pouget et al. 2002
) and psychophysical evidence (from pointing) (Crawford et al. 2004
; Henriques et al. 1998
; Lewald and Ehrenstein 2000
; Mergner et al. 2001
) for eye-centered target storage by showing that eye-centered storage persists in a task for which there is no visual information specifying target location. Note that our results do not preclude the possibility that target information is stored in multiple coordinate frames (Avillac 2005
). In this case, reach planning may not be exclusively based on an eye-centered target representation; however, our results suggest that an eye-centered representation is both updated without vision and is incorporated in reach plans.
Although it seems intuitive that an eye-centered coordinate frame is used when visual information is present, it is less clear why nonvisual information is also being stored in eye-centered coordinates (i.e., target occluded condition). One possibility is that the brain uses a robust strategy for information storage. If the brain assumes that the loss of visual information is temporary, eye-centered storage allows rapid prediction of the target's location, which is useful for error correction if the target reappears. It may be the case that people who have had extended periods of poor visual information (i.e., low-vision or blind) will not use eye-centered target representations. Another possibility is that visual fixation marks promoted an eye-centered storage strategy. In particular, fixation marks may have been used to maintain an accurate representation of the body's configuration with respect to the apparatus. Computing hand and target location relative to a fixation point would be equivalent to computing hand and target locations in eye-centered coordinates. In future work, we plan to study if MGA changes across eye position remain when a tactile fixation point is used in the target occluded condition.
In summary, we showed that grasping behavior adjusts for coordinate transformation uncertainty introduced by errors in eye position sensing, suggesting the brain has an internal model capable of predicting the consequences of CTU. In addition, we provided a Bayesian model that quantitatively describes the impact of CTU on both the reliability and bias of the posterior distribution on target location. The model was used to provide an explanation for previously reported biases in pointing, in addition to predicting our own psychophysical data. Together, these results suggest that CTU may affect behavior in tasks where multimodal cue combination is being performed and body articulation is not fixed. CTU could safely be ignored in previous research on multimodal cue combination because it primarily focused on tasks with constant body articulation (Atkins et al. 2001
; Battaglia et al. 2003
; Ernst and Banks 2002
). However, in tasks with variable body articulation, the coordinate transformation uncertainty introduced by errors in joint sensing may cause behavioral changes that will be inexplicable if CTU is not taken into account.
| APPENDIX: DATA MODELING |
|---|
|
|
|---|
The basic computations and assumptions of the model are as follows. We assume remembered target location can be represented by a probability density function. For example, the memory distribution for eye-centered coordinates is represented by pmem(x|mk1), where x is the target location in eye-centered coordinates and mk1 summarizes visual and haptic experience up to the k 1th trial. At each trial before the kth reach, perceived target location is computed by combining recent sensory data with target information in memory. The perceived target location is represented by a density function px(x|vk,hk1,ek1,mk1), which makes explicit the dependence on the current trial's visual information
k (when available) and the previous trial's haptic information hk1 (which introduces dependence on eye position information ek1). Finally, we assume reach actions are based on perceived target location transformed to body-centered coordinates, represented by py(y|vk,hk1,ek1,mk1,ek), and potentially corrupted by CTU because of errors in eye position sensing (for eye-centered storage).
First we derive expressions for perceived target location and derive the effect of CTU. To derive explicit formula for perceived object location, Gaussian approximations are used for all distributions, where N(x;µ,
2) denotes a Gaussian density on x with mean µ and variance
2. However, all distributions in the model are realistic insofar as the parameters are known. Specific assumptions about the form of these distributions are described in Modeling.
Perceived target location in eye-centered coordinates
To form the distribution on perceived target location, visual information from the current trial p(vk|x) and haptic information from the previous trial p(hk1|x,ek1) are combined with memory using probabilistic inference, resulting in the following expression when the target is visible
![]() | (11) |
![]() | (12) |
k denotes the retinal location of the target, hk1 is the target location information conveyed by touching the object on the previous trial, ek1 are eye position signals used to bring haptic information into eye-centered coordinates, and the probabilities in the denominators are normalization constants that do not affect inference.
We model visual and haptic distributions as Gaussian
![]() | (13) |
![]() | (14) |
![]() | (15) |
h,x2 are the mean and variance of the haptic distribution remapped into eye-centered coordinates. Remapping has the effect of introducing bias and additional uncertainty caused by errors in eye position sensing
![]() | (16) |
The terms ek1 and
pe,k12 are the mean and variance of a distribution describing eye position inference, and we is a gain factor that encodes a bias in eye position estimation toward forward view. Both of these are described in the next section below.
The posterior distributions for target location with and without visual information are Gaussian
![]() |
![]() |
![]() |
![]() | (17) |
![]() |
![]() | (18) |
Models for visual and haptic information are presented in Modeling.
Coordinate transformations and eye position sensing
Modeling coordinate transformations using probability theory involves defining a joint distribution that relates the position of the target object in both coordinate frames using the parameters of the transformation. Although this is complicated in general, we chose to represent target location in angular coordinates so that the transformation between head and eye coordinates becomes approximately linear 1 : y = x + r, where r denotes the azimuthal (angular) coordinate of the eye with respect to forward view. Probabilistically the transformation is a transition kernel expressing the relation between y and x for every value of eye position r: p(y|x,r). A transition kernel for the linear transformation above can be represented by a Dirac delta function: p(y|x,r) =
[y (x + r)]. Information about eye position r is represented by p(r|e), where e is a vector that summarizes efference copy of motor commands (Lewis et al.1998
), proprioception (Steinbach 1986
), and the retinal location of the fixation point (Niemeier et al. 2003
). Coordinate transformation uncertainty arises from marginalizing the transition
![]() | (19) |
The effect of CTU is to introduce uncertainty in the relationship between x and y. In particular, assuming p(r|ek) = N(r; we, ek;
pe,k2 is approximately Gaussian (modeled below), the above integral results in
![]() | (20) |
Using D to symbolize an arbitrary set of data, coordinate transformations from eye- to head-centered coordinates (and vice versa) integrate across the transition kernel and incorporate available eye position signals (from eye to head)
![]() |
![]() | (21) |
pe,k. For haptic information brought into eye-centered coordinates, the effects of CTU previously have been shown in Eq. 16. Using Eq. 22 to modify the expressions in Eqs. 17 and 18 for perceived target location in eye-centered coordinates, the distributions for perceived target location in head-centered coordinates are given by
![]() |
![]() | (22) |
![]() |
![]() | (23) |
Next we model eye position sensing, after which we pull the model components into the form used to generate data predictions.
Modeling eye position sensing
We assume that the distribution encoding eye position p(r|e) is derived from e and prior information p(r) according to Bayes' formula
![]() | (24) |
), and
(Niemeier et al. 2003
). The resulting posterior distribution is used to estimate eye position and takes on the form N(r,µpe,
pe), where µpe = wee and
pe = we
e. Perceived object location in head-centered coordinates
Expressions for the mean and variance of the perceived object distribution in head-centered coordinates are generated below. Predictions for pointing and grasping based on these results are found in RESULTS.
Let
) denote the average mean and variance of the memory distribution in eye-centered coordinates.
Combining Eqs. 22 and 23 with Eqs. 17 and 18 results in the following expressions:
Target visible, eye-centered coordinates
![]() |
![]() |
Target occluded, eye-centered coordinates
![]() |
![]() |
Storing information in head-centered coordinates
To model storage in head-centered coordinates, all the computations above remain the same, except the memory update is performed in head-centered coordinates y, with memory distribution p(y|mk1). When visual information is available the head-centered memory distribution is updated as follows:
Target visible, head-centered coordinates
![]() |
![]() |
Target occluded, head-centered coordinates
![]() |
![]() |
Without vision, integrating across v and x removes all dependence between y and r and hence there is no eye position dependence for memory distributions stored in head-centered coordinates.
| GRANTS |
|---|
|
|
|---|
| FOOTNOTES |
|---|
1 The approximation stems from the fact that the eye is offset from the center of the head and that the eye does not rotate around its center. However, the offset is constant and hence does not affect our results, and the effect of off-axis eye rotation is negligible compared with the rotation. ![]()
Address for reprint requests and other correspondence: E. J. Schlicht, Univ. of Minnesota, N218 Elliott Hall, 75 East River Rd., Minneapolis, MN55455
| REFERENCES |
|---|
|
|
|---|