Multiple Sensory Cues Underlying the Perception of Translation and Path

N. Au Yong, G. D. Paige, S. H. Seidman


The translational linear vestibuloocular reflex compensates most accurately for high frequencies of head translation, with response magnitude decreasing with declining stimulus frequency. However, studies of the perception of translation typically report robust responses even at low frequencies or during prolonged motion. This inconsistency may reflect the incorporation of nondirectional sensory information associated with the vibration and noise that typically accompany translation, into motion perception. We investigated the perception of passive translation in humans while dissociating nondirectional cues from actual head motion. In a cue-dissociation experiment, interaural (IA) motion was generated using either a linear sled, the mechanics of which generated noise and vibration cues that were correlated with the motion profile, or a multiaxis technique that dissociated these cues from actual motion. In a trajectory-shift experiment, IA motion was interrupted by a sudden change in direction (±30° diagonal) that produced a change in linear acceleration while maintaining sled speed and therefore mechanical (nondirectional) cues. During multi-axis cue-dissociation trials, subjects reported erroneous translation perceptions that strongly reflected the pattern of nondirectional cues, as opposed to nearly veridical percepts when motion and nondirectional cues coincided. During trajectory-shift trials, subjects' percepts were initially accurate, but erroneous following the direction change. Results suggest that nondirectional cues strongly influence the perception of linear motion, while the utility of cues directly related to translational acceleration is limited. One key implication is that “path integration” likely involves complex mechanisms that depend on nondirectional and contextual self-motion cues in support of limited and transient otolith-dependent acceleration input.


Effective spatial behavior (e.g., navigation and orientation) entails interacting with, and controlling motion through, a cluttered environment. Multiple sensory systems are utilized in this process. The idiothetic senses detect physical forces imparted on the body and its parts due to motion, whereas the exteroceptive senses relay reorientation and displacement of the environment relative to ourselves. It remains unclear how neural mechanisms underlying spatial behavior integrate sensory information from different modalities, resulting in how we accurately perceive both self and environmental motion. One area of particular confusion is the role of vestibular input in forming accurate perceptions of self-motion. In this study, we specifically explored how ideothetic cues influence motion perception during linear motion, a common form of movement in daily life.

Ideothetic cues, consisting of both inertial and substratal information (Mittelstaedt and Mittelstaedt 2001), convey how the body moves through space. The vestibular system, having evolved specifically as a self-motion and orientation detector, no doubt plays an important role in this process (Guedry 1974; Young 1984). The vestibular endorgans, composed of the otoliths and semi-circular canals, detect linear and angular acceleration, respectively, of the head. Recent studies have demonstrated that humans can robustly determine linear displacement even during passive translations (i.e., not self-generated), an ability termed “path integration.” Many have suggested that this ability is dependent on a double integration of translational acceleration transduced by the otolith organs Israël and Berthoz 1989; Israël et al. 1993; Mittelstaedt 1999; Mittelstaedt and Mittelstaedt 2001; Wallace et al. 2002). However, other investigators have found the influence of otolith information on self-motion perception (Ivanenko et al. 1997a) and spatial navigation (Glasauer et al. 1994, 2002) to be limited. Further, studies of the linear vestibuloocular reflex (LVOR) in response to purely translational motion, a behavior also driven by the otoliths, show that eye movements compensatory for translational motion reflect high-pass dynamics and are robust only for motion above ∼0.5 Hz. (Paige and Tomko 1991; Paige et al. 1998; Telford et al. 1997). In particular, the LVOR response to translation is weak during the constant velocity stimuli of the type often used in studies that report robust path integration, suggesting that vestibular information reflecting translation may not be as accurate or useful as observations from path integration studies imply.

Motion cues

Although otolith input, or its processing, might support or influence path integration, there are a variety of other cues related to linear motion that might underlie the behaviors reported in earlier studies. Directional cues provide vectorial information about translational motion and may arise from both inertial and noninertial sources. Inertial information is transduced directly by the otolith organs but also by tactile and pressure cues due to the body's (or organs therein) inertial forces against the stimulus apparatus that convey acceleration magnitude, direction, and time of onset. Thus the entire body may serve as an accelerometer of sorts. Noninertial directional cues result from the direct interaction of the body with the medium that it is moving through, such as wind sensed by the skin when riding in an open vehicle. Unlike inertial cues, which are present only during periods of acceleration and not during constant velocity travel, these noninertial directional cues can be present during all nonzero velocity motion. However, a cue such as wind may be caused by external sources while stationary as well, and this can be directionally confusing when combined with wind arising from self-motion in a different direction.

Nondirectional motion cues consist of mechanical vibration, transduced by the otolith organs and a variety of somatic mechanoreceptors, and sound, which is sensed through auditory mechanisms, though may be transduced by the otolith organs at levels exceeding 100 dB (Young et al. 1977). These cues are often produced as an artifactual byproduct of the apparatus used to provide motion. The nature of these cues will typically be a function of the magnitude of acceleration and speed (as opposed to the directional velocity) of movement. There is evidence that such cues influence speed perception in automobiles (Horswill and McKenna 1999).

Although it is tempting to attribute path integration to the processing of inertial signals as transduced by the otolith organs, all of the cues described in the preceding text are available during most translation, and it seems likely that all sensory information available will be used to optimally achieve a specific task. Inertial and noninertial sensory information must be processed differently to calculate path, however. Directional acceleration must be temporally integrated twice to directly generate path information, whereas directional velocity information must be integrated only once. Nondirectional information alone cannot be used in isolation to calculate path and must thus be processed in conjunction with some directional cue. It is also quite likely that path calculations can be supplemented by cognitive influences (Wertheim et al. 2001), conscious or otherwise, such as processes that use estimates of time of travel and speed of motion to derive an estimate of path.

To study and parse the simultaneous contributions of available cues on the perception of passive linear translation, we developed two experimental paradigms, a cue-dissociation paradigm and a trajectory-shift paradigm. In both, we assessed our subjects' percept of translation velocity derived from nonvisual somatosensory cues in darkness. The cue-dissociation paradigm examined the perception of translational motion under conditions where directional cues remained fixed while the predominant nondirectional cues, sound and vibration, were either correlated with, or dissociated from, translational motion. We hypothesize that responses thus elicited will show an influence of nondirectional cues. In addition, the trajectory-shift paradigm was used to assess how subjects perceive a sudden change in direction (i.e., change in otolith and directional somatic input) while maintaining linear speed and its associated nondirectional cues. We expect that less-than-perfect integration of head acceleration will manifest as perceptual errors during this type of motion.



Twelve healthy adult subjects between 21 and 31 yr in age participated in these experiments. Subjects had no history of visual, vestibular, oculomotor, or other neurological dysfunction with most (8 of 12 subjects) tested with laboratory methods (e.g., caloric responses and audiogram) and by clinical examination (Paige). All subjects were naive to the nature of the experiments. Five of the 12 subjects previously participated in experiments using the same apparatus but with different tasks and goals. Experiments were conducted in accordance with the 1964 Declaration of Helsinki, and were performed after obtaining informed consent from subjects with approval from the internal Research Subject Review Board at the University of Rochester.

Stimulus apparatus and control

A previously described multi-axis sled/rotator (Contraves USA and JA Design, Pittsburg, PA) (Paige et al. 1998; Seidman and Paige 1996; Seidman et al. 1998) was used to generate translational motion. The apparatus consists of a linear sled sandwiched between two rotational axes (chair and base). The chair axis is composed of a custom chair mounted on a motorized angular rate table capable of generating ≤45 ft lb of torque. The chair axis is mounted on a linear sled driven by a motorized lead screw mechanism, capable of generating translation ≤5 Hz with peak accelerations of 0.5 g. The chair and sled axes are in turn mounted on an angular base axis capable of generating 325 ft lb of torque at frequencies ≤5 Hz. Voltages proportional to position and velocity of all three axes are provided by internal motion transducers.

Each axis of the apparatus could be moved independently or in combination to produce complex motion. The sled axis, due to the lead-screw drive and rails, produces noise and vibration, whereas both rotation axes function nearly silently and without perceived vibration. Thus nondirectional motion cues are highly correlated with the motion of the sled axis but not the rotation axes. We deliberately exploited these factors in this study when designing our motion stimuli.

Motion generation and profiles

Analog signals driving all three axes were generated using a custom real-time PC-based control system with 12-bit resolution operating at 1 kHz developed under LabviewRT (National Instruments, Austin, TX). Two types of linear motion trials were used in this study: pure interaural (IA) translations in a trapezoidal velocity profile, generated in two ways to dissociate translational acceleration cues from nondirectional cues (the cue-dissociation experiment); and similar linear velocity trapezoids that suddenly changed direction by 30° from IA during the constant velocity period, constituting a trajectory-shift without a change in sled-related mechanical cues. The integrity and reproducibility of each motion profile was confirmed with a three-axis accelerometer (Entran EGAL312S-10D, or Kistler 8393A2) fastened to the base of the chair. Peak accelerations of all stimuli were suprathreshold for the detection of passive linear movement (Benson et al. 1986; Gianna et al. 1996) with the possible exception of a small midtrial IA deceleration during the trajectory-shift protocol. We assessed the nature of nondirectional cues of motion in separate trials in which vibration during motion was measured with the Kistler accelerometer attached to the chair near the point of bitebar attachment, and audible noise was measured with a sound level meter (Quest model 1900) with the microphone placed at a location approximating the subjects' ears. For these trials, the outputs of the accelerometer and the sound meter were acquired at 25 kHz.


Motion stimuli during cue-dissociation experiments consisted of purely IA velocity trapezoids generated using two methods (sled-only and R-theta translation), each producing different relationships between nondirectional motion cues and actual motion. During sled-only translation (Fig, 1A), motion was generated using only the sled axis with the subject's IA axis aligned parallel to sled motion. Thus nondirectional cues were tightly correlated with actual motion (sled speed).

FIG. 1.

Motion stimuli using the sled/rotator device (diagrams at top), and signals from base rotation axis and the linear axis of motion as well as actual head translation as calculated from accelerometer recording (sets of traces below). Sensory cues associated with the operation of each axis and motion profiles are listed at the right. A: sled-only motion: interaural (IA) translation is produced only through moving the linear sled. B: R-theta paradigm: IA translation is generated through sled motion along with counterrotation of both the base and chair axes (chair velocity = −base velocity). C: trajectory-shift experiment: Sled-translation is initiated with an IA trajectory but is then altered mid-trial with a sudden counterrotation of both the base and chair axes, resulting in a diagonal path while maintaining the subjects' angular orientation in space.

During R-theta translation (Fig. 1B), a combination of sled motion and counter-rotation of the chair and base axes was used to translate the subjects along the chord of a circle while maintaining angular orientation in space. The chair was initially positioned eccentrically from the sled's center. To generate pure IA translation, three events took place simultaneously: the base axis rotated, the chair axis counter-rotated with respect to the base axis to maintain angular orientation, and the sled axis concurrently retracted and then extended the chair so as to maintain the chair's (and head's) trajectory in line with a 75-cm chord of a circle. By precisely controlling all axes, a linear velocity trapezoid was produced as in the sled-only case. Once again, the predominant nondirectional cues during R-theta translation were produced by sled motion. Because the speed of the sled during retraction and extension was now bimodal, noise and vibration cues behaved accordingly, whereas IA head velocity followed a trapezoidal form. Thus the noise and vibration cues during R-theta motion were dissociated from translation velocity.

Trapezoidal velocity profiles of various shapes (Fig. 2) were used, all producing an equal displacement of 75 cm with peak velocities of 10–45 cm/s and peak accelerations of 10–100 cm/s2 (Table 1). A small amount of artifactual naso-occipital (NO) motion (0.17–0.6 cm/s2 RMS, <6% of the slowest stimulus, at worst) was present during R-theta translation, as it is difficult to attain exact dynamic synchrony of all three axes. Rotational velocities during R-theta motion were <1°/s in magnitude. In addition to trapezoidal profiles, we also included a dual trapezoidal velocity profile with the sled-only method, consisting of two brief “V25A10” stimuli presented in succession (Fig. 2D). This profile was designed to replicate the characteristics of extra-otolith cues present during R-theta translation but now during sled-only motion. Unlike R-theta trials, nondirectional cues during this dual trapezoidal profile were tightly correlated with actual head velocity as with all sled-only trials.

FIG. 2.

Examples of actual IA velocity profiles presented in cue-dissociation experiments generated with sled-only (solid) and R-theta (dashed) methods. A: V10A50; short acceleration phase with long periods of constant velocity. B: V25A10; triangular shaped profile with 2 predominant periods of constant acceleration. C: V25A25; Moderate period of acceleration followed by a moderate period of constant velocity. Profiles V25A50, V45A50, and V45A100 are similar. D: bimodal IA velocity stimulus generated with only the sled axis, approximately simulating the vibration and noise present during R-theta trials.

View this table:

Nomenclature, plateau velocity, and peak acceleration of trapezoidal velocity stimuli employed in cue-dissociation experiments


Trajectory-shift experiments employed a stimulus consisting of a single linear velocity trapezoid along the sled axis, interjected midflight by a counterrotation of 30° by the chair and base axes (Fig. 1C). The subject and chair were initially displaced eccentrically on the sled and oriented with the IA axis parallel to the sled/head motion. The chair was then translated with a maximum acceleration of 100 cm/s2 to achieve a 25-cm/s plateau. Once the chair reached the center (directly over the base axis), a sudden 30° counterrotation of the base and chair axis was introduced (completed within 0.8 s). The counterrotation maintained the subject's angular orientation in space while changing the translation direction from IA to a diagonal trajectory (±30° from IA). This effectively introduced a brief IA deceleration coupled with a novel forward or backward (NO) acceleration. The initial 25 cm/s IA head velocity was thereafter decreased to a 21.7 cm/s but now coupled with a 12.5 cm/s NO component. Note that sled speed remained unaltered, and therefore so did nondirectional cues, during the entire plateau period of these trials. Acceleration-dependent inertial cues were, however, present during the change in trajectory, constituting novel but transient linear acceleration cues that were uncorrelated with noise and vibration.

The midtrial deceleration in the IA direction is near threshold for direction detection and will not necessarily influence perceived motion. The overall stimulus, however, is sufficient to examine the psychophysical response to the novel NO acceleration. If the subject correctly incorporates the novel NO component with a correct estimate of IA velocity, the result would be a report of motion in an oblique direction, with both NO and IA components. This is true whether the midtrial IA deceleration is robustly detected or not.

The mid-trial counter-rotation that provided the change in translational direction occurred over a 0.8 s interval, with a peak velocity of 60°/s. During this interval, dynamic differences between the base and chair axes precluded a perfect counter-rotation, but overall rotation only exceeded 3°/s for the first and last 0.1 s intervals of the 0.8 s interval of rotation.

Psychophysical task

Subjects were provided with a spring-centered X-Y joystick to report their perceptions of translational velocity in the IA and NO directions. The spring mechanism ensured that an indication of nonzero velocity is an intentional on-going action. Subjects were told that motion could be in any direction and were instructed to report their perception of motion in space by scaling the appropriate joystick channel with their perceived velocity, with higher velocities indicated by larger joystick deflection. Calibration and normalization of responses are described in the following text.

Experimental protocols

Subjects were secured to the chair with restraining belts at the chest, lap, and legs. The head was held rigidly to the chair using a customized bite-bar of dental compound over a steel bite-plate (Paige et al. 1998; Seidman and Paige 1996; Seidman et al. 1998), which might well play a role in enhancing the transmission of nondirectional cues of motion. Prior to the experiments, subjects underwent brief practice runs under light and dark conditions. Subjects were translated using sled-only motion along their IA axis, twice for each single-motion profile and once for the dual trapezoidal profile, affording the operator the opportunity to confirm the subjects' understanding of the task. Further, these practice runs exposed the subjects to the range of velocities employed in the study and the physical range of joystick motion, thus providing subjects with the information necessary to effectively scale their joystick responses. Extensive psychophysical training designed to insure precise calibration of responses or to prevent calibration drift was not performed, thus the analysis methods described below were carefully selected to be insensitive to response scale but instead to assess overall response morphology. The experiment ensued immediately after the practice runs. The subjects were kept in complete darkness for the entire duration of the experiment. Communication with the subjects was continuously maintained with an intercom.


Motion profiles were presented consecutively in pseudo-random order during a single session that lasted 60–90 min. A small rotation of the chair axis was made during each inter-trial period to realign the subject's IA axis parallel to the direction of motion. Subjects were naive to the presentation order. Postexperiment interviews confirmed that all subjects were unable to assess how the motion was generated and where their instantaneous location was at any point during the experiment.


Trajectory-shift trials were conducted immediately after the cue-dissociation experiment on a subset of five subjects. Trajectory-shift trials were presented consecutively in a single session lasting ∼3 min, with a 10 s intertrial interval. The direction of the trajectory-shift was pseudo-randomized so that subjects could not predict whether a subsequent trial shifted forward (i.e., +30° from the IA axis) or backward (i.e., −30° from IA). After each trial, the chair axis was slowly rotated to reorient the subject's IA axis parallel to the sled. For each subject, at least eight forward and backward trajectory-shift trials were performed.

Data collection and analysis

Recordings were conducted using a PC running custom data-acquisition software developed in Labview RT (National Instruments, Austin, TX). Voltages from the joystick (corresponding to reports of both IA and NO velocity) as well as voltages proportional to the position and velocity of each motion axis were digitized with 12-bit resolution at 100 Hz and stored for off-line analysis. Sound pressure and acceleration during motion profiles were sampled at 25 kHz in separate trials.


Responses from the second set of initial practice trials in the light were used to derive normalization factors for each subject's joystick voltages. For each subject, the steady-state joystick responses were linearly regressed against actual translational velocity to generate a calibration factor corresponding to a mean unity response gain in the light. This psychophysical calibration might certainly have drifted during the course of the experiment, and thus any analysis related to response scale is minimal. The analysis tools used to investigate similarities between response morphology and motion cues, on which we base our conclusions, are insensitive to scale and are thus robust despite potential calibration drift.

To analyze the effects of extra-otolith and otolith cues on translation perception, we compared perceived velocity to actual IA velocity and sled speed. The latter signal is proportional to the intensity of noise and vibration cues associated with mechanics of sled motion (all velocity-dependent). For sled-only trials, where IA velocity equals sled velocity, response profiles resembled sled signals. Correlation analysis served to quantify the linear relationship between perceived head velocity and actual sled velocity. Correlation coefficients were calculated between perceived velocity and a time-delayed version of sled velocity. These delays, ranging from 0 to 2.5 s in 0.01 s intervals, were introduced to compensate for psychophysical delays inherent in reported percepts. The delay associated with maximum correlation coefficient for each trial was recorded and used for further analysis.

During R-theta trials, in which sled velocity (and associated nondirectional cues) differed from actual IA velocity, response profiles became complex, requiring a different analytic approach. While perceived velocity often appeared morphologically quite similar to sled speed (cf. Figs. 6 and 7), correlation analysis was not always sufficiently sensitive to demonstrate this relationship. Therefore we employed a measure of mutual information, an information theory tool from communication systems engineering (Shannon 1948), which has previously been successfully applied to biomedical research (Dayan and Abbott 2003; Jin et al. 2004).

FIG. 3.

Spectrograms of auditory (A) and vibrational cues (B) during motion profiles used in these experiments. The abscissa shows time, and the ordinate shows frequency with magnitude represented by shade. For auditory cues, the intensity scale is in units of V/Hz. This scale is dictated by the audiometer for which 0 V reflects a floor of 50 dB SPL, and 3.16 V is the maximum audiometer output, corresponding to 110 dB SPL. Intensity units for vibration are dB/Hz. C, bottom traces: velocity of the sled axis during sled only (left) and R-theta (right) V25A25 motion with further details of the V25A25 profile available in Fig. 1. Actual motion along the IA axis was nearly identical for both cases. During sled-only motion, noise and vibration cues are fairly constant during the constant velocity portion of motion, whereas during R-theta motion, these cues are substantially modulated by the speed of the sled axis.

FIG. 4.

Representative data from 1 subject during sled-only translation trials. A: all trapezoidal IA velocity profiles (dotted black lines) are shown and are labeled as defined in Table 1. The mean response profiles of 1 subject for each stimulus type are shown (heavy black line) with response SE reflected by the thinner black lines. B: all of the responses of the subject shown in A elicited by the V25A25 sled-only stimulus. The stimulus profile is shown as response 0 with responses shown in the order they occurred as indicated on the “trial” axis. C: grand average of sled-only responses for all subjects and trials.

FIG. 5.

Representative responses from 2 subjects during bimodal sled-only motion trials. A: responses from this subject resemble those of 7 of 12 subjects who reported an accurate percept of IA head velocity. Inset: average response for this subject. B: responses from a different subject are representative of 5 of 12 subjects, who reported a reversal in the direction of translation perception during the onset of the 2nd half of the stimulus, particularly obvious in this subject for trials 1, 5, 6, and 8. Conventions are the same as in Fig. 4. Because of the intermittent reversals in direction reported for the 2nd half of the motion profile, the 2nd half of the average response (inset) is attenuated and shows larger SE. C: grand average of bimodal sled-only responses (left) across all subjects, and the average of the absolute value of these responses, thus eliminating the effects of directional errors.

FIG. 6.

Representative responses from 1 subject during R-theta trials. Nearly all responses show nontrapezoidal, mostly bimodal, response morphologies resembling sled speed, not IA head velocity. A: mean responses of this subject to every R-theta stimulus type, with the IA velocity shown as the stimulus (dashed line). B: shows individual responses to the V25A25 stimulus type with the velocity of the sled axis shown for reference (trial 0).

FIG. 7.

Representative responses during R-theta trials from a different subject than in Fig. 6. This subject perceived a reversal of translational motion during the latter half of many R-theta trials. Similar response reversals were observed in 7 other subjects. Conventions are the same as in Fig. 5.

An understanding of mutual information in this context can be facilitated by imagining two message sources in our system, one transmitting IA velocity information and the other sled speed (nondirectional). Both of these messages are passed to a receiver through a nonlinear noisy channel. The message appearing at the receiver should reflect at least one of the messages passed through the receiver, in this case represented by the report of perceived translational motion, should reflect at least one of the sent messages. To determine which of the two sources is best reflected in the received message, we can calculate the similarity in information, or mutual information, between the messages sent by the sources and the message received at the destination. Mutual information between the message received and the message sent (MIRS) is a stochastic measure of the dependence between these two waveforms, quantifying the information common to both. It can be interpreted as the information contained in the message sent (HS; i.e., sled velocity or IA velocity), plus the information in the message received (HR), minus the uncertainty between the message received and that sent (HRS): MIRS = HR + HSHRS.

To calculate each of these arguments, the probability density functions of the response and the stimulus, plus the joint probability density function of the response and stimulus, are required. Larger values of MI indicate stronger relationships between percept and stimulus characteristics (sled speed or IA velocity). A more mathematically rigorous discussion of information theory in neuroscience applications can be found in Dayan and Abbott (2003).

Mutual information was calculated for each trial using a method applicable to continuous distributions (Moddemeijer 1989). The probability density functions (PDF) were estimated using a two-dimensional histogram, facilitating the calculation of mutual information. This histogram was limited to the range of the stimulus and response with the number of bins set to the cube root of the number of 100-Hz samples analyzed rounded to the next highest integer value (thus 9–11 bins in each dimension). For each trial, response latency was estimated by calculating mutual information between percept (i.e., the signal received) and time-delayed (0–2.5 s, incremented at 0.01 s) versions of both sled speed and IA velocity (i.e., the 2 transmitted signals). The time delay for which the MI value was maximal was used as the estimate of response latency and was incorporated into subsequent analyses.


Velocity percepts reported during trajectory-shift trials were qualitatively analyzed in the IA and NO directions. IA and NO percepts were derived from joystick voltages. For each trial, data were normalized such that the integral of the report of perceived IA velocity (thus the IA displacement) prior to the introduction of the NO component of motion was 50 cm. A “path” of displacement was then calculated by integrating each component of the scaled velocity percepts to yield linear position responses. Although this integration of perceived velocity yields a path in the mathematical sense (i.e., the results have units of displacement in two dimensions), the resulting displacement might differ from the actual perceived path. A close relationship between integrated perceived velocity and perceived path has been verified for rotational motion (Mergner et al. 1996) but not for translational motion. Until such a relationship is better established, this construct should be considered a convenient tool for examining the effects of the change of the direction of motion on the motion perception, but not necessarily an accurate portrayal of perceived path. Paths calculated in this fashion were compared with the actual displacement to determine if any trends in self-motion perception were present among subjects.



In our efforts to establish the influence of nondirectional cues on the perception of translation, we first quantify these cues and establish the relationship between sled speed and detectable noise and vibration, demonstrating that nondirectional cues are indeed dissociated from IA translational velocity during R-theta motion. We then present the results of the cue-dissociation experiment, which show that despite high correlation between responses and translational velocity during sled-only stimulation, responses to R-theta stimulation show more similarity to nondirectional cues than to the inertial otolith cues. Finally, we show data from a trajectory-shift experiment that shows surprisingly poor assessment of path—much inferior to what one would predict if a perfect double-integration of acceleration were used to generate a path perception.

Nondirectional cues of motion

For both sled-only and R-theta motion, peak noise and vibration increased with increasing velocity of IA motion. For sled-only motion, the peak audible noise level ranged between 56.6 dB SPL for 10 cm/s peak velocity and 71.2 dB SPL at 45 cm/s. For the same IA velocities elicited through R-theta motion, peak noise ranged between 54.8 and 56 dB SPL. Peak vibration was generally largest in the dorsoventral (DV) direction, ranging between 0.21 g for 10 cm/s and 0.80 for 45 cm/s sled-only motion, and 0.10–0.60 g for corresponding R-theta motion. For both types of motion naso-occipital (NO) and IA vibration values were often lower than in the dorsoventral direction, with the greatest differences occurring at the highest velocities of motion. (see Table 2).

View this table:

Nondirectional cues for 10 cm/s and 45 cm/s peak translational velocities

Spectrographic analysis of noise and vibration (Fig. 3) confirms a strong morphological similarity between these nondirectional cues and the speed of the sled axis. During sled-only trials (Fig. 3, left), noise and vibration were strongly related to the actual motion of the subject. However, during R-theta motion (Fig. 3, right) auditory noise and vibration more closely reflect sled speed and thus reach a minimum halfway through the motion profile.

The relationship between sled speed and nondirectional motion cues showed some nonlinearity, with threshold effects at low velocity, and soft saturation effects at high velocities of sled motion. Thus moderate sled velocities produced much stronger nondirectional cues than low velocity sled movement, and the fastest velocities caused modest enhancement of noise and vibration. Changes in sled velocity caused changes in both the amplitude and frequency content of nondirectional cues.

Cue-dissociation paradigm

Correlation coefficients and mutual information between stimulus and estimated translation velocity were calculated for each trial, with analyses carried out using both IA translational velocity of the subject and sled speed used to generate R-theta motion (representing nondirectional motion cues) for the stimulus signal. The response latency was defined as the delay between the stimulus signal (actual translational velocity, or sled speed as a representation of nondirectional cues) and the response signal that yielded maximum mutual information. The mean latency was 325 ms with no significant difference between latency calculated using actual IA velocity or sled velocity (paired t-test, P > 0.05). Salient response features did not vary between left- and rightward stimulus direction (latency, correlation coefficient, mutual information: t-test, P > 0.05; percept in incorrect direction: chi-square, P > 0.05), therefore all data are sign-corrected for display to appear as rightward motion stimuli.


For trials using sled-only translation (sled speed correlated with IA velocity), all 12 subjects reported motion percepts that morphologically resembled IA velocity, as exemplified by the example of Fig. 4. Responses displayed little systematic inter-trial or -subject variability. For stimulus V25A10, which was nearly triangular in shape, most subjects reported a trapezoidal-like plateau that differed from the actual velocity profile near its peak (see Fig. 4A, middle top). This was reflected in a modest mean correlation coefficient of 0.88 between response magnitude and sled speed, and is perhaps related to the previously described nonlinearities associated with noise and vibration during low sled velocity. For all other velocity profiles, correlation coefficients exceeded 0.94.

During dual-trapezoid sled-only trials, all subjects correctly reported percepts of two distinct periods of motion (Fig. 5). Although seven subjects reported two similar and correctly directed movements (see Fig. 5A), five subjects perceived the second period of motion to be in the wrong direction on many trials (Table 3; Fig. 5B).

View this table:

Percentage of trials for which subjects perceived a reversal in the direction of motion


For trials using R-theta translation, subjects typically reported percepts of motion velocity that morphologically differed from actual IA velocity profiles. The vast majority of responses displayed two sequential periods of perceived motion, with an acute attenuation mid-trial (Figs. 68), even though IA velocity persisted. Thus perceived velocity more closely reflected sled speed and its bimodal profile of sled motion than true IA motion, even though matching the latter was the goal of the task. All subjects accurately reported the direction of the motion during the first half of the R-theta profile, but 7 of the 12 subjects sporadically reported erroneous perceptions of movement in the opposite direction during the last half of motion (Table 3; Fig. 7, A and B). These erroneous reports of direction corresponded with the reinitiation of sled motion (and associated noise and vibration) during the second half of these trials even though the IA velocity persisted through the entire plateau of translation. Five of the seven subjects who reported directional changes here also reported direction reversals during dual-trapezoid sled-only trials noted in the preceding text. Postexperiment interviews indicate that subjects' perceptions were limited to translations, and they did not experience overt or obvious perceptions of rotation or tilt.

FIG. 8.

A: grand average of the responses elicited during R-theta trials across subjects. B: grand average of the absolute values of the same responses as shown in A, thus eliminating the effects of reports of direction reversals. Thin lines show SEs.

Mutual information measures between percept and sled speed (MISS), as well as between percept and IA velocity (MIIA), were calculated. MI closely followed our qualitative assessment of these subjects' perceptions. For each stimulus type, each subjects' mean MISS was larger than mean MIIA (Fig. 9A, paired t-test, P < 0.05). Thus the perceived motion proved more closely related to sled speed (i.e., nondirectional motion cues) than to the actual translational motion. A similar analysis that employed correlation coefficients instead of MI showed similar results; however, the MI analysis proved more robust in the presence of the erroneous direction changes reported by more than half of the subjects. Table 4 shows MI and CC calculations for the data shown in Figs. 6 and 7, calculated using both sled speed and IA velocity as the input signal. MISS was larger than MIIA, regardless of whether a subject had a tendency to perceive reversals (as did subject 5) or not (subject 7). CCSS, however, was only larger than CCIA for the subject that did not experience perceived reversals, unless the absolute value of perceived velocity was used for analysis (CCIA* and CCSS*). As MI did not require using the absolute value of the perception report, we base our conclusions on this analysis.

FIG. 9.

Mutual information analysis of R-theta motion trials. A: Tukey box plot of mutual information values when the response is compared with actual IA velocity (gray fill), the speed of the sled axis (diagonal lines), and the paired difference between the 2 for each subject (white). Horizontal lines on the boxes indicate the distribution of the data, indicating the 5th, 25th, 50th, 75th, and 95th percentiles, respectively. Data points outside this range are indicated as black dots. B: percentage differences (PDs) between responses vs. actual IA velocity and responses versus sled speed are plotted as a function of mutual information between IA velocity and sled speed (MISI). For PD >0, the reported percept resembles sled speed more than IA velocity (MISS > MIIA). The opposite is true for PD <0. MISI values represent the similarity between the morphology of sled speed and IA velocity for each stimulus.

View this table:

Mutual information (MI) and cross correlation (CC) results for the V25A25 R-theta stimulus type in representative subjects with and without perceived direction reversals

In Fig. 9B, a plot of the percent difference (PD) from mutual information analysis Math versus mutual information between IA velocity and sled speed (MSI) is shown. For all stimuli, PD was greater than zero, indicating that percepts were more similar to sled-speed than IA velocity (i.e., MISS > MIIA). In addition, a regression between PD and MSI shows that PD increased as the similarity between sled speed and actual IA velocity (MSI) declined (P < 0.001); i.e., the more difference there was between sled speed and IA velocity, the more apparent the similarity of the perceived motion and sled speed. Both observations confirm our qualitative assessment that percepts more closely followed extra-otolith cues (sled speed), than actual IA velocity.

Trajectory-shift paradigm

Figures 10 and 11 show responses and the perceived paths of travel reported by subjects during the trajectory-shift experiment. Figure 10 displays the actual joystick reports from the two individual trials that typify the two general categories of observed responses. In addition, a derived version of the “path” (with caveats on its interpretation, as described in methods), calculated by integrating the velocity of travel in the IA and NO directions to generate a graph of position in space, is displayed (cf. methods). Each symbol along this calculated path represents a 150-ms time interval. Figure 11 shows these path calculations for the last four trials in each direction (forward or backward trajectory shift) for each of the five subjects used in this experiment.

FIG. 10.

Representative responses showing the 2 types of morphologies observed during trajectory-shift trials. In each panel, the IA (—) and NO (- - -) components of the stimulus are shown in the top trace with the corresponding perceptual responses indicated immediately below. Bottom trace: estimate of path traveled (open circles, with each symbol reflecting a 150-ms time interval) and the actual motion (solid lines) as calculated by a mathematical integration of the IA and NO components of the reported and actual motion profiles, respectively. A: morphologically accurate report of motion consisting of an initial horizontal motion followed by an oblique movement as indicated by simultaneous joystick reports of IA and NO components. In B, the subject reports IA motion until the direction change, at which point a diagonal motion is reported. Before the cessation of motion, the perception of an NO component diminishes, and the subject reports purely IA motion.

FIG. 11.

Trajectory-shift results for all 5 subjects tested. Each row contains responses from a single subject. Left: responses from forward trajectory shifts, and right: backward shifts. Different shades of gray reflect different trials, and each symbol reflects a 150-ms time interval. The actual path is shown as a solid black line. Although trials can consist of either right- or leftward motion, data have been transformed for display purposed to a left-to-right motion. Subjects begin at the right side of the trajectory path, facing toward the top of the figure. IA motion to the right ensues, with a direction change occurring near the middle of the path, such that the subject moves obliquely forward (left column) or backward (right column).

In general, subjects perceived the initial period of IA translation accurately with respect to direction, much like in sled-only trials reported in the preceding text. All five subjects reported purely IA motion in the appropriate direction for the stimulus period preceding the change in path direction. However, after the mid-trial change in motion trajectory, nearly all subjects failed to report the altered motion accurately. This included errors in excursion, direction, or both.

Responses after the change in direction were somewhat variable but shared some salient features. Directional errors, in which subjects did not correctly identify whether the mid-trial trajectory shift was in the forward or backward direction, were quite common. For the last four forward and backward trials, only one subject correctly identified the direction (Fig. 11, subject 6). One subject identified direction shifts in either direction as backward trajectory shifts (Fig. 11, subject 10), and one subject appeared to determine the direction quite randomly (Fig. 11, subject 2).

Responses that closely reflected the actual path—a motion along the IA axis for the first half of the trial, followed by a motion along a diagonal path for the remainder of the trial–were quite rare, even ignoring direction. Subject 2 displayed this behavior on some trials in the backward direction (Figs. 10A and 11, top right), although with sporadic directional errors and unusually long latency for the report of the NO component of travel (which, given our normalization procedures, are observable in Figs. 10 and 11 as excursions in the purely IA direction beyond the midpoint of travel).

The other four subjects all appropriately report IA motion during the first half of travel, and then a brief period of an NO component of motion associated with the change in trajectory direction, though individual subjects presented distinctive variations of this rough response morphology. Subject 10 (Figs. 10B and 11, 4th row) reported a perception of nearly constant IA velocity throughout the entire motion stimulus, with a brief 0.5-s presence of an NO component that occurred with the direction change. Thus this subject reported and IA motion, followed by a diagonal motion. Prior to the actual cessation of the diagonal motion, the subject's perception of an NO motion component diminished to zero, and the subject ended the trial with a report of purely IA motion.

Subject 3 shows similar response morphology; however, his perception of an NO motion component it not accompanied by and IA component (Fig. 11, row 2). Thus the overall perception is a purely IA motion prior to the direction change with the change in direction causing a brief period of purely NO motion, which then reverts to purely IA motion prior to the cessation of motion. Subjects 2 and 12 experience similar periods of purely NO motion perception, although in many trials the perception never reverts back to a period of IA motion.

Our data indicate that subjects perform remarkably poorly and very rarely replicate the actual path of motion during trajectory-shift trials.



The primary goal of this study was to determine the relative contributions of various cues that typically arise during translational motion in the generation of the perception of translation. These include direction-specific cures related to linear acceleration or velocity, and nondirectional cues related to the practical generation of linear motion (noise and vibration from a vehicle or a sled). Two different experimental paradigms were used to assess how these cues affect the translational percepts of human subjects during passive linear displacement. In a cue-dissociation experiment, we compared subjects' perception of whole-body motion when noise and vibration cues resulting from sled mechanics were correlated with (sled-only trials), or dissociated from (R-theta trials), actual translational velocity. Subjects typically reported consistent and accurate percepts of translational velocity during sled-only trials, where percepts were highly correlated with linear velocity as well as sled speed (R2 ≥ 0.88), in accord with previously reported results from similar studies (Berthoz et al. 1995; Israël and Berthoz 1989; Israël et al. 1993, 1997). However, during R-theta trials, in which a different mechanical strategy was used to produce the same linear motion as sled-only trials, subjects' perception of linear velocity was more strongly influenced by nondirectional mechanical cues (vibration and noise) related to sled speed than by actual translational motion and were therefore highly inaccurate.

Inaccuracies during R-theta trials most usually consisted of errors in which the morphology of the perceived motion did not reflect the velocity of actual IA translation but rather the amplitude of noise and vibration cues. Thus report of motion decreased to near zero velocity mid-trial. In addition, directional errors were also observed in 7 of 12 subjects. In each case, these directional errors coincided with the onset of noise and vibration during the second of two distinct periods of sled movement within R-theta trials, even though actual linear velocity of the subject remained unaltered. Such errors did not occur during sled-only trials consisting of one velocity trapezoid, but five of the seven subjects that reported reversal errors during the R-theta trials also did not correctly identify the direction of the second motion period during bimodal sled-only trials. There is a possibility that the two subjects with R-theta direction errors were less prone to misidentify direction during the bimodal sled-only trials because of the more robust mid-trial inertial information present, although further studies would be necessary to confirm this interpretation.

In a second experiment, the trajectory-shift paradigm was used to assess translation perception. An initial period of IA translation was altered mid-trial to a new trajectory in the forward or backward direction but without changing the subjects' speed or angular orientation in space. Because translation was generated entirely by the sled, nondirectional translation-related cues were uniform throughout these trials, except for the brief period when sled direction was changed and subjects received a transient NO acceleration and small IA deceleration. Subjects performed well during the early period of motion but were generally confused and highly inconsistent in their percepts of motion after the change in direction. In response to the trajectory shift, subjects often reported a brief NO motion component only near the transition period and then reverted to pure IA motion or no motion. Further, directional errors were commonplace.

The dramatic differences between sled-only and R-theta trials, and the clear errors during the trajectory-shift trials, suggest that the perception of translational motion (velocity or path) is not derived solely from, or even dominated by, linear acceleration. Perception must include the integration of motion cues across sensory modalities, likely augmented by cognitive skill and expectation. Inertial cues might even be overruled by strong nondirectional cues. This is not an outlandish conclusion given that otolith input provided to the brain by its peripheral afferents (Fernández and Goldberg 1976c) respond faithfully to linear head acceleration and not velocity. Thus during the plateau portions of our stimulus waveforms, afferent input related to acceleration drops rapidly (within milliseconds) to zero. Directional otolith cues during typical linear accelerations (natural or simulated) are brief, yet literature often relies on a “path integration” process based on a presumed otolith input that is perfectly integrated (Israël et al. 1993; Mittelstaedt 1999a; Mittelstaedt and Mittelstaedt 2001; Wallace et al. 2002). Our data demonstrate that this is not the case.

Other cues that typically accompany translation (e.g., sound and vibration) prove influential. In this regard, correlated combinations of different sensory cues are commonplace in daily life, although nuances in the relationships between translational motion and the accompanying set of nondirectional cues can be quite ambiguous, particularly in the absence of vision. For example, when cruising at constant velocity in an automobile, there is no translational acceleration to activate the otolith endorgans. However, noise and vibration cues persist, whether in the form of the humming of an engine or vibration from the tires, and these cues govern perceptions of prolonged linear velocity (along with vision if available), supplementing an accurate perception of motion. This is exemplified by our simple sled-only motion stimulus, where nondirectional cues are highly correlated with motion, and subjects' percepts accurately portray motion despite brief otolith afferent activity.

Nondirectional cues cannot help in every case, however. When these cues change in relation to the same motion, as when transitioning from driving a loud vibrating small car to a large and smooth luxury vehicle, errors occur in speed judgments (Matthews and Cousins 1980), even in the presence of robust visual motion cues. Our R-theta stimulus was designed to decorrelate nondirectional cues from actual motion, and this served to make the perception of motion inaccurate. For our trajectory-shift trials, nondirectional cues are fairly constant throughout motion, despite a fairly large change in the direction of travel and thus cannot help generate more accurate motion perceptions. In this case, subjects do not accurately perceive motion but do indicate that some change has occurred and sometimes correctly show the fore/aft direction of the change.

Other studies have also reported a limited and sometimes subtle role of otolith information in self-motion perception and spatial navigation. Gianna et al. (1996) report that the acceleration threshold for some subjects with bilateral vestibular loss is the same as that of normal subjects and attribute this finding to nonvestibular somatosensory cues. A study that employed a virtual display field consistent with a motion containing both angular and translational components found that the perception of translation was enhanced if the visual presentation was accompanied by a short-duration linear translation (Bertin and Berthoz 2004). Ivanenko et al. (1997b) found that during complex motion with concurrent angular and linear motion, subjects were significantly more sensitive and accurate in detecting the angular component of motion than the linear components. Similarly, Glasauer and colleagues reported that length of travel estimation during locomotion tasks was adequate despite on otolith dysfunction but direction while walking in a triangular path was dependent on proper semicircular canal function. (Glasauer et al. 1994, 2002)

Another consideration confounds otolith function. From a kinematic perspective, otolith information (or any linear acceleration signal) is inherently unsuitable as the primary source of translation information. This is partly because indistinguishable changes in linear acceleration can be produced by tilt (i.e., reorientation with respect to gravity) or translation—a consequence of Einstein's “equivalency principle.” As a result, these equivalent acceleration vectors, regardless of their mode of production, are identically coded by otolith afferents. In addition, linear acceleration in one direction cannot be differentiated from deceleration in the opposite direction if the neural integration of the acceleration information is leaky. To use linear acceleration alone to accurately determine self-motion, these two types of ambiguities must be resolved because the demands of translational acceleration on postural correction, eye movement compensation, and spatial orientation mechanisms can be confounded by these ambiguities.

Studies of both reflex behavior and perception indicate that the CNS has developed strategies to address ambiguities related to linear acceleration. Although linear head acceleration is accurately transduced by the otoliths (Fernández and Goldberg 1976ac), studies of perception and the LVOR during purely linear motion suggest that otolith input is subjected to a central frequency-parsing mechanism, possibly serving to partly resolve tilt-translation ambiguity through differing dynamic properties (Guedry 1974; Mayne 1974; Merfeld et al. 2005; Paige and Seidman 1999; Paige and Tomko 1991). In this scheme, a low-pass tilt-related pathway yields responses that are most robust during low-frequency (<0.1 Hz) or prolonged linear accelerations, whereas a high-pass translation-related pathway produces responses that are most robust during high-frequency or transient ones.

An example of incomplete resolution arises during motion at modest frequency (between 0.1 and 1 Hz), where perceptions of both tilt and translation may coexist. This is the “hilltop illusion” (Glasauer 1995). Fortunately, evolution has provided other sensory modalities to operate synergistically with otolith input in guiding navigation, notably including the semi-circular canals and vision (Glasauer et al. 2002; Harris et al. 2000). Given multisensory inputs under natural conditions, the brain may indeed solve or counteract most quandaries of otolith and other sensory ambiguities, although not all. (Angelaki et al. 1999; Merfeld et al. 2001; Paige and Tomko 1991; Raphan and Cohen 2002)

During natural behavior, where self-motion is generated through active movement, translation information provided by motor command and control, in conjunction with multiple sensory modalities, help further to overcome the ambiguities of otolith input. Substratal information, such as reafference, proprioceptive cues, vision, and audition, presumably all supplement vestibular information (otolith and canal) in generating accurate percepts of motion. Shortcomings of one modality (e.g., otolith input) are countered by others (e.g., vision). Nevertheless, limitations remain, and their manifestations arise whenever the richness of coherent cross-sensory input is reduced, compromised, or in conflict. In such cases, and particularly for ideotrophic cues, inaccuracy in the interpretation of translational motion is likely.

Aside from sensory cues, cognitive factors assuredly contributed to self-motion perception in our subjects. Cognitive processes related to self-motion perception remain somewhat uncharted but support physiological mechanisms with factors such as prior experience, context, attention, and prediction (Wertheim et al. 2001). The processes involved in deriving motion perception likely rely most heavily on what is cognitively registered as the most reliable set of motion cues. Even so, what is increasingly clear is that the perception of linear motion is tenuous—it cannot, and does not, rely primarily on path-related otolith input or integrated versions of it over time.

There are, however, conditions under which directional inertial information, such as that provided by the otoliths, might prove a more important cue for the generation of translational acceleration. The stimuli employed in this study are of relatively modest frequency, and VOR responses to stimuli of this nature are not as robust as for higher frequency motion. (Paige et al. 1998; Telford et al. 1997) Natural motion may contain higher frequency content, and translational otolith pathways may thus prove more influential in the generation of motion perception. Further, perceptual mechanisms might well assign more weight to inertial cues when other sensory cues are not present. An early investigation of translation perception in the vertical direction (Walsh 1964) used a fairly noise- and vibration-free seesaw-like mechanism to provide motion and showed large phase leads in the perception of velocity and displacement at frequencies <0.33 Hz (the highest frequency employed in the study). Such phase leads are observed in the VOR to low-frequency translation, and suggest that otolith cues played a larger role in the earlier work than in the current study, possibly because of minimal noise and vibration cues.

The stimuli employed in the current study, which are of modest frequency content and contain ample nondirectional cues, are not optimal for a thorough investigation of the contribution of otolith pathways to the perception of translational motion. However, they provide compelling evidence that nondirectional cues might have provided an important, or even dominant role, in some path integration studies that have attributed robust perceptions during similar stimuli to otolith mechanisms. Otolith and other signals with directional information might still play a role in determining path for low-frequency stimuli such as those employed in this study, but this cue might be most influential at the onset or offset of motion where erroneous reports of direction are extremely rare. There is also the intriguing possibility that there is a continuous assessment of all motion cues, along with a determination of which are the most reliable at any given moment. Appropriate weights for each sensory modality can then be assigned to generate the percept of motion. Thus when inertial information is the only cue present, that cue plays the major role in translation perception. When compelling nondirectional cues are available, we make use of our lifelong experience of a robust relationship between such cues and motion and thus use these cues as a major contribution to motion perception. Bertin and Berthoz show us a more subtle combination of cues where a small otolith signal enhances the percept of a translational motion component while viewing a virtual visual display consistent with translation along an arc(Bertin and Berthoz 2004). This sensory reweighting process is well established in models of postural control (Forssberg and Nashner 1982) and has recently been adapted to models of vestibular function and spatial orientation (Zupan et al. 2002).

In summary, we conclude that nondirectional cues of motion are employed in the generation of motion perception. Further, poor performance during the trajectory-shift protocol shows an imperfect integration of head acceleration to yield estimates of head velocity. Earlier reports of robust otolith-based path integration likely reflect the artifactual influence of nondirectional cues of motion. Our conclusions are further supported by an alternative strategy to the current study, which is to assess translational motion perception when nonotolith cues are eliminated (or nearly so), thereby isolating otolith influences as well as any cognitive processes that might exploit them. We recently have done just that at NASA Ames Research, using a linear sled riding on air bearings which eliminates sound and vibration cues (Seidman et al. 2003). During trapezoidal and sinusoidal linear motion, translational velocity perception typically displayed high-pass dynamics that more closely resembled simultaneously recorded LVOR responses than actual motion profiles. These findings further illustrate the limited utility of otolith input to guide the perception of translation for anything but brief transients of motion. The notion of “path integration” must derive from complex processes that have little to do with ongoing otolith influences.


This work was supported by National Institutes of Health Grants DC-04153, DC-01935, DC-005409, and EY-01389 (to the Center for Visual Science).


We acknowledge M. Gira and L. Nagy for technical support. We thank Dr. R. Moddemeijer whose mutual information analysis source codes (∼rudy/matlab/) were employed in our data analysis.


  • The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.


View Abstract