## Abstract

The monitoring of one's own spatial orientation depends on the ability to estimate successive self-motion cues accurately. This process has become to be known as path integration. A feature of sequential cue estimation, in general, is that the history of previously experienced stimuli, or priors, biases perception. Here, we investigate how during angular path integration, the prior imparted by the displacement path dynamics affects the translation of vestibular sensations into perceptual estimates. Subjects received successive whole-body yaw rotations and were instructed to report their position within a virtual scene after each rotation. The overall movement trajectory either followed a parabolic path or was devoid of explicit dynamics. In the latter case, estimates were biased toward the average stimulus prior and were well captured by an optimal Bayesian estimator model fit to the data. However, the use of parabolic paths reduced perceptual uncertainty, and a decrease of the average size of bias and thus the weight of the average stimulus prior were observed over time. The produced estimates were, in fact, better accounted for by a model where a prediction of rotation magnitude is inferred from the underlying path dynamics on each trial. Therefore, when passively displaced, we seem to be able to build, over time, from sequential vestibular measurements an internal model of the vehicle's movement dynamics. Our findings suggest that in ecological conditions, vestibular afference can be internally predicted, even when self-motion is not actively generated by the observer, thereby augmenting both the accuracy and precision of displacement perception.

- Bayesian modeling
- internal prediction
- self-motion
- vestibular perception

information from our senses is used by the central nervous system to drive behavior. Noisy sensory signals create conditions of perceptual uncertainty, which are translated into uncertain thoughts, decisions, or actions. In elaborate, real-life situations, sensory cues are not processed in isolation; they have a history and occur within a context. Such elaboration implies that perceptual uncertainty can be handled by having noisy inputs shaped by experience-dependent internal representations of the world. Bayesian theory provides a probabilistic instantiation of this concept and offers a practical framework for studying sensory perception [for a review, see Ma (2012)]. It dictates that percepts result from a probabilistic combination of a likelihood function, which reflects the reliability of the sensory measurement of a physical quantity, with an internal estimate of the same quantity derived from prior experience.

Various experimental findings suggest that humans behave as Bayesian estimators in a variety of perceptual tasks (Adams et al. 2004; Berniker et al. 2010; Girshick et al. 2011; Kersten et al. 2004; Kording et al. 2004; Miyazaki et al. 2005; Stocker and Simoncelli 2006; Tassinari et al. 2006; Weiss et al. 2002). However, inferred perceptual priors, such as light from the above, cardinal orientation of objects, smooth visual motion, or upright body posture, are acquired over long time scales and not easily modifiable in experimental settings. Yet, it has also been demonstrated that the recent history of experienced stimuli during a sequential sensory estimation task biases perception and can be modeled within the framework of Bayesian inference (Cicchini et al. 2012; Jazayeri and Shadlen 2010; Petzschner and Glasauer 2011). Therefore, Bayesian inference does not seem to be a sophisticated, metacognitive capacity but instead, seems to fall under the rubric of unconscious perception and can occur on very short time scales (i.e., tens of seconds).

In path integration, an initial reference and a continuous estimation of self-motion cues allow tracking one's own location in space when passively displaced. Such an iterative updating could thus also be biased by an endogenous prior, namely, the one incurred from the vestibular stimuli experienced along the displacement path. To examine this possibility, we devised a novel, self-motion estimation task that involved judging one's own orientation within a virtual visual scene subsequent to a passive whole-body yaw rotation in darkness. Position judgments over consecutive rotations of randomized size revealed that vestibular measurements were indeed converted into self-motion estimates biased toward the average rotation magnitude of recent stimulus history. As for findings pertaining to visual self-motion cues (Petzschner and Glasauer 2011), we suggest that an optimal Bayesian estimator (OBE) model accounts for these experience-dependent errors and can be extended to the vestibular modality as well. However, in ecological conditions, the spatiotemporal sequence of natural motion stimuli is not random; it is dynamically generated [e.g., Carriot et al. (2014); Dobs et al. (2014)]. The internalization of the structure of these dynamics makes the sensory events predictable. A prior based on this prediction could compete with and override the average stimulus prior and result in unbiased estimates. We demonstrate that by having the self-motion cue sequence follow a dynamic path (i.e., a nonrandom motion profile), subjects indeed seem to use these intrinsic path properties to resolve both the perceptual uncertainty and estimation biases imparted by sensory noise and Bayesian priors, respectively.

Therefore, we suggest that the particular kind of angular path integration task performed in our experiments is not uniquely carried out by a continuous integration of head angular velocity signals from the vestibular system (McNaughton et al. 2006; Taube et al. 1990; Zhang 1996). In addition to these bottom-up sensory inputs, a variety of endogenous estimates is also included in the computation of head direction along the displacement journey.

## MATERIALS AND METHODS

### Participants

Seven healthy adults, naïve to the aims of the study, with normal or corrected-to-normal vision and no history of inner-ear disease, participated in the first experiment (two women, aged 22–25 yr), eight different subjects participated in the second experiment (four women, aged 23–26 yr), and seven additional in the eye-tracking measurements (two women, aged 20–31 yr). All participants gave informed consent and received monetary compensation at 20 Swiss francs/hour. The studies were approved by a local ethics committee and were conducted in accordance with the Declaration of Helsinki.

### Experimental Setup

Yaw rotation vestibular stimuli were delivered in complete darkness by a cockpit-style chair, digitally servo controlled (PCI-7352) with highly precise positioning (±0.1°). Subjects were restrained comfortably with a five-point racing harness, feet straps, and additional cushioning. The head was stabilized using a neck pillow and chin and forehead rests. Rotation profiles were precomputed and specified the chair's instantaneous angular position at a rate of 100 Hz. The rotations' velocity profile *v*(*t*) was a single cycle of a 0.77-Hz raised cosine function.
(1)

where *d* is rotation size and *T* its duration (*T* = 13 s, in our case). Instantaneous angular position *p*(*t*) is then specified as
(2)

During response periods, a 360° panoramic scene, a real-world image, was presented stereoscopically on a large chair-fixed display facing the subject. The 22-in. display was facing the subject at a distance of ∼29 cm, thus covering ∼80° of horizontal and ∼56° of vertical visual angle. The subject and the display were physically enclosed so as to eliminate any visual cues that might emanate from the stationary surroundings during rotations. The stereoscopic stimulus was generated by the Nvidia Quadro FX 3800 graphics card using the OpenGL quad-buffer mechanism. The stimulus was programmed with the Python language and viewed with the Nvidia 3D Vision Kit (active shutter glasses), paired with a Samsung SyncMaster 2233RZ display (120 Hz refresh rate) via an infrared transmitter. Masking white noise was delivered over ear phones at all times.

Monocular eye movements were tracked in total darkness (infrared illumination at 940 nm wavelength) with the EyeLink 1000 system (SR Research, Kanata, Ontario, Canada) at a 1-kHz sampling rate. The eye tracker was mounted on a custom-built, chair-fixed frame that allowed for a placement of the optical objective at adjustable distances and heights relative to the subject's line of sight. However, the subject's eyes were restricted to a position ∼10 cm posterior to the rotation axis, which resulted in attenuated vestibulo-ocular reflex (VOR) gains (Crane et al. 1997). A liquid-crystal display monitor was used for calibrating eye position and then removed from the setup at the start of each experimental session.

### Experimental Procedures

Across experiments, we rotated participants stepwise through a virtual 360° panoramic scene by asking them, after each step, to estimate their new spatial orientation within that scene (Fig. 1).

#### Self-motion estimation task.

Participants sat head fixed in a chair, delivering successive whole-body yaw rotations. Each trial began by providing the initial position cue in the virtual scene (Fig. 1) for 2 s, as indicated by a central fixation dot. The scene was then turned off, the fixation point remained, and a whole-body yaw rotation in darkness followed. The visual scene then reappeared at a random reset location (Fig. 1) without the fixation point, and the subjects were instructed to align the visual scene, using a joystick, with their new straight-ahead position (Fig. 1), based on the estimated size of the experienced whole-body rotation. The response was time limited to 4 s, after which, the central fixation point reappeared and provided the initial position cue for the next trial. No feedback about correct performance was given. Therefore, the subjects effectively provided an indirect size estimate of the vestibular self-motion cue on each trial (see Fig. 1 for details). The random reset avoided the emergence of an automatized matching strategy between rotation size and amount of joystick and/or scene displacement. In the eye-tracking experiment, a version of the same task without a visual scene was used. After each rotation, the subjects were instructed to reproduce the size of the angular displacement by sliding their finger along a circular boundary on a touchpad (Logitech, Newark, CA).

#### Self-motion cue sequence.

In the first experiment, rotation sizes were drawn from three different uniform distributions of angles: a low (1.5–36.5°), medium (18.5–53.5°), and high (35.5–70.5°) range. Each range comprised 11 equally spaced angular values, each tested 50 times in randomized order. The three experimental conditions were separated by at least 24 h, and their order was counterbalanced across subjects. The total of 550 trials in each condition was divided into 10 experimental daily sessions of 55 trials, lasting ∼7.5 min each. Subjects were given self-managed, short breaks after each session. The starting position in the scene and the direction of rotation (left or right) were assigned randomly at the start of each session. Rotation direction switched half-way through the 55 trials. In the second experiment, the sequence of rotation sizes was generated according to (3)

where *d*_{i} is the angular displacement magnitude on trial *i*, *u* is the rate of change of *d* over trials, and *N* is the number of trials in a given session. Therefore, the absolute position over trials follows a parabolic path according to
(4)

(see Fig. 4*A*). We tested 10 sequences with *N* = 50, each corresponding to a different rate *u* (1.44, 1.6, 1.76, 1.92, 2.08, 2.24, 2.4, 2.56, 2.72, and 2.88) in random order. Different rate values yield different rotation sizes in each session and thus prevent memorizing or building a look-up table of the displacement sequence. The same 10 sessions, each lasting ∼7 min, were then repeated, once again, on a second day. As a control condition, the temporal order of the same trials was shuffled to produce identical overall displacements but devoid of intrinsic dynamics (see Fig. 4*A*). The 10 shuffled paths were also tested twice on separate days in the same group of subjects, either before or after the 20 sessions with parabolic paths. The order was counterbalanced across subjects. In the eye-tracking experiment, the same 10 parabolic and shuffled paths were tested once with each subject on separate days.

#### Training and bias correction.

All subjects underwent familiarization with the experimental setup and extensive training on the task before starting the actual experiment in each daily session. They were first given unlimited time to learn the visual scene by moving through it with the joystick on the stationary chair. The learning was interrupted regularly by 2-min sessions, during which, stepwise chair rotations were paired with a congruent visual rotation of the scene. The visual scene was rotated on the chair-fixed display synchronously by an equal amount in an opposite direction, using the same rotation profile as the chair rotation (i.e., it remained earth stationary). The subjects were instructed to maintain visual fixation on a central dot, passively observe the visual motion, and focus on the size of each rotation step. These sessions were intended to consolidate the mapping between chair-rotation size and the corresponding amount of visual-scene displacement. Each 2-min session comprised 65 rotations of random size (range: 5–65°) and spanned the complete 360° scene six times (three times in each direction). The subjects were then trained on the actual task. In the first experiment, training sessions were identical to the experimental sessions and unbeknownst to the subject. The SD between the estimated and veridical rotation size was calculated for each session, and training was deemed complete when this value did not decrease any further on two consecutive sessions. The training typically lasted between four and eight sessions, and data collected on these training trials were discarded. Because of the absolute size estimation nature of the task, each subject still had an idiosyncratic tendency to either underestimate or overestimate consistently the true magnitude of the vestibular stimulus. We corrected this intrinsic error in the first experiment by subtracting the average difference between actual rotation size and reported estimate in a given session from all estimates in that session (Cicchini et al. 2012). The underlying assumptions are that all rotation sizes are subjected to the same instrinsic error (i.e., the error is independent of stimulus magnitude) and that the experience-dependent biases (see results) are symmetric with respect to the mean of the tested range of angles in each of the three conditions. However, in our modeling approach (see below), Weber's law impinges on both the sample variance and the sensory weight when computing the posterior estimate (i.e., the same Weber ratio parameter). This predicts, amongst others, that perceptual biases within individual test sets are asymmetric around the mean test size. The smaller the range, the more symmetric the predicted estimation bias is about the average of the tested range. Therefore, the symmetric assumption of the corrective procedure used is approximately valid in the first experiment, due to the limited ranges (i.e., 35°) tested separately in each condition, but invalid for the wide range of stimuli tested in the same sessions in the second experiment (as high as 70°). Therefore, we have corrected the idiosyncratic biases experimentally during the training sessions in the second experiment. Training sessions consisted of a random order of rotation sizes, each repeated four times in succession before moving to the next. Following the response given by the subject about the estimated new position, the scene was reset to the true new position at the start of the next trial. The subjects were instructed to use this feedback to correct their overestimation or underestimation tendency on the subsequent repetitions of the same stimulus. The training was deemed complete when both the SD and its absolute mean value did not decrease on two consecutive sessions, which typically took between four and eight sessions. Single top-up training sessions were introduced after the second, fifth, and eight sessions of the actual experiment. Training was always done with a random sequence of rotation sizes regardless of whether parabolic or shuffled paths were being tested on that particular day. The subjects never experienced the parabolic paths outside of the 20 test sessions included in the data analysis. The analytic correction of intrinsic bias in the first experiment mitigates the asymmetry of estimate bias around the mean test angle predicted by Weber's law. This corrective procedure seems to explain why the asymmetry was not observed in the data in *experiment 1* (Figs. 2, *A* and *B*, and 3*B*) but clearly transpired in the data of *experiment 2* (see Fig. 5, *A* and *C*) when the intrinsic biases were corrected experimentally instead.

### OBE Model

On each trial *i*, the vestibular organs provide sensory information *I*_{s} about the true angular displacement stimulus *d*_{i}. This noisy information *I*_{s} encodes a measured displacement size *d*_{m,i}. The variability over repeated noisy measurements *d*_{m,i} is specified as a conditional probability distribution *p*(*d*_{m,i}|*d*_{i}) = *N*(*d*_{i},*σ*_{m,i}^{2}), assumed to be normal with mean *d*_{i} (i.e., assuming unbiased measurements) and with variance
(5)

which implements Weber's law, since it grows linearly with its mean (i.e., larger displacements *d*_{i} yield noisier measurements). The free parameters *w*_{m} and *c* are, respectively, the Weber ratio and a constant term that is needed to account for measurement noise at low *d*_{i} values. The neural representation of a particular vestibular measurement, when considered as a function of *d*_{i}, is known as a sensory likelihood function *p*(*I*_{s}|*d*_{i}) = *N*(*d*_{m,i},*σ*_{m,i}^{2}). The most likely value of this conditional probability density is *d*_{m,i}, and the uncertainty associated with it is given by *σ*_{m,i}^{2}. Therefore, the likelihood function is not fixed for a given stimulus but varies from trial to trial, because it is computed from a different noisy measurement each time. We have previously demonstrated that human observers have access to this likelihood function on each trial, because they can dynamically reweigh visual and vestibular estimates of rotatory self-motion, according to their relative uncertainties on a trial-by-trial basis (Prsa et al. 2012). The prior probability distribution *p*_{obe}(*d*_{i}) = *N*(*d̂ _{pobe}_{,i}*,σ

_{pobe}

_{,i}

^{2}) specifies the probability of encountering on trial

*i*a vestibular stimulus of any particular size. The most likely prior estimate (6)

is defined as a running average of all rotation sizes experienced in previous trials of a given experimental session. The uncertainty associated with the OBE prior estimate *σ _{pobe}_{,i}*

^{2}is the third free parameter of the model. There is indeed evidence that human observers use information from many previous trials to maintain an internal estimate of the average stimulus size whose variance is different from that of sensory measurements (Morgan et al. 2000). Furthermore, this prior mean can be acquired on a trial-by-trial basis (Verstynen and Sabes 2011) and relatively fast (Berniker et al. 2010).

The modern framework of probabilistic models of perception is based on Bayes' rule, which posits that the statistically optimal observer should select the most likely value of the posterior distribution (7)

which is a normalized product of the prior and the sensory likelihood as the most reliable (i.e., least uncertain) estimate of *d*_{i}. With the assumed Gaussian distributions, this most likely posterior estimate becomes a weighted average of *d̂ _{pobe}_{,i}* and

*d*

_{m,i}(8)

where the sensory weight *W* represents the relative uncertainty of the prior and sensory likelihood
(9)

The width of the posterior distribution (i.e., the internal uncertainty associated with the posterior estimate) is hence reduced relative to the prior and sensory likelihood uncertainties (10)

Assuming that *d̂ _{pobe}_{,i}* is noiseless, the predicted sample variance of

*d̂*is (11)

_{robe}_{,i}For M experimentally observed samples *X*_{j}, we obtained model parameters that best fit the sample mean and variance by maximizing the data log-likelihood function
(12)

using the *fminsearch* function in Matlab (MathWorks, Natick, MA). We always verified that the fitting procedure was stable with respect to initial parameter values.

In the context of iterative estimation of sensory cue magnitudes, a Kalman version of the OBE model has also been proposed (Petzschner and Glasauer 2011). In the Kalman filter implementation, an explicit prior is substituted by an iterative two-step update of the posterior estimate and its variance by the estimate and variance of the previous trial. In the first step, the posterior is predicted according to (13) (14)

where *q* is a free parameter and represents the noise of the prediction process. In the second step, this prediction is updated with the sensory measurement, according to statistical optimality as
(15)
(16)
(17)

where *K* is called the Kalman gain and is analogous to the sensory weight. The predicted sample variance becomes
(18)

We have fit both the OBE model and its Kalman filter version to our data, always by maximizing the data log-likelihood function (*Eq. 12*), and compared their relative goodness of fit using the Aikake information criterion (AIC)
(19)

where *k* is the number of free model parameters, and *L* is the maximized value of the log-likelihood function. The preferred model is the one with the lowest AIC value.

### Motion Dynamics Estimator Model

In the case of parabolic paths, we could extract from *Eq. 4* the equations of step-wise chair motion. We could subsequently hypothesize that the subjects obtain from these equations an estimate of the state (a vector of position and velocity) of self-motion on each trial *i*. An internal model of self-motion dynamics, similar to that for hand movements in the context of sensorimotor integration (Wolpert et al. 1995), could then be proposed. If the subjects do actually use the dynamic properties of the parabolic paths to predict internally the position within the visual scene on a given trial, then this task is also solvable by merely internalizing the rate of change *u* to predict *d*_{i}. Therefore, we opted for this more parsimonious alternative and define the most likely motion dynamics estimator (MDE) prior estimate as
(20)

with prior variance *σ*_{pmde,i}^{2} as a free parameter. The MDE model implementation is otherwise identical to that of the OBE model described above.

### Mixture Model

A mixture model that combines OBE and MDE priors, according to statistical optimality, postulates the existence of a mixed prior (21) (22) (23) (24)

where *W*_{obe} is the relative weight of the OBE prior estimate (*W*_{mde} = 1 − *W*_{obe}), and *d̂ _{pmix}_{,i}*

^{2}is the internal uncertainty associated with the most likely mixed prior estimate

*d̂*. The posterior estimate and its sample variance are then derived by a statistically optimal combination of

_{pmix}_{,i}*p*

_{mix}(

*d*

_{i}) and the sensory likelihood

*p*(

*I*

_{s}|

*d*

_{i}), as described previously for the OBE model.

### Data Analysis

The estimate of rotation size was calculated as the difference between estimated new position and initial scene position (Fig. 1) on each trial. For each experimental session, we removed outliers using an iterative Grubbs' test with significance level 0.1. The test identified overall <2% of estimates as outliers.

In the second experiment, rotation sizes and their estimates were standardized across different sessions by dividing them by *uN*/4. This provides the same standardized rotation sizes with an average size of 1° for each of the 10 parabolic and shuffled paths. The standardized data were used for model fitting and all further analysis.

Eye-position measurements sampled at 1 kHz were low-pass filtered with a Savitzky-Golay filter (window = 10 points; polynomial degree = 3); eye-velocity traces were computed by taking digital derivatives and aligned to rotation onset of each trial. The slow-phase components of the VOR response were extracted using a custom algorithm based on velocity and acceleration thresholds. Missing values were then extrapolated by fitting a cubic smoothing spline to the slow-phase segments using the *csaps* function in Matlab (MathWorks). Rotations smaller than 10° were not included in the analysis. Each slow-phase eye-velocity trace was normalized with respect to the peak velocity of the corresponding rotation. The peak values of the normalized traces (i.e., the VOR gains) were compared between the parabolic and shuffled path conditions using bootstrap statistics. The latter consisted of calculating the mean VOR gain in each condition 9,999 times on a different subset of trials each time. The different subsets were formed by taking, at random, with replacement, *N* trials from the total set of *N*. Statistical tests were made by assessing the amount of overlap between the bootstrap iterations of two measures. If the measure of interest is *x*, and *x*_{1}^{j} and *x*_{2}^{j} are its estimates in the two conditions obtained from the *j*^{th} bootstrap sample, then the one-tailed bootstrap probability of (*x*_{1} > *x*_{2}) is
(25)

where *B* = 9,999, and *I* is the indicator function, which is equal to 1 when its argument is true and 0 otherwise. The inequality would be reversed for the probability of (*x*_{1} < *x*_{2}). Therefore, the one-tailed bootstrap *P* value is simply the proportion of (*x*_{1}^{j} − *x*_{2}^{j}) values that are more extreme than 0. The same test was used to compare the AIC values that evaluate the goodness of fit of two competing models (see above).

## RESULTS

### Bayesian Account of Experience-Dependent Biases in Self-Motion Estimation

All subjects tested in the first experiment (*n* = 7) were successfully trained to align the virtual scene with their perceived, new, straight-ahead position subsequent to a passive whole-body rotation on each trial. The angular difference between the estimated postrotation position and initial cue position provided the a posteriori estimate *d̂*_{r,i} of the actual rotation size *d*_{i} (Fig. 1). The variability over trials of *d̂*_{r,i} was positively correlated with increasing *d*_{i} magnitudes (linear regression, mean ± SD: *r*^{2} = 0.62 ± 0.26, *P* < 0.01 for all subjects), suggesting that larger rotation sizes were associated with higher perceptual uncertainty, as implied by Weber's power law (Figs. 2, *A* and *B*, and 3*A*). For each of the three overlapping and separately tested distributions of rotation angles, estimates were consistently biased toward the mean of the distribution; smaller-than-average rotation sizes were overestimated, and larger-than-average were underestimated (Figs. 2, *A* and *B*, and 3*B*). The size of bias was dependent on stimulus size and the tested stimulus range (two-way ANOVA, main effect of rotation size and distribution condition: *P* < 10^{−12}, and *P* < 10^{−4}, respectively). Specifically, absolute bias size increased as test rotations became more dissimilar to the average value of each underlying distribution of stimuli (Figs. 2, *A* and *B*, and 3*B*), and this regression to the mean was accentuated for conditions with higher stimulus magnitudes (fitted slopes to the mean estimates in the small, medium, and big rotation conditions, mean ± SD: 0.71 ± 0.14, 0.62 ± 0.22, and 0.46 ± 0.27, respectively).

The described pattern of biased self-motion estimates can be accounted for by an OBE model, as demonstrated recently for time interval (Cicchini et al. 2012; Jazayeri and Shadlen 2010) and visual displacement (Petzschner and Glasauer 2011) estimates. Accordingly, based on previously experienced stimuli, an OBE internally generates an a priori estimate *d̂ _{pobe}_{,i}* (

*Eq. 6*) of the current rotation size

*d*

_{i}and combines it with the sensory measurement

*d*

_{m,i}of

*d*

_{i}, according to statistical optimality, to produce the a posteriori estimate

*d̂*

_{r,i}(

*Eqs. 7*and

*8*) that dominates the percept. In a given experimental session, as subjects experience trial by trial, one of the three tested ranges of stimuli

*d̂*quickly converges to the average range value of the experimental session. As depicted in Fig. 2

_{pobe}_{,i}*C*, posterior estimates are biased away from their sensory measurements toward this most likely prior value; therefore, rotation sizes greater than

*d̂*(Fig. 2

_{pobe}_{,i}*C*) are underestimated, and those smaller than

*d̂*(Fig. 2

_{pobe}_{,i}*C*) are overestimated. Due to Weber's power law (

*Eq. 5*) and because the OBE weighs

*d̂*and

_{pobe}_{,i}*d*

_{m,i}, according to the relative variances of the prior distribution and sensory likelihood (

*Eq. 9*), bigger rotation sizes are measured with more uncertainty and result in larger biases. Simulations of the best OBE model fits to single-trial estimates in a typical subject (Fig. 2

*A*) and pooled data points from all subjects (Fig. 2

*B*) indeed capture all of the observed data features.

### Characteristics of the Prior

In our OBE model, the prior distribution is not derived from the true distribution of stimuli. It is instead assumed to be Gaussian, and its most likely value *d̂ _{pobe}_{,i}* is iteratively updated based on trial history (

*Eq. 6*), but its width is not fixed. It is a free parameter that best fits the data of each subject. If the prior would indeed arise from the actual distribution of rotation sizes, then all subjects would have the same prior, and therefore, subjects with more variable estimates would inevitably exhibit larger biases. However, we found no significant correlation between the average variance of all estimates and the average absolute bias of all estimates across our limited set of subjects (linear regression,

*r*

^{2}= 0.09,

*P*= 0.51). Therefore, it seems that the prior distribution, in our case, arises from a noisy process estimating the average rotation size, similar to that proposed by Cicchini et al. (2012). The uncertainty associated with this average estimate is different for every subject and independent of the uncertainty of sensory measurements. If the variance of the prior would depend on sensory noise associated with measuring rotation sizes included in the average, then the relative uncertainties of the prior and likelihood would be the same across subjects and stimuli ranges and translate into equal amounts of estimation bias. However, the average absolute bias was more than twice as large in certain subjects as in others [mean (lowest, highest) = 4.4° (1.7°, 5.9°)], and the bias increased with higher stimulus magnitudes (see above). It should be mentioned that output noise associated with the production of the motor response was not included in our model, which if different for every subject and dependent on stimulus size, could theoretically account for these discrepancies. Furthermore, prior mean is apparently acquired substantially faster than prior variance (fewer than 10 trials vs. hundreds of trials) (Berniker et al. 2010), and different learning rates across subjects might also provide an explanation for these results. We abstain from any further derivation of the exact nature of the prior, as it is beyond the scope of this paper.

### Kalman Filter Model

A Kalman filter version of the OBE model has been used to describe self-motion estimates from visual cues in an analogous experimental design (Petzschner and Glasauer 2011). The model iteratively updates, on each trial, the rotation-size estimate and its uncertainty based on the estimate of the previous trial and the current sensory measurement (see materials and methods). Simulations of the best-fit Kalman filter model to our data also captured all of the characteristic features described above (data not shown). The comparison of the AIC of the two models revealed that one subject preferred the Kalman filter version, three subjects showed no significant difference, and the remaining three subjects preferred the OBE model (one-tailed bootstrap test, significance level *P* < 0.05; see materials and methods for details). Therefore, it is inconclusive, at least in our first experiment, whether human subjects use a neural implementation of a Kalman filter to estimate the size of successive self-motion cues, despite the small preference observed for the OBE model.

### Nonrandom Cue Sequences Reduce the Weight of Priors

In the second experiment, a different group of subjects (*n* = 8) completed the same sequential estimation task, but the cue sequence was nonrandom. Successive rotations sizes were linearly decreasing and then increasing over a range of angles covering all three ranges separately tested in the first experiment—the overall movement trajectory through the visual scene thus following a parabolic path (Fig. 4*A*). As a control condition, the temporal order of the same rotation steps was shuffled to produce trajectories with identical overall displacements but devoid of intrinsic dynamics (Fig. 4*A*). In the shuffled paths, subjects again produced the pattern of estimation biases consistent with the OBE prediction (Fig. 5, *A* and *C*). Sensory weights *W*, which symbolize the relative weighing of *d̂ _{pobe}_{,i}* and

*d*

_{m,i}(

*W*= 1: unbiased estimates;

*W*= 0: complete regression to the mean), were calculated for each standardized rotation size (see materials and methods) from the best-fit OBE parameters for each subject (Fig. 4

*B*). The sensory weights were different from unity for six out of eight subjects and as constrained by the model, decreased with increasing rotation size due to increasingly uncertain vestibular measurements. For the two remaining subjects, best-OBE fits yielded prior variance parameters σ

_{p,i}

^{2}>> σ

_{m,i}

^{2}for every

*d*

_{i}, as they produced highly unbiased estimates across the entire stimulus range (Fig. 5

*B*). Sensory weights obtained from the best-OBE model fits to rotation size estimates in the parabolic paths (Fig. 4

*B*) were consistently greater across subjects compared with the randomized trial sequence (Fig. 4

*B*).

Parameters of the OBE fits obtained in the shuffled condition were used to predict the estimates of the same rotation sizes when their sequence produced parabolic paths (Fig. 5, *A–C*). In this case, a different pattern of biases is predicted, as *d̂ _{pobe}_{,i}* does not converge to the mean stimulus value, due to the identical nonrandom trial sequence in each session. As observed in the data of the example subject S8 (Fig. 5

*A*) and of the group average (Fig. 5

*C*), rotation-size estimates were closer to veridical values (i.e., zero bias) than the OBE prediction. To quantify the amount of bias reduction compared with shuffled data and the OBE prediction and also to simplify the graphical representation, we divided the trial sequence in both experimental conditions into two intervals: bias− and bias+ (Fig. 5,

*A*and

*C*). Each interval comprises trials in which the OBE model predicts an overestimation (bias+) or an underestimation (bias−) relative to true rotation size. Summing and plotting the errors relative to true values in the two intervals (∑

*bias*+ and ∑

*bias*−) reveal that biases are reduced considerably in the nonshuffled condition and moreover, are overestimated by the OBE prediction (Fig. 6

*A*). If

*d̂*is not reset at the start of each session, thus assuming that the prior is based on stimuli experienced in all previous sessions (including training), then the OBE-predicted biases would be similar to those observed for shuffled data (Figs. 5 and 6

_{pobe}_{,i}*A*). Therefore, the resetting of

*d̂*at the start of each session provides a more conservative prediction of bias size. For the six subjects with nonzero prior weights (1 −

_{pobe}_{,i}*W*), the Euclidean distances between the origin and data points defined by the average and values (Fig. 6

*B*) were reduced significantly compared with the shuffled trials (paired two-tailed

*t*-test,

*P*< 0.01) and the OBE-predicted distances (

*P*< 0.05). Because the remaining two subjects were already unbiased in the shuffled condition, the OBE prediction yields zero bias in the parabolic condition, and therefore, no conclusions can be drawn about an eventual reduction of estimate bias due to the presence of displacement dynamics.

These results are consistent with a reduction in the weight of the prior and could theoretically be caused by more reliable vestibular measurements (i.e., narrower sensory likelihoods), higher uncertainty about the prior, or a combination of both. A comparison of fitted parameters to standardized rotation sizes/estimates between the two conditions (Table 1) indeed reveals that both larger *σ _{pobe}_{,i}* values and reduced sensory noise (smaller

*w*

_{m}and/or

*c*values) contributed to a reduction in prior weights and thus smaller perceptual biases.

### Prediction of Rotation Size from Motion Dynamics

Prior uncertainty might indeed depend on the nature of the stimulus sequence, but it remains unclear why ordered rotation sizes, as opposed to a shuffled, trial-by-trial presentation of the exact same stimuli, would produce less-noisy vestibular sensations. A reduction of sensory noise is indeed unlikely, given the popular opinion that sensory systems are already optimally adapted by evolution to biologically relevant stimuli (Laughlin and Sejnowski 2003; Machens et al. 2005; Simoncelli 2003). The constraining of the OBE model by fixing *w*_{m} and *c* parameters to their values obtained with shuffled data allows testing if reduced perceptual biases can be solely explained by higher prior uncertainty *σ _{pobe}_{,i}*. In addition to smaller biases, wider priors entail that self-motion estimates become more variable in the case of parabolic paths. To assess how well the constrained OBE model fits the data in terms of estimate bias and variance, we compared the actual and predicted values of cumulated average bias and as well as mean estimate variance of standardized rotation sizes (Fig. 6

*C*). Compared with OBE predictions based on fits to shuffled data, which significantly overestimate cumulative subjects' biases (

*P*= 0.001, paired two-tailed

*t*-test), the constrained OBE model unsurprisingly better accounts for their reduction (Fig. 6

*C*) but still overestimates them (

*P*= 0.02). Also shown are the two unbiased subjects, who were not included in the group average or the statistical tests, as their data do not allow for a comparison of estimation bias between the two conditions. Furthermore, the parabolic paths reduced the perceptual variance, as best exemplified by the data of subject S1 (Fig. 5

*B*). The OBE fit with the scenario of modified prior uncertainty cannot account for this reduction (Fig. 6

*C*), as revealed by a significant overestimation (

*P*= 0.008) of the computed average . This latter observation suggests that an additional source of information about rotation size is available and used by the subjects when experiencing the dynamic stimulus sequence.

Therefore, an alternative explanation for reduced perceptual biases is that subjects internalize the constant rate of change *u* in successive rotation magnitudes. They use this additional source of information to predict current stimulus size, according to *Eq. 20*, which partly overrides the prior belief *d̂ _{pobe}_{,i}*. Because we varied the value of

*u*across sessions (see materials and methods), each path was comprised of different rotation sizes. Therefore, the only common denominator between the sessions was the nature of motion dynamics (i.e., parabolic paths). The MDE model (see materials and methods) is constrained to predict zero bias (Fig. 6

*C*), and assuming that the sensory noise remains unchanged (fixed

*w*

_{m}and

*c*obtained from OBE fits to shuffled data), the only free parameter

*σ*, corresponding to the uncertainty of

_{pobe}_{,i}*u*, is fit to the data for the model to match true estimate variance. The best-fit MDE prediction indeed provides a closer match to the experimental data than the OBE model in terms of both average estimate bias (Fig. 6

*C*) and average sample variance (Fig. 6

*C*).

### Perceptual Estimates are Best Explained by a Mixture Model

To the contrary of the MDE prediction, the parabolic paths did not reduce the perceptual biases to zero. Across individual experimental sessions, it is apparent that the biases were scattered between those predicted by the OBE and MDE fits (Fig. 6*A*). Therefore, it seems more likely that both *d̂ _{pobe}_{,i}* and

*d̂*are estimated and combined on any given trial to yield a mixed prior

_{pmde}_{,i}*d̂*, according to statistical optimality (

_{pmix}_{,i}*Eqs. 21*and

*22*). This mixture model with free parameters

*σ*and

_{pobe}_{,i}*σ*(

_{pmde}_{,i}*w*

_{m}and

*c*remained fixed as previously) logically provided closer matches to experimental data (Fig. 6

*C*) for subjects with positive cumulative average biases and . Negative and values mean that, as was the case with the two unbiased subjects (Fig. 6

*C*), the rotation sizes tended to be underestimated in the bias+ interval and overestimated in the bias− interval, which is opposite to the OBE prediction. The MDE model alone can account for such features if we assume that subjects S1 and S2 simply overestimate the rate

*u*. This would provide evidence, in supplement to a reduced perceptual variance, for the use of motion dynamics, even in the case of initially unbiased subjects, as can be deduced by qualitatively comparing Fig. 5

*B*data.

### A Kalman Filter Provides a Poor Account of the Data

There was a small tendency for the Kalman filter version of the OBE model (see materials and methods) in the first experiment to give a worse account of the data. However, in the case of parabolic paths, the Kalman filter implementation can, in principle, account for reduced biases in and of itself. Because of the ordered sequence of rotation sizes, the sensory estimate on a given trial is very similar to the estimate of the previous trial, and biases close to zero will thus be predicted. The best-fit OBE Kalman filter model, as well as its prediction based on fits to the shuffled data, does indeed account for the reduced biases in the parabolic condition (nonsignificant differences with absolute average bias size; data not shown). With the comparison of the AIC of the MDE and Kalman filter models, four subjects showed no significant difference, and the remaining four preferred the MDE model (one-tailed bootstrap test, significance level *P* < 0.05; see materials and methods for details). Moreover, constraining the Kalman filter model, as previously, by fixing *w*_{m} and *c* parameters to their values obtained with shuffled data and fitting the data of parabolic paths resulted in significantly overestimating the average sample variance (*P* = 0.03; Fig. 6*C*). Therefore, only the MDE and mixture models can simultaneously account for smaller biases and reduced perceptual variance observed in the subjects' estimates of rotation sizes when sequentially displaced along a dynamic path.

### Progressive Learning of Motion Dynamics

Finally, we also fit the mixture model separately to each of the 10 successive experimental sessions, also repeated on a second day (Fig. 7*A*). On average, the relative weight of the MDE prior progressively increased (i.e., that of the OBE prior decreased), as subjects were, over time, exposed to the parabolic paths (linear regression: *r*^{2} = 0.41, *P* < 0.05 on *day 1*; *r*^{2} = 0.23, *P* = 0.16 on *day 2*). This result is suggestive of a progressive learning of dynamics underlying the overall motion through the visual scene. Furthermore, the fact that on *day 2*, the subjects, on average, reach the putative, asymptotic *W*_{mde} level faster than on *day 1* indicates that the learned dynamics might be retained partially over time. For comparison purposes, we also plot the sensory weight *W* corresponding to the largest standardized rotation size obtained by separately fitting the OBE model to the 10 shuffled paths (Fig. 7*B*). Analogous changes over consecutive test sessions were not observed on either day (linear regression for average data points: *r*^{2} = 0.1, *P* = 0.38 on *day 1*; r^{2} = 0.04, *P* = 0.6 on *day 2*) in that case. Moreover, smaller average *W*_{mde} weights were obtained when fitting the mixture model across all experimental sessions to the first *N*/2 trials (*W*_{mde} = 0.59) rather than to the last *N*/2 trials (*W*_{mde} = 0.84) of each session separately (paired two-tailed *t*-test, *P* = 0.06). The same was not observed for the sensory weight *W* in the OBE fit to shuffled data between the first (*W* = 0.72) and last half (*W* = 0.67) of trials in each session (*P* = 0.18). This suggests that the motion dynamics were also being learned, as the subjects experienced successive vestibular cues within single parabolic paths.

### Predictable Rotation Size Does Not Attenuate the VOR

In an alternative version of the same task (see materials and methods), we tracked the reflexive eye movements produced by seven subjects as they were displaced stepwise along the parabolic and shuffled paths. As previously, they had to estimate the displacement magnitude of each step. The goal was to reveal a possible physiological correlate of the internal prediction signal. We extracted the slow-phase eye-velocity components (see materials and methods) of the recorded eye-movement trace for every trial and normalized it by the peak velocity of the corresponding chair rotation. As depicted by the slow-phase VOR responses of a typical subject (Fig. 8), reflexive eye movements did not differ significantly (*P* > 0.05, one-tailed bootstrap test on the VOR gains) between the two conditions for all subjects. Therefore, being able to predict the magnitude of the vestibular stimulus, by presumably internalizing cue sequence dynamics, seems not to impinge on the magnitude of compensatory ocular reflexes and the associated responses of vestibular nuclei neurons.

## DISCUSSION

In the present study, with the use of a novel path-integration task, we quantified how vestibular estimates of self-motion are influenced by internal signals. We found that human subjects exhibit systematic, experience-dependent biases when estimating self-motion from vestibular cues (Figs. 2 and 5) but also that such biases are mitigated when the overall displacement trajectory is characterized by quantifiable dynamics (i.e., is nonrandom; Figs. 5 and 6). Our Bayesian modeling approach demonstrates that these results can be explained by a probabilistic combination of sensory inputs with top-down predictions based on perceptual priors, as well as an internal representation of path dynamics. These novel findings extend previous demonstrations that vestibular cues pertaining to self-motion perception are processed according to statistical optimality (Butler et al. 2010; Fetsch et al. 2010; Laurens and Droulez 2007; MacNeilage et al. 2008; Prsa et al. 2012; Vingerhoets et al. 2009).

### Internal Model of Passive Displacement Dynamics

In individual sessions of our task, the totality of successive rotations produced a coherent, overall displacement trajectory in the virtual environment. Even though very different from a smooth displacement that would occur in ecological situations, an internal representation of the dynamics characterizing this discretized motion is possible and provides a likely explanation of our results. The equations of such step-wise chair motion could be extracted from *Eq. 4*, and we could hypothesize that on each trial, the subjects maintain a state (a vector of position and velocity) of their body. This would entail that an internal model of passive whole-body movement dynamics, analogous to a forward model of arm (Miall and Wolpert 1996; Shadmehr and Mussaivaldi 1994; Wolpert et al. 1995) or eye plant (Galiana and Outerbridge 1984; Glasauer 2003; Prsa and Galiana 2007) dynamics, was used by the subjects in our experiments. Sensory predictions would not be derived from motor efference copies but from prior sensory inputs. However, in our MDE model implementation, we assumed more parsimoniously that the subjects internalize the rate of change of successive rotation sizes (*Eq. 20*). In either case, this mechanism would fall under the rubric of internal models that represent a solution to a specific equation rather than the dynamic properties of a motor plant. Such internal models have been postulated for visual target motion (Zago et al. 2004), the earth's gravitational force (McIntyre et al. 2001), or for dissociating gravitational from inertial vestibular cues (Angelaki et al. 2004; Merfeld et al. 1999). Internal prediction of vestibular re-afference during active head turns inhibits vestibular neurons otherwise active during passive head-in-space rotations (Angelaki and Cullen 2008). The possibility that vestibular afference could be internally predicted even in the case of passive self-motion, as, for instance, during passive displacements along a dynamic path, implies that the associated neural processing might, in those instances, parallel that of active head turns.

### Novel Estimation Task

Recent demonstrations of a Bayesian account of perceptual biases that stem from sequentially estimating the magnitude of a physical quantity all used a similar cue reproduction task. The latter consists of producing a time interval with the push of a button (Cicchini et al. 2012; Jazayeri and Shadlen 2010) or a visual displacement by moving a joystick (Petzschner and Glasauer 2011) that matches the length or size of a perceived stimulus presented to the subject on each trial. Therefore, the response is based on a second perceptual estimate—that of the reproduced cue—which is presumably also subjected to experience-dependent biases. Yet, all three studies assume the production stage to be an unbiased estimation, and Jazayeri and Shadlen (2010) explicitly model it as such. If the biases associated with the initial estimate are the same as those associated with the estimate of the reproduction, then no regression to the mean should be observed. In that case, to reproduce an estimate *d*_{i} + *e* of the stimulus *d*_{i}, the subject would respond when a size *d*_{i} + *e* is perceived, which would correspond to a true reproduced size *d*_{i}. For example, the reproduction by walking of an initially walked path length was unbiased when the walking speeds of the two stages were matched (Mittelstaedt and Mittelstaedt 2001). Estimates that regress to the average stimulus size would transpire only if the production stage involves less noisy measurements. Indeed, the reproduction of perceived time intervals separating two visual or auditory events relied on measuring time from tactile and proprioceptive cues associated with holding down a key (Cicchini et al. 2012) or no sensory cue at all (Jazayeri and Shadlen 2010). Visual cues in the self-motion, production-reproduction task of Petzschner and Glasauer (2011) were, on average, the same between the two stages. Motion was generated by moving the joystick in both cases, but motion end was predefined in the production stage and user controlled in the reproduction stage. Therefore, during reproduction, the subjects additionally benefitted from proprioceptive and motor efference cues congruent with the visual stimulus. Less noisy cues and reduced variance, resulting from optimal multisensory integration (Ernst and Banks 2002), were probably responsible for smaller biases in the reproduced estimates in those studies. Nevertheless, the reported regression to the mean effects was most likely still underestimated. Our novel task design circumvents this issue because the response about current position is not in itself a displacement reproduction. Both the amount and direction of visual-scene rotation in the response period are decoupled from the preceding self-motion stimulus. Therefore, the results reported here provide an accurate account of self-motion perceptual biases. Moreover, by mapping consecutive orientation judgments onto the common panoramic visual scene, we were able to use different sensory modalities for perceiving and reporting the size of self-motion stimuli, thereby accentuating the independence of the two stages.

We believe that our improved task design allowed us to constrain our models to fit the sample mean and sample variance of the estimates with the same set of parameters. The same likelihood functions reflecting Weber's law were used to account simultaneously for increasing variability with stimulus size and for the relative weighting of sensory evidence and prior. Petzschner and Glasauer (2011) fit their parameters only to the mean of the estimates and derive the sample variance independently from the observed linear relationship between mean and SD in the experimental data. Jazayeri and Shadlen (2010) separate their model into two independent stages: an estimation stage and a production stage. The estimation stage simulates the estimate bias and the production stage, the sample variance by fitting two separate Weber ratio parameters. Independently fitting these two features of the experimental data was likely enforced by a task design that resulted in misestimating the true extent of perceptual biases, as explained above.

### Combining Multiple Bayesian Priors

When subjects were displaced along the parabolic paths, their self-motion estimates were best accounted for by a mixture model combining two types of Bayesian priors (Fig. 6*C*). A motion dynamics-based prior was progressively weighted more as subjects were exposed to these dynamics over time, and the prior, based on the average rotation size of the stimulus history, was weighted less (Fig. 7*A*). Human observers can indeed acquire multiple prior distributions by having symbolic contextual cues identify a stimulus as belonging to one of two prior categories (Nagai et al. 2012; Petzschner et al. 2012). These priors might be combined on each trial, according to the relative probabilities of the two categories (Petzschner et al. 2012). However, our results suggest that nonunique prior distributions can arise without an explicit categorization of stimuli. They can be derived from, if available, independent measures of different contextual properties of the sensory cues and recursively updated with experience. For example, nonhuman primates apparently use independent speed and direction priors to initiate a smooth eye-movement pursuit of a visual-motion stimulus (Yang et al. 2012). If one of the prior estimates is an unbiased prediction of true stimulus magnitude, then its contribution reduces both perceptual uncertainty and estimation biases imparted by other priors.

We would argue that such predictive priors are readily available in many circumstances, because natural sensory events are not generated by random mechanisms. Whether it is the position of the eyes on a human face or the displacement of an accelerating vehicle, an ordered spatial-temporal structure of a stimulus renders it predictable. Therefore, the characteristic and widely assumed variance-bias tradeoff of sensory perception, resulting from optimal Bayesian integration, can be remedied by acquiring an internal model of this dynamic structure. The demonstration of the existence of these predictive signals further underscores the growing tendency to think about sensorimotor processes in terms of a complex interplay of sensory and motor signals with multifaceted internal states.

## GRANTS

Support for this study was provided by the Swiss National Science Foundation.

## DISCLOSURES

The authors declare no competing financial interests.

## AUTHOR CONTRIBUTIONS

Author contributions: M.P. conception and design of research; M.P. performed experiments; M.P. and D.J-R. analyzed data; M.P., D.J-R., and O.B. interpreted results of experiments; M.P. prepared figures; M.P. and O.B. drafted manuscript; M.P. and O.B. edited and revised manuscript; M.P., D.J-R., and O.B. approved final version of manuscript.

- Copyright © 2015 the American Physiological Society