To determine the direction of object motion in external space, the brain must combine retinal motion signals and information about the orientation of the eyes in space. We assessed the accuracy of this process in eight laterally tilted subjects who aligned the motion direction of a random-dot pattern (30% coherence, moving at 6°/s) with their perceived direction of gravity (motion vertical) in otherwise complete darkness. For comparison, we also tested the ability to align an adjustable visual line (12° diameter) to the direction of gravity (line vertical). For small head tilts (<40°), systematic errors in either task were almost negligible. In contrast, tilts >60° revealed a pattern of large systematic errors (often >30°) that was virtually identical in both tasks. Regression analysis confirmed that mean errors in the two tasks were closely related, with slopes close to 1.0 and correlations >0.89. Control experiments ruled out that motion settings were based on processing of individual single-dot paths. We conclude that the conversion of both motion direction and line orientation on the retina into a spatial frame of reference involves a shared computational strategy. Simulations with two spatial-orientation models suggest that the pattern of systematic errors may be the downside of an optimal strategy for dealing with imperfections in the tilt signal that is implemented before the reference-frame transformation.
The topic of this study concerns the integration of visual motion and vestibular signals for spatial perception. One line of investigation in this broad field has concentrated on the visual contribution to the percept of egomotion and posture (Brandt et al. 1974; Lappe et al. 1999; Wertheim 1994). This work showed that persistent large-field visual motion, initially perceived as motion in external space, induces a slowly developing percept of self-motion (vection). It is now widely accepted that large-field optic flow signals in the low-frequency range complement vestibular motion cues covering the high-frequency domain. This account is further supported by neurophysiological studies (Brandt et al. 1998; Henn et al. 1974) and has been incorporated in spatial-orientation models (Merfeld 1995; Robinson 1977; Zupan et al. 2002). Other studies have provided a better understanding of the brain areas involved in the integration of visual motion signals with nonvisual cues for the perception of self-motion (Angelaki and Hess 2005; Bremmer et al. 2002; Fetsch et al. 2007; Gu et al. 2006, 2007; Müller et al. 2005; Page and Duffy 2003). As a result, we now have a good understanding of how optic flow signals contribute to the percept of body posture and egomotion.
Much less is known about the opposite perspective: the involvement of postural information in the perception of visual object motion in space. To estimate the direction of visual motion in an earth-centric reference frame, visual signals coding the direction of motion on the retina must be combined with extraretinal signals coding eye position in space. Much previous work has investigated this issue for voluntary eye movements in head-restrained subjects. Studies on visual motion perception during smooth pursuit, for example, have shown that the brain can take extraretinal pursuit signals into account, despite certain imperfections in this integration process (Freeman et al. 2000; Souman et al. 2005; Wertheim 1994). Recently, neurophysiological studies have elucidated the major neural pathways involved in visuooculomotor integration during smooth pursuit (Ilg et al. 2004; Inaba et al. 2007; Lindner et al. 2006; Newsome et al. 1988; for reviews see Krauzlis 2004; Lisberger et al. 1987) This work has identified brain areas with access to both visual-motion and gaze-motion signals. Thus with respect to this aspect of external motion perception, there is now at least a basic concept of how sensory and motor signals can merge into a coherent percept.
We investigated to what extent laterally tilted subjects use vestibular information about body posture when estimating the direction of visual motion in external space. A further question was whether performance in this task involves the same intriguing misjudgments found in tests of line verticality. Numerous studies on the subjective line vertical have shown that human observers, tilted sideways in darkness, make substantial systematic errors when aligning a luminous line with the direction of gravity (for review see Mittelstaedt 1983). Generally, at large tilts (>60°) these errors suggest underestimation of head tilt and are known as the Aubert or A-effect, as first described by Aubert (1861). In the smallest tilt range (<30°), errors are typically small, although errors of overestimation (the E-effect) have also been reported. Although line settings mostly suggest an underestimation of body tilt, systematic errors in the tilt signal cannot be held accountable since body tilt perception is almost veridical in these conditions (Kaptein and Van Gisbergen 2004; Mast and Jarchow 1996; Mittelstaedt 1983; Van Beuzekom and Van Gisbergen 2000; Van Beuzekom et al. 2001). Uncorrected eye torsion, causing errors in the opposite direction, cannot explain the A-effect either (Howard 1982). Instead, it has been suggested that these large systematic errors are the downside of a computational strategy to improve verticality perception at small tilts (Eggert 1998; Mittelstaedt 1999).
Mittelstaedt (1983) proposed a model that implements this computational strategy by means of an internal bias signal, which serves to correct for the distortion caused by a putative imbalance in the tilt signal due to the unequal numbers of hair cells in the two otolith organs. This internal signal, called the idiotropic vector, is a head-fixed vector that is added vectorially to the estimated direction of gravity derived from the otoliths. Although the idiotropic contribution compensates for the distortion at small tilts, it worsens verticality perception at large tilts, where it accounts for the A-effect. Recently, it has been pointed out that the effect of the idiotropic component of Mittelstaedt's model is equivalent to the role of prior knowledge in the optimal evaluation of a noisy head tilt signal in a Bayesian framework (Eggert 1998). Also MacNeilage et al. (2007) suggested that the A-effect in spatial perception could be explained from a Bayesian perspective, but this has never been tested explicitly.
In our experiments, subjects adjusted the direction of visual motion to the perceived vertical (motion vertical) at a range of different body tilts. For comparison, the same subjects were also tested in a classical luminous-line task (line vertical). We hypothesized that errors in the two tasks would be the same, indicating that these errors do not reflect errors in the neural processing of the visual cues themselves, but rather errors in the neural processing that yields the reference frame (gravity perception). We indeed found virtually identical performance in both tasks, with considerable systematic errors at the larger tilt angles, suggesting that there is a shared mechanism in the computation of the motion vertical and the line vertical.
Simulations with two spatial-orientation models—Mittelstaedt's original idiotropic-vector model and a novel Bayesian model—show that the shared pattern of systematic errors probably reflects central handling of imperfections in the sensory tilt signal, before it is combined with signals from each visual subsystem.
Eight subjects (seven male, one female), aged between 23 and 62 yr (mean ± SD: 31 ± 13 yr), provided written informed consent to participate in the experiments. All subjects, including the three authors who were familiar with the purposes of the experiments, took part in the adjustment experiment (see following text). An additional control experiment involved four subjects (one author). All participants had normal or corrected-to-normal visual acuity and were free of known vestibular or other neurological disorders.
Subjects were seated in a computer-controlled vestibular chair with nested gimbals that allowed whole-body rotation about any axis in space. They were securely tightened in the chair using seat belts, adjustable shoulder and hip supports, a footrest, and Velcro straps to restrain feet and legs. The head was firmly fixated in a natural upright position for looking straight ahead, using a padded adjustable helmet. During the experiments, subjects were tilted in complete darkness by rotation about the nasooccipital (roll) axis to a stationary lateral tilt position. The seat was adjusted so that the axis of rotation was aligned with the cyclopean eye. Chair orientation was measured using a digital position encoder with an angular resolution of 0.04° and was recorded on disk.
Visual stimuli for testing verticality percepts were presented on a chair-fixed Philips 15-in. LCD screen with a refresh rate of 60 Hz, mounted at eye level in the frontoparallel plane at a distance of 90 cm from the subject. Due to computational restrictions, movie frames were shown at an effective frame rate of 20 Hz. Visual stimuli were generated in Matlab (The MathWorks) using the Psychophysics Toolbox (Brainard 1997; Pelli 1997). To exclude visual cues about the direction of gravity, a mask with a circular aperture of 14° was mounted in front of the screen. As a further precaution against spurious tilt cues, stray light was reduced by a 2.7 log unit neutral density filter, which kept background luminance of the screen to <0.001 cd/m2. The intensity of test stimuli was 0.2 cd/m2. Vision was always binocular and subjects were allowed to move their eyes freely at all times.
Starting from upright, subjects were roll-rotated in darkness to a tilt angle ρ, with right-ear-down rotations taken positive (see Fig. 1). The chair rotated with a peak acceleration of 50°/s2 to a constant velocity of 30°/s, which was reached within 1 s. After rotation to the tilt position had been completed, a 30-s waiting period followed to allow dissipation of putative canal effects. Then, the subjective motion vertical (Fig. 1A) or the subjective line vertical (Fig. 1B) was tested in a run of 12 sequential trials in the adjustment experiment or 66 trials in the control experiment (see following text). Next, subjects were rotated back to the upright position where they remained for 30 s, with the room lights on, until the next rotation started. Tests for positive and negative tilt angles were alternated regularly. Subjects never received feedback about their performance.
We used an adjustment paradigm to determine both the motion vertical and the line vertical at tilt angles ranging from −120 to +120° at 20° intervals. Subjects performed these tasks, tested in random order, as follows.
In the motion-adjustment task (Fig. 1A), subjects viewed a moving random-dot pattern with 0.3° diameter dots. We used a random-dot pattern to ensure that the motion vertical was based on global motion mechanisms and to discourage the application of a line strategy based on single-dot trajectories (see Global versus local motion in results). The pattern contained 50 dots on average, equivalent to a dot density of 0.3 dot/deg2, of which 30% behaved as signal dots, moving coherently in steps of 0.3° at a speed of 6°/s (for a similar approach see Newsome and Paré 1988). The other 70%, which were noise dots, shifted to a random location in the circular 14° aperture. At each movie frame, dots had a 30% chance of being treated as a signal dot (as in Newsome and Paré 1988). As a result, signal dot life time was limited, with a 9% chance that a signal dot would survive two movie frames, a chance of 2.7% for surviving three subsequent frames, and so on. The subject's task was to adjust the direction of the noisy moving dot pattern toward the floor, parallel to the perceived direction of gravity, using a joystick.
In the line-adjustment task (Fig. 1B), subjects viewed a luminous line (length 12°, width 0.3°) that was polarized by a bright dot at one end. Subjects used a joystick to adjust the orientation of the line parallel to the perceived direction of gravity so that the dot pointed downward in space. The rotation axis of the line coincided with the subject's roll axis.
The time available for completing each adjustment was 10 s. Line and motion stimuli had random orientations at trial onset. Each combination of task and tilt angle involved 12 successive trials in a single run. In total, there were 26 combinations of tilt angle and task, each of which was tested once in random order in two experimental sessions of about 40 min each. Before testing began, subjects were given sufficient practice trials to get used to the tasks.
When it was found that performance in the line- and motion-adjustment tasks was very similar, the question arose whether subjects had in fact performed the motion task by transforming it into a line task. If this scenario applied, the subject would have derived a percept of motion verticality by temporally integrating the extrapolated motion paths of individual signal dots. The similarity in results would be trivial if this strategy had actually been used. Therefore we designed a control experiment to rule out this possible confound. We used a forced-choice paradigm to quantify the motion vertical and line vertical psychometrically at a large tilt angle where the adjustment experiment had shown substantial systematic errors in the two tasks. In testing the subjective motion vertical, we compared two paradigms: 1) one was designed to impose reliance on a global motion percept for solving the task; and 2) the other enforced a single-dot strategy, which precluded spatial integration. We used the following tasks.
GLOBAL MOTION FORCED-CHOICE TASK.
As in the motion-adjustment experiment, subjects viewed a 30%-correlated random-dot pattern, but motion directions of signal dots were now drawn from a normal distribution with SD of 15° around a mean. Recall that in the motion-adjustment paradigm, all signal dots moved in the same direction. This modification, inspired by the previous work of Dakin et al. (2005), was introduced to deter reliance on local motion cues for solving the task. As a further measure, exposure duration was limited to 200 ms per trial, corresponding to four shifts of the random-dot pattern (five movie frames). Subjects indicated whether the motion direction of the pattern in space was clockwise (CW) or counterclockwise (CCW) from their perceived direction of gravity, by using a toggle switch within a 1.5-s-response interval after the stimulus.
LOCAL MOTION FORCED-CHOICE TASK.
This task enforced a single-dot strategy to test whether this would degrade performance relative to the global motion task. The stimulus consisted of a single 0.3°-diameter dot, with the same motion statistics as the dots in the global motion forced-choice task. Thus the stimulus dot moved to a new position in each movie frame, with a 30% probability that it moved like a signal dot and with a 70% chance that it moved like a noise dot. Accordingly, if subjects had in fact used a single-dot strategy in the global motion forced-choice task, performance in both motion forced-choice tasks would be identical. The stimulus was shown for 200 ms, equivalent to four shifts of the single dot, followed by a 1.5-s-response period. Using a toggle switch, subjects indicated whether the motion direction of the dot was CW or CCW from their sense of gravity.
LINE FORCED-CHOICE TASK.
In this task, subjects viewed the same luminous line as in the line-adjustment task but only for a brief period of 200 ms, followed by a 1.5-s-response period. Using a toggle switch, they had to indicate whether the presented line orientation was CW or CCW from the perceived direction of gravity. The objective behind this test was to verify whether the strong similarity between the global motion vertical and the line vertical, as revealed in the adjustment tasks, was retained in the altered conditions of the psychophysical experiments.
Care was taken to familiarize subjects with all tasks. The local motion task was first practiced at a 100% coherence level (with the single dot behaving as a signal dot). Stimulus coherence was then gradually reduced to the 30% level used in the actual experiment. The global motion and line tasks were practiced using the same characteristics as in the actual tests.
The three control tasks were performed at three different tilt angles: −100°, +100°, and 0° (upright). Psychometric data were collected using the method of constant stimuli, which is based on multiple presentations of test stimuli, in random order, in a predetermined range above and below the perceptual threshold (see Ehrenstein and Ehrenstein 1999). We centered the test range on the subjective vertical estimated from the prior adjustment results. This value was determined from a third-order polynomial curve that characterized the best-fit relationship of the mean verticality settings in the adjustment tasks as a function of tilt angle. In all three tasks we presented the same set of 11 directions, at 0, ±3, ±6, ±9, ±15, and ±25° relative to this value. Each stimulus direction was presented 12 times, yielding a set of 132 responses for each psychometric function. Data for each combination of task and tilt angle were collected in two 112-s runs of 66 forced-choice trials. In each subject, the 18 runs (3 tasks × 3 tilt angles × 2 runs) were tested in random order in two experimental sessions of about 45 min each.
Data analyses were performed off-line using Matlab software (Matlab 7.0, The MathWorks). Response error γ in the motion adjustment task was defined as the difference between the adjusted motion direction of the dot pattern and the actual direction of gravity (see Fig. 1A). Likewise, the response error in the line-adjustment task was computed as the angular difference between the line setting and the true vertical (see Fig. 1B). Compensation angle β was defined as the motion or line setting relative to the subject's vertical head axis. Response averages and their SDs were calculated using circular statistics (Batschelet 1981). Data points >3SDs from the mean, considered outliers, were excluded from further analysis. Differences in the results among different experimental conditions were considered statistically significant at P < 0.05.
Psychometric data from the global motion forced-choice and the line forced-choice control experiments were analyzed in a standard manner. We calculated the proportion (P) of CW responses for each stimulus direction and fitted a cumulative Gaussian curve using the method of maximum likelihood (Wichmann and Hill 2001). This curve is given by the following function (1) in which x represents global motion direction or line orientation of the stimulus and y is an integration variable that runs within the stimulus domain. The mean value of the cumulative Gaussian μ represents the subject's subjective vertical (motion or line); the slope of the curve σ reflects the noise in the subjective vertical, which serves as a measure of the subject's uncertainty. Parameter λ, the lapse rate, which accounts for stimulus-independent errors caused by subject lapses or mistakes, was restricted to small values (0 < λ < 0.06). Lapses may be due to a temporary lack of attention and cannot be attributed to stimulus properties (for details see Klein 2001; Wichmann and Hill 2001). The expected poor performance in the local motion forced-choice task made fitting a psychometric function to the respective data unfeasible. Therefore we compared these data to the global motion data by comparing the deviations of both data sets to the global motion–fit curve (see Global versus local motion in results).
As we will subsequently see in results, our subjects made considerable errors at the larger tilt angles in both the motion task and the line task. Based on the assumption of a central mechanism that biases the internal representation of verticality toward the long body axis, Mittelstaedt (1983) proposed a widely accepted model that can account for such errors in visual verticality perception. In the following, we will first briefly describe this model and then introduce an alternative framework that can explain biased verticality percepts with a different mechanism.
Mittelstaedt's idiotropic-vector model
Mittelstaedt (1983, 1986) suggested that systematic errors in verticality perception reflect a central mechanism that compensates for putative systematic errors in the tilt estimate derived from the otoliths. A schematic representation of his model is shown in Fig. 2. The basic idea is that the brain reconstructs tilt angle ρ by combining the signals from the two otolith organs, utricle and saccule, which are arranged in two orthogonal planes. However, since these signals are mediated by unequal numbers of afferents (Rosenhall 1972, 1974), combining them in the straightforward manner implemented in the model yields a distorted estimate of ρ (see Fig. 2, bottom left). To minimize the effect of this distortion at small tilt angles, which are most frequently encountered in daily life, the model invokes an internal bias signal. This internal signal, called the idiotropic vector, is a head-fixed vector that is added vectorially to the estimated direction of gravity derived from the otoliths. The addition of the idiotropic vector biases the tilt estimate toward the head axis, thereby effectively canceling the distortion at small tilts (see Fig. 2, right). The downside of this computational strategy, however, is to worsen performance at large tilts. In a sense, Mittelstaedt's concept echoes earlier ideas formulated in the seminal study by Aubert (1861), which interprets the subjective visual vertical as a compromise between the sensory input providing information about head tilt and the vertical retinal meridian.
In the model, the internal representation of the roll-tilt angle, denoted as β, is specified as a function of ρ by the following relation (2) In this equation, ĝy = sin (ρ) and ĝz = S cos (ρ) represent the neurally encoded gravity components along the head's y-axis and z-axis, as provided by utricle and saccule, respectively. S denotes the saccular gain (S <1), linked to the ratio of the number of saccular and utricular hair cells, and ρ is the physical tilt angle of the head. Furthermore, Mz is the z-component of the tilt-independent head-fixed idiotropic vector (M) and N = is a normalization factor to ensure that the internal representation of the gravity vector has a fixed length. Thus the model has two parameters to determine β: the size of the M vector, an idiosyncratic value, and the normalized saccular gain S. In the model, it is assumed that the visual signal φ̃r is unbiased and can be simply added to the internal tilt representation to obtain the required output, i.e., the line orientation (or motion direction) in space φ̃s. As a result, the systematic errors in the subjective vertical task merely reflect the bias in the internal tilt representation (3) which implies that, according to the model, systematic errors in the motion task and line task must be identical.
Furthermore, Mittelstaedt's model predicts that the noise in the internal tilt representation σβ is inversely proportional to the length of the resultant vector, which is obtained by summation of the idiotropic vector and the normalized gravity vector (Mittelstaedt 1983). Thus in analytical terms (4) For the normal range of S and Mz parameter values, noise in the internal tilt signal increases with tilt angle, with C a proportionality constant, the third free parameter in the model. Qualitatively, Eq. 4 implies that subjects with a stronger idiotropic vector show reduced scatter at modest tilt angles but increased scatter at large tilts, compared with subjects with a smaller idiotropic vector. In the simulations, we assumed that the noise in the output error σγ depends not only on the noise in the internal tilt estimate σβ, but also on the visual noise σv according to standard statistical rules for noise combination (5) in which visual noise for lines σvl and visual noise for motion σvm may be different, but independent of line orientation or direction of visual motion on the retina, ignoring the oblique effect (Krukowski et al. 2003; Löffler and Orbach 2001; Luyat et al. 2001).
Mittelstaedt's model assigns a major role to a deficiency in the accuracy of the sensory tilt signal to explain the response biases. In the following, we explore an alternative explanation, a so-called Bayesian model, which rather focuses on the precision of the tilt signal. Although the Bayesian account is similar to Mittelstaedt's vector-averaging model in various aspects (Eggert 1998; MacNeilage et al. 2007), some of its basic assumptions are different, as we subsequently explain. In the literature, Bayesian models are used to deal with various sources of information to optimize performance in the context of optimal observer theory (e.g., Knill and Pouget 2004; Körding and Wolpert 2004). These frameworks have been applied successfully in studies reporting perceptual biases. For example, in visual speed perception, a Bayesian model has been used to explain the finding that subjects systematically underestimate object speed when stimulus contrast is reduced (Stocker and Simoncelli 2006). Recently, Niemeier et al. (2003) provided evidence that optimal handling of noisy efference copy signals could account for the reduced ability to detect object motion during a saccade. In other words, as these examples show, an apparent shortcoming of the system may actually reflect optimal Bayesian processing.
We designed an optimal observer model, which is schematically illustrated in Fig. 3, to test whether our results would fit a Bayesian framework. This model is designed to combine noisy signals in an optimal fashion, which means that it deals with probability distributions rather than with deterministic signals. Inputs to the model are head orientation in space ρ and the orientation of the visual line (or the direction of visual motion) with respect to the retina φr. These inputs are measured by the (noisy) sensors, which provide two sensory signals (ρ̂ and φ̂r) to the observer. A major source for the sensory head tilt signal ρ̂ are the otoliths, but other sensory systems, like somatosensory afferents (Bronstein 1999) and the semicircular canals (Jaggi-Schwarz and Hess 2003; Kaptein and Van Gisbergen 2006; Pavlou et al. 2003), may contribute as well. The model assumes that the sensory tilt signal is veridical on average, but rather noisy in comparison with the sensory visual signal φ̂r. Thus computing the orientation of the line in space φs, simply by a straightforward combination of incoming sensory signals, would yield a noisy spatial percept. The key feature of Bayesian models is that, along with sensory information, prior knowledge is taken into account to obtain a statically optimal estimate. Application of this notion in the present model means that the observer uses prior knowledge about head tilt, implying that small tilts are most likely, to improve the internal tilt signal. This computational strategy comes at a price: although the reliance on prior knowledge has the beneficial effect of noise reduction at small tilts, this also causes systematic errors at larger tilts (further explanation is subsequently presented). Since Bayesian processing of the tilt signal applies to a stage before the computation of the line vertical and of the motion vertical, the model predicts the same tilt-dependent response bias in both tasks. A further interesting feature is that the model simultaneously makes quantitative predictions about both response bias and scatter. Before describing the mathematical structure of the model in detail, we will first list its assumptions and approximations.
ASSUMPTIONS AND APPROXIMATIONS.
First, sensory signals from the visual system and from the head-tilt detecting system are contaminated with independent Gaussian noise. Second, for the sake of simplicity, we ignored the oblique effect (Krukowski et al. 2003; Löffler and Orbach 2001; Luyat et al. 2001) and assumed that visual noise is independent of line orientation or direction of visual motion on the retina. Third, errors due to imperfect compensation for ocular countertorsion (Curthoys 1996) were ignored. Fourth, the model uses a priori information by assuming that the head is mostly oriented near upright, which is implemented by a Gaussian probability distribution that peaks at zero tilt. We purposely used Gaussians to find the analytical solutions of the model, but we are aware that they do not account for the periodic nature of spatial orientations. Space periodicity, however, can be neglected in the modeled tilt range (−120 to 120°), under the requisite that the width of the Gaussian distribution is kept at a moderate level.
INTERNAL TILT REPRESENTATION.
Guided by Fig. 3, we now describe the sequential computational steps to obtain β, the central tilt signal that ultimately transforms retinal signals to spatial coordinates. We assume that the signal ρ̂, provided by head-tilt sensors, is accurate but contaminated by noise. The noise parameters of the tilt signal will be subsequently specified (see Eq. 10). For a proper understanding of the model, it is of interest to look at this relation from two opposite perspectives. The forward perspective, indicated by the vertical dashed line in the left bottom panel, specifies the distribution of ρ̂ signals that is produced at the various tilt angles. This is the viewpoint of the neurophysiologist who varies the head-tilt angle and records the sensory signal. The CNS, however, must adopt the inverse perspective. When it receives the sensory signal indicated by the horizontal dashed line, for example, the brain must find out which tilt angle may have been responsible. Because the tilt signal is noisy, this inverse problem has no unique solution, so a statistical approach is required. The Bayesian scheme applies knowledge of the forward ρ–ρ̂ relationship to compute the probability that any particular tilt angle produced the incoming sensory signal. The result of this computation, the likelihood function P(ρ̂|ρ), is based exclusively on the sensory evidence ρ̂. Note that the likelihood is a function of tilt angle ρ and that more sensory noise yields a broader distribution and thus an increased uncertainty about which tilt angle may have caused the sensory signal. To optimize the tilt estimate, the observer should take into account which tilt angles are likely on an a priori basis, as expressed by the prior distribution P(ρ), which is shown in the middle part of Fig. 3. The resulting probability of any particular tilt angle, given the combination of sensory evidence and prior knowledge, is termed the posterior probability distribution P(ρ|ρ̂), which follows from the product of likelihood and prior (6) where the probability P(ρ̂) in the denominator serves a normalization purpose. Note that the posterior peaks in between the prior and the likelihood (compare panels in Fig. 3). The exact location of the peak depends on the relative widths of the prior and the likelihood (Carandini 2006). Finally, once the posterior distribution has been computed, the brain needs a decision rule (δ) to obtain β. We assumed that the observer selects the tilt angle with highest probability, the maximum a posteriori (MAP) estimate. Note that the internal tilt signal β will vary in repeated trials due to sensory noise. Because the prior and likelihood are approximated as Gaussian distributions, the model predicts that the distribution of internal tilt angles β is also a normal distribution centered at a mean value (7) and with SD (8) with σp and σtilt, the SDs of the prior distribution and the sensory tilt signal, respectively. A more detailed derivation of Eqs. 7 and 8 can be found in the appendix. Equation 7 quantifies how the bias in β (the difference between β and ρ), caused by the prior, depends on prior width and tilt noise. Note that the ratio of prior width and tilt noise determines the bias in β. A further effect of the prior is that the noise in the internal tilt estimate σβ is smaller than the noise introduced by the tilt sensors σtilt, as shown in Eq. 8. Thus the narrower the prior, the larger the bias and the smaller the scatter in the internal tilt representation. Processing of the visual signal φ̂r in the model involves a likelihood function, P(φ̂r|φr), but no prior knowledge (flat prior, not shown). As a consequence, application of the MAP decision rule means that the most likely retinal orientation φ̂r simply equals the peak of the likelihood function.
PREDICTED TILT DEPENDENCE OF SYSTEMATIC AND RANDOM ERRORS.
The output of the model is the space-centered orientation of the visual stimulus φ̂s, which follows from the linear combination of tilt representation β and retinal orientation estimate φ̂r (see Fig. 3, right panel). Since the visual signal is assumed to be unbiased on average, the model predicts that the systematic errors in the output γ̄ simply reflect the bias in the internal tilt estimate β with respect to actual tilt angle ρ (see Eq. 3). If σtilt has a constant tilt-independent value, the systematic error γ̄, given by (9) will depend linearly on the actual tilt angle ρ. However, to account for the nonlinear relationship between γ̄ and tilt angle ρ, observed in the actual results (see Fig. 5), we allowed σtilt to increase rectilinearly with tilt angle (10) with parameter a0, the offset, representing the noise (SD) in the tilt signal at 0° head tilt and parameter a1, the slope (a dimensionless parameter), specifying how σtilt increases with tilt angle. Note that the tilt dependence in Eq. 10 causes a slight skewness in the likelihood function, which was neglected in fitting the experimental data to enable a straightforward analytical fitting procedure. The model also makes a quantitative prediction of the random errors σγ, which depend on the combination of noise in the visual signal σv and noise in the internal tilt estimate σβ, as specified by Eq. 5 (σγ2 = σβ2 + σv2). Due to the tilt dependence of the noise in the sensory tilt signal (Eq. 10), output scatter also increases with tilt angle.
In conclusion, the Bayesian model contains three parameters—a0, a1, and σp—to determine the mean value and scatter of the internal tilt estimate β; each of these three parameters has an effect on both. Two additional parameters, σvl and σvm, which represent the visual noise in the line task and motion task, respectively, are required to fit the scatter in the output error σγ, in the two tasks.
Both models make predictions about the relationship between the error in the space-centered orientation of the visual stimulus γ and the physical roll-tilt angle ρ. We used the experimental response errors from both the motion- and line-adjustment tasks, to obtain best-fit parameters for the two models (see results). Motion and line data from all subjects were fitted simultaneously. A maximum-likelihood estimation (MLE) procedure was applied for the two models, which has the advantage of fitting both systematic and random errors at the same time.
MITTELSTAEDT'S IDIOTROPIC-VECTOR MODEL.
We obtained the best-fit values of M, S, C, σvl, and σvm by minimizing the negative log-likelihood using the fmincon routine (Matlab 7.0; The MathWorks) and a multistart procedure using different initial parameters. The log-likelihood function L(θ) is given by L(θ) = ∑i=1n log [Pθ(γi|θ)], in which Pθ(γi|θ) represents the chance of obtaining error γi given a particular parameter set θ. We reduced the degrees of freedom by allowing only parameter M to account for intersubject differences. This approach resulted in a total of 12 free parameters (8 + 4) to fit the line and motion data from all eight subjects. Confidence intervals of the best-fit parameters were determined by 100 bootstraps (Press et al. 1992).
Best-fit values for a0, a1, σp, σvl, and σvm were found by minimizing the negative log-likelihood, using the same approach as in the fits of the idiotropic-vector model. Confidence intervals of the best-fit parameter values were obtained using bootstrapping techniques. Since leaving all five parameters free in each subject caused overfitting, we reduced the degrees of freedom by allowing only a single parameter to account for intersubject differences. This approach resulted in a total of 12 free parameters (8 + 4) to fit the line and motion data from all eight subjects. We explored three fit versions of the Bayesian model, in which either the offset a0, the slope a1, or the prior width σp was chosen as the free parameter.
We investigated the ability to compensate for static body tilt, when judging the spatial direction of visual motion, by determining the motion vertical. For comparison, we also tested the sense of line verticality by using the classical luminous-line task. The main body of results was obtained with an adjustment method in which subjects aligned the direction of a moving-dot pattern or the orientation of a luminous line to the perceived direction of gravity. In a further control experiment, we used a psychometric approach to verify that subjects relied on a global motion percept rather than single-dot motion vectors, when judging the earth-centric motion direction of the random-dot pattern.
Compensation for body tilt is incomplete
The adjustment task measured the ability to set the direction of a moving-dot pattern or the orientation of a visual line parallel to the direction of gravity. Both the motion vertical and the line vertical were tested at various body tilt angles, ranging from −120 to 120°. If subjects were to compensate perfectly for their body orientation, compensation angle β (see Fig. 1) would be equal to body tilt angle ρ. If they did not compensate for tilt, always taking the long body axis as the direction of gravity, compensation angle β would be zero. Figure 4 shows the actual degree of compensation in the two tasks for each tested tilt angle. The top panels, which illustrate the results from a typical subject, immediately convey the impression of a strikingly similar pattern of compensation in the two tasks. Compensation is nearly flawless for absolute tilts <60°, with misalignments remaining <10°. Furthermore, in both tasks, compensation does fall substantially short for tilt angles >60°, as if the amount of body tilt is underestimated, with errors ranging up to 40° in this subject. The only noticeable difference in task performance is that settings in the motion task are noisier than those in the line task. This phenomenon, however, need not reflect a difference in the actual spatial computation. It probably indicates a visual factor in the sense that detecting the direction of motion was less precise, partly due to our particular task design and partly because this task requires more temporal integration than estimating the orientation of a line.
The other subjects also show a similar compensation pattern in both tasks, as demonstrated in the bottom panels of Fig. 4, although there is some intersubject variability. Thus at first sight, the motion vertical and line vertical are quite comparable, with both tasks showing a pattern of errors that agrees quite well with previous reports about the perception of line verticality (Mittelstaedt 1983; Van Beuzekom et al. 2001).
To investigate the similarity of performance in the two tasks in more detail, Fig. 5 makes a direct comparison by showing response error (mean ± SD) as a function of tilt angle for each subject separately. As shown, error profiles for the motion task (black line) and line task (gray line) show strong resemblance within each subject, with only subtle systematic differences. Although only small errors occur at tilt angles <60°, verticality misjudgments become quite substantial at larger body tilts, with clear differences across subjects. For example, subject JG makes errors of about 50° in both tasks when the body is tilted to 120°, whereas the errors of subject JM remain limited to 30°. Although most errors represent an undercompensation for body tilt (β < ρ), there are occasional signs of overcompensation at smaller tilt angles. For instance, subjects PM and MV produce errors in the direction opposite to body tilt at the smallest body tilt angles (±20 and ±40°). Subject PB overcompensates only in the motion task. The mean pattern of errors across subjects (Fig. 5, bottom) further underlines the similarity of performance in the two tasks. A repeated-measures two-way ANOVA confirmed this by showing no significant main effect of task [F(1,7) = 1.41, P = 0.27]. Not surprisingly, the effect of tilt angle is highly significant [F(12,84) = 67.9, P ≪ 0.001], consistent with the general increase in systematic errors as a function of tilt angle. The slight overcompensation for small tilt angles (±20 and ±40°) in the motion task is mainly due to subject PB, and to a lesser extent to subjects RV and CT. Across subjects, however, the interaction between task and tilt angle was not significant [F(12,84) = 1.72, P = 0.08].
To further illustrate the comparable performance in the two tasks, Fig. 6A plots the mean error in the motion task against the mean error in the line task, lumped across all subjects and tilt angles. Data points scatter about the identity line. Because both variables are subject to measurement errors, a type II regression (also referred to as a major-axis regression) was used to determine their relationship for each subject. Slope and confidence limits were estimated by the bootstrap method. Correlation coefficients range between 0.89 and 0.98 and slopes vary between 0.91 and 1.35 across subjects (Fig. 6B). In all subjects but one (MV), a slope of 1 is within the 95% confidence limits. The average slope (±SD) across subjects is 1.02 ± 0.04, suggesting a one-to-one relationship between the errors in the two tasks. Thus we conclude that systematic errors in the two tasks are virtually identical.
Global versus local motion
Can the quite similar biases in the motion vertical and line vertical be taken as evidence that the alignment with gravity in the two adjustment tasks was facilitated by the same computational strategy? A potential caveat emerges if performance in the motion task was not based on the processing of global motion by spatial integration, as the experimental paradigm intended. Instead, subjects might have derived a percept of visual direction in space by temporally integrating the extrapolated motion paths of individual signal dots. The similarity in results that we found earlier would be trivial if this surrogate line strategy, however unlikely it may seem, had actually been used in the motion task. To exclude this possibility in a further experiment, four subjects were tested in a global and a local motion task, using a forced-choice paradigm (see methods). The local motion forced-choice task enforced a strategy of judging the motion of a single dot, which was subject to the same motion statistics as the dots in the global motion task. We may conclude that subjects made use of a global motion percept if their performance in the local motion forced-choice task is significantly worse than that in the global motion forced-choice task.
To compare performance in the two tasks for a typical subject at each of the three tested tilt angles, Fig. 7 shows the proportion of clockwise responses, P(CW), as a function of motion direction of the stimulus relative to gravity. The results are clear: performance in the local motion task is substantially worse than that in the global motion task. The fraction of CW responses in the global motion task (▪) covers the whole range between 0 and 1, indicating that stimulus levels were placed correctly in the relevant tilt range. Data from the local motion task (□) show a very different pattern. Subjects never reach optimal response levels, showing response rates that remain close to the 0.5-chance level for all stimulus directions. In a first step to further quantify these results, we fitted psychometric curves to the global motion data. These curves, shown by the solid lines, provide a good description of the data, with R2 >0.86. This curve is characterized by two parameters: threshold and slope. The threshold, which is the mean of the cumulative Gaussian function, is a measure for the subjective motion vertical. Its value corroborates the results from the motion-adjustment task at these tilt angles. At the large tilt angles, the sense of motion verticality deviates from the actual direction of gravity by an amount ranging from about −18 to −43° at −100° tilt and from 16 to 37° at 100° tilt. At 0° tilt, the motion vertical is quite veridical, with errors <5°. The SD of the fitted Gaussian, which represents the subject's uncertainty about motion verticality, is relatively constant across the three tilt angles. It ranges from 8 to 16° at −100° tilt, from 6 to 12° at 100° tilt, and from 6 to 8° at 0° tilt.
Because asymptotic performance in the local motion task remained <100%, fitting a psychometric curve through these data would be questionable. We assessed the difference in performance with the global motion task by examining the deviation of the local motion data points from the psychometric curve fitted through the global motion data. A likelihood ratio test confirmed that the local motion response frequencies, P(CW), were significantly different from the Gaussian probability that was fitted through the global motion data χ2(12,11) > 49.4, P ≪ 0.01 for each test performance (4 subjects, 3 tilt angles). Taken together, this analysis firmly rules out that subjects used a single-dot strategy in the global motion task.
Comparison of global motion and line results
To complete analysis of the control experiments, we investigated whether the similarity between motion vertical and line vertical in the adjustment tasks was upheld in the psychometric experiments. Figure 8 shows results of the global motion forced-choice and line forced-choice tasks from the same subject as in Fig. 7. Psychometric curves of the line data (dashed lines) had R2 values >0.82. Psychometric line and motion thresholds are rather similar at each tilt angle. As indicated by the steeper slopes of the line fits, subjects were less certain about the motion vertical than about the line vertical.
Figure 9 depicts the threshold and SD values derived from the psychometric fits in all subjects. Thresholds in the line and global motion tasks are not significantly different (paired t-test, P = 0.40). Moreover, thresholds in both tasks are not significantly different from the mean errors found in the adjustment experiment (paired t-test, motion task: P = 0.09, line task: P = 0.47). Standard deviations are significantly larger in the motion task, for all tilt angles and subjects (paired t-test, P ≪ 0.01). This can be explained, at least partly, by assuming that the visual noise was more pronounced in the global motion task than that in the line task.
Model predictions and fits
In methods, we described two models to account for the error patterns described earlier, Mittelstaedt's idiotropic vector model (1983) and a new Bayesian model. Since both schemes imply that systematic errors reflect handling of the compensatory tilt signal, rather than processing in the visual pathway, both models predict that error patterns are identical in the line and motion tasks. We fitted the two models to the adjustment data from our eight subjects (see methods for details), the results of which will be described next.
MITTELSTAEDT'S IDIOTROPIC-VECTOR MODEL.
Figure 10 shows fit results of Mittelstaedt's idiotropic-vector model (gray line) to the observed systematic errors in the line vertical and motion vertical. The model fits the data quite accurately, with goodness-of-fit values of R2 >0.82. Note that the model accounts for both underestimation and overestimation errors (see JM and PM). Best-fit parameter values are listed in Table 1. Parameter S has a best-fit value of 0.64, which is roughly comparable to the value (S = 0.58) found by Mittelstaedt (1983). As expected, subjects with larger systematic errors also have higher M-values (e.g., M = 0.65 for subject JG vs. M = 0.24 for JM), in line with the fact that a larger idiotropic vector has a stronger biasing effect. We conclude that the model can account very well for the pattern of systematic errors in both tasks with just a single free parameter for each subject.
Can the model account equally well for the random errors in the data? Figure 11 (top panels) shows the model fits for the line task (Fig. 11A) and the motion task (Fig. 11B) together with the mean scatter curve across subjects (gray line). As can be seen, Mittelstaedt's model accounts for the increase in response scatter with tilt angle, seen in the data. However, the shape of the scatter–tilt relation is not well captured by the model, which also falls short in fitting the stereotyped scatter level in upright. The most glaring discrepancy between the data and the fit—the fact that the predicted scatter exceeds the actual scatter roughly twofold—calls for an explanation. Why did the MLE fit procedure not simply adopt a smaller C value (see Eq. 4) to prevent this problem in the first place? To understand why this would not improve model performance, it should be recalled that the scatter level in the MLE fit reflects two factors: not just the actual data scatter at each tilt angle, but also the subtle discrepancies between the fit line and the local data average. Since the former is shown in the panels, but the latter is not, the discrepancy between predicted and actual scatter levels should not immediately be taken as a weak spot in the model. Since it is much more difficult to determine the precision characteristics of the system than its degree of accuracy, the issue cannot be resolved quickly. Only future testing of response scatter on a more massive scale (multiple trials, separate runs) can reveal whether the gap between predicted and actual scatter level is apparent or real.
For reasons outlined earlier in methods, we tested three fit versions of the model, each imposing a different set of constraints on the fit. Version 1 allowed tilt noise parameter a0 (see Eq. 10) to vary among subjects, whereas the remaining four parameters were determined as a best-fit value across subjects. The approach in the other fit versions was similar, except that now only tilt noise parameter a1 (version 2) or the prior width σp (version 3) was free to vary among subjects, whereas the remaining parameters were fit as a single value across subjects. The fit was applied simultaneously to both systematic and random errors from the line and motion tasks, using MLE (see methods).
Figure 10 presents the results of this analysis for the systematic errors, showing the fits of all three versions superimposed on the data. The fitted curves are practically indistinguishable. All three versions had goodness-of-fit values R2 >0.80 (see Table 2). In contrast to Mittelstaedt's model, the Bayesian model cannot account for errors of overestimation, as seen in subjects PM and JM. Table 2 lists the best-fit parameter values for the three Bayesian model fit versions. Clearly, it would be pointless to prefer any of these fit versions based on their account of the systematic errors. In other words, according to the model, the pattern of errors may equally well be caused by a higher tilt noise offset, a0 (version 1); a steeper increase of tilt noise with tilt angle, a1 (version 2); or a narrower prior, σp (version 3). That being said, there is an interesting observation to make across the three fit versions: the best-fit value of parameter a1 is invariably positive (see Table 2). The important implication of this result is that tilt noise must increase with tilt angle if our Bayesian model is to account for the present data. Not surprisingly, all three fit versions indicate that visual noise in the motion task σvm, which ranged from 8 to 9°, is larger than that in the line task σvl, where it reached values ≤3.2°.
Following the demonstration that the three fit versions account about equally well for the systematic errors in the data, Fig. 11, C–H compares their ability to predict the random errors. In general, all three versions predict scatter levels above those in the data (gray line), for similar reasons as mentioned previously for Mittelstaedt's model. Interestingly, all fit versions predict scatter levels to increase with tilt angle, which is consistent with the positive values for a1, seen in Table 2. However, only versions 2 and 3 seem to match the finding that individual scatter levels at 0° tilt are practically identical (see Fig. 11, I and J).
In conclusion, both models account very well for the virtually identical pattern of systematic errors in the motion vertical and line vertical, which they interpret as the result of central processing of the compensatory tilt signal. Neither model fully matches the random errors in the data, although it appears that the Bayesian model accounts slightly better for the tilt dependence of the variable errors in the data. In this respect, the Bayesian model, which sheds a new light on the origin of biased verticality percepts, has emerged as a viable alternative that deserves further exploration.
Recapitulation of main findings
We investigated the brain's ability to account for head tilt when estimating the direction of visual motion in space. We found that incomplete compensation for head tilt at larger tilt angles caused systematic errors in the motion vertical that were virtually identical to those in the line vertical. A trivial explanation of this similarity, implying that the motion vertical might have been based on extrapolated motion paths of single dots, was firmly ruled out by psychometric control experiments. Taken together, our results suggest reliance on a common reference frame for spatial motion and pattern vision during lateral body tilt. Fit results of two spatial orientation models suggest that the pattern of systematic errors in the two tasks may be the downside of a strategy for dealing with imperfections in the sensory tilt signal, which is implemented at a stage preceding the conversion of visual signals from retinal to spatial coordinates. In the following, we explore the merits of these two different modeling perspectives from a wider perspective and in more depth.
Both Mittelstaedt's idiotropic-vector model and the Bayesian observer model proposed here link errors in the motion vertical and line vertical at large tilts to a strategy for dealing with imperfections in the sensory tilt signal (MacNeilage et al. 2007). The two models propose that the resulting biased tilt representation is used to convert the retinal signals into an earth-centric reference frame, which explains the strong resemblance of the systematic errors in both visual subsystems.
RATIONALE BEHIND THE BIASING MECHANISMS IN THE TWO MODELS DIFFERS.
Both schemes assume that the raw sensory tilt signal is subject to imperfections, but their ideas about what is imperfect are rather different. Mittelstaedt's model assumes that combining the signals from the two otolith organs, utricle and saccule, which contain different numbers of hair cells, is a nontrivial problem leading to systematic errors. The Bayesian scheme makes no assumptions about the precise contributions of the otolith organs and allows for the possibility that other sensors may contribute to the raw tilt signal. According to this scheme, the problem with the sensory tilt signal is that it is noisy. Reduction of this noise by using prior knowledge causes systematic errors at large tilt angles. To account for the pattern of systematic errors, it was necessary to extend the standard Bayesian model by the additional parameter a1, which describes an increase of the noise in the sensory tilt signal with increasing tilt angle.
ARE THE SUPPOSED IMPERFECTIONS OF THE RAW TILT SIGNAL REALISTIC?
Although Mittelstaedt's scheme has found wide acceptance in the literature since it can nicely account for the pattern of systematic errors in the subjective vertical, some questions about its basic assumptions can be raised. First of all, the notion that the brain would have problems coping with unequal numbers of utricle and saccule afferents is not immediately convincing. Similar challenges occur in other sensory systems, like vision and the somatosensory system, which show little sign of major distortion in their representation of spatial relationships at the perceptual level. Differences in the number of sensory afferents have well-established perceptual correlates, although these concern primarily differences in discrimination thresholds (e.g., foveal vs. peripheral vision). Second, if the otolith signal is distorted, one would expect this to show up in body tilt estimates as well. Mittelstaedt (1995, 1999) was the first to discover that this was not the case. He showed that subjects, rotated sideways, make distinct errors in the classic line-verticality task, but show virtually no bias in their estimate of body orientation. Other studies have reported similar observations (Bortolami et al. 2006; Kaptein and Van Gisbergen 2004; Mast and Jarchow 1996). This clear dissociation between body tilt percept and the visual upright poses an intriguing and nontrivial paradox: Why would errors in the subjective vertical occur when the visual and tilt signals, from which this percept must be derived, are virtually unbiased? Mittelstaedt argued that the idiotropic vector plays a role only in the perception of the subjective vertical but not in the perception of body tilt, which is supported by other sensory cues, such as graviceptors in the trunk.
According to the Bayesian model, the sensory tilt signal is accurate but contaminated by noise, which increases with tilt angle. The suggested dependence of tilt noise on tilt angle may appear controversial since single-unit studies by Fernandez and Goldberg (1976) provide no direct evidence that this is the case. However, at a slightly higher level, the fact that utricle and saccule have unequal numbers of hair cells (Rosenhall 1972, 1974) may be a relevant factor. As was first shown by Eggert (1998), this arrangement may yield tilt-dependent noise since the utricle, which is most sensitive to head tilts around 0° (upright), would provide a more precise signal than the saccule, which is most sensitive at 90° roll tilt. Further evidence for the dependence of tilt noise on tilt angle comes from perceptual studies showing that the effect of optokinetic stimulation on the subjective vertical (Dichgans et al. 1974) and the sense of body tilt (Young et al. 1975) is stronger at larger tilt angles.
In principle, the Bayesian model can solve the above-mentioned paradox that a rather accurate tilt signal appears not to be used as such in visual verticality judgments. The crucial point is that the visual signal is very precise (Orban and Vogels 1998; Westheimer 2003) in comparison with the sensory tilt signal (Bisdorff et al. 1996; Day and Fitzpatrick 2005; Mast and Jarchow 1996). From a Bayesian perspective, it may not be the most optimal strategy to add the noisy tilt signal and the precise retinal signal to facilitate verticality perception. Indeed, Mast and Jarchow (1996) have provided evidence against the notion of simple noise propagation, by showing that scatter in line-verticality judgments was less than the noise in the body tilt judgments. The addition of a prior, centered on 0° tilt, suppresses noise in verticality perception at the expense of a systematic bias at larger tilt angles. In this sense, the Bayesian strategy amounts to a precision–accuracy trade-off. Viewed from this perspective, the trade-off appears to have a different outcome in the perception of body tilt where different optimality criteria may apply. The fact that body-tilt percepts show better accuracy but poorer precision compared with the line vertical (see earlier text) suggests that the involvement of prior information is minimal when estimating the orientation of body in space.
In a balanced assessment, it should be noted that Bayesian inference depends heavily on the assumption that the brain is adapted to the noise properties of the sensory tilt signal, which is essential to compute the corresponding likelihood function. It is not a trivial matter to validate this assumption, which can be seen as a weakness of all Bayesian models.
PERFORMANCE OF THE TWO TESTED MODELS.
Both models performed very well in explaining the systematic errors in the adjustment experiments, although Mittelstaedt's model did slightly better, due to its provision to also account for errors of overcompensation (E-effects), seen in some subjects. As an explanation for such errors, Mittelstaedt's model allowed the idiotropic vector to fall short in the full compensation of the inaccuracies in the raw tilt signal. A further factor—that errors of overcompensation may reflect uncompensated eye torsion (Curthoys 1996)—was ignored in both models.
With respect to the random errors, both models predict an increase of scatter with tilt angle. Recall that, in Mittelstaedt's model, a special parameter (C) was dedicated to this aspect of the data, whereas in the Bayesian model there is a tight coupling between random and systematic errors. The Bayesian model generally allowed for a better qualitative match to the observed scatter data than did the Mittelstaedt model, even though both models systematically overestimate the scatter levels. In this respect, the exact combination of systematic and variable errors cannot be accounted for by either model, which may be partly explained by the fact that differences between the local data average and the model fit must also be accounted for by the models' scatter prediction. We cannot exclude that the scatter in the data was underestimated due to our approach of collecting all responses in a single run, which may have caused some dependence between trials. In terms of systematic errors, the Bayesian model suggests a dependence of the accuracy on the precision of the subjective vertical. If the computation of the a posteriori probability of the gravity direction is adapted to the width of the individual likelihood of the vestibular signal, then subjects with high precision are also expected to show better accuracy (see Eqs. 9 and 10). One way to proceed would be to measure psychometric curves of perceived self-tilt at 0 and 90° tilts, to test whether subjects with large systematic errors in the motion vertical and line vertical at 90° also have a less-precise tilt percept at this tilt angle.
It remains speculative as to where these Bayesian computations may be encoded in the brain, but a brief discussion about the neurophysiological implications of our results seems pertinent. At the peripheral level, the otoliths detect the gravitointertial force (GIF), meaning that they cannot distinguish head translation from head tilt relative to gravity. Although errors may occur in the disambiguation of the GIF signal under dynamic conditions (Vingerhoets et al. 2007), we tacitly assumed that the brain can accurately disambiguate the GIF signal under static conditions. Support for this assumption comes from several modeling studies (Laurens and Droulez 2007; MacNeilage et al. 2007; Merfeld et al. 1999). As a correlate of disambiguation, a recent neurophysiological study found that Purkinje cell activity in the cerebellar vermis reflects the transformation of afferent canal and otolith information into earth-referenced self-motion and spatial-orientation signals (Yakusheva et al. 2007). These findings suggest that the brain cares about isolating a head-in-space signal.
Our results raise interesting questions about the neural locus where such a signal may interact with visual signals to solve the spatial constancy problem during lateral tilt. In the early stages of visual processing, up to area V1, information appears to be coded in a retinal frame. A first attempt to look for signs of orientation constancy in the visual cortex has been made by Sauvan and Peterhans (1999). As far as we know, a comparable investigation has not yet been performed in middle temporal (MT) and middle superior temporal (MST) areas, which are key players in the analysis of visual motion. MSTd neurons, which are involved in the coding of self-motion, are sensitive to both visual and vestibular motion cues (Gu et al. 2006), but these signals are not coded in a common spatial frame of reference (Fetsch et al. 2007).
Another area, most closely associated with the dorsal stream, the parietoinsular vestibular cortex (PIVC), has received attention in the context of vestibular processing. PIVC is a multisensory region, responding to vestibular, somatosensory, and visual motion stimuli (Grüsser et al. 1990). It has been reported that patients with lesions in the human homologue of area PIVC show abnormalities in the perceived visual line vertical (Brandt and Dieterich 1999; Brandt et al. 1994). Regarding our results, it would be interesting to investigate whether these patients have similar abnormalities in their visual motion vertical as well. Paradoxically, patients with such lesions have no affected percept of body posture and subsequent loss of lateral balance. In view of these findings, it seems that area PIVC may play an important role in the implementation of the computational mechanisms that subserve spatial perception.
We have shown that the conversion of motion direction and line orientation from a retinal to a world-centered frame of reference during lateral tilt is subject to the same pattern of errors. This result suggests that this bias probably arises at the level of the compensatory tilt signal used in the reference frame transformation. Modeling efforts, showing that both Mittelstaedt's idiotropic-vector model and a new Bayesian observer model can account for the pattern of systematic errors, suggest that these errors are the downside of a strategy to compensate for imperfections in the sensory tilt signal.
In the case of a single trial, Bayes' rule implies that the optimal estimate of tilt angle β, given sensory signal ρ̂ and prior information, is specified by (A1) which is obtained by calculating the maximum value of the posterior distribution using its derivative. For the case of many repeated trials, the mean value of β is given by (A2) which equals Eq. 7 (see methods). The variance of β depends on the variance of ρ̂ according to (A3) which is equivalent to Eq. 8.
This work was supported by Nijmegen Institute for Cognition and Information and Faculteit der Natuurwetenschappen, Wiskunde en Informatica of Radboud University Nijmegen and by grants from the Netherlands Organization for Scientific Research and the Human Frontier Science Program to W. P. Medendorp.
We thank H. Kleijnen, G. van Lingen, S. Martens, and G. Windau for technical support; F. Verstraten for helpful suggestions on the visual stimuli; T. Dijkstra, T. Heskes, and O. Zoeter for valuable advice on Bayesian methods; and T. Eggert for useful discussions about spatial-perception models.
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2008 by the American Physiological Society