## Abstract

Using vestibular sensors to maintain visual stability during changes in head tilt, crucial when panoramic cues are not available, presents a computational challenge. Reliance on the otoliths requires a neural strategy for resolving their tilt/translation ambiguity, such as canal–otolith interaction or frequency segregation. The canal signal is subject to bandwidth limitations. In this study, we assessed the relative contribution of canal and otolith signals and investigated how they might be processed and combined. The experimental approach was to explore conditions with and without otolith contributions in a frequency range with various degrees of canal activation. We tested the perceptual stability of visual line orientation in six human subjects during passive sinusoidal roll tilt in the dark at frequencies from 0.05 to 0.4 Hz (30° peak to peak). Because subjects were constantly monitoring spatial motion of a visual line in the frontal plane, the paradigm required moment-to-moment updating for ongoing ego motion. Their task was to judge the total spatial sway of the line when it rotated sinusoidally at various amplitudes. From the responses we determined how the line had to be rotated to be perceived as stable in space. Tests were taken both with (subject upright) and without (subject supine) gravity cues. Analysis of these data showed that the compensation for body rotation in the computation of line orientation in space, although always incomplete, depended on vestibular rotation frequency and on the availability of gravity cues. In the supine condition, the compensation for ego motion showed a steep increase with frequency, compatible with an integrated canal signal. The improvement of performance in the upright condition, afforded by graviceptive cues from the otoliths, showed low-pass characteristics. Simulations showed that a linear combination of an integrated canal signal and a gravity-based signal can account for these results.

## INTRODUCTION

An intriguing aspect of spatial vision, known as *orientation constancy*, is our ability to maintain at least a roughly correct percept of allocentric visual orientations despite changes in head orientation (Bischof 1974; Sauvan and Peterhans 1999). To what extent, and how, external space perception can be maintained during self-motion is still an unresolved issue. In the dark, the vestibular system plays an important role in this process, although its sensors are imperfect and partly ambiguous. The semicircular canals, sensing angular velocity, exhibit high-pass-filter characteristics with poor responses during low-frequency or constant-velocity rotations. The otoliths, measuring gravitoinertial force (GIF), cannot distinguish between tilt and linear acceleration for elementary physical reasons (Young 1984). How does the brain handle this sensory information for spatial vision? Current interest into the solution of the ambiguity problem focuses on two main approaches. The first solution is to use frequency filtering of the otolith signal to differentiate between tilt and translation (e.g., Paige and Tomko 1991). The second approach relies on canal–otolith interactions to resolve the ambiguities (e.g., Angelaki et al. 1999; Glasauer 1992; Mayne 1974; Merfeld 1995; Zupan et al. 2002). Recently it has been suggested that human oculomotor performance is governed by frequency filtering whereas ego-motion percepts reflect canal–otolith interaction (Merfeld et al. 2005).

Whether this notion of differential disambiguation mechanisms for perception and action can be generalized to the domain of visual space perception remains to be studied. The contribution of the canals, whether in an indirect interactive role or in the form of an additive signal, has remained unclear. Previous studies on the role of vestibular mechanisms have yielded seemingly conflicting results concerning the relative importance of canal and otolith signals in the maintenance of visual stability. Classical visual orientation studies, testing the sense of verticality under static tilt conditions, have strongly emphasized the role of the otoliths (Mittelstaedt 1983). Other studies, using rapid tilt paradigms, found canal-related effects, but still considered the otoliths as the main source of information (Jaggi-Schwarz et al. 2003; Keusch et al. 2004; Udo de Haes and Schöne 1970). Also a recent study by Klier et al. (2005), which compared saccadic updating after upright and supine roll tilts involving vigorous canal stimulation, concluded that the contribution from graviceptive signals is critical. In contrast, Jaekl et al. (2005) found no evidence for any otolith involvement at all in a study on the effect of active head movements on visual stability.

The present study was designed to test to what extent the otoliths and the canals can maintain a degree of visual orientation constancy during passive roll rotation, even in the absence of panoramic visual cues. Making spatial stability judgments under such circumstances requires an analysis of whether the incoming visual motion signals match the change of head orientation in space detected by the vestibular system. When head rotation is underestimated under such reduced conditions, there will be no match between the vestibular signals and the visual signals evoked by an Earth-fixed line that will therefore be seen as moving. Our approach was to present a variety of visual line motions in the frontal plane of the subject, coupled to the motion of the vestibular chair, to determine which combination of visual motion and vestibular motion would yield the pair of matching signals required for a stable visual percept (no line rotation in space).

In experiments, subjects seated in a vestibular chair were subjected to sinusoidal roll rotation in a dark room. At the same time, they saw a visual line that rotated in counterphase to chair motion, at a variety of amplitudes. Expressed as a fraction of chair motion, these amplitudes varied from zero (line moving in space with the body) to 1.5 and thus included the special case (rotation fraction 1.0) that the line was physically stationary in space (see Fig. 1). From the perceived line sway estimates in all trials, the matching condition for perceived stability could be determined. This was done at various vestibular rotation frequencies chosen in a frequency range (0.05–0.4 Hz) where current disambiguation models predict that the availability of otolith cues for the estimation of body sway will gradually diminish with frequency. Potential canal contributions, on the other hand, would be expected to increase with frequency in this range. In a further attempt to assess the relative importance of otoliths and canals, the experiment was performed in the upright and the supine condition. In the upright experiments, both canal and otolith signals are available, whereas supine rotation will provide only canal signals.

The total data set collected in all these experiments served to answer two major questions: *1*) is it possible to quantify the involvement of the canals and the otoliths in the computation of body motion underlying this task, and *2*) what are the implications for current otolith-disambiguation schemes? Before these questions could be addressed, it was necessary to sort out how various confounding factors, such as the gain of the vestibuloocular reflex (VOR), affected the outcome of the experiments. After correction for these effects, the results showed a remarkably strong involvement of canal signals that completely dominated the response at the higher frequencies. By contrast, the modest improvement in performance in the upright experiments, which reflected the otolith contribution, was most noticeable at lower frequencies. These relationships in the data could be described by a linear combination of weighted otolith and canal contributions, irrespective of which neural strategy for otolith disambiguation was incorporated in the overall model.

## METHODS

### Vestibular rotation

The subject was seated in a computer-controlled vestibular stimulator. Body rotation was controlled by rotation about the nasooccipital roll axis. Roll position was measured using a digital position encoder with an angular resolution of 0.04°. The cyclopean eye was aligned with the axis of rotation by adjusting the subject's seat in height. The subject's trunk was tightly fixated using seat belts and adjustable shoulder and hip supports. The legs and feet where restrained by Velcro straps. The head was firmly fixated in a natural upright position for looking straight ahead, using a padded helmet.

In different sessions, all subjects were rotated sinusoidally in the upright and supine positions (Fig. 1), in complete darkness, at frequencies of 0.05, 0.1, 0.2, and 0.4 Hz and an amplitude of 15° (30° peak to peak). These frequencies were chosen to be equally spaced on a log scale. The peak angular velocities at 0.05, 0.1, 0.2, and 0.4 Hz were 4.7, 9.4, 18.8, and 37.7°/s, respectively. The corresponding peak angular accelerations for these four frequencies were 1.5, 5.9, 23.7, and 94.7°/s^{2}.

To avoid discontinuities in velocity and acceleration at motion onset, angular velocity increased linearly over an integer number of sinusoidal periods (see Merfeld et al. 2005). The number of such ramp-up cycles was frequency dependent: one for 0.05 Hz, two for 0.1 and 0.2 Hz, and three for 0.4 Hz. The number of steady-state cycles was also frequency dependent and ranged from about 10 to about 75, such that the total steady-state period available for testing was always about 3 to 3.5 min.

In total, runs lasted between 3 and 4 min. Between runs, there was a resting period of 1 min, with the room lights on. Vision was always binocular and subjects were allowed to move their eyes freely.

### Subjects

Six subjects (five male, one female) gave written informed consent to participate in the experiments. Three of them had no knowledge about the purpose of the experiments. Age ranged from 25 to 61. Subjects were always given a few practice runs to get used to the experiment and never received feedback about their performance.

### Experiments

Spatial orientation constancy was tested in the upright and supine positions. Testing was done using a luminous line mounted on the frame of the vestibular chair, 90 cm in front of the subject (angular subtense 20°). The line was polarized by a bright dot at one end and could rotate in the frontoparallel plane about an axis aligned with the vestibular rotation axis (see Fig. 1).

During the steady-state cycles, the luminous line switched on for two consecutive cycles in the 0.05-, 0.1-, and 0.2-Hz conditions and for three cycles in the 0.4-Hz condition. In the course of the experiment, we applied various amplitudes of line rotation, always in counterphase to the vestibular motion, using the same frequency. Line rotation was defined relative to the subject and its amplitude was expressed as a fraction of chair-rotation amplitude (rotation fraction, for short). When there was no counterrotation (rotation fraction 0), the line moved in space and remained aligned with the subject's long body axis (Fig. 1*B*). By contrast, if the countermotion was very large (e.g., rotation fraction 1.5), the line rotated in space in a direction opposite to body motion (Fig. 1*C*). At a rotation fraction of 1.0, the line remained physically stationary in space. The goal of the experiment was to determine the critical rotation fraction where the line would be perceived as stable in space.

The subject's task was to judge the peak-to-peak movement of the line in an Earth-fixed reference frame. A toggle switch, mounted near the right hand, served to indicate whether the line was seen moving clockwise or counterclockwise, in space. This signal, which was recorded on disk, provided us with a rough indication of perceived phase. After the line was switched off, subjects verbally reported its perceived peak-to-peak sway in space, using a clock scale (see Van Beuzekom et al. 2001). Subjects could fully concentrate on their percept of line motion in space, which did not require a conscious awareness of the imposed body movements. After practicing a few trials, all subjects reported that they could perform this task without problem. Subjects were instructed that it was not relevant, in the upright condition, whether the line was perceived as Earth vertical. In each run, four (at 0.05 Hz) to 15 (at 0.4 Hz) different rotation fractions could be tested. Between trials the line was switched off and the subject had at least 3 s to verbally report a response, before the line switched on for the next trial. On average, four sessions of approximately 45 min were needed to collect the data from each subject for all conditions.

To check whether the continuous visibility of the rotating line had an optokinetic effect, we performed a control experiment. This control experiment was the same as the standard experiment described above, except that the line now flashed at a rate of 5 Hz, with a duration of 1 ms. At this frequency, the number of flashed lines in each cycle was still sufficient to get a clear percept of its sinusoidal movement, even at the highest frequency. The control experiment was performed in the upright and supine conditions, at the two extreme rotation frequencies (0.05 and 0.4 Hz). Three subjects participated in the control experiment, which took one to two sessions to collect the data from each subject.

### Data analysis

The purpose of the experiments was to characterize updating for visuospatial orientation during vestibular rotation. By testing at various rotation-fraction values (see previous section), we determined the amount of line counterrotation required for visuospatial orientation stability. The verbal response was used to express the perceived amplitude of line sway and the toggle-switch responses specified the perceived rotation direction. When the toggle-switch responses in a given trial implied that the line was perceived as moving in the direction of ego motion, the verbal sway estimate was given a positive sign. If these responses indicated that the line was perceived as moving in opposite direction, the verbal estimate was given a negative sign. Plots of these responses versus the amount of counterrotation (i.e., rotation fraction) were found to be linear. An example can be seen in Fig. 2. The intersection of a linear regression line with the *x*-axis determined the amount of counterrotation necessary for spatial stability. This critical rotation fraction will be called “null fraction,” represented by symbol *F*_{0}. Note that a veridical response requires *F*_{0} = 1.

### Model simulations

Model simulations in the discussion were performed using Matlab 6.0 and Simulink 4.0 (The MathWorks). Details on model simulations can be found in the appendix. Best-fit parameters of the two models in the discussion were obtained using nonlinear least-squares data fitting by the Gauss–Newton method (routine lsqcurvefit in Matlab) in combination with a multistart procedure using different initial parameters. Results were double-checked with the Nelder–Mead algorithm. SDs of the parameters were computed from the Jacobian matrix and the residuals, using the Matlab routine nlparci. We used two goodness-of-fit measures: the adjusted *R*-square (*R _{adj}^{2}*) and the root mean squared error (

*rmse*). The latter is defined as the square root of the mean squared distance between a data point and the corresponding model prediction. We included the

*rmse*measure because

*R*was not always a proper indication of goodness of fit.

_{adj}^{2}## RESULTS

To illustrate how we determined the amount of counterrotation necessary for spatial stability from the sway estimates, Fig. 2 shows the responses from a typical subject in the upright experiment. The dashed line denotes the theoretical case of perfect responses (i.e., slope −1, intercept 1). The solid line is a linear regression through the data points. At all frequencies, the intersection of the regression line with the horizontal axis is to the left of the dashed line, implying that the amount of counterrotation necessary for spatial constancy (the null fraction, *F*_{0}) was less than the ego-motion amplitude. For example, the null fraction of 0.48 in Fig. 2*A* means that line counterrotation amounting to only 48% of chair rotation (instead of 100%) was already sufficient for visual stability. Thus what looked visually stable to the sinusoidally rotating subject was far from stable in a physical sense, as if the changes in body orientation were severely underestimated with respect to the visual orientation changes. We will now characterize the responses by the slopes and the *x*-axis intercepts that follow from the linear regressions on the sway estimates.

### Comparison of upright and supine results

#### Slopes.

The regression lines in Fig. 2 are all steeper than the veridical response (dashed line), which appeared to be a general phenomenon. In the upright condition, all regression lines had slopes beyond −1, ranging from −1.1 to −2.9, with a mean (±1 SD) of −1.7 ± 0.5. In the supine condition, slopes ranged between −0.7 and −2.0, with an average of −1.4 ± 0.4. In both conditions, there were considerable differences among subjects. The frequency dependency of the slope is shown in Fig. 3, for the upright (*A*) and supine (*B*) conditions. It can be seen that there is no clear frequency dependency. A repeated-measures ANOVA, with frequency (0.05, 0.1, 0.2, and 0.4 Hz) and condition (upright and supine) as factors, revealed no significant effect of frequency [*F*(3,35) = 1.24; *P* = 0.33], but showed a significant effect of condition [*F*(1,35) = 18.2; *P* < 0.01].

To avoid cluttering, Fig. 3 does not contain SDs. The mean SD for the upright condition (Fig. 3*A*) was 0.12 ± 0.05 for 23 of the 24 data points. There was one outlier with SD = 0.69. For the supine condition (Fig. 3*B*), the mean SD was 0.08 ± 0.03 for 23 of the 24 data points, with one outlier (SD = 0.33).

A control experiment measuring peak-to-peak sway of moving lines in four stationary subjects showed gains closer to 1.0 but still with a significant bias. From this result, it seems unlikely that the unexplained bias phenomenon is related to the visual stability task. The effect was not explored further because our conclusions will entirely be based on the null fractions (see discussion).

#### Null fractions.

The null fractions for all subjects (i.e., the intersections of the linear regressions with the *x*-axis) are shown in Fig. 4 as a function of frequency, for the upright (*A*) and the supine (*B*) conditions. Recall that perfect performance would yield null fractions of 1.0 and note the horizontal logarithmic scale.

Figure 4*A* shows that five of the six subjects exhibited very similar behavior in the upright condition. Null fractions increased significantly with frequency from about 0.5 to about 0.8. A linear regression of null fraction on log (frequency), using the data of these five subjects, yielded *r* = 0.93 (*P* < 0.0001; *n* = 20). Subject BB, with clearly better performance, did not conform to this picture. This subject showed rather a frequency-dependent decay from a null fraction of about 1.0 to about 0.75 that was difficult to substantiate, however, because of the small number of tested frequencies (*n* = 4).

Compared with upright, the supine experiments (Fig. 4*B*) always yielded poorer performance, especially at the lower frequencies. Five of the six subjects again showed similar behavior, and subject BB was again different. In the homogeneous group of five subjects, the mean null fraction increased with frequency from about 0.15 to about 0.6. A linear regression of null fraction on log (frequency) yielded *r* = 0.90 (*P* < 0.0001; *n* = 20). Subject BB again showed no clear frequency dependency in this condition. A repeated-measures ANOVA, with frequency and condition as factors, indeed revealed significant effects of frequency [*F*(3,35) = 15.4; *P* < 0.001] and condition [*F*(1,35) = 97.6; *P* < 0.01].

As in the previous figure, SDs are not shown in Fig. 4. The SDs for the two conditions were comparable and showed no frequency dependency. The mean SD was 0.03 ± 0.02 for 47 of the 48 data points. The *top left* data point (0.05 Hz) in the upright condition (Fig. 4*A*) is an outlier with SD = 0.27.

### Results of control experiment

We performed a control experiment to investigate the possibility of an optokinetic effect induced by the continuously visible moving line. In the control experiment, we used a stroboscopic line, flashed at a frequency of 5 Hz. Sway estimates (not shown) were very similar to those of the standard experiment (Fig. 4), in showing the same frequency dependency and the same supine–upright difference. Indeed, a three-way ANOVA, with frequency (0.05 and 0.4 Hz), condition (upright and supine), and line type (continuous and stroboscopic) as factors, revealed no significant effect of line type [*F*(1,20) = 0.59; *P* > 0.05].

## DISCUSSION

We investigated visuospatial orientation perception during sinusoidal rotation with and without gravity cues. The results of the standard experiment (Fig. 4) show that performance was imperfect in that the null fraction (i.e., the fraction of counterrotation yielding spatial constancy) was nearly always well below 1.0. In the upright condition performance was always better than that in the supine condition, and this difference was most pronounced at low frequencies. The interpretation of these results will proceed in two steps. First, we will discuss how the null fractions can be interpreted as the expression of visual and vestibular processing, and how the vestibular contribution to visuospatial orientation updating can be determined. Second, we will discuss two different schemes for how this vestibular gain could be obtained from canal and otolith signals.

### What determines spatial orientation constancy?

In our experiments, the two important physical variables were head orientation in space (*H _{S}*) and line orientation relative to the head (

*L*). The task of the subject was to judge peak-to-peak variations of line orientation in space (

_{H}*ΔL̂*). To estimate

_{S}*L̂*, the subject needs estimates of head orientation in space (

_{S}*Ĥ*), of eye orientation relative to the head (

_{S}*Ê*), and of line orientation relative to the eye (

_{H}*L̂*).

_{E}A simple scheme of how the two inputs can be transformed into the estimated line orientation, based on intermediate internal estimates, can be seen in Fig. 5. The purpose of this scheme is to be explicit about the various factors that may have contributed to the nonveridical performance seen in the experiments (see Fig. 4). These factors, indicated by five triangular gain boxes in Fig. 5, include imperfect coding of retinal line orientation (*G*_{vis}) and of head tilt (*G*_{vest}). Although it has been suggested that somatosensory cues may also contribute to signal *Ĥ _{S}* (e.g., Anastasopoulos and Bronstein 1999), this system was not included because its role at small tilt angles (<30°) appears negligible (Trousselard et al. 2004). A complicating factor in the interpretation of the data is the torsional vestibuloocular reflex (VOR), with gain

*G*

_{vor}, which causes discrepancies between

*L*and

_{H}*L*. The scheme allows for the possibility that these visual consequences may be compensated centrally, if only partially, with gain

_{E}*G*

_{comp}. Finally, determining

*ΔL̂*from the time-varying signal

_{S}*L̂*may also be subject to error (i.e., if

_{S}*G*

_{scal}≠ 1).

Our first objective is to determine the vestibular gain (*G*_{vest}) as a function of frequency, both for the upright and supine condition. In view of the many factors involved, this may seem impossible at first sight. To make the problem tractable, we translated Fig. 5 into analytical terms (see *VOR compensation* in the appendix). Based on the resulting expression (*Eq. A4*), it could be shown that the slope of the regression line through the sway estimates (see Fig. 2) has no relation to *G*_{vest} and depends only on the product of *G*_{vis} and *G*_{scal}.

The expression for the null fraction (*F*_{0}) reads (see *Eq. A6* in the appendix) (1) Accordingly, *G*_{vest} can be computed from the null-fraction data if we can substitute plausible values for *G*_{vis}, *G*_{vor}, and *G*_{comp}. We will now show that this is possible for *G*_{vis} and *G*_{vor}, but less so for *G*_{comp}. It will be shown later that the remaining uncertainty about *G*_{comp} has only limited effects on *G*_{vest} with no consequences for our overall final conclusions.

#### Torsional vor parameters.

Figure 5 shows that instead of separate processing of visual and vestibular signals, the two interact through the torsional VOR. The brain could compensate for this effect if it has access to inflow or outflow signals (Carpenter 1988; Wurtz and Sommer 2004) about the eye movements. When the torsional VOR is fully taken into account by the brain (i.e., *G*_{comp} = 1), this interaction has no effect on *L̂ _{S}*. However, it has been suggested that

*G*

_{comp}<1, meaning that the torsional VOR is only partially compensated (Balliet and Nakayama 1978; Pavlou et al. 2003; Wade and Curthoys 1997). Values for

*G*

_{comp}suggested in the literature are about 0.24 (Pavlou et al. 2003), about 0.67 (Balliet and Nakayama 1978), and 1.0 (Mast 2000). We used

*G*

_{comp}= 0.5. Given this uncertainty, we will later explore the effect of using different values for this parameter.

The VOR icon in the scheme represents the total gain of the canal- and otolith-driven VOR subsystems. For simplicity, these subsystems have been approximated as a lumped linear system and the role of head velocity signals in the canal-driven VOR has not been made explicit. The point emphasized in the scheme is that torsional eye movements affect the relation between line orientation relative to the head and line orientation relative to the retina, irrespective of which subsystem caused them. The overall gain of the human torsional VOR (*G*_{vor}) that we used, taken from Schmid-Priscoveanu et al. (2000), did take into account that the VOR gain depends on rotation frequency and on whether the subject is rotated in upright or supine posture. Schmid-Priscoveanu et al. (2000) found that, in subjects in the upright condition, gain increased from about 0.15 to about 0.30, between 0.05 and 0.4 Hz. In the supine condition, gain increased from about 0.05 to about 0.20. Our *G*_{vor} gains are based on the model of Schmid-Priscoveanu et al. (2000) (depicted in Fig. 6 of their paper), which describes their experimental VOR gains very well. In our analysis, *G*_{vor} was taken as negative to express the compensatory nature of the VOR.

#### Visual and scaling gain.

The visual gain *G*_{vis} and the scaling gain *G*_{scal} were introduced in the scheme to consider two further possible sources of error (i.e., gains different from 1). In a pure mathematical sense, either term could explain why the slopes of the regression lines (Fig. 2) differ systematically from −1 (see *Eq. A5*). However, that *G*_{vis} would be responsible can be safely excluded on functional grounds.

The visual gain *G*_{vis} represents the coding of retinal line orientation by the visual system. It is very unlikely that *G*_{vis} ≠ 1 because the full range of orientations (0–360°) has to be coded, which is possible only when *G*_{vis} = 1. Further indications come from line-orientation estimates (Kaptein and Van Gisbergen 2005; Van Beuzekom et al. 2001). Based on these considerations we fixed *G*_{vis} at 1, which implies that the slope of the regression lines reflects *G*_{scal} (*Eq. A5*). In results (see Fig. 3) we saw that there was some dependency of the slopes on pitch orientation (upright vs. supine). We cannot explain this effect, but, because the difference was relatively small (mean slope upright: −1.7 ± 0.5; supine: −1.4 ± 0.4) and lacked frequency dependency, we ignored it in further analyses.

#### Reconstructed vestibular gains.

Using these assumed values for *G*_{vor}, *G*_{comp}, and *G*_{vis}, we can apply *Eq. 1* on the null fraction data in Fig. 4 to obtain the vestibular gain. Note that if we had fitted the complete expression (*Eq. A4*) on the raw data instead, the results would be identical. The derived vestibular gains differ from the null fractions by a value of 0.5*G*_{vor}, which is frequency dependent. This value ranges from 0.14 at 0.05 to 0.17 at 0.4 Hz in the upright condition and from 0.09 at 0.05 Hz to 0.12 at 0.4 Hz in the supine condition. Because this frequency dependency of the correction factor is only weak, the transformation from null fraction to vestibular gain can be approximated by a frequency-independent vertical shift of 0.16 for the upright condition and 0.11 for the supine condition. Using this approximation, the vestibular gain *G*_{vest} was indicated along additional vertically shifted axes on the *right* of Fig. 4.

#### No evidence for optic-flow effects.

We are aware that the scheme in Fig. 5 may not be complete. For example, large-field visual motion may have an effect on *Ĥ _{S}* (Dichgans et al. 1974; Wertheim 1987). It has also been shown that a rotating line can affect ocular torsion (Mezey et al. 2004), which means that

*L*can influence

_{H}*E*.

_{H}Inspired by Wertheim (1987), we performed a control experiment using a stroboscopic line, to check for such potential optic-flow effects. In this way, a percept of continuous movement was prevented, but subjects could still estimate the peak-to-peak sway by comparing successive line orientations. Because the results were very similar to those of the standard experiment (see results) we conclude that the effect of these factors on our results may be safely ignored.

### Role of rotational and gravitational cues

We now proceed to consider how central processing of canal and otolith signals in the gray-marked zone of Fig. 5 could account for the frequency dependency of the total vestibular gains (*G*_{vest}) in upright and supine conditions (see Fig. 4). Maintaining orientation constancy during self-motion requires information about changes in head orientation with respect to a spatial reference frame. This information could be obtained through two very different mechanisms. One method, based on the use of rotational cues, requires mathematical integration of head angular velocity (path integration). A second approach uses graviceptive cues to detect changes in head position. Although the first mechanism could work in both the upright and the supine conditions, the graviceptive mechanism was available only in the upright experiment. Thus performance in the supine experiment must have been based exclusively on the processing of rotational cues.

In the upright condition, however, the availability of both cues suggests three potential strategies to perform the task. The first possibility is that the brain uses only rotational cues, just as in the supine experiments. According to the second scenario, the brain shifts strategy in the upright condition by exploiting the availability of gravitational cues and ignoring the rotational cues. The third option is that performance in the upright experiments reflects a combination of both graviceptive and rotational cues. Interestingly, as will be shown shortly, the first two scenarios can be discounted on the basis of the data, whereas the third appears quite feasible.

The first above-mentioned strategy for performing the task in the upright position implies that subjects always relied exclusively on rotational cues, both in upright and in supine conditions. An explanation of the fact that performance was better in the upright condition would then require that rotational cues should be different in these two conditions. In theory, the canal–otolith interaction model (Merfeld 1995; Merfeld and Zupan 2002) allows for this possibility, in that gravity cues can improve the estimate of angular velocity and thereby have an effect on rotational cues. In fact, when using parameter values proposed earlier for ego-motion perception under similar conditions (Merfeld et al. 2005) the effect of gravity on the estimate of angular velocity is marginal and substantially insufficient to explain the upright–supine difference in our data (not shown).

The second option—to use graviceptive cues in the upright position and to rely on rotational cues in the supine position—is not realistic either. The problem is that neither the canal–otolith interaction model nor the frequency-segregation model can account for the high-pass characteristics of the upright data. This is obvious for the frequency-segregation model, which relies on low-pass filtering of the otolith signal to obtain a gravity-based estimate. When using parameter values proposed earlier, the canal–otolith interaction model predicts an almost veridical gravity estimate, which similarly lacks the required high-pass characteristics (see Merfeld et al. 2005).

We are thus left with the third possibility: to use both gravity cues and rotational cues in parallel, whenever the occasion arises. That rotational cues can influence visual space perception in the upright condition has been suggested before (Jaggi-Schwarz and Hess 2003; Jaggi-Schwarz et al. 2003; Keusch et al. 2004; Udo de Haes and Schöne 1970). Furthermore, it is known that optic-flow stimulation with a rotating random-dot pattern about the line of sight has a clear effect on the visual subjective vertical (Dichgans et al. 1972; Held et al. 1975; Mittelstaedt 1995). Our working hypothesis, then, can be formulated as follows. The total vestibular response as a function of frequency *G*_{vest}(*f*), seen in Fig. 4, is the sum of a component derived from rotational cues, *R*(*f*), and a component based on graviceptive cues, *G*(*f*). In the simplest version of this linear-summation model, to be worked out next, component *R*(*f*) is independent of condition and contributes equally in both the upright and the supine positions.

Reconstruction of the two contributing terms (*R* and *G*), based on this assumption of independence, is shown in Fig. 6. The thin lines show component *R*(*f*), measured in isolated form in the supine condition where contribution *G*(*f*) is zero. Note the high-pass characteristics, compatible with the expectation that this signal must have a canal origin (Young 1984). The thick lines show contribution *G*(*f*), which is the gain improvement in the upright condition, obtained by subtracting the supine from the upright data. It can be seen that the gravity effect decreases with frequency (low-pass behavior) in all subjects.

### Two implementations of linear-summation model

Our next objective is to explore in some detail how contributions *G* and *R* could be derived from the vestibular sensors (otoliths and canals). To attain better insight into this fundamental question, we performed simulations with two specific model implementations of our linear-summation hypothesis (see *Model simulations* in appendix).

Each scheme contains two branches, one for the *G* and one for the *R* component, which are added after scaling by a weighting factor (see Fig. 7). The major difference between the two models revolves around a hot topic in the literature: the question of how the *G* component can be derived from the vestibular sensors, better known as the otolith-ambiguity problem (see introduction). The solutions to the ambiguity problem that we implemented in the two models were borrowed from the existing literature. In the first model that we will consider (Fig. 7*A*), this process relies on canal–otolith interaction. The second scheme (Fig. 7*B*) obtains the *G* component by simple low-pass filtering of the raw otolith signal. In both models, the *R* contribution is obtained by path integration, after preprocessing of estimated head angular velocity (*ω̂*) by a velocity-threshold element.

We found that incorporating the threshold element was essential. In both models, path integration without the nonlinear element failed to simulate the frequency characteristics of the supine data. In the canal–otolith interaction model, this reflects the fact that the rotational output shows basically no frequency dependency in our frequency range, arising from the implemented velocity-storage mechanism. In the filter model the frequency dependency of the raw canal signal also has too high a gain at the lower frequencies. These observations show that additional processing is necessary to account for our data. One possibility for such additional processing is including a velocity threshold (Glasauer and Mittelstaedt 1998; Mergner et al. 1991) in the *R* branch. The effect of the velocity threshold is to shift the integrated canal curve to higher frequencies, as if the cutoff frequency was higher. Without a threshold, the decay occurs at lower frequencies, out of sight of the frequency range investigated in this study. Becker et al. (2002) suggested that a velocity threshold may be functionally useful by preventing low-frequency noise in the canal signal from influencing the subsequent velocity-to-position conversion. Simulation results, to be presented in the next sections, have shown that incorporating a velocity threshold can at least partly solve the problem encountered by the simpler versions of the two models. The effect of a velocity threshold suggests that adding a high-pass filter in the *R* branch would be a potential alternative. In the sections fit results and evaluation of linear-summation model, this possibility will be discussed in more detail.

To emphasize the important differences in the *G* branch, the two models will be denoted as interaction and filter model, respectively. These labels should not be mistaken as indications that our two schemes are merely replicas of the existing canal–otolith interaction and frequency-segregation models in the literature. Instead, both are extended versions of these schemes by the addition of the path-integrating *R* branch and both embody our proposal of linear summation of *G* and *R* signals. The question is whether the linear-summation model can work with either otolith-disambiguation proposal.

#### Interaction model.

Figure 7*A* shows how we incorporated the canal–otolith interaction module into our linear-summation scheme. The module, taken from Merfeld and Zupan (2002), has four parameters (see appendix, canal–otolith interaction module). We used the same parameter values that Merfeld et al. (2005) found to be optimal for ego-motion perception.^{1}

The model computes internal estimates for the direction of gravity (*ĝ*), for angular velocity (*ω̂*), and for linear acceleration (*â*) from the canal and otolith signals. Output signal *â* plays no role in our model, which is limited to rotational updating. As shown, signal *ĝ* feeds the *G* pathway and *ω̂* feeds the *R* pathway. Note that all outputs, including *ω̂*, are the result of canal–otolith interaction and thus can be affected by gravity. Accordingly, this scheme allows for the possibility that the *R* contribution may differ somewhat in the two conditions (upright vs. supine). However, simulations for our experimental conditions with the chosen set of parameters have shown that *ω̂* is virtually identical in the upright and supine conditions.

Both the *G* and the *R* signals are scaled and then combined in a summing junction. Altogether, the model has three free parameters: the velocity threshold (ω_{0}) and two scaling parameters (*W _{G}* and

*W*). Note that all parameters are independent of condition (upright vs. supine). See appendix,

_{R}*Model simulations*, for further details.

#### Filter model.

Figure 7*B* shows that otolith disambiguation in this version of the linear-summation model is managed by low-pass filtering of the raw otolith signal (Paige and Tomko 1991; Seidman et al. 1998). This frequency-segregation module has only one free parameter, the time constant of the low-pass filter (τ_{lp}). Signal *ω̂* in this model equals the output of the canals (i.e., a high-pass-filtered head-velocity signal). Because there is no effect of gravity on *ω̂* in this model, the *R* contribution in this scheme (Fig. 7*B*) is identical for both upright and supine, as assumed in the reconstruction of *R* and *G* in Fig. 6.

As in the interaction model, a velocity threshold, two weighting factors, and a summing junction complete the scheme. This model has four free parameters (τ_{lp}, ω_{0}, *W _{G}*, and

*W*). Again, all parameters are independent of condition.

_{R}#### Fit results.

The best-fit result of the interaction model for the homogeneous group of five subjects can be seen in Fig. 8*A*. The model fits these pooled data rather well (*R _{adj}^{2}* = 0.72,

*n*= 40). However, the low-pass characteristic of

*G*(see Fig. 6) was not accurately captured by the model. Figure 8

*A*shows that the additional effect of gravity (solid minus dashed line) in the fit is roughly constant, with almost no sign of low-pass decay. We cannot exclude that using a different parameter set for the canal–otolith interaction module (

*k*

_{ω},

*k*

_{f}_{ω},

*k*, and

_{f}*k*) would improve model performance in this respect. The fit results for subject BB are shown separately in the

_{a}*inset*of Fig. 8

*A*. The model can also fit the data from this deviating subject reasonably well. Best-fit results for individual subjects can be found in Table 1, which provides two measures for the goodness of fit (

*R*and the

_{adj}^{2}*rmse*; see methods).

The best-fit results of the filter model for the pooled data of five subjects, shown in Fig. 8*B*, were very good (*R _{adj}^{2}* = 0.82,

*n*= 40). Note that this model can better account for the contribution of gravity cues in the upright condition. The best-fit result for deviating subject BB in the

*inset*shows that the model can also describe his results very well. Best-fit parameters can be found in Table 1.

Comparison of the best-fit values in Table 1 shows that *W _{R}* typically exceeds

*W*in both models, by a factor of nearly 2. The fitted velocity thresholds range between 2.2 and 6.3°/s, which is in the range of values proposed by others (e.g., Glasauer and Mittelstaedt 1998; Mergner et al. 1991; Sills et al. 1978). Subject BB appears to have no significant velocity threshold. Values for the low-pass filter time constant (τ

_{G}_{lp}) reported in the literature include 7 s (Seidman et al. 1998), 2.8 s (Bos and Bles 2002), and 2.3 s (Merfeld et al. 2005). Our values range between 0.9 and 4.8 s, mostly rather low compared with the values in the literature.

Earlier we made the assumption that *G*_{comp} = 0.5, admittedly a rather arbitrary choice. The main effect of changing *G*_{comp} is to shift the curves in Fig. 4 upward (*G*_{comp} >0.5) or downward (*G*_{comp} <0.5) with respect to the right-hand vertical scale, whereas the high-pass characteristics always stay intact. When the two models were refitted on the pooled data of the five similar subjects by setting *G*_{comp} to the theoretical limits of 0 and 1 successively, the goodness-of-fit measures for both models scarcely changed. Compared with the values from Table 1 (*G*_{comp} = 0.5), the best-fit values for *W _{G}* and

*W*fell and rose by about 0.15 for the filter model and by roughly 0.10 for the interaction model, for

_{R}*G*

_{comp}values of 0 and 1, respectively. The parameter ω

_{0}changed by about 1°/s, downward for

*G*

_{comp}= 1 and upward for

*G*

_{comp}= 0, in both models. Parameter τ

_{lp}of the filter model was not significantly affected by changes in

*G*

_{comp}. Thus the value of

*G*

_{comp}has no effect on our conclusion concerning the relative strength of canal and otolith contributions.

As stated before, a high-pass filter would be a potential alternative to the velocity threshold in the *R* branch. When a high-pass filter with a single time constant τ_{hp} was used instead, both models fitted the data almost as well as the velocity-threshold versions. The best-fit parameters of the filter model for the pooled data of the five similar subjects were: *W _{G}* = 0.40 ± 0.04,

*W*= 0.62 ± 0.07, τ

_{R}_{lp}= 0.4 ± 0.1 s, τ

_{hp}= 0.6 ± 0.1 s, and

*R*= 0.75. For the interaction model the best-fit parameters were:

_{adj}^{2}*W*= 0.33 ± 0.04,

_{G}*W*= 0.71 ± 0.08, τ

_{R}_{hp}= 0.7 ± 0.2 s, and

*R*= 0.69. Because the weighting factors of both contributions are close to the values in Table. 1, it appears that this issue is not critical for our conclusions regarding the weighting of rotational and graviceptive cues.

_{adj}^{2}#### Evaluation of linear-summation model.

In both versions of the model, integration of the original *ω̂* signal failed to account for the supine data. Simulations showed the necessity of additional processing that partly excludes the lower frequencies in this signal from the integration process, which can be achieved by a velocity threshold element or a high-pass filter. We also showed that, to explain the upright data, both models required a combination of a rotational and a gravity-related signal. The former, processed as in supine, was essential to account for the high-pass-frequency characteristics of the upright data. Addition of the latter signal was crucial to explain the improvement in performance at the lower frequencies, compared with the supine results. The filter model was better able to account for the low-pass characteristics of this improvement in upright performance, shown in Fig. 6, by virtue of the low-pass filter in its gravity pathway. This was a problem for the interaction model whose gravity signal, when using the parameter set from Merfeld et al. (2005), appears to have an almost flat frequency characteristic in the presently tested frequency range. Given these characteristics, the optimal fit result that can be obtained entails that the gravity pathway adds an almost constant gain factor. As a result, the supine and upright curves of the interaction model (see Fig. 8*A*) are almost vertically shifted versions of each other.

However, this finding should be interpreted with some caution. For example, we cannot exclude that the interaction model would perform better with a different set of parameters for the interaction module. Further testing under a broader range of experimental conditions, preferably including exposure to combined rotation–translation paradigms, may also change the picture. The lack of detailed phase information in our data is further reason for postponing definite judgment about the relative merits of the two implementations of the model. The rough phase indications, provided by the toggle-switch responses, were compatible with both model predictions.

We showed that use of a velocity threshold or a high-pass filter in the *R* branch gave similar results. However, these elements are not simply exchangeable under all conditions because the high-pass filter is frequency dependent, whereas the threshold is velocity dependent. To test whether the observed frequency dependency is in fact a velocity dependency, it would be useful to perform further experiments, using the same frequencies but with higher velocity amplitudes. If this results in a higher gain, especially at the lower frequencies, this would argue for a velocity threshold.

Apart from these remaining questions, a major conclusion supported by these results is that the notion of additive rotational and gravitational signals is essential for either version. This concept of additive rotational and graviceptive signals was proposed in studies of the torsional VOR (Bartl et al. 2005; Schmid-Priscoveanu et al. 2000) but not yet evaluated for visuospatial perception. A further interesting conclusion is that the relative importance of gravitational and rotational contributions (weights *W _{G}* and

*W*), reconstructed on the basis of the two models, yields a consistent picture (see Table 1). In virtually all subjects, and in both models,

_{R}*W*exceeded

_{R}*W*roughly by a factor of 2. Based on this measure, it may be argued that reliance on rotational cues dominated, even in the condition where gravitational cues were fully available. Whether

_{G}*W*and

_{G}*W*are really fixed weighting factors, independent of condition, is an implication of the model that deserves further testing. This would require exploration of a more extensive range of experimental conditions, including different rotation axes and a broader frequency range.

_{R}### Weights of rotational and graviceptive cues: comparison with earlier studies

Two studies that cannot be explained within the present framework were performed by Jaekl et al. (2005) and Wallach (1987). Both studied visual stability during self-generated head tilts. Wallach (1987) found null fractions close to 1.0. The values found by Jaekl et al. (2005) were clearly >1.0 but had a very large range of uncertainty. Remarkably, Jaekl et al. (2005) found no effect of gravity. By contrast, we found null fractions <1 and a clear effect of gravity. These discrepancies between our data and the data of the two previous studies may reflect differences in experimental design. Because, in contrast to our study, the head movements in both previous studies were active, neck proprioceptive information may have played an important role (Mergner et al. 2001). Also the use of a large-field visual stimulus may have induced optokinetic effects that could have influenced the earlier results (Wertheim 1987). As noticed before, optokinetic effects could not be demonstrated in our conditions.

Studies of the subjective visual vertical under static conditions have suggested a dominant otolith contribution (Kaptein and Van Gisbergen 2004; Mittelstaedt 1983). Other studies, using dynamic conditions (Jaggi-Schwarz and Hess 2003; Jaggi-Schwarz et al. 2003; Keusch et al. 2004; Udo de Haes and Schöne 1970), did find canal-related effects, but these were considered relatively small compared with the otolith contribution, or were seen as a sign of canal–otolith interaction. Unfortunately, since such experiments cannot be carried out in supine, canal and otolith contributions could not easily be distinguished.

A recent study that did compare both upright and supine rotations was performed by Klier et al. (2005). Using saccadic responses, they investigated spatial updating of allocentric positions necessitated by constant-acceleration rotations. This study found rotational updating gains around 0.3 and 0.9 in the supine and upright conditions, respectively, suggesting an important role for the otoliths. Seen from the perspective of our summation model, performance in the upright condition reflects the combined effects of *R* and *G* cues. Simulations show that reproducing the data of Klier et al. (2005) requires roughly equal *R* and *G* weights (interaction model: *W _{G}* ≈ 0.6,

*W*≈ 0.5; filter model:

_{R}*W*≈ 0.6,

_{G}*W*≈ 0.6). In other words, this interpretation of upright results suggests an important graviceptive contribution (

_{R}*G*), along with a substantial rotational contribution (

*R*). These data cannot be used to argue for or against either method of disambiguation. The fact that different weighting factors are necessary, compared with those derived from our data (Table 1), may relate to the fact that the Klier et al. (2005) experiment, unlike our paradigm, did not require moment-to-moment updating. Furthermore, the emergence of systematic errors at the larger tilts used in the saccadic experiment (Mittelstaedt 1983; Van Pelt et al. 2005) may have affected the results.

Thus both linear-summation models (Fig. 7), with tilt-independent weights for the rotational and graviceptive contributions, work quite well for our data and may also apply to the data of Klier et al. (2005). It remains to be seen whether this approach can be extended to other paradigms.

In conclusion, the vestibular role in visual orientation constancy during sinusoidal roll rotation was tested with and without gravity cues. Performance in the supine condition was far from perfect and showed high-pass characteristics. Gravity cues in the upright condition improved performance predominantly at lower frequencies.

Our results suggest that visuospatial rotational updating relies on a linear summation of rotational and graviceptive cues. This conclusion is supported irrespective of whether canal–otolith interaction or frequency segregation is used to solve the otolith-ambiguity problem.

## APPENDIX

### VOR compensation

On the basis of Fig. 5, *L̂ _{S}* can be written as the sum of three terms:

*L̂*,

_{E}*Ê*, and

_{H}*Ĥ*. Each of these can be written in terms of the input variables (

_{S}*L*and

_{H}*H*) and the gains

_{S}*G*

_{vis},

*G*

_{vor},

*G*

_{comp}, and

*G*

_{vest}according to (A1) (A2) (A3) The sway estimate of the subject,

*ΔL̂*, can now be written as (A4) In all experiments, the amplitude of head motion (

_{S}*H*) was kept constant at 15°.

_{S}*Equation A4*then predicts a linear relation between

*ΔL̂*and

_{S}*L*. The slope of this linear relation corresponds to (A5) The null fraction (

_{H}*F*

_{0}), i.e., the intersection with the

*x*-axis (see Fig. 2), is given by (A6) Ratio

*L*/

_{H}*H*is negative because

_{S}*L*and

_{H}*H*had opposite signs when the line was counterrotating relative to ego motion. Merely to ensure that

_{S}*F*

_{0}was positive, we included the minus sign before

*L*/

_{H}*H*in

_{S}*Eq. A6*.

Because we assume that *G*_{vis}, *G*_{vor}, and *G*_{comp} are known, *G*_{vest} immediately follows from *F*_{0}. Note that *G*_{vest} is independent from *G*_{scal}.

### Model simulations

To obtain the graviceptive and rotational contributions to the total vestibular updating gains of the models (Fig. 7) we began by computing the real-time predictions of both model branches (*G* and *R*) during the steady-state phase. The resulting periodic curves were added (∑) and the maximum peak-to-peak excursion of the resulting signal was used to calculate the total gain. Fits were performed simultaneously on the upright and supine data.

In both linear-combination models that we explored (Fig. 7), canals were represented as a high-pass filter with two time constants of 5 and 80 s (see Merfeld and Zupan 2002). Otoliths were modeled as a unity transfer function. Three parameters are common to both models: the velocity threshold (ω_{0}) and the weights for the graviceptive and rotational contributions (*W _{G}* and

*W*, respectively). As in Mergner et al. (1991), the velocity threshold was implemented as a dead-zone element.

_{R}#### Canal–otolith interaction module.

The canal–otolith interaction module in our interaction model (see Fig. 7*A*) was taken from Merfeld and Zupan (2002). Briefly, the model obtains initial estimates about gravity (*g*), linear acceleration (*a*), and angular velocity of the head (ω) from the sensors. Using an optimization process involving internal feedback loops, the model then derives a set of output signals (*ĝ*, *â*, and *ω̂*) as its best guess of the underlying physical variables, based on inbuilt knowledge about the properties of its sensors and the laws of physics (internal model). The model has four parameters that scale the feedback signals before they enter the internal model. In our simulation, the parameters had the same values as in Merfeld et al. (2005): *k*_{ω} = 3, *k _{f}*

_{ω}= 1,

*k*= 1,

_{f}*k*= −2. Both k

_{a}_{ω}and k

_{a}are dimensionless, whereas k

_{f}and k

_{fω}have units of rad · s

^{−1}· rad

^{−1}.

## GRANTS

This study was supported by The Netherlands Organisation for Scientific Research–Foundation for Earth and Life Sciences (NWO-ALW).

## Acknowledgments

We thank H. Kleijnen, G. Van Lingen, S. Martens, and G. Windau for excellent technical support. D. M. Merfeld provided helpful advice on the sinusoidal stimulation profile and on the fit procedures. W. P. Medendorp and S. Van Pelt gave valuable comments on an earlier version of this manuscript. Two anonymous referees provided valuable suggestions on modeling issues.

## Footnotes

↵1 Fitting these parameters as well, which would mean a total of seven parameters, appeared infeasible, mainly arising from computational limitations. The model is quite robust to parameter changes (Merfeld et al. 1993) and if we took values proposed by Merfeld and Zupan (2002) for the VOR, the results were indeed almost the same.

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2006 by the American Physiological Society