## Abstract

Networks of neurons perform complex calculations using distributed, parallel computation, including dynamic “real-time” calculations required for motion control. The brain must combine sensory signals to estimate the motion of body parts using imperfect information from noisy neurons. Models and experiments suggest that the brain sometimes optimally minimizes the influence of noise, although it remains unclear when and precisely how neurons perform such optimal computations. To investigate, we created a model of velocity storage based on a relatively new technique–“particle filtering”–that is both distributed and parallel. It extends existing observer and Kalman filter models of vestibular processing by simulating the observer model many times in parallel with noise added. During simulation, the variance of the particles defining the estimator state is used to compute the particle filter gain. We applied our model to estimate one-dimensional angular velocity during yaw rotation, which yielded estimates for the velocity storage time constant, afferent noise, and perceptual noise that matched experimental data. We also found that the velocity storage time constant was Bayesian optimal by comparing the estimate of our particle filter with the estimate of the Kalman filter, which is optimal. The particle filter demonstrated a reduced velocity storage time constant when afferent noise increased, which mimics what is known about aminoglycoside ablation of semicircular canal hair cells. This model helps bridge the gap between parallel distributed neural computation and systems-level behavioral responses like the vestibuloocular response and perception.

- sensory estimation
- Bayesian
- particle filter

the brain uses networks of neurons to perform complex calculations using distributed, parallel computation, including the calculations required for dynamic motion control of the body, which, in turn, relies on estimating motion using sensory cues from afferent signals that are noisy, complementary, sometimes ambiguous, and could carry incomplete information. It has been proposed that the brain processes and combines vestibular cues using internal models (e.g., Angelaki et al. 2004; Angelaki and Cullen 2008; Borah et al. 1988; Bos and Bles 2002; Droulez and Darlot 1989; Glasauer 1992; Green et al. 2005; Laurens and Angelaki 2011; Laurens and Droulez 2007; MacNeilage et al. 2008; Merfeld et al. 1993, 1999; Merfeld and Zupan 2002; Mergner and Glasauer 1999; Newman 2009; Oman 1982; Paulin et al. 2004; Paulin 2005; Selva 2009; Selva and Oman 2012; Young 2011; Zupan et al. 2002) that reflect the influence of neural noise (Borah et al. 1988; Laurens and Angelaki 2011; Laurens and Droulez 2007; Merfeld et al. 1993; Oman 1982; Paulin et al. 2004; Paulin 2005; Selva 2009; Selva and Oman 2012) and that the brain fuses cues optimally (Borah et al. 1988; Ernst and Banks 2002; Laurens and Droulez 2007; Paulin et al. 2004; Paulin 2005; Selva 2009; Selva and Oman 2012). For example, it has been suggested that a goal of “velocity storage” (the elongation of the time constant of vestibular responses) is to reduce the effects of noise (Borah et al. 1988; Laurens and Angelaki 2011; Laurens and Droulez 2007; Oman 1982; Selva 2009; Selva and Oman 2012), a connection that we explore in this report. Furthermore, velocity storage has been implicated in sensory fusion (Green et al. 2005; Green and Angelaki 2003; Merfeld et al. 1993; Raphan et al. 1977; Raphan and Cohen 2002) of cues from organs located in the inner ear: the semicircular canals (SCCs), which transduce angular acceleration, and the otolith organs, which transduce the combined influence of linear acceleration and gravity (Fernandez and Goldberg 1976). Although psychophysical experiments have suggested that the brain does, at least under some circumstances, optimize sensory fusion based on neural noise (Ernst and Banks 2002; Gu et al. 2008), it is unclear how neurons perform these computations, especially in dynamic settings.

This report describes a new approach combining “internal models” with techniques to optimize neural computation based on noise. An internal model is a representation within the central nervous system (CNS) of sensorimotor dynamics. We extended a previous model of spatial orientation estimation that uses otolith and SCC cues (e.g., Merfeld et al. 1993; Merfeld 1995; Merfeld et al. 2005a, 2005b). We focused on velocity storage, which was originally defined by the prolongation of the decay of perceptual and oculomotor responses (Raphan et al. 1977; Robinson 1977) beyond the decay time constant of the SCC afferent response. Velocity storage was one of the emergent properties of the Merfeld et al. (1993) model, so we build on this framework.

Internal models can be broadly categorized as forward and inverse models (Jordan and Rumelhart 1992; Zupan and Merfeld 2005). A forward model predicts the output of a system for a given input, whereas an inverse model determines an input that could have caused a particular output. For example (Zupan and Merfeld 2005), a forward model of a physical relationship would use head rotation about a non-Earth-vertical axis to estimate the change in the direction of gravity relative to the head, since the head rotation is changing the relative direction of gravity. On the other hand, an inverse model could use the change in the direction of gravity relative to the head to estimate head rotation, since the latter is causing the former. A physiological example of a forward model would be one that uses head rotation about an Earth-vertical axis to estimate SCC afferent firing rate. The associated inverse model would use SCC afferent firing rate to estimate head rotation.

Dynamic vestibular modeling has been applied using both forward and inverse internal models. (See Table 1 for a comparison of some of the models described below.) While many models explain behavior by incorporating inverse models (e.g., Droulez and Darlot 1989; Laurens and Angelaki 2011; Laurens and Droulez 2007; Zupan and Merfeld 2005), using exclusively forward models has several advantages that could make them better choices for neural implementation (Jordan and Rumelhart 1992; Miall and Wolpert 1996; Wolpert and Ghahramani 2000). For example, if the physical relationship has a many-to-one mapping, inverse models may have an infinite solution space, or no solution at all (Jordan and Rumelhart 1992), whereas forward models will always have a single solution. A simple example is the human arm, where a forward mapping transforms joint positions to hand position, and an inverse mapping uses hand position to determine joint positions. In this case, there are many different places the elbow could be, resulting in a huge solution space for joint positions. Similarly, there are certain hand positions that are close together that require completely different joint positions (e.g., running a finger down one's spine). Modeling inaccuracies are less computationally disruptive to a forward model than an inverse model (Jordan and Rumelhart 1992). For example, a small amount of noise in hand position near these transition points can result in a very different solution of joint positions, or no solution at all. However, a small amount of noise in joint positions will cause only small changes in hand position. Those applying forward models alone include models based on observer theory (Merfeld et al. 1993; Newman 2009) and optimal observer theory, including Kalman filters (Bilien 1993; Borah et al. 1988; Kalman 1960; Kalman and Bucy 1961; Pommellet 1990; Selva 2009; Selva and Oman 2012). The goal of an observer is to estimate the state of a dynamic system by updating the state of an internal model of the system (the “observer”) (Luenberger 1971). It compares the noisy measurements of the actual system (i.e., afferents conveying noisy information from the vestibular periphery) with the predicted measurements provided by the internal model in the CNS and uses the comparison to update the state of the model. These differences are also known as sensory conflict signals (Oman 1982) and are often used in feedback loops after being multiplied by observer gains. These gains control the tradeoff between rejecting noise and the responsiveness of the internal model to new information. In our observer model (Merfeld et al. 1993), a single observer gain vector allows the model to correctly reproduce vestibuloocular responses (VORs) to a variety of motions (Merfeld 1995; Merfeld et al. 1999, 2005a; Merfeld and Zupan 2002).

Optimal observer theory provides a link between noise levels and the observer gain for linear systems. Specifically, Kalman (Kalman 1960; Kalman and Bucy 1961) proved that a Bayesian optimal filter gain can be calculated for a linear system, which balances the goals of attenuating noise while making the internal model update quickly. The Kalman filter minimizes the mean squared error between the estimated and actual states of the system for a linear system with Gaussian noise. Vestibular models applying Kalman filtering (Bilien 1993; Borah et al. 1988; Paulin et al. 1989; Paulin et al. 2001; Pommellet 1990; Selva 2009; Selva and Oman 2012) show how the brain could incorporate information about noise into dynamic sensory processing (Table 1, Kalman filter).

Particle filtering is a relatively new technique (e.g., Arulampalam et al. 2002; Bolic 2008; Daum 2005; Doucet et al. 2001; Gordon et al. 1993) that applies parallel computations to a distributed set of “particles.” These particles are sample values that propagate through a recursive estimator, with each particle in the set differing slightly because of randomly sampled noise values. Together the particles form and propagate a probability distribution, in contrast with the Kalman filter, which propagates only a signal with its estimated variance. Particle filters use sequential Monte Carlo techniques, since they simulate a system many times using actual sampled noise, and can approach Bayesian optimality, when they incorporate optimal Bayesian computations and parameters and techniques are well chosen (Arulampalam et al. 2002). The particle filter technique is an attractive method for modeling dynamic processing in the brain, since its distributed, parallel computations resemble the distributed, parallel nature of neural processing. This approach provides a crucial link for understanding the gap between behavior and parallel, distributed neural computation (Paulin et al. 2004; Paulin 2005). Another advantage of particle filtering is that it is usually better than Kalman filters when applied to nonlinear systems, a feature that has been applied recently to processing of vestibular dynamics (Laurens and Droulez 2007, 2008; MacNeilage et al. 2008; Paulin et al. 2004; Paulin 2005). These specific models relate responses to the modeler's selection of variances for the Gaussian motion prior distribution and measurement likelihood, loosely analogous to the specification of the two noise parameters in the Kalman filter (Table 1, Laurens particle filter and Paulin particle filter). The major disadvantage of the particle filter is that it is computationally intensive using conventional digital computing approaches, although this can be partly mitigated through careful parameter selection (Daum 2005; Daum and Huang 2002).

In this report, we describe our particle filter (Table 1, this particle filter). To our knowledge, this is the first dynamic, vestibular application of a particle filter with only forward internal models, in contrast to past reports, which incorporated one or more inverse calculations (Laurens and Droulez 2007, 2008; MacNeilage et al. 2008; Paulin et al. 2004; Paulin 2005). Another difference is that our particle filter does not explicitly implement Bayes' rule but rather uses transformations derived from the Kalman filter. Unlike other models (Laurens and Droulez 2007; Paulin and Hoffman 2011), our particle filter does not use resampling, a process that helps the set of particles accurately represent the underlying distribution by continually eliminating particles that correspond to parts of the distribution that have low probabilities and adding particles that correspond to parts of the distribution that have high probabilities (Arulampalam et al. 2002). Our particle filter simulates the vestibular observer (Merfeld et al. 1993; Oman 1982) many times in parallel. As with Kalman filters (Bilien 1993; Borah et al. 1988; Paulin et al. 1989, 2001; Pommellet 1990; Selva 2009; Selva and Oman 2012) and other particle filters (Laurens and Droulez 2007, 2008; MacNeilage et al. 2008; Paulin et al. 2004; Paulin 2005), the responses of our particle filter depend on noise properties: the distribution of the parallel particles is used to calculate the particle filter gain. This resembles neural computation because the computations are distributed and parallel in nature. Indeed, we draw on this correspondence by loosely assuming that each particle in our model represents a neuron, which allows us to constrain model parameters using published neural data. We have implemented our particle filter for Earth-vertical yaw rotation. This simple, linear model involves a single SCC plane and no dynamic otolith stimulation. The sensory dynamics in the model have a single input (physical angular velocity) and a single output (afferent firing rate). The model has three free parameters that are uniquely constrained by experimental data: we developed an iterative tuning procedure that adjusted the free parameters so that model estimate metrics match three experimental constraints. This contrasts with past vestibular modeling efforts that incorporated noises in which a family of parameters could result in a given gain (Borah et al. 1988; Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989). However, those models could also be constrained to unique solutions by using additional experimental constraint (Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989), such as in our approach of using the variability of the output of the model, which has also been alluded to by others (Laurens and Droulez 2007; MacNeilage et al. 2008). The model determines a time constant of velocity storage that matches experimental studies based on the noise level of sensory afferents that are derived from experimental studies. This shows that this distributed, parallel computational approach implemented using a forward internal model reproduces experimental results based on noise levels. The model's velocity storage time constant also decreases when measurement noise increases. Furthermore, we found that the particle filter gain matches the Kalman filter gain, showing that this vestibular particle filter can yield near-optimal performance.

## METHODS

#### Approach.

Our particle filter is a logical progression of both the observer and Kalman filter. To justify the design of our particle filter, we will begin by explaining the structure of the observer and Kalman filter. Since particle filters approach optimality only when appropriately designed (Arulampalam et al. 2002), we also used simulations of the optimal Kalman filter to serve as a baseline for evaluating the optimality of our particle filter. After describing the development of our particle filter model, we describe the iterative tuning procedure that was used to select the three model free parameters so that the model estimate metrics matched three experimental constraints. The iterative tuning procedure was used to constrain the model when incorporating the properties of regular or irregular neurons.

The major difference between the three models is in the way the filter gain [*K*(*t*)] is computed at each time step. In the observer model, *K*(*t*) is chosen by the designer based on dynamic considerations and/or noise, and in the Kalman filter, it is computed using the Riccati equation based on knowledge of the system dynamics and noise statistics. In contrast, the particle filter gain *K*(*t*) is computed by propagating actual sampled noise through the system.

#### Model of the sensor.

The SCCs, located in the inner ear, measure angular rotation of the head. Although three SCCs in each ear measure angular rotation in three approximately orthogonal planes, in this model we focused on yaw rotation, which is transduced by the two parallel horizontal SCCs. As in most earlier models (e.g., Borah et al. 1988; Merfeld et al. 1993; Newman 2009; Oman 1982; Selva 2009; Selva and Oman 2012), for simplicity, we modeled the two parallel SCCs as a single head-centered sensor.

SCC fluid dynamics can be modeled as a second-order band-pass filter
*y* is an output time-varying afferent firing rate scalar, *d* is an input time-varying angular velocity disturbance scalar, *s* is the Laplace variable. The slow time constant (τ_{1}) was 5.7 s (0.18 rad/s or 0.028 Hz) (Fernandez and Goldberg 1971; Jones and Milsum 1971), and the fast time constant (τ_{2}) was 0.005 s (200 rad/s or 32 Hz). Although the τ_{2} has not been measured experimentally, it has been estimated by various models to be between 0.004 and 0.010 s (Fernandez and Goldberg 1971; Groen 1957; Oman et al. 1987; Rabbitt 1999), and the specific value over that range does not affect the ability of our model to make estimates that match experimental constraints. Figure 1*A* shows the SCC frequency response characteristics for this second-order model and compares the dynamics with a commonly used first-order model. Figure 1*B* shows the responses evoked by a trapezoidal velocity input. For physiologic stimuli (e.g., stimuli below 30 Hz), these two models yield nearly indistinguishable responses.

The following state-space form of the SCC model was used for the implementation of the Kalman filter:
*x*(*t*) is the state of the actual system and *A*, *B*, and *C* define the dynamics of the system. The variables for our two-state, phase-variable, canonical state-space form of the SCC, which was derived as described in appendix a, are as follows:
*d*(*t*) is a time-varying scalar input representing a passive motion disturbance and *y*(*t*) is a time-varying scalar output representing afferent information.

#### Observer model.

The goal of an observer applied to spatial orientation is to estimate motion dynamically using afferent information from the sense organs and copies of efferent information going from the CNS to muscles (Merfeld et al. 1993). Figure 2*A* shows a general model that applies observer theory to sensory estimation during dynamic voluntary movement. The broad structure is that the output from motor planning in the brain follows two paths that eventually converge. The first causes body motion via muscle activation, which is contaminated by external and other unplanned motion, sensed by imperfect organs, and carried to the brain by noisy afferent neurons. The second path, the observer, is completely neural, consisting of internal models of body and sensor dynamics. The paths converge when the observer receives new information from the noisy afferent signal, which is compared with the expected afferent signal and used to guide the observer's estimate of motion toward the actual amount of motion. The observer gain (*K*) determines how quickly this guidance occurs; making *K* large allows nondeterministic motion to be quickly incorporated into the estimated state but also allows measurement noise to have a greater effect. Figure 2 details the model elements, which builds on our earlier observer model (Merfeld et al. 1993, 2005a, 2005b; Merfeld 1995; Oman 1982). In contrast with the earlier velocity storage observer model (Merfeld et al. 1993), which was deterministic, this model incorporates knowledge about noise, using conventions and assumptions commonly used for Kalman filters (Kalman 1960; Kalman and Bucy 1961). Measurement noise represents sensor inaccuracies. Physiologically, it can be interpreted as the inaccuracy introduced when a sensory organ transduces a signal and by afferent neurons that carry the signal to the brain. The mathematical analog of this interpretation is that noise is added to a signal. Process noise represents system perturbations, including but not limited to external disturbances. For the vestibular system, this includes body motion that was not commanded by the brain, from disturbances both internal and external to the body. Internal disturbances include motion due to motor noise and muscle inaccuracies, which are present even when there is no commanded motion. Examples of external disturbances include motion due to movement of an unstable surface, building vibration, earthquakes, and wind gusts. This implementation philosophically matches Kalman's original implementation where process noise represents disturbances like the influence of atmospheric turbulence on a rocket, which causes real changes in the trajectory of the rocket, but does not include commanded motion of the rocket, such as that controlled by intentional changes in thrust direction. Process noise is also often artificially increased to compensate for modeling errors, that is, inaccuracies in creating the internal models that mimic the dynamics of the physical system (Simon 2006). We assume that process noise is typically small compared with the distribution of typical head movement, since a large fraction of daily variability is controlled by the brain and known to the internal model. Movement of a subject by laboratory equipment is not controlled by the brain and thus is process noise. For reasons described below, we have separated the component of process noise due to the experiment as “experimental external motion.”

As in earlier observer models (Merfeld et al. 1993, 2005a, 2005b; Merfeld 1995), we made the following simplifications to the observer model shown in Fig. 2*A*, yielding the observer model for vestibular processing during passive yaw rotation shown in Fig. 2*B*. First, there is no active control of the body and thus there are no muscle commands, but rather only passive perturbations. Second, only one type of sensory organ (the SCCs responding to angular velocity) is stimulated by motion, and contributions from proprioception and the otoliths are omitted. Third, body dynamics are unity because the actual motion exactly equals the passive disturbance because of experimental head restraint. Finally, the dynamics of the internal model of the sensor are identical to the actual sensor dynamics. For this one-dimensional case, process noise [w(t)] and measurement noise [v(t)] are assumed to be zero-mean Gaussian with variances *Q* and *R*, respectively.

The observer filter gain *K* is selected by the designer and sets the time constant of the estimated angular velocity to (*K* + 1) times the SCC time constant, which corresponds to the velocity storage time constant; the gain of the response is *K*/(*K* + 1) (Merfeld et al. 1993). In experimental studies of velocity storage, ocular responses to rotation decay with an average time constant between 14 and 35 s (Dimitri et al. 2001) and ocular and perceptual responses could rely on a common velocity storage mechanism (Bertolini et al. 2011; Okada et al. 1999). Setting *K* = 3.0 yields a time constant of 23 s. Figure 3 shows the frequency and temporal responses of the observer. The temporal response demonstrates the tradeoff between noise and bandwidth. When *K* is large, the observer's estimate is closer to the actual angular velocity, but noise is greater. When *K* is small, the estimate is less noisy, but it decays more quickly. This also demonstrates that the velocity storage time constant, the rate of decay of the estimated angular velocity, is proportional to *K*.

#### Kalman filter model.

Kalman and colleagues (Kalman 1960; Kalman and Bucy 1961) developed a way to select the Kalman filter gain *K*(*t*) that is an optimal tradeoff between reducing noise in the estimate and how quickly the estimate responds, by minimizing the root mean square (RMS) difference between the actual and estimated state. The Kalman filter (Fig. 4*A*) calculates *K*(*t*) based on the process and measurement noises specified by the designer. Since the Kalman filter structure is the same as the observer (they differ only in how *K* is calculated), the velocity storage time constant is also (*K* + 1) times the SCC time constant. Thus, there is an explicit link between the velocity storage time constant and noise characteristics.

*K*(*t*) is determined in two steps. The first is to solve the following Kalman Riccati equation:
*P*(*t*) is the covariance of difference between the states of the actual system and the internal model, *A* and *B* define the dynamics of the system, *T* is the transpose operator, *Q* is the variance of process noise, and *R* is the variance of measurement noise. Second, for uncorrelated stationary process and measurement noise, *K*(*t*) = *P*(*t*)*C*^{T}*R*^{−1}, which is essentially the ratio of *P*(*t*) to *R*.

Often the estimator's parameters do not perfectly reflect the actual system. In particular, we assumed that *Q* may reflect the long-term statistics of process noise and that in the short term it may not reflect the actual statistics. This is particular true in the laboratory situation in which head-restrained subjects experience movement not controlled by the brain. Since this is outside of the usual disturbances experienced by the brain, it is not certain that the brain would incorporate these statistics into its internal model (although frequent research subjects may be an exception). To incorporate this into our simulations, we separated the component of process noise due to the experiment as experimental external motion. This allowed us to distinguish between actual process noise and *Q* specified by the modeler of which the internal model is aware but which does not include experimental external motion.

The sensor model we used has two states, which required that *K*(*t*) be a 2 × 1 vector and *P*(*t*) a 2 × 2 matrix. *x*_{2} depends directly on the input, whereas *x*_{1} is influenced by the input only via *x*_{2}, and thus the first element of *K*(*t*) is effectively zero [*K*_{11}(*t*)]; this in turn forces all but one element of *P*(*t*) to zero [i.e., *P*_{11}(*t*), *P*_{21}(*t*), and *P*_{12}(*t*) equal zero]. Thus, we propagated and stored only scalars *K*_{21}(*t*) and *P*_{22}(*t*) and set other values to zero. Simulations with and without this simplification showed no noticeable difference. Since our system dynamics and noise properties do not vary with time, *K*_{21}(*t*) converges to a steady-state scalar (*K*_{ss}).

#### Particle filter model.

Like the Kalman filter, our particle filter (Fig. 4*B*) is also based on the observer model structure. In contrast with the Kalman filter, the particle filter gain *K*(*t*) is computed by propagating actual noise through the system multiple times in parallel and measuring the distribution of the particles after they have passed through the system dynamics. Process and measurement noises are sampled for individual particles and are assumed Gaussian and mutually uncorrelated. We chose to formulate noise sources in the same form as the Kalman filter to facilitate comparisons with the Kalman filter, although with our particle filter there is flexibility in the location and number of noise sources.

More specifically, for a particle filter, the system calculations are performed many times in parallel, once for each “particle.” At each time step, the state error covariance *P*(*t*) is approximated using the covariance of the particles, yielding the estimate *P̂*(*t*). *K*(*t*) is then calculated using the following formula: *K*(*t*) = *P̂*(*t*)*C*^{T} × *R*^{−1}, as for the Kalman filter. This requires the assumption that the internal model has learned the noise properties, which is common to most, if not all, vestibular models (e.g., Borah et al. 1988; Laurens and Droulez 2007; Paulin et al. 1989). Since the particle filter transfer functions are the same as the observer and Kalman filters, the velocity storage time constant is also (*K* + 1) times the SCC time constant.

*P̂*(*t*) is required to calculate *K(t)*. As in the Kalman filter, *P̂*(*t*) is an estimate of the covariance of the state error *x̃*(*t*) = *x*(*t*) − *x̂*(*t*),which is the difference between the state of the actual system [*x*(*t*)] and internal model estimate [*x̂*(*t*)]. The Kalman filter analytically estimates *P*(*t*) = *E*[*x̃*(*t*)*x̃*(*t*)^{T}] = *E*[(*x̂*(*t*) − *x*(*t*)) *x̂*(*t*) − *x*(*t*)]^{T}, where *E* is the expected value operator. In our model, the state of the actual system is not available, nor can the error covariance be computed analytically as in the Kalman filter. As an alternative, we noted that the mean of the particles of the state estimate was the best available approximation of the state of the actual system, i.e., *x*(*t*) ≈ *E*[*x̂*(*t*)], so that
*x̂*_{i}(*t*),*x̂*_{j}(*t*), where *i*,*j* ∈ 1 …*N* and *N* is the number of particles.

*K*(*t*) converges to steady state, with the exception of small random variability related to the noise present. For comparison with the Kalman filter, scalar steady-state particle filter gain (*K*_{ss}) was calculated for a range of noise characteristics by averaging *K*_{21}(*t*) over time after steady state was attained.

A distinction between our particle filter and other particle filters (Arulampalam et al. 2002; Doucet et al. 2001; Laurens and Droulez 2007; Paulin et al. 2004; Paulin 2005) is our implementation. Most particle filters implement Bayes' rule explicitly for each particle. In our particle filter, we implicitly implemented Bayesian optimal weighting via the Kalman gain calculation *K*(*t*) = *P̂*(*t*)*C*^{T} × *R*^{−1}, since the Kalman filter is Bayesian optimal (see also Gelb 1992; Kalman 1960; Kalman and Bucy 1961; Simon 2006). Another distinguishing characteristic of our particle filter is that resampling was not used; we describe our rationale in the discussion. In addition, during simulation of our particle filter, each particle has an independently sampled process noise and measurement noise chosen from a normal distribution. Our particle filter, as with most particle filters, differs from most Monte Carlo simulations in that the parallel simulations interact with each other at each time step. Typical Monte Carlo simulations run independently of each other, and the results are compiled at the end.

#### Regular versus irregular neurons.

Experimental data on SCC afferent neural noise are available for both regular and irregular neurons (Sadeghi et al. 2007), and we were interested in how our particle filter would be impacted by the properties used. There is evidence that irregular neurons play an important role in vestibular pathways despite their higher noise, which justifies their consideration in this model: vestibular nuclei neurons that take inputs from both regular and irregular neurons have thresholds closer to irregular neurons (Massot et al. 2011). In each simulation, we included the properties of either only regular neurons or only irregular neurons, leading to lower and upper bounds for the affected free parameters. Including regular and irregular neurons together would have added an additional free parameter to the model reflecting the relative proportion of the two, and the experimental data are not available to constrain this parameter.

#### Model free parameters and iterative tuning procedure.

Our particle filter has three free parameters, which are uniquely set using an iterative tuning procedure. In this tuning procedure, the free parameters are manually adjusted by the model designer, then the model simulates estimated responses. Three model estimate metrics are computed based on the estimated responses and compared with published experimental data, which we refer to as experimental constraints (Table 2). The iterative tuning procedure continues until the three model estimate metrics match experimental constraints to at least two significant digits; to confirm an accurate correspondence, we averaged model estimate metrics across 200 simulations and also computed their SD. In addition, a broad grid-based search of the parameter space confirmed that only a single set of free parameters resulted in model estimate metrics that matched the experimental constraints. One of the experimental constraints corresponds to whether the properties of regular or irregular neurons are used; changing it and applying the iterative tuning procedure resulted in changes in all three free parameters (Table 2).

The first and second free parameters are process noise variance *Q* and measurement noise variance *R*. The third free parameter is the number of particles (*N*). Table 2 shows the values of the free parameters derived from the iterative tuning procedure.

The first of the three experimental constraints is on *K*_{ss} of the particle filter. As with the observer, we used *K*_{ss} = 3.0, which yielded a time constant of 23 s, which is approximately the population mean of measured responses for 35-yr-old normal subjects (e.g., Dimitri et al. 2001).

The second experimental constraint is the noise on an individual SCC afferent, which we relate to the noise on an individual particle. Individual afferent noises are derived from neural threshold data from macaque nonhuman primates; thresholds for individual SCC neurons were shown to average 4.0 ± 1.0°/s for regular afferents and 8.7 ± 1.0°/s for irregular afferents over frequencies of 0.5 to 15 Hz (Sadeghi et al. 2007). For the threshold criterion used (*d*′ = 1), this corresponds to noise SDs of 4.0 and 8.7°/s, respectively. In our particle filter, this value corresponded to the SD of the particle population at each time step, averaging over time, using the particles at the point after measurement noise is added [*z*(*t*) in Fig. 4*B*].

The third experimental constraint is the noise on the estimate of angular velocity from the internal model, which we derived from psychophysical measurements of thresholds. Behavioral thresholds are lower than thresholds for individual neurons, likely in part because of convergence and averaging of information from multiple afferents. Specifically, the threshold for yaw rotation recognition in humans plateaus at 0.71°/s for frequencies of 0.5 Hz and higher (Grabherr et al. 2008), when the threshold is measured using a three-down, one-up paradigm resulting in a 79.4% recognition rate (Leek 2001). For this threshold criterion (*d*′ = 0.82), this corresponds to a noise SD of 0.87°/s for behavioral responses after convergence. In our particle filter, this value corresponded to the SD over time of the angular velocity estimate [ω̂(*t*) in Fig. 4*B*].

In our particle filter, noise estimate metrics for afferents and behavioral responses were calculated using particles during the time that the system was in steady state, before the commencement of motion. The values of process noise, measurement noise, and *N* were manually varied iteratively to yield afferent and behavioral noise SDs that match those found in the literature, as outlined above.

Although we confirmed that only a single set of free parameters satisfied the constraints using a grid-based search, this can also be understood intuitively. We observed that afferent noise is basically a weighted sum of process and measurement noises and that *K* increases as *R* decreases and *Q* increases. Thus, in *R* versus *Q* space, the *R* versus *Q* solution space to constrain afferent noise is a line with a negative slope, and to constrain *K* it is a curve with a slope that varies but is always positive. There can only be one intersection between these solutions, so there are unique solutions for *R* and *Q*. With *R* and *Q* fixed, the estimated noise decreases monotonically as *N* increases, so there can only be one solution for *N*.

#### Changing measurement noise.

Certain physiological and pathological changes can be understood in the context provided by changes in afferent response. We modeled afferent changes by increasing measurement noise and simulating the model to see if the model estimates change in a similar manner to changes in patients. Although definitive experimental data correlating changes in afferent noise with behavior are not available, a few studies on vestibular hair cell ablation have shown that changes in afferent noise may cause changes in behavior.

Aminoglycosides, such as gentamicin, are used to ablate vestibular function in patients with peripheral vestibular disorders, such as Meniere's disease; vestibular loss is also sometimes a side effect of systemic aminoglycoside use to treat severe infections. In patients with bilateral aminoglycoside ablation, velocity storage time constants typically fall to well below the SCC time constant; for example, one study found an average time constant of 1.7 ± 0.9 s (Ishiyama et al. 2006). The same study found that VOR gain had also greatly decreased. For sinusoidal motion, VOR gain ranged from 0.08 at 0.05 Hz to 0.45 at 0.8 Hz. For high acceleration step stimuli (head thrusts), the average VOR gain was 0.27. Gentamicin acts by destroying SCC hair cells, which causes changes in the responses of the afferents innervated by those hair cells, although the function of the afferents themselves is not affected by gentamicin (Hirvonen et al. 2005). The reduced population of hair cells converging on each afferent could lead one to expect a change in the noise of the afferent responses, which is what occurs in chinchilla: the SD of the interspike interval lognormal distribution for irregular neurons increases (Hirvonen et al. 2005), although it does not change for regular neurons. While these studies were on different species (human vs. chinchilla) and there is no way to compare the relative doses, they do suggest that increased afferent noise may result in a lowered time constant.

We ran simulations with irregular neurons and adjusted measurement noise on each particle so that it was slightly higher. This mimicked the experimental results that showed higher noise with gentamicin treatment. Our hypothesis was a lowered *K*(*t*) and thus a lowered velocity storage time constant as well as a lowered VOR gain.

#### Simulations.

The overall algorithm for executing the particle filter is shown in Table 3. For the first second of the simulation, particle filter gain *K*(*t*) was held constant with an initial condition and then released and allowed to converge. Input angular velocity was held at zero for 10 s to allow *K*(*t*) to stabilize. We tested our particle filter with trapezoidal inputs in which the velocity increased from 0 to to 90°/s in 1 s as well as with sinusoids to characterize the frequency response.

A major concern in our simulations was numerical stability, since our simulations propagate actual sampled noise through the model, in contrast to many other techniques such as the Kalman filter (Kalman 1960; Kalman and Bucy 1961) and the unscented Kalman filter (Julier and Uhlmann 2004), which propagate only the statistics of the noise. Applying Euler integration with a large time step resulted in high-frequency random noise quickly building to instability. To improve numerical stability, white measurement and process noises were low-pass filtered with a first-order filter with a cutoff of 400 Hz. The noise SDs that we report were measured after this filtering, since this is the actual RMS noise input to the model.

Simulations were implemented in Matlab 7.8 (The Mathworks) on the Harvard Orchestra computation cluster. To improve efficiency, key sections of the simulation were implemented in C++, compiled using gcc, and incorporated into Matlab using the Mex linker. Models were derived in continuous time and implemented as discrete approximations of continuous time systems. Euler integration was used for particle filter simulations using a timestep of 1/8,000 s. *K*(*t*) was averaged over a short period of time (i.e., 50 ms) to reduce the effects of random fluctuations in the selection of noise values that could cause unstable positive feedback. While 33 particles resulted in the simulation being stable and matching experimental results, more particles would likely be required if the system dynamics had more states (Daum and Huang 2002). Simulations were run on an IBM BladeCenter HS21 XM with a 3.16-GHz Xeon processor and 8 GB of RAM, although Matlab used only a single processor core. A typical simulation took 12 s with 33 particles for an estimate lasting 100 s.

For simulations of the Kalman filter, the steady-state Kalman filter gain was computed by iteratively solving the Riccati equation. Since process and measurement noises did not vary with time, it was necessary only to allow the Riccati equation to converge before the simulation and not to continuously solve it during the simulation.

## RESULTS

#### Velocity storage is reproduced by our particle filter (regular neurons).

Figure 5*A* shows that our particle filter reproduces velocity storage for yaw rotation about an Earth-vertical axis, which stimulates the SCC without any dynamic otolith stimulation. Angular velocity rapidly ramped to a constant-velocity rotation. In Fig. 5*A*, the SCC afferent responses are represented by particles (dark gray cloud) whose distribution at each time step arises because each particle has unique noise. The responses rapidly increased during the velocity ramp, then, as the velocity plateaued, the particles, on average, decayed with a time constant of 5.7 s (i.e., the slow SCC time constant). For the specified *Q* and *R*, each SCC afferent had a SD of 4.0°/s, matching the experimental constraints for a regular afferent (Table 2). The particle filter gain (Fig. 5*B*) converged to a steady-state value in <0.25 s and then continued to have random fluctuations about an average of *K*_{ss} = 3.0. We found that the particle filter's *K*_{ss} was not dependent on the initial value of *K*(*t*) over a broad range of initial conditions, which we empirically found ranged from ∼0.55 to 79. The particles representing the brain's estimate of angular velocity (light gray cloud in Fig. 5*A*) showed a prolonged time constant of velocity storage; in this example, 23 s. This time constant was in the range of 14 to 35 s for normal 35-yr-old human subjects (Dimitri et al. 2001). It was also consistent with other models (Borah et al. 1988; Merfeld et al. 1993; Raphan et al. 1977; Robinson 1977). The overall estimated angular velocity (solid line in Fig. 5*A*) was calculated as the mean of the particles and decayed with the same time constant of 23 s. The estimated angular velocity had a SD of 0.87°/s, equal to the experimental value of 0.87°/s (Table 2). We quantified the repeatability of the particle filter by measuring the SDs of the three metrics across 200 simulations and found that they were 0.054 on *K* (1.8% of the mean), 0.009 on afferent noise (0.23% of the mean), and 0.016 on the variability of the estimated angular velocity (1.8% of the mean).

Figure 6 shows the mean gain and phase characteristics of our particle filter (circles) versus frequency and compares it with the theoretical predictions (solid line) for velocity storage modeled as a first-order high-pass filter with a time constant of 23 s. Figure 6 shows that the frequency response of our particle filter is consistent with theoretical velocity storage across a broad range of sinusoidal frequencies. It also shows that the extension of the low-frequency dynamics for the particle filter compared with the SCC afferents (dashed line).

#### Using irregular neurons increases the number of particles required.

To model how the particle filter parameters would change if it incorporated irregular, rather than regular, neurons, we changed the experimental constraint on the SD of afferent noise from 4.0 to 8.7°/s (Sadeghi et al. 2007) and applied the iterative tuning procedure to adjust the three free parameters. Figure 7 shows the response estimates for irregular neurons. The free parameters that fit the adjusted constraints (Table 2) were a process noise SD of 14°/s, a measurement noise SD of 3.6°/s, and *N* of 158. Thus, with the higher afferent noise associated with irregular neurons, five times more particles are required to fit the behavioral noise constraints, a number that is still physiologically plausible as it is well below the estimated 1,800 fibers innervating each SCC (e.g., Bergstrom 1973; Lopez et al. 2005). We quantified the repeatability of the particle filter by measuring the SDs of the three metrics across 200 simulations and found that they were 0.025 on *K* (0.82% of the mean), 0.009 on afferent noise (0.11% of the mean), and 0.014 on the variability of the estimated angular velocity (1.6% of the mean).

#### Velocity storage is reproduced by the Kalman filter.

Figure 8 shows that the Kalman filter reproduces velocity storage for yaw rotation about an Earth-vertical axis, consistent with previous work (Borah et al. 1988; Paulin et al. 1989). It does so with three depictions (Fig. 8, *A–C*) of the same simulation to demonstrate different aspects of the results. In all cases, as in Fig. 5, angular velocity rapidly ramped to a constant-velocity rotation. The process and measurement noise variances *Q* and *R* for this simulation were the same as for our particle filter (Table 2) and were used to find the Kalman gain, which is *K*_{ss} = 3.0, by finding the steady-state solution of the Riccati equation. The SCC afferent response (dark gray) rapidly increased during the velocity ramp and then, as the velocity plateaued, decayed with a time constant of 5.7 s (i.e., the slow SCC time constant). The Kalman filter's estimated angular velocity (light gray) showed the prolonged time constant of velocity storage; in this example, 23 s. This time constant fell in the range of 14 to 35 s for normal 35-yr-old human subjects (Dimitri et al. 2001), which was also consistent with that found by our particle filter and by other models (Borah et al. 1988; Merfeld et al. 1993; Raphan et al. 1977; Robinson 1977). Thus, the Kalman filter is able to reproduce the dynamics of velocity storage when its properties are determined by noise characteristics from experimental data.

The three depictions in Fig. 8 demonstrate concepts relating the observer, Kalman, and particle filters. In Fig. 8*A*, the Kalman filter gain was calculated using *Q* and *R*, but the simulation was performed with noiseless signals. This depiction resembles the observer, although it is a Kalman filter because the Kalman Riccati equation was used to compute the Kalman filter gain. In Fig. 8*B*, sampled process and measurement noises were added during the simulation, yielding an afferent with the noise properties of a regular neuron. This resulted in the estimated angular velocity also having noise. This depiction showed the simulation in a way that is closer to the conceptual framework of the Kalman filter. Figure 8*C* shows multiple parallel executions of the Kalman filter. Although this is an unusual way to depict the Kalman filter, it demonstrates the conceptual relationship of the Kalman filter and our particle filters; despite the gains being calculated in different ways, the simulations were very similar. In this simulation, each has independent sampled noise, yielding a cloud of afferent responses at each time. This cloud's distribution has a SD that matches the experimentally measured value for a regular neuron. The Kalman filter's estimate of angular velocity for each execution (light gray) formed a cloud. When the mean was taken across the estimated angular velocity at each time, the “converged” estimated angular velocity (solid line) had reduced temporal variability. For this particular example, the number of parallel executions was 33, matching the number of particles in the simulation shown in Fig. 5, and the “converged” estimate had a SD of 0.87°/s, consistent with our particle filter and experimental data. There was one key difference between this depiction and the particle filter shown in Fig. 5: while in our particle filter the gain was calculated based on all the particles and thus there was an interaction across particles, in the Kalman filter there was no interaction between the parallel simulations.

#### Our particle filter calculates optimal gains.

When the particle filter and Kalman filter have the same noise properties (i.e., *Q* and *R*), the average particle filter gain (Fig. 5) and Kalman filter gain (Fig. 8) are equal. Specifically, for both the Kalman filter and particle filter, *K*_{ss} = 3.0.

For a linear system, the Kalman filter will calculate the Bayesian optimal gain *K*_{ss} that minimizes the RMS estimation error (Kalman 1960; Kalman and Bucy 1961). That the particle filter gains are near the Kalman filter gain on average shows that the particle filter is also approximately Bayesian optimal. Thus, in addition to reproducing experimental results using a distributed, parallel system, our particle filter roughly mimics the Bayesian optimality of Kalman filtering.

#### Increased afferent noise leads to reduced time constant and gain.

We found that the velocity storage time constant decreased as measurement noise increased (Fig. 9), consistent with previous Kalman and particle filters (Borah et al. 1988; Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989). For example, the average particle filter gain *K* decreased from 3.0 to 2.0 when measurement noise increased 8%, causing a reduction in the velocity storage time constant from 23 to 17 s and a corresponding 12% VOR gain reduction from 0.75 to 0.67. Furthermore, *K* decreased to zero when measurement noise increased beyond 10%. This yielded a VOR gain of zero. *K* = 0 also corresponds to a time constant of 5.7 s, although this time constant has little physical meaning when VOR gain is zero. Furthermore, the noise for the estimated angular velocity became zero.

These results have some similarities with data from patients suffering severe bilateral ototoxicity (Ishiyama et al. 2006). Specifically, in these patients, VOR gain was between 0.08 and 0.45 compared with between 0.50 and 0.94 for the healthy control group (Ishiyama et al. 2006), whereas for our model VOR gain dropped from 0.75 to 0.67 when measurement noise increased moderately and then went to zero when measurement noise increased further. Furthermore, in these patients, the velocity storage time constant was 1.7 ± 0.9 s compared with 12.2 ± 3.6 s for the healthy control group (Ishiyama et al. 2006), whereas for our model it decreased from 23 s to the SCC time constant. While the patient velocity storage time constant can be below the presumed SCC time constant, the lowest possible velocity storage time constant in our model is the SCC time constant; possible explanations are provided in the discussion.

## DISCUSSION

#### A distributed, parallel, forward model reproduces velocity storage.

We have implemented a distributed, parallel, forward model of a dynamic sensory process–“velocity storage”–using a particle filter approach. Our particle filter estimates angular velocity with characteristics that are consistent with experimental data about the time course of dynamic vestibular processing, the level of afferent noise, and the level of noise in behavioral responses.

The iterative tuning procedure we developed uses three experimental constraints to finds solutions for each of three free parameters (*N, Q*, and *R*). This results in the free parameters being uniquely constrained. This contrasts with some past vestibular models, which did not find unique solutions for the model parameters, although these studies could have done so if they incorporated experimental constraints of the variability of model estimates (Borah et al. 1988; Laurens and Droulez 2007; MacNeilage et al. 2008; Merfeld et al. 1993; Paulin et al. 1989). In contrast to previous particle filter models (Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 2004; Paulin 2005), ours uses only forward models and does not include inverse models. As detailed in the Introduction, forward models have several computational advantages over inverse models that would apply to neural computation (Jordan and Rumelhart 1992; Miall and Wolpert 1996; Wolpert and Ghahramani 2000).

Our particle filter computes a time course of vestibular processing that is Bayesian optimal for the noise parameters *Q* and *R*. This was verified by testing whether the Kalman filter and our particle filter produce the same average time constant for the same noise parameters. While this may seem unsurprising since the particle filter incorporates optimal computations, in actuality the particle filter is an approximate, numerical technique and thus is a “suboptimal algorithm” (Arulampalam et al. 2002). In the limit the particle filter approaches optimality, but this depends on practical issues such as the number of particles and the resampling method chosen (Arulampalam et al. 2002). Thus, the similarity of the time constant to that of the Kalman filter demonstrates that, as designed, our particle filter produces Bayesian optimal results.

#### Relationship between particles and neurons.

We determined that *N* = 33 particles met the experimental constraints on the model with the noise properties of regular neurons and that *N* = 158 particles met the constraints with the noise properties of irregular neurons. These values serve as lower and upper bounds for the number of SCC afferents suggested by the model, since the brain presumably uses both for actual implementation. While these numbers are small compared with the roughly 1,800 primary afferents that innervate each SCC (e.g., Bergstrom 1973; Lopez et al. 2005), one experimental study suggested a similar range for the number of SCC afferents required functionally. Specifically, Massot et al. (2011) performed multiunit recordings on regular and irregular afferents and from vestibuloocular neurons and pooled spiking activity to determine the relationship between detection threshold and the number of afferents. They determined that ∼40 neurons are required to reduce detection thresholds to the levels found psychophysically and that little additional information is provided by adding more neurons. Recent data have shown that when the sensitivity of irregular neurons is considered, the signal-to-noise ratio may not be systematically lower for irregular neurons than for regular neurons, although further work is required to understand interspecies differences (Hoffman et al. 2010; Ramachandran and Lisberger 2006). While additional data might change the values of the free parameters in our model, this would not fundamentally alter our conclusions.

#### Bayesian optimality of our model only implies an optimal neural implementation if all assumptions match.

Experimental studies have suggested that the brain performs Bayesian optimal weighting for the integration of sensory cues (Ernst and Banks 2002; Gu et al. 2008). Despite our particle filter being a near-optimal technique and being well constrained by experimental measurements, we cannot yet argue that it supports or refutes the hypothesis that the brain optimally performs the calculations required to estimate angular velocity in the presence of noise. For example, the brain could be performing the same computations suboptimally, using 10-fold more neurons as we had particles. Providing evidence for optimal performance would require further constraints by experimental values that are not yet known, for example, the number of afferent neurons that innervate each SCC and project to the reflex or perception of interest. Our inability to prove that the neural implementation of vestibular estimation is optimal despite having a Bayesian optimal mathematical formulation is not unique to this model (e.g., Borah et al. 1988; Laurens and Droulez 2007; Paulin et al. 2004; Paulin 2005; Selva 2009; Selva and Oman 2012).

#### Comparison with prior work.

Table 1 compared the five classes of models discussed in this report: the observer model (Merfeld et al. 1993; Newman 2009; Oman 1982), the Kalman filter (Bilien 1993; Borah et al. 1988; Paulin et al. 1989, 2001; Pommellet 1990; Selva 2009; Selva and Oman 2012), the Laurens particle filter (Laurens and Droulez 2007, 2008), the Paulin particle filter (Paulin et al. 2004; Paulin 2005; Paulin and Hoffman 2011), and our particle filter. The observer model (Table 1) is able to estimate a broad range of experimental results using a forward internal model of the vestibular sensors. The modeler iteratively modifies the observer gain *K* to make the resulting estimates close to experimental results. The Kalman filter (Table 1) relates responses with the properties of noise in the system specified by the modeler using the Kalman Riccati equations, which are a linked set of quadratic differential equations often written in matrix notation. Specifically, it computes the Kalman filter gain *K*(*t*) using the Kalman Riccati equation, based on the process and measurement noises *Q* and *R*. Our particle filter (Table 1) makes the same link between responses and the properties of noise specified by the modeler but differs from the Kalman filter in that it uses a distributed, parallel computation rather than the Kalman Riccati equation to compute *K*. We believe this is an important distinction because the Riccati differential equations are challenging to solve and there is no evidence that the brain does so. On the other hand, the distributed, parallel computation is similar to the brain in two ways. First, the parallel particles mimic parallel bundles of neurons (Paulin et al. 2004; Paulin 2005; Paulin and Hoffman 2011). Second, the computations performed by the particle filter consist of simple arithmetic computations (multiplications/divisions and additions/subtractions) that neurons can implement (Koch 1999). Thus, we believe that while both the Kalman and particle filter connect neural noise with response characteristics, the particle filter does so in a manner that seems much closer to a neural implementation (MacNeilage et al. 2008; Paulin et al. 2004; Paulin 2005; Paulin and Hoffman 2011). As detailed above, unlike previous Kalman (Table 1) and particle filter (Table 1) studies for which a family of parameters yielded a particular gain (Borah et al. 1988; Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989), our approach adds more experimental constraints so that they are equal in number to the model free parameters, yielding a unique solution for the model free parameters; other probabilistic models should also yield a unique solution if constrained with additional experimental constraints (Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989).

There are a few important distinguishing features between the three vestibular particle filter models. At a fundamental level, the Laurens and our particle filters (Table 1) ask “what are the optimal central dynamics given noisy, continuous afferent signals?,” in contrast with the Paulin particle filter (Table 1), which elegantly broaches the neural decoding question “what is the most likely instantaneous dynamic sensory input given the discrete information contained in a single noisy neuronal spike?” As a simple example, while the Laurens and our particle filters see particles as carrying the same angular velocity signal but each with different measurement noises, the Paulin particle filter sees particles as representing the probability of a dynamic system being in a certain location in a velocity-acceleration state space, with individual neurons having a specific receptive field in this state space.

While the Laurens and Paulin particle filters (Table 1) use distributed computation based on noise properties, they contrast with our model in that their implementation incorporates both forward and inverse models and because they explicitly calculate probability distributions using Bayes' rule (as in most particle filters), whereas our particle filter implicitly performs Bayesian optimal weighting using the Kalman gain equation: *K*(*t*) = *P̂*(*t*)*C*^{T} × R^{−1}.

Unlike our particle filter, both the Laurens and Paulin particle filters (Table 1) implement resampling to make the distribution of particles better represents the overall probability distribution. In our approach, which supposes that the particles represent neurons, resampling would imply that the brain selectively ignores or boosts neurons based on the distribution of neuronal responses on a spike-by-spike basis, which seems unlikely. Laurens does not claim an analog between particles and neurons and sees resampling strictly as a way of improving computational efficiency, and thus it is not in conflict with our view. Paulin argues that since their particles represent a probability of being in a certain location in state space, removing a particle has the neural analog of ending already-low spiking activity in a part of the network that corresponds to the state space that is far from the likely solution; creating a particle is analogous to increasing activity in parts of the network corresponding to the likely input. In summary, while resampling may have a neural analog in the Paulin particle filter, we believe that not resampling is nearer to a neural implementation in our particle filter.

Our implementation differs subtly from other Kalman filters (Bilien 1993; Pommellet 1990; Selva and Oman 2012) that are designed to estimate hidden states of the system, whereas ours estimates the input to the system. While it is straightforward to add another state that equals the input, we chose not to do so because in a neural implementation this distinction would blur, so we simply used the available input to the internal model that represents the estimated angular velocity.

#### Estimates of the particle filter for increased afferent noise.

Our particle filter's velocity storage time constant decreases as measurement noise increases, which is consistent with other Kalman and particle filter models (Borah et al. 1988; Laurens and Droulez 2007; MacNeilage et al. 2008; Paulin et al. 1989). This loosely matches the evidence for an afferent noise increase and a time constant decrease due to bilateral vestibular ablation using gentamicin.

A reasonable hypothesis about the decrease in the velocity storage time constant observed in patients with bilateral vestibular hypofunction due to ototoxicity, with which our particle filter concurs, is that the time constant decrease results from an interaction between peripheral changes and central processing rather than changes isolated to either the periphery or central processing. Evidence from lesion studies suggests that velocity storage is centrally implemented (e.g., Katz et al. 1991). Likewise, evidence suggests that aminoglycoside acts peripherally. Thus, a possible conclusion is that the peripheral changes somehow modify the afferent signals in a way which causes velocity storage to function differently. Our particle filter parallels these ideas: the changes that occur in peripheral units cause the internal model of central processing to reproduce pathological behavior.

Experimental measurements found that the velocity storage time constant drops to below the SCC time constant for healthy SCCs, whereas in our particle filter it cannot be less than the SCC time constant. The difference could be explained by an aminogylcoside-induced drop in the SCC time constant, such as changes in the dynamics of individual hair cells and/or afferents. Hair cell loss due to aminogylcoside-induced apoptosis (e.g., Matsui et al. 2003) could result in decreased function of the vestibular efferent system. For example, stimulation of the vestibular efference system causes a nonlinear amplification of low-amplitude hair cell responses (Rabbitt et al. 2010). Although this work does not prove frequency dependence, it is known in the auditory system that the efferent system influences the cochlear frequency tuning (Fettiplace and Hackney 2006; Wolfe et al. 2009). Thus, it is plausible that the death of hair cells could modify the SCC time constant because hair cells are the actuators that implement peripheral mechanical tuning. Alternatively, a decrease in the peripheral time constant would result if cupular stiffness increased (Rabbitt 1999), which might occur if hair cell death had a direct mechanical effect on cupular stiffness. Finally, there is a precedent for a central origin for the decrease in the velocity storage time constant below the SCC time constant. In pigeons, there is evidence for “velocity leakage,” in which the VOR time constant is less than half that of the SCC (Anastasio and Correia 1994).

#### Limitations of this work.

Despite the success of the model, it has limitations. Most importantly, it includes only one type of sensory organ, the SCC, and is implemented for upright rotation only, which encompasses only one of many possible states. Although this is sufficient to study velocity storage, expansion of the model to include more states is both desirable and possible; indeed, the observer model from which this model was developed includes nine states (Merfeld et al. 1993).

Our model and other similar ones (Borah et al. 1988; Laurens and Droulez 2007; Merfeld et al. 1993; Newman 2009; Oman 1982; Selva 2009; Selva and Oman 2012) use a first- or second-order transfer function for the SCC, which omits afferent dynamics. With a fourth-order transfer function that includes the rate adaptation term in the transfer function numerator (Fernandez and Goldberg 1971), our particle filter was able to match the three experimental constraints after reapplying the iterative tuning procedure. (For computational reasons, we also added an additional pole at a much higher frequency than any of the other dynamics in the system.)

Since our model simulates a perceptual magnitude estimation task, ideally the experimental constraint for noise SD of the angular velocity estimate would be from a magnitude estimation task rather than from a threshold measurement task. While we are aware of no published data quantifying variability in vestibular perceptual magnitude estimate tasks, there is evidence that similar noise underlies perceptual and motor thresholds for vestibular stimuli (Haburcakova et al. 2012), and thus it is conceivable that perceptual magnitude estimation also have the same underlying noise. Small changes in the experimental constraint for noise SD of the angular velocity estimate would result in a change only in the number of particles in the model and not other free parameters. Only one study has shown evidence for signal-dependent noise in vestibular perceptual responses (Mallery et al. 2010), whereas neuronal recordings have shown that noise is independent of signal amplitude (Sadeghi et al. 2007). Since our model, like other models in the observer family, uses only additive noise, its estimates will not have signal-dependent noise, but this is an area for future investigation.

Experiments to confirm or refute the hypothesis that dynamic estimation depends on noise are conceptually simple but practically difficult. The noise properties or number of neurons would be manipulated and dynamics and output variability measured. However, current technology limits our ability to cleanly perform these manipulations over a large number of neurons, and disease models change more than one parameter simultaneously.

#### Plausibility of neural implementation.

We chose to combine the particles that represented estimated angular velocity into a single value by calculating their mean; we found that using the median did not change the results appreciably. While other computations could be used to combine particles, we believe the mean was justified based on experimental data from Massot et al. (2011). We fit a power curve to their data for the relationship between threshold and number of neurons (Fig. 9 in Massot et al. 2011) and found an exponent of −0.41. For a theoretical linear unweighted combination (i.e., the mean), the variances of the individual neurons would add, and since threshold is proportional to SD, the threshold should go down with the square root of the number of neurons, yielding an exponent of −0.5. The experimental exponent is close enough to the theoretical one to justify the use of the mean operator.

Even if we assume that the brain were able to implement a direct mathematical solution of the Kalman Riccati equation, when parallel computational elements are available it is likely more computationally efficient to implement the particle filter. For example, while numerous neurons and interconnections were required to implement the Kalman filter using biological plausible neuron models (Deneve et al. 2007) and for the Riccati equation implemented using artificial neural networks (Wang and Wu 1998), our particle filter requires a relatively small number of neurons, with the number of connections scaling linearly with the number of particles. A learned lookup table relating process and measurement noise to gain could reduce the required number of real-time computations, but training the lookup table would likely require the same sorts of computations as described above, unless a scheme could be developed that incorporated feedback-based learning. For these reasons, we think that the particle filter provides a solution that narrows the gap between behavior and neurons and improves our understanding of the role played by noise in dynamic sensory estimation.

## GRANTS

This work was supported by National Institute on Deafness and Other Communication Disorders Grant DC0-4158. The Orchestra computational cluster is supported by National Center for Research Resources Shared Equipment Grant 1-S10-RR-028832.

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## AUTHOR CONTRIBUTIONS

Author contributions: F.K. and D.M.M. conception and design of research; F.K. analyzed data; F.K. and D.M.M. interpreted results of experiments; F.K. prepared figures; F.K. drafted manuscript; F.K. and D.M.M. edited and revised manuscript; F.K. and D.M.M. approved final version of manuscript.

## ACKNOWLEDGMENTS

The authors thank Lena Valavani for reviewing a draft of this manuscript and Chuck Oman and Fred Daum for stimulating discussions.

## APPENDIX: STATE-SPACE REALIZATION OF THE SCC

Here, we provide derivations for state-space realizations of the SCC. The state-space form describes the system dynamics as a first-order matrix differential equation with multiple states that are perturbed by the inputs. There are many state-space realizations that can be chosen to represent our system. We chose to use the two-state phase-variable form.

We defined the SCC transfer function as follows:
*y*) to the input time-varying angular velocity disturbance scalar (*d*), where *s* is the Laplace variable and τ_{1} and τ_{2} are the SCC time constants determined by fluid dynamics and biomechanics.

By defining

Transforming to the time domain yielded the following formula:

We now defined

It follows that *ẋ*_{2} = *ÿ*′ and *x*_{1} = *y*′. Through substitution, we found the following:

By rearranging, we found the following:

*Equations A5* and *A7* can be written in matrix form as follows:

Recalling from *Eqs. A2* and *A5* that *y* = *y*′(*s*/τ_{2}) and *ẏ*′ = *x*_{2}, we expressed the output equation in the time domain as follows:

Thus, the state-space equations for our system are

- Copyright © 2012 the American Physiological Society