Journal of Neurophysiology

Active Control of Bias for the Control of Posture and Movement

Emmanuel Guigon

Abstract

Posture and movement are fundamental, intermixed components of motor coordination. Current approaches consider either that 1) movement is an active, anticipatory process and posture is a passive feedback process or 2) movement and posture result from a common passive process. In both cases, the presence of a passive component renders control scarcely robust and stable in the face of transmission delays and low feedback gains. Here we show in a model that posture and movement could result from the same active process: an optimal feedback control that drives the body from its estimated state to its goal in a given (planning) time by acting through muscles on the insertion position (bias) of compliant linkages (tendons). Computer simulations show that iteration of this process in the presence of noise indifferently produces realistic postural sway, fast goal-directed movements, and natural transitions between posture and movement.

INTRODUCTION

Motor behavior is a natural and continuous superimposition of movement periods, generally involving large and rapid displacements of focal body segments to subserve goal-directed actions, and posture periods, made of small and slow displacements of the whole body to achieve postural orientation and equilibrium maintenance (Massion 1992). A fundamental function of the nervous system is to provide proper coordination between movement and posture that guarantees that neither does a movement compromise equilibrium nor does postural maintenance induce resistance to movement initiation.

The nature of the coordination process between posture and movement is unknown and remains a highly debated issue in the field of motor control (Kurtzer et al. 2005; Massion 1992; Ostry and Feldman 2003). The controversy is centered on two possible computational schemes. On the one hand, movement would result from continuous transitions between postures (equilibrium point hypothesis; Feldman and Levin 1995). In this framework, a unique operation based on shifts in the equilibrium position of the moving limb is responsible for maintaining steady postures and creating smooth displacements. On the other hand, coordination could emerge from the combination of separate processes (Franklin et al. 2003), one that translates desired kinematics into appropriate forces (inverse dynamics) and another that creates feedback corrections based on deviations from the desired kinematics (impedance control). The two schemes (equilibrium point hypothesis and inverse dynamics/impedance control) have different qualities, but the same drawbacks. First, given that they elaborate control signals based on a desired trajectory, they fail to account for the flexibility of motor behavior (Bernstein 1967; Todorov and Jordan 2002). Second, they consider posture maintenance as a passive, impedance-based process that is likely to be scarcely robust and stable in the face of transmission delays and low levels of actuator stiffness (Bottaro et al. 2005; Loram and Lakie 2002a; Morasso and Schieppati 1999).

Two observations suggest a different coordination scheme. First, experimental data indicate that posture likely results from a high-level, active, anticipatory process, not only for anticipatory postural adjustments, but also for unperturbed quiet stance (Bottaro et al. 2008; Loram et al. 2001; Morasso and Sanguineti 2002; Morasso and Schieppati 1999), although it is a highly debated issue (Masani et al. 2003; Winter et al. 1998). Second, the parameter used to control posture could have the dimension of a position, i.e., muscle force is not translated directly into joint torque, but modifies the bias (insertion position) of a compliant linkage (tendon) that actually transmits force (Lakie et al. 2003; Loram and Lakie 2002b). A series of experimental studies has shown that anticipatory control of bias is a faithful analog of postural control (Lakie and Loram 2006; Lakie et al. 2003; Loram and Lakie 2002a,b; Loram et al. 2001, 2004, 2005a,b). If we assume that the elementary command for a movement is an optimal feedback control signal that drives the body from its estimated state to its goal (Todorov and Jordan 2002), an elementary command for posture should be a signal of the same nature, applied to the bias of a muscle–tendon unit. Here we show that optimal feedback control of bias captures characteristics of unperturbed postural control.

Experimental and computational background

Consider the following experiment. A subject has to maintain the position of an inverted pendulum near the vertical using a linkage (a spring) between its hand and the pendulum (Fig. 1; Lakie et al. 2003). The subject moves its hand to change the bias (insertion position of the spring relative to an arbitrary origin; Fig. 1) which changes the length of the spring and thus the force applied to the pendulum and the position of the pendulum relative to the vertical (sway; Fig. 1). A simplified mathematical representation of this problem is obtained by writing the dynamics of the pendulum I(d2θ/dt2)=mgΛsinθ+τh where θ is the angle of the pendulum with the vertical; I, m, and Λ are the inertia, mass, and length of the pendulum, respectively; and τh is the torque applied by the hand. A solution to this task can be obtained in the framework of classical feedback control using τh=Kθ (1) where K is the stiffness of the linkage, i.e., the subject applies commands that are proportional to the deviation of the pendulum from the vertical. Stability analysis shows that this process is efficient if K > mgΛ, i.e., the stiffness of the linkage is larger than the “stiffness” of the pendulum (for a thorough analysis of classical feedback control for posture, see Bottaro et al. 2005). This mathematical result corresponds to the following intuitive description. If the linkage is rigid (K ≫ mgΛ), the task is rather easy because the subject needs only keep its hand immobile to properly balance the pendulum. In the case of a compliant linkage (K < mgΛ), however, the torques induced by small deviations of the pendulum from the vertical are not compensated by the passive resistance of the linkage and the pendulum will eventually fall.

Fig. 1.

Definition of bias and sway for the manual control of an inverted pendulum through a spring. Gray arrows indicate possible directions of hand displacement. The bias is the insertion position of the spring (vertical plain line) measured relative to an arbitrary origin (vertical dotted line). The sway is the deviation of the pendulum from the vertical (dashed line). Note that for small displacements involved in pendulum balancing, the bias and sway can be equivalently represented by linear or angular displacements.

The model defined by Eq. 1 predicts that: 1) the task is successfully executed only for K > mgΛ and 2) for K > mgΛ, the pendulum should oscillate with a frequency that is proportional to K (see Fig. 2 in Lakie et al. 2003). These predictions are not consistent with the experimental observations of Lakie et al. (2003). They found that subjects can balance the pendulum for a wide range of K (from 54 to 746% of pendulum stiffness) and the frequency of pendulum oscillations (sway frequency) is independent of K (see Fig. 7B in Lakie et al. 2003). Note that for simplicity we describe the results of Lakie et al. in terms of frequency f, although they dealt with duration = 1/2 × f (see Data analysis in methods). They further reported characteristics of hand displacements (bias). Bias frequency was about threefold larger than sway frequency except for larger K (see Fig. 7, B and D in Lakie et al. 2003), which means that adjustments in hand position were more frequent than changes in the direction of sway. Sway and bias were negatively correlated with zero timelag for K < mgΛ, and positively correlated with negative timelag for K > mgΛ (see Fig. 6 in Lakie et al. 2003). The two latter results provide some information on the process that governs hand displacements, although they are not easy to interpret. In particular, the negative correlation between sway and bias for K < mgΛ suggests the presence of anticipatory adjustments of bias, but does not prove their existence.

Consider now a second experiment. Subjects stand freely and characteristics of sway and muscle (ankle extensor) length variations are measured (Loram et al. 2004, 2005a,b). Loram and colleagues found that muscle length and sway frequency were somewhat independent of the stiffness of the tendon (Achille's tendon) that transmits muscle force to the body and muscle length frequency was about threefold larger than sway frequency. They also found that, in some subjects, sway and bias were negatively correlated with zero timelag, corresponding to the presence of “paradoxical muscle movements,” i.e., muscle shortening with increasing sway angle and muscle lengthening with decreasing sway angle. As mentioned earlier, these paradoxical movements could correspond to anticipatory adjustments. On this basis, Loram and colleagues proposed that pendulum balancing with the hand is an analog of control during quiet stance. The hand plays the role of the muscle, the linkage is the tendon, and the pendulum is the body.

The implications of these observations are the following: 1) Quiet standing is a postural task when considered from the point of view of the inverted pendulum (the body) to be maintained in equilibrium close to the vertical (target position). From the perspective of the nervous system, however, the problem is to program displacements of the bias (insertion position of the tendon) that should produce tendon forces to displace the pendulum to its target position. This is clearly illustrated in the analog task of pendulum balancing with the hand. A similar analysis could apply to the task of stick balancing on the finger. 2) Adjustments of bias have an active, anticipatory nature that makes them similar to programmed movements. 3) Passive feedback control (Eq. 1) is not appropriate to explain postural control during quiet stance.

Accordingly, it is important to address the nature of the control process that governs the bias. A model that could reproduce characteristics of sway and bias during pendulum balancing and quiet stance could provide new insights into postural control. The central tenet of this study is that the control process for unperturbed posture is an active process that is similar to processes typically advocated for the control of movement (Todorov and Jordan 2002), i.e., a process based on an internal model of the pendulum/body and the neuromuscular system and a state estimator.

This assumption leads to a critical difficulty, which is the following. In qualitative terms, a movement is usually described as a displacement of a “well-defined” amplitude and “well-defined” duration (movement time), whereas posture involves “small,” stochastic, and more or less periodic displacements (sway) over some “undefined” time period. Amplitude and duration can be considered as desired parameters of movement, but neither size nor frequency is a desired property of sway. This means that a movement time has to be specified to displace the pendulum/body to its target position, while there is no overt temporal constraint on the displacement of the pendulum/body. To circumvent this difficulty, we considered the following approach. Assume that you perform a tracking task, such as following a moving target with your finger. At each time, you must reduce the spatial discrepancy between the finger and the target (in fact between their estimated positions). However, because you cannot expect to reduce it instantaneously, you fixate a duration (planning time [PT]), corresponding to the time necessary to reach the target if it stops moving, and you compute the corresponding motor command. Then you execute this command for one timestep and you start the process again for the new (estimated) positions of the hand and the target. In this way, you generate a continuous flow of displacements that defines a tracking trajectory. The same process can be applied for postural control.

A second example is interesting to fully explain the notion of planning time. Assume now that you perform a reaching task, e.g., move your arm toward a visual target. Some well-known characteristics of this movement are an almost straight path and a bell-shaped velocity profile. To plan and execute such a movement, you need to know its duration so that your commands generate a properly scaled acceleration profile. During the movement, you must keep track of a remaining time that progressively tends to zero as the hand approaches the target. This representation of time is awkward for at least two reasons. First, when the hand actually reaches the target, time tends to “disappear.” There is no remaining time to control the arm, e.g., for residual errors or to compensate for gravity. At this point, it could be interesting to consider that the end of movement time corresponds to the beginning of a “postural period” governed by a classical feedback controller (Eq. 1). In this way, there is no need to worry about time since control is dictated by the time constant of the controller. Yet we already pointed to the fact that a classical feedback controller is not appropriate for postural maintenance (see preceding text). Second, if movement is perturbed (target jump, force applied to the arm), time could disappear before completion of the movement. Thus you need to somehow reallocate time to complete the movement. In fact, the central problem is the peculiar status of initial and final states compared with intermediate states along the trajectory, i.e., there is no time before the initial state and no time after the final state. A solution to break this asymmetry is to consider that all states are equivalent, i.e., each state is the starting point of a (new) goal-directed behavior defined by a desired final state and a duration (the planning time) to complete this behavior. The only constraint to guarantee the equivalence of states is that the planning time should always be nonzero. The choice of the planning time is an open issue. Generally speaking, the planning time is a function of the current state and the desired final state. For instance, it can be an affine function of the distance between the states, corresponding to the existence of amplitude/duration scaling laws observed for different types of movements (Gordon et al. 1994; Hefter et al. 1996).

The origin of scaling laws is unclear, but we have shown that it could be related to the fact that subjects allot the same amount of effort whatever the amplitude of the movement or the carried load (Guigon et al. 2007b). Here we postulated that motor control is a universal process defined along a scaling law. At each time, a displacement is planned and executed based on the estimated distance to the goal and its associated duration (planning time) is prescribed by the scaling law. This process is repeated indefinitely and implicitly defines periods of posture (in the vicinity of zero amplitude) or movement (outside this region), although the distinction is purely arbitrary. We note that a scaling law should be related not only to distance, but also to other states (e.g., velocity) of the controlled object. A more general scaling law does not change the nature of our theoretical construct. For simplicity we assumed that the planning time is constant for the small displacements encountered during quiet stance. It corresponds to a scaling law with a zero slope.

In summary, the goal of this study was to show that a control principle that is appropriate for the production of movement can also explain characteristics of unperturbed postural control. In particular, we want to ascertain the proposal of Loram and colleagues on the existence of anticipatory adjustments of bias during quiet stance (which has been derived from a correlation analysis) using a mechanistic model. We also want to ascertain the proposed analogy between quiet stance and pendulum balancing with the hand. From a computational perspective, this analogy is not trivial since the two cases involve different dynamics (see description of obj1 and obj2 in methods). Simulations are provided as a technical proof of these proposals. Then we show in a simple case that the same principle can produce natural transitions between posture and movement.

METHODS

General approach

Modeling was cast in the framework of the dynamical systems approach to motor control (Wolpert and Ghahramani 2000) and exploits the theory of optimal feedback control (OFC; Todorov and Jordan 2002; see Fig. 2 in Guigon et al. 2008a). The framework and the theory are described in the appendix.

Rationale

OFC is an engineering technique (Bryson and Ho 1975; Stengel 1986) involving 1) a controller that elaborates appropriate control signals to reach a desired goal for a given state of the system and 2) a state estimator that constructs an estimated state of the system based on commands and sensory feedback. A rationale for a control/estimation architecture in the framework of motor control has been developed by Todorov and Jordan (2002). Central to their analysis is the observation that, for reaching a behavioral goal, the CNS is directly pursuing it rather than trying to reproduce a predetermined pattern that would fulfill it. OFC captures this fact through the “minimum intervention principle” (Todorov and Jordan 2002), i.e., the controller corrects deviations only when they interfere with the task goal. It has been shown that OFC can account for kinematics, kinetics, muscular, and neural characteristics of arm movements (Guigon et al. 2007a,b; Todorov and Jordan 2002), on-line movement corrections (Saunders and Knill 2004; Todorov and Jordan 2002), structure of motor variability (Guigon et al. 2008a,b; Todorov and Jordan 2002), and Fitts' law and control of precision (Guigon et al. 2008a).

Example

We consider an inertial point that can move along a line, actuated by a force generator. Its dynamics is given by dx1/dt=x2(t)+nOBJ1 dx2/dt=u(t)/m+nOBJ2 where x1 is the position of the point, x2 is its velocity, m is its mass, u is the control input transmitted by the force generator, and nOBJ1 and nOBJ2 are noises. The state vector is x = [x1 x2], the control vector u = [u], and the noise vector is nOBJ = [nOBJ1 nOBJ2]. This equation can be formally written as Eq. A1 in the appendix dx/dt=A¯x(t)+B¯u(t)+nOBJ where A is a 2 × 2 matrix and B is a 2 × 1 matrix. The state x is not in general known, but can be observed only through a noisy sensor, such as y(t)=x(t)+nOBS corresponding to Eq. A3, where y is the observation vector and nOBS is a noise vector. An optimal estimation of x can be obtained using a Kalman filter dx̂/dt=A¯x̂(t)+B¯u(t)+K¯(t)[y(t)x̂(t)] corresponding to Eq. A7, where K is the Kalman gain matrix. Applying OFC to this object means finding at each time t an optimal control input u(t), i.e., optimal relative to a criterion (Eq. A5) that can displace the inertial point from its current state to a desired final state. For this linear problem, OFC can be solved analytically (Guigon et al. 2008b; Todorov and Jordan 2002), i.e., for initial state x0 at time t0 and final state xf at time tf, a controller can be written as in Eq. A2.

Specific use

OFC as described in the preceding example and in the appendix was applied here with a single modification. Final time tf was not fixed but, at each time t, a displacement was planned and executed based on the estimated distance to the goal [using (t) and xf] and its associated duration (planning time) was prescribed by a scaling law (see Experimental and computational background in the introduction). For simplicity we assumed that PT is constant for small displacements, i.e., at each time t, tf = t + PT.

Definition of controlled objects

The first object (obj1) was a single inverted pendulum (mass m, length Λ, inertia I) actuated by a muscle–tendon unit (MTU; Fig. 2, A and B). The MTU was a simplified linear version of the nonlinear model of van Soest and Bobbert (1993) (Fig. 2B). Assuming restricted changes in muscle length, the parabolic relationship between muscle force and muscle length was replaced by a linear relationship, the tendon was considered as a linear spring, and parallel elasticity was removed. The force–velocity relationship was also removed because it can be taken into account by the controller (Guigon et al. 2007b). The force transmitted to the pendulum was FT=kT[LTLT0]+ where LT, LT0, and kT are the length, recruitment threshold, and stiffness of the tendon, respectively, and [z]+ = z if z > 0; otherwise [z]+ = 0. The force–length relationship of the muscle was FM=akM[LMLM0]+ where LM, LM0, and kM are the length, recruitment threshold, and stiffness of the muscle, respectively, and a is a dimensionless variable (activation) transmitted by the controller, derived from the control signal u through second-order low-pass filtering (van der Helm and Rozendaal 2000) τ(da/dt)=a+eτ(de/dt)=e+u (2) where e is excitation and τ is the time constant of filtering. Muscle and tendon lengths were obtained using FM = FT, LMTU = LM + LT, and LMTU=[(ΛicosθΛo)2+(Λisinθ)2]0.5 where LMTU is the length of the muscle–tendon unit, θ is the angle of the pendulum with the vertical (sway angle), Λo is the muscle origin length, and Λi is the muscle insertion length.

Fig. 2.

A: architecture of optimal feedback control for obj1 (gray). See text for notations. B: model of a muscle–tendon unit. Schematized force–length relationship for tendon (down left) and muscle (down right). Dashed curves: schematized nonlinear model. C: simulation of a 100-s sway (θ). The task was to maintain the pendulum 3° away from the vertical. D: bias (LM). Parameters were: PT = 0.6 s; τnoise = 25 s; σSINs = 10−2; wdθ/dt = 0.06; σSDNm = 10−4; kM = 45 N/mm; kT = 200 N/mm.

Dynamics of the pendulum was I(d2θ/dt2)=Fg+FT (3) where Fg = mgΛ sin θ. Control of obj1 is a simple analog of the control of an inverted pendulum through ankle musculature (Loram and Lakie 2002b; Loram et al. 2001) and quiet stance (Loram et al. 2004, 2005a,b). The presence of a single actuator corresponds to the fact that these tasks involve mainly the activation of ankle extensor muscles (soleus and gastrocnemius), the flexors being almost silent. In terms of the general formalism described in Appendix, the state vector was x = [θ dθ/dt a e] (n = 4) and the control vector was u = [u] (m = 1). The dynamics of obj1 is given by Eqs. 2 and 3.

The second object (obj2) was a single inverted pendulum (mass mP, length ΛP, inertia IP) actuated by the hand (inertia IH) through a spring of variable stiffness (Fig. 5E; Lakie et al. 2003). Its dynamics is expressed as IH(d2θH/dt2)hHkS(hPθPhHθH)=μaIP(d2θP/dt2)+hPkS(hPθPhHθH)kPθP=0 (4) where θH is the angle of the hand; θP is the angle of the pendulum relative to the vertical; kP = mPP is the pendulum stiffness; kS is the spring stiffness; hH and hP represent the height of attachment of the spring to the hand and the pendulum, respectively; and a is the activation transmitted by the controller (see preceding text; μ = 1 Nm guarantees homogeneous units). We assumed that θH and θP remained small (sin θH ≈ θH, sin θP ≈ θP). When necessary, hand and pendulum positions were calculated as xH = hHθH and xP = hPθP, respectively. We note that IH represents the equivalent inertia of the biomechanical system that controls the pendulum and is, a priori, unknown. In terms of the general formalism, the state vector was x = [θH θPH/dtP/dt a e] (n = 6), and the control vector was u = [u] (m = 1). The dynamics of obj2 is given by Eqs. 2 and 4.

Task and boundary conditions

The general task of the controller was to maintain the controlled object at a reference position with zero velocity. For obj1, the state vector was x = [θ dθ/dt a e] and the boundary conditions were x0 = [θ0 0 0 0] and xf = [θf 0 ∅ ∅], where ∅ indicates the absence of boundary value for the corresponding state. For obj2, the state vector was x = [θH θPH/dtP/dt a e] and the boundary conditions were x0 = [θH0 θP0 0 0 0 0] and xf = [∅ θPf ∅ 0 ∅ ∅]. The rationale for these conditions was to consider only task constraints.

Object noise

Object noise was a multiplicative noise (or signal-dependent noise; SDNm, where m stands for motor; Guigon et al. 2008a; Harris and Wolpert 1998; Todorov 2005) defined by Eq. A8. Since m = 1, then c = 1, C1 = [0 0 0 1/τ] and Ωε = σSDNm.

Observation and observation noise

The observation functions were restricted to kinematic variables (position, velocity). For obj1, from Eq. A 6, obs(x) = Hx = [θ dθ/dt], corresponding to visual/vestibular information on the position/velocity of the pendulum. For obj2, obs(x) = [θH θPH/dtP/dt], corresponding to visual information for the pendulum and proprioceptive information for the hand. The observation noise was an additive noise (or signal-independent noise; SINs, where s stands for sensory; Guigon et al. 2008a; Todorov 2005) defined by Eq. A 9. For obj1, Ωω = σSINs × diag [1 wθP wdθH/dt wdθP/dt], where σSINs represents the SD of noise, and wdθ/dt defines the relative weight of the two observed states. In the same way, for obj2, Ωω = σSINs × diag [1 wθP wH/dt wP/dt]. The first weight is 1 so that there is no redundant parameter. The “color” of noise is likely to influence the characteristics of postural sway (Newman et al. 1996; Peterka 2000). On this basis, the assumption was made that at least information of visual or vestibular origin could be corrupted by colored noise. This assumption is necessary to explain detailed characteristics of pendulum balancing, although it is not necessary to account for general characteristics of balancing (see Supplemental Figs. S5 and S6)1. To simulate different types of colored noise, observation noise was low-pass filtered with a time constant τnoise. The relationship between the scaling factor of noise and τnoise is shown in Supplemental Fig. S1. The nature of noise was specified by matrix Γ that indicated the presence of colored noise (1) or white noise (0) for each element of Ω. For obj1, Γ = diag [1 1]. For obj2, Γ = diag [0 1 0 1].

Parameters and boundary values

There are five types of parameter in the model: 1) parameters that are fixed, and common to all objects (Δ = 0.1 s, τ = 0.05 s); 2) parameters that are fixed and object-specific (obj1: m = 60 kg, I = 60 kg · m2, Λ = 1 m, Λi = 0.4 m, Λo = 0.05 m, LT0 = 0.3 m, LM0 = 0.08 m; obj2: mP = 51 kg, ΛP = 1.03 m, IP = 64.1 kg · m2, hP = 0.87 m, kS = 58, 74, 94, 106, 124, 149, 186, 249, 746% of kP); 3) parameters that can vary and are object-specific (obj1: kM, kT; obj2: IH, hH); 4) parameters that can vary and are related to the task (σSINs, σSDNm; obj1: wdθH/dt, obj2: wdθH/dt, wθP, wdθP/dt); 5) parameters that can vary and are common to all objects (τnoise, PT). The values of the three latter types of parameters were chosen to match experimental observations. The influence of these choices was assessed in a parametric study (Supplemental Figs. S2–S6). The boundary values were: θf = θ0 = 3° (obj1); θPf = θP0 = 3° (obj2).

Data analysis

Time domain and frequency domain analyses were performed on pendulum sway (variables θ) and muscle length and hand position for which the term bias was used (variables LM for obj1 and hHθH for obj2). In the time domain, a sway was defined as a unidirectional displacement of the pendulum between two extrema. Sway size was the mean magnitude of the sways, sway duration the mean duration of sways. Sway frequency was 1/(2 × sway duration). The same definitions were used for the bias. In the frequency domain, power spectral density of sway velocity (Pvv) was calculated and used to define mean sway frequency as fmean=ff×Pvv/fPvv Mean sway duration was then obtained as 1/2 × fmean. The same definitions were used for the bias. Line-crossing impedance was defined as the mean slope of the ankle/torque curve at peak velocities.

Numerical methods

The optimal feedback control problem was solved numerically in the following way. Simulation time T was discretized with timestep η = 0.05 s. At each time t, an optimal control problem (Eqs. A1, A2, and A5) was formulated for proper boundary conditions (initial boundary conditions or currently estimated state and final boundary conditions) and a given planning time PT. This problem was discretized using a direct transcription method (N = 200 points; Betts 2001) and solved by a large-sparse nonlinear programming method (interior point method; Wächter and Biegler 2006; details in Tran et al. 2008). The solution (control signal over PT) was integrated with noise (using a differential equation integrator with adaptive stepsize control; odeint in Press et al. 2002) over duration η (Eqs. A1, A3, and A4) to obtain actual and estimated states at time t + η that will serve for initial boundary conditions at the next step. More details to replicate the results are given in the Supplemental material.

RESULTS

Optimal feedback control of bias (see methods) was applied to two kinds of inverted pendulum used in the study of postural control (obj1, obj2). They correspond to different tasks and different levels of control complexity (obj1: quiet stance or control of an inverted pendulum through ankle musculature/control through a muscle–tendon unit; obj2: pendulum balancing with the hand/control through a spring).

Quiet stance

The proposed mechanism is illustrated in Fig. 2A. Postural control was represented by the control of an inverted pendulum (similar in weight and inertia to a human) through a muscle–tendon unit (obj1), i.e., a muscle in series with a tendon (Fig. 2B). The control architecture consisted in an optimal feedback controller (co) that calculates the best command to the muscle (in the sense of a cost function) that allows the pendulum to be displaced from its currently estimated state to a reference state in a given time (planning time [PT]) and an optimal state estimator (est) that provides the best estimate (in the least square probabilistic sense) of the state of the pendulum (methods).

Results consist in 100-s simulations of the control of obj1 in the presence of sensory and motor noise. An example is shown in Fig. 2. The pendulum swayed regularly around its reference position (Fig. 2C) and muscle length (bias) varied more frequently than pendulum position (Fig. 2D).

For an appropriate choice of the parameters (for a parametric study, see Supplemental Figs. S2, S3, and S5), the model reproduced four main characteristics of pendulum balancing (Loram and Lakie 2002b; Loram et al. 2001) and natural postural sway (Loram et al. 2004, 2005a) (Fig. 3):

  • 1 Balance consisted in complex changes in torque with pendulum angle (Fig. 3A; also see Fig. 3 in Loram et al. 2001). The torque required for equilibrium is indicated by a gray line in Fig. 3A. There was no single equilibrium position as the torque crossed the equilibrium line over a range of angles. Balance consisted in a succession of “biphasic throw and catch” patterns (Fig. 3B; also see Fig. 5A in Loram et al. 2001).

  • 2 When different levels of noise were used, sway size varied, but sway frequency remained constant around 0.4 Hz (Fig. 3C; also see Fig. 4, B and C in Loram et al. 2001). In the same way, line-crossing impedance (mean slope of the torque–angle relationship; Fig. 3B) was about 30 Nm/deg, irrespective of sway size (Fig. 3D; also see Fig. 5B in Loram et al. 2001).

  • 3 The cross-correlation function between sway angle and bias revealed a negative correlation (r = −0.58) with zero time lag, corresponding to the presence of “paradoxical muscle movements,” i.e., muscle shortening with increasing sway angle and muscle lengthening with decreasing sway angle (Fig. 3E; also see Fig. 3, A and B in Loram et al. 2005a).

  • 4 Adjustments in bias (muscle length: 1.5 Hz) were 3.2-fold more frequent than sway movements (0.47 Hz), which reveals a form of intermittent control (Fig. 3F; also see Fig. 3 in Loram et al. 2005b), i.e., there were more adjustments of bias than changes in the direction of sway.

Fig. 3.

Simulation of obj1. A: torque vs. position for a 50-s sway taken from Fig. 2C. Gray line: pendulum torque (kP sin θ). B: mean centered relationship between position and torque at positive peak velocities (i.e., equilibrium position; average over ±0.8 s). Dashed line: regression line (±0.1 s). The slope of the regression line is the line-crossing impedance. C: influence of sway size on sway frequency. Different levels of sway size were obtained with variations in σSINs (10−3, 3 × 10−3, 5 × 10−3, 7 × 10−3). Vertical dashed lines: range of sway from Loram et al. (2001). Horizontal plain lines: range of sway frequency from Lakie and Loram (2006). Horizontal dashed line: mean sway frequency from Loram et al. (2001) and Loram et al. (2006b). D: influence of sway size on line-crossing impedance. Same levels of sway size as in C. Horizontal lines: range of line-crossing impedance from Loram et al. (2001). E: cross-correlation between sway angle and bias. F: power spectrum of sway velocity (solid black) and bias velocity (solid gray). Solid vertical lines: mean frequency of the distributions. Dashed lines: data from Lakie and Loram (2006). Same parameters as those in Fig. 2.

The parametric study (see Supplemental material) shows that these results are highly robust across variations of the parameters. In particular, intermittency is a ubiquitous phenomenon that was observed for every tested combination of the parameters (Supplemental Fig. S2).

The preceding results have been obtained for a value of tendon stiffness (kT) that produces paradoxical muscle movements. We assessed the influence of kT on the characteristics of sway and bias for comparison with experimental results on intersubject variations in tendon stiffness (Loram et al. 2004, 2005a,b). Changes in kT produced slight variations in sway duration (Fig. 4A; also see Fig. 3D in Loram et al. 2005b), sway size (Fig. 4B; also see Fig. 3B in Loram et al. 2005b), and line-crossing impedance (not shown). Variations in bias duration and size were more complex (Fig. 4, A and B). In a lower range of stiffness, bias duration remained almost constant (see Fig. 3E in Loram et al. 2005b) and bias size decreased (see Fig. 3C in Loram et al. 2005b). In an upper range, both duration and size increased with tendon stiffness. This behavior is explained by the fact that there is a quasi-rigid link between the muscle and the pendulum at higher stiffness. This latter behavior was not reported by Loram et al. (2004, 2005a,b), probably because tendon stiffness is not so high in human subjects, but is consistent with results obtained during pendulum balancing with the hand (Lakie et al. 2003; see following text). The correlation between sway angle and bias increased with kT, i.e., the paradoxical movements disappeared at higher tendon stiffness (Fig. 4C; also see Fig. 4 in Loram et al. 2005a). The time lag between sway and bias was zero when correlation was negative, became large and negative around zero correlation, and then increased with correlation (Fig. 4D).

Fig. 4.

Influence of tendon stiffness (kT) on the characteristics of postural sway (obj1). A: sway (open square) and bias (filled squared) duration as a function of kT. Horizontal dashed lines: sway and bias duration from Loram et al. (2005b) and Lakie and Loram (2006). B: sway and bias size. C: peak correlation between sway angle and bias as a function of kT. D: time lag between sway angle and bias as a function of kT. Vertical line: kT ≈ kP. Same parameters as those in Fig. 2.

These results show that active control of bias is a robust mechanistic model of postural control during quiet stance.

Pendulum balancing with the hand

According to Loram and colleagues, pendulum balancing with the hand is a faithful analog postural control during quiet stance. The model was used to test this proposal (obj2; Fig. 5E). Analogy with obj1 is based on the following correspondence: the sway was pendulum position (xP = hPθP), the bias was hand position (xH = hHθH), and tendon stiffness was spring stiffness (kS).

Fig. 5.

Simulation of pendulum balancing with a spring (OBJ2). A: sway and bias size (xP and xH) and duration as a function of percentage of pendulum stiffness. Dotted lines: 95% confidence interval. Vertical line: 100% of pendulum stiffness. Gray lines and symbols: data from Lakie et al. (2003). B: sway/bias correlation. C: sway/bias time lag. Inset: replot of experimental data to show hidden parts. D: slope of the pendulum position/hand position relationship. E: scheme. Parameters were: PT = 0.5 s; τnoise = 25 s; σSINs = 2 × 10−3; wθP = 20; wdθH/dt = 0.1; wdθP/dt = 2; σSDNm = 10−3; IH = 25 kg · m2; hH = 0.85 m.

Results consist in 200-s simulations of the control of obj2 for nine values of kS (58, 74, 94, 106, 124, 149, 186, 249, and 746% of the stiffness of the pendulum kP). For an appropriate choice of the parameters (for a parametric study, see Supplemental Figs. S4 and S6), the model reproduced seven main characteristics of pendulum balancing with the hand (Lakie et al. 2003):

  • 1 Balance was successfully maintained for every value of kS.

  • 2 Sway duration was about 1 s and varied little with kS (Fig. 5A; also see Fig. 7B in Lakie et al. 2003).

  • 3 Bias duration was about 0.4 s and varied little with kS, except for large kS (Fig. 5A; also see Fig. 7D in Lakie et al. 2003).

  • 4 Sway and bias sizes decreased with kS (Fig. 5A; also see Fig. 7, A and C in Lakie et al. 2003).

  • 5 The sway/bias correlation increased with kS and was zero for kS ≈ pendulum stiffness kP (Fig. 5B; also see Fig. 6A in Lakie et al. 2003).

  • 6 The time lag between sway and bias was zero for kS < kP, became large and negative for kS ≈ kP, and increased with kS for kS > kP (Fig. 5C; also see Fig. 6B in Lakie et al. 2003).

  • 7 The slope of the pendulum position–hand position relationship increased with kS (Fig. 5D; also see Fig. 4 in Lakie et al. 2003).

It is interesting to note that sway size varied with spring stiffness in the current test (Fig. 5A), but did not vary with tendon stiffness in the preceding test (Fig. 4B), in agreement with experimental observations (Lakie et al. 2003; Loram et al. 2005b).

The parametric study (Supplemental Figs. S4 and S6) shows that these results are highly robust across variations of the parameters. Thus active control of bias is a robust mechanistic model of pendulum balancing with the hand.

Transition between posture and movement

In the preceding simulations, posture was defined by the requirement to drive a pendulum from its currently estimated position to a nearby target position, although the model is not limited to a particular range of target positions. If the final boundary conditions are suddenly modified to specify any new target position, the controlled object should be driven to this position. For simplicity, we assumed that the planning time was independent of movement amplitude. Considering again obj1 (with a pair of antagonist muscles; Fig. 6A, inset), we simulated: 1) movements of constant amplitude and variable durations (Fig. 6A): different durations were obtained by changing the planning time; and 2) movements of constant duration and variable amplitudes (Fig. 6B): a small range of movement amplitude was deliberately chosen to match the hypothesis of restricted changes in muscle length (methods). We observed that the same process maintained posture against gravity (time <0.5 s), generated a displacement with a bell-shaped velocity profile (time >0.5 s), and maintained posture at the end of movement.

Fig. 6.

Simulation of movement for obj1. A: movements of constant amplitude (6°) and variable durations (PT = 0.3/black, 0.4/red, 0.5/blue, 0.6/green s). Movements started at 0.5 s (solid vertical line). Planning time (relative to the beginning of the movement) is indicated by a dashed colored line. B: movements of constant duration (PT = 0.4 s) and variable amplitudes (4/black, 6/red, 8/blue, 10/green deg). Same parameters as those in Fig. 2.

DISCUSSION

The present results together with those of previous modeling studies (Guigon et al. 2007b, 2008a; Todorov and Jordan 2002) suggest that one and the same computational process can generate movement- and posture-like displacements and account for some of their experimentally observed properties. We discuss the implications of our results at three levels: 1) in the framework of the debate between passive versus active view of quiet stance; 2) in the framework of postural control in the broad sense; and 3) in the framework of the debate between common versus separate processes for the control of posture and movement.

Passive versus active view of quiet stance

The proposed theory, which has been derived from the study of unperturbed postural paradigms for single inverted pendula (pendulum and body balancing), states that, in these cases, postural control involves an active, anticipatory process—i.e., an internal model of the body and the neuromuscular system—and a state estimator. The results provided a technical proof of this fact.

A fundamental implication is that posture should be addressed within the scope of the analysis proposed by Todorov and Jordan (2002) on the nature of motor behaviors, i.e., posture is a highly coordinated and flexible behavior. This view is consistent with some experimental and theoretical arguments (Bottaro et al. 2008; Loram and Lakie 2002a; Loram et al. 2001; Morasso and Sanguineti 2002; Morasso and Schieppati 1999), yet it is not in the mainstream of studies on postural control that consider posture as a passive process (Feldman and Levin 1995; Lockhart and Ting 2007; Masani et al. 2003; Peterka 2000; van Soest and Rozendaal 2008; Winter et al. 1998). First, we note that passive stabilization is likely to be scarcely robust and stable in the face of transmission delays and low levels of actuator stiffness (Bottaro et al. 2005; Loram and Lakie 2002a; Morasso and Schieppati 1999). Second, Bottaro et al. (2005) previously showed that, in the description of posture as a fixed point of a classic feedback controller, postural sway is the result of the action of noise and not the action of the controller. In fact, noise and control signals are of the same order of magnitude, corresponding to a physiologically implausible level of noise.

The present results are not totally unexpected since many studies have successfully addressed quiet stance in the framework of optimal control and optimal state estimation (Kiemel et al. 2002; Kuo 1995; Newman et al. 1996; Qu et al. 2007; van der Kooij et al. 1999). Yet there is a fundamental difference between the previous and current approaches (except Newman et al. 1996; see following text), i.e., the optimality criteria are different. The usual optimality criterion is derived from the classic linear quadratic Gaussian formalism and involves the minimization of a combination of control and error (Bryson and Ho 1975). As applied to posture, the error term contains position and velocity terms, i.e., posture results partly from the minimization of the kinematics of sway over a definite time period. Accordingly, posture is to be construed as a trajectory following process, the trajectory being a fixed point. Our results were obtained with a criterion involving only the minimization of controls (Eq. A5), the kinematic variables being constrained by boundary conditions. This criterion was originally proposed for movement production (Guigon et al. 2007b; Harris and Wolpert 1998; Nelson 1983) and was also used for trajectory formation during postural control (Ferry et al. 2004; Martin et al. 2006; Menegaldo et al. 2003). One study (Newman et al. 1996) has shown that the two power-law scaling regimes that are typical of physiological sway movements (Collins and De Luca 1993) actually emerge with this same criterion.

Implications for postural control

A central issue is whether this theory, which accounts for the control of an unperturbed single inverted pendulum, is relevant to the more general case of multijoint redundant kinematic chains in the presence of perturbations. The fact that we considered only a single inverted pendulum could in fact be viewed as a limitation of this study. The reason for this choice is twofold. First, the model is intrinsically able to coordinate systems with multiple degrees of freedom and kinematic and muscular redundancy (Anderson and Pandy 2001; Guigon et al. 2007b; Todorov and Jordan 2002). Second, we have not found sufficiently quantitative data on the control of multiple inverted pendula that would put enough constraints on the model. The successful coordination of a double (e.g., leg/trunk) or triple (shank/thigh/trunk) inverted pendulum with the model could not be considered as a major achievement and, for lack of stringent constraints, would not add further support to the proposed theory.

The issue of perturbations is complex for any theory of motor control. In fact, a theory is in general built to be as simple as possible and to account for the largest set of experimental observations. For instance, optimal feedback control can account for trajectory formation and on-line control of movement, but because it does not include any low-level reflex operations, it cannot implement short-latency corrections induced by unexpected perturbations. At this point, we see a clear limitation of an approach (computational) that is not based on physiological processes. However, there is no reason why a more detailed model could not address postural perturbations.

In summary, the present results do not allow us to draw conclusions on the general issue of postural control, but point to a new theoretical framework for the study of posture.

Coordination of posture and movement

The present theory also contributes to the debate on the coordination between posture and movement. At the most general level, there are three ways to consider this coordination: 1) movement is posture (equilibrium point theory; Ostry and Feldman 2003); 2) posture is movement (present theory); 3) posture and movement are separate processes (Kurtzer et al. 2005; Massion 1992). We focus the discussion on the issue of common versus separate processes and we do not specifically address the equilibrium point theory, which has frequently been discussed in the literature (Feldman and Levin 1995). The central point of the debate is that the different views are based on arguments at different levels. The first two views claim, on a computational ground, that a unique process is necessary for the sake of coordination. In the scheme of Massion (1992), separate movement and posture controllers are considered and interact only through an efferent copy of the commands from the former to the latter. In this configuration, any postural adjustment is unknown to the movement controller and should lead to motor errors. In fact, arguments for separation are mainly anatomical and physiological necessities. For instance, the stretch reflex could be considered as a specific postural pathway that can maintain stable postures through a negative feedback loop (Houk and Rymer 1981), whereas supraspinal inputs to the motoneurons would convey movement-related commands. For the case of unperturbed quiet stance, there is little evidence for an involvement of the stretch reflex (Loram and Lakie 2002a). This example shows that the mere existence of separate anatomical and physiological pathways for posture and movement should not be considered as a conclusive argument in the absence of a computational framework that describes their involvement in the coordination of posture and movement.

A different view of separation is based on the idea that posture and movement processes would pursue different goals or optimize different functions, e.g., related to gravity, control of the center-of-mass or stability issues for the former, and related to velocity, accuracy, or energy savings for the latter. Although this proposal is attractive, there is no specific experimental support for it. On the contrary, results reported by Nishikawa et al. (1999) go against this view. They reasoned that, because the relative contributions of antigravity and movement-related forces vary with movement velocity, optimization in relation to gravity forces should lead to changes in terminal posture with velocity. For instance, the posture at the end of a slow movement should be chosen to minimize the influence of gravity. Their results did not back up this prediction because the terminal posture of three-dimensional redundant arm movements was independent of movement velocity.

Neural bases of posture and movement

The present theory states that postural control involves an internal model of the body and the neuromuscular system and a state estimator. The former element has not yet been formally identified at the neural level, but observations of paradoxical muscle movements and the absence of reflex contribution during postural sway (Loram and Lakie 2002a; Loram et al. 2004) suggest a supraspinal origin. In quadrupeds, activity of motor cortical neurons is closely related to the pattern necessary for postural maintenance (Beloozerova et al. 2003) and the production of anticipatory postural adjustments (Yakovenko and Drew 2009). The presence of an internal model of the motor apparatus has not been proven in these cases, but related experimental and theoretical results in the primate motor cortex concur with this idea (Guigon et al. 2007a; Scott 2007; Todorov 2000). The latter element (state estimator) is likely located in the cerebellum (Wolpert et al. 1998), which is consistent with impaired postural control in the case of cerebellar ataxia (Morton and Bastian 2004). At a more elaborated level, the theory suggests that posture and movement would result from the same anticipatory process. Evidence for common or separate neural processes for posture and movement is clearly mixed (Kurtzer et al. 2005; Sergio et al. 2005). On the one hand, many motor cortical neurons are recruited for both isometric and movement tasks (Sergio et al. 2005). On the other hand, populations of M1 neurons display load-related activity in a task-specific way, i.e., only during a posture or a movement task (Kurtzer et al. 2005). This discrepancy is difficult to settle, but might be related to the general difficulty to infer the role of a neuron from its discharge pattern (Fetz 1992). Although the theory supports the “common” view of posture and movement, it could also be relevant for the “separate” view. In fact, as mentioned earlier, the theory has been derived from the study of axial posture and might not be adequate to describe postural control of the upper limb, e.g., posture maintenance of the forearm against gravity. A reason for this could be related to properties of tendons: shorter and stiffer tendons would render control of bias similar to control of force (i.e., muscle force is directly translated into joint torque), which is inappropriate for postural control (Ostry and Feldman 2003) and would require an additional postural controller.

Intermittency

The model displays an intermittent behavior characterized by adjustments of bias (muscle length or hand position) that are more frequent than sway (pendulum position) variations. This behavior, which fits observations on postural sway (Lakie and Loram 2006; Loram et al. 2005b), occurs in the absence of intermittent processes in the model. A parametric study reveals that the level of intermittency (ratio of bias and sway frequency) is modulated only by the planning time (i.e., the time to reach the boundary conditions as defined by the amplitude/duration scaling law) and the characteristic of sensory noise (i.e., the frequency content of sensory noise). The fundamental point is that these two parameters should remain unchanged by task conditions (e.g., instructions to the subjects). Conversely, the level of intermittency is not modulated by parameters that could change with the task conditions (e.g., level of noise). Thus intermittency appears to be an intrinsic property of the interaction between the control process and the controlled object, which would explain its ubiquitous presence and invariant nature across experimental studies (Lakie and Loram 2006; Lakie et al. 2003; Loram et al. 2005b). The emergent nature of intermittency in the model contrasts with other approaches in which an intermittent control mechanism is a built-in feature, involving a periodic or state-dependent switching process between active and inactive modes of control (Bottaro et al. 2008). The proposed view of intermittency also differs from classic accounts that ascribe intermittency to constraints that would limit the functioning of a controller (e.g., deadzone, refractory period, etc.; Miall et al. 1993).

Colored noise

In previous models of postural control, colored noise was deemed necessary to explain characteristics of postural sway (Newman et al. 1996; Peterka 2000). The same observation was true for the present model, i.e., colored noise in sensory feedback was necessary to reproduce quantitative aspects of pendulum balancing. The exact origin and meaning of this observation are unclear, but might be related to transduction mechanisms at the level of sensory receptors that act as low-pass filters (e.g., Fitzpatrick and Day 2004). Yet the issue of noise is globally orthogonal to the central topic of this study—i.e., the problem of posture and movement control.

Representation of time

Time is central to motor control (Schöner 2002), yet its role in postural control remains unclear. Three proposals can be considered. First, in the framework of classic feedback, there is no explicit time in the control process, since the behavior in the vicinity of a fixed point is governed by a time constant and is thus determined by the parameters of the feedback controller (Supplemental Fig. S7). This view of posture is clearly at variance with experimental observations (Lakie et al. 2003; Loram et al. 2005b). Second, in usual models of movement production, a movement duration is chosen for a given goal and time decreases toward zero as the command to the controlled object unfolds. Extension to posture is not straightforward since there are no well-defined temporal boundaries for postural displacements. More generally, on-line movement perturbations induce updating of movement duration (Prablanc and Martin 1992; Shadmehr and Mussa-Ivaldi 1994), which is not easily captured in this framework. The third proposal is control along an amplitude/duration scaling law (with a nonzero intercept). The present study shows that this mechanism could provide a unified representation of time for posture and movement.

Testing the theory

A critical issue for any computational theory is to show that it can be in some way invalidated. Two directions can be proposed, corresponding to two aspects of the model that could be faulty. An extension of our modeling framework to multiple degrees of freedom (DOFs) including the trunk and leg should be able to explain the patterns of coordination between the trunk and leg segments during quiet stance (Creath et al. 2005; Saffer et al. 2008; Zhang et al. 2007), i.e., the angular displacements of the trunk about the hip and legs about the ankle are aligned in phase below 1 Hz and in antiphase above 1 Hz. The simultaneous measurements of ankle/hip displacements and changes in ankle muscle length during quiet stance should provide sufficiently quantitative data for a critical test of the model. Yet a difficulty is that the number of parameters (muscle parameters, muscle insertions and moment arms, parameters of the state estimator) increases with the number of DOFs. Thus a success would not be particularly significant as it could be ascribed to a clever choice of the parameters, but a failure would be highly significant. Second, the influence of the planning time on the characteristics of postural sway (Supplemental Figs. S2A and S3A) could be exploited. Amplitude/duration scaling laws can be modified by task instructions (Brown et al. 1990) and pathological states (Hefter et al. 1996). An interesting case is Parkinson's disease, which induces upward shifts in amplitude/duration scaling laws (and downward shift in amplitude/velocity scaling) compared with control subjects (see Fig. 6 in Flowers 1976, Fig. 2 in Berardelli et al. 1986, Fig. 5 in Warabi et al. 1986, Fig. 6 in Sheridan et al. 1987, Fig. 3 in Hefter et al. 1996, and Fig. 3 in Pfann et al. 2001). The same effect is also observed for unmedicated versus medicated PD patients (see Fig. 2 in Robichaud et al. 2002). According to the model, an upward shift in amplitude/duration scaling corresponds to an increased planning time that should modify motor control (in terms of velocity for movement and in terms of frequency/amplitude of sway for posture). This is a testable prediction, for instance with a comparison of postural sway between medicated and unmedicated Parkinsonian patients: medication should lead to a decrease in sway duration and sway size (Supplemental Figs. S2A and S3A). A failure to observe these effects would be significant and would invalidate the model. A related idea would be to exploit circadian variations in movement duration (Gueugneau et al. 2009) and to show that they are accompanied by corresponding variations in the characteristics of postural sway.

DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

Footnotes

  • 1 The online version of this article contains supplemental data.

APPENDIX

The model is cast in terms of an interaction between a controlled object (obj), a controller (co), an observer (obs), and a state estimator (est). In this framework, the following variables are used: x is an n-dimensional state vector (bold indicates a vector, italic is for scalar, and underlined is for matrix) that contains position, velocity, … , of obj; u is an m-dimensional control signal provided by co; y is a p-dimensional vector provided by obs, representing observation of the state vector through sensory feedback; is an n-dimensional vector computed by est as an estimate of x.

The model is made of 1) the controlled object with dynamics dx/dt=OBJ[x(t),u(t)]+nOBJ(t) (A1) where nOBJ is an n-dimensional process noise (noise); 2) the controller defined by u(t)=co[x̂(t),xf,tf,OBJ] (A2) that calculates the appropriate u to displace the object from its estimated state at time t to its goal xf at time tf (boundary conditions, bound); 3) the observer y(t)=OBS[x(tΔ)]+nOBJ(t) (A3) where nobs is a p-dimensional observation noise (noise) and Δ is the time delay in sensory feedback pathways; 4) the state estimator defined by dx̂/dt=est[x̂(t),y(t),u(t),OBJ] (A4) that calculates the state estimate based on u and observation y.

If co is an optimal controller for the optimality criterion (crit) J(t)=[t;tf]u(w)2dw (A5) and est is an optimal state estimator, the ensemble {co, crit, obs, est, noise}, applied to {obj, bound}, defines an optimal feedback control architecture. To generate the movement of an object from initial state x0 at time t0 to final state xf at time tf, the optimal feedback control (OFC) calculates at each time t in [t0; tf] the best command that displaces the object from its currently estimated state (t) to its goal state xf in the remaining duration tft.

To obtain a complete description of the model, obs, est, and noise must be specified. Observation is defined by OBS[x(t)]=Hx(t) (A6) where H is a p × n observation matrix. The estimated state is obtained using a Kalman filter dx̂/dt=OBJ[x̂(t),u(t)]+K¯(t){y(t)OBS[x̂(tΔ)]} (A7) where K is the n × p Kalman gain matrix (Guigon et al. 2008b). Both dynamics and observation are corrupted by noise (Guigon et al. 2008b; Todorov 2005). Object noise is a signal-dependent noise nOBJ(t)=i=1cεi(t)C¯iu(t) (A8) where ε = [ε1 … εc] is a zero-mean Gaussian random vector with covariance matrix Ωε and [C1Cc] is a set of n × m matrices (Todorov 2005). Observation noise is a signal-independent noise nOBS(t)=ω(t) (A9) where ω is a p-dimensional zero-mean Gaussian random vector with covariance matrix Ωω. The rationale for Eqs. A8 and A9 is the following. Signal-dependent noise on object dynamics is necessary for OFC to implement a minimum intervention principle (Guigon et al. 2008b; Todorov and Jordan 2002). Signal-independent noise on observation is the simplest form of noise on sensory feedback. Thus Eqs. A8 and A9 specify the simplest noisy environment for OFC.

The present formalism for optimal control is slightly different from the stochastic optimal control framework of Todorov and Jordan (2002). The difference is related to the optimality criterion (Eq. A5), which includes both a control term and an error term in Todorov and Jordan. The error term is used in place of the hard final boundary constraint, but in fact requires additional parameters to determine the weights of state costs (not only position, but also velocity, etc.) relative to control costs. Despite this difference, the two frameworks share similar properties (Guigon et al. 2008b).

REFERENCES

View Abstract