## Abstract

When a stimulus supports two distinct interpretations, perception alternates in an irregular manner between them. What causes the bistable perceptual switches remains an open question. Most existing models assume that switches arise from a slow fatiguing process, such as adaptation or synaptic depression. We develop a new, attractor-based framework in which alternations are induced by noise and are absent without it. Our model goes beyond previous energy-based conceptualizations of perceptual bistability by constructing a neurally plausible attractor model that is implemented in both firing rate mean-field and spiking cell-based networks. The model accounts for known properties of bistable perceptual phenomena, most notably the increase in alternation rate with stimulation strength observed in binocular rivalry. Furthermore, it makes a novel prediction about the effect of changing stimulus strength on the activity levels of the dominant and suppressed neural populations, a prediction that could be tested with functional MRI or electrophysiological recordings. The neural architecture derived from the energy-based model readily generalizes to several competing populations, providing a natural extension for multistability phenomena.

## INTRODUCTION

When observers are presented with an ambiguous stimulus that has two distinct interpretations, their perception alternates over time between the different possible percepts in an irregular manner, a phenomenon known as *perceptual bistability*. Bistability arises in many domains of perception: ambiguous figures (Necker 1832), figure–ground segregation (Rubin 1921), ambiguous motion displays (Hupé and Rubin 2003), auditory segmentation (Pressnitzer and Hupé 2006), and—the domain that has been studied most extensively—binocular rivalry (Blake 1989, 2001; Levelt 1968; Logothetis 1998; Tong 2001; Wheatstone 1838). In addition to generating much experimental work, binocular rivalry has attracted much attention theoretically, and many models have been proposed for it (Bialek and DeWeese 1995; Blake 1989; Dayan 1998; Freeman 2005; Laing and Chow 2002; Lehky 1988; Lumer 1998; Wilson 2003). Although there are fewer quantitative studies of other bistable perceptual phenomena, there is evidence that they share many properties of binocular rivalry alternations (Rubin and Hupé 2004; van Ee 2005). Thus models of binocular rivalry may be generalized to other bistable perceptual phenomena.

Although the alternations seem haphazard, for a fixed stimulus the durations are drawn from a stationary distribution that resembles a skewed Gaussian, typically fit by a gamma or log-normal function (e.g., Lehky 1995; Levelt 1968; Rubin and Hupé 2004). Importantly, alternations occur not only when the two percepts are balanced in strength (equal mean dominance durations), but also when one is significantly stronger than the other. When stimulus parameters that affect the relative strength of the two interpretations are varied continuously, the relative time spent perceiving each changes gradually (Hupé and Rubin 2003; Levelt 1968). In the domain of binocular rivalry, where the strength of each competing percept can be manipulated independently (e.g., by the contrast of the monocular images), two additional important observations were summarized by Levelt (1968). His “Proposition II” states that the imbalance in dominance time caused by weakening only one image (while keeping the other fixed) occurs mainly through an increase of the mean dominance duration of the other (unchanged) image, with little or no effect on the dominance durations of the manipulated one. (Note that this is a statement about absolute mean durations; the fractions of dominance time of each percept obviously both change because they must add up to one.) Levelt's “Proposition IV” states that when the monocular images are strengthened simultaneously, the mean durations of both eyes decrease (i.e., the rate of alternations increases, although the fraction of time spent perceiving each image remains unchanged).

What causes the alternations? Although this is a central question about perceptual bistability, the mechanisms underlying the perceptual switches are not well understood. In most current models alternations between dominance of two or more competing neuronal populations arise from some form of slow adaptation acting on the dominant population, either in its firing rate or in its synaptic output (synaptic depression) or both, which leads to a switch in dominance to the competing population (Kalarickal and Marshall 2000; Lago-Fernandez and Deco 2002; Laing and Chow 2002; Lehky 1988; Matsuoka 1984; Stollenwerk and Bode 2003; Wilson 2003). In the absence of noise or finite-sized induced fluctuations, models in which switches are caused by adaptation generate alternations with perfect periodicity; we therefore term them *oscillator models*. Importantly, in such models noise is assumed to be an inessential (albeit experimentally inevitable) component of the perceptual alternations.

An alternative possibility is that the main cause of perceptual switching is noise—external, internal, or both. Noise, in the form of unavoidable perturbations, is ubiquitous in the brain at multiple scales, from vesicular release and spiking variability to fluctuations in global neurotransmitter levels. Furthermore, external noise can cause perceptual alternations (cf. Kanai et al. 2005; Lankheet 2006) as can some internally generated noise (e.g., blinks). This raises the possibility that noise is the primary cause for alternations. In this scheme, dominance of each of the competing percepts can be viewed as a stable state of the neuronal dynamics (i.e., attractor; Hertz et al. 1991), with noise causing the system to alternate between them. We therefore term these *noise-driven attractor models*. (Additional involvement of neural adaptation may still be present, but in this scheme it would not play the primary role and would not lead to alternations in the absence of noise.) Attractor models make a fundamentally different prediction than do oscillator models about the consequence of eliminating noise from the system: rather than showing perfectly periodic alternations, they predict that the perceptual alternations would cease—i.e., the system would settle down in one of the two percepts and stay there indefinitely. Although this is a thought experiment that cannot be performed practically, exploring the distinction between the two alternatives theoretically is important for our understanding of the underlying mechanisms.

Herein we present an attractor-based framework in which alternations are induced by noise and are absent without it. The proposal that bistable transitions may be mediated by noise has been made before (e.g., Brascamp et al. 2006; Freeman 2005; Haken 1994; Kim et al. 2006; Lankheet 2006; Riani and Simonotto 1994; Salinas 2003). Our work goes beyond previous models of noise-driven bistability by constructing a neurally plausible attractor model that produces behaviors consistent with the experimental findings summarized earlier. A particular challenge is posed by Proposition II introduced by Levelt (1968) because it implies that increasing input strength to one attractor reduces the energy barrier for the other attractor (Kim et al. 2006), and such behavior does not arise in commonly used energy functions (see, e.g., Hertz et al. 1991). We therefore start by formulating a simple two-well energy function that includes coupling between the input strength to one attractor and the energy barrier of the other. We then derive from the energy function dynamical equations of a rate-based (mean-field) model. The equations suggest a novel network architecture, where information about the input strength of each percept is sent not only to the population representing it but also to the population representing the competing percept. This, in turn, leads to the novel prediction that increasing stimulus strength to one population will reduce the activity level of the competing population by recruiting more inhibition while it is dominant. Finally, we show that the model can also be realized in a spiking neuronal attractor network, using the neuronal architecture derived for the rate-based model, thus providing a more realistic description of the neuronal dynamics during perceptual bistability.

## RESULTS

We start by positing that each of two neuronal populations, labeled *A* and *B*, represent a different possible interpretation of the stimulus. The neural correlate of competition for perceptual dominance is a competition between these populations for higher activity. The activities of the populations are described by their mean firing rates, *r _{A}* and

*r*. We denote by

_{B}*A*and

_{on}*B*the states of dominance of populations A (

_{on}*r*≫

_{A}*r*) and B (

_{B}*r*≫

_{B}*r*), respectively.

_{A}Hypothesizing noise-driven alternations means a fundamentally different structure of the trajectories in state space (the space of neuronal activities). This is illustrated in the *left* and *right panels* in Fig. 1*A*, which visualize on the plane of population firing rates (*r _{A}*,

*r*) the evolution of the two models over time. (Time is not explicit in this representation: rather, points along the trajectory correspond to snapshots of the state of the system at regular time intervals.) In both oscillator and noise-driven attractor models, perceptual alternations correspond to alternations between points

_{B}*A*and

_{on}*B*. However, the trajectories in state space that move the system between these states are fundamentally different in the two cases. In oscillator models (Fig. 1

_{on}*A*,

*left*), the alternations follow a cyclic trajectory in the plane (

*r*,

_{A}*r*), caused by the deterministic effect of the slow negative feedback provided by the adaptation. Thus the system is characterized by a limit cycle, with a large proportion of the time spent around

_{B}*A*and

_{on}*B*. In contrast, in our noise-driven model (Fig. 1

_{on}*A*,

*right*)

*A*and

_{on}*B*are stable fixed points, or attractors, like those obtained from a stimulus that activates only one of the populations (e.g., by turning off one of the monocular images in binocular rivalry). All trajectories in state space approach either

_{on}*A*or

_{on}*B*, depending on which side of the diagonal (separatrix) they originate from. The alternations between the two stable states are attributed to noise that occasionally allows the system to overcome the energy barrier between them. In the presence of noise, dominance alternates over time as the system visits the states

_{on}*A*and

_{on}*B*in turn (Fig. 1,

_{on}*center plot*, oriented diagonally). In the absence of noise, the system would flow into one of the stable states (depending on initial conditions) and stay there indefinitely.

Because bistable perceptual alternations are not regular, but rather appear stochastic, many oscillatory models also assume a role for noise (Kalarickal and Marshall 2000; Lago-Fernandez and Deco 2002; Lehky 1988; another proposal is that the irregularity of alternations arises from finite-size effects; Laing and Chow 2002). It is therefore important to sharpen and clarify the distinction we make between two types of models, and the two panels in Fig. 1*A* are particularly useful for that.

We use the term “noise-driven attractor model” to refer to any system where the points *A _{on}* and

*B*in state space are stable fixed points, i.e., a system that will not undergo alternations between these states in the absence of noise. Such a system may or may not also contain adaptation, as long as the adaptation is not strong enough to drive alternations by itself (i.e., when noise is eliminated). Indeed, as will be seen later, in our model we use weak adaptation to adjust the form of the distributions of dominance durations so that they resemble those observed experimentally. However, because this adaptation is too weak to drive alternations when noise is eliminated, its inclusion does not change the noise-driven nature of the model.

_{on}Conversely, by “oscillatory model” we refer to one where the points *A _{on}* and

*B*are

_{on}*not*stable fixed points, but rather belong to a limit cycle that is the only stable state (when both populations A and B are stimulated). Such a system may or may not also contain noise (e.g., to introduce jitter in the dominance durations) so long as the noise does not destroy the stable limit cycle. (One may construct more complex systems where state space changes over time from having attractors to oscillatory stable states; such systems would not fall into either preceding category and we do not study them here.)

### A one-variable energy model for bistability

The two-attractor structure of state space proposed in Fig. 1*A* (*right*) naturally leads to a description by an energy function with two local minima, corresponding to *A _{on}* and

*B*, and a barrier corresponding to the separatrix (Fig. 1

_{on}*B*). We therefore first sought to find an energy-based formalism to describe the dynamics of perceptual alternations. This formulation will later shed light on how to build more realistic rate-based and spiking neural networks. The observations summarized by Levelt (1968) about the effect of stimulation strengths on the mean dominance durations of each percept present an important challenge in the construction of an energy function. A simplistic extension of commonly used energy functions would produce a system where increasing the input strength to percept A would deepen its own minimum. This would make it harder for the system to escape from A, which in turn would increase its mean dominance durations. This is at odds with Levelt's Proposition II, which states that the main effect of increasing the input strength to A is a decrease in the mean dominance duration of B. As observed by Kim et al. (2006), the latter behavior implies that an increase of the input to A has the effect of heightening the energy barrier of the population representing percept B (Fig. 2

*A*,

*right*). Similarly, Proposition IV (alternation rate grows as both stimuli are strengthened) implies that increasing both inputs lowers—not heightens—the energy barrier (Fig. 2

*A*,

*left*). A simple energy function that has these two properties is (1) where the single variable Δ

*r*=

*r*−

_{A}*r*is the difference between the firing rates of the two competing populations and

_{B}*g*and

_{A}*g*are their input strengths. The minima are located close to Δ

_{B}*r*= ±1 (states

*A*and

_{on}*B*, respectively; for simplicity, the firing rates are dimensionless here). The first term of the energy function ensures that there are two local minima for small values of the stimulus strengths. The next two quadratic terms are proportional to the stimulation strengths; each increases the energy of the competing minimum without changing its own minimum energy.

_{on}For a model based on an energy function the dynamic variable satisfies dΔ*r*/d*t* = −τ^{−1}d*E*(Δ*r*)/dΔ*r*. This means that the dependent variable Δ*r* moves along the horizontal axis of the energy function (Fig. 2*A*) toward the location of the closest minimum with a velocity proportional to the slope of the function. Because the slope of the energy function is zero at the minima, those points are fixed points of the dynamics. In addition to the deterministic rule specified earlier, we introduce a noise source to allow random transitions between the minima. The time evolution is therefore given by (2) Here τ (set to 10 ms in the subsequent simulations) is the timescale in which Δ*r* changes and *n*(*t*) is a colored Gaussian noise (see *Eq*. A*1*, appendix A). Because perception of stimulus *A* happens whenever the firing rate of population A is higher than that for population B (*r _{A}* >

*r*), an alternation occurs when the variable Δ

_{B}*r*crosses zero. This dynamics generates trajectories that linger for a short while around one of the fixed points and then move to the other (Fig. 1

*B*,

*middle*, oriented diagonally).

When alternations between two states are driven purely by noise, the lifetimes of each state are distributed exponentially (Kramers 1940). With refractory period or other biophysical constraints, the distribution of dominance durations is nearly exponential, with the peak determined by the timescale of the noise (e.g., Kramers 1940; van Kampen 2001). As can be seen in Fig. 1*C*, this is also the case for the model as formulated in *Eq. 2*, where the peak of the distribution of dominance durations is at the timescale of the noise we used, approximately 100 ms. This is very far from experimentally observed distributions, which typically peak at timescales of seconds and have a shape resembling a skewed Gaussian (Lehky 1995; Levelt 1968). One may propose to address this by assuming that perceptual alternations are driven by noise sources that act at a slower timescale, of seconds (e.g., originating from endogenous attention modulations and/or global neurotransmitter levels). This approach is limited, however: while it can certainly lengthen the dominance durations, it is not sufficient to fit the shape of their distributions (see following text, ⇓⇓⇓⇓Fig. 7). We therefore favor another approach, which can yield both realistic means and shape of distributions of dominance durations. We propose that biophysical noise sources characterized by fast timescales (∼100 ms) do play a major role in causing alternations. However, unlike in the simple model of *Eq. 2*, we suggest that in reality there are additional mechanisms that effectively reduce the probability that the system leaves an attractor right after it has settled into it, compared with the probability of later transitions. There is independent evidence for the existence of such “short-term persistence” mechanisms (see discussion and Leopold et al. 2002), but their precise nature is not well understood. In the model presented in the following sections, we achieve this tendency by adding a weak adaptation current. The initial activity level of the dominant population (i.e., immediately after transitions) will be too high for the noise to push the system to the competing state. Over time, however, the weak adaptation will bring the activity to a slightly lower level, comparable to that of the noise amplitude, so that the probability of transition will increase. Importantly, however, in our model adaptation alone (i.e., without noise) will not be enough to cause alternations, i.e., they will still be noise driven. In terms of the state-space and the energy landscape (Fig. 2), the effect of adaptation will be to add a slow, time-dependent forcing that effectively reduces the depth of the minimum associated with the dominant percept over time. However, the adaptation will be too weak to destroy the energy minima (i.e., to destabilize the states *A _{on}* and

*B*), and therefore noise will still be crucial for dominance switches.

_{on}The mean dominance durations of each attractor state, calculated from simulations of *Eq. 2* for different input strengths, show that the system indeed satisfies Levelt's propositions (Fig. 2*B*, solid lines). This is a direct consequence of our choice of energy function, specifically the dependence of the height of energy barriers on input strength. This dependence arises from the two terms where the input strengths (*g _{A}* and

*g*) are multiplied with the state variable (Δ

_{B}*r*). Although the effect of

*g*on

_{B}*T*is much larger than that on

_{A}*T*(Fig. 2

_{B}*B*,

*right*), the effect on

*T*is not negligible. This is because, although increasing

_{B}*g*greatly reduces the energy barrier for

_{B}*A*, it also slightly increases the barrier for

_{on}*B*. This behavior is consistent with experimental results (Brascamp et al. 2006). In the next section we will see that the coupling between the input strength to one population and the energy barrier of the other, posited to obtain the experimentally observed dependencies of mean dominance durations on stimulus strength, motivates a novel network architecture and leads to novel predictions about the levels of activity of the neural populations.

_{on}### Derivation of a rate-based model and network architecture

In this section we construct a rate-based network model based on the energy description of *Eq. 1*. We first extend *Eq. 1* to a two-variable energy function (*Eq*. B*1*, appendix B). This energy describes the dynamics of two populations, *A* and *B*, through their firing rates *r _{A}* and

*r*. We then derive from the two-variable energy function two coupled differential equations describing the dynamics of the two populations' firing rates in the presence of noise (

_{B}*Eq*. B

*4*).

The dynamics equations determine the time evolution of the firing rates of the two populations and can be interpreted as originating from an underlying neural network. Indeed, the neural populations in the architecture presented in Fig. 3*A* obey the dynamics derived from the two-population energy function. Each population has recurrent excitation and each inhibits the other through direct cross-connections. (Although the schematic indicates that both excitation and inhibition emanate from a single population, this connectivity could be achieved with excitatory and inhibitory subpopulations; not shown.) The network shares a basic feature with many other models of bistability: to ensure that only one population is active at any time (“mutual exclusivity”; Leopold and Logothetis 1999; Rubin 2003), mutual inhibition is exerted between the two populations (Blake 1989; Laing and Chow 2002; Wilson 2003). Our model, differing from some others, requires strong recurrent excitatory connections to produce robust winner-take-all behavior for relatively weak inputs. However, for very weak inputs a single low-activity resting state is the only attractor.

A novel feature of the model that is clearly visible in the architecture is that the local inhibitory subpopulations (small circles in Fig. 3*A*) are driven by the total external stimulation. A crucial point here is that the external input to these subpopulations constitutes not only a copy of the external input to “their” excitatory population, but also the input sent to the competing population (e.g., to the other eye). Moreover, the rate-based equations (*Eq*. B*4*) require that this total external input be gated by the activity level of the corresponding excitatory population, so that each recurrent population *k* (*k* = *A*, *B*) receives back inhibition equaling (*g _{A}* +

*g*)

_{B}*r*. This feature is a consequence of the multiplicative terms

_{k}*g*(Δ

_{k}*r*± 1)

^{2}in the one-variable energy function (

*Eq. 1*). Recall that those terms were required to make the model behave in accordance with propositions II and IV of Levelt (1968). Inspection of

*Eq*. B

*4*now sheds light on how the multiplicative terms give rise to these behaviors. Increasing the input to one population, say A, results in stronger inhibition to it when it is dominant (i.e., when

*r*= 1) and also in stronger inhibition to population B when the latter is dominant (i.e., when

_{A}*r*= 1). At the same time, the increase of

_{B}*g*also provides additional excitatory input to population A, and therefore the total input to it remains largely unaffected when it is dominant. In contrast, population B does not enjoy stronger excitatory input from the increase in

_{A}*g*, and therefore its total input, although dominant, is reduced by an amount −

_{A}*g*

_{A}*r*. Consequently, the mean dominance duration of B is reduced because less noise is now required to kick it out of dominance; meanwhile, the mean dominance duration of population A remains nearly unchanged (Levelt's proposition II) because there is not much change to its total input while dominant. This argument does not apply when the input to A is so large that state

_{B}*B*is close to disappearing and

_{on}*A*becomes the only stable state of the system (see last subsection in results). Similarly, simultaneous increases of the input strength to both populations cause enhanced inhibition to both during dominance, and therefore an increase in alternations rate (Levelt's proposition IV). As for the question how the multiplicative terms (

_{on}*g*+

_{A}*g*)

_{B}*r*may be implemented, they can be realized in a biophysically plausible way by a nonlinear input–output transfer function for the neurons of the inhibitory subpopulations (see, e.g., the quadratic function in the next version of the model,

_{k}*Eqs*. B

*5*–B

*7*).

Finally, we modify the architecture to achieve more plausible generalization to perceptual multistability, i.e., when the number of competing percepts *N* is >2 (Rubin 2003; Suzuki and Grabowecky 2002). A simplistic generalization of the architecture in Fig. 3*A* would require each of the *N* populations to send direct inhibitory connections to all other populations, causing the number of connections to grow as *N*^{2} and implying that each population needs to have knowledge of all its potential competitors. These problems are solved by the alternative architecture shown in Fig. 3*B*, which consists of a common neural pool that is driven by all of the external inputs, and sends by excitatory connections information about the total summed input to all of the local inhibitory subpopulations, which in turn inhibit their respective excitatory populations as discussed earlier. This eliminates the need for direct connections between the neural populations representing the different percepts and reduces the number of required connections from *O*(*N*^{2}) to *O*(*N*).

### Dynamics of the noise-driven rate-based model and the role of weak adaptation

We have simulated a two-population rate-based model using the architecture in Fig. 3*B* with the addition of weak adaptation currents (for details see appendix B, second section). Figure 4 presents time courses for the relevant dynamical variables of an excitatory neuronal population that undergoes an alternation from the suppressed to the dominance state and back to the suppressed state. Traces are shown for two different conditions: weak (gray) and strong (black) stimulation. [Equal stimulation was applied to the two populations in each case; to facilitate direct comparison between the two conditions, we used the same noise *n*(*t*) for both simulations.]

We first use Fig. 4 to further explain the effect of the weak adaptation in our model because it is fundamentally different from that in oscillatory models. The dashed curved traces in Fig. 4, *A* and *B* show the activity level of the dominant population and of the total input to it, respectively. A slight decline over time is clearly evident in the noise-free system, but it also exists for the mean activity and mean total input in the presence of noise. This decline is caused by the gradual increase of the adaptation current (Fig. 4*D*; the adaptation does not exhibit rapid fluctuations because it integrates the activity slowly; cf. *Eq*. B*5*). Because the total input to the excitatory population is given by (cf. *Eq*. B*5*), its mean decreases as the adaptation current increases over time. Note, however, that the asymptotic value of the total input is well above the transition threshold of the system (horizontal line in Fig. 4*B*). The adaptation is therefore not sufficient to drive dominance switches by itself. Instead, transitions occur by chance, when noise-evoked fluctuations bring the total input below the threshold. Thus if noise is removed from the model, the system would never show alternations. This noise-driven switching mechanism is fundamentally different from what happens in oscillator models: in those, the adaptation is taken to be strong enough to cause switching in dominance by reducing the total input below the transition threshold, even in the absence of noise.

Although the weak adaptation does not drive alternations in our model, it serves another important purpose: it provides a mechanism to make the probability of transition time dependent. Because the mean and the amplitude of the fluctuations in our model do not change over time, without adaptation the probability that the noise would cause the total input to dip below threshold would have been constant in time. This, in turn, would have yielded exponential-like distributions of dominance durations, peaking at the timescale of the noise (recall that without weak adaptation brief dominance durations of ∼100 ms are much more likely to occur; Fig. 1*C*). The weak adaptation provides a time-varying mean input that disfavors early transitions in comparison with later transitions, thus providing a form of “short-term persistence.”

In terms of the energy landscape (Fig. 2), the weak adaptation can be thought of as causing slow changes in the shapes of the energy wells around *A _{on}* and

*B*and the energy barrier between them (but without completely destabilizing the two attractors). Specifically, over time there is a decrease in size of the basin of attraction associated with the dominant percept, together with a shift of the separatrix (approximately the peak of the energy barrier) toward the same attractor. Figure 5 provides an example of an individual trajectory of the system, illustrating a transition from

_{on}*B*to

_{on}*A*on the plane (

_{on}*r*,

_{A}*r*), the change in the location of the seperatrix over time, and the absence of transition without noise.

_{B}Returning to Fig. 4, we next examine it to gain further understanding of the dependence of dominance durations on stimulation strength. During dominance, the activity level and the total input (Fig. 4, *A* and *B*, respectively) are both slightly higher in the weak stimulation condition (*gray traces*) than in the strong stimulation condition (*black traces*). This, in turn, results from a higher inhibition when the competing stimulus is stronger (Fig. 4*C*). As a result, the total input during dominance is closer to the threshold for stronger competing stimuli, so that transitions tend to occur sooner, in accordance with Levelt's propositions II and IV. The dotted lines in Fig. 2*B* confirm that the rate-based model indeed obeys these propositions over a wide range of stimulus strengths. For weak enough inputs, alternation behavior gives way to quiescence: the bistable attractor states disappear and a single (resting) low-activity attractor state for both populations is the only available firing pattern. The transition regime between the alternation mode (*A _{on}* or

*B*) and quiescence is sharp, and it is characterized by the presence of random sequencing between three states:

_{on}*A*alone,

_{on}*B*alone, and the resting state. For weak inputs, the resting state dominates most of the time, whereas for less weak inputs the

_{on}*A*and

_{on}*B*states alternate. Although sharp, the transition regime is continuous, with the resting state occupying an increasing fraction of time as the stimulus strength decreases. One may interpret this continuous transition as corresponding to the stimulus detection threshold. For very large input strengths, the system can oscillate even without noise because of the presence of weak adaptation; for yet stronger stimuli, steady coactivity of the two populations occurs (the latter is also observed in adaptation-based models; Shpiro et al. 2007). However, such large inputs are not likely to be experienced in reality because of gain control mechanisms that operate at multiple levels of sensory processing. The existence of a large range in which a winner-take-all regime is present between the low- and high-input strength regimes is controlled by the strength of the recurrent connections of the excitatory populations. We have established a set of conditions for the network connectivity parameters that approximately determine when the attractor states exist (appendix B, third section).

_{on}To further examine the effect of noise on dominance transitions we calculated averages of the time courses of input noise synchronized to specific transition events [“switch-triggered-averages” (STAs)]. The solid curve in Fig. 6 *A* shows the STA for transitions from suppressed to dominant states of one population (arbitrarily chosen) and the dashed curve shows the STA for the reverse transitions of the same population. The curves indicate that transitions tend to occur when there are simultaneous increases in input noise to the population switching to dominance and decreases in input noise to the population becoming suppressed. (Note that the transitions occur with a short delay after the coincidental noise fluctuations in the two populations, reflecting the neuronal integration timescale.) Figure 6*B* shows that this tendency holds for individual transitions, not just for averages. Each point in the figure represents the values of the input noise to population A against that of population B at moments of transition. In spite of the variability in the noise values at individual transition events, there is a clear and an almost complete separation between the two clouds, with the dot symbols, indicating transitions of population A from suppressed to dominant, clustering in the *bottom right quadrant* (i.e., when *n _{A}* > 0 and

*n*< 0), and the cross symbols, indicating the opposite transitions, clustering in the opposite quadrant. Furthermore, the clouds of points are elongated with slope near one, suggesting that a stronger-than-average positive fluctuation in the input to A can push it to dominance even if B receives a weaker-than-average negative fluctuation, and vice versa.

_{B}Recently, Lankheet (2006) conducted an experiment to test the effect of modulations in the external (stimulus) noise on perceptual transitions. Two random-dot kinematograms with different directions of motion were used as competing stimuli in a binocular rivalry paradigm. The coherence levels were modulated in a pseudorandom fashion as observers continually indicated their percept. Lankheet then calculated the STAs of the coherence in the two stimuli. STAs associated with transitions from suppression to dominance of an eye showed a peak just before the transition and those of the other eye show a negative (if weaker) peak. The experimental STAs resemble our simulated STAs in Fig. 6*A*. At the same time, there are a few notable differences between Lankheet's results and the STAs shown in Fig. 6*A*. First, only one of Lankheet's subjects showed a negative peak in the STAs of the transitions from dominance to suppression, whereas the STAs produced by our model show positive and negative peaks of approximately the same height. Dissimilar heights can be obtained in a slightly modified version of our model, too, by injecting the noise directly into the inputs of all populations that receive external stimulation, rather than as a perturbation to the excitatory populations only (not shown), which is more similar to the Lankheet (2006) paradigm of perturbing the external stimuli. Second, some of Lankheet's subjects showed wide and shallow peaks in their STAs several seconds before the narrow peaks immediately preceding the transition, which are not observed in our simulations. To mimic the experimental observations, we ran a simulation with a noise timescale of 500 ms as that used in the experiment (instead of the much shorter 100-ms timescale used to compute the STAs in Fig. 6*A*). The new STAs resemble those found experimentally, including the presence of wide and shallow peaks preceding the sharp peaks right before transitions (Fig. 6*C*, solid lines). Using simulation results of a competition model, Lankheet (2006) interpreted the shallow peaks as the consequence of firing rate adaptation. However, even after we removed adaptation from our model altogether (and increased the noise level so that the alternation rate is kept constant, around 0.25 Hz), the shallow peaks did not disappear (Fig. 6*C*, dashed lines). This suggests that adaptation is not necessary to produce those peaks.

### The interplay between noise and adaptation levels and its effect on the distribution of dominance durations

With appropriate choice of the amplitudes of noise and adaptation current, the rate-based *Eq. 2* produces noise-driven switches whose distribution of dominance durations (Fig. 7*A*) agrees with those typically observed during rivalry, being well fit by gamma or log-normal functions (Lehky 1988; Levelt 1968). The timescale and amplitude of both noise and adaptation affect the shape of the distributions. Figure 7*B* presents the distributions obtained for two other conditions, stronger and weaker adaptation (dashed and dotted lines, respectively; conditions were compared with the mean dominance durations kept approximately constant, which means that as adaptation strength was increased, the noise amplitude was reduced accordingly.) When adaptation is strong, alternations are dominated by the dynamics of this outward current, making the durations less variable. The distribution becomes narrower and symmetrical around the mean dominance duration (Fig. 7*B*, thin solid curve). The limiting case of strengthening adaptation (relative to noise amplitude) produces a noise-free oscillatory system; i.e., the distribution of dominance durations becomes a delta function (not shown). At the other extreme, when adaptation is removed altogether, the distribution becomes severely skewed with its peak shifted down to a value closer to the timescale of the noise (100 ms in our case). These results indicate that to obtain realistic distributions of dominance durations adaptation should be present, but weak. Importantly, the values of adaptation and noise in our model that produce realistic distributions are such that adaptation cannot generate switches by itself, i.e., when noise is removed from the system. In the presence of adaptation there should be correlations between the durations of consecutive percepts, but because in our model the adaptation is weak, correlations are small (not shown), in accordance with experimental evidence (Fox and Herrmann 1967; Rubin and Hupé 2004). Nevertheless, the role of adaptation is important and twofold. It produces a time-dependent probability of transitions that gives realistic distributions of dominance durations. Also, adaptation's slow timescale in companion with the noise amplitude sets the timescale (seconds) of alternations.

### Bistability and alternations in a spiking neural attractor network

In this section we present results from simulations of a cell-based network with spiking neurons based on the rate model presented above (see appendix C). The architecture was like that in Fig. 3*B* (without population C), with 100 neurons per population. We used leaky integrate-and-fire neurons with weak adaptation currents (to obtain dominance durations consistent with experimental observations; see above). The connectivity between neurons in each stimulus-selective excitatory population was all to all. In addition, each neuron projected to all other neurons in a target population. Background synaptic conductance input was modeled using fast kinetics like those of α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) and γ-aminobutyric acid type A (GABA_{A}) receptors receiving white noise, uncorrelated between the neurons. Excitatory recurrent connections were mediated by slow synaptic conductances, like those used to model *N*-methyl-d-aspartate (NMDA) receptors, which ensures that states with low firing rates are stable (Wang 1999). Indeed, the simulations show that even states with firing rates as low as 10 Hz can engage in rivalry alternations (Fig. 8 *A*). The mean dominance duration as a function of the conductance (inputs) reproduces Levelt's propositions (Fig. 8*B*; see following text) and the distribution of their durations follow a skewed Gaussian (Fig. 8*C*).

As found for our rate model, alternations in the spiking neuronal network are not mediated by adaptation; rather, they are noise driven. There are two sources of activity fluctuations in the spiking network model: noise in the synaptic input arriving from background sources (external to the network), and variability from nonsynchronous, individual, synaptic inputs generated within the network. In the network described earlier, transitions were driven by the latter source of noise. We verified this by increasing the size of the network while proportionally scaling down the unitary synaptic conductances, such that the mean total conductance to each neuron was kept constant, alhough the size of its input fluctuations was reduced (external noise is kept fixed here). As the network size increased, there was a point at which alternations ceased (at 10,000 neurons; not shown), revealing that the cause of the transitions was internally generated noise in the form of spiking neuronal variability. In addition to identifying the cause of alternations, this result also shows that adaptation alone cannot produce alternations in our model because the adaptation modulation was not affected by this manipulation. This last result also indicates that larger networks would require a mechanism that maintains the internal spiking variability to ensure that switching would be maintained. This could be obtained by small amounts of externally correlated noise (Moreno et al. 2002; Moreno-Bote and Parga 2006; Renart et al. 2007; Zohary et al. 1994). Indeed, simulations of a large-scale network (10,000 neurons per population) showed alternations when a small fraction of the external noise (∼1%) was the same for all neurons in a population (not shown). Note that the firing rate models with added noise presented in the previous sections are thought to represent such large networks with correlated fluctuations that cannot be averaged out. Because it is not feasible to perform long enough simulations of very large spiking networks to obtain reliable statistics, we simulated mainly small networks and therefore used completely uncorrelated external noise.

An interesting property of the model is presented in Fig. 9, which shows the mean firing rate as a function of the stimulus strength for the dominant and suppressed populations during rivalry, as well as the mean firing rate of a single population under nonrivaling conditions (e.g., when the competing monocular stimulus is turned off in binocular rivalry). During nonrivaling conditions, the mean firing rate of the stimulated population increases with input strength. In contrast, during rivalry the mean firing rate of the dominant population shows little dependence on stimulation strength. Furthermore, the activity during rivalry is lower than that during nonrivaling conditions throughout the range of stimulation strength. A similar reduction in activity was also observed in the rate-based and energy models (data not shown). Therefore a lower activity during rivalry is a robust feature in all versions of our model. This prediction can be tested experimentally using functional magnetic resonance imaging (fMRI) or electrophysiology (see discussion).

### Dependence of mean dominance durations at large input strengths

Our energy-based, population rates, and spiking network models produce Levelt's Proposition II in a wide range of input strengths (Figs. 2*B* and 8*B*). This behavior is robust in our models as long as the input strength to the affected eye is smaller than or similar to the input strength to the other, unaffected eye (that whose input strength, e.g., image's contrast is kept fixed). However, consideration of the situation when the input strength of the affected eye is made much larger than that of the other eye suggests that, at some point, a different behavior than that stated in Levelt's Proposition II must emerge. This is readily apparent when one considers the limiting case, when the image contrast to the affected (say, the right) eye has been raised well above that of the other eye. Clearly, at this point perception will be dominated by the image given to the right eye, and presumably its mean dominance duration must be much higher than that of the other (left) eye. Therefore at a point around or soon after the contrast of the right eye is increased above the (fixed) contrast of the left eye, the mean duration of the right eye must start increasing, in violation of Levelt's Proposition II. This is precisely the behavior produced in our models, as shown in ⇑Fig. 10 *A*. As the input strength to population *B* is made larger than that to population *A*, the mean dominance duration of *B* increases much more than the mean dominance duration of *A* is reduced. Figure 10*B* provides an intuitive explanation of this result in terms of the effect of input strength on the energy function. The energy function in *Eq. 1* is plotted for several values of *g _{B}*, all of them larger than

*g*. As

_{A}*g*increases, state

_{B}*A*starts to lose stability because its energy well becomes shallower. Crucially, at the same time the energy well of state

_{on}*B*becomes deeper, thus increasing its mean dominance duration. Recently, Brascamp et al. (2006) showed that the behavior described earlier is indeed observed experimentally in binocular rivalry, i.e., that there is a significant violation of Levelt's Proposition II so that as the contrast of the affected eye is increased well above that of the unaffected eye, the mean dominance durations of the former rise rapidly. The same authors also showed that this behavior is found in purely oscillator models, although more analysis about its robustness is required.

_{on}## DISCUSSION

The mechanisms by which perceptual alternations occur during binocular rivalry are not well understood, nor is it known whether there are commonalities (e.g., similar architectures) with the mechanisms that cause perceptual switching in other bistable perceptual phenomena. The work presented here shows that attractor networks, as a class of models, provide a plausible framework to describe the dynamics of perceptual bistability. Our approach differs from most existing models of bistability, which assume the alternations are driven by some form of slow adaptation acting on the dominant population (Kalarickal and Marshall 2000; Lago-Fernandez and Deco 2002; Laing and Chow 2002; Lehky 1988; Matsuoka 1984; Stollenwerk and Bode 2003; Wilson 2003). In those models, the adaptation precludes the persistence of the dominant state over time. The threshold for switching and the activity state slowly drift toward each other and autonomously coalesce, leading to a switch. The oscillation between the two competing populations is the only stable state in the system. In contrast, in our model the competing states remain stable fixed-points at all times, and it is noise (e.g., the spiking variability observed commonly in vivo) that causes alternations in dominance. Thus alternations cease if noise is removed (Figs. 1 and 6), although in its presence the interplay between adaptation and noise sets the timescale of alternations. Finally, the same sources of noise in our model also cause the variability in dominance durations observed experimentally; i.e., there is no need to invoke an ad hoc assumption about the presence of noise to explain this variability.

Our model goes beyond previous energy-based conceptualizations of perceptual bistability (e.g., Haken 1994; Kanai et al. 2005; Kim et al. 2006; Riani and Simonotto 1994), by presenting a neurally plausible attractor model that behaves consistently with experimental findings, most notably the increase in alternation rate with stimulation strength observed in binocular rivalry (Levelt 1968; Fig. 2*B*). This behavior would not arise automatically in an attractor-based model but rather depends on the network architecture. In particular, if the effect of increasing stimulus strength in an attractor model was to deepen the energy well of the corresponding attractor, this would have the opposite outcome of lengthening of the durations the network spends in that attractor. Furthermore, comprehensive analysis of oscillator models revealed that they, too, produce dominance durations that increase with stimulus strength in large parts of parameter space (Shpiro et al. 2006). The different behavior in our model (a shortening of the time spent in the competing attractors with increasing stimulus strength) arises from introducing in the energy function terms coupling the attractors' energy barriers with the input strength (Fig. 2*A*). In the network architecture of our model this was realized by feeding into each local inhibitory population a signal equal to the total external input [either directly (Fig. 3*A*) or by a global excitatory pool (Fig. 3*B*)], which is gated (multiplied) by the activity level of the corresponding excitatory population. Our network thus illustrates a class of models in which the wiring can be dynamically modified, as opposed to being hard-wired. As for the question what brain region may act as the excitatory pool in Fig. 3*B* (i.e., compute the total strength of all sensory inputs), note that this need not be a cortical region. Broad tuning like that expected from this hypothesized region is more characteristic of subcortical structures, which receive projections from a multitude of sensory cortical areas, and therefore could compute the global signal our model requires and send it back to the cortical local inhibitory subpopulations as schematized in Fig. 3*B*.

### Noise versus adaptation as possible causes of perceptual alternations

The success of our model in reproducing salient dynamical behaviors of perceptual bistability suggests that noise may be the primary cause of perceptual alternations in bistability. This contrasts with the prevalent view that perceptual alternations are caused by some form of adaptation or fatigue (e.g., Kalarickal and Marshall 2000; Lago-Fernandez and Deco 2002; Laing and Chow 2002; Lehky 1968; Matsuoka 1984; Stollenwerk and Bode 2003; Wilson 2003). It is therefore important to note that, although it is known that there are multiple forms and mechanisms of adaptation in the brain, in the specific context of bistability a direct link has not been established to point to adaptation as the primary cause of alternations. An important observation in this context is that there is no evidence for dependence between the durations of successive dominance periods (e.g., a tendency for shorter periods to follow particularly long periods or vice versa; Fox and Herrmann 1967; Lehky 1995; Necker 1832; Rubin and Hupé 2004), as may be expected if adaptation played a major role in causing alternations. Thus if adaptation plays any role in causing the alternations it would have to involve mechanisms with a very fast reset, so that all trace of it is essentially gone soon after the system has switched to the competing percept. Also at timescales longer than individual dominance durations, data from long trials (5–10 min) of several binocular rivalry and plaid stimuli reveals that both the durations' means and their variances remain remarkably stable over time (Rubin and Hupé 2004), again showing no evidence for buildup of adaptation over time.

The distinction between oscillator models (where adaptation is the cause of alternations and noise is inessential) and noise-driven attractor models (where adaptation is inessential for alternations) is conceptually useful. However, given the ubiquity of adaptation mechanisms in the nervous system, in reality bistable networks most likely contain some form(s) of adaptation and this, in turn, may affect some aspects of the alternations. Indeed, we included adaptation in both our rate-based and spiking neuronal networks. Note, however, that in isolation (i.e., without noise) adaptation could not instigate alternations in our model and its function was rather to produce distributions of durations that resemble the skewed Gaussians observed experimentally. Specifically, the weak adaptation provided a natural and theoretically tractable way to implement a form of short-term persistence that disfavored the system leaving the attractor state right after it has settled into it, compared with the probability of leaving it later in time. However, other ways to implement such a tendency may be equally valid, such as synaptic facilitation of local inhibition by the selective excitatory population.

A few experimental studies examined the role of adaptation in perceptual bistability. Blake et al. (1990) modified the standard binocular rivalry paradigm by “forcing” one eye to dominate for long periods of time (30 s); they found that, on removal of the forcing, this eye's dominance durations were shorter by about a factor of two. Although this implicates adaptation in the dynamics, the crucial question is not whether adaptation is present, but whether it is responsible for the alternations. If that were the case, then the very long forcing should have led to very fast or even instantaneous transitions, with narrowly distributed durations. Instead, Blake et al. (1990) observed durations with a mean of about 2 s and large variability, suggesting that even such saturated adaptation does not instigate immediate transitions. In another experiment Leopold et al. (2002) showed, for a host of bistable stimuli, that alternations can be slowed down dramatically if the stimulus is periodically removed from view, again suggesting that if adaptation is involved in bistability, its influence does not carry over from one dominance epoch to another. Moreover, as these authors noted, their results suggest the involvement of a short-term implicit perceptual memory that, as discussed earlier, could produce distributions of dominance duration consistent with experiments without the need to invoke adaptation.

Two recent studies provide experimental evidence for an important role for noise in causing perceptual alternations. Brascamp et al. (2006) studied the role of noise in causing alternations by focusing on the prevalence of “return transitions,” cases when the dominant percept gives way to a mixed percept but then the system returns to the same percept that was dominant before (rather than the competing one). Brascamp et al. (2006) found a high prevalence of such transitions that, as they noted, is more consistent with noise than with adaptation as driving the alternations. Kim et al. (2006) studied the effect of weak contrast oscillations on the alternation rate of two rivaling images. They found the presence of stochastic resonance, that is, a maximum effect of the frequency of the oscillatory signal when it matches that of the alternations, an effect that can be explained only if a large amount of noise is present in the system.

### The nature and sources of noise

In all three levels of description, the noise was fast compared with the timescale of alternations [*O*(100 ms) vs. *O*(1 s), respectively]; i.e., the transitions between states were not a trivial consequence of noise at the same scale. The noise timescale we posited is plausible biologically. In the spiking neuronal network, recurrent connections are dominated by NMDA-like synaptic receptors, and therefore the current fluctuations inherit the timescale of those synapses, of the order of 100 ms (Moreno-Bote and Parga 2005b; Titz and Keller 1997; Umemiya et al. 1999). Such receptors have been invoked in other models, e.g., to stabilize sustained activity in prefrontal cortex during a delayed-match-to-sample task (Wang 1999), and to account for the slow ramping behavior of neurons in posterior parietal cortex during a discrimination task (Wang 2002). Although in our simulations fast AMPA noise is present as an external source, internally generated noise is dominated by slower NMDA-generated fluctuations so as not to lead to fast population activity fluctuations that could destabilize the attractor dynamics.

There are other conceivable sources of noise in the synaptic input to cortical neurons. Modulations in ongoing cortical activity patterns, measured with optical imaging and local field potential recordings, are known to affect spiking responses in single neurons (Arieli et al. 1996). Moreover, although the underlying mechanisms are not well understood, changes in the level of coherent cortical activity at those timescales have been tied to modulations in attention and perceptual performance (Fries et al. 2001; Salinas and Sejnowski 2001; Womelsdorf et al. 2006). Thus variability appearing as mere noise in the context of perceptual bistability may arise from changes in internal network states that have functional roles in other situations. Finally, it is reasonable to assume that other sources of internal noise, including some that act at slower timescales (e.g., variations in global neurotransmitter levels, endogenous attention modulations, blinks) could also play a role in producing some of the switches.

### Model predictions and experimental tests

An important feature of our model that has arisen from the energy formulation is the presence of inhibition from the input layer that is targeted at the competing population(s) (directly, as in Fig. 3*A*, or by an excitatory pool, as in Fig. 3*B*). This leads to a new prediction that can be tested with electrophysiological and neuroimaging studies. The model predicts that activity during rivalry should be lower compared with when the neural population receives the same input under nonrivaling conditions (Fig. 8). (Note that we use the term “rivalry” here in the general sense of two competing interpretations of a stimulus and, correspondingly, two rivaling neural populations. Therefore the prediction is not restricted to binocular rivalry but also holds for other bistable perceptual phenomena.) The reason is that during rivaling stimulation, local inhibition is enhanced due to the higher signal from the external input, leading to a reduction in the activity of the dominant excitatory population. Furthermore, the difference in activity between the two conditions grows as the stimulation strength increases (Fig. 9). In contrast, this prediction does not arise for models in which the dynamics is governed by adaptation currents (oscillator models). There, when a population becomes dominant, it receives no inhibitory inputs (because the only possible source is the suppressed population), and therefore no reduction of activity is expected compared with when the competing stimulus is turned off. The predictions that oscillator and our attractor model make in this regard are clearly different and could be used to determine which model better describes the neuronal dynamics during rivalry.

Interestingly, recent human fMRI studies of binocular rivalry provide some evidence to support the prediction of our attractor model. Reduced blood oxygenated level–dependent signal during rivalry compared with nonrivaling vision have been shown in the lateral geniculate nucleus (Haynes et al. 2005; Wunderlich et al. 2005) and visual cortical areas V1 through V4 (Lee and Blake 2002; Polonsky et al. 2000). In higher visual areas, an fMRI study found no differences in activity between rivalry and nonrivalry conditions in the fusiform face and parahippocampal place areas (Tong et al. 1998), whereas electrophysiological recording in monkeys have shown reduced activity during rivalry in inferotemporal cortex and the superior temporal sulcus (Sheinberg and Logothetis 1997). Further investigation is therefore needed to examine this issue across the brain and for different bistability phenomena.

In conclusion, we have proposed a novel framework to model the bistable perceptual alternations observed during exposure to ambiguous or rivaling sensory stimuli. Our approach is based on the assumption that each of the competing percepts corresponds to a neuronal stable state, and the transitions between them are caused by noise. This differs from the prevalent view that the transitions are caused by an adaptation or fatigue process, which implies that the alternations reflect a limit cycle (oscillations) in neuronal state space. Starting from an energy-based model chosen to meet specific experimentally observed characteristics, we derived neurally plausible rate-based and spiking attractor neuronal networks, which are the first implementations of this broad class of models showing a dynamical behavior consistent with salient properties of perceptual bistable phenomena. Our results suggest that the hypothesis that competing percepts may correspond to the activation of different attractor states of neural activity, and that alternations between them may be driven by noise, is sustainable from a theoretical point of view, and should be examined experimentally with more care.

## APPENDIX A: ENERGY MODEL

This model is defined by a two-well energy function (*Eq. 1*). The variable Δ*r* evolves according to *Eq. 2* with time constant τ = 10 ms. The noise *n*(*t*) is an Ornstein–Uhlenbeck process (Risken 1989) with zero mean and deviation σ (σ = 0.7) (A1) where τ_{s} = 100 ms and ξ(*t*) is a white noise process with zero mean and 〈ξ(*t*)ξ(*t*′)〉 = δ(*t* − *t*′). See numerical procedures in appendix E.

## APPENDIX B: RATE-BASED MODELS

### Model with direct cross-inhibition

Based on the single-variable energy function of *Eq. 1*, we formulate an energy function for a network with two populations, *A* and *B*, that are firing at rates *r _{A}* and

*r*, respectively. This energy function has quadratic potentials placed at the states (

_{B}*r*,

_{A}*r*) = (1, 0) and (0, 1). For simplicity, the firing rates are dimensionless here, measured in relation to a maximum firing rate so that 0 ≤

_{B}*r*,

_{A}*r*≤ 1. Here,

_{B}*f*

^{(−1)}denotes the inverse function of a neuronal population's input–output relation [i.e., firing rate =

*f*(input)]. For weak stimuli and with

*f*idealized as a step function [i.e.,

*f*(

*u*) = 0 for

*u*< 0, and

*f*(

*u*) = 1 elsewhere], the energy function is simply

*E*(

*r*,

_{A}*r*) = −(

_{B}*αr*+

_{A}^{2}*αr*− 2β

_{B}^{2}*r*

_{A}*r*)/2; if β > α, it has two minima, placed at (

_{B}*r*,

_{A}*r*) = (1, 0) and (0, 1), within the plane [0, 1] × [0, 1], where the firing rates are defined. Thus in this parameter range the inhibition each population exerts on the other is strong enough to preclude the two populations from being active at the same time.

_{B}The following network dynamics (B2) minimizes the energy function *E*(*r _{A}*,

*r*). Strictly,

_{B}*E*(

*r*,

_{A}*r*) is a Lyapunov function for the dynamics defined in

_{B}*Eq*. B

*2*. That is,

*E*is nonincreasing along trajectories (B3) assuming that

*f*is a nondecreasing function and τ > 0 (Hertz et al. 1991).

Once stochastic input terms *n _{A}* and

*n*are added, the dynamics produces transitions between the two local minima of the potential function, according to the equation set (B4) The two noise terms are taken to be independent, continuous random processes as in the single-variable energy-based model (

_{B}*Eq*. A

*1*).

### Model with inhibition driven indirectly by an excitatory pool and with weak adaptation

Here we derive the rate-based model for the architecture in Fig. 3*B*, with an excitatory pool projecting to all local inhibitory populations to mediate the mutual exclusivity, instead of direct cross-inhibition between the percept-specific populations. Despite the differences in architecture between this and the previous model, they can be mapped one onto the other for particular parameters sets, as shown in the next section. Parameters below were chosen to allow this mapping.

The activity of the excitatory population A, *r _{A}*, is described by the equation set (B5) with α = 0.75, β = 0.5, γ = 0.1;

*f*is the input–output curve, modeled as a sigmoid function (B6) with threshold θ = 0.1 and

*k*= 0.05. The inputs to the neuron consist of: recurrent excitation (with strength or efficacy α); local inhibition (with strength β) that grades linearly with the inhibitory firing rate

*r*

_{A}_{,inh}; a hyperpolarizing current,

*a*

_{A}(

*t*), with a maximum amplitude γ and time constant τ

_{a}= 2 s that produces weak adaptation; and the noise variable

*n*(

_{A}*t*), with SD σ = 0.03.

The local inhibitory population A is assumed to respond instantaneously to its inputs (i.e., fast recruitment) with a quadratic input–output relation. The quadratic form allows the system to be easily mapped onto the previous architecture (see following text), although other steep nonlinear functions (e.g., cubic) also produce similar network behavior. Its firing rate is given by (B7) where η = 0.5 is the ratio between the strength of the excitatory feedback (see Fig. 3) and the input from the excitatory pool, *r _{pool}*.

The excitatory pool receives inputs from the network (weighted by φ = 0.5) and from the external stimulation, and we assume that it responds with a short recruitment timescale and linearly in response to its inputs. Its firing rate is therefore given by where [·]^{+} denotes linear thresholding (note that the rate of the pool is nonnegative even when inputs are negative, allowing one to define the system also in that input regime). Similar equations define the dynamics of the population selective to stimulus B.

### Relation between the models with and without direct cross-inhibition

Despite the large differences in the architecture between the two rate-based models we have presented, it is possible to map approximately one into the other for particular sets of parameters. In fact, parameters in the model without cross-inhibition have been chosen to allow this mapping, and therefore to have dynamics consistent with Levelt's propositions. We next explain the mapping.

Let us start from the case with no direct inhibition. Because the response properties of interest are found with small stimulation strengths (*g _{A}*

_{,B}≪ 1), we may approximate the activity of the local inhibitory population (

*Eq*. B

*7*) by

During rivalry alternations values of *r _{A}*

_{,B}are either close to 0 or 1 because the input–output relation for the excitatory population is rather steep and saturates (

*Eq*. B

*6*). Suppose that population A is dominant, so that B is inactive (

*r*∼ 0); then Therefore the dynamics of

_{B}*r*when B is suppressed are approximately governed by (B8) Now compare this equation with the corresponding

_{A}*Eq*. B

*4*from the case with direct inhibition. If we set 2β(φ + η) = 1 (or approximately so) both equations depend identically on the stimulation strengths. Although the term β(φ + η)

^{2}

*r*differs between the two models, this difference does not affect the qualitative behavior as the stimulus strengths are varied. However, this term imposes a number of conditions that should hold to allow alternations and to produce mutual exclusivity between the possible stationary states. Because the dominance state should be stable, we have to impose the condition that the total synaptic input to the population is on average above the firing threshold, that is α − β(φ + η)

_{A}^{2}^{2}> θ. To guarantee mutual exclusivity we demand that if both populations attempt to become active simultaneously, the net input should be below threshold. This condition imposes α − β(2φ + η)

^{2}< θ. The preceding three conditions should hold to produce a dynamic properties that are consistent with experimental observations. In simulations we have chosen α = 0.75 and β = φ = η = 0.5, although others are also valid.

Besides the architecture, the main difference between the two models is the presence of adaptation for the model with indirect inhibition. Adaptation shapes the distribution of dominance durations but its influence is limited; we choose parameter values such that adaptation by itself (without noise) does not generate transitions between percepts. This means that the activity of a fully adapted dominant population cannot drop below the threshold of the input–output relation. Because the maximum amplitude of adaptation is γ (see *Eq*. B*5*), this condition translates into the parameter constraint (from *Eq*. B*8*) α − β(φ + η)^{2} − γ > θ, We have used γ = 0.1 and, to generate alternations with a duration of a few seconds, we have chosen σ = 0.03, unless noted otherwise.

## APPENDIX C: SPIKING NEURONAL NETWORK

We have simulated a cell-based network with the connectivity described in Fig. 3*B*. Each population contains *N* = 100 leaky integrate-and-fire neuron models. Coupling is with conductance-based synapses and all-to-all connectivity (each neuron receives connections from *all* neurons in a presynaptic population). Model equations and parameters follow (Brunel and Wang 2001; Moreno-Bote and Parga 2005a,b; Wang 2002). The voltage below the spiking threshold for the excitatory neurons in the competing populations obeys with membrane capacitance *C _{m}* = 0.5 nF, leak conductance

*g*= 25 nS, producing a membrane time constant τ

_{L}_{m}=

*C*/

_{m}*g*= 20 ms, and resting potential

_{L}*V*= −65 mV. The neuron emits a spike when the voltage reaches the threshold

_{L}*V*= −54 mV, after which the voltage is reset to

_{th}*V*= −60 mV.

_{reset}*I*(

_{syn}*t*) is the total synaptic current delivered to a neuron.

*I*(

_{adap}*t*) is a slow conductance-driven adaptation current:

*I*(

_{adap}*t*) =

*g*(

_{adap}*t*)(

*V*−

*V*);

_{adap}*g*is increased by Δ

_{adap}*g*= 0.075 nS with each spike and decays to zero exponentially with time constant τ

_{adap}= 2 s;

*V*= −80 mV. Voltage equations for the inhibitory populations and the pool are the same, but without adaptation current.

_{adap}The synaptic currents to the excitatory (E), inhibitory (I) populations, and the pool (P) are *I*_{syn,E}(*t*) = *I _{NMDA,rec}*(

*t*) +

*I*(

_{GABA}*t*) +

*I*,

_{ext}_{E}(

*t*) +

*I*(

_{back}*t*),

*I*,

_{syn}*I*(

*t*) =

*I*(

_{AMPA}*t*) +

*I*(

_{ext,I}*t*) +

*I*(

_{back}*t*), and

*I*

_{syn,P}(

*t*) =

*I*

_{AMPA}(

*t*) +

*I*(

_{ext,P}*t*) +

*I*(

_{back}*t*), respectively. The NMDA recurrent synaptic current is modeled with a linearized driving force (valid below threshold) as

*I*=

_{NMDA,rec}*∑*(

_{i}^{N}g_{NMDA,i}*t*)(

*V*−

*V*) (Brunel and Wang 2001; Moreno-Bote and Parga 2005b), where

_{E}*V*= 0 mV and

_{E}*g*(

_{NMDA,i}*t*) is the individual conductance generated by the presynaptic neuron

*i*, defined as Here, τ

_{NMDA}= 100 ms,

*g*,

_{NMDA}_{max}= 0.15 nS is the individual synaptic maximum conductance, and the sum represents the spikes emitted by neuron

*i*at previous times

*t*. Equations for the AMPA and GABA currents are (

_{j}^{i}*k*= AMPA, GABA) given by

*I*=

_{k}*g*(

_{k}*V*−

*V*), where the conductance is with τ

_{k}_{AMP}

_{A(GABA)}= 10 ms (20 ms),

*V*

_{E}_{(I)}= 0 mV (−80 mV), and the sum of spikes now extends to all presynaptic neurons

*i*. The unitary conductance for E to P connections is

*g*

_{E→P,unit}= 0.075 nS,

*g*

_{P→I,unit}= 0.23 nS for the P to inhibitory population (I) connections,

*g*

_{I→E,unit}= 0.175 nS for the I to E connections, and

*g*

_{E→I,unit}= 0.1 nS for the E to I synapses.

External inputs are modeled as constant excitatory conductances to produce a current *I _{ext,E}*

_{,A(B)}=

*g*

_{A}_{(B)}(

*V*−

*V*) for the E populations A(B),

_{E}*I*

_{ext,I}_{,A(B)}=

*f*

_{I}*g*

_{A}_{(B)}(

*V*−

*V*) for the

_{E}*I*populations A(B), and

*I*=

_{ext,P}*f*(

_{P}*g*+

_{A}*g*)(

_{B}*V*−

*V*) for P, following the architecture of Fig. 3

_{E}*B*. The factors

*f*=

_{I}*f*= 0.1 measure the effect of the external inputs on I and P populations in relation to E, and they control the slope in Fig. 9.

_{P}Each neuron receives an independent source of noisy conductance with AMPA and GABA contributions mimicking spontaneous external activity (Destexhe et al. 2003; Moreno-Bote and Parga 2005a), defined by (*k* = AMPA, NMDA) *I _{back}*(

*t*) = [g

_{k}+

*n*(

_{k}*t*)][

*V*(

*t*) −

*V*], where

_{k}*n*(

_{k}*t*) is a colored noise as in

*Eq. 3*with timescale τ

_{AMPA(GABA)}= 10 ms (20 ms). The means (

*g*) and dispersions (σ

_{k}_{k}) for the background conductances are

*g*

_{AMPA(GABA)}= 5 nS (7.5 nS), σ

_{AMPA(GABA)}= 3.53 nS (3.53 nS), equal for all neurons.

The parameters are chosen as follows: Background conductances alone should produce low firing rates in all populations. Connections between E and P, and P to I should be strong enough to produce winner-take-all behavior. Recurrent NMDA connections should be tuned to support attractor states and also allow transitions between them. Connections between E and I also need to be strong.

## APPENDIX D: LOG-NORMAL AND GAMMA FITS TO THE DISTRIBUTION OF DOMINANCE DURATIONS

The distributions of dominance durations in Fig. 7*A* have been fitted with log-normal and gamma distributions. The log-normal distribution is defined as and the gamma distribution as where *C* is the normalizing factor. Maximum likelihood fits of the simulated distributions with a log-normal distribution give the values μ = 1.24 and σ = 0.35, and with the gamma distribution give α = 8.66 and β = 0.41. The quality of both fits is very similar.

## APPENDIX E: NUMERICAL PROCEDURES

The dynamical equations for energy, rate-based, and spiking network simulations are integrated using Euler's method with time step δ*t* = 0.1 ms. Recomputing with a shorter integration time step did not produce appreciable differences in any of the results that we obtained and reported with the standard time step. The dominance durations for each percept in the energy model are defined by the amount of time in which the variable Δ*r* is below (or above) Δ*r* = 0. For the rate-based model, a transition occurs when the firing rate becomes larger (or smaller) than the firing rate of the other population. For the spiking network, a transition occurs when the averaged population firing rate (number of spikes emitted by the excitatory population over time window Δ*t* = 100 ms divided by the number of excitatory neurons) reverses order with the firing rate of the competing population. In this case, due to the large activity fluctuations, we impose the additional constraint in defining a transition that the firing rate of the population that becomes dominant must be at least >5 Hz. Energy and rate-based models typically run for 10^{5} s (model time), generating around 10^{4} durations for each percept. Means in all the plots are computed from the time series generated with these long simulations and error bars correspond to SDs of the means. Also the distributions and switch-triggered averages obtained from these time series are smooth and robust. For the spiking network simulations, shorter runs were used due to the large number of neurons per population being simulated. These spiking network simulations typically run for 10^{4} s, producing on the order of 10^{3} alternations. We used Fortran 90 custom code to simulate the models, and Matlab to analyze and plot the data, along with a random generator for white noise that generated long nonrepetitive series.

## GRANTS

This work was supported by National Eye Institute Grant EY-14030 to N. Rubin and by a Swartz Foundation grant to N. Rubin and J. Rinzel.

## Acknowledgments

We thank S. Seung, A. Shpiro, and H. Sompolinsky for useful comments and discussions.

## Footnotes

The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “

*advertisement*” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

- Copyright © 2007 by the American Physiological Society