## Abstract

Neuronal recordings and lesion studies indicate that key aspects of economic decisions take place in the orbitofrontal cortex (OFC). Previous work identified in this area three groups of neurons encoding the offer value, the chosen value, and the identity of the chosen good. An important and open question is whether and how decisions could emerge from a neural circuit formed by these three populations. Here we adapted a biophysically realistic neural network previously proposed for perceptual decisions (Wang XJ. *Neuron* 36: 955–968, 2002; Wong KF, Wang XJ. *J Neurosci* 26: 1314–1328, 2006). The domain of economic decisions is significantly broader than that for which the model was originally designed, yet the model performed remarkably well. The input and output nodes of the network were naturally mapped onto two groups of cells in OFC. Surprisingly, the activity of interneurons in the network closely resembled that of the third group of cells, namely, chosen value cells. The model reproduced several phenomena related to the neuronal origins of choice variability. It also generated testable predictions on the excitatory/inhibitory nature of different neuronal populations and on their connectivity. Some aspects of the empirical data were not reproduced, but simple extensions of the model could overcome these limitations. These results render a biologically credible model for the neuronal mechanisms of economic decisions. They demonstrate that choices could emerge from the activity of cells in the OFC, suggesting that chosen value cells directly participate in the decision process. Importantly, Wang's model provides a platform to investigate the implications of neuroscience results for economic theory.

- dynamic system
- good-based decisions
- neural network
- neuroeconomics
- orbitofrontal cortex

economic choices are thought to entail two mental stages: subjective values are first assigned to the available options, and decisions are made by comparing values. Evidence from lesion studies (Fellows 2011; Rudebeck and Murray 2014), functional imaging (Bartra et al. 2013; Clithero and Rangel 2014), and neurophysiology (Mainen and Kepecs 2009; Padoa-Schioppa 2011; Wallis 2011) indicates that choices, in particular choices between goods, involve the orbitofrontal cortex (OFC). Neuronal recordings in primates choosing between different juices identified three groups of neurons in this area: offer value cells encoding the value of individual juices, chosen juice cells encoding the binary decision outcome, and chosen value cells encoding the value of the chosen juice (Padoa-Schioppa 2013; Padoa-Schioppa and Assad 2006). Prima facie, these groups of neurons appear sufficient to characterize—and possibly generate—a decision. Indeed, offer value cells capture the input to the decision process, while chosen juice and chosen value cells capture the identity and value of the chosen good, and thus the decision outcome. A primary goal for decision neuroscience is to formalize this intuition by building a biologically realistic model in which the groups of cells found in OFC form a circuit that generates decisions. Ideally, such a model would encompass all that is known about these neurons and, concurrently, make new and testable predictions.

Our thinking about the mechanisms of economic decisions is influenced by work on motion perception (perceptual decisions). In a simplified scheme, two brain regions play a primary role: the middle temporal area (MT), where neurons encode the momentary evidence, and the lateral intraparietal area (LIP), where cells represent the decision outcome in the form of a planned saccade (Gold and Shadlen 2007; Parker and Newsome 1998). Notably, there is a natural analogy between neurons in MT and offer value cells and between neurons in LIP and chosen juice cells. In contrast, chosen value cells do not have a known correspondent in perceptual decisions.

Several models have been proposed to describe the neuronal mechanisms of perceptual decisions (Bogacz et al. 2006; Drugowitsch and Pouget 2012; Gold and Shadlen 2007). At the biophysical level, a leading proposal is Wang's model, in which decisions emerge from a balance of recurrent excitation and pooled inhibition. Different variants of the model account for perceptual decisions (Wang 2002; Wong and Wang 2006), similarity judgments (Engel and Wang 2011), probabilistic inference (Soltani and Wang 2010), behavior in a competitive game (Soltani et al. 2006), and flexible sensorimotor mapping (Fusi et al. 2007). More recently, the model has been adapted to describe the activity in LIP during foraging tasks (Soltani and Wang 2006) and to fit aggregate neural activity during value-based decisions (Hunt et al. 2012; Jocham et al. 2012). However, a precise mapping between Wang's model and the activity of neurons in OFC (or any brain area) during economic decisions has not yet been attempted. In this study, we examined the extent to which Wang's model can reproduce neuronal activity in the OFC. From a modeling perspective, this is a challenging test because—as we argue in detail below—the domain of economic decisions is significantly broader than that for which the model was originally designed and because neuronal activity in the OFC during economic decisions only partly resembles that in area LIP during perceptual decisions. Our deliberate goal was to test the model without changing its structure. Importantly, all the parameters in the model represent biophysical quantities such as synaptic efficacies and time constants and are derived, at least approximately, from empirical measures (Wang 2002). In our investigation, we used the same parameters previously used to model perceptual decisions (Wong and Wang 2006).

In most respects, Wang's model reproduced the activity of different groups of cells in OFC remarkably well. The two layers corresponding to the input and the categorical output of the network were identified with offer value cells and chosen juice cells, respectively. Most surprisingly, the activity of inhibitory interneurons in the network closely resembled that of the third group of neurons found in OFC, namely, chosen value cells. The model also reproduced several phenomena related to the neuronal origins of choice variability, namely, choice hysteresis, the “predictive activity” of chosen juice cells, and the “activity overshooting” of chosen value cells (Padoa-Schioppa 2013). Two aspects of the empirical data were not reproduced. First, the model did not include neurons with negative encoding. Second, a significant baseline in the activity of offer value cells introduced distortions in the behavior of the network. However, simple extensions of the model could overcome these limitations. The results of this study render a biologically credible model for the neuronal mechanisms of economic decisions.

## MATERIALS AND METHODS

#### Structure of the model.

In its extended form (Brunel and Wang 2001; Wang 2002), the model is a recurrent network of 2,000 spiking neurons, of which 80% (*N*_{E}) are excitatory pyramidal cells and 20% (*N*_{I}) are inhibitory interneurons. All neurons in the network are leaky integrate-and-fire cells endowed with biophysically realistic parameters. Two external stimuli provide the primary input, and each stimulus activates a fraction *f* = 0.15 of pyramidal cells. The remaining (1 − 2*f*) *N*_{E} pyramidal cells are “nonresponsive.” The synaptic input to each neuron is both excitatory and inhibitory. Excitatory inputs are through AMPA- and NMDA-mediated synapses, while inhibitory inputs are through GABA_{A}-mediated synapses. For each neuron, the input has an external component and a recurrent component. For the two groups of selective pyramidal cells, the external component includes the external stimulus. In addition, each neuron in the network receives an external background noise distributed as a Poisson process. Stimulus current and external background are through AMPA-mediated synapses. The recurrent component is provided by other neurons in the network. In analogy to the Hebbian rule, synapses between neurons in a given group (which fire together) are potentiated by a factor *w*_{+} > 1, whereas synapses between neurons in different groups are depressed by a factor *w*_{−} < 1. The condition *w*_{−} = 1 − *f*(*w*_{+} − 1)/(1 − *f*) ensures that all excitatory cells have the same spontaneous firing rate.

With a mean-field approach (Renart et al. 2003; Wong and Wang 2006), the network can be reduced to a dynamic system of 11 variables (see below). In the following, we refer to this version of the model as W11. Under several additional assumptions and approximations, the model can be further reduced to a dynamic system of two variables (henceforth W2). The advantage of the two-variable formulation is that it is easily tractable—for example, it is possible to examine the dynamics of the system in a phase plane. However, two important reasons motivated us to examine W11. First, W11 is expressed in terms of neuronal firing rates, which allows a direct comparison between the activity of units in the model and that of cells recorded experimentally. In contrast, W2 is expressed in terms of gating variables, which represent the fraction of open channels and are not directly accessible in our experiments. Second, W11 expresses explicitly the activity of each group of cells including inhibitory interneurons. In contrast, interneurons in W2 exist only implicitly, and their dynamics is not easily recovered. In summary, using W11 allowed us to contrast directly the predictions of Wang's model with the activity of neurons in the OFC.

Many previous studies examined the extended, spiking version of the model or W2. In contrast, W11 was discussed only briefly by Wong and Wang (2006). We thus recapitulate it here for convenience. A schematic representation of the model is provided in Fig. 2. In preliminary tests, we observed that the activity of inhibitory interneurons in the model examined as a function of the offer type resembles that of chosen value cells. Throughout this report, we refer to neurons in OFC as offer value, chosen juice, and chosen value cells and to nodes in the model as OV, CJ, and CV cells, respectively. The three groups of pyramidal cells correspond to CJA cells (*group 1*), CJB cells (*group 2*), and nonselective cells (*group 3*). Unless otherwise indicated, all the parameters used in our simulations were set identical to those used in the original W11 (Wong and Wang 2006). Their values are indicated in Table 1.

The dynamic system is defined by the following 11 equations: (1) (2) (3) (4) (5)

In *Eq. 1* and below, *i* = 1, 2, 3 refers to the three groups of pyramidal cells and *r*_{i} indicates the firing rate. In *Eq. 2* and below, I indicates interneurons. For each group of cells *i* and for each receptor type *R*, *S*_{R,i} is the corresponding gating variable, defined as the fraction of open ionic channels. The parameters τ are time constants, and γ in *Eq. 4* is a constant (see Table 1). Note that firing rates and gating variables are all time dependent. The input-output relation for a leaky integrate-and-fire cell is given by the simplified formula of Abbott and Chance (2005):
(6)

Note that *Eq. 6* is written separately for excitatory pyramidal cells (E) and inhibitory interneurons (I). In this equation, *I*_{syn} is the total synaptic input to the cell, and the parameters *I*_{E,I}, *c*_{E,I}, and *g*_{E,I} are, respectively, the threshold current, the gain factor, and the noise factor (see Table 1).

#### Currents and parameters.

For each group of cells, the input current *I*_{syn} includes several components:
(7)

In *Eq. 7* and below, indices ext and rec refer to external and recurrent currents. Currents depend on the gating variables through the following equations:
(8)
(9)
(10)
(11)
(12)
(13)

The corresponding equations written for interneurons are (14) (15) (16) (17)

In these equations, *C*_{ext} is the total number of external synapses per cell and the parameters *J* are synaptic efficacies, whose values are derived from empirical measures (Wang 2002; Wong and Wang 2006). The two parameters δ*J*_{NMDA,i} and δ*J*_{GABA,i}, which were not present in the original W11, were introduced here in certain simulations to examine the effects of synaptic imbalance (see *Imposing nontrivial relative values*). In the initial simulations, δ*J*_{NMDA,i} and δ*J*_{GABA,i} were set equal to 1.

Finally, *I*_{η,i} in *Eq. 8* and *I*_{η,I} in *Eq. 14* are noise terms. Neurons in the extended, spiking network (Wang 2002) are noisy Poisson processes. In the derivation of W11, Wong and Wang (2006) first removed the time-dependent noise during the mean-field approximation and then reintroduced a white noise in the form of a Ornstein-Uhlenbeck process. Thus *I*_{η} is according to the following equation:
(18)

where η (*t*) is a white noise with unit variance (Renart et al. 2003). The parameter σ_{η} represents the amount of noise (see Table 1).

#### Modeling offer value cells.

The last term in *Eq. 7*, namely *I*_{stim}, is the primary input, which equals zero for nonselective pyramidal cells and for interneurons. For CJA and CJB cells, we set *I*_{stim} as follows:
(19)

where *r*_{OV} is the firing rate of OV cells. The synaptic efficacy *J*_{AMPA,input} reflects the number of connections between OV cells and CJ cells. In the original W11, *J*_{AMPA,input} was set equal to *J*_{AMPA,ext,pyr}. Unless otherwise indicated, in all our simulations we set *J*_{AMPA,input} = 30*J*_{AMPA,ext,pyr}. This adjustment provided a sizable dynamic range for CV cells and reflected the fact that offer value cells and chosen juice cells are found in close proximity in the same brain region, while cells in MT and LIP are connected only long distance.

In *Eq. 19*, the parameter δ*J*_{stim,i} was introduced to impose nontrivial relative values (see *Imposing nontrivial relative values*). In the initial simulations, we set δ*J*_{stim} = (2, 1). Conversely, the parameter δ*J*_{HL,i} accounted for the Hebbian learning taking place after range adaptation (see section *Range adaptation, context-dependent preferences, and Hebbian learning*). We generally set δ*J*_{HL} = (Δ*A*/Δ*B*, 1). Unless otherwise indicated, the two value ranges used in the simulations were equal and δ*J*_{HL} = (1, 1).

For the sake of simplicity, the value encoded by offer value cells is treated here as equivalent to the corresponding juice quantity. [In reality, these two variables are distinct (Raghuraman and Padoa-Schioppa 2014)]. For any session and for any *juice X*, #*X* indicates the quantity of *X* offered in the current trial, #*X*_{min} and #*X*_{max} indicate the minimum and maximum quantities of *X* offered in the session, and Δ*X* = [#*X*_{min}, #*X*_{max}] is the value range. The encoding of values in OFC undergoes range adaptation (Kobayashi et al. 2010; Padoa-Schioppa 2009). In other words, the activity of offer value cells on any given trial is a linear function of the value rank *x*:
(20)

The activity profile of OV cells was modeled as follows: (21) (22) (23)

where *n* is the trial number, *t* is time within a trial, and *x*_{i} is the rank for *juice X*_{i}. *Juices A* and *B* correspond to *X*_{1} and *X*_{2}, respectively.

The baseline activity (*r*_{0}), dynamic range (Δ*r*), and time constants (*a*, *b*, *c*, *d*) used in the simulations are indicated in Table 1. Unless otherwise indicated, in all the simulations we set the same value range for the two juices, namely Δ*A* = Δ*B* = [0, 20]. This large range allowed to generate choice patterns and neuronal activity at high resolution.

#### Simulations.

Simulations were run in MATLAB (MathWorks). Unless otherwise specified, each session included 4,000 trials with both offers randomly selected on each trial from the range [0, 20]. Offers 0A:0B were excluded. The network dynamics was generated with a resolution d*t* = 0.5 ms and then examined averaging over 5-ms time bins. For each trial, the choice outcome was identified by comparing the activity of CJA and CJB cells in the interval 400–600 ms after the offer (similar results were obtained with different time windows). A trial type was defined by two offers and a choice (e.g., [2A:5B, B]). For the study of activity profiles (e.g., Fig. 4*A*), trials were divided in groups according to the relevant criterion and the activity was averaged across trials for each group. For the study of tuning functions (e.g., Fig. 4, *B–D*), we focused on specific time windows, namely, 0–500 ms after the offer for OV cells and CV cells and 500-1,000 ms after the offer for CJ cells. However, qualitatively similar results were obtained with different time windows. Firing rates were averaged across trials for each trial type. Choice patterns (e.g., Fig. 5*A*) were analyzed with logistic regressions (see results). The code is available upon request.

## RESULTS

#### Summary of experimental observations.

Figure 1 summarizes the primary experimental results that we sought to reproduce with Wang's model. In the experiments (Padoa-Schioppa and Assad 2006, 2008), monkeys chose between two juices labeled *A* and *B*, with *A* preferred. On any given trial, the offers appeared on a computer monitor on the two sides of a center fixation point. After the offer, the animals maintained center fixation for a randomly variable delay that lasted 1–2 s, after which the center fixation was dimmed (go signal). The animal indicated its choice with a saccade. Juice quantities varied from trial to trial, and choice patterns typically presented a quality-quantity trade-off. For example, in the session shown in Fig. 1*A*, the monkey was roughly indifferent between 1A and 3B. Recordings were performed in central OFC. In the initial analysis, we defined several time windows aligned with different behavioral events and a large number of variables that neurons could conceivably encode. A variable selection analysis indicated that *1*) neurons in OFC encoded one of three variables, namely, *offer value*, *chosen juice*, and *chosen value*. Further analyses showed that *2*) the encoding of these variables was roughly linear and that *3*) each cell encoded one variable or another variable, but not a mixture of variables. We also found that *4*) any given cell generally encoded the same variable in different time windows. Thus neurons encoding the three variables were conceptualized as forming different groups of cells (Padoa-Schioppa 2013). For each variable, *5*) the encoding could be either positive (higher activity for higher values) or negative (lower activity for higher value). Importantly, experiments in which animals chose between three juices offered pairwise showed that *6*) cells encoding the offer value of one particular juice did not depend on the juice offered in alternative (menu invariance). Last but not least, *7*) the activity of both offer value and chosen value cells adapted to the range of values available in any given session (Padoa-Schioppa 2009).

Three examples of neurons encoding the *offer value* *B*, the *chosen juice A*, and the *chosen value* are illustrated in Fig. 1. Figure 1, *A*, *C*, and *E*, depict the population activity profile recorded for each group of cells (only the populations of cells with positive encoding are shown). Figure 1*D* and behavioral measures (Padoa-Schioppa, unpublished observations) indicate that decisions were made within a few hundred milliseconds after the offer.

#### Wang's model for economic decisions.

Figure 2 illustrates the structure of the model. We identified the input node (OV cells, corresponding to MT in perceptual decisions) with offer value cells. In the original W11, the input current has the form (24)

where μ_{0} = 40 sp/s, coh ϵ [−1, 1], and the ± signs refer to *pools A* and *B*, respectively. The fact that the range of possible inputs is bounded to the interval [0, 80] sp/s is essential for the network to operate properly. As we moved from perceptual to economic decisions, we had to consider several factors.

First, unlike motion coherence, values are not bounded to a finite range. In principle, this fact could pose a challenge for the network. In reality, however, this challenge does not arise thanks to the phenomenon of range adaptation (Kobayashi et al. 2010; Padoa-Schioppa 2009). After neurons have adapted to the range of values available in the behavioral context (current session), their activity on any given trial is a linear function of the value rank, which varies in the interval [0, 1] (see *Eq. 20*). We discuss below ways in which range adaptation presents a challenge for Wang's model. However, at this stage range adaptation makes it easy to identify offer value cells with the input node of the model.

Second, in the random dot task (Newsome 1997), the two inputs are perfectly anticorrelated. In this sense, the stimulus is intrinsically one-dimensional (and indeed it is parameterized by the unidimensional parameter coh). Thus if we consider the plane formed by *I*_{stim,1} and *I*_{stim,2}, the inputs for the random dot task lie on the diagonal with slope −1, and the data point corresponding to coh = 0 is in the center of the diagram (Fig. 3*A*). In contrast, the two offers in economic choice tasks (Padoa-Schioppa 2011) can vary independently of one another and assume any value within the range spanned in the behavioral session. Thus the input to the model can lie anywhere on the plane formed by *I*_{stim,1} and *I*_{stim,2} (Fig. 3*B*). As a consequence, there are many sets of offers that induce behavioral indifference. In practice, offers used in the experimental sessions did not cover the plane densely, because in most trials one of the two juices was offered in quantity 1 (Padoa-Schioppa and Assad 2006, 2008). However, in the present study, we simulated offers covering the full plane (Fig. 3*B*). In other words, we tested the neuronal network well beyond its original domain of definition.

Third, in the original W11 the time profile of the input currents is a boxcar, which mimics the fact that neurons in MT respond with good time fidelity to the momentary motion of the visual stimulus. In contrast, the time profile of offer value cells in OFC is more complex (Fig. 1*B*). In particular, we note several salient aspects. *1*) There is a baseline of ∼6 sp/s and a dynamic range of ∼8 sp/s. Importantly, the ratio between dynamic range and baseline is modest compared with that typically reported for MT and modeled in W11, where the baseline is negligible and the dynamic range is roughly 80 sp/s. *2*) Compared with baseline, the modulation due to value is all in the direction of increased firing rates. In other words, focusing on the 500 ms following the offer, the baseline-subtracted mean firing rate of offer value cells ranges roughly between 0 and 8 sp/s, depending on the value offered in the trial. Thus in our simulations, we used a semirealistic time profile to model the activity of offer value cells. In the initial simulations we set the baseline to zero. The effects of introducing a nonzero baseline are examined in *The baseline activity of offer value cells*.

Fourth, the input from MT to LIP is via long-distance connections. In contrast, offer value cells and chosen juice cells are in the same anatomical region, and thus they presumably enjoy the density typical of local connections. We thus increased the connectivity between OV cells and CJ cells (*Eq. 19*). In the simulations, we set *J*_{AMPA,input} = 30*J*_{AMPA,ext,pyr}. This value was chosen to obtain a sizable dynamic range for CV cells. Notably, even considering the fact that the dynamic range of offer value cells is much lower than that of neurons in MT, this value made the input current higher than that used in the original W11.

Fifth, the activation of chosen juice cells in OFC is largely phasic (Fig. 1*B*). Their activity profile increases shortly after the offer, peaks ∼350 ms after the offer, and decays rapidly in the following 300–400 ms, even though the offers are still on the monitor and the animal has not yet revealed its choice. The traces corresponding to the two choice outcomes remain separated, but that signal is relatively small until after the animal has performed its saccade. Furthermore, if one compares the traces for easy and split decisions (see below), it appears clear that the activity of chosen juice cells does not resemble a race to threshold. This largely phasic activity profile differs from that reported for LIP in the random dot task, and this difference is relevant to the model. Indeed, the precursor to Wang's model discussed here was a model designed to describe the persistent delay activity observed in lateral prefrontal cortex and other association areas (Brunel and Wang 2001; Compte et al. 2000). Reverberation mediated by NMDA receptors, which have a slow dynamics, is a characteristic trait of this model. Working memory activity generally increases with *w*_{+}, and the original W11 set *w*_{+} = 1.80 (Wong and Wang 2006). However, the authors noted that, depending on *w*_{+}, the model could perform decisions with or without working memory. In our initial simulations we set *w*_{+} = 1.75, which provided a largely phasic (as opposed to a working memory-like) time profile for chosen juice cells. The dependence of the results on *w*_{+} is described later in *Dependence on the strength of recurring synapses*.

#### Initial simulations.

Figure 4 shows the results of one simulation. Figure 4, *A–D*, illustrate the activity of OVB cells. Figure 4*A* depicts the activity profile, splitting trials depending on whether the encoded value was low, medium, or high. Figure 4*B* depicts the full two-dimensional tuning curve. The *x*- and *y*-axes represent the quantities of *juice A* and *juice B* offered, and the *z*-axis represents the firing rate. Each data point represents one trial type, and trial types are coded depending on whether the model chose *juice A* or *juice B*. To visualize the tuning of OV cells in a one-dimensional format similar to that of Fig. 1*A*, we downsampled the tuning curve, selecting a subset of offer types analogous to those typically employed in the experiments (Fig. 4*C*). Finally, Fig. 4*D* illustrates the activity of OVB cells plotted against the encoded variable (*offer value B*).

Figure 4, *E–H*, illustrate the activity of CJB cells in a similar format. In Fig. 4*E*, trials were divided depending on whether the model chose *juice A* or *juice B*. Figure 4*F* depicts the full tuning curve, with colors indicating the chosen juice. Figure 4*G* shows a downsampled version of the tuning curve, and Fig. 4*H* displays the activity of CJB cells as a function of the variable *chosen juice*.

Figure 4, *I–L*, illustrate the activity of CV cells. To generate these plots, we analyzed the network's choices and derived the relative value of the two juices from the indifference point. The details of this computation are described in the next section. In essence, the relative value reconstructed from choices (ρ = 2.03) was very close to the ratio δ*J*_{stim,1}/δ*J*_{stim,2} = 2. Following the approach used in the analysis of empirical data (Padoa-Schioppa and Assad 2006), we could thus express quantities of either juice on a single value scale, and we conventionally used units of *juice B* (uB). For each trial type, we could thus calculate the chosen value. In Fig. 4*I*, trials were divided depending on whether the chosen value was low, medium, or high. Figure 4*J* depicts the full tuning curve, with colors indicating the chosen juice. Figure 4*K* shows a downsampled version of the tuning curve, and Fig. 4*L* displays the activity of CV cells as a function of the variable *chosen value*.

Several points are noteworthy. First, for both CJ cells and CV cells, the activity profile (Fig. 4, *E* and *I*) reproduces fairly well the corresponding activity profile recorded experimentally (Fig. 1, *B* and *C*). Second, for CJ cells, the tuning curve clearly separates between the two choice outcomes—the activity of CJB is higher (lower) when the network chooses *juice B* (*juice A*) (Fig. 4*F*). At the same time, the tuning is not quite binary. For given choice, the activity of CJB cells increases (decreases) with the quantity of *juice B* (*juice A*) offered. This characteristic was already present in the extended version of the model describing perceptual decision (see Fig. 4 in Wang 2002). Third, the tuning curve obtained for CV cells (Fig. 4*K*) closely resembles those obtained experimentally for chosen value cells. Inspection of Fig. 4*L* reveals that, for given chosen value (*x*-axis), the activity of CV cells is essentially identical when the network chooses *juice A* and when it chooses *juice B*. This fact means that CV cells indeed capture the relative value of the two juices (ρ).

The resemblance between CV cell and chosen value cells was unexpected and quite remarkable if one considers the fact that interneurons were included in the model for biological realism and stability (Amit and Brunel 1997)—not to reproduce empirical observations analogous to chosen value cells. An intuition for this result can be gained by noting that CV cells receive an input proportional to the activity sum of CJA and CJB cells (*Eqs. 15* and *17*). By virtue of the decision, this sum eventually equals the activity of the winning pool. In turn, CJA and CJB cells receive an input proportional to the value of the corresponding juice weighted by the relative value ρ. Consequently, the input to—and thus the activity of—CV cells is linearly related to the value of the chosen juice, namely, the chosen value. Important in this respect is the fact—rarely noted but clearly apparent in Wang (2002)—that the network attractors depend on the input currents. Figure 4*L* reveals that the activity of CV cells as a function of the chosen value is slightly sublinear. To some extent, this is also true in the empirical population data (Padoa-Schioppa, unpublished observations), although the departure from linearity at the level of individual cells rarely reaches significance. As described below, the magnitude of these effects depends on the parameter *w*_{+}.

Wang's network also includes a population of nonselective (NS) pyramidal cells, which are interconnected with both groups of CJ cells and CV cells (Fig. 2). For the reasons discussed above, this pattern of connectivity suggests that the activity of NS cells should resemble that of chosen value cells in OFC, and this is indeed what we found (not shown). Apart from the excitatory/inhibitory nature of the neurons, one difference between CV cells and NS cells is that the latter group presented a lower dynamic range (∼1 sp/s for the simulation of Fig. 4) and a higher trial-by-trial variability (relative to the dynamic range). Importantly, while inhibitory interneurons (CV cells) are critical to the decision mechanism, NS cells do not play a significant role in the decision (Amit and Brunel 1997; Wong and Wang 2006). Thus we will not discuss them further.

#### Imposing nontrivial relative values.

Consider a subject choosing between quantities of *goods A* and *B*. If value functions are linear (i.e., if the subjective value assigned to each good increases as a linear function of the good's quantity), the relative value between the two goods, namely ρ, is defined as the quantity that makes the subject indifferent between 1A and ρB. A hallmark of economic decisions is the fact that relative values are subjective and even variable over time. In fact, relative values capture the quintessence of economic decisions, namely, the fact that subjective value provides a common unit of measure to compare qualitatively different goods. In general, *goods A* and *B* have different physical dimensions, and ρ has the physical dimensions necessary to convert one unit of *good A* into one unit of *good B*. A fundamental but often underappreciated issue concerns the neural origins of relative values. In essence, the question is: How is ρ determined in the brain?

To appreciate this issue, consider in our experiments choices between two juices offered in equal ranges. In this case, ρ is a number. In general, ρ can assume any value and we can conventionally set ρ ≥ 1 (i.e., *A* is preferred to *B*). In principle, the relative value between two goods could be induced by differences in the activity of different groups of offer value cells. For example, if the animal choosing between *juices A* and *B* is indifferent between 1A and 3B, one might expect to observe that the firing rate of offer value A cells measured when the animal is offered 1A is three times as large as that of offer value B cells measured when the animal is offered 1B. However, experimental results indicate that this is not the case—in fact, the activity range of any given cell was found to be independent of the relative value of two juices (Padoa-Schioppa 2009). In other words, relative values do not simply reflect differences in the activity of offer value cells. This fact highlights an important point, namely, that economic decisions cannot be conceptualized as simple comparisons of neuronal firing rates. So how are relative values determined in the brain? We examined this issue in the framework of Wang's model.

The original W11 is symmetric in *A* and *B*. If the activity of OV cells does not depend on the intrinsic preference for the corresponding juice, and if the two value ranges are equal, the symmetry of the network implies ρ = 1. However, nontrivial relative values (i.e., ρ > 1) can be imposed, introducing an imbalance in the synaptic efficacies linking the various pools of neurons associated with the two juices. We experimented with introducing such imbalance at different stages of the network. Figure 5*A* illustrates the choice patterns obtained in the same simulation illustrated in Fig. 4. The *x*- and *y*-axes represent the quantities of *juices A* and *B* offered, and the *z*-axis represents the proportion of trials in which *juice B* was chosen. Each gray dot represents one trial type, and the color surface shows the result of a two-dimensional logistic regression. Specifically, to examine departures from linearity, we used a logistic model including all second-order terms:
(25)

In *Eq. 25*, the variable *choice B* is the proportion of B choices; #*A* and #*B* are the quantities of *juice A* and *juice B* offered, respectively. Figure 5*B* shows the same surface shown in Fig. 5*A* seen from the *z*-axis. In this simulation, the synaptic imbalance was introduced at the level of the input current: referring to *Eq. 19*, we set δ*J*_{stim} = (2, 1). The ensemble of offers for which the model was indifferent between the two juices is termed the indifference function. Notably, the indifference function in this simulation was a straight line through the origin (Fig. 5*B*). Furthermore, the slope of the indifference function was essentially equal to the synaptic weight ratio = 2. A simplified logistic model including only first-order terms provided the measure ρ ≡ *a*_{1}/*a*_{2} = 2.03. (This value was used to generate Fig. 4, *I–L*.)

In another simulation, we introduced the synaptic imbalance in the recurrent, NMDA-mediated self-excitation of CJ cells (Fig. 5*C*). In this case, referring to *Eq. 11*, we set δ*J*_{NMDA} = (1.05, 1). (These values were chosen such that the indifference function would cross offers [10A:20B].) Notably, the indifference function was no longer a straight line and, more importantly, it was no longer homogeneous (it did not cross the origin). This behavior is not realistic because it amounts to stating that the model consistently chooses no juice over small quantities of *juice B*. We obtained similar results when we imposed the imbalance in the inhibition from interneurons to CJ cells (Fig. 5*D*). In this case, referring to *Eq. 13*, we set δ*J*_{GABA} = (1, 1.02). Again, the indifference function was nonlinear and, most importantly, nonhomogeneous. We concluded that within Wang's model nontrivial relative values emerge from an imbalance in the input synaptic ratio.

#### Dependence on the strength of recurring synapses.

Wang's model is biophysically realistic in the sense that all the parameters represent biophysical quantities (synaptic efficacies, time constants, etc.) and their values are derived from or constrained by experimental measures (with some tuning). In this study, we used the same parameters previously set for perceptual decisions (Wong and Wang 2006). The only (partially) free parameter was the relative strength of recurring synapses *w*_{+}. This parameter is also thought to characterize different brain regions (Murray et al. 2014). We thus examined how the results described above depended on *w*_{+}.

In the simulations described so far, we set *w*_{+} = 1.75. Figure 6 summarizes the results of three additional simulations, in which we set *w*_{+} = 1.55 (Fig. 6, *A–D*), *w*_{+} = 1.70 (Fig. 6, *E–H*), and *w*_{+} = 1.85 (Fig. 6, *I–L*). For each simulation in Fig. 6, the activity of CJB cells is illustrated on the left and the activity of CV cells is illustrated on the right. The results obtained for the three simulations are qualitatively similar, but a few quantitative trends can be observed. First, as *w*_{+} increased the sustained, working memory-like delay activity of CJB cells increased (Fig. 6, *A*, *E*, and *I*). Concurrently, the steady-state activity of these cells became more binary (Fig. 6, *B*, *F*, and *J*). As described in the next section, the “predictive activity” of CJ cells increased as a function of *w*_{+}. Finally, the relation between the activity of CV cells and the chosen value was closer to linear for lower values of *w*_{+} (Fig. 6, *D*, *H*, and *L*). These trends notwithstanding, the primary observation of these analyses is that the results presented in previous sections held essentially true in a fairly wide range of *w*_{+}.

#### Choice hysteresis and the predictive activity of chosen juice cells.

We next examined whether Wang's model could reproduce a series of empirical phenomena related to the origins of choice variability. Relevant to all these phenomena is the distinction between easy and split decisions. Consider the behavior shown in Fig. 1*E*. For many offer types, away from the indifference point, the animal chose consistently the same juice (*A* or *B*). These decisions are referred to as “easy.” For other offer types, closer to the indifference point, the animal split its choices between the two juices. These decisions are referred to as “split.”

All other things equal, monkeys in our experiments had the tendency to choose on any given trial the same juice chosen (and received) in the previous trial (Fig. 7*A*). This phenomenon is termed “choice hysteresis” (Padoa-Schioppa 2013). Other analyses showed that the activity of chosen juice cells prior to the offer correlated with the eventual decision of the animal—an effect termed “predictive activity” (Padoa-Schioppa 2013). The predictive activity can be observed in Fig. 7*B*. Trials were divided into four groups depending on whether the animal chose *juice A* or *juice B* and on whether decisions were easy or split. For easy trials, the activity of chosen juice cells recorded before the offer did not correlate with the eventual decision. In contrast, the preoffer activity recorded in split trials correlated with the eventual decision of the animal. Importantly, a large component of the predictive activity observed in Fig. 7*B* was tail activity from the previous trial. In a conservative interpretation, this tail activity was correlated with the current decision because the decision on any given trial was mildly correlated with that in the previous trial. In other words, a large component of the predictive activity was closely related to choice hysteresis. However, the predictive activity had also a smaller, “residual” component that did not depend on the outcome of the previous trial (Padoa-Schioppa 2013).

We now examined whether W11 reproduced these phenomena. First, we considered the same simulation depicted in Fig. 4 and conducted on the activity of CJ cells the same analysis as that conducted on neuronal data. As illustrated in Fig. 7*C*, we found a consistent predictive activity. Interestingly, this predictive activity increased as a function of *w*_{+} (Fig. 6, *A*, *E*, and *I*). This trends reflects the fact that baseline fluctuations increase with *w*_{+}. Also, for higher *w*_{+} the decision is made more rapidly (the network is more “impulsive”) and thus the initial bias becomes more relevant (see below).

The predictive activity in Fig. 7*C* (W11) is noticeably modest compared with that in Fig. 7*A* (experimental data). Importantly, in all the simulations discussed so far, the initial conditions of the network were reset at the beginning of each trial. This reset effectively prevented any choice hysteresis (Fig. 7*D*). As a consequence, the predictive activity observed in Fig. 7*C* corresponds exclusively to the residual component of the predictive activity observed empirically. To further investigate choice hysteresis and the predictive activity within W11, we ran a series of simulations in which the initial conditions of the network were set, in each trial, equal to the final conditions at the end of the preceding trial. For any value of *w*_{+} this policy induced some choice hysteresis and enhanced the predictive activity. However, these effects were more pronounced for higher values of *w*_{+} (because of the larger tail activity). Figure 7, *E* and *F*, illustrate a simulation in which we set *w*_{+} = 1.82. It can be noted that there is a robust choice hysteresis and that the predictive activity is clearly enhanced (note the different *y*-axis scales in the insets of Fig. 7, *E* and *C*).

In conclusion, W11 reproduces both components of the predictive activity, as well as choice hysteresis. These results suggest the following interpretation. The preoffer activity represents the state of neural assembly prior to the offer, which varies to some extent from trial to trial because of history (tail activity from the previous trial) or stochastic fluctuations. On any given trial, the initial state introduces a small bias. When one of the two offered values dominates (easy trials), the initial bias is irrelevant and the higher value is always chosen. When the two offer values are close (split trials), the initial bias contributes significantly to the decision. This contribution determines a correlation between the preoffer activity and the decision (predictive activity). Within W11, the tail component of the predictive activity causes choice hysteresis. While consistent with current data, this causal link remains to be tested empirically.

#### The overshooting of chosen value cells.

Another phenomenon observed in the empirical data pertains to chosen value cells. All other things equal, the activity of these neurons presented a transient but robust “overshooting” when the decision was more difficult (Padoa-Schioppa 2013). For the analysis of neuronal data, we focused on trials in which the animal chose 1A and divided them into easy and split (Fig. 8*A*). In first approximation, the activity recorded for the two groups of trials was the same (same chosen value). However, the activity recorded in split trials presented a transient overshooting in the time window extending 150–400 ms after the offer.

We now repeated this analysis on the activity of CV cells in W11. We examined the same simulation as in Fig. 4, and we focused on trials in which 1A was offered. We divided trials into easy and split, and we computed the two activity profiles. We then repeated this calculation for each quantity of *A* that induced some split decisions (*offer value A* = 1…10), and we averaged the activity profiles obtained for different values of *A* offered. As illustrated in Fig. 8*B*, the model reproduced the activity overshooting. Notably, this result provides a novel interpretation for the overshooting. We previously showed that the overshooting of chosen value cells could be explained if the relative value of the two juices (ρ) fluctuated from trial to trial (Padoa-Schioppa 2013). However, the activity overshooting for CV cells was obtained here without introducing any variability in the synaptic efficacies. Of course, this observation does not exclude that synaptic efficacies do fluctuate from trial to trial, which would provide an additional contribution to the overshooting.

#### Range adaptation, context-dependent preferences, and Hebbian learning.

The issue of range adaptation and Hebbian learning was addressed elsewhere (Padoa-Schioppa and Rustichini 2014), and we discuss it here briefly for the sake of completeness.

As described above, the activity of offer value cells adapts to the range of values available in any behavioral context (Padoa-Schioppa 2009). In principle, range adaptation ensures an efficient representation. At the same time, range adaptation also poses a computational challenge, in the following sense. First, note that in W11—and in other decision models (Bogacz et al. 2006; Krajbich et al. 2010)—decisions are ultimately comparisons of firing rates. Now consider a session in which offers of *juice A* and *juice B* vary in ranges Δ*A* = [0, 3] and Δ*B* = [0, 6], respectively, and assume that the relative value is such that 1A = 2B. If decisions are made by comparing firing rates, the animal chooses *juice A* whenever the activity of offer value A cells is higher than that of offer value B cells. Now imagine that we run a second session in which the range of *juice A* remains unchanged (Δ*A* = [0, 3]) while the range of *juice B* is doubled (Δ*B* = [0, 12]). The activity of offer value B cells will adapt to the new value range. If decisions are made by comparing firing rates, range adaptation will inevitably induce a change of preferences in the new session such that 1A = 4B. Interestingly, framing effects described in behavioral economics show that preferences can depend on the behavioral context in ways qualitatively similar to that described here (Ariely et al. 2003; Savage 1972; Tversky and Kahneman 1981). At the same time, it would seem puzzling if preferences could be manipulated so easily and so arbitrarily by modifying the range of options. Indeed, an experiment in which monkeys chose between two juices in two subsequent blocks showed that relative values were fairly stable even when the ratio of offer value ranges (Δ*A*/Δ*B*) was varied by a factor of 2 (Conen K, Cai X, Padoa-Schioppa C, unpublished observations). These considerations raise one question: How can the network achieve (reasonably) stable preferences under varying ranges of offer values?

The issue of preference stability in the presence of range adaptation is closely related to that of nontrivial relative values (ρ > 1) discussed above. In essence, we propose that the network responds to the changes in value ranges by altering the synaptic efficacies between OV cells and CJ cells (*Eq. 19*). As previously shown, this synaptic plasticity can be achieved with a mechanism of Hebbian learning (Padoa-Schioppa and Rustichini 2014). Framing effects may be reproduced if this Hebbian learning lags range adaptation of OV cells.

#### Stable points of the dynamical system.

One important question concerns the number of steady states of the model. Wong and Wang (2006) provided a bifurcation diagram, in which the number of attractors and the working-memory regime were examined as a function of the input firing rate (μ_{0}) and the parameter *w*_{+}. Their analysis was done for the reduced version of the model (W2) and for coh = 0. Generating a bifurcation diagram for W11 is more difficult because the system is time dependent and high dimensional. As a first step, we conducted a series of simulations to examine the end state of the network for different sets of offer values (Fig. 9). For these simulations we set Δ*A* = Δ*B* = [0 15] and δ*J*_{stim} = (1, 1), such that ρ = 1. For comparison, although *w*_{+} defined in W11 is not identical to *w*_{+} defined in W2, the value of *w*_{+} used here is comparable to values examined by Wong and Wang (2006). In contrast, stimulus currents are substantially higher in our case. Specifically, in our simulations we normally used Δ*J* = *J*_{AMPA,input}/*J*_{AMPA,ext,pyr} = 30 (see Table 1). In such conditions and for the data point at the center of Fig. 3*B*, the product *J*_{AMPA,input}*r*_{OV} is roughly equal to 100*J*_{AMPA,ext,pyr} (see *Eq. 19*). In contrast, in the original W11, the equivalent product was set to 40 *J*_{AMPA,ext,pyr} when coh = 0.

Consider Fig. 9*B1*. In this scatterplot, *x*- and *y*-axes represent, respectively, the “final” activity of CJA and CJB cells as measured in the time window 400–600 ms after stimulus on. Each symbol + represents one trial, and symbols are color coded according to the offer type; the color legend is indicated in Fig. 9*A1*. Figure 9, *B2* and *B3*, illustrate the same data focusing on a subset of trials. Several points can be noted. First, the network clearly separated between the two end points when the value difference was sufficiently high (purple, blue, and green symbols in Fig. 9, *B1* and *B2*). It did not clearly separate between the end points when offer values were close (Fig. 9*B3*). In separate simulations, we observed that extending the stimulus in time for 1 s or increasing the levels of noise in the system did not affect these results (not shown). The simulation in Fig. 9, *B1–B3*, was based on our normal parameters *w*_{+} = 1.75 and Δ*J* = 30. We then repeated the simulations using parameters *w*_{+} = 1.85 and Δ*J* = 30 (Fig. 9, *C1–C3*), and *w*_{+} = 1.75 and Δ*J* = 15 (Fig. 9, *D1–D3*). The results were qualitatively similar, although the separation for difficult decisions tended to increase when *w*+ was raised (Fig. 9*C2*) and to decrease when Δ*J* was reduced (Fig. 9*D2*).

The formulation of W2 for economic decisions and a more formal analysis of the steady states will be presented elsewhere.

#### The baseline activity of offer value cells.

As illustrated in Fig. 1*A*, offer value cells presented a baseline activity of ∼6 sp/s, which is smaller but comparable to the value-related dynamic range (∼8 sp/s). Experimental work showed that this baseline activity does not depend on the value range, on the juice preference, or, for given juice pair, on the relative value of the juices (Padoa-Schioppa 2009). As we examine the effects of introducing a baseline in the activity of OV cells, two premises are in order. First, the original W11 set the baseline equal to zero—a reasonable approximation since the baseline activity of MT cells is indeed low. Second, the presence of a substantial baseline changes dramatically the input to the network.

In a series of simulations we tested W11, introducing a realistic baseline. If we consider the situation in which all synaptic efficacies are balanced, the network is robust to the introduction of a baseline. In this condition, even using the same parameters as in the initial simulations, we obtained for the indifference function a straight line through the origin (not shown). Small adjustments to the parameters provided realistic profiles for CJ cells and for CV cells (Fig. 10, *A–F*). However, problems arise when one considers the fact that the relative value ρ between two goods should be free to assume any value (see *Imposing nontrivial relative values*) and/or the fact that (only) the value-dependent component of offer value cells activity adapts to the range of offer values (see *Range adaptation, context-dependent preferences, and Hebbian learning*). The approach used in the baseline-subtracted case, namely, to introduce an imbalance in the synaptic efficacies, no longer works. In other words, the indifference function becomes nonhomogeneous even if the synaptic imbalance is limited to the input (Fig. 10, *G–L*). This is because the synaptic efficacies δ*J*_{stim} multiply the entire activity of OV cells, including the baseline. As a consequence, when δ*J*_{stim,1} > δ*J*_{stim,2}, the network chooses zero quantities of *juice A* over small quantities of *juice B*. In conclusion, the baseline activity of offer value cells poses a challenge for W11. Possible ways to address this issue are discussed below.

## DISCUSSION

We examined a biophysically realistic model previously proposed to describe the activity of area LIP during perceptual decisions, namely W11, and we tested the extent to which it could reproduce the activity recorded in the OFC during economic decisions. Our analysis represented a challenging test for the model for at least four reasons. *1*) Stimuli (i.e., offers) in economic decisions are intrinsically two-dimensional, whereas stimuli in perceptual decisions are one-dimensional. *2*) The relative value between two given goods is arbitrary. Thus to accommodate subjective preferences, the network must be flexible. *3*) Offer value cells undergo range adaptation but decisions should, in first approximation, not depend on the range of offer values. In other words, the network must be adaptable. *4*) The activity of neurons in the OFC only partially resembles that of neurons in area LIP, for which the model was designed. Specifically, in addition to offer value cells (analogous to MT) and chosen juice cells (analogous to LIP), there are chosen value cells, which have no known correspondent in perceptual decisions. Moreover, even the resemblance between chosen juice cells and neurons in LIP is weak, because the representation of chosen juice cells is categorical while that of LIP is continuous (spatial), and because chosen juice cells do not present the race to threshold and the working memory activity the model was originally designed to reproduce. Despite all these challenges, the spirit of this study was not to design a new model. Rather, we tested Wang's model as much as possible off the shelf, without modifying its structure or even adjusting its parameters. We did, however, adapt the network in two important ways. First, we strengthened the connections between OV cells and CJ cells (reflecting the fact that offer value cells and chosen juice cells are found in the same area). Second, we introduced mechanisms of synaptic plasticity to account for the required flexibility and adaptability of the model.

Several traits make W11 a particularly attractive model. First, the model is biophysically realistic. Elements in the extended, spiking network are neurons endowed with realistic synaptic connections and time constants, and the mean-field approach preserves the realism of the model. Second, the model is nearly without free parameters. In other words, the model is very constrained, especially when tested outside its original domain. Third, the model makes testable predictions on the excitatory/inhibitory nature of different neuronal populations and on their connectivity. These traits set W11 apart from other computational models of economic decisions including the drift-diffusion models, the mutual inhibition models, and the leaky competing accumulator model (Bogacz et al. 2006; Hare et al. 2011; Krajbich et al. 2010; Usher and McClelland 2001). Furthermore, the explicit inclusion of inhibitory interneurons in W11 provides an account for chosen value cells, which are not explained in more schematic models including the drift-diffusion model and the leaky competing accumulator model. Other studies examined Wang's model in the context of value-based decisions. In particular, Behrens and colleagues used the simplified W2 version to generate aggregate regressors for the analysis of MEG data (Hunt et al. 2012; Jocham et al. 2012). In this respect, we note that the tests run here were significantly more stringent because we matched each node in the network with a specific group of neurons, because the original analysis of neuronal data (Padoa-Schioppa and Assad 2006) tested a large number of variables and not only variables generated from the model, and because we reproduced a variety of empirical findings.

We found that W11 provides a remarkably accurate account for the activity of neurons in the OFC. The input node of the model (OV cells) can be identified with offer value cells. The model naturally generates binary decisions, and the activity profile of the output node (CJ cells) is fairly similar to that of chosen juice cells. Perhaps most surprisingly, the activity of inhibitory interneurons (CV cells) is very similar to that of chosen value cells (more on this below). In addition, W11 reproduces several phenomena related to the neuronal origins of choice variability, namely choice hysteresis, the predictive activity of chosen juice cells, and the activity overshooting of chosen value cells. We examined what changes in the network are necessary to ensure that relative values are arbitrary and to accommodate the fact that offer value cells undergo range adaptation. We found that these two requirements are addressed only when synaptic plasticity is introduced in the connections between OV cells and CJ cells. We also found that W11 falls short of the experimental data in at least two ways. First, the model includes only neurons with positive encoding (i.e., higher firing rate for higher values). In contrast, for each of the three variables, a substantial fraction of the neurons in OFC presents negative encoding (i.e., higher firing rate for lower values) (Padoa-Schioppa 2009, 2013). Second, the model can provide the flexibility necessary to accommodate any relative value, or it can handle a significant baseline activity in the input node (OV cells), but it cannot do both of these things at the same time. As discussed below, these limitations appear conceptually surmountable, although more theoretical work is necessary in this respect.

#### Chosen value signals and interneurons in the OFC.

One of the main results of this study is that interneurons of W11 were found to encode the chosen value. This observation was unforeseen and somewhat extraordinary. Interneurons were included in the progenitor models of W11 for biological realism and network stability (Amit and Brunel 1997)—not to reproduce empirical observations analogous to chosen value cells. Furthermore, chosen value cells have no known correspondent in perceptual decision tasks. In contrast, chosen value signals have been observed in numerous studies of value-based decisions and in multiple brain regions (Amemori and Graybiel 2012; Cai et al. 2011; Cai and Padoa-Schioppa 2012, 2014; Grabenhorst et al. 2012; Lau and Glimcher 2008; Lee et al. 2012; Padoa-Schioppa and Assad 2006; Roesch et al. 2009; Strait et al. 2014; Sul et al. 2010; Wunderlich et al. 2010). While it is clear that chosen value signals could inform a variety of mental functions including associative learning, visual attention, emotion, and others, the possible contributions of chosen value signals to economic decisions have remained mysterious. The present results suggest that chosen value cells in the OFC may be directly involved in the decision. This suggestion is bolstered by the fact that CV cells in the model reproduce the activity overshooting seen in chosen value cells. Importantly, the hypothesis that chosen value cells in the OFC participate in the decision does not imply that all chosen value signals in the brain are the signature of a decision process. Indeed, once computed and explicitly represented by a neuronal population, the variable chosen value could be transmitted broadly to other brain regions and thus facilitate various mental functions.

Considering the similarity between CV cells and chosen value cells from a different angle, W11 makes the strong prediction that all chosen value cells in the decision circuit are interneurons and that all chosen juice cells are pyramidal cells. Future experiments will test these predictions. Importantly, the present model provides an account for the decision circuit. However, OFC likely comprises additional neuronal populations that do not directly intervene in the decision but receive input from the decision circuit. Thus testing the predictions of the model will require distinguishing between neurons that contribute to the decisions and other neurons within OFC. Notably, the hypothesis that the variable chosen value is first computed in OFC and then transmitted to other brain regions implies that this variable is passed from the decision circuit to pyramidal cells in layers 5 and 6 of OFC, where cortico-cortical projections originate. In principle, NS pyramidal cells in W11 (whose activity resembles that of chosen value cells with low dynamic range) could serve that purpose, although alternate schemes are also possible.

In its current form, W11 presents two clear limitations—the fact that the model only includes neurons with positive encoding and the challenge posed by the baseline activity of offer value cells. Negative encoding is an intriguing phenomenon, partly because it is not observed in sensory systems (to our knowledge). More detailed neurophysiology work is necessary to establish the excitatory/inhibitory nature of neurons with negative encoding, whether these neurons are preferentially located in specific cortical layers, and whether they present specific patterns of connectivity. From a modeling perspective, the presence of cells with negative encoding can be viewed as a degree of freedom present in the empirical data but not used by the model. In principle, this degree of freedom could help resolve the challenge posed by the baseline activity in offer value cells, because two inputs encoding the same variable with opposite sign (excitatory and inhibitory) could be combined in a way that reduces or even eliminates any baseline. In the light of these considerations, we view W11 as a benchmark and a starting point for biophysically realistic models of economic decisions. The next steps are to examine empirically the excitatory/inhibitory nature of different groups of cells in OFC, to assess possible differences between cortical layers, and to establish the actual connectivity between the different groups of neurons (a set of challenging tasks!). The results of these enquiries should then inform new and more accurate neuro-computational models.

To conclude, Wang's model provides a biologically credible account for the neuronal mechanisms of economic decisions. In the model, decisions are generated by three groups of cells whose activity closely resembles the activity of the three groups of neurons previously found in the OFC. This close resemblance does not demonstrate that economic decisions take place in the OFC, or that the decision mechanisms are based on recurrent excitation and pooled inhibition. However, the model does provide an important proof of concept that economic decisions could emerge from the activity of neurons identified in this area. In this framework, W11 suggests that chosen value cells are a key component of the decision process. The model makes several nontrivial predictions that require further empirical testing. Last but not least, Wang's model represents an important bridge between neuroscience and economics, providing a platform to investigate the implications of neuronal data for economic theory. We will examine some of these implications in future work.

## GRANTS

This research was supported by the National Institute on Drug Abuse (Grant R01-DA-032758 to C. Padoa-Schioppa) and the National Science Foundation (Grant SES-1357877 to A. Rustichini).

## DISCLOSURES

No conflicts of interest, financial or otherwise, are declared by the author(s).

## AUTHOR CONTRIBUTIONS

Author contributions: A.R. and C.P.-S. conception and design of research; A.R. and C.P.-S. performed experiments; A.R. and C.P.-S. analyzed data; A.R. and C.P.-S. interpreted results of experiments; A.R. and C.P.-S. edited and revised manuscript; A.R. and C.P.-S. approved final version of manuscript; C.P.-S. prepared figures; C.P.-S. drafted manuscript.

## ACKNOWLEDGMENTS

We thank Nicolas Brunel, Xinying Cai, David Levine, John Murray, Xiao-Jing Wang, and KongFatt Wong-Lin for helpful discussions, and Alberto Bernacchia, John Murray, and Xiao-Jing Wang for comments on the manuscript. We also thank Xiao-Jing Wang and KongFatt Wong-Lin for sharing their MATLAB code.

- Copyright © 2015 the American Physiological Society