JN Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


J Neurophysiol 96: 872-890, 2006. First published April 19, 2006; doi:10.1152/jn.00079.2006
0022-3077/06 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Corrigendum and corrected article
Right arrow All Versions of this Article:
96/2/872    most recent
00079.2006v2
00079.2006v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (6)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Huys, Q. J. M.
Right arrow Articles by Paninski, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Huys, Q. J. M.
Right arrow Articles by Paninski, L.

INNOVATIVE METHODOLOGY

Efficient Estimation of Detailed Single-Neuron Models

Quentin J. M. Huys1,*, Misha B. Ahrens1,* and Liam Paninski2

1Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom; and 2Department of Statistics and Center for Theoretical Neuroscience, Columbia University, New York, New York

Submitted 24 January 2006; accepted in final form 14 April 2006


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS: APPLICATIONS TO MODEL...
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Biophysically accurate multicompartmental models of individual neurons have significantly advanced our understanding of the input–output function of single cells. These models depend on a large number of parameters that are difficult to estimate. In practice, they are often hand-tuned to match measured physiological behaviors, thus raising questions of identifiability and interpretability. We propose a statistical approach to the automatic estimation of various biologically relevant parameters, including 1) the distribution of channel densities, 2) the spatiotemporal pattern of synaptic input, and 3) axial resistances across extended dendrites. Recent experimental advances, notably in voltage-sensitive imaging, motivate us to assume access to: i) the spatiotemporal voltage signal in the dendrite and ii) an approximate description of the channel kinetics of interest. We show here that, given i and ii, parameters 1–3 can be inferred simultaneously by nonnegative linear regression; that this optimization problem possesses a unique solution and is guaranteed to converge despite the large number of parameters and their complex nonlinear interaction; and that standard optimization algorithms efficiently reach this optimum with modest computational and data requirements. We demonstrate that the method leads to accurate estimations on a wide variety of challenging model data sets that include up to about 104 parameters (roughly two orders of magnitude more than previously feasible) and describe how the method gives insights into the functional interaction of groups of channels.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS: APPLICATIONS TO MODEL...
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Recent electrophysiological investigations have yielded an ever deeper insight into dendritic computation and dynamics, shedding light on fundamental issues such as the interaction of synaptic inputs in active dendrites (Roth and Häusser 2001Go), the role of shunting inhibition (Borg-Graham et al. 1998Go), backpropagating action potentials (Larkum et al. 1999Go; Roth and Häusser 2001Go; Stuart and Sakmann 1994Go), and active dendritic channels (Frick et al. 2001Go; Magee 1998Go; Magee and Johnston 1995Go; Reyes 2001Go). These experimental results have been supplemented by accurate, detailed biophysical models of single neurons that have allowed us to probe the computational purposes of such features and to address specific questions about the neural input–output (IO) function (Fellous et al. 2003Go; Koch 1999Go; Koch and Segev 2000Go; London and Häusser 2005Go; Poirazi et al. 2003bGo; Wolfart et al. 2005Go). Here, we attempt to capitalize on the advances in voltage-dye imaging (Baker et al. 2005Go; Djurisic and Zecevic 2005Go) and propose a simple statistical method for automatically building large, detailed compartmental models of single neurons, circumventing some of the traditional complexities in building these models "by hand."

Our knowledge about the input to individual cells and about the relevant part of the output of the cell is at present still insufficient to allow direct inference of the neural IO function. However, the knowledge that passive properties of neurons are accurately described by the cable equation (Baldi et al. 1998Go; Bell and Craciun 2003Go; Bhalla and Bower 1993Go; Dayan and Abbott 2001Go; Jack et al. 1975Go) represents very rich information that can significantly constrain our search for the true neural IO function. Our approach to estimating neural properties from data will therefore consist in postulating a model class (a large, detailed compartmental model) that approximates the cable equations to arbitrary precision and then seeking to constrain the parameters in the model by data.

Typically, there is a major trade-off between realism and tractability when constructing large compartmental models: the more biophysically accurate and interpretable the model, the harder the computational task of setting the model’s parameters becomes, as the number of (nonlinearly interacting) parameters increases [into the thousands in biophysically accurate compartmental models, although this number is dramatically reduced using experimentally supported heuristics (Poirazi et al. 2003aGo; Schaefer et al. 2003bGo)]. The challenging nature of this high-dimensional, simultaneous parameter estimation problem is well known (e.g., Prinz et al. 2003Go) and to a large extent arises from highly nonlinear objective functions [e.g., the percentage of correctly predicted spike times (Bhalla and Bower 1993Go; Jolivet et al. 2004Go; Keren et al. 2005Go)] and the abundance of nonglobal optima in the large parameter space (Goldman et al. 2001Go; Vanier and Bower 1999Go).

Here we present a simple approach to estimating single neuron properties that is both computationally tractable and biophysically detailed. Our goal is to simultaneously infer, for each compartment of a large multicompartmental model, 1) the concentration of membrane channels; 2) intercompartmental conductances; 3) the time-varying synaptic conductances; and 4) other parameters such as the channel reversal potential, the membrane capacitance, and the noise level. To achieve this demanding goal we must make several assumptions; in particular, we assume 1) observability of the spatiotemporal voltage signal at several (many) positions on the dendritic tree [e.g., by voltage-sensitive imaging methods (Antic et al. 1999Go; Baker et al. 2005Go; Djurisic and Zecevic 2005Go)] and 2) a good understanding of the kinetics of the channels for which we seek to determine the densities.

The key insight of the proposed method is the linear relationship between dynamic functions of the observed voltage (such as the channel open probabilities) and the transmembrane currents, both of which are readily computed once the transmembrane voltage is known (see also Morse et al. 2001Go; Wood et al. 2004Go). The estimation of the parameters of proportionality between these can therefore be recast into a simple nonnegative linear regression problem. The linear relationship is of great value. First, it implies a unique, global optimum. Second, finding this optimum is an extremely well studied problem for which powerful computational tools are readily available (Boyd and Vanderberghe 2004Go; Press et al. 1992Go). In this paper we will give examples of the method’s performance on an extensive set of model data and plan to apply this method to in vitro recordings in the future.

Several subtleties, both of a computational and physiological nature, are worth noting. First, because of the large numbers of parameters, the optimization problem, although convex, is nevertheless very high dimensional (we will here infer about 104 parameters simultaneously, which is roughly two orders of magnitude more than previously feasible); fortunately, certain decomposition methods apply that allow us to break the problem into many smaller, tractable subproblems (Platt 1998Go). In addition, it turns out that the problem of estimating the time-varying synaptic input to a given compartment is underconstrained: because we will be inferring several temporal series of conductance values (one for each type of synapse impinging on any particular compartment), given a voltage trace of the same length, the ratio of data to unknown variables will be <1. Similar issues arise when attempting to determine the relative densities of channels with very similar kinetics and reversal potentials. We discuss regularization strategies that have proven effective for controlling these problems (including methods for providing confidence intervals around our estimates). We will also show how the present method naturally reveals information not only about individual, but also about groups of channels and their joint effect on a neuron’s behavior.

The aim of the present extended version is to 1) report the methods in intuitive detail to allow efficient implementation; 2) extend the scope and scale of both the methods and the simulations; and 3) provide an in-depth analysis of their performance. We previously reported the main ideas in abbreviated form (Ahrens et al. 2006Go).


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS: APPLICATIONS TO MODEL...
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
Basic setup

Biophysically accurate models of individual neurons are typically formulated as compartmental models—a set of first-order coupled differential equations that form an arbitrarily accurate spatially discrete approximation to the standard cable equations (Dayan and Abbott 2001Go). Modeling the cell under investigation in this discretized manner, the voltage Vx(t) in compartment x can be described by

Formula 1(1)
Here the products ai,xJi,x(t) represent the time-varying membrane and axial currents, whereas Ix(t) is an externally injected (electrode) current and Nx,t is current noise scaled by {sigma}x. Dropping the subscript x for notational clarity when possible, the terms aiJi(t) will come to represent three types of currents in each compartment (see also Fig. 1)

where Es and Ec are the synapse and channel reversal potentials, respectively1 ; gs(t, t') is the conductance time course that synapse s would have contributed if it had been activated at time t'; and similarly gc(t) is the time course of channel c’s conductance. The main insight in this paper is that the total transmembrane current C(dV/dt) is linear in the "current shapes" Ji(t) and that the proportionality constants ai can be inferred by linear regression if we have access to the Ji(t). For each of the three current types mentioned above, the proportionality constants represent biophysically highly relevant variables whose estimation has been the object of extensive research. Among others, the parameters we will infer will include


Figure 1
View larger version (43K):
[in this window]
[in a new window]
 
FIG. 1. Setup. The neuron is represented, as usual, as a set of compartments (boxes), each of which has its own voltage Vx and its own set of channel densities contributing Ichannels. They are linked by intercompartmental conductances fx,y (contributing Iintercompartmental) and receive synaptic input (contributing Isynaptic).

 
The second key point in this paper is that knowledge of the voltage trace [V(t)] and the channel kinetics [gs(t, t') and gc(t)] together also grant us access to both the current shapes Ji(t) and the total transmembrane currents C(dV/dt), and thus permit us to accurately estimate the parameters we are interested in.

Clearly, any recording that gives us access to Vx(t) gives us direct access to dVx/dt by a differentiation. Such scenarios include voltage-dye recordings for large, electrotonically extensive cells and whole cell patch-clamp recordings for electrotonically compact cells. In electrotonically compact cells, voltage-clamp recordings grant us direct access to the transmembrane current, relieving us from the need to differentiate the data V(t). Let us now show that voltage recordings also give us access to the Ji(t) (assuming knowledge of the kinetics) and allow us to efficiently estimate the parameters ai.

Special case: single-compartment, passive neuron

We illustrate the procedure for a single compartment with only a leak channel2 , which is described by

Formula 2(2)

Assuming also that we know V(t) (and momentarily also EL and C, although this is readily relaxed; cf. APPENDIX A), we can evaluate both the terms JL(t) = [ELV(t)], which here is just the driving force, and the total transmembrane current C(dV/dt). gL is now the unknown scaling factor that relates JL(t) to the total transmembrane current. We seek to set it such that the difference between the total transmembrane current observed and the sum of the currents on the right-hand side (RHS) of Eq. 2 match as closely as possible

Formula 2
where arg minz h(z) stands for the value of z that achieves the minimum of some function h(z). In the present case, this is unique. We note that this choice of gL corresponds to the maximum-likelihood (ML) estimate under Gaussian white noise N(t). For notational clarity, without loss of generality, we are neglecting the capacitance in vectorized formulations (cf. APPENDIX A).

Definition of current shapes; vector parameterization

More generally, knowledge of kinetics gives us access to J(t) when we condition on (i.e., assume knowledge of) the voltage trace. We will illustrate this for active conductances, in which the kinetics are voltage dependent, and for synapses, in which the kinetics are voltage independent but where the parameterization is somewhat more complex.

VOLTAGE-GATED CONDUCTANCES. For the active conductances, knowledge of the membrane voltage and channel c’s kinetics gives direct access to the current shape Jc(t) of that channel. The shape of a channel’s current contribution, the "current shape," is given by the product of that channel’s open fraction gc(t) (which is a function of past voltage and time only) and the driving force Ec V(t)

Formula 2
The open fraction gc(t) is computed straightforwardly once the voltage and the kinetics are known. Assume for example a Hodgkin–Huxley (HH) Na+ channel whose open fraction is gc(t) = m(t)3h(t) (Hodgkin and Huxley 1952Go). Both m(t) and h(t) are given by equations of the type

Formula 3(3)
where V(t) enters as a forcing term through m{infty}(V), and through {tau}m (V), which together completely define the kinetics of the channel. Given V(t) and knowledge of m{infty} and {tau}m (the channel kinetics), we can directly compute m(t) and h(t), and thus gc(t) and therefore the current shape Jc(t) that channel may contribute. [Note that in typical modeling studies V(t) and the channel variables m(t) and h(t) are evolved together as a set of coupled differential equations, but that knowledge of V(t) here decouples these equations.] The total discretized channel current is given by summing over the Nc distinct channels present in the membrane

Formula 4(4)
and we can again identify each column of the channel shape matrix Jc with the current shape Jc(t) that a particular channel can contribute. Each component of Formula 4 is the corresponding proportionality constant Formula 4c, the membrane density of channel c.

Figure 2 illustrates this setting. The top panel in gray shows the voltage trace V(t), the only observed data here. From it we derive, in a deterministic fashion, the current shapes on the left and the derivative of the voltage on the right (in gray boxes). dV/dt now has to be matched by a weighted sum of the current shapes. The weights correspond to the parameters being inferred (in dashed black box) and are constrained to be positive. The case for synaptic inputs w(t) will be elucidated shortly.


Figure 2
View larger version (17K):
[in this window]
[in a new window]
 
FIG. 2. Illustration of the procedure when only access to the voltage is granted. On gray background: data. Deterministic functions of the data are in solid gray boxes. Inferred parameters are in dashed box. Top voltage trace: only observed data. Gray boxes: current shapes J(t) for channels and synapses (in mV, given that the proportionality constant is in mS/cm2) to be weighted and summed to match dV/dt. Each of the current shapes is a deterministic function of the voltage V(t) and the (known) kinetics. Top 3 panels in the left gray box show the channel current shapes Jc(t) for Hodgkin–Huxley (HH) Na+, K+, and leak channels. Bottom 2 panels: current shapes arising from input spikes at various times, each drawn in a different shade of gray. At discretization level T, there are T such possible contributions for each synapse, one for each time at which a spike could have occurred. Here one excitatory (e.g., glutamatergic) and one inhibitory (e.g., GABAergic) synapse is shown.

 
Figure 3, on the other hand, illustrates the case when a voltage is imposed by voltage clamp. We then have experimental access to both the voltage and the transmembrane current. The voltage is still used to derive the current shapes deterministically. These are then weighted and summed to reproduce the negative of the current injected through the electrode, rather than the derivative of the voltage.


Figure 3
View larger version (29K):
[in this window]
[in a new window]
 
FIG. 3. Illustration of the procedure applied to voltage-clamp scenarios, when access to both voltage and current is granted. On gray background: data. Deterministic functions of the data are in solid gray box. Inferred parameters are in dashed box. Top: voltage trace. Gray box: current shapes (in mV, given that the proportionality constant is in mS/cm2) to be weighted and summed to match the derivative of the voltage plus the injected current. Three panels in the gray box (left) show, respectively, the HH Na+, K+, and leak current shapes.

 
LIGAND-GATED CONDUCTANCES. For the synaptic conductances, assuming e.g., {alpha}-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA)–like synaptic kinetics, i.e., instantaneous rise and fast, exponential decay, turns the time-varying synaptic conductance gs(t) into a convolution of the synaptic input ws(t)

Formula 5(5)
More generally, for synapses with slower rise kinetics [e.g., N-methyl-D-aspartate (NMDA)], we would convolve the input signal ws(t) with an elementary conductance waveform possessing a correspondingly slower rise time, such as an alpha function (cf. VOLTAGE-DEPENDENT SYNAPTIC INPUT: NMDA).

By discretizing the time series, the convolution in Eq. 5 can be rewritten as a multiplication with a convolution matrix

Formula 5
which makes explicit the parameterization we use: every point in time t' at which a synapse could be active is ascribed its own, independent parameter ws(t'). The conductance time course (for all times t) corresponding to activating a synapse at time t' is in this case given by gs(t, t') = {Theta}(t t')e–(tt')/{tau}s, where {Theta}(tt') is the Heaviside function and zeroes out anything before time t'. The respective current shape is then given by

Formula 5

Synaptic current shapes for inputs at various times are illustrated in Fig. 2. The total current shape from one synapse is the weighted sum of the current shapes arising from activations of the synapse at different times t'

Formula 6(6)
where diag (·) stands for a diagonal matrix with the vector EsV on the diagonal and implements an elementwise product of the vectors (EsV) and Kws. We can identify the ith column of synaptic shape matrix Jsyn with the current shape Ji(t) as corresponding to the ith component of the vector ws. A synaptic input at time t' and with weight ws(t') thus contributes a current of the shape of column t' of J to the RHS of Eq. 1. To sum up, once again, we may write Isyn as a weighted sum of known terms, with the weights ws(t) exactly the parameters we seek to estimate. For each spike time, the current shape is, as before, a function only of the data but not of the parameters we seek to estimate.

At a discretization level that leads to a voltage vector of length T, there are T parameters to estimate for each type of synaptic kinetics included. Thus if there is one synapse per compartment, there are as many parameters to infer as data points are available. For more synapses per compartment, the data/parameter ratio is <1.

The gray box in Fig. 2 also illustrates how the total synaptic current is made up of the individual current shapes in a single compartment. In the bottom two plots, current shapes for an excitatory and an inhibitory synapse are shown. Curves in shades of gray correspond to different times t' of synaptic activity. Thus the first curve simply looks like an exponential because the driving force at the times of nonzero conductance right after such an early synaptic input is constant. However, for an input spike right before the action potential, the current shape bears the effect of the change in driving force during the action potential. Although here we have considered only voltage-independent synapses, a combination of the channel and synaptic cases can be used for voltage-dependent (e.g., NMDA) synapses (see also VOLTAGE-DEPENDENT SYNAPTIC INPUT: NMDA).

Inference

Given Eqs. 6 and 4 and a similar equation for the intercompartmental conductances, we can infer a = {Formula 5c, ws, f} by linear regression. To see this, we concatenate all the shape matrices and the parameter vectors and write

Formula 7(7)
where Formula 7 is a vector of length T with elements Formula 7t = [V(t + dt) – V(t)]/dt, a is a vector containing all the parameters {Formula 7c, ws, f} we want to infer, and Nt = Formula 7{varepsilon}, where {varepsilon} is unit variance, independent Gaussian noise. A solution to this linear equation can be written as a constrained optimization

Formula 8(8)
where Formula 8 the inequality constraints stem from the nonnegative nature of the parameters (note importantly that all the parameters we infer are directly biophysically interpretable). The Hessian H = JTJ, whereas f = 2JTFormula 8.

The optimization in Eq. 8 is jointly quadratic in all the parameters except {sigma}, to which it is indifferent. There are no nonglobal optima (the Hessian is positive semidefinite) and we can use well-analyzed quadratic programming methods to find the optimum under the nonnegativity constraints, which act as linear constraints on a here. Furthermore, performing the quadratic minimization in Eq. 8 is equivalent to assuming that the noise N(t) is Gaussian and white and maximizing the likelihood of the data given a setting of the parameters, i.e., âML is the maximum likelihood estimate (MLE) of the parameters under a Gaussian noise model.

Given {Formula 8c, gs, f}, the usual MLE for {sigma} in turn is readily (and analytically) computed: Formula 8ML is the root-mean-square error in the predicted current, Formula 8ML2 = (1/T) {sum}t [Formula 8(t) {sum}i âiJi(t)]2, where the numerator gives the sum-squared error of our estimate of Formula 8(t) and the denominator normalizes by the length T of the observed data trace.

Simulations

All simulations were carried out using MATLAB. Sample code for some of the simulations is available at http://www.gatsby.ucl.ac.uk/~qhuys/code.html.


    RESULTS: APPLICATIONS TO MODEL DATA
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 RESULTS: APPLICATIONS TO MODEL...
 DISCUSSION
 APPENDIX
 GRANTS
 ACKNOWLEDGMENTS
 REFERENCES
 
In the following we explore inference of several combinations of these parameters to test the validity, efficiency, and accuracy of the proposed method. Initially, we will infer maximal channel conductances from a single compartment or multiple compartments. Then, we will analyze inference of synaptic inputs in a passive membrane and finally combine inference of synaptic and channel conductances.

Channel densities Formula 8c

For illustrative purposes, let us first analyze the inference of channel densities in a single compartment when we know the exact channel descriptions. We will then relax this assumption to the case in which we do not know the identity of the channels present. Finally, we will infer channel densities along a multicompartmental structure.

EXACT KNOWLEDGE OF CHANNEL IDENTITIES AND KINETICS. Suppose we are given the noiseless voltage trace V(t) from a single compartment’s response to a current input I(t) and know both the dynamics of all ion channels present in that compartment and the input I(t) applied to it. We now seek to infer the channel densities Formula 8c of these different channels from thevoltage trace. The equation describing this case is

Formula 9(9)
where Ec is a channel’s (known) reversal potential and C = 1 µF/cm2. Given V(t), we know the open fractions gc(V, t) because these are only a function of t and the past voltage history V(t') {forall}t' ≤ t; e.g., gc(V, t) = m(V, t)h3(V, t) with both gates m and h given by solutions to equations of the form of Eq. 3. The only unknowns in Eq. 9 are the Formula 9c and the capacitance C. J is now a matrix that has entries Jt,c = gc(V, t)[V(t) – Ec] and Jt,I = I(t) and gc is a vector with entries Formula 9c/C and 1/C. Dropping the subscript c, the regression is then formulated as follows

Formula 10(10)
In this noiseless case the regression provides highly accurate estimates and correctly recovers the true channel concentrations in the spiking model compartment (C = 1 µF/cm2, gNa = 120 mS/cm2, gK = 36 mS/cm2, and gleak = 3 mS/cm2; data not shown).

UNCERTAINTY IN CHANNEL IDENTITIES AND KINETICS. We now relax the requirement of exact knowledge of both channel kinetics and identities because we will not usually have knowledge of the exact kinetics of all channels present in each compartment. Rather than using only the true kinetics of the compartment’s channels as in the preceding section, we fit a more powerful model containing many different channel kinetics (including the true) in the hope that only those channels truly present in the membrane will be assigned nonzero conductances and that the data will constrain even this more powerful model. The true channel kinetics included were as before of Hodgkin–Huxley type. To test the selectivity of the estimation procedure, we fitted additional candidate channels from Poirazi et al. (2003a)Go, Mainen and Sejnowski (1996)Go, and Safronov et al. (2000)Go to the same data as used in the previous section. Figure 4A shows the data and Fig. 4B the voltage derivative to be matched by the summed, weighted current shapes. The inferred densities are shown to match the true ones in Fig. 4C. The actual nonzero weights by which the current shapes were multiplied were ainput current = 1/C = 1 cm2/µF; aNa = gNa/C = 120 mS/µF; aK = gK/C = 36 mS/µF; and aleak = gleak/C = 3 mS/µF, from which the densities are easily derived. Zero or nonzero densities were inferred for the channels not present during the generation of the voltage trace (Fig. 4).


Figure 4
View larger version (27K):
[in this window]
[in a new window]
 
FIG. 4. Inferring channel densities in a single compartment with uncertainty about channel identity. A: voltage trace V(t) (data). B: voltage derivative dV/dt in black and weighted sum of currents {sum}i aiJi(t) in gray. C: true and inferred channel densities (gray); Monte Carlo error bars (black; see APPENDIX B). D: current shapes. From top to bottom: input current shape; HH Na+; HH K+; leak; pyramidal Na+ and K+ (Poirazi et al. 2003aGo); pyramidal Na+ and K+ (Safronov et al. 2000Go).

 
The solution to our problem formulation is given by the weighted sum of current shapes that most closely reproduces the observed current. Because we assume that channel densities are constant over time, we can expect to collect enough data to constrain even the more powerful model: in the example just mentioned, the same data that constrained the inference with exact knowledge of the channels sufficed to constrain the more powerful model with an extended channel library. However, this is only the case when the channel kinetics lead to sufficiently different current shapes. Had the channels we were trying to distinguish between been much more similar, we would not have expected such a simple answer. In general, the closer the kinetic schemes we try to distinguish, the more data will be needed.

More data, however, will not distinguish between channels whose linear combination jointly well accounts for the data. To see this more clearly, consider an artificial case where we are trying to simultaneously determine the densities of two channels g1 and g2, which happen to have identical kinetics. Clearly, only the sum of the conductances assigned to these two instances is of importance, whereas the difference |Formula 101Formula 102| is irrelevant. Figure 5 illustrates this scenario. Whenever there is such functional equivalence between kinetic schemes, there is irreducible uncertainty with respect to different combinations of channels and the data cannot discriminate between them. Looking back at the formulation of the problem in Eq. 8, this type of uncertainty corresponds to zero (or more generally, near-zero) eigenvalues of the Hessian H. If we have two very similar channels, two columns of the matrix J will approximately be equal (or proportional) and H will be near singular. The smallest eigenvalue of H will correspond to the longest axis of the now very elongated quadratic bowl aTHa, i.e., to the direction in parameter space along which the data least constrain the parameters.


Figure 5
View larger version (11K):
[in this window]
[in a new window]
 
FIG. 5. Uncertainty resulting from equal or similar channel formulations. If two equal channels with densities Formula 41 and Formula 42, respectively are fitted, only the sum of the two densities Formula 41 + Formula 42 is relevant, and there is irreducible uncertainty along the line. More generally, the quadratic bowl aTHa will be elongated along this direction. Largest eigenvalues of H thus correspond to the most well constrained directions in the space of channel densities.

 
Because the eigenspectrum of the Hessian H will be determined both by the similarity of the kinetic schemes and by the appropriateness of the data in differentiating between them, it will be a good indicator from which to derive error bars as measures of certainty. Nota bene, however, that the eigenvalues will be informative about certainty along eigenvectors of H, and that these eigenvectors will not generally line up with the axes along which only a single channel density changes. Eigenvalues will measure how well combinations of channels are constrained by the data, rather than individual channel densities. Figure 6 illustrates this point. Changes along certain relative channel densities have a considerable effect on the behavior of the neuron and are well constrained by the data. For example, in Fig. 6 we recover the intuitive result that changing the balance of Na+ to K+ channels has a substantial effect on the neuron’s behavior, whereas changing the relative concentrations of very similar channels has very little impact on either the objective function aTHa or on the behavior of the cell. Thus this method allows us to ascertain the channel combinations that are most and least constrained by the data, and therefore relevant or irrelevant to the measured neural behavior, respectively.


Figure 6
View larger version (13K):
[in this window]
[in a new window]
 
FIG. 6. Eigenvectors of the matrix H for a single compartment with 23 different channels. A: most-constrained (amax). B: least-constrained (amin) direction in parameter space. From left to right these relate to the densities of different Na+ and K+ channels. Rightmost bar relates to the leak channel (L). Whereas the eigenvector in A broadly exchanges Na+ channels for K+ channels, the eigenvector in B exchanges some K+ channels for (similar) other K+ channels. Voltage trace of the model neuron after perturbation by amax (C) and that by amin (D). Solid line: perturbed model; dotted line: original model. For the perturbed models, the parameters were set to a + {varepsilon}amax or a + {varepsilon}amin, respectively, with equal {varepsilon}. Sum of all absolute conductance changes was small (roughly 3%). This analysis recovers the intuitive result that perturbing the balance of Na+/K+ channels affects the neuron more strongly than does changing the relative densities of 2 highly similar Na+ or K+ channels; more specifically, the least-constrained eigenvector measures the balance of very similar K+ or Na+ channels, whereas the most-constrained eigenvector measures the balance between Na+ and K+ channels in the compartment.

 
Finally, we seek a measure of confidence in the estimates of densities of individual channels. Assuming as we have so far, that the noise is Gaussian and white, we can use Monte Carlo methods such as importance sampling (Press et al. 1992Go) (or for low-dimensional problems as in Fig. 4 even straightforward rejection sampling) to obtain estimates of the error bars for individual channels (see APPENDIX B for details).

INFERRING CHANNEL CONDUCTANCES IN A MULTICOMPARTMENTAL MODEL. Next, we extend the method to a multicompartmental model, described by

Formula 11(11)
which is the same as before, but now including current flow between connected compartments. The intercompartmental conductances are fx,y and we minimize a similar vectorized equation as above (see Eq. 8). Based on biophysical constraints, we require fx,y = fy,x because the resistance to current flowing in one direction should be equal to the resistance to the current flowing in the opposite direction if the compartments are of equal size. Instead of constraining the optimization (which would entail an additional set of linear constraints on a), we define a new set of parameters, consisting only of the nonzero fx,y for which x > y, and optimize over them instead.

Because of the small mutual influence between compartments that are spatially well separated, the inference is amenable to a computational decomposition speed-up method described in APPENDIX C, and is thus quite easily extensible to even larger spatial structures, including thousands of compartments.3 Thus this method could potentially permit efficient inference of channel distributions across dendritic trees.

To illustrate the performance of the method, we randomly generated cells of 50 or 1,000 compartments, as shown in Fig. 7. A known squared sinusoidal current [I(t) {propto} sin2 ({gamma}t)] was injected in the soma and the voltage from all compartments was recorded. When the true transmembrane current is known, as would be the case if each compartment’s voltage had been observed at an arbitrarily high sampling rate, all parameters are accurately recovered with only 10 ms of data, even in a nonspiking model (Fig. 8 shows this for the 1,000-compartment neuron in Fig. 7). Figure 8E shows the entire trace from 100 compartments.


Figure 7
View larger version (13K):
[in this window]
[in a new window]
 
FIG. 7. Examples of the randomly generated cell morphologies used in this subsection and in the next 2 subsections. Large cell on the left has 1,000 compartments; the small cells on the right each have 50. In all cases, compartment n was attached to compartment n – 1 with probability 1/2, and with probability 1/2 to a random (uniformly chosen) other compartment.

 

Figure 8
View larger version (14K):
[in this window]
[in a new window]
 
FIG. 8. Inference of varying channel densities (all in mS/cm2) in the large, electrotonically extended neuron model of Fig. 7 with 1,000 compartments (each of length L = 16 µm, corresponding to intercompartmental conductances fxy = 200 mS/cm2) when access to both voltage and current are given. A (known) squared sinusoidal driving current was injected in the first compartment (soma, thick trace in E) only, and negligible current noise ({sigma} = 100 µA/cm2) was injected into each compartment independently. All parameters are recovered with high accuracy. Each compartment contained Na+, K+, and leak channels at different densities. AC: Na+, K+, and leak channel densities, respectively, each dot representing one compartment. D: intercompartmental conductances. E: voltage traces from 100 randomly selected compartments. Note that only 10 ms of data are used here, that spikes were observed in only a minority (16%) of compartments, and that many compartments show very little voltage modulation.

 
SENSITIVITY TO CURRENT NOISE. Figure 8 lays out the performance of the inference in the presence of negligible noise. However, here we are mainly interested in experimental situations in which only the noisy voltage is recorded, and the transmembrane current has to be recovered from the voltage trace. Taking the derivative of a noisy quantity increases the noise and may hurt the performance of the inference. Figures 9 and 10 show aggregate statistics for the effect of noise and temporal discretization, respectively. The data are drawn from 11 x 200 simulations of randomly generated 50-compartment, electrotonically extensive neurons (L = 16 µm) with 2,000 time points each (including spikes) for each compartment (e.g., 100 ms at 20 kHz). The current was inferred from the voltage in the simplest possible manner by taking the finite difference between adjacent data points Îtotal(t + 0.5{Delta}) = [V(t + {Delta}) – V(t)]/2{Delta} [more sophisticated methods, such as derivative Gaussian processes (Solak et al. 2003Go) or intermittent Kalman filters (Doucet et al. 2001Go; Voss et al. 2004Go; Huys and Paninski, unpublished observations) are available; also see DISCUSSION].


Figure 9
View larger version (16K):
[in this window]
[in a new window]
 
FIG. 9. Effect of current noise variance {sigma}2 ({sigma} is in mA/cm2) on parameter inference in multicompartment models. 6 x 200 60-compartmental neurons were generated randomly. Squared sinusoidal current was injected into the soma of each neuron, and noise current was injected into each compartment. Transmembrane current was inferred by taking the numerical derivative of the (noisy) voltage trace; 10 ms of data were used at a sampling frequency of 20 kHz. Each plot shows true vs. inferred parameters. Top row: Na+ (light gray dots), K+ (medium gray dots), and leak (black dots) channel concentrations. Bottom row: intercompartmental conductance fxy (in mS/cm2). To show the effect of physiological electrotonic distances, intercompartmental conductances around 200 mS/cm2 were chosen, corresponding to individual compartments of moderate length (L = 16 µm). Noise has a more pronounced effect on the intercompartmental conductances than on the active conductances. Plot on the right shows a voltage trace with {sigma} = 160 mA/cm2.

 

Figure 10
View larger version (17K):
[in this window]
[in a new window]
 
FIG. 10. Effect of sampling interval {Delta}t (in ms) on parameter inference in multicompartment models. 5 x 200 60-compartmental neurons were generated randomly. Squared sinusoidal current was injected into the soma of each neuron, and small noise current was injected into each compartment ({sigma} = 100 µA/cm2). Transmembrane current was inferred by taking the numerical derivative of the (noisy) voltage trace; 2,000 data points were used (e.g., 10 ms at 20 kHz). Each plot shows true vs. inferred parameters. Top row: Na+ (light gray dots), K+ (medium gray dots), and leak (black dots) channel concentrations. Bottom row: intercompartmental conductance fxy (in mS/cm2). Intercompartmental conductances around 200 mS/cm2 were chosen, corresponding to large individual compartments of length of L = 16 µm. As the sampling interval increases, the parameter inference becomes noisier for all parameters, but a significant bias appears mostly in the estimates for the active conductances because the peak currents during the action potentials are underestimated.

 
Figure 9 shows that the inference of channel densities is less sensitive to noise than is inference of the intercompartmental conductances. Indeed, for very extended cells (fxy < 5 mS/cm2 or L > 150 µm; data not shown), two sets of parameters emerge as the noise variance {sigma}2 is increased. For one cluster, inference deteriorates. These are compartments in which there was no adequate voltage deflection to inform parameter choice at that particular level of noise. The currents in those compartments are well accounted for by noise and there is thus a trade-off between the noise {sigma}2 and the required electrotonical compactness of the neuron.4

When decreasing the temporal resolution of the recording, a negative bias is mostly observed in the active conductances. This was to be expected because inferring the current by a finite difference results in a systematic underestimation of the peak current during the action potential, this underestimate being directly dependent on {Delta}t and vanishing as {Delta}t -> 0.

SPATIAL SAMPLING—THE LINEAR DENDRITE. A likely current experimental situation is a recording from a contiguous subsection of the entire dendrite, for example along just one branch. Although in that situation we cannot estimate the parameters in the outermost compartments (because current from their unobserved neighbor compartments cannot be accounted for), estimation for the parameters of all the other compartments is unchanged if we treat these compartments as current sources.

A linear piece of dendrite also provides a good test of the robustness of our method to spatial subsampling: In the previous section we generated data from a multicompartmental model and used the same compartmentalization to infer the parameters. In general, however, the compartmentalization will be given by the spatial sampling locations without any knowledge of an accurate underlying compartmental model. Once there is subsampling, there is no longer a "true" set of parameters to which we can compare inference. A compartmental model with compartments at each sampling location will be just a spatial approximation to the true cell, and the parameters inferred are the best parameters for the particular compartmental model assumed by the discretization.

To demonstrate the robustness properties of our method to spatial subsampling, we generated a linear dendrite of 200 compartments and subsampled by factors k = {2, 3, 4, 5}, i.e., generated the data from the full dendrite but during parameter inference assumed access only to every kth compartment. The results are displayed in Fig. 11. The distribution of channel densities (Fig. 11, AC) is centered around the true value in the subsampled case. The variance of the distribution shrinks rapidly as the cells become more electrotonically compact. For realistic initial compartmental lengths of L = 7 µm (corresponding to fxy = 2,000 mS/cm2) the error bars nearly vanish. The procedure is thus seen to be very robust to subsampling. Note that the inferred intercompartmental conductance decreases with increasing subsampling (Fig. 11D); this effect is expected, and exactly predicted by linear cable theory for which fxy = a/(2rLL2), where rL is the axial resistivity (in k{Omega} · mm2) and a is the dendrite’s radius (in µm) (Dayan and Abbott 2001Go). The solid line shows a fit of this relationship to the data.


Figure 11
View larger version (13K):
[in this window]
[in a new window]
 
FIG. 11. Parameter estimation is robust to subsampling. Ten different 200-compartmental electrotonically extended linear dendrites (compartment length L = 31.2 µm) with constant channel concentrations were subsampled by a factor k = {2, 3, 4, 5}, i.e., access only to each kth compartment assumed. For clarity, all compartments in a particular (original, i.e., not subsampled) dendrite had the same channel densities. To illustrate purely the effect of subsampling, current noise was small ({sigma} = 1 mA/cm2). For Na+, K+, and leak channels respectively. AC: gray bars: true [and correctly recovered (see Fig. 8)] channel densities. Lines with error bars: inferred densities for the subsampled dendrites (mean ± 1 SD of estimated channel density distribution across the linear subsampled dendrite), darker lines for larger subsampling factors. D: in each dendrite, the estimated intercompartmental conductances fxy (in mS/cm2) vary with the increasing compartmental length resulting from the sparser sampling. Line is a fit of the predicted relationship E[fxy] = {alpha}/L2 (Dayan and Abbott 2001Go). Averages are taken over all the dendrites. All error bars become negligible for electrotonically compact cells with compartment lengths of L = 5 µm.

 
Synaptic inputs ws(t) to a passive membrane

Finally, we approach inference of synaptic conductance, i.e., presynaptic input. For clarity, let us first analyze a passive membrane patch that receives voltage-independent synaptic inputs (e.g., AMPA and GABAA)

Formula 12(12)

Formula 13(13)
where s indexes the S synapse types impinging on this membrane patch, gs(t) is the synaptic conductance time course, ws(t) is the strength of the synaptic input at time t, and {tau}s is the synaptic time constant. The response of a passive membrane to one such synaptic input is shown in Fig. 12. Assuming knowledge of the synaptic time constants, our aim is to infer the ws(t), the temporal pattern of presynaptic input. [Note that the synaptic input strength ws(t) is constrained to be positive only, not to be binary.] We represent all the time series [V(t), dV/dt] as vectors V, Formula 13, and so forth and write the maximum likelihood (ML) w as

Formula 14(14)
where all variables indexed by s have been concatenated, i.e., w is a concatenation of all discretized ws(t) and for a recording of length T has length TS. If K is the convolution matrix corresponding to Eq. 13, i.e., g = Kw, then Jsyn = diag (EsV)K as in Eq. 6. As before, the positive nature of the synaptic entries turns this minimization into a standard nonnegative quadratic minimization problem. However, the complexity of solving quadratic programs typically scales as O(N3), where N is the number of parameters, which in the synaptic case is T · S. Except for small problems it is not feasible to solve this directly, but because the minimization is jointly convex in all the parameters, a number of decomposition techniques apply, e.g., we can simply iteratively optimize over a subset of parameters. APPENDIX C (SYNAPTIC INPUTS ONLY) details the procedure, which is quite efficient for this simple passive membrane. Figure 12D shows that this simple approach gives rather good results, but points to one issue that arises directly from the large number of free parameters: the inferred presynaptic activity ws(t) is not sparse. That is, our synapse is kept active all the time (albeit at a low level) to explain the small random perturbations caused by the noise term in Eq. 12—a classical example of overfitting. Overfitting was to be expected as we have, for a single synapse on one compartment, as many parameters as data points (N = TS = T).


Figure 12
View larger version (11K):
[in this window]
[in a new window]
 
FIG. 12. Inferring presynaptic input. A: synaptic input time course ws(t) (a set of simulated presynaptic spikes with a given (fixed) weight). B: convolution of ws(t) with an exponential kernel to produce the time-varying input conductance gs(t). C: response of a passive membrane to the synaptic conductance (Es = 0 mV). D and E: gray circles are true inputs ws(t) and the black lines show the inferred inputs ws(t). D: maximum likelihood (ML) inference. E: maximum a posteriori (MAP) inference with a sparse prior. Note that the noisy, roughly constant activity in D is effectively suppressed by the sparsening prior.

 
REGULARIZATION. Figure 13A shows the performance of the ML estimator when there are three synapses, two excitatory (glutamatergic) and one inhibitory (GABAergic). We see that the inferred presynaptic activity is even less sparse than in the case with a single synapse: constant small synaptic inputs are used to explain the noise. Two factors explain why this case is worse than the case with one synapse: First, we now use each data (time) point to infer several (three) parameters. Second, because we have both excitatory and inhibitory synapses, an overshoot resulting from overactivity in a synapse can be explained away with too much activity in a synapse of the opposite sign with similar dynamics, which would not be possible if all synapses were of the same sign.


Figure 13
View larger version (12K):
[in this window]
[in a new window]
 
FIG. 13. Passive membrane with 3 synapses. Effect of shrinkage prior. There were 2 excitatory synapses of different strengths (6 and 12 mS/cm2) and one inhibitory synapse (12 mS/cm2). A: ML inference. B: MAP inference (with sparse prior). Top traces are the excitatory inputs; the middle shows the voltage trace, and at the bottom we plot the inhibitory inputs. C = 1 µF/cm2. True synaptic inputs are plotted as gray symbols (circles and x symbols), the inferred synaptic time course as black lines. Note the overfitting when doing maximum likelihood and the high accuracy of the estimated synaptic inputs when maximizing the posterior.

 
Because the data do not provide enough information to constrain the parameters, we have to use regularization techniques. We take a Bayesian approach: given that we chose to infer ws(t)—a real nonnegative number—we can readily impose a prior distribution on ws(t). Because we expect that for a single synapse, ws(t) is really a sparse set of discrete events in time, an appropriate prior is one that 1) does not enforce smoothness across time and 2) penalizes nonzero values of ws(t) and thereby enforces sparseness. A simple shrinkage prior that factorizes over time and penalizes large values of ws(t) satisfies both these requirements. In vectorized form, we write the maximum a posteriori (MAP) estimate for ws(t) as

Formula 15(15)
where n is a vector of ones and {lambda} determines the relative importance we give to matching the data (the first term in the equation above) and to sparsification (the second term). The larger the value of {lambda}, the sparser the value of the estimated ws(t). In probabilistic terms, {lambda} parameterizes an independent exponential prior distribution over ws(t), that is

Formula 15
Figures 12E and 13B show the effect of including the prior. Most of the small, constant synaptic activity is effectively suppressed without greatly affecting the larger spikes. The effect is pronounced in both the single- and three-synapse cases, but of more importance in the latter, for the reasons discussed above.

The value of {lambda} also needs to be inferred. Its ML value is the a priori unknown presynaptic firing rate. There are several ways to proceed: The one adopted here is based on an empirical prior, choosing {lambda} that maximizes the posterior probability, which corresponds to maximizing the posterior with respect to {lambda} after discretization. Alternatively, Formula 15 can be chosen to maximize the marginal likelihood

Formula 16(16)
an integral that can be evaluated by Markov chain Monte Carlo methods (MacKay 2003Go).

During relatively input-free times regularization as introduced here will suppress the inferred synaptic activity used (mistakenly) to explain the voltage evolution noise. However, during periods of high-input firing rates, this regularization might also suppress legitimate spikes. Two more flexible schemes could be applied if we possessed information about some variable that explained much of the fluctuations in the input to our cell. First, it is reasonable on morphological grounds to allow the regularization parameter to vary as a function of location on the dendritic tree; for example, close to the soma of a pyramidal cell, we might decrease the regularization parameter {lambda}inh corresponding to inhibitory synapses, but increase the regularization parameter corresponding to excitatory synapses, thus reflecting our prior knowledge of the distribution of synaptic input as a function of distance from the soma. Second, it may be sensible to let {lambda} vary as a function of time {lambda}(t), where {lambda}(t) may be modeled as some function of the stimulus condition at time t. A simplified version of this idea is explored further in STIMULUS-DEPENDENT SYNAPTIC INPUT TO AN ACTIVE MEMBRANE. To estimate the error bars on the synaptic input parameters, it will be necessary to adopt methods from the sequential sampling literature (Doucet et al. 2001Go; Huys and Paninski, unpublished observations) because the inference space here is too large for the simple importance sampler in APPENDIX B.

TIME-VARYING SYNAPTIC STRENGTH. Synaptic strength changes over time, at both fast (because of synaptic dynamics such as facilitation and depression) and slow (synaptic plasticity such as long-term potentiation and depression) timescales. Treating the weight of every spike by any individual synapse as an independent variable has the strong advantage that it not only allows inference of precise input spike times, but also naturally allows inferring the time-varying strength of the synaptic input. As an example, consider Fig. 14. Here three potentiating synapses5 impinged on one passive compartment, two of which were excitatory (Fig. 14A) and one inhibitory (Fig. 14C). Figure 14D plots the true synaptic strength for each spike versus the inferred strength. Because there were both excitatory and inhibitory synapses, regularization was necessary, but it introduced a minor negative bias only in the estimates of synaptic strengths.


Figure 14
View larger version (11K):
[in this window]
[in a new window]
 
FIG. 14. Simultaneously inferring the time-varying strength of 3 potentiating synapses in a passive compartment. A: 2 potentiating excitatory synapses with different resting weights. B: voltage trace data. C: one inhibitory potentiating synapse. True strengths are shown by gray markers, a different one for each synapse (*, +, and x, respectively). Inferred synaptic strength in black. D: plot of the inferred vs. true synaptic strengths for all synapses at input spike times. Note that the method tracks the changing synaptic weight quite well, despite the mild negative bias resulting from regularization.

 
VOLTAGE-DEPENDENT SYNAPTIC INPUT: NMDA. The synapses discussed so far had simple, voltage-independent kinetics. NMDA synapses have a more complex voltage-dependent form, contributing a conductance gN

Formula 17(17)
where b(V) is some suitable (known) sigmoidal function of the voltage, representing magnesium block, whereas o(t) is the time-dependent component similar to the AMPA synapse gs. As in the discussion of active channels, given the voltage V(t), we can compute both o(t) and b[V(t)] and thus again infer the channel density Formula 17. The variation of the synaptic strength with time should present no problems as we infer the full time course ws(t). Thus this fits directly into the present framework.

Synaptic inputs to an active membrane

As mentioned above, there is a jointly global optimum for both channel densities and presynaptic input (the optimization is jointly convex in both). To illustrate this, we infer channel density and presynaptic input simultaneously in a single-compartment cell, writing

Formula 18(18)
where gc(t) is the conductance state of a channel c and Formula 18c is that channel’s maximal conductance. We fit the same channel kinetics gc(t) as in the multicompartmental case and infer the C parameters Formula 18c plus the T · S parameters ws(t).

Proceeding as before, we discretize Eq. 18 and rewrite it in vector form

Formula 19(19)
Jch stems from the channel conductances (Eq. 9) and Jsyn from the vectorized synaptic input (Eq. 12). Jj is thus T by C + TS, which as in the previous section is typically very large. In the N-compartment case this becomes T by NC + NTS + N (the last addition being for the intercompartmental conductances). g is the concatenation of all the parameters we try to estimate (gc and ws) and we obtain our e