A point process framework for modeling electrical stimulation of the auditory nerve

Joshua H. Goldwyn, Jay T. Rubinstein, Eric Shea-Brown


Model-based studies of responses of auditory nerve fibers to electrical stimulation can provide insight into the functioning of cochlear implants. Ideally, these studies can identify limitations in sound processing strategies and lead to improved methods for providing sound information to cochlear implant users. To accomplish this, models must accurately describe spiking activity while avoiding excessive complexity that would preclude large-scale simulations of populations of auditory nerve fibers and obscure insight into the mechanisms that influence neural encoding of sound information. In this spirit, we develop a point process model of individual auditory nerve fibers that provides a compact and accurate description of neural responses to electric stimulation. Inspired by the framework of generalized linear models, the proposed model consists of a cascade of linear and nonlinear stages. We show how each of these stages can be associated with biophysical mechanisms and related to models of neuronal dynamics. Moreover, we derive a semianalytical procedure that uniquely determines each parameter in the model on the basis of fundamental statistics from recordings of single fiber responses to electric stimulation, including threshold, relative spread, jitter, and chronaxie. The model also accounts for refractory and summation effects that influence the responses of auditory nerve fibers to high pulse rate stimulation. Throughout, we compare model predictions to published physiological data of response to high and low pulse rate stimulation. We find that the model, although constructed to fit data from single and paired pulse experiments, can accurately predict responses to unmodulated and modulated pulse train stimuli. We close by performing an ideal observer analysis of simulated spike trains in response to sinusoidally amplitude modulated stimuli and find that carrier pulse rate does not affect modulation detection thresholds.

  • cochlear implant
  • computer simulation
  • neural prosthetic
  • modulation detection

cochlear implants restore a sense of hearing to individuals with severe to profound hearing loss by electrically stimulating the primary sensory neurons in the auditory pathway. A complete understanding of the transformation from electrical stimulation to neural responses would aid the design of improved cochlear implant sound processing strategies. Computational and mathematical models, especially those that are carefully constrained by available neurophysiological data, can play an essential role in exploring this transformation. In addition to providing an efficient platform for testing and evaluating stimulation paradigms, models provide quantitative tools for studying how neural mechanisms influence the transmission of information in the auditory system.

In this study, we develop a point process modeling framework that can be used to simulate the responses of individual fibers of the auditory nerve (AN) to cochlear implant stimulation. The dynamical and stochastic features of the model are matched to statistics that characterize neural responses to single and paired pulses of electrical current, stimuli that are commonly used to characterize the response properties of AN fibers in animal models of electric hearing (Hartmann et al. 1984; van den Honert and Stypulkowski 1984; Dynes 1996; Bruce et al. 1999c; Miller et al. 1999; Shepherd and Javel 1999; Cartee et al. 2000, 2006; Miller et al. 2001a,b). We go on to show that this model can provide insight into the responses of neurons to extended pulse trains of electrical stimulation. Point process models also have mathematical properties that facilitate analyses of auditory signal detection (Heinz et al. 2001a,b), and we close by relating results from our model to the psychophysical test of amplitude modulation detection.

Our approach connects two prior modeling frameworks. The first is the stochastic threshold crossing model (Bruce et al. 1999a,c). This model is a useful phenomenological representation of AN activity. A number of cochlear implant psychophysics experiments have been studied with this model (Bruce et al. 1999b; Xu and Collins 2007; Goldwyn et al. 2010b). Unfortunately, as stated by its creators, it lacks sufficient dynamical details to provide valid predictions of responses of AN fibers to high rate stimulation (Bruce et al. 1999a). Studying responses to high pulse rate stimulation is necessary to characterize contemporary cochlear implant stimulation strategies and, for reasons that we discuss in more detail below, is a primary focus of this work. We therefore extend the stochastic threshold crossing model by turning to a more general class of point process models.

Point processes describe spiking via an instantaneous firing rate that varies over time (Perkel et al. 1967; Johnson 1996; Truccolo et al. 2005). They have been frequently applied to model firing of AN fibers (Miller and Mark 1992; Litvak et al. 2003b; Trevino et al. 2010; Plourde et al. 2011). Our approach is largely motivated by a specific family of point processes: generalized linear models (GLMs; Paninski 2004; Truccolo et al. 2005). GLMs have emerged as an essential tool for modeling physiological data and investigating the coding and computational properties of neurons (Paninski 2004; Paninski et al. 2007), including auditory neurons (Trevino et al. 2010; Plourde et al. 2011). Moreover, they hold particular promise for applications to sensory neural prostheses because they have useful mathematical properties that permit efficient parameter fitting to spike train data (Paninski 2004), they can be used to optimally decode spike trains (Paninski et al. 2007), and they can be used in connection with real-time optimization methods to identify stimulation patterns that control the timing of evoked spikes (Ahmadi et al. 2011).

We apply our point process model to investigate differences between responses of AN fibers to low and high pulse rate electric stimulation. Interest in high pulse rate stimuli arises from the intuitive notion that higher stimulation rates should provide greater temporal information to cochlear implant listeners and thereby improve speech reception. Indeed, computational modeling and neurophysiological studies have indicated that stimulating neurons with 5,000 pulses per second (pps)-pulse trains can desynchronize neural responses (Rubinstein et al. 1999; Litvak et al. 2001, 2003c). This may return AN activity to a state that more closely resembles normal spontaneous activity in the healthy cochlea (Rubinstein et al. 1999; Litvak et al. 2003c). In related modeling and experimental studies, high pulse rate stimulation has been shown to produce neural firing rates that more faithfully reproduce the envelopes of temporally modulated stimuli (Litvak et al. 2003a,b; Chen and Zhang 2007; Mino 2007). Psychophysical data from tests of amplitude modulation detection, however, have not yielded clear indications that high pulse rate stimulation improves the ability of listeners to detect temporal modulation (Galvin and Fu 2005, 2009; Pfingst et al. 2007). More generally, speech tests have reported mixed results as to whether higher pulse rate stimulation improves speech reception in cochlear implant listeners (Brill et al. 1997; Kiefer et al. 2000; Loizou et al. 2000; Vandali et al. 2000; Holden et al. 2002; Friesen et al. 2005). Computational modeling can help trace the connections between the perceptual benefits, if any, of high pulse rate stimulation and the neural responses that it evokes.


In this section, we introduce the point process framework and discuss how it can be parameterized using commonly reported statistics of responses to single pulse and paired pulse stimuli. These include threshold, relative spread, jitter, chronaxie, summation threshold, and two refractory effects. We begin by introducing the response statistics that we used to construct the model (Response Statistics), followed by a general discussion of how these statistics can be extracted from a point process model (Point Process Theory). We then introduce the specific model framework that we will use throughout the study (Model Formation) and explain the procedure by which each parameter in the model can be uniquely associated with a response statistic (Model Parameterization).

Response Statistics

A first-order description of neuronal excitability in response to electrical stimulation is the firing efficiency curve. This is an input/output function that relates the current level of a single pulse of current to the probability that the stimulus evokes a spike. As shown by Verveen (1960) in his pioneering recordings of green frog axons, the firing efficiency curve can take the form of a sigmoid function and be approximated by the integral of a Gaussian distribution. The median of the underlying Gaussian distribution represents the stimulus level that elicits a spike with probability one-half. This level is referred to as the threshold of the neuron, which we denote by θ. A measure of the variability of spike initiation is the relative spread (RS), defined as the standard deviation of the underlying Gaussian distribution divided by its mean. Firing efficiency curves are frequently used to characterize the response of AN fibers to cochlear implant stimulation in electrophysiological experiments (Dynes 1996; Bruce et al. 1999c; Miller et al. 1999; Shepherd and Javel 1999; Miller et al. 2001b) and to parameterize computational models (Bruce et al. 1999c; Stocks et al. 2002; Macherey et al. 2007; Imennov and Rubinstein 2009; O'Gorman et al. 2009). We will follow this approach and use the firing efficiency curve, in particular the parameters θ and RS, as the starting point for constructing our point process model.

The firing efficiency curve summarizes the excitability of a neuron in response to a single pulse of stimulation. For longer pulse durations, the threshold current level is typically smaller, due to the capacity of the neural membrane to integrate charge over time. This dependence of threshold on pulse duration can be summarized by the chronaxie; the pulse duration at which the threshold level is twice what it would be for a much longer pulse duration. This basic feature of membrane dynamics is absent from earlier stochastic threshold crossing models (Bruce et al. 1999a; Litvak et al. 2003b).

The firing efficiency curve encapsulates variability in the initiation of spikes (or lack thereof), but an additional source of randomness is in the timing of spikes. The standard deviation of spike times evoked by a single pulse is known as jitter. It is another statistic that has been widely studied in physiological and modeling studies of electrical stimulation of AN fibers (van den Honert and Stypulkowski 1984; Miller et al. 1999, 2001b; Javel and Shepherd 2000; Cartee et al. 2006). Jitter can depend on pulse duration and pulse level, but in our model we will only use the value of jitter that is most commonly reported in the literature: the value measured for the same pulse duration and level that defines the threshold.

In order for the model to generate realistic responses to high pulse rate stimulation, it must also include spike history effects. The first, and most essential, is a refractory effect that reduces the excitability of the model neuron in the immediate aftermath of an evoked spike. This effect will be implemented as a transient increase in the threshold value θ following a spike, similar to the method of Bruce et al. (1999a). The point process will also include a second spike history effect, a transient increase in RS immediately after a spike (Miller et al. 2001a). Based on simulation results from a biophysically detailed computational model of an AN fiber, it has been hypothesized that this phenomenon is a signature of the random activity of ion channels in a cell membrane known as channel noise (Matsuoka et al. 2001; Mino and Rubinstein 2006). Modeling channel noise in AN fibers in the context of cochlear implant stimulation is important for a number of reasons. First, neurons in the deafened cochlea do not receive the typical synaptic input from inner hair cells, and therefore ion channels may generate a dominant noise source in this stage of auditory processing. Second, channel noise may be a mechanism by which high pulse rate stimulation can desynchronize neural responses, thereby leading to improved encoding of electrical stimuli (Rubinstein et al. 1999; Litvak et al. 2003a,b; Mino 2007). Finally, we incorporate an additional feature into the model that is relevant for high carrier pulse rates, the capacity of a neuron to integrate consecutive subthreshold pulses. In other words, the model will account for the phenomenon by which multiple pulses are more likely to evoke a spike than would be expected if each pulse acted independently on the neuron. This effect, known as summation or facilitation, has been observed in responses to pairs of pulses with short interpulse intervals (Cartee et al. 2000, 2006) and to high rate pulse train stimuli (Heffer et al. 2010). As we will see, this form of interpulse interaction creates fundamental differences between responses to low and high rate pulse train stimulation.

Point Process Theory

In this section, we introduce a number of basic ideas from the theory of point processes and illustrate their connections to the response statistics discussed above. A point process model is completely defined by its conditional intensity function (Daley and Vere-Jones 2003), which can be interpreted as the instantaneous probability that a neuron will spike (Truccolo et al. 2005). We denote the conditional intensity function as λ(t I, H), where I is the stimulus applied to the neuron and H represents the past history of the neuron, both of which should be viewed as functions of time. A related and important quantity is the integrated intensity function: Λ(t1,t2|I,H)=t1t2λ(s|I,H)ds. From Λ(t1, t2|I, H), one can define the probability that there will be at least one spike in a time interval [t1, t2]. This function is known as the lifetime distribution function and is given by Daley and Vere-Jones (2003): L(t1,t2|I,H)=1eΛ(t1,t2|I,H).(1)

The lifetime distribution function is central to our analysis because it forms the mathematical link between the point process model and the statistics that describe the responses of neurons to single and paired pulses. Consider, for instance, responses to a single pulse of current level I and some duration D. If we let t2 in Eq. 1 become sufficiently large, then the lifetime distribution function defines the probability with which a single pulse of a certain current level I will ever evoke a spike. In other words, when viewed as function of I, it is equivalent to the firing efficiency curve. If we now change our perspective, and view Eq. 1 as a function of t2, for a fixed value of I, then this function gives the probability that a spike has been evoked before time t2. In this context, the lifetime distribution function describes the temporal dispersion of spike times and can thus be used to calculate jitter. We now present how the single pulse and paired pulse response statistics discussed in Response Statistics can formally be obtained using point process theory.

We begin with a set of measures that are obtained from responses to single pulses, and we therefore assume that they are not influenced by spike history effects. This allows us to omit the term H for now and let I refer to the current level of a single stimulating pulse. As introduced above, the threshold value θ is the current level at which the firing efficiency curve has a value of ½. From the definition of the lifetime distribution, θ must satisfy 12=L(0,|θ). Applying Eq. 1, this can be rewritten in terms of the integrated intensity function: Λ(0,|θ)=log2(2)

To define RS, we generalize the traditional definition that is based on assuming the firing efficiency curve is shaped like the integral of a Gaussian distribution (Verveen 1960). From Eq. 1, it is apparent that the firing efficiency curve for a point process model is not necessarily an integrated Gaussian function. It is, however, the cumulative distribution of some probability density function. To be concrete, we define an associated probability density function as the derivative of the lifetime distribution function with respect to the current level I: lI(0,|I)=ddI[Λ(0,|I)]eΛ(0,|I) By analogy with the traditional definition, we let RS be the standard deviation of this associated density function normalized by its mean.

To define chronaxie in terms of a point process model, we use Eq. 2 and compare threshold current levels for varying pulse durations. Let θ(D) denote the threshold for a monophasic pulse of a long pulse of duration D. Then, the chronaxie is the phase duration Dc for which the single pulse threshold θ(Dc) is twice the value of θ(D). Specifically, Dc must satisfy the following two conditions: θ(Dc)=2θ(D)Λ[0,|θ(Dc)]=Λ[0,|θ(D)]=log(2)

The final single pulse statistic we consider is jitter. As mentioned above, jitter can be obtained from the lifetime distribution function when it is viewed a function of time. Following the standard practice, we will consider jitter in spike times evoked by a single pulse presented at the threshold level θ. Then the probability of at least one spike occurring in the interval [0, t], conditioned on the event that the stimulus produces a spike at any time, is twice the lifetime distribution function: 2L(0, t|θ). We define the associated probability density function by taking a derivative with respect to time of Eq. 1 and see that jitter is the standard deviation of the following density function for the probability that a spike occurs at time t: lt(0,t|θ)=2λ(0,t|θ)eΛ(0,t|θ).(3)

We now turn to responses evoked by pairs of pulses. Using a summation pulse paradigm (Cartee et al. 2000, 2006), we can quantify how threshold current levels change for a stimulus consisting of two pulses of equal current levels, separated by an interpulse interval of varying duration. Note that in the experimental studies of Cartee et al., the question asked was whether the pulse pair evoked a spike; the timing of the spike was not relevant. Assuming that the response to the pulse pair is not influenced by any prior spiking activity, the summation threshold satisfies the same threshold condition given by Eq. 2, the only changes are that the stimulus I is a pair of pulses and θ is the current level at which this pair of pulses evokes at least one spike on half of all trials, on average.

Lastly, we observe how refractory effects can be incorporated by allowing the single pulse threshold θ and Fig. 1: RS to depend on the time since the last pulse. In the refractory pulse paradigm (Cartee et al. 2000, 2006; Miller et al. 2001a), a strong first pulse forces the neuron to spike and then a firing efficiency curve is measured from the responses to a second pulse presented some time later. The mathematical relationships between that firing efficiency curve and the lifetime distribution function remain unchanged, but they must be formally modified by adding the spike history term H.

Fig. 1.

A: schematic diagram of the point process model. Current input to the model, shown here as a train of biphasic pulses, is passed through a cascade of linear filters and a nonlinear function. This produces a value, the conditional intensity, that defines the instantaneous probability of a spike, which is then used to generate random sequences of spike times. Previous spikes provide feedback that modulates the stimulus filters and the nonlinearity. B: illustration of the response of each stage of the model to a sequence of three biphasic pulses presented at a rate of 1,000 pulses per second (pps). Black lines illustrate model dynamics in the absence of a refractory effect. Gray lines illustrate model dynamics that include a refractory effect and a spike at time 0.6 ms.

Model Formulation

We have presented the connection between the point process model and response statistics in a general manner. In this section, we make these connections more explicit by proposing a specific structure for the conditional intensity function. The model consists of a cascade of linear and nonlinear stages followed by a probabilistic spike generator. This structure is inspired by the popular GLM class of point process neuron models. Figure 1 illustrates the model features. The model differs from standard GLMs in several ways. These include a nonlinear stage that depends on the time since the last spike, an asymmetry in how the positive and negative phases of charge-balanced pulses are filtered, and a secondary filter that adds variability to simulated spike times.

The action of the model, depicted schematically in Fig. 1, can be summarized as follows: an incoming pulse train I(t) is passed through stimulus filters K+(t) and K(t), the outputs of the filters are recombined and used as input to a nonlinear function f(·). To incorporate refractory effects, the stimulus filters and the nonlinearity are all modified depending on the time since the last spike. An additional filter J(t) is included to add variability to simulated spike times. The result of this chain of events is an instantaneous, history-dependent value for the conditional intensity function that defines the point process model: λ(t|I,H)=[J*f(K+*I++K*I)](t),(4) where I+ and I represent the positive and negative portions of the input and * represents the convolution operator.

A variety of methods exist for fitting GLMs to neural data including reverse correlation to white noise stimuli (Simoncelli et al. 2004) and maximum likelihood methods (Paninski 2004). We pursue a different route and take advantage of the special structure of the proposed model and the mathematical relationships in Point Process Theory to develop a semianalytical procedure that uniquely identifies parameters in the point process model with specific response statistics. For reasons of mathematical tractability and biological relevance, which we articulate more fully below, we define the components of the model as follows: f(x)={xαifx00else(5a) K+(x)={κτκetτκift00else(5b) K(t)={βκτκetτκift00else(5c) J(t)={1τJetτJift00else.(5d) This model has five parameters: α, κ, τK, β, and τJ. We next show how they are uniquely determined by the five response statistics discussed in Response Statistics: relative spread, threshold, chronaxie, jitter, and summation pulse threshold. We will also illustrate how the model can account for refractory effects by allowing the parameters α and κ to depend on the time since the previous spike.

Model Parameterization

We begin by discussing the response measures that do not depend on spike history and thus neglect H for now. To simplify our presentation, we introduce some additional notation. Let w(t) be the waveform of the stimulus; for instance, the waveform of a monophasic pulse is a rectangular step that reaches a maximum value of 1. Alternatively, the waveform function for a biphasic pulse consists of two rectangular phases, one reaches a value of +1 and the other −1. We next define the filtered waveform function W(t), which represents the action of K+ and K on w(t): W(t)=0t[K+(s)ω+(ts)+K(s)ω(ts)]ds, where w+ and w are the positive and negative parts of the waveform function.

For the case that all pulses have identical current level I, for instance in single pulse or summation pulse experiments, the intensity function for the point process model has the compact form: λ(t|I)=(κI)α[J*Wα](t). The parameter κ and the pulse current level I both factor out due to our choice of a power law nonlinearity. In addition, the jitter filter J(t) can be ignored when considering the integrated intensity function over all time. This follows from Fubini's Theorem (Jones 2001) and the fact that ∫0 J(t)dt = 1: 0[J*Wα](t)dt=0W(t)αdt. If we denote this integral as 0W(t)αdtWα,(6) then the lifetime distribution function for the time interval [0, ∞) can be written as: L(0,|I)=1exp(καIαWα).(7) As noted above, this lifetime distribution function is, in the language of neurophysiology, the firing efficiency curve or input/output function of the neuron in response to a single pulse of applied current. Next, we use this equation to parameterize the model.

Relative spread.

The choice of a power law nonlinearity significantly simplifies the parameterization process. The lifetime distribution function in Eq. 7 is a Weibull cumulative distribution function with shape parameter α and scale parameter [κwαα]1. We define the relative spread as the associated standard deviation divided by the mean. Using known expressions for these values we find (Rinne 2009): RS=Γ(1+2α)Γ2(1+2α)1,(8) where Γ(·) is the gamma function. Remarkably, the relationship between RS and α does not depend on any other model or stimulus parameters. There is therefore a one-to-one relationship between the response statistic RS and the model parameter α. To fit the model, we invert this relationship to find α as a function of RS. In practice this must be done numerically.


The threshold current level θ is obtained by solving Eq. 2, which is equivalent to finding the median of the Weibull distribution: θ=1καlog2Wα.(9) We invert this relationship to have an expression for the model parameter κ as a function of the response statistic θ: κ=1θαlog2Wα.(10) Note that κ depends on the model parameters α as well as β, and τκ (through the definition of Wα in Eq. 6). As we will see, the values of these parameters are independent of θ and can therefore be viewed as (known) constants in this equation.


To fit the model parameter τκ, observe that the waveform function w(t) and thus the associated function Wα both depend on the pulse duration. We denote this dependence as Wα(D). By definition, the chronaxie value is the pulse duration Dc for which the threshold for a monophasic pulse is twice the value of the threshold in response to a monophasic pulse of a long duration D (the exact value of D varies across experimental studies). We can use Eq. 9 to define the threshold current level as a function of pulse duration, and obtain the relationship: 1καlog2Wα(Dc)=2καlog2Wα(D). The factors of κ cancel one another and the equation can be further simplified to: Wα(D)Wα(Dc)=2α.(11) Conveniently, the solution to this equation depends only on Wα and α. Now, α can be assumed to have a known value since it is uniquely determined by RS (see Eq. 8). Furthermore, Wα does not depend on β since chronaxie is defined with respect to monophasic pulses. As a consequence, the only undetermined model parameter in this equation is τκ, which enters in the definition of Wα. There is no analytical way to identify τκ as a function of chronaxie, but we can numerically evaluate Wα(D) and Wα(Dc) and then use numerical root finding methods to determine the value of τκ that satisfies the relationship in Eq. 11, for given values of chronaxie Dc and reference pulse duration D.

Summation threshold.

To parameterize the temporal summation properties of the model, we replicate the paired pulse experiments of Cartee et al. (2000) using pseudo-monophasic pulses (Cartee et al. 2000; Cartee 2000). In these experiments, two charge-balanced pulses are presented at varying interpulse intervals (tIPI). The threshold current level for the combined response to both pulses, which we denote θ2, is compared with the threshold current level for a single pulse presented in isolation, which we denote θ1. Cartee et al. (2000) measured thresholds at three interpulse intervals (100, 200, and 300 μs) and summarized the relationship between θ1 and θ2 using the function: θ2θ1=112exp(tIPI/τsum).(12) The parameter τsum controls the length of time over which the neuron can effectively sum consecutive subthreshold pulses.

To translate this into the language of the point process model, we use Eq. 2 to obtain a relationship between the single and paired pulse thresholds. Let Wα,1 denote the filtered waveform for a single pulse and Wα,2(tIPI) be the filtered waveform for a pair of pulses separated by an interpulse interval of length tIPI. Then the ratio of the thresholds can be written as: θ2θ1=αWα,1Wα,2(tIPI)(13)

The model parameters that enter into this equation are α, τκ, and β (since the pulses in this paradigm have negative phases). We treat α and τκ as constants in Eq. 13 since they do not depend on β and can be determined at prior steps in the fitting procedure. We determine β by minimizing the mean square error between the two representations of θ2 ÷ θ1 on the righthand sides of Eqs. 12 and 13. This must be done using numerical methods for integration and minimization. The parameter β will take a value between zero and one, with smaller values leading to greater subthreshold integration of consecutive pulses. We are not aware of experimental studies that indicate how the temporal integration properties of AN fibers change immediately after a spike, so β is left as a constant that does not depend on spike activity.


The remaining undetermined parameter is τJ. To determine its value, we recall that jitter in the point process model is defined as the standard deviation of the probability density function in Eq. 3. Once again, we cannot obtain a simple analytical relationship between τJ and jitter, but we can use numerical integration and root finding methods to find the value of τJ for which the standard deviation of the density function in Eq. 3 is equal to a measured value of jitter. In other words, we solve the minimization problem: τJargmin|0t2lt(0,t|θ)dt[0tlt(0,t|θ)dt]2γ|, where γ is the experimentally observed jitter value. Although the density function lt(0, t θ) depends on all of the other parameters in the model, by this step in our parameterization method all other parameter values will have been specified and we can therefore treat them as (known) constants when determining the value of τJ. Available data from cat AN fibers did not provide strong evidence that jitter changes significantly due to spike history effects (Miller et al. 2001a). Thus the parameter τJ does not depend on spike activity and is held constant throughout our simulations.

Refractory effects.

Finally, we describe how spike history effects can be incorporated in the model by allowing the parameters κ and α to depend on the time since the last spike. The relationship between κ and θ in Eq. 10 shows that decreasing κ immediately following a spike allows the model to exhibit a common refractory property, whereby neurons are less excitable in the aftermath of a spike. Additionally, the relationship between α and RS in Eq. 8 shows that decreasing α after a spike allows the model to exhibit an increase in RS during the refractory period. We choose the dynamics of these parameters to match the experimentally measured values reported in Miller et al. (2001a). This study used a paired pulse stimulation paradigm to probe the changes in threshold and RS due to spike time. Following an initial “masker” pulse that is sufficiently strong to always evoke a spike, there is an interpulse interval before a second pulse is presented. The current level of the second pulse is varied to measure the firing efficiency curve of the neuron within the refractory period.

Miller et al. (2001a) defined the dependence of θ and RS on the interpulse interval, which we denote as Δt, with the following equations: θ(Δt)=θ01e(Δttθ)/τθ.(14) and RS(Δt)=RS01e(ΔttRS)/τRS.(15) Here, θ0 and RS0 are baseline values obtained from single pulse responses. The parameter tθ represents the absolute refractory period during which no spikes can be produced. We implement the absolute refractory period by holding the intensity function λ(t|I, H) at zero until the elapsed time since the last spike exceeds tθ. The time constant τθ quantifies how quickly the effects of a previous spike fade and is therefore a measure of the relative refractory period. The parameters tRS and τRS play similar roles in defining the evolution of RS, as a function of the time since the last spike.

To simulate the model with refractory effects, we let Δt in the above equations represent the time since the last spike. We then update the values of κ and α at the onset of each pulse, based on the time since the last spike. These parameters are then kept at a constant value from the onset of the pulse until the onset of the next pulse in the stimulus, at which time their values are updated again to reflect the changed time since the last spike. This simplification, as opposed to allowing the parameters to vary continuously, allows a straightforward parameterization of κ and α using the same single pulse relationships with θ and RS, respectively, which were derived above. The value of α must be obtained using numerical methods, but a formula for the value of κ can be obtained by combining Eqs. 10 and 14: κ(Δt)=κ0(1e(Δttθ)/τθ),(16) where κ0 is the baseline value that is determined from the single pulse threshold θ0.

Summary of Parameterization Method

The fitting sequence we have described above provides a semianalytical method by which model parameters in Eq. 5 can be uniquely determined based on single and paired pulse response statistics. This sequence can be summarized as follows:

  1. Use RS value to determine nonlinearity parameter α in Eq. 5a.

  2. Use α and chronaxie value to determine the stimulus filter time constant τκ in Eqs. 5b and 5c.

  3. Use α, τκ, and the summation time constant to determine the value β in Eq. 5c for scaling negative phases of the stimuli.

  4. Use α, τκ, β, and threshold value to determine the scale factor κ in Eqs. 5b and 5c.

  5. Use α, τκ, β, κ, and jitter value to determine the jitter time constant τJ in Eq. 5d.

  6. Use the refractory function for RS in Eq. 15 and the relationship between RS and α to determine the dynamics of α after a spike.

  7. Use the refractory function for threshold in Eq. 14 and the relationship between threshold and κ to determine the dynamics of κ after a spike.

Several comments are in order regarding the effect of pulse shape and duration on these parameters. Threshold should decrease with increasing pulse duration (van den Honert and Stypulkowski 1984); this property is captured for monophasic pulses by the chronaxie statistics and τκ. However, thresholds also depend on pulse shape. In particular, thresholds measured with biphasic pulses are higher than those measured with monophasic pulse (Miller et al. 2001b). The model exhibits this property, but it is not quantitatively matched to experimental data. In contrast, the unique relationship between RS and α in Eq. 8 shows that, for this model, RS does not depend on pulse shape or pulse duration. This feature is consistent with data reported in (Miller et al. 1999), although others have suggested that RS may increase with pulse duration (Bruce et al. 1999b). The parameters β and τJ will both depend on the pulse shape and durations in ways that we do not attempt to fit to physiological data. Ideally one would define the model with a set of data obtained using a self-consistent set of stimulation parameters, and these same stimulation parameters would then be used to obtain additional physiological recordings or perform cochlear implant psychophysics experiments. We fit model parameters and compared subsequent simulation results to data from a variety of published sources; we will comment on variations in stimulation protocols and how they may impact our modeling results where appropriate.

Numerical Methods

All simulations were performed using original computer code written in the Fortran programming language. Numerical quadrature to evaluate the stimulus and jitter filters was performed using the trapezoid method and a time step of 1 μs. Random numbers were generated using the Mersenne twister algorithm (Woloshyn 1999). The original Fortran code is available from the authors upon request, and a more user-friendly version of the code written for Matlab (Mathworks) can be downloaded from the ModelDB website (accession number 143760) or the website of one of the authors (http://www.cns.nyu.edu/∼goldwyn/).


Connections to Biophysical Mechanisms and Models

To gain intuition into the structure and behavior of the point process model, we take a brief detour to explore its key features and parameters. To provide this intuition, we connect aspects of the point process model to biophysical mechanisms and more familiar mathematical models of neurons.

Subthreshold dynamics.

The result of applying the exponential filters K+(t) and K(t) to the incoming stimulus I(t) can be equated with a dynamic state variable that we denote by v(t). This notation is meant to suggest, in a loose sense, that this state variable represents the membrane potential of the model neuron. We can reformulate the action of the stimulus filters K+ and K in terms of a differential equation and find that the state variable v(t) evolves according to: τκdvdt=v+g[I(t)],(17) where the function g(I) is pictured in Fig. 2A and defined as: g(I)={κ(Δt)IifI0βκ(Δt)Ielse. The function κ(Δt), which depends on the time Δt since the last spike, is defined in Eq. 16.

Fig. 2.

Relationship between the point process model and a soft-threshold integrate and fire model. A: illustration of the function g[I(t)] in Eq. 17. The parameter β controls the summation time constant of the model by decreasing the slope of g[I(t)] for negative currents. Refractory effects reduce the excitability of the neuron by decreasing the slope for all current values. B: illustration of the probability density function for the noisy variable η in the stochastic threshold analogy (see text). Smaller values of α generate broader distributions or equivalently a more variable spike initiation mechanism.

The form of the differential equation for v(t) in Eq. 17 reveals that its dynamics are similar to the widely used leaky integrate and fire model (Burkitt 2006) and related spike response models (Gerstner and Kistler 2002), a fact noted in previous presentations of GLMs (Paninski 2004; Paninski et al. 2010). The effect of spike history on g(I), which is determined by the dynamics of κ(Δt) in Eq. 16, is to reduce the slope of this piecewise linear function, as shown in the transition from the blue to green lines in Fig. 2A. This change in g(I) makes the model neuron less excitable immediately after a previous spike. This feature has also been included in spike response models (Gerstner and Kistler 2002) and integrate and fire models (Badel et al. 2008). The asymmetry in g(I) enables the model to exhibit facilitation or summation of consecutive charge-balanced pulses. A modeling study using Hodgkin-Huxley type models has shown that summation of pseudomonophasic pulses, and in particular the time constant τsum in Eq. 12 is determined by the dynamics of the m-gating variable (Cartee 2000). The asymmetry in g(I), therefore, can be viewed as a phenomenolgical approximation of the nonlinear process, mediated by Na+ activation, by which the excitability of a neuron in response to consecutive charge-balanced pulses is increased beyond what would be expected if both pulses were presented in isolation from one another.

The parameter τκ appears in Eq. 17 as the time scale of the dynamics of v(t). In the context of an integrate and fire model of a point neuron, this time scale is often interpreted as the membrane time constant and related to the passive integration properties of the neural membrane. In the context of cochlear implants, such a description overlooks the fact that intracochlear electric fields can interact with AN fibers at multiple, anatomically diverse regions of the neural membrane. For instance, membrane time constants of excitable nodes of Ranvier, somata, and unmyelinated portions of neurons may differ significantly (Rattay and Felix 2001). The parameter τκ, therefore, must be interpreted as a single time scale that summarizes the combined integrative properties of a spatially extended AN fibers in an external electric field.

Spike generation.

In the second stage of the model, the power law nonlinearity in Eq. 5a is applied to the state variable v(t) to generate the desired probability of spiking. In the GLM and related linear-nonlinear frameworks, this nonlinearity is typically static and often interpreted as a reduced description of the spike generating mechanisms of a neuron. We have introduced the innovation that the nonlinearity depends on the time since the last spike to account for the observation that RS increases during the refractory period (Miller et al. 2001a). Here, we explain the relationship between the nonlinearity and spike initiation and show why a dynamic nonlinearity produces the desired change in RS.

We begin by viewing the state variable v(t) along with the nonlinear function f(v) = vα as a definition of a point process conditional intensity function. The probability of observing a spike in a small time window δt can then be obtained by approximating the lifetime distribution function in Eq. 1. We find that: P(spike;δt)1evαδt.(18)

We now consider an alternate point of view in which v(t) represents the subthreshold state of a neuron. We suppose that the probability that a spike occurs in the δt time window is related to the distance between v(t) and some voltage threshold, which we denote Θ. This idea is known as an escape noise model (Plesser and Gerstner 2000) and is closely related to stochastic threshold crossing models (Bruce et al. 1999b; Litvak et al. 2003b). From this perspective, we can imagine that the noise free trajectory v(t) is modified by adding a random number η that represents the noise in the spike generation process. The probability that a spike occurs in a δt window can be written as: P(spike;δt)P(v+η>Θ).(19) If we define the cumulative distribution function Fη for the noise variable η, then we can equate Eqs. 18 and 19 and get: Fη(η)=e(Θη)αδt By taking the derivative of this distribution function, we derive a probability density function for η: fη(η)=αδt(Θn)α1e(Θη)αδt.

In Fig. 2B, we illustrate the relationship between α and the probability density function of η by plotting fη(η) for several values of α. We computed these distributions using δt = 1 μs and Θ=log2δtα. The probability density function for η becomes broad for small values of α. This shows how the parameter α characterizes variability in the spike initiation process. Moreover, this illustrates that allowing α to decrease immediately after a spike creates the desired behavior in the model, a spike generation mechanism that is more variable within the refractory period. To provide a biophysical interpretation for this parameter, we note that the random open and closing of ion channels in the cell membrane (known as channel noise) has been shown to determine the value of RS in simulation studies (Rubinstein 1995). Thus we view η as an abstract representation of channel noise.

Spike time variability.

The final stage of the model is to apply the jitter filter J(t) to the output of the nonlinearity. As we will see below, this secondary filter is necessary in order for this point process model to have realistic amounts of spike time variability. J(t) is applied after the nonlinearity, which suggests that it represents a source of timing variability subsequent to the spike generator in the cell. In the context of extracellular electrical stimulation of the AN, one possible source of this variability is spatial interaction among multiple possible spatial sites of spike initiation. For instance, in a simulation study of a spatially extended neuron with multiple nodes of Ranvier, Mino et al. (2004) showed that stimulating electrodes that are more distant from a model AN fiber produce more variable spike times. The more distant electric fields broaden the distribution of nodes of Ranvier at which spike initiation can occur; thus spatial interactions across the axon may represent one postspike mechanism that adds jitter to spike times.

Model Parameterization

With this understanding of the model features in hand, we proceed to parameterize the model following the steps outlined in Summary of Parameterization Method. We defined parameters in the model using published data from single pulse and paired pulse recordings in cat. Table 1 summarizes the data values we used, their sources in the literature, and the parameter values that we obtained from the fitting procedure. Whenever possible, we used mean values obtained from experiments that used biphasic pulses: these include the mean threshold, RS, and jitter values reported in Miller et al. (2001b). For chronaxie, we used the mean value reported by van den Honert and Stypulkowski (1984) for intracochlear electrical stimulation of cat AN fibers. The longest phase duration tested was 2,000 μs. For the summation time constant (τsum in Eq. 12), we did not find a mean value reported in the published literature, so we assumed a value that fell within the range reported in Cartee et al. (2006) for responses of AN fibers to an electrode placed in the scala tympani. The parameter values that describe the refractory effects on threshold and RS were obtained from Miller et al. (2001a). Specifically, we used tθ = 332 μs and τθ = 411 μs, which were the mean values reported in this study. Mean values were not reported for tRS and τRS, so we estimated that tRS = 199 μs and τRS = 423 μs based on plotted data (Fig. 8 in Miller et al. 2001a). These experiments used monophasic stimuli, but it is straightforward to modify our refractory model should data obtained using biphasic pulses become available.

View this table:
Table 1.

Neural response statistics and corresponding model parameter values

Figure 3 depicts the relationships between the response measures shown on the x-axes and the model parameters shown on the y-axes. Each panel illustrates the unique relationships that arose by following the parameterization sequence summarized in Summary of Parameterization Method. For instance, in Fig. 3A we show the dependence of the nonlinearity parameter α on RS. Once we had fixed the value of α based on the desired RS value, we then computed how the stimulus filter time constant τκ depended on chronaxie. This relationship is shown in Fig. 3B. One-by-one, we obtained values for each parameter, the values of which are marked by the X symbol in Fig. 3 and reported in Table 1.

Fig. 3.

Relationships between response statistics and model parameters. Following the procedure summarized in Summary of Parameterization Method, we obtain unique relationships between response statistics (x-axis) and model parameter values (y-axis). At each step, a parameter value is chosen to fit the response statistic (black X, see Table 1 for exact values), and then this parameter value can be used to obtain a unique relationship between the next response statistic and model parameter value in the fitting hierarchy [A: relative spread (RS); B: chronaxie; C: summation time constant; D: threshold; E: jitter]. Approximation used to determine α (Eq. 20) is shown as a dashed red line in A.

To introduce spike history effects on RS, we must recompute α at the onset of every pulse. In principle, we could use a numerical root finding method to invert Eq. 8, but to avoid this and thereby increase the computational speed of our simulations, we observed that the relationship between α and RS could be approximated by the power law: αRS1.0587.(20) The combination of this equation for α and Eq. 15 provided a simple rule for evolving α according to the time since the last spike. The approximation, which is shown with the red dashed line in Fig. 3A, is suitably accurate.

Model Predictions: Single Pulse Stimuli

Firing efficiency curve.

We first simulated responses of the model to a single pulse of current. To be consistent with the experimental data from which we obtained values for threshold, RS, and jitter, we used a 40 μs per phase biphasic pulse stimulus as the input to the model. By varying the current level of the pulse, we could measure the complete firing efficiency curve, as shown in Fig. 4A. The probability of a spike is defined as the proportion of trials, out of a total of 5,000 for each current level, on which the model generated a spike. Simulation results are shown with black X-marks. As expected, they line up with the analytically derived firing efficiency curve obtained from Eq. 1 and plotted with a blue line. To compare this input/output function to the more standard functional form that is used in the stochastic threshold crossing model of Bruce et al. (1999c), we show in red the integral of a Gaussian distribution with mean and standard deviation consistent with the threshold and relative spread values that we used to fit the point process model. At the highest and lowest current levels, the firing efficiency curve for the point process model is slightly greater, but overall both methods produce similar spiking probabilities in response to a single pulse.

Fig. 4.

Responses of the point process model to single pulse stimulation. A: firing efficiency curves for the point process model (blue), an integrated Gaussian function with the same mean and RS (red), and simulation results from the model (black X). B: distributions of spike times obtained from 5,000 simulations of the model with a jitter filter (blue) and without a jitter filter (red). cdf, Cumulative distribution function.


Figure 4B shows the distribution of 10,000 spike times obtained from the point process model in response to a single biphasic pulse of phase duration 40 μs presented at the threshold stimulus level. The blue line shows results for the model with a jitter filter time constant of τJ = 94.3 μs, as determined by the parameterization method. This produces a distribution of spike times with standard deviation of 86 μs, consistent with the mean value in (Miller et al. 2001b) that was used to parameterize the model. In the absence of a second filtering stage, the model produces an extremely narrow distribution of spike times, as shown by the red distribution. The secondary filter is necessary, therefore, for the model to have realistic amounts of spike time jitter.

Model Predictions: Paired Pulse Stimuli

Summation pulse paradigm.

To test how the neuron model responds to pairs of pulses, we first simulated the summation pulse procedure in (Cartee et al. 2006). As inputs, we used two biphasic pulses of equal current level separated by an interpulse interval of either 200 μs, 500 μs, or 1 ms. The probabilities of observing at least one spike in response to both pulses, for varying interpulse intervals and current levels, are shown as X-marks in Fig. 5A. These were computed from 5,000 repeated simulations of the model. We also plotted the analytically obtained firing probabilities for pulse pairs, obtained from Eq. 1, as curves in Fig. 5A. The effect of summation is visible as a leftward shift of these curves at shorter interpulse intervals. As a reference, the black line reproduces the single pulse firing efficiency curve from Fig. 4A.

Fig. 5.

Responses of the point process model to the summation pulse paradigm. A: simulated spiking probabilities (symbols) for three interpulse intervals: 200 μs (green), 500 μs (red), and 1,000 μs (blue). Solid lines are analytically obtained from the lifetime distribution function in Eq. 1. Single pulse firing efficiency curve is shown in black for reference. B: thresholds estimated from simulated firing efficiency curves and normalized by the threshold for two independent pulses are shown as X-marks for the three interpulse intervals. Facilitation occurs if this threshold ratio is less than one. Black curve is the analytically predicted result in Eq. 13.

To summarize these results, we estimated summation thresholds at each interpulse interval by fitting an integrated Gaussian function. We then normalized the estimated threshold values relative to the single pulse threshold and plotted the values in Fig. 5B. The black curve represents the analytical prediction of the point process model, given by Eq. 13, for the threshold of two independent pulses. Facilitation occurs if the threshold ratio is less than one. In these simulations, we see facilitation for interpulse intervals shorter than ∼1,000 μs. The point process model, therefore, will exhibit facilitation for carrier pulse rates of roughly 1,000 pps and above.

Refractory pulse paradigm.

To probe the effects of spike history in the model, we followed the experimental procedure in (Miller et al. 2001a) and simulated responses to masker-probe pulse pairs. The current level of the first pulse was set to a very high value so that it always elicited a spike. The level of the second pulse was then varied to measure firing efficiency curves. We used pairs of biphasic pulses to be consistent with our previous simulations. Spike probabilities were defined as the proportion of trials (out of 5,000 total) in which the second pulse produced a spike. Firing efficiency curves obtained from responses to the second pulse of current are shown in Fig. 6A for three different interpulse intervals (667, 1,000, and 1,500 μs). As the interpulse interval (and equivalently the time since the last spike) decreases, the spike history effect has a greater influence on the probability that the second pulse will evoke a spike.

Fig. 6.

Responses of the point process model to the refractory pulse paradigm. A: simulated spiking probabilities (symbols) for three interpulse intervals: 667 μs (green), 1,000 μs (red), and 1,500 μs (blue). Solid lines are analytically obtained firing efficiency curves obtained from lifetime distribution function in Eq. 1. Single pulse firing efficiency curve is shown in black for reference. B: thresholds estimated from the simulated firing efficiency curves and normalized by the single pulse threshold are shown as symbols for the three interpulse intervals. Black curve is Eq. 14. C: relative spread estimated from the simulated firing efficiency curves and normalized by the single pulse relative spread are shown as symbols for the three interpulse intervals. Black curve is Eq. 15.

As discussed in Refractory effects, there are two types of refractory effects in the model. The first is the standard refractory phenomenon whereby the model neuron is less excitable immediately after a spike. This is apparent in the rightward shift in the firing efficiency curves for shorter interpulse intervals in Fig. 6A. The second effect is that RS increases at shorter interpulse intervals. This leads to shallower slopes in the firing efficiency curves at shorter interpulse intervals, but the decibel scale in Fig. 6A obscures the change. There are some discrepancies between the simulation results (X-marks) and the predicted values from point process theory at the smallest interpulse intervals (green line). The differences arise from the fact that the simulated spike times in response to the first pulse have a small amount of random variability, whereas the analytical result is computed using the assumption that the time of the first spike is locked to the time of the onset of the first pulse.

To summarize these effects, we estimated threshold and RS from the simulated firing efficiency curves by fitting integrated Gaussian functions. These values are normalized with respect to single pulse threshold and RS and shown in Fig. 6, B and C, respectively. By construction of the model, the values computed from simulations agree with the refractory functions Eqs. 14 and 15, which we plot as black curves. We emphasize that a model with a static nonlinearity, namely one in which α does not depend on spike history, would produce RS values that decrease at shorter interpulse intervals. Alternatively, the stochastic threshold crossing model of Bruce et al. (1999a) assumes a constant RS and would produce a flat line (Fig. 6C).

Model Predictions: Constant Pulse Train Stimuli

The results to this point have validated the fitting method and demonstrated that the model reproduces a range of measures that characterize responses to single pulses and pairs of pulses. Of greater relevance to cochlear implant speech processing strategies are the responses of neurons to extended trains of pulsatile stimuli. We first simulated responses to trains of biphasic pulses with phase duration 40 μs and constant current levels and sought to characterize how carrier pulse rate affects the sequence of evoked spikes. We relate our simulation results to relevant physiological data throughout.

Firing rate and interspike interval distributions.

Figure 7A shows how firing rates in the model increase with current level for three different pulse rates. The stimuli were 1 s in duration. Mean and standard deviations (shown with error bars) were estimated from 100 repeated simulations. At the lowest pulse rate (250 pps, blue line), the interpulse interval is longer than the summation time scale as well as the relative refractory periods for threshold and RS. Thus the firing rate curve follows directly from the single pulse firing efficiency curve in Fig. 4A. The firing rate saturates at one spike per pulse, in this case 250 spikes/s. The 1,000- and 5,000-pps pulse trains have interpulse intervals of 1 ms and 200 μs, respectively, which are short enough for refractory and summation effects begin to impact the behavior of the model. Due to summation and the higher pulse rate, the same firing rate can be evoked by higher pulse rate stimuli using less current per pulse. Thus the firing rate curves are shifted leftward as the pulse rate increases.

Fig. 7.

Firing rate and interspike interval distributions for three stimulus pulse rates. A: firing rate as a function of current level per pulse. Stimuli were one second long trains of biphasic pulses with constant current level per pulse, the level of which is shown on the x-axis. Error bars indicate the standard deviation of the mean firing rate, estimated from 100 repeated simulations of the model. B–D: interspike interval distributions estimated from 10,000 interspike intervals with the pulse rates as indicated. Current level per pulse was chosen so that the stimuli evoked ∼100 spikes/s. [Insets: single fiber recordings from cat auditory nerve (AN) fibers at the corresponding pulse rates reproduced from Fig. 1 in Miller et al. (2008) with kind permission from Springer Science & Business Media: J Assoc Res Otolaryngol.]

Figure 7, B–D, shows interspike interval distributions at the three pulse rates, where the current levels were set to evoke ∼100 spikes/s. The distributions were obtained from histograms of 10,000 spike times in time bins of 100 μs. The distributions obtained from responses to the 250- and 1,000-pps pulse trains, but not the high rate 5,000-pps pulse train, show peaks that are clearly aligned to the interpulse intervals in the stimulus. This feature of the simulated interspike interval distributions is qualitatively similar to distributions recorded from cat AN fibers using the same pulse rates (Miller et al. 2008). For ease of comparison, we include examples of interspike interval data reported by Miller et al. in Fig.7, insets. The model is capable of reproducing some characteristics of the interspike interval distribution for this single neuron, although an important caveat is that the experimentally measured interspike intervals were obtained under different stimulus conditions and likely also different firing rates. We do not intend to claim that the model completely reproduces all of the behavior of this or other cells.

The results of these simulations suggest that spike time jitter, which was incorporated into the model on the basis of responses to single pulse stimuli, can account for distinctive features of the interspike interval distributions and how they vary with the stimulus pulse rate. On the one hand, temporal variability in spike times is small relative to the interpulse intervals for the lower pulse rate stimuli, which leads to the periodic nature of these distributions in Fig. 7, C and D. On the other hand, this small temporal variability is sufficient to spread the simulated spike times across the interpulse interval for the 5,000 pps, creating a more smoothly varying interspike interval distribution in Fig. 7B, consistent with the appearance of the distribution shown in Fig. 7B, inset.

Synchronization of spike times.

Simulation and physiological studies have generated interest in the possibility that high rate pulse trains may desynchronize neural responses to cochlear implant stimulation (Rubinstein et al. 1999; Litvak et al. 2003c). We tested whether the model exhibited similar signatures of desynchronization by computing a synchronization index, known as vector strength (Goldberg and Brown 1969), and then compared our results to physiological data reported in Miller et al. (2008). For a sequence of N spike times {ti}1N, vector strength (VS) is defined with respect to a period T as: VS=1N[i=1Ncos(2πti/T)]2+[i=1Nsin(2πti/T)]2.(21) VS takes values between 0 and 1, with higher values being interpreted as a more synchronized spike train. In these simulations, we use VS to measure the strength of phase locking to the period of the stimulus pulse train, and thus T represents the interpulse interval. In all simulations, as well as the experimental data to which we compare our results, the stimulus is a train of biphasic pulses with a constant current level and a 40 μs phase duration. Simulation results for three pulse rates are shown Fig. 8, left, where each circle represents the firing rate and VS values computed from the response to a 10-s long pulse train. Our results can be compared with VS values obtained from recordings of cat AN fibers responding to biphasic pulses trains presented at the same pulse rates and pulse shape. These data, which we reproduce from Miller et al. (2008), are shown in Fig. 8, right.

Fig. 8.

Vector strength (VS) measured from responses to three pulse rates. Left: vector strength computed from simulated responses to 10 second long pulse trains of biphasic pulses with equal current levels. Current levels were varied to explore a range of firing rates. Note the change of scale on the y-axis; VS values decrease as pulse rate increases. Right: VS values reported by Miller et al. (2008) based on responses of 37 cat AN fibers stimulated with intracochlear electrodes using monopolar, biphasic pulse trains. Black diamonds do not represent median values, see Miller et al. (2008) for details. [Reproduced from Fig. 8 in (Miller et al. 2008) with kind permission from Springer Science & Business Media: J Assoc Res Otolaryngol.]

For the lowest pulse rate, VS values obtained from simulations exceeded 0.98 for all firing rates. This represents near perfect phase locking to the 250-pps pulse train. VS systematically decreases with pulse rate: note the change of scale on the y-axes. Comparisons with the data of Miller et al. (2008) show good agreement between simulated and measured VS values. The primary discrepancy between the two sets of VS values is the large variability present in the data values, seen as scatter in the vertical direction of these plots. A likely cause of this difference is the fact that we simulated a single model neuron with a fixed set of model parameters. In contrast, the data were obtained from a sampling of 37 AN fibers. Presumably each neuron had different intrinsic properties that may even change further over the course of the experiment. Additionally, Miller et al. suggested that some scatter in the VS values measured from responses to the 5,000-pps stimulus may have been due to limitations in their ability to resolve the phase of spike times with respect to the 200-μs interpulse interval.

On the basis of the lower VS values at higher pulse rates, one is tempted to conclude that these results reveal the desynchronizing effects of high pulse rate stimulation. As noted in Miller et al. (2008), however, higher VS values at higher carrier pulse rates may not indicate desynchronization since VS is computed with a different reference period for each stimulus, depending on the interpulse interval. In particular, at higher carrier pulse rates the period T in Eq. 21 is smaller, and this can lead to lower VS values for spike trains with equivalent amounts of temporal dispersion. To test whether the low VS values computed in simulations using 5,000-pps pulse trains do in fact indicate desynchronizing effects of high pulse rate stimulation, we examined the distribution of spike times relative to the onset time of the pulse immediately preceding the spike. These spike time distributions are shown in Fig. 9 and were estimated from 10,000 spike times using a 10-μs bin width. Figure 9A shows distributions for the 250 (blue)- and 5,000-pps (green)-pps pulse trains for a current level set to evoke ∼100 spikes/s. In Fig. 9B, we show distributions of spike times obtained from stimuli that caused the model to spike at 225 spikes/s. The spike time distributions are nearly identical at the lower firing rate indicating that the lower VS values obtained with the 5,000-pps stimulus do not reveal any desynchronization in this case. At the higher firing rate, however, the distribution of spike times in response to the 250-pps stimulus is considerably narrower than the distribution of spike times measured from the 5,000-pps pulse train. One could interpret this as desynchronization, although it may be more accurate to say that the 5,000-pps stimulus is maintaining a degree of spike time variability that is lost when the current level of the 250-pps stimulus is increased and the evoked firing rate approaches the maximal value of 250 spikes/s.

Fig. 9.

Further comparison of spike timing variability in response to low and high rate pulse trains. A: distribution of times between simulated spikes and the onset of the previous pulse. Stimuli are trains of biphasic pulses of either 250 pps (blue) or 5,000 pps (green), with a constant current level per pulse. Current level was chosen so that firing rates of the model were ∼100 spikes/s for both stimuli. 10,000 simulated spike times were used to estimate the distributions, which were plotted using 10-μs bin window. B: similar to A, with current levels increased so that firing rates in simulations were ∼250 spikes/s. C: similar relationship between jitter and firing efficiency in the point process model. Black curve represents analytical calculation of jitter from lifetime distribution function, and blue circles mark the firing efficiency values, interpreted as average probability of a spike per pulse, corresponding to the pulse rates and firing rates in B and C. D: relationship between and jitter and firing efficiency in measurements of responses of cat AN fibers to monopolar, monophasic (cathodic) stimulation by an intracochlear electrode. [Reprinted from Hear Res (Miller et al. 1999) with permission from Elsevier.]

We can explain the narrowing of the spike time distribution by considering the main source of spike time variability in the model. In Fig. 9C, we show the amount of jitter in simulated spike times, as a function of firing efficiency for single pulse stimulation. We obtained this curve directly from the density function for spike times in Eq. 3. For the 250-pps stimulus, firing rates of 100 and 225 spike per second translate to firing efficiency values per pulse of 40 and 90%, respectively. The blue circles mark the location of these firing efficiency values. Jitter in the model decreases substantially at the higher firing efficiency value, which leads to the narrower spike time distribution in Fig. 9B. In contrast, increasing the firing rate from 100 to 225 spikes/s when using a 5,000-pps stimulus translates to a small change in the firing efficiency, here interpreted as the average probability of a spike per pulse, from 2 to 4.5% (green circles). The model predicts, therefore, that high rate pulse trains can maintain temporal variability in spike times even at relatively high firing rates because the probability of a spike on any single pulse remains low. The key feature of the model that explains these results, the fact that jitter decreases with firing efficiency, has also been observed in the responses of cat AN fibers to electric stimulation using monophasic pulses (Miller et al. 1999). These data are reproduced in Fig. 9D. They show a qualitative match with our point process predictions and underline the conclusion that the desynchronization observed at high pulse rates largely follows from the use of weaker current impulses.

Irregularity of firing responses.

An additional response statistic that has been used to characterize responses to high rate pulse trains is the Fano factor. This quantity measures irregularity in the number of observed spikes. It is defined as the variance in the number of spikes observed in a time window normalized by the mean value. A Fano factor of one, which is the value generated by any Poisson process model of spiking, is considered to signify a highly irregular spike generator and lower values indicate more regularity. To estimate Fano factor from the model, we simulated responses to a 1,000-s long train of biphasic pulses (40-μs pulse duration), using 250- and 5,000-pps pulse trains. Ten estimates of Fano factor were obtained by subdividing these spike trains. Results are shown in Fig. 10A, with error bars that represent standard error of the 10 estimates of Fano factor for each stimulus condition. In Fig. 10B, we reproduce a figure from (Miller et al. 2008) to compare the behavior of the point process model to AN fibers in cat, also stimulated by biphasic pulse trains. These are median values obtained from recordings of 37 AN fibers in 8 deafened cats, as such they do not reveal the considerable variation around the medians present in the experimental data.

Fig. 10.

Fano factor measured from responses to low (250 pps) and high (5,000 pps) pulse rates. A: simulation results using the point process model. Results for the low pulse rate stimulation (blue line) are predicted by the Fano factor for a binomial distribution (black line, see text for details). Gray line shows the corresponding prediction of a binomial distribution for the 5,000-pps pulse train. Green line illustrates that simulated responses to the 5,000-pps pulse train can have larger Fano factors than responses to the 250-pps pulse train if the refractory period is shortened (τθ = 200 μs). Fano factor values are estimated from 100 intervals of simulated spike trains, where each interval is 100-s long. Error bars represent SE of these estimates. B: medians of Fano factors recorded from cat AN fibers. Numbers express the ratios of the median Fano factors for the 5,000-pps stimulus in proportion to the median Fano factors for the 250-pps stimulus. [Reproduced from Fig. 9 in Miller et al. (2008) with kind permission from Springer Science & Business Media: J Assoc Res Otolaryngol.]

Fano factors obtained from simulations in response to 250-pps pulse trains are shown by the blue curve in Fig. 10A, and results for the high rate pulse trains are in red. The clear trend is a decrease in Fano factor as firing rate increases, consistent with the physiological data in Fig. 10B. As firing rates approach 250 spikes/s, the responses to the low rate stimulus saturate at one spike for every pulse, leading to a sharp decrease in the Fano factor, a trend also seen in the data. One substantial discrepancy between simulation results and the median Fano factors reported in Miller et al. (2008) is that, in the data, Fano factors obtained from responses to high pulse rate stimuli were larger than those obtained from response to the low pulse rate stimuli for most firing rates and exceeded one in some cases. The numbers above data points in Fig. 10B indicate the multiplicative factor by which the high rate Fano factors exceed the low rate values. These experimental results supported the notion that high pulse rate stimulation can generate more irregular spiking activity.

Can our model predict this phenomenon? As we show in greater detail in the appendix, the effects of refractoriness and interpulse interactions tend to decrease Fano factor (see also Berry and Meister 1998). When the response of a neuron to any pulse is independent of all previous stimuli and spike history, the activity can be described as a sequence of Bernoulli trials, where the probability p that any one pulse evokes a spikes is equal to the average firing rate divided by the number of pulses. The total number of spikes, in this case, would be distributed binomially with mean Np and variance Np(1 − p), where N is the total number of pulses. If we denote the average firing rate by r and the pulse rate by ρ, then the Fano factor would have a simple dependence on p, and, consequently, the average firing rate r: Fbinomial=1p=1rρ.(22)

Our simulation results for the 250-pps pulse train are completely explained by this analogy to a binomial distribution. Due to the relatively long interpulse interval of 4 ms, past spike and stimulus histories have no effect on the model's response to subsequent pulses. The curve representing Fbinomial in Eq. 22 for the 250-pps pulse train is shown in black in Fig. 10 and follows the simulation results. At 5,000 pps, there are strong effects of stimulus and spike history, so the binomial analogy is not expected to approximate the behavior of the model. However, it does provide an upper bound for the possible Fano factors that the model can achieve, which is shown by the gray line. This illustrates that there is room for the model to generate higher Fano factor values (although no larger than one), in response to high pulse rate stimulation but different parameter values must be chosen. To test the capacity for the model to produce higher Fano factors, we weakened the refractory effect by decreasing the time scale of the relative refractory period, τθ in Eq. 14, to 200 μs. This value is still within the range observed in physiological recordings (Miller et al. 2001b). All other model parameters were kept the same. The results from simulations of this model, shown in green, illustrate that it can achieve higher Fano factor values, although the difference between Fano factors at the low and high pulse rates remains larger in the data.

We were able to explore the relationship between refractory dynamics and Fano factor in the model more fully by developing a discrete-time Markov chain approximation to the point process model. The Markov chain framework extends the Bernoulli process analogy (introduced above for low pulse rate stimulation) for stimuli in which spike history and interpulse interaction effects are present. We used the Markov chain approximation to obtain estimates of firing rate and Fano factor and explore the full range of possible behaviors in the model. See the appendix for further details.

Model Predictions: Amplitude-Modulated Pulse Train Stimuli

Modern cochlear implant speech processors provide temporal information to cochlear implant listeners by modulating the current levels of pulses over time. It is important, therefore, to explore how the model responds to nonconstant pulse trains. To relate model results to available physiological and psychoacoustic data, we simulated responses to sinusoidally amplitude-modulated pulse trains. Specifically, as inputs to the model we used trains of biphasic pulses with the current level of the nth pulses defined by the equation: In=I¯[1+msin(2πtnfm)]. In is the current level of the pulse with onset at time tn, and the parameters m and fm parameterize the modulation depth and modulation frequency, respectively. As in previous sections, these pulses were charge balanced and had phase duration of 40 μs. To compare the model with available physiological data (Litvak et al. 2003b) and a recent modeling study (O'Gorman et al. 2010), we used a 5,000-pps carrier pulse rate modulated at a frequency of 417 Hz. We simulated two mean current levels, one that evoked ∼50 spikes/s when the pulse train was unmodulated and a second that evoked ∼100 spikes/s. The duration of all pulse trains used in simulations was 10 s, and the results from 10 repeated simulations for each stimulus condition were used to compute standard errors in estimated response statistics.

In Fig. 11A, we show simulated firing rates, as a function of modulation depth. Firing rates increase with modulation depth for m greater than ∼5%. We compared these results to cat single fiber data measured by Litvak et al. (2003b), which we have reproduced in Fig. 11B. The simulation results qualitatively match the experimental data, especially when compared with the two fibers that have lower firing rates (circle and filled square in B). The main discrepancy between the simulated and recorded firing rates is that the model does not exhibit increased spiking until the modulation depth increases beyond 5%, whereas the recorded firing rates increase at modulation depths as low as 1%.

Fig. 11.

Responses to sinusoidally amplitude-modulated pulse trains with 417-Hz modulation frequency. A and C: simulation results obtained from responses to 10-s long biphasic pulse trains with 40-μs pulse durations and modulation frequency given on the x-axis. Two mean current levels were used, one that produced a mean firing rate of 50 spikes/s for unmodulated input (blue) and a higher value that produced a mean firing rate of 100 spikes/s (red). Error bars represent SE of 10 repeated simulations. VS is computed with reference to the period of modulation. B and D: firing rate and VS measures obtained from five cat AN fibers stimulated by intracochlear monopolar cochlear implant stimulation using biphasic pulse trains with duration of 25 μs/phase. [Reprinted with permission from Litvak et al. (2003b), copyright Acoustical Society of America.]

To quantify the sensitivity of spike timing to the period of the modulated waveform, Litvak et al. computed VS as in Eq. 21. The relevant period in this case is the period of modulation, so T = 2.4 ms in Eq. 21. Simulation results using the point process model are shown in Fig. 11C, and the experimental measurements of Litvak et al. (2003b) are reproduced in Fig. 11D. The model again qualitatively captures the increase in VS with increasing modulation depth, although with important quantitative differences. In particular, the experimental data show that AN fibers have exquisite sensitivity to weak modulations, producing VS values of ∼0.5 for modulation depths of only 0.5%. As shown by O'Gorman et al. (2010), this sensitivity to weak modulations may be a consequence of nonlinear mechanisms in spike generators that are accounted for in Fitzhugh-Nagumo dynamical models but appear to be lacking in this point process description.

Application to Amplitude Modulation Detection

In this final set of simulations, we explore how the temporal information in spike patterns can be quantified in a way that enables simulation results to be interpreted in the context of psychoacoustic experiments studying the ability of cochlear implant listeners to detect modulation at varying carrier pulse rates (Galvin and Fu 2005; Pfingst et al. 2007; Galvin and Fu 2009). We begin with a simple observation: a 250-pps carrier, due to the tight phase-locking of spikes relative to the timing of pulses (see Fig. 8A), does not appear to evoke spiking activity that represents a modulated envelope. This point is illustrated in Fig. 12, left. Here, we show histograms of spike times obtained from one cycle of a pulse train modulated at 75 Hz. From top to bottom, the modulation depth is increased from 1 to 10%. The current level is fixed so that 50 spikes/s are evoked, on average, in response to unmodulated stimuli. Our intuition tells us that this sparse and punctate pattern of spikes will not effectively transmit information about a slowly varying envelope. Nonetheless, these responses still carry some information about modulation depth, as evidenced, for instance, by the increasing height of the second peak as modulation depth increases.

Fig. 12.

Histograms of response to 250 (left)- and 5,000 (right)-pps pulse trains modulated at 75 Hz. Modulation depth increases from 1% (top) to 10% (bottom). Histograms were estimated from 10,000 repeated simulations of responses to one cycle of trains of biphasic pulses, and plotted with 50-μs time bins. Current levels were set separately for each pulse rate so that unmodulated pulse trains presented would evoke ∼50 spikes/s.

In Fig. 12, right, we show responses to a cycle of a 5,000-pps pulse train modulated at 75 Hz. These histograms indicate that the high rate carrier appears to provide a more complete representation of the envelope of the modulated stimulus. In contrast to the low rate carrier, then, we may expect that this carrier would provide greater temporal information regarding the presence of a sinusoidal modulation. In other words, we may expect improved amplitude modulation detection if an observer (for instance higher processing centers in the auditory pathway) had access to this pattern of spikes as opposed to those in Fig. 12, left.

To test these statements quantitatively, we used an ideal observer analysis to simulate modulation detection based on spike trains generated by the point process model. A general result of point process theory allows us to use the conditional intensity function λ[t|I(t), H(t)] in Eq. 4 to compute the log-likelihood of observing a sequence of spike times {ti}iN, conditioned on the input stimulus I(t) (Daley and Vere-Jones 2003; Paninski 2004; Truccolo et al. 2005): L({ti}iN|I)=i=1Nlog{λ[ti|I(ti),H(ti)]}0sλ[t|I(t),H(t)]dt,(23) where s is the total duration of the stimulus. To analyze how temporal modulations are encoded in spike trains, we simulated two spike trains: one in response to an unmodulated pulse train and one in response to a modulated pulse train with 75-Hz modulation frequency and 1% modulation depth. Both stimuli were 1-s long and consisted of biphasic pulses that were 40 μs per phase. After computing the likelihood function (Eq. 23) for all possible pairings of spike trains and stimuli, we used a likelihood ratio test to discriminate between the two spike trains (Green and Swets 1966; Pillow et al. 2005; Goldwyn et al. 2010b). If the true pairings of stimulus and response produce the highest likelihood values, then the ideal observer is said to correctly detect the modulated stimulus. By repeating this procedure, we estimated percent correct detection values. We also varied the carrier pulse rate from a low carrier pulse rate (250 pps) to a high carrier pulse rate (5,000 pps) to investigate our previous observation that low pulse rate stimulation produces an (apparently) incomplete representation of the modulated waveform.

The results of this analysis are shown by the red line in Fig. 13, where error bars indicate the standard error in the mean of 10 computed values of percent correct. The x-axis shows the carrier pulse rate used as the input to the point process model. At all pulse rates tested, for this controlled spike rate of 50 spikes/s, this maximum likelihood discrimination procedure produced correct discrimination on ∼80% of trials. These simulations did not reveal any strong changes due to the carrier pulse rate. Despite the visible differences in Fig. 12, the spike patterns in response to modulated stimuli are equally detectable regardless of the carrier pulse rate.

Fig. 13.

Percent discrimination of a sinusoidally amplitude-modulated pulse train. Three discrimination measures are were used to discriminate unmodulated pulse train from a modulated pulse train on the basis of simulated spike trains (see text for details). Both stimuli were one second long trains of biphasic pulses, 40 μs per phase and current level was varied for each carrier pulse rate so that the unmodulated pulse train always evoked ∼50 spikes/s, on average. Modulated pulse trains had 75-Hz modulation frequency and 1% modulation depth. Percent correct values were computed from 1,000 repeated simulations of the model, error bars represent SE in the mean of 10 repeated estimates of percent correct. Chance level is 50% correct.

To further characterize the information in these spike patterns, we also computed discrimination measures using decision rules that selected the modulated pulse train on the basis of which spike train response had a higher spike count (blue line) or a higher vector strength with reference to the 75-Hz modulation (green line). From Fig. 11, we can see that spike count and vector strength increase with modulation depth, so simulated spike trains can encode the presence of modulation information using either of these cues. For the small modulation depth used in these simulations, the spike count discrimination rule had near chance performance at just over 50% correct. The vector strength discrimination measure produced a higher percent correct, although it was still substantially below the value obtained using the spike train likelihood technique discussed above, indicating that vector strength as a measure of phase locking does not completely describe how temporal modulations are expressed in the precise sequence of spike times. Similar to the likelihood function discrimination measure, neither spike count nor vector strength discrimination predicted percent correct values that depended strongly on the carrier pulse rate.

To summarize these simulations of modulation detection: we found that modulation detection based on three different decision rules did not vary across a range of carrier pulse rates. This ideal observer analysis suggests that modulation detection does not require peripheral spike patterns that fully represent the sinusoidal envelope of these pulse trains.


Review of Main Findings

In this study, we have used point process theory to derive mathematical expressions for statistical measures of neural excitability that are commonly used to characterize the response properties of AN fibers to electrical stimulation. Furthermore, we proposed an explicit model with features that could be related to biophysical mechanisms in a phenomenological sense. These included temporal filtering that represented subthreshold dynamics of the membrane potential, a nonlinearity associated with spike generating processes, and a secondary filter that accounts for variability in spike timing.

An essential feature of the proposed modeling framework is that all parameters can be determined from responses to single and pairs of electric pulses. The model is minimal in the sense that each model parameter is uniquely identified with a single statistic reported in physiological experiments, as illustrated in Fig. 3. Moreover, the point process model incorporated dynamical and stochastic properties that are relevant to high pulse rate stimulation. Specifically, the relative spread of the model depends on the time since the last spike in a manner consistent with data reported in (Miller et al. 2001a), and subthreshold pulses presented in rapid succession could combine to increase the excitability of the model neuron, a facilitation phenomenon observed in electrical stimulation of AN fibers (Cartee et al. 2000, 2006; Heffer et al. 2010).

The construction of the model ensured that it would reproduce the measures of threshold, relative spread, jitter, chronaxie, as well as changes in threshold and relative spread in the context of response to pairs of pulses separated by an interpulse interval. To further test whether our model generalizes, we simulated responses to extended stimuli of greater relevance to cochlear implant stimulation: pulse trains with constant current per pulse and current levels that were sinusoidally amplitude-modulated. The model produced interspike interval distributions that qualitatively agreed with data reported in Miller et al. (2008) (see Fig. 7) and also provided important insight into how temporal jitter in spike times could account for synchronization to pulse trains presented at multiple pulse rates (see Fig. 8). Overall, it appears that the proposed point process framework provides a satisfactory description of AN fiber spike trains, although there are important shortcomings and potential extensions to consider, which we discuss below.

Limitations and Possible Extensions

Although the model accurately predicted responses to extended pulse trains in several ways, it was unable to quantitatively predict the precise dependence of Fano factor on pulse rate and firing rate (Fig. 10) and the exquisite sensitivity of AN fibers to modulation as small as 0.5% (Fig. 8). A possible explanation for these results can be found in recent work by O'Gorman et al. (O'Gorman et al. 2009, 2010), which showed that the Fitzhugh-Nagumo model of neural dynamics exhibits an instability when driven with 5,000-pps stimuli and that this mechanism can account for experimentally measured values of Fano factor, together with the strong phase locking to weak modulations reported by Litvak et al. (2003b). Future work could seek to synthesize these developments by developing models that capture this dynamical mechanism, can be parameterized to additional physiological data, and can be used for likelihood-based discrimination studies. For example, one possible approach for improving the sensitivity of the model to small modulations is to include an additional filter to the stimulus that acts as a differentiator on the sinusoidal envelope.

A related shortcoming of the model is that it cannot generate a Fano factor greater than one. Fano factors larger than one have been measured in studies of AN fiber responses to cochlear implant stimulation (Miller et al. 2008), so future extensions of the model should seek to remedy this limitation. Large Fano factors have also been observed in numerous studies of AN fibers in nondeafened animals (Lowen and Teich 1996, and references therein), which suggest that irregular firing may be a generic feature of AN fiber activity. Doubly stochastic point processes and point processes with fractal noise have successfully modeled the irregular firing patterns of AN spike trains and generated physiologically realistic Fano factors by inducing noise correlations over long time scales (Teich et al. 1990; Kumar and Johnson 1993; Zilany et al. 2009). Incorporating these types of noise processes in the point process framework could be an important step toward improving predicted responses to pulse train stimuli.

An additional mechanism that could be included in the model to induce changes in firing probabilities over long time scales is firing rate adaptation, also referred to as accommodation in the cochlear implant literature (Sly et al. 2007). A dynamic interplay between accommodation and facilitation could increase the irregularity of firing and therefore potentially increase Fano factor values. Furthermore, recordings from AN fibers in deafened cats have shown that adaptation affects neural responses to amplitude-modulated cochlear implant stimuli (Hu et al. 2010) and differentially affects responses to low and high rate pulse trains (Zhang et al. 2007). Adaptation appears to be especially strong at high pulse rates; Zhang et al. (2007) recorded from some neurons that responded to 10,000-pps stimuli with only brief onset bursts of spikes. Including firing rate adaptation in future point process models, therefore, will be a significant advance in modeling AN fiber responses to cochlear implant stimulation. Our initial attempts to add a phenomenological representation of adaptation in the point process model incremented the spike threshold (model parameter κ) after each spike and allowed the threshold to relax back to its baseline level on a time scale longer than the refractory effect (Goldwyn and Shea-Brown 2010). This spike-history dependent adaptation mechanism is phenomenologically similar to a mechanism proposed by Woo et al. (2009). In their biophysically based model, extracellular concentrations of potassium ions transiently increased after a spike leading to slow changes in neural excitability. An alternate mechanism based on ion channels that are active at subthreshold membrane potentials (Ih and IKLT) has been proposed by Negm and Bruce (2008). At present there is no method for mapping voltage-dependent ion currents to parameters or filters in the point process framework, but as we discuss in Relation to Previous Models, it is possible to approximate the essential contributions of these channels to subthreshold dynamics using simplified models that could be incorporated into the point process model.

Relation to Previous Models

The point process model extends previous simplified models of AN fiber spiking that used a stochastic threshold crossing formulation (Bruce et al. 1999a,c; Litvak et al. 2003b) by including temporal integration, facilitation, spike history-dependent relative spread, and a realistic amount of spike time jitter. An alternate class of simplified models that have been used to describe responses of AN fibers to electric stimulation is the integrate and fire model (Stocks et al. 2002; Chen and Zhang 2007). As explained in Connections to Biophysical Mechanisms and Models, the point process model can be interpreted as a modified integrate and fire model with escape noise. One significant difference between our approach and standard integrate and fire models is that we have introduced an asymmetry in the otherwise linear subthreshold dynamics, enabling the model to exhibit facilitation in response to charge balanced biphasic pulses. An advantage of the point process model over standard integrate and fire models is that spiking probabilities can be computed relatively simply using the conditional intensity function and the corresponding lifetime distribution function in Eq. 1. One does not need to estimate hazard rates (Plesser and Gerstner 2000) or solve difficult first passage time problems. We have used this mathematical tractability to derive a direct parameterization method based on standard physiological data.

Point process models can be an alternative to biophysically detailed models that have been developed to study cochlear implant stimulation of AN fibers (Rattay and Felix 2001; Briaire and Frijns 2006; Negm and Bruce 2008; Imennov and Rubinstein 2009; Woo et al. 2010). These detailed models, based on the Hodgkin- Huxley formalism, incorporate explicit representations of how action potentials arise from the coordinated dynamics of voltage-gated ion channels. They can also include spatial structure such as multiple nodes of Ranvier, myelinated axons, and degeneration of peripheral processes. Hodgkin-Huxley type models are appealing because of their relatively realistic representation of the biophysics of AN fiber activity, but they can require extensive computational resources to simulate and have a large number of model parameters. The point process framework is more phenomenological in nature, but its small number of parameters can be uniquely determined from relatively common experimental data and can be simulated with a lower computational burden. An intriguing direction for future investigations would be to seek to relate parameters in the point process model to features of Hodgkin-Huxley type models. We have alluded to two connections that could be explored in more detail: summation of consecutive subthreshold pulses (represented by the model parameter β) may be related to the kinetics of sodium activation, as discussed by Cartee (2000), and the filter J(t) that operates after the nonlinearity may be related to spatial interactions among multiple excitable nodes of Ranvier, as investigated in Mino et al. (2004).

One possible route that could connect the point process framework to biophysically detailed models is via the analogy to integrate and fire models with escape noise, discussed in Connections to Biophysical Mechanisms and Models. Methods that have been developed to approximate Hodgkin-Huxley type models with integrate and fire or spike-response models (Gerstner and Kistler 2002; Paninski et al. 2004; Badel et al. 2008) may suggest how the cascade of filters and nonlinear functions that make up the point process framework could be modified to accurately approximate biophysically detailed models of AN fibers. Our choice of the exponential filter K(t) produced a model with linear subthreshold dynamics, but more complicated subthreshold dynamics can also be included in the integrate and fire framework. The recent model of Negm and Bruce (2008) noted the importance of two types of voltage-dependent ion channels that are active subthreshold and thought to be present in AN fibers: hyperpolarization-activated cation current Ih and low threshold potassium current IKLT. An essential dynamical effect of Ih is that it leads to subthreshold oscillations in membrane potential (Izhikevich 2007). So-called resonate and fire models are an extension of the integrate and fire that produce subthreshold oscillations (Izhikevich 2001) and have been used to model the response of AN fibers to cochlear implant stimulation (Macherey et al. 2007). Overall, such resonating subthreshold dynamics would lead to differentiating-type filters applied to the stimulus that could account for the sensitivity to small stimulus modulations referred to above (Agüera y Arcas et al. 2003). Extensions of the integrate and fire model have also been derived to account for the effects of IKLT (Svirskis and Rinzel 2003). We note that altering the subthreshold dynamics of the model would introduce additional parameters that could potentially interfere with the unique relationships between model parameters and response statistics that we have derived; as in any modeling study there is an unavoidable tension between biological detail and model parsimony.

Future Directions

Our simulations have focused on the activity of individual AN fibers, but cochlear implants convey sound information by simultaneously stimulating thousands of AN fibers. Complete models of cochlear implant stimulation must, therefore, efficiently simulate the activity of a population of AN fibers distributed throughout the cochlea. This, together with the computational demands of exploring and eventually optimizing stimulus parameters, demands efficient models. Furthermore, since response statistics such as threshold, relative spread, and jitter can vary widely across a population of AN fibers (Miller et al. 1999, 2001b), it is necessary to construct a highly heterogeneous population of AN fibers with a representative distribution of response properties. This can be accomplished numerically with biophysically detailed models (Imennov and Rubinstein 2009); however, the point process framework greatly simplifies the procedure because each model parameter is uniquely identified with a response statistic. Thus the point process framework provides an explicit method for manipulating the distributions of response properties of a population of model AN fibers. In addition to facilitating future simulations of populations of AN fibers, this parametric control can also be used to investigate which response properties of a population of neurons may impact neural encoding of cochlear implant stimulation in a significant way. For instance, physiological data from rat AN fibers suggest that long-term deafness can increase absolute refractory periods and excitability thresholds (Shepherd et al. 2004). The effects of these changes can be tested with the model via straightforward modifications of parameter values.

A complete model of cochlear implant stimulation of the AN also requires a description of the spatial spread of electric fields in the cochlea (Bruce et al. 1999b; Briaire and Frijns 2000; Rattay et al. 2001; Bonham and Litvak 2008; Cohen 2009a; Goldwyn et al. 2010a). We have focused on temporal response properties of individual AN fibers, but the point process framework could be incorporated into existing models that simulate populations of AN fibers. For instance, the stochastic threshold crossing model of Bruce et al. (1999c) and variants thereof have been coupled to models of spatial spread of electric fields to simulate the responses of a population of AN fibers (Bruce et al. 1999b; Bonham and Litvak 2008; Cohen 2009b; Goldwyn et al. 2010a). Since the point process model can be viewed as an extension of the stochastic threshold crossing model, it would be straightforward to model spatial fields as described in these studies and simulate AN fiber activity using the point process model in place of a stochastic threshold crossing model. In practice, this would require generating a population of AN fibers and then stimulating each neuron with a current level that may depend on the putative distance of the model fiber from the stimulating electrode, as defined by a three-dimensional model of intracochlear electric fields. Neural response properties may also depend on electrode-to-neuron distance (Mino et al. 2004), so the unique relationship between model parameters and response statistics could be used to generate a heterogeneous population of model AN fibers that incorporates the effects of their spatial distribution.

Our approach to constructing the point process model was to carefully select model features that could be plausibly related to known properties of neural dynamics and be uniquely identified with single pulse and paired pulse response statistics. An alternative approach that would still be grounded in the point process framework would be to fit a model directly to a richer set of stimuli that are more relevant to the study of cochlear implant speech processing strategies. We have pointed out that the form of the model presented here, a cascade of linear and nonlinear stages, is similar to standard GLMs (Paninski 2004). GLMs have been shown to capture the activity of sensory neurons in retina with a high degree of temporal precision and can be parameterized from sets of spike trains recorded in response to arbitrary time-varying stimuli (Pillow et al. 2005). They have also been recently applied to AN recordings for acoustic stimulation (Trevino et al. 2010; Plourde et al. 2011). An important direction for future work, therefore, would be to fit GLMs to single unit recordings of AN fibers using sets of stimuli that are clinically relevant to cochlear implant speech processing. One could also pursue a model-based approach and fit GLMs to biophysically detailed models of AN fiber responses to cochlear implant stimulation (Imennov and Rubinstein 2009; Woo et al. 2010). The resulting point process descriptions would not necessarily be constrained by data from single pulse and paired pulse stimuli, but the mathematical relationships in Point Process Theory could still be used to analyze the resulting models. In particular, in Application to Amplitude Modulation Detection we illustrated how the point process model, specifically the conditional intensity function, can be used to simulate a discrimination task. GLMs and related point process models can be used, therefore, to quantitatively evaluate cochlear implant stimulation strategies by simulating psychophysics experiments. The conditional intensity function of a GLM can also be used to design stimuli that evoke desired temporal patterns of spikes (Ahmadi et al. 2011). Thus future applications of point process models of AN fibers could seek to develop principles of optimal stimulus design that could translate to improved cochlear implant stimulation strategies.

Implications for High Pulse Rate Stimulation Strategies

A motivation for this work was to develop a model that could accurately predict responses to high carrier pulse rate stimulation. We have done this by including dynamical and stochastic features that reflect how past stimulation and past spiking activity can influence the excitability of AN fibers. Our simulation results in Fig. 12, past modeling studies (Rubinstein et al. 1999; Mino et al. 2002; Chen and Zhang 2007), and physiological evidence (Litvak et al. 2003a,b) suggest that responses to high pulse rate stimulation may represent the envelope of temporally modulated stimuli with greater fidelity than responses evoked by low pulse rate stimulation. Moreover, one would intuitively expect high pulse rate stimulation to provide greater temporal information to cochlear implant listeners and thereby improve speech perception and psychophysical performance on tests of temporal resolution. The fact that, to date, no psychoacoustic studies have shown clear evidence that high pulse rate stimulation improves speech reception or amplitude modulation detection poses an intriguing challenge to our understanding of the connection between AN spiking activity and auditory perception.

To connect AN spiking activity to psychoacoustic experiments, one can leverage the mathematical theory of point processes, which provides access to tools from signal detection and information theory (Heinz et al. 2001a,b; Goldwyn et al. 2010b). In Application to Amplitude Modulation Detection, we used the likelihood function of the point process model (Eq. 23) to simulate modulation detection for a 75-Hz sinusoidally amplitude-modulated pulse train. In human listeners, modulation detection at this frequency is correlated with speech perception (Won et al. 2011), so it is of interest to understand how neural responses transmit the temporal information in these stimuli.

The simulated spike trains evoked by low pulse rate stimulation were equally discriminable to those evoked by high pulse rate stimulation in our simulation of amplitude modulation detection (Fig. 13). These results suggest the possibility that cochlear implant listeners could successfully discriminate modulated and unmodulated stimuli in the context of psychoacoustic experiments, even when patterns of evoked neural activity in the auditory periphery provide distorted or incomplete representation of the stimulus envelope. This observation is consistent with psychoacoustic evidence that modulation detection in human listeners does not improve with high carrier pulse rate stimulation (Galvin and Fu 2005, 2009; Pfingst et al. 2007; Galvin III and Fu 2009). Moreover, it highlights an essential challenge for improving listening outcomes for cochlear implant users; to identify the relative benefits (or shortcomings) of high pulse rate stimulation, research should seek to identify psychoacoustic measures that reveal the perceptual consequences of the distinct patterns of neural activity evoked by low and high rate pulsatile stimulation.


This work has been supported by National Institute on Deafness and Other Communication Disorders Grants F31-DC-010306 (to J. H. Goldwyn), R01-DC-007525 (to J. T. Rubinstein), and the Burroughs Wellcome Fund (to E. Shea-Brown).


Dr. J. T. Rubinstein has received grant funding from and has served as a paid consultant to Cochlear, Ltd and Advanced Bionics Corp, two manufacturers of cochlear implant devices. Neither company played any role in data collection, analysis, or authoring this manuscript.


Author contributions: J.H.G., J.T.R., and E.S.B. conceived and designed the study. J.H.G. performed simulations, performed data analysis, and wrote manuscript. J.H.G., J.T.R., and E.S.B. edited, revised, and approved manuscript.


Definition of the Markov Chain for Constant Pulse Train Stimulation

In Synchronization of spike times, we discussed how the firing rate and Fano factor produced by low pulse rate, constant current level stimulation can be estimated for the point process model by making an analogy to a Bernoulli process. In this appendix, we generalize this type of approximation for the case of high pulse rate, constant current level stimulation by using a Markov chain model to account for the effects of past spikes and stimulus history. Results from Markov chain theory allow us to characterize the range of firing rates and firing irregularity (Fano factor) that the model can produce for this class of stimuli.

As in the Bernoulli process analogy, we simplify the problem by associating to each pulse a probability that the neuron spikes in response. This probability is given by the lifetime distribution function in Eq. 1, where the integral is evaluated over a duration of one interpulse interval. This probability value depends on the history of past pulses and past spikes. If we neglect precise spike timing, and presume that all spike times coincide with the onset of a pulse, then we can approximate the probabilities at hand via a sequence of values {pn(I)}, where the subscript n represents the number of pulses that have elapsed since the last occurrence of a spike. In the absence of history effects, for instance for low pulse rate stimulation, pn(I) is identical for all pulses since the interpulse interval is long relative to summation and refractory effects.

To account for history effects, we define a discrete time Markov chain. The states of this chain, which we denote sn, represent the number of pulses that have elapsed since the previous spike. There are two possible transitions away from each state. If a spike occurs, then the chain returns to s1. The transition probability of this event is pn(I). If no spike occurs, then the state of the chain advances by one to sn + 1. The probability of this event is 1 − pn(I). In practice, there is some number of pulses beyond which the refractory and summation effects no longer alter the probability of a spike, so we can limit the number of states in the chain to some finite number N. A schematic illustration of the resulting Markov chain is shown in Fig. A1A. An example of the sequence of transition probabilities {pn(I)} is shown in Fig. A1B for a 5,000-pps stimulus that produces a firing rate of 100 spikes/s. There are 25 total states in this chain (the x-axis), this indicates that history effects in the model do not persist beyond ∼5 ms. We therefore used a 5-state Markov chain for 1,000-pps stimuli and a 2-state Markov chain for 250-pps stimulation.

Fig. A1.

Markov chain approximation to the point process model. A: schematic of the Markov chain, with pn representing the probability that the neuron spikes n pulses after the previous spike. B: example of pn as a function of sn for 5,000-pps stimulation set to evoke 100 spikes/s (blue). Corresponding piecewise linear approximation (red), see text for details. C: firing rates of the point process model predicted by the Markov chain model. D: Fano factor of the point process model predicted by the Markov chain model. E: dependence of the Fano factor of the piecewise linear approximation to the Markov chain model on the number of initial and recovery stages. Contours show Fano factor values with firing rate set to be 100 spikes/s and pulse rate of 5,000 pps. Gray area indicates regions where Fano factor is higher at 5,000 pps than the value obtained at 250 pps, and X marks the position of the point process model with parameter values given in Table 1.

Calculation of the Firing Rate and Fano Factor

Results from the theory of discrete-time, discrete-space Markov chains allow us to analytically compute the firing rate and Fano factor of the Markov chain approximation, for arbitrary pulse rates and current levels. To do this, we use the fact that the Markov chain has a stationary distribution π, which gives the long time probability that the Markov chain will be in each state. The stationary distribution can be obtained by computing the eigenvector associated with the (unique) eigenvalue that takes the value one (Kemeny and Snell 1960). In other words, the stationary distribution π solves πM=π, where M is the transition matrix for the Markov chain. The first element of π, which we denote by π1, represents the long-term proportion of time steps in which the chain is in state s1. Since the chain only returns to state s1 if there has been a spike in response to the previous pulse, π1 can be used to compute the firing rate. In particular, if we denote the pulse rate of the stimulus by ρ, then the firing rate (in spikes/s) is given by: r=ρπ1.(A1) Figure A1C compares this Markov chain approximation for firing rate (black lines) to firing rates obtained from simulations of the point process model (colored error bars). The simulated firing rates are the same as shown in Fig. 7A. The fact that the Markov chain approximation accurately predicts firing rates for the point process model at low pulse rates is expected because the spiking can be described by a Bernoulli process in this case. Importantly, however, the Markov chain also accurately approximates firing rates for the high pulse rate stimuli. This framework, therefore, adequately captures the influence of spike history and interpulse effects in the point process model and can be used to estimate the firing rate of the point process model via Eq. A1.

Next, we show how the Markov chain approximation to the point process model can be used to compute the Fano factor of the point process model for pulse train stimuli with constant current strength. If we let N(T) be the number of spikes produced over a period of time of length T, then the Fano factor is defined as: F=Var[N(T)]E[N(T)]. The denominator is given via the stationary distribution, as discussed above. In particular, E[N(T)] = π1np, where np is the total number of pulses in the stimulus. To estimate the variance of N(T), we first define the characteristic matrix (Kemeny and Snell 1960) Z=[IdM+M]1 where Id is the identity matrix, M is the transition matrix for the Markov chain, and M is a matrix whose rows are the vector π. We then appeal to the law of large numbers for Markov chains, which states that the limiting variance of N(T)/np is π1(2Z11 − π1 − 1) (Kemeny and Snell 1960), where Z11 is the first entry of the characteristic matrix.

To compute the Fano factor, we then take the ratio of the variance of N(T)/np to the expected value of N(T)/np and find that the Fano factor is: F=2Z11π11.(A2) Figure A1D shows values of the Fano factor for 250- and 5,000-pps stimulation computed using the equation for a range of firing rates (black lines). Fano factor values computed for the point process model are shown with colored error bars and are the same as those in Fig. 7B. Overall, there is close agreement between the Markov chain approximation given by Eq. A2 and the Fano factor values computed from simulations of the point process model. In sum, the Markov chain approximation accurately captures both the mean firing rate and the normalized variance of the spike count for a range of stimulus levels and pulse rates.

Piecewise-Linear Approximation to the Markov Chain

As discussed in the text, physiological data suggest that Fano factors measured from responses to 5,000-pps pulse trains may be higher than those measured from responses to 250-pps pulse trains (Miller et al. 2008). We were interested in identifying whether the point process model could exhibit the same behavior, so we used the Markov chain to systematically explore the space of possible Fano factors that the model could produce. To do this, we first observed that for the 5,000-pps pulse train, the pn(I) curves can be roughly caricatured as having three stages: an initial stage where the probability of spiking is near zero, a recovery stage where pn(I) increases in a relatively linear manner with n, and a final stage where the pn(I) is nearly constant with n. This piecewise linear approximation is shown in red in Fig. A1B. We therefore characterized the functions pn(I) by two parameters, one that describes the number of initial stages and one that describes the number of recovery stages. We then swept across this space of two parameters to explore the range of possible Fano factor values for a fixed firing rate.

An example of this procedure is shown in Fig. A1E, where firing rate is set to be 100 spikes/s Fig. A1E: and the pulse rate is 5,000 pps. We show Fano factor values computed using the Markov chain model with the piecewise linear pn(I), and indicate in gray the regions of parameter space in which Fano factors are higher when stimulating the neuron with 5,000-pps than with 250-pps pulse trains. The black X indicates the location of the model with parameter values given in Table 1. To generate Fano factors that are larger for responses to 5,000-pps stimuli than 250-pps stimuli, either the number of initial stages or the number of recovery stages must be decreased. One mechanism in the model that controls the number of initial and recovery stages is the time scale of the threshold refractory effect, τθ in Eq. 14. Shortening the refractory period reduces the number of initial and recovery stages, and thus our analysis of the Markov chain model provides a theoretical basis for the observation in Fig. 10 that the point process model with a smaller τθ will have higher Fano factor for 5,000-pps stimulation than 250-pps stimulation. The number of initial and recovery stages is also influenced by the time scale of stimulus integration, τθ in the stimulus filters K+ and K in Eqs. 5b and 5c. The Markov chain approximation is therefore a useful tool for connecting between spike train statistics of the point process model to the biophysical properties (e.g., refractory effects, membrane time constant) that are represented by parameters in the model.


View Abstract