- Research
- Open Access
- Published:

# Fokker–Planck and Fortet Equation-Based Parameter Estimation for a Leaky Integrate-and-Fire Model with Sinusoidal and Stochastic Forcing

*The Journal of Mathematical Neuroscience*
**volume 4**, Article number: 4 (2014)

## Abstract

Analysis of sinusoidal noisy leaky integrate-and-fire models and comparison with experimental data are important to understand the neural code and neural synchronization and rhythms. In this paper, we propose two methods to estimate input parameters using interspike interval data only. One is based on numerical solutions of the Fokker–Planck equation, and the other is based on an integral equation, which is fulfilled by the interspike interval probability density. This generalizes previous methods tailored to stationary data to the case of time-dependent input. The main contribution is a binning method to circumvent the problems of nonstationarity, and an easy-to-implement initializer for the numerical procedures. The methods are compared on simulated data.

### List of Abbreviations

LIF: Leaky integrate-and-fire

ISI: Interspike interval

SDE: Stochastic differential equation

PDE: Partial differential equation

## 1 Introduction

Information processing in the nervous system is carried out by spike timings in neurons. To study the neural code in such a complicated system, a first step is to understand signal processing and transmission in single neurons. Stochastic leaky integrate-and-fire (LIF) neuronal models are a good compromise between biophysical realism and mathematical tractability, and are commonly applied as theoretical tools to study properties of real neuronal systems. A central issue is then to perform statistical inference from experimental data and estimate model parameters. Many electrophysiological experiments on neurons, namely extra-cellular recordings, are only capable of detecting the time of the spike and not the detailed voltage trajectory leading up to the spike. Estimating the parameters of the LIF model from this type of data is equivalent to estimating the parameters of a stochastic model from the statistics of the first-passage times only. A common assumption is that the data are well described by a renewal process, thus basing the statistical inference on the interspike intervals (ISIs), assuming these are realizations of independent and identically distributed random variables. Since only partial information about the process is available, the statistical problem becomes more difficult, and no explicit expression for the likelihood is available.

Different methods have been proposed. In the seminal paper [1], a point process approach is proposed. The spike trains of a collection of neurons are represented as counting processes. Time is discretized and the point processes approximated by 0–1 time series. Then the probability of firing in the next time interval is modeled as a function of the spike history. In this way, maximum likelihood estimation is feasible. External stimuli are not considered. In [2], a numerically involved moment method is developed. It uses the first two moments of the first-passage times of the Ornstein–Uhlenbeck process to a constant threshold, which are given as series expressions, and equates them to their empirical counterparts. In [3, 4], certain explicit moment relations derived from the Laplace transform of the first-passage time distribution are applied, but these are only valid under stimulation (supra-threshold regime). In [5], inference is based on numerical inversion of the Laplace transform. In [6], a functional of a three-dimensional Bessel bridge is applied to obtain a maximum likelihood estimator. None of these methods are feasible to extend to the time-inhomogeneous case, which is of our interest. In [7, 8], an integral equation is used to derive an estimator in the time-homogeneous setting. This approach is readily extended to time varying input, which we will explore in this paper. Some of the above methods are compared in [9]. Finally, a review of estimation methods is provided in [10].

Many sensory stimuli, like sound, contain an oscillatory component [11, 12]. Such inputs will cause oscillating membrane potentials in the neuron, generating rhythmic spiking patterns. The oscillation frequency determines the basic rhythm of spiking, and is considered to be significant for neuronal information processing. The dynamics of periodically forced neuron models have been extensively studied; see [13–18] and references therein. Even so, attempts to solve the estimation problem in these nonstationary settings have been rare. One problem is that the ISIs are no longer independent nor identically distributed. In [19], a more complicated model with linear filters is considered, allowing also for the spike history to influence the membrane potential dynamics. The estimation problem is solved through numerical solutions to the Fokker–Planck equation, and it is shown that the log-likelihood is concave, thus ensuring a global maximum; see also [20, 21]. Because their model is more involved, some approximations to the solution of the Fokker–Planck equation are applied, to ensure acceptable computing times. We will apply the full Fokker–Planck equation to solve our estimation problem, since the computing time is always lower than 2 seconds for a sample size of 1000 spikes.

In this paper, we thus describe and discuss two methods to estimate parameters of LIF models with the added complexity of a time-varying input current. We assume that the time-varying current is a sinusoidal wave, but we believe that the approaches generalize to an arbitrary periodic forcing with known frequency. One approach relies on the Fortet integral equation, which is readily extended to the time-inhomogeneous case. An advantage of this approach is that if the transition density of the diffusion in the LIF model is known, as is the case for the Ornstein–Uhlenbeck and the Feller model, the computational burden is limited. A second approach involves numerical solution of the Fokker–Planck equation, where the time-dependence is explicitly accounted for. After a numerical differentiation, the likelihood function can be calculated providing the maximum likelihood estimator. Nevertheless, we chose an alternative loss function which seems marginally more robust, directly comparing the survival function provided by the solution of the Fokker–Planck equation with its empirical counterpart. The two approaches give similar results and they are more carefully compared in the supplementary online material. Both methods need sensible starting values for the optimization algorithms, and we provide an easy-to-implement initializer. The estimation procedures are compared on simulated data and we find that both algorithms are able to find estimates close to the true values for several different dynamical regimes. We find that for small sample sizes the Fokker–Planck algorithm can be considered marginally preferable, whereas for larger sample sizes the Fortet algorithm becomes marginally superior. Moreover, at high frequencies of the sinusoidal forcing, the Fortet is better at identifying the parameters, though in general there is less information in the data to distinguish between a constant input and the amplitude of the periodic forcing.

## 2 Model

The time evolution of the voltage of a spiking neuron is modelled by a stochastic process, *V*, given as solution to the following stochastic differential equation (SDE):

Here, *μ* is a bias current acting on the cell, *τ* is the decay time, *A* and *ω* are the amplitude and (angular) frequency of the sinusoidal current acting on the cell, *σ* is the strength of the stochastic fluctuations, $W={\{{W}_{t}\}}_{t\ge 0}$ is a standard Wiener process, and ${t}_{n}^{+}$ denotes the right limit taken at ${t}_{n}$. A spike occurs when the membrane voltage $V(t)$ crosses a voltage threshold, ${v}_{\mathrm{th}}$, and then $V(t)$ is instantaneously reset to the resting potential ${v}_{0}$. The difference between subsequent spike times, ${J}_{n}={t}_{n}-{t}_{n-1}$, is called the interspike interval (ISI).

We will assume that *τ* is known (but see Sect. 6 for a discussion of the alternative) and nondimensionalize Eq. (1) as follows:

to obtain

where we have defined ${I}_{n}={J}_{n}/\tau $. We can also write the dynamics between two spike times ${s}_{n}$ and ${s}_{n+1}$ in terms of elapsed time since the last spike, ${s}^{\prime}=s-{s}_{n}$, ${s}^{\prime}<{I}_{n+1}$,

This form of the dynamics highlights that this is not a renewal process since different trajectories between spikes have different phase shifts ${\varphi}_{n}={s}_{n}$ modulo $2\pi /\Omega $. This will be important in the following discussion. The shape of the ISI distribution depends on the model parameters, and it is natural to divide the parameter space in different regimes characterized by their qualitative behavior. Four distinct parameter regimes will be considered; supra-threshold, critical, subthreshold, and supersinusoidal. To understand the reasoning behind the regime names, observe that in the absence of noise, $\beta =0$, the deterministic model will produce spikes if and only if

see the discussion in [14], which can be directly inferred from the solution in Eq. (12) below. In both the supra-threshold and supersinusoidal regimes, $\alpha +\gamma /\sqrt{1+{\Omega}^{2}}>1$. The difference between the two is that in the supra-threshold regime the constant bias current alone is sufficient for spikes to occur, also in absence of noise, that is, $\alpha >1$. In the supersinusoidal regime, the sinusoidal current is necessary for spikes to occur in absence of noise, that is, $\alpha +\gamma /\sqrt{1+{\Omega}^{2}}>1$ and $\alpha \le 1$. In the critical regime, the sum of the two terms is just barely enough to guarantee deterministic spiking, that is $\alpha +\gamma /\sqrt{1+{\Omega}^{2}}\approx 1$. Finally, in the subthreshold regime, there would be no spikes without the noise, $\alpha +\gamma /\sqrt{1+{\Omega}^{2}}<1$.

Table 1 tabulates examples of corresponding parameter values for each regime, while Fig. 1 shows examples of individual voltage trajectories and their associated spike trains. Figures 2 and 3 illustrate how each regime behaves for selected *ϕ*’s by plotting the survivor distribution, ${\overline{G}}_{\varphi}(t)$, and the probability density, ${g}_{\varphi}(t)$, both defined in Eq. (4) below.

With regards to Figs. 2 and 3, it is worth noting explicitly that combinations of noise and sinusoidal forcing can cause firing patterns in which spikes are phase locked, but skip a certain number of cycles. This leads to multimodal ISI densities. There are many different dynamical mechanisms that can yield such patterns, and the particular correlations between the ISIs will depend on the underlying voltage dynamics (which, in our case, we assume to be given by Eq. (1)); in particular, it may be difficult to distinguish whether the dynamics are subthreshold or supra-threshold, since both can show similar ISI densities; see [22].

### 2.1 Basic ISI Probability Density Functions

Here, we introduce the notation for the probability density, distribution and survival functions of ${I}_{n}$, an ISI arising from a trajectory produced by Eq. (3),

The subscript *ϕ* is to stress that *g*, *G*, and $\overline{G}$ depend on the value of ${\varphi}_{n}$ in Eq. (3). This is the formal statement that in a sinusoidally-driven neuron, the interspike intervals are not identically distributed, and are only independent conditioned on the sinusoidal phase at an interval’s onset. Knowing these distributions would provide the likelihood function, offering estimation by the preferred method of choice, the maximum likelihood estimator. Unfortunately, explicit expressions for the ISI distribution are not available except for the special case of $\gamma =0$ and $\alpha =1$; see [3]. Different representations of the likelihood function are available though, see [23], one of which we will use below.

### 2.2 Fokker–Planck Equation with Absorbing Boundaries

The Fokker–Planck equation is a partial differential equation (PDE) describing the evolution of the probability density, $f(x,t)$, of ${X}_{t}$. For the sinusoidally-forced Ornstein–Uhlenbeck process, Eq. (3), with the threshold ${x}_{\mathrm{th}}=1$, the PDE is

Due to the reset, we have that at time $t=0$, ${X}_{t}=0$ and so for the initial conditions we can write

where $\delta (\cdot )$ is the Dirac delta function. The spike is represented as a zero boundary condition for *f* at $x=1$

The natural way of using the Fokker–Planck equation in first-hitting-times problems is as follows. Denote the integral of ${f}^{(\varphi )}$ by ${F}^{(\varphi )}(x,t)={\int}_{\xi \le x}{f}^{(\varphi )}(\xi ,t)\mathrm{d}\xi $. ${F}^{(\varphi )}(x,t)$ can be related to the ISI’s survivor distribution function, ${\overline{G}}_{\varphi}(t)$, by

Equation (7) forms the basis of one of the methods below for estimating the structural parameters from the observed data.

Since Eq. (5) has to be solved numerically, we will need to truncate its domain from below. The most natural way to do this, given the dynamics, is to impose reflecting boundary conditions at some $x={x}_{-}\ll (\alpha -\gamma /\sqrt{1+{\Omega}^{2}})$ where the probability mass is very small. For the left (lower) limit of the computational domain, we use the formula

This choice requires some explanation. In the $t\to \mathrm{\infty}$ limit, the distribution of ${X}_{t}$ in Eq. (3) *without* thresholding is Gaussian with mean given by Eq. (12) (below) and variance equal to ${\beta}^{2}/2$. Thus, to truncate the computational domain for the thresholded process from below, we take the lowest value of the asymptotic mean, $\alpha -\gamma /\sqrt{1+{\Omega}^{2}}$, then from this we subtract two standard deviations, $2\beta /\sqrt{2}$ and set the result to be the lower bound, ${x}_{-}$. Finally, if this value for ${x}_{-}$ happens to be larger than −0.25, we enforce that ${x}_{-}\le -0.25$.

Numerical considerations lead us to solve for *F*, instead of *f*, since delta functions are difficult to represent in floating point, while the initial conditions for *F*, the Heaviside step function, $H(x)$, faces no such difficulties [24]. The Heaviside step function is defined to be equal to 0 for $x<0$ and to be equal to 1 for $x\ge 0$. At this point, we need to derive the PDE for the distribution *F*, starting from the PDE for the density, *f*, Eq. (5).

First, at the lower boundary, it is intuitive that the distribution should be zero, $F({x}_{-},t)=0$, while $f(1,t)=0$ implies that at the upper boundary ${\partial}_{x}F(1,t)=0$. Inside the domain, the PDE itself reformulates as

so that

Integrating with respect to *x* then gives

where $C(t)$ is a constant of integration depending on *t*. Now consider the lower boundary condition, $x={x}_{-}$. Here, $F({x}_{-},t)=0$ implies that ${\partial}_{t}F=0$ and so

The right-hand side in Eq. (8) is precisely the reflecting boundary condition on *f* once we recall that ${\partial}_{x}F=f$. Therefore, $C(t)\equiv 0$.

Thus, the fully specified PDE for *F*, which we will be solving frequently in what follows, is

Numerical solutions for Eq. (9) are shown in Fig. 4. We have used the standard Crank–Nicholson finite-difference algorithm (central-differences in space with equally weighted implicit-explicit terms in time, see [25]).

### 2.3 Fortet Equation

Consider a general form of Eq. (3),

Let $\Phi (y,t|{y}_{0},{t}_{0}):=\mathbb{P}[{Y}_{t}\le y|{Y}_{{t}_{0}}={y}_{0}]$ be the transition cumulative distribution of *Y*. Note that this is the distribution of ${Y}_{t}$ in absence of a threshold, different from the distribution given in Eq. (7), which is the distribution of the process constrained to be below the threshold. Now consider an arbitrary time-dependent threshold ${v}_{\mathrm{th}}(t)$. The Fortet equation (see [26]) convolves the first-hitting time probabilities, $g(t)$, with the transition density of the process. Integrating over $(-\mathrm{\infty},{v}_{\mathrm{th}}(t))$, we obtain

The left-hand side is simply the probability of exceeding ${v}_{\mathrm{th}}$ at time *t* starting at ${v}_{0}$ at time 0. This can also be written as the probability of hitting ${v}_{\mathrm{th}}$ for the first time at time $\tau <t$ and then exceeding ${v}_{\mathrm{th}}$ at time *t* starting at ${v}_{\mathrm{th}}$ at time *τ*, integrated over all *τ*.

The Fortet equation is particularly appealing to use when we have an analytical expression for *Φ*. For the problem at hand, *Φ* is complicated due to the time-dependent forcing. However, the following transformation yields a time-homogeneous *Y* for which *Φ* will be tractable, along with an associated moving threshold, ${v}_{\mathrm{th}}(t)$. This makes feasible the use of the Fortet equation. To obtain this transformation, cf. [27], consider the deterministic version of the SDE in Eq. (3)

with solution

Now take ${X}_{t}$, the solution to Eq. (3) and $v(t)$, Eq. (12), and let ${Y}_{t}={X}_{t}-v(t)$. Then

which has the time and parameter dependent threshold

That is, ${X}_{t}$ hits the constant threshold ${v}_{\mathrm{th}}$ if and only if ${Y}_{t}$ hits the moving threshold ${{v}_{\mathrm{th}}}_{\{\alpha ,\gamma ;\varphi \}}(t)$, where the subindex indicates the dependence on *α*, *γ* and *ϕ*. Therefore, the ISIs produced by *X* and *Y* are the same and so are their distributions. Thus, ${g}_{\varphi}(\tau )$ satisfies

where

is the conditional cumulative distribution function of ${Y}_{t}$ defined in Eq. (13).

## 3 Parameter Estimation Algorithms

The unknown parameters in Eq. (3) are *α*, *β*, and *γ*, while we assume *Ω* known. The reason why the amplitude, *γ*, is often unknown while the frequency, *Ω*, is known is that one can usually observe the sinusoidal input and thus its frequency. Further, the encoding of the input into neuronal firing patterns often involves phase locking to the sinusoidal component. However, the actual forcing amplitude at the level of the neuron is usually modified by various synaptic and other filtering processes, unless the cell receives direct sinusoidal current injection.

Our goal is to estimate the structural parameters (*α*, *β*, *γ*) from a sample of spike time data, $\{{i}_{1},\dots ,{i}_{N}\}$. There are several algorithms for estimating the parameters for the simpler and more common case of $\gamma =0$. One such algorithm relies on the Fortet equation (see [7, 8]), which we extend to the presence of a time-varying current. A more basic approach is to directly solve the Fokker–Planck equation for the probability density of ${X}_{t}$, [19–21], from which one can derive the survival distribution of ${I}_{n}$, and use this to compare against the empirical survival distribution of ${I}_{n}$ obtained from data. An approximate maximum likelihood approach is also possible by numerical differentiation. The relation between Fokker–Planck equations and the first-passage time problem is discussed in most introductory books on stochastic analysis; see, for example, [28]. A recent review of this approach for the simple $\gamma =0$ case in neuronal modeling can be found in [21], wherein the first passage problem is discussed at great lengths in the context of spiking neurons. We will use this in Sect. 2.2. A more elaborate approach using the Fokker–Planck equation to approximate the hitting time distribution is given in [29]. The techniques in [29] avoid the need to compute the Fokker–Planck PDE numerically, instead approximating it with analytically known solutions. This approach might offer significant computational savings, but since this would at most amount to a computational speed-up of our algorithm, we have left this unexplored for now.

The immediate problem in generalizing the aforementioned approaches to the case of $\gamma \ne 0$ is that the ${I}_{n}$’s are no longer identically distributed since the phase ${\varphi}_{n-1}$ of the *n* th interval ${I}_{n}$ depends on ${t}_{n-1}$, the time the previous spike occurred. The ${I}_{n}$’s are also dependent, but conditionally independent given ${\varphi}_{n-1}$. So the trajectories in each interval are parameterized by the value of ${\varphi}_{n-1}$ at the time of the last spike/reset. We overcome this obstacle by splitting the ${I}_{n}$’s in groups, and approximating the ${I}_{n}$’s within groups as coming from identically distributed trajectories in a sense to be specified below. This approximation which solves the challenge of dependent and non-identically distributed ISIs is the primary contribution of this paper.

### 3.1 *ϕ*-Binning

Before we can use Eq. (9) or (15), we need to deal with the fact that *ϕ* is not fixed, but instead each ${I}_{n}$ starts with a distinct ${\varphi}_{n}$. Our approach is to partition the interval $[0,2\pi /\Omega ]$ into *M* bins, where $M\ll N$, and represent each bin by the midpoint of the bin, ${\varphi}_{m}$. Then we approximate the *N* observed ${\varphi}_{n}$’s by the closest ${\varphi}_{m}$ and pretend that any observed ${I}_{n}$ was not produced by a trajectory of the form in Eq. (3) with $\varphi ={\varphi}_{n}$, but with $\varphi ={\varphi}_{m}$. Our hope is that for a judicious choice of *M*, we can balance the error of ${\varphi}_{n}\ne {\varphi}_{m}$ with having enough data points in each bin in order to obtain a useful estimate from Eq. (9) or (15).

There is clearly much freedom in how one sets up these bins, but we will do the simplest thing and make them all of equal width, $\delta \varphi =2\pi /(\Omega M)$. Each ${\varphi}_{n}$ will belong to one and only one of the bins ${[{\varphi}_{m}-\delta \varphi /2,{\varphi}_{m}+\delta \varphi /2)}_{m=1}^{M}$, with center points ${\varphi}_{m}=\delta \varphi /2+(m-1)\delta \varphi $, for $m=1,\dots ,M$. Thus, given an empirically observed ${I}_{n}$ with associated ${\varphi}_{n}$, we will pretend that it was produced by the process

where

This binning is illustrated in Fig. 5.

While we have no rigorous approach to determine the value of *M*, our limited experience suggests that given $N=1000$ ISIs, $M=10$, or $M=20$ gives satisfactory results for very different parameter regimes. In general, choosing *M* is a balancing act. For *M* too high, the resulting bins will have too few data points to approximate $\overline{G}(I)$ accurately. Therefore, *M* is forced to be small when sample size is not large. For *M* too low, the approximation of the phase shifts will be poor, leading to a biased estimate of $\overline{G}(I)$. We illustrate the effect of increasing *M* in Fig. 6. Generally, as long as there are sufficient data points, as *M* increases, the approximation of using the survival distribution with ${\varphi}_{m}$ instead of ${\varphi}_{n}$ improves since ${\varphi}_{m}(n)\to {\varphi}_{n}$ as $M\to \mathrm{\infty}$. In the sequel, we will use $M=20$ for sample sizes of $N=1000$ and $M=8$ for sample sizes of $N=100$.

### 3.2 Fokker–Planck Algorithm

Within each bin it is clear how to apply Eq. (7). In the *m* th bin, for a given ${\varphi}_{m}$, we approximate ${\overline{G}}_{\varphi}(t)$ by

where ${N}_{m}$ is the number of ISIs in bin *m*. Using Eq. (7), we define the loss function

The weight ${N}_{m}$ is included so that bins with larger sample sizes have a larger influence on the estimates.

To evaluate the supremum in Eq. (17), we spline interpolate the empirically discrete $\stackrel{\u02c6}{\overline{G}}$ for each ${\varphi}_{m}$, sample at the time nodes of the PDE discretization and finally take the maximum amongst the sampled values. We then minimize *L* using an optimization algorithm (see below, Sect. 4) and take our estimates $\stackrel{\u02c6}{\alpha}$, $\stackrel{\u02c6}{\beta}$, $\stackrel{\u02c6}{\gamma}$ to be

Note that the relation between the spike time survival density, ${\overline{G}}_{\varphi}$ and the transition distribution, ${F}_{\varphi}$, in Eq. (7) could also allow for an approximate maximum likelihood estimator (MLE), based on maximizing

where the derivative has to be approximated by finite differences. We can then again use binning to avoid having to compute the PDE separately for each $({i}_{n},{\varphi}_{n-1})$. Our experience with the MLE approach has been that the quality of the estimates provided are similar to those obtained by minimizing Eq. (17) and that the associated computing times are on the same order. Due to this similarity and in order to keep the paper concise, we include details of the MLE estimates only in the supplementary online material.

### 3.3 Fortet Algorithm

An alternative approach is to form a loss function from Eq. (15). This is similar to what is done in [7, 8] for the simpler case of a constant threshold. Noting that ${\int}_{0}^{t}g(\tau )[1-\Phi ]\mathrm{d}\tau =\mathbb{E}[(1-\Phi ){\mathbb{1}}_{I\le t}]$ where the expectation is taken with respect to the distribution of the random variable *I*, we can use the fact that the ISIs are approximately independent and invoke the law of large numbers to estimate the integral as

We then define the loss function

We divide each inner term by $\omega ({\varphi}_{m};\alpha ,\beta ,\gamma )={sup}_{s>0}|1-{\Phi}_{\alpha ,\beta ,\gamma}^{({\varphi}_{m})}({v}_{\mathrm{th}}(s),s|{v}_{0})|$, following the suggestion in [8]. This scaling ensures that Eq. (15) divided by $\omega (\alpha ,\beta ,\gamma )$ will vary between 0 and 1 for all parameter values thus giving sense to the measure defined by the loss function. Since we can solve in closed form for *Φ*, we have all we need given an observed spike train of ${i}_{n}$’s. We evaluate the sup by sampling at $K=500$ uniformly spaced points in $(0,{I}_{max}+\u03f5]$ and taking the maximum of the sampled values.

### 3.4 Initialization of the Algorithms

The parameter search can be initialized in a simple way using the fact that the Fokker–Planck PDE is almost an “advection-diffusion” equation whose solution is almost a Gaussian. Then $\overline{G}(t)$ can be approximated by the amount of probability mass of a Gaussian to the left of the threshold at time *t*. The idea is as follows. Suppose we are solving the following PDE:

Its solution given an initial condition $\rho (x,0)=\delta (x)$ will be a Gaussian bell moving to the right with speed *U* and standard deviation $\sigma =\beta \sqrt{t}$.

The survivor function $\overline{G}(t)$ can be thought of as the amount of area that has passed the threshold (from the left moving to the right). We can then invert the information about $\overline{G}$ to estimate *U* and *β*. In particular, a Gaussian bell has ≈0.158 of its mass more than one standard deviation to the right of its mean. Thus, at time ${t}_{1}$ such that $\overline{G}({t}_{1})=0.842$, the right tail of more than one standard deviation of the Gaussian bell has crossed the threshold. The threshold is at $x=1$ and we obtain the following equation:

Similarly, at time ${t}_{2}$ such that $\overline{G}({t}_{2})=1-0.842$, the Gaussian bell has crossed the threshold except for the left tail and we have

If *U* and *β* were constant, then Eqs. (20) and (21) provide two equations in two unknowns. However, $U=U(x,t)=(\alpha -x+\gamma sin(\Omega (t+\varphi )))$ is not constant and we approximate *U* as

i.e., we approximate the space-dependent term, *x*, with the mid-point between the reset value, ${v}_{0}=0$, and the threshold, ${v}_{\mathrm{th}}=1$, and we approximate the time-dependent term, $sin(\Omega (\tau +\varphi ))$, by its time-average value between 0 and *t*. If we use the 0th, 1st, and 2nd standard deviation points, we can form 5 equations in 3 unknowns as follows:

with the time-average weighting function $s(t)=(cos(\Omega \varphi )-cos(\Omega (t+\varphi )))/\Omega $. However, the approximation is best for earlier times, when the solution is closer to a Gaussian bell that is approaching the threshold, but less correct for later times, since it neglects the loss of probability mass and thus overestimates the backward probability current. Indeed, we have found it to be best to use only ${t}_{1}$ and ${t}_{2}$. In the following we use only these equations:

for the initializer. We can form these equations separately for each ${\varphi}_{m}$ bin, thus resulting in $M\times 2$ equations for the unknowns *α*, *β*, and *γ*. Since we have more equations than unknowns, we use least-squares estimates in a regression to pick out unique *α*, *β*, and *γ* estimates.

The proposed initialization procedure has two advantages. First, it is automatic, i.e., it requires only the data and no input or guidance from the user. Second, it is extremely fast. While the precise effect of the initializer is shown in Sect. 4, it is intuitively clear that it will work best in the supra-threshold parameter regime when the bell curve is truly moving past the threshold as a whole and less so for subthreshold regimes, when only the diffusive force serves to propel the process to reach ${v}_{\mathrm{th}}$. The behavior of the initializer in the different regimes is illustrated in Fig. 7. What we show in Fig. 7 is the following: First, we show the survival distribution for a given regime and ${\varphi}_{m}$ fixed. Then using data generated from such a regime and with ${\varphi}_{n}$ in the *m* th bin, the initializer tries to find the best approximation by the motion of a Gaussian bell which will fit these data, in the sense of solving for *α*, *β*, *γ* as previously described. Once this is done, we then show in red the amount of area under this Gaussian bell to the left of the threshold. Of course, the interpretation of the survival distribution for an ISI as a fraction of the area under a moving bell with conserved total area is wrong, but the assumption is useful in automatically generating initial values for the more appropriate approximations to start their work.

## 4 Method Comparison on Simulated Data

We will now use our algorithms on spike trains simulated from the four different regimes: the supra-threshold, the critical, the supersinusoidal and the subthreshold. We have used 100 sample spike trains per regime, with $N=100$ as well as $N=1000$ spikes per train. In order to perform the numerical minimization of Eqs. (17) and (18), we have used an implementation of the Nelder–Mead algorithm from the SciPy library [30]. The Nelder–Mead algorithm is a non-linear minimization routine which uses a bounding-polygon method to zero-in on the minimum and thus avoids the need to provide the gradient of the loss function. It is the standard non-gradient minimization algorithm.

The estimation results are shown in Figs. 8, 9, 10, 11, where we plot box plots for the estimates, $\stackrel{\u02c6}{\alpha}$, $\stackrel{\u02c6}{\beta}$, $\stackrel{\u02c6}{\gamma}$ in the four regimes. We also tabulate the average and the empirical 95 % confidence intervals of the estimates in Tables 2 and 3. Conclusions that can be drawn from these results are as follows. The initializer method is effective for the supra-threshold regime and gives reasonable ballpark estimates for all regimes, though the error can be substantial for the supersinusoidal regime. In general, both the Fortet and Fokker–Planck algorithm estimate the parameters well in the supra-threshold, critical and supersinusoidal regimes. The estimators’ variance is especially low in the supra-threshold regime, while it is higher for the critical and supersinusoidal regimes. In the supersinusoidal regime, the two algorithms give accurate estimates even though the initializer can be quite off. On the other hand, in the subthreshold regime, the initializer has a performance comparable to that of the two more involved methods. It seems that distinguishing between the constant bias and the sinusoidal current is difficult if their sum is not sufficient to generate spikes without noise.

The Fokker–Planck method has a larger bias but a smaller spread than the Fortet method for $N=100$, Table 2. However for $N=1000$, the two methods have comparable spreads, while the Fortet method retains a smaller bias, see Table 3. More precisely, for $N=1000$, the Fokker–Planck method has a smaller spread in the subthreshold regime, while the Fortet method has a smaller spread in the supersinusoidal regime. As such, at least in the supersinusoidal regime, the Fortet method seems superior. A detailed comparison between the Fortet and the Fokker–Planck estimators for each parameter in each regime can be seen in Fig. 12 for $N=100$ and in Fig. 13 for $N=1000$.

The two algorithms are numerically intensive. For $N=100$ and $N=1000$ spikes, we show the times taken for the estimation in Table 4. While we have done most of our numerical work in Python/SciPy [30], we have implemented the critical components of both algorithms in C. That is, we solve the inner part of Eq. (18) and the Fokker–Planck PDE, Eq. (9), in C using the GSL libraries [31]. From Table 4, we can verify that the computing time for the Fortet algorithm scales proportionally with the number of spikes. This is to be expected, since the Fortet equation has a term of the form ${\sum}_{{i}_{n}}$ which in turn has *N* terms and this forms the bulk of the computing time for the Fortet equation. The Fokker–Planck algorithm, on the other hand, scales less-than-linearly with *N*, since the dependency on *N* is in forming the approximation, $\stackrel{\u02c6}{\overline{G}}$ to the survivor function and that is not computationally intensive (solving the PDE is).

## 5 The Effect of *Ω*

So far, we have held *Ω* constant and equal to 1. We now investigate the effect of varying *Ω* on the quality of estimates. To narrow the scope, we focus on increasing *Ω* while keeping the parameters in the critical regime such that $\alpha +\gamma /\sqrt{1+{\Omega}^{2}}=1$ and $\alpha =0.5$. This amounts to increasing *γ* with *Ω*. We do the estimations for four values of $\Omega =[1,5,10,20]$. Similarly to the previous section, we use 100 sample spike trains per parameter set, with each spike train consisting of $N=1000$ ISIs.

We show box plots of the estimates for each *Ω* in Fig. 14. We then directly compare the two algorithms, Fortet vs. Fokker–Planck, in Fig. 15. The immediate observation is that the Fokker–Planck algorithm fails to keep up at the higher frequencies and consistently underestimates *γ*. The Fortet algorithm does better, but still underestimates *γ*. In general, this underestimation of *γ* is accompanied by an overestimation of *α*. This is exacerbated at higher *Ω*. We illustrate the relation between estimates for *α* vs. *γ* in Fig. 16, where it is quite clear that an underestimation of *γ* is proportional to the overestimation of *α*.

For completeness, we also include the estimates’ average and empirical 95 % confidence intervals in Table 5.

## 6 Discussion and Outlook

We have shown two methods to estimate parameters in Eq. (2) from ISI data. Our methods are based on binning the spikes into bins with representative phase shifts. We have devised a constructive procedure to automatically initialize the methods from the data.

Our computational results suggest that for low frequencies the Fortet algorithm is superior for large sample sizes, especially in the supersinusoidal regime, while the Fokker–Planck algorithm has a comparable accuracy and a lower variance for small sample sizes. Both algorithms find sensible estimates most of the time, although they seem less effective in the subthreshold regime. Their performance can be partially attributed to the ability of the initializer algorithm to supply good guesses for starting the optimization iterations.

The Fokker–Planck equation allows for approximate maximum likelihood estimation. We chose an alternative loss function, though, because it marginally appeared more robust, possibly because a numerical derivation step is avoided. This is further investigated by simulations in the supplementary online material. The simulations suggest that the distribution of the maximum likelihood estimates in the supersinusoidal regime appears bimodal, which is not the case for the alternative loss function, Eq. (17).

We have also made a preliminary exploration of the effect of *Ω* on the quality of the estimates. Our results show that an increase in *Ω* makes the parameters *α* and *γ* more difficult to estimate accurately and at high *Ω*, *γ* is underestimated, while *α* is over-estimated. We find that in this scenario, the Fortet algorithm does a markedly more accurate job then the Fokker–Planck algorithm.

We have assumed the time-constant *τ* of the leak term to be known. In most experiments that is not realistic, and it would be preferable to estimate *τ* alongside the other parameters. However, it is difficult to estimate [32]. When we tried to estimate it together with the other parameters, we usually obtained results which were not accurate. The obtained estimates resulted in ISIs that very well matched the data, no worse than the ISIs obtained from the true parameters. This leads us to believe that the simultaneous estimation of *τ* along with *α*, *β*, *γ* using only ISI data suffers from identifiability problems. In [5], they were able to estimate *τ* in the simpler nonsinusoidally-driven model, but concluded that adding *τ* as an unknown dramatically reduced the accuracy in the estimation of the other unknown parameters. The reason is that if *τ* is also estimated from a single dataset alongside the other parameters, then a reasonable fit can be found to the data for various combinations of *α*, *β*, *γ*, and *τ*, but the so-obtained parameter values can be far from the true values.

Our model is relatively simple and ignores neurophysiological realism, such as the fact that the spiking threshold is likely nonconstant, with a time-dependent functional form that would involve further unknown parameters. A recent paper attempting the parameter estimation in such a model, but without sinusoidal forcing, is [20]. Furthermore, intracellular recordings suggest that a hard threshold is a rough approximation and an exponential voltage-dependent spiking intensity is more realistic [33].

While our work has used a very specific form of the periodic forcing term, namely $\gamma sin(\Omega t)$, it is clear how to apply the approach to an arbitrary periodic function. This can be done as long as one knows where in the period of oscillation a spike has occurred. If that is the case, then the binning procedure can be applied and the estimation methods proposed can be attempted.

## References

- 1.
Brillinger DR: Maximum likelihood analysis of spike trains of interacting nerve cells.

*Biol Cybern*1988, 59: 189–200. 10.1007/BF00318010 - 2.
Inoue J, Sato S, Ricciardi L: On the parameter estimation for diffusion models of single neuron’s activities.

*Biol Cybern*1995, 73: 209–221. 10.1007/BF00201423 - 3.
Ditlevsen S, Lánský P: Estimation of the input parameters in the Ornstein–Uhlenbeck neuronal model.

*Phys Rev E*2005., 71: Article ID 011907 Article ID 011907 - 4.
Ditlevsen S, Lánský P: Estimation of the input parameters in the Feller neuronal model.

*Phys Rev E*2006., 73: Article ID 061910 Article ID 061910 - 5.
Mullowney P, Iyengar S: Parameter estimation for a leaky integrate-and-fire neuronal model from ISI data.

*J Comput Neurosci*2008, 24(2):179–194. 10.1007/s10827-007-0047-5 - 6.
Zhang X, You G, Chen T, Feng J: Maximum likelihood decoding of neuronal inputs from an interspike interval distribution.

*Neural Comput*2009, 21(11):3079–3105. 10.1162/neco.2009.06-08-807 - 7.
Ditlevsen S, Ditlevsen O: Parameter estimation from observations of first-passage times of the Ornstein–Uhlenbeck process and the Feller process.

*Probab Eng Mech*2008, 23: 170–179. 10.1016/j.probengmech.2007.12.024 - 8.
Ditlevsen S, Lánský P: Parameters of stochastic diffusion processes estimated from observations of first-hitting times: application to the leaky integrate-and-fire neuronal model.

*Phys Rev E*2007., 76: Article ID 041906 Article ID 041906 - 9.
Ditlevsen S, Lánský P: Comparison of statistical methods for estimation of the input parameters in the Ornstein–Uhlenbeck neuronal model from first-passage times data. American Institute of Physics Proceedings Series CP1028. In

*Collective Dynamics: Topics on Competition and Cooperation in the Biosciences*. Edited by: Ricciardi LM, Buonocore A, Pirozzi E. Am Inst of Phys, Melville; 2008:171–185. - 10.
Lánský P, Ditlevsen S: A review of the methods for signal estimation in stochastic diffusion leaky integrate-and-fire neuronal models.

*Biol Cybern*2008, 99(4–5):253–262. 10.1007/s00422-008-0237-x - 11.
Braun HA, Wissing H, Schafer K, Hirsch MC: Oscillation and noise determine signal-transduction in shark multimodal sensory cells.

*Nature*1994, 367(6460):270–273. 10.1038/367270a0 - 12.
Chacron MJ, Longtin A, St-Hilaire M, Maler L: Suprathreshold stochastic firing dynamics with memory in P-type electroreceptors.

*Phys Rev Lett*2000, 85(7):1576–1579. 10.1103/PhysRevLett.85.1576 - 13.
Bulsara AR, Elston TC, Doering CR, Lowen SB, Lindenberg K: Cooperative behavior in periodically driven noisy integrate-fire models of neuronal dynamics.

*Phys Rev E*1996, 53(4):3958–3969. 10.1103/PhysRevE.53.3958 - 14.
Burkitt AN: A review of the integrate-and-fire neuron model: II. Inhomogeneous synaptic input and network properties.

*Biol Cybern*2006, 95(2):97–112. 10.1007/s00422-006-0082-8 - 15.
Lánský P: Sources of periodical force in noisy integrate-and-fire models of neuronal dynamics.

*Phys Rev E*1997, 55(2):2040–2043. 10.1103/PhysRevE.55.2040 - 16.
Longtin A, Bulsara A, Pierson D, Moss F: Bistability and the dynamics of periodically forced sensory neurons.

*Biol Cybern*1994, 70(6):569–578. 10.1007/BF00198810 - 17.
Sacerdote L, Giraudo MT: Stochastic integrate and fire models: a review on mathematical methods and their applications. Lecture Notes in Mathematics 2058. In

*Stochastic Biomathematical Models with Applications to Neuronal Modeling*. Edited by: Bachar M, Batzel J, Ditlevsen S. Springer, Berlin; 2013. - 18.
Shimokawa T, Pakdaman K, Takahata T, Tanabe S, Sato S: A first-passage-time analysis of the periodically forced noisy leaky integrate-and-fire model.

*Biol Cybern*2000, 83(4):327–340. 10.1007/s004220000156 - 19.
Paninski L, Pillow J, Simoncelli E: Maximum likelihood estimation of a stochastic integrate-and-fire neural encoding model.

*Neural Comput*2004, 16(12):2533–2561. 10.1162/0899766042321797 - 20.
Dong Y, Mihalas S, Russell A, Etienne-Cummings R, Niebur E: Estimating parameters of generalized integrate-and-fire neurons from the maximum likelihood of spike trains.

*Neural Comput*2011, 23(11):2833–2867. 10.1162/NECO_a_00196 - 21.
Sirovich L, Knight B: Spiking neurons and the first passage problem.

*Neural Comput*2011, 23(7):1675–1703. 10.1162/NECO_a_00139 - 22.
Longtin A: Mechanisms of stochastic phase locking.

*Chaos, Interdiscip J Nonlinear Sci*1995, 1(5):209–215. - 23.
Alili L, Patie P, Pedersen J: Representations of the first hitting time density of an Ornstein–Uhlenbeck process.

*Stoch Models*2005, 21: 967–980. 10.1080/15326340500294702 - 24.
Hurn A, Jeisman J, Lindsay K: ML estimation of the parameters of SDE’s by numerical solution of the Fokker–Planck equation.

*MODSIM 2005: International Congress on Modelling and Simulation: Advances and Applications for Management and Decision Making*2005, 849–855. - 25.
Karniadakis GE, Kirby RM:

*Parallel Scientific Computing in C++ and MPI*. Cambridge University Press, Cambridge; 2003. - 26.
Fortet R: Les fonctions aléatoires du type de Markoff associées à certaines équations linéaires aux dérivées partielles du type parabolique.

*J Math Pures Appl (9)*1943, 22: 177–243. - 27.
Shimokawa T, Pakdaman K, Sato S: Time-scale matching in the response of a leaky integrate-and-fire neuron model to periodic stimulus with additive noise.

*Phys Rev E*1999, 59(3):3427–3443. 10.1103/PhysRevE.59.3427 - 28.
Jacobs K:

*Stochastic Processes for Physicists: Understanding Noisy Systems*. Cambridge University Press, Cambridge; 2010. - 29.
Lo C, Chung TK: First passage time problem for the Ornstein–Uhlenbeck neuronal model.

*13th International Conference on Neural Information Processing*2006, 324–333. - 30.
Jones E, Oliphant T, Peterson P, et al: SciPy: open source scientific tools for Python; 2001.

- 31.
Galasi M:

*GNU Scientific Library Reference Manual*. 3rd edition. Network Theory Ltd, Godalming; 2009. - 32.
Ditlevsen S, Lánský P: Only through perturbation can relaxation times be estimated.

*Phys Rev E*2012., 86(5): Article ID 050102 Article ID 050102 - 33.
Jahn P, Berg RW, Hounsgaard J, Ditlevsen S: Motoneuron membrane potentials follow a time inhomogeneous jump diffusion process.

*J Comput Neurosci*2011, 31: 563–579. 10.1007/s10827-011-0326-z

## Acknowledgements

SD is supported by the Danish Council for Independent Research|Natural Sciences. AL acknowledges support from NSERC Canada. AI is supported by a PGS-D scholarship from NSERC Canada. The work is part of the Dynamical Systems Interdisciplinary Network, University of Copenhagen.

## Author information

## Additional information

### Competing Interests

The authors declare that they have no competing interests.

### Authors’ Contributions

All authors contributed equally to the writing of this paper. All authors read and approved the final manuscript.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

#### Received

#### Accepted

#### Published

#### DOI

### Keywords

- First-passage times
- Stochastic neuron models
- Parameter estimation from stopping times
- Fortet integral equation
- Fokker–Planck equation