Stochastic Hybrid Systems in Cellular Neuroscience

We review recent work on the theory and applications of stochastic hybrid systems in cellular neuroscience. A stochastic hybrid system or piecewise deterministic Markov process involves the coupling between a piecewise deterministic differential equation and a time-homogeneous Markov chain on some discrete space. The latter typically represents some random switching process. We begin by summarizing the basic theory of stochastic hybrid systems, including various approximation schemes in the fast switching (weak noise) limit. In subsequent sections, we consider various applications of stochastic hybrid systems, including stochastic ion channels and membrane voltage fluctuations, stochastic gap junctions and diffusion in randomly switching environments, and intracellular transport in axons and dendrites. Finally, we describe recent work on phase reduction methods for stochastic hybrid limit cycle oscillators.


Introduction
There are a growing number of problems in cell biology that involve the coupling between a piecewise deterministic differential equation and a time-homogeneous Markov chain on some discrete space Γ , resulting in a stochastic hybrid system, also known as a piecewise deterministic Markov process (PDMP) [37]. Typically, the phase space of the dynamical system is taken to be R d for finite d. One important example at the single-cell level is the occurrence of membrane voltage fluctuations in neurons due to the stochastic opening and closing of ion channels [2,25,30,32,54,64,80,109,114,117]. Here the discrete states of the ion channels evolve according mainly considers gene regulatory networks can be found elsewhere [14]. In Sect. 2, we summarize the basic theory of stochastic hybrid systems, In subsequent sections, we consider various applications of stochastic hybrid systems, including stochastic ion channels and membrane voltage fluctuations (Sect. 3), stochastic gap junctions and diffusion in randomly switching environments (Sect. 4), and intracellular transport in axons and dendrites (Sect. 5). Finally, in Sect. 6, we present recent work on phase reduction methods for stochastic hybrid limit cycle oscillators.

Stochastic Hybrid Systems
In this section, we review the basic theory of stochastic hybrid systems. We start with the notion of a piecewise deterministic differential equation, which can be used to generate sample paths of the stochastic process. We then describe how the probability distribution of sample paths can be determined by solving a differential Chapman-Kolmogorov (CK) equation (Sect. 2.1). In many applications, including the stochastic ion channel models of Sect. 3, there is a separation of time-scales between a fast O(1/ε) switching process and a slow O(1) continuous dynamics. In the fast switching limit ε → 0, we obtain a deterministic dynamical system. In Sect. 2.2, we use an asymptotic expansion in ε to show how the CK equation can be approximated by the Fokker-Planck (FP) equation with an O(ε) diffusion term (Sect. 2.2). Finally, in Sect. 2.3, we consider methods for analyzing escape problems in stochastic hybrid systems. We assume that the deterministic system is bistable so that, in the absence of noise, the long-time stable state of the system depends on the initial conditions. On the other hand, for finite switching rates, the resulting fluctuations can induce transitions between the metastable states. In the case of weak noise (fast switching 0 < ε 1), transitions are rare events involving large fluctuations that are in the tails of the underlying probability density function. This means that estimates of mean first passage times (MFPTs) and other statistical quantities can develop exponentially large errors under the diffusion approximation. We describe a more accurate method for calculating MFPTs based on a WKB analysis.
We begin with the definition of a stochastic hybrid system and, in particular, a piecewise deterministic Markov process (PDMP) [37,53,84]. For illustration, consider a system whose states are described by a pair (x, n) ∈ Σ × {0, . . . , N 0 − 1}, where x is a continuous variable in a connected bounded domain Σ ⊂ R d with regular boundary ∂Ω, and n is a discrete stochastic variable taking values in the finite set Γ ≡ {0, . . . , N 0 − 1}. (It is possible to have a set of discrete variables, although we can always relabel the internal states so that they are effectively indexed by a single integer. We can also consider generalizations of the continuous process, in which the ODE (2.1) is replaced by a stochastic differential equation (SDE) or even a partial differential equation (PDE). To allow for such possibilities, we will refer to all of these processes as examples of a stochastic hybrid system.) When the internal state is n, the system evolves according to the ordinary differential equation (ODE) where the vector field F n : R → R is a continuous function, locally Lipschitz. That is, given a compact subset K of Σ, there exists a positive constant K n such that F n (x) − F n (y) ≤ K n |x − y|, ∀x, y ∈ Σ. (2.2) We assume that the dynamics of x is confined to the domain Σ so that existence and uniqueness of a trajectory holds for each n. For fixed x, the discrete stochastic variable evolves according to a homogeneous continuous-time Markov chain with transition matrix W(x) and corresponding generator A(x), which are related according to 3) The matrix A(x) is also taken to be Lipschitz. We make the further assumption that the chain is irreducible for all x ∈ Σ , that is, for fixed x, there is a nonzero probability of transitioning, possibly in more than one step, from any state to any other state of the Markov chain. This implies the existence of a unique invariant probability distribution on Γ for fixed x ∈ Σ, denoted by ρ(x), such that m∈Γ A nm (x)ρ m (x) = 0, ∀n ∈ Γ. (2.4) Let us decompose the transition matrix of the Markov chain as with n =m P nm (x) = 1 for all x. Hence λ m (x) determines the jump times from the state m, whereas P nm (x) determines the probability distribution that when it jumps, the new state is n for n = m. The hybrid evolution of the system with respect to x(t) and n(t) can then be described as follows; see Fig. 1. Suppose the system starts at time zero in the state (x 0 , n 0 ). Call x 0 (t) the solution of (2.1) with n = n 0 such that x 0 (0) = x 0 . Let t 1 be the random variable (stopping time) such that Then in the random time interval s ∈ [0, t 1 ) the state of the system is (x 0 (s), n 0 ). Now draw a value of t 1 from P(t 1 < t), choose an internal state n 1 ∈ Γ with probability P n 1 n 0 (x 0 (t 1 )), and call x 1 (t) the solution of the following Cauchy problem on [t 1 , ∞): ẋ 1 (t) = F n 1 (x 1 (t)), t ≥ t 1 , Iterating this procedure, we can construct a sequence of increasing jumping times (t k ) k≥0 (setting t 0 = 0) and a corresponding sequence of internal states (n k ) k≥0 . The evolution (x(t), n(t)) is then defined as x(t), n(t) = x k (t), n k if t k ≤ t < t k+1 . (2.5) Note that the path x(t) is continuous and piecewise C 1 . To have a well-defined dynamics on [0, T ], it is necessary that almost surely the system makes a finite number of jumps in the time interval [0, T ]. This is guaranteed in our case. This formulation is the basis of a simulation algorithm for PDMPs [2,150].
It follows that p evolves according to the forward differential Chapman-Kolmogorov (CK) equation [10,61] ∂p n ∂t = −∇ · F n (x)p n (x, t) + 1 ε m∈Γ A nm p m (x, t). (2.6) For notational convenience, we have dropped the explicit dependence on initial conditions. The first term on the right-hand side represents the probability flow associated with the piecewise deterministic dynamics for a given n, whereas the second term represents jumps in the discrete state n. Note that we have rescaled the matrix A by introducing the dimensionless parameter ε > 0. This is motivated by the observation that one often finds a separation of time-scales between the relaxation time for the dynamics of the continuous variable x and the rate of switching between the different discrete states n. The fast switching limit then corresponds to the case ε → 0. Let us now define the averaged vector field F : R d → R d by Intuitively speaking, we would expect the stochastic hybrid system (2.1) to reduce to the deterministic dynamical system ẋ(t) = F (x(t)), x(0) = x 0 , (2.8) in the fast switching limit ε → 0. That is, for sufficiently small ε, the Markov chain undergoes many jumps over a small time interval Δt during which Δx ≈ 0, and thus the relative frequency of each discrete state n is approximately p * n (x). This can be made precise in terms of a law of large numbers for stochastic hybrid systems [51,84].
It remains to specify boundary conditions for the CK equation. (2.9) It follows that p n (0, t) = 0 for m ≤ n ≤ N 0 − 1 and p n (L, t) = 0 for 0 ≤ n ≤ m − 1.
In the analysis of metastability (Sect. 2.3), it will be necessary to impose an absorbing boundary condition at some interior point x * of the domain Σ , that is, p n (x * , t) = 0, 0 ≤ n ≤ m − 1.
In contrast to the no-flux conditions, there are nonzero fluxes through x * . In general, it is difficult to obtain an analytical steady-state solution of (2.6), assuming that it exists, unless d = 1 and N 0 = 2 [46,79]. The one-dimensional CK equation takes the form ∂p n ∂t = − ∂ ∂x F n (x)p n (x, t) + 1 ε m∈Γ A nm (x)p m (x, t). (2.10) In the two-state case (N 0 = 2), for a pair of transition rates α(x), β(x), so that the steady-state version of (2.10) reduces to the pair of equations . (2.12) Adding the pair of equations yields (2.13) that is, for some constant c. The reflecting boundary conditions imply that c = 0. Since F n (x) is nonzero for all x ∈ Σ, we can express p 1 (x) in terms of p 0 (x): (2.14) Substituting into equation (2.11) gives . (2.15) This yields the solutions (2.16) where x * ∈ Σ is arbitrary and assuming that the normalization factor Z exists.

Quasi-Steady-State (QSS) Diffusion Approximation
For small but nonzero ε, we can use perturbation theory to derive lowest order corrections to the deterministic mean field equation, which leads to the Langevin equation with noise amplitude O( √ ε). More specifically, perturbations of the mean-field equation (2.8) can be analyzed using a quasi-steady-state (QSS) diffusion or adiabatic approximation, in which the CK equation (2.6) is approximated by the Fokker-Planck (FP) equation for the total density C(x, t) = n p n (x, t). The QSS approximation was first developed from a probabilistic perspective by Papanicolaou [119]. It has subsequently been applied to a wide range of problems in biology, including models of intracellular transport in axons [57,123] and dendrites [111][112][113] and bacterial chemotaxis [73,74,116]. There have also been more recent probabilistic treatments of the adiabatic limit, which have been applied to various stochastic neuron models [118]. Finally, note that it is also possible to obtain a diffusion limit by taking the number of discrete states N 0 to be large [30,117].
The basic steps of the QSS reduction are as follows: (a) Decompose the probability density as where n p n (x, t) = C(x, t) is the marginal probability density for the continuous variables x, and n w n (x, t) = 0. Substituting into equation (2.6) yields Summing both sides with respect to n then gives where F (x) is the mean vector field of equation (2.7). (b) Using the equation for C and the fact that A(x)ρ(x) = 0, we have (c) Introduce the asymptotic expansion w n ∼ w (0) n + εw (1) n + ε 2 w (2) n + · · · and collect O(1) terms: The Fredholm alternative theorem (see the end of Sect. 2.3) shows that this has a solution, which is unique on imposing the condition n w (0) n (x, t) = 0: where A † is the pseudoinverse of the generator A. We typically have to determine the pseudoinverse of A numerically. (d) Combining equations (2.19) and (2.17) shows that C evolves according to the Itô Fokker-Planck (FP) equation (2.20) where the O(ε) correction to the drift, V (x), and the diffusion matrix D(x) are given by Since m A † mn = 0, we can rewrite the diffusion matrix as In the one-dimensional case, the CK equation (2.10) reduces to the onedimensional Itô FP equation with the diffusion coefficient D(x) given by For N 0 > 2, we typically have to solve equation (2.24) numerically in order to find the pseudoinverse of A. However, in the special case of a two-state discrete process (n = 0, 1), we have the explicit solution . This gives the reduced expression One subtle point is the nature of boundary conditions under the QSS reduction, since the FP equation is a second-order parabolic PDE, whereas the original CK equation is an N 0 th-order hyperbolic PDE. It follows that, for N 0 > 2, there is a mismatch in the number of boundary conditions between the CK and FP equations. This implies that the QSS reduction may break down in a small neighborhood of the boundary, as reflected by the existence of boundary layers [152]. One way to eliminate the existence of boundary layers is to ensure that the boundary conditions of the CK equation are compatible with the QSS reduction.

Metastability in Stochastic Hybrid Systems
Several examples of stochastic hybrid systems are known to exhibit multistability in the fast-switching limit ε → 0 [14]. That is, the deterministic equation (2.8) supports more than one stable equilibrium. In the absence of noise, the particular state of the system is determined by initial conditions. On the other hand, when noise is included by taking into account the stochastic switching, fluctuations can induce transitions between the metastable states. If the noise is weak (fast switching 0 < ε 1), then transitions are rare events involving large fluctuations that are in the tails of the underlying probability density function. This means that estimates of mean transition times and other statistical quantities can be sensitive to any approximations, including the Gaussian approximation based on the QSS approximation of Sect. 2.3, and can sometimes lead to exponentially large errors.
The analysis of metastability has a long history [70], particularly within the context of SDEs with weak noise. The underlying idea is that the mean rate to transition from a metastable state in the weak noise limit can be identified with the principal eigenvalue of the generator of the underlying stochastic process, which is a second-order differential operator in the case of a Fokker-Planck equation. Calculating the eigenvalue typically involves obtaining a Wentzel-Kramers-Brillouin (WKB) approximation of a quasistationary solution and then using singular perturbation theory to match the solution to an absorbing boundary condition [69,97,99,103,130]. The latter is defined on the boundary that marks the region beyond which the system rapidly relaxes to another metastable state, becomes extinct, or escapes to infinity. In one-dimensional systems (d = 1), this boundary is simply an unstable fixed point, whereas in higher-dimensions (d > 1), it is generically a (d − 1)-submanifold. In the weak noise limit, the most likely paths of escape through an absorbing boundary are rare events, occurring in the tails of the associated functional probability distribution. From a mathematical perspective, the rigorous analysis of the tails of a distribution is known as large deviation theory [39,53,55,138], which provides a rigorous probabilistic framework for interpreting the WKB solution in terms of optimal fluctuational paths. The analysis of metastability in chemical master equations has been developed along analogous lines to SDEs, combining WKB methods and large deviation principles [43,45,49,53,69,75,85,124] with path-integral or operator methods [40,41,121,128,143]. The study of metastability in stochastic hybrid systems is more recent, and much of the theory has been developed in a series of papers on stochastic ion channels [25,109,114,115], gene networks [108,110], and stochastic neural networks [24]. Again there is a strong connection between WKB methods, large deviation principles [15,51,84], and formal path-integral methods [11,26], although the connection is now more subtle.
For illustration, we will focus on a one-dimensional stochastic hybrid system and develop the theory using WKB methods. First, suppose that the deterministic equation (2.8) is written asẋ Sketch of a double-well potential of a bistable deterministic system in R with the potential U(x) having two minima (stable equilibria) separated by a single maximum (unstable equilibrium), as illustrated in Fig. 2. To calculate the mean escape rate from the metastable state x − , say, the CK equation (2.6) is supplemented by an absorbing boundary condition at x = x 0 . The initial condition is taken to be p n (x, 0|y, 0) = δ(x − y)ρ n (y), where y is in a neighborhood of x − , and ρ n (y) is the stationary distribution of the switching process. Let T (y) denote the (stochastic) first passage time for which the system first reaches x 0 , given that it started at y. The distribution of first passage times f (t, y) is related to the survival probability that the system has not yet reached x 0 : That is, Prob{t > T |X(0) = y} = S(y, t), and the first passage time density f (y, t) = −∂S/∂t. Substituting for ∂p n /∂t using the CK equation (2.10) shows that with Γ = {0, 1} for the two-state model. We have used n∈Γ A nm (x) = 0 and the asymptotic limit F n (x)p n (x, t|y, 0) → 0 as x → ±∞. The mean first passage time (MFPT) τ (y) is then given by It turns out that for small ε, the MFPT has an Arrhenius-like form analogous to SDEs [69]: where Φ(x) is known as the quasipotential or stochastic potential, and Γ is a prefactor. One important observation is that the escape time is exponentially sensitive to the precise form of Φ. If we were first to carry out the QSS reduction of Sect. 2.3 and then use a standard analysis of the one-dimensional FP equation in order to estimate the MFPT [61], then we would find that Γ = 1 and, to O(1), with D(x) given by equation (2.25).
The escape time then depends on the barrier height ΔE shown in Fig. 2. As we have already commented, the Gaussian approximation may not accurately capture the statistics of rare events that dominate noise-induced escape. This is reflected by the observation that Φ QSS (x) can differ significantly from the true quasipotential. A much better estimate can be obtained using WKB.
To apply the WKB method, we can exploit the fact that, in the weak noise limit (ε 1), the flux through the absorbing boundary is exponentially small. This has major implications for the spectral decomposition of the solution to the CK equation with an absorbing boundary at x = x 0 . More specifically, consider the eigenfunction expansion is an eigenpair of the matrix-valued linear operator appearing on the right-hand side of (2.6), that is, together with the absorbing boundary conditions φ (r) ε (x 0 , n) = 0 for all n. We also assume that the eigenvalues λ (r) ε all have positive definite real parts and the smallest eigenvalue λ (0) ε is real and simple, so that we can introduce the ordering 0 < λ (2) ε ] ≤ · · · . The exponentially slow rate of escape through x 0 in the weak-noise limit means that λ (0) ε is exponentially small, ε ] is only weakly dependent on ε for r ≥ 1. Under these assumptions, we have the quasistationary approximation for large t: Substituting such an approximation into equation (2.29) and suppressing the initial conditions give and thus ε is exponentially small, we can take the quasistationary solution φ (0) ε to satisfy the time-independent CK equation. We then seek a WKB approximation of the quasistationary solution by making the ansatz where Φ(x) is the WKB quasipotential. Substituting into the time-independent version of equation (2.10) yields where Φ = dΦ/dx. Introducing the asymptotic expansions Φ ∼ Φ 0 + εΦ 1 and Z ∼ Z (0) + εZ (1) , the leading order equation is Positivity of the quasistationary density φ (0) ε requires positivity of the corresponding solution Z (0) . One positive solution is the trivial solution Z (0) (x) = ρ(x) for all x ∈ Σ , where ρ is the unique right eigenvector of A, for which Φ 0 = 0. Establishing the existence of a nontrivial positive solution requires more work and is related to the fact that the connection of the WKB solution to optimal fluctuational paths and large deviation principles is less direct in the case of stochastic hybrid systems.
It turns out that we have to consider the eigenvalue problem [11,15,25 n (x, q). The optimal fluctuational paths are obtained by identifying the Perron eigenvalue Λ 0 (x, q) as a Hamiltonian and finding zero energy solutions to Hamilton's equationṡ This can be established using large deviation theory or path-integrals. In the latter case, we can show that a path-integral representation of the density p(x, τ ) is for some appropriate measure D [q, x]. Applying steepest descents to the path integral then yields a variational principle in which optimal paths minimize the action Comparison of equation ( n (x, q) with q = Φ 0 (x) and Φ 0 satisfies the corresponding Hamilton-Jacobi equation , and similarly for the other fixed points. Deterministic mean field equations and optimal paths of escape from a metastable state both correspond to zero energy solutions. Along zero-energy paths,

Calculation of Principal Eigenvalue
To calculate the principal eigenvalue, it is necessary to determine the first-order correction Φ 1 to the quasipotential of the WKB solution (2.36). Proceeding to the next order in the asymptotic expansion of equation (2.37), we have For fixed x and WKB potential Φ 0 , the matrix operator on the left-hand side of this equation has a one-dimensional null space spanned by the positive WKB solution Z (0) (x). The Fredholm alternative theorem (see Sect. 2.2) then implies that the right-hand side of (2.42) is orthogonal to the left null vector S ofĀ. That is, we have the solvability condition Given Z (0) , S, and Φ 0 , the solvability condition yields the following equation for Φ 1 : . (2.44) Combining the various results and defining give to leading order in ε, where we choose n Z (0) n (x) = 1 for all x, and N is the normalization factor, The latter can be approximated using Laplace's method to give The final step is to use singular perturbation theory to match the outer quasistationary solution to the absorbing boundary condition at x 0 . The analysis is quite involved [80,108], so here we simply quote the result for the 1D model: with D(x) the effective diffusion coefficient (2.23) obtained using a QSS reduction.

Two-State Model
We now illustrate the above theory for the simple two-state model of equation (2.10). The specific version of the linear equation (2.39) can be written as the twodimensional system The corresponding characteristic equation is It follows that the Perron eigenvalue is given by A little algebra shows that so that, as expected, Λ 0 is real. The quasipotential Φ 0 (x) satisfies the HJ equation This has two solutions: the classical deterministic solution q = 0 with Φ 0 (x) = 0 and a nontrivial solution whose quasipotential satisfies (2.52) (Note that F n (x) does not vanish anywhere and F 0 (x)F 1 (x) < 0.) The quasipotential can be determined by numerically integrating with respect to x. The resulting quasipotential differs significantly from the one obtained by carrying out a QSS diffusion approximation of the stochastic hybrid system along the lines outlined in Sect. 2.2. For this simple model, it is also straightforward to determine the various prefactors in equation (2.48). For example, the normalized positive eigenvector Z (0) has the components Since F 0 (x) < 0 and F 1 (x) > 0 for x ∈ Σ , it follows from equation (2.52) that Z (0) 0 is positive. The components of the adjoint eigenvector S satisfy It then follows from equation (2.44) that the first correction to the quasipotential satisfies . (2.54) Finally, D(x 0 ) is given by equation (2.26).

Fredholm Alternative Theorem
Consider an M-dimensional linear inhomogeneous equation Suppose that the M × M matrix A has a nontrivial null-space and let u be a null vector of the adjoint matrix A † , that is, A † u = 0. The Fredholm alternative theorem for finite-dimensional vector spaces states that the inhomogeneous equation has a (nonunique) solution for z if and only if u · b = 0 for all null vectors u. Let us apply this theorem to equation (2.18) for fixed x, t. The one-dimensional null-space is spanned by the vector with components u n = 1, since n u n A nm = n A † mn u n = 0. Hence equation (2.18) has a solution, provided that This immediately follows since n p n (x) = 1 and n p * n (x)F n (x) = F (x) for all x.

Perron-Frobenius Theorem
If T is an irreducible positive finite matrix, then 1. there is a simple eigenvalue λ 0 of T that is real and positive, with positive left and right eigenvectors; 2. the remaining eigenvalues λ satisfy |λ| < λ 0 .
If T nm = W nm / k W km , then λ 0 = 1, where W is an irreducible transition matrix, then the left positive eigenvector is ψ = (1, . . . , 1), and the right positive eigenvector is the stationary distribution ρ. In the case of the matrix operator L(x) with components L nm (x) := A nm (x) + qF n (x)δ n,m , which appears in the eigenvalue equation (2.39), it is clear that not all components of the matrix are positive for a given x ∈ Σ. However, taking ζ > sup x∈Σ L(x) ∞ , the matrix L(x) + ζ I satisfies the conditions of the Perron-Frobenius theorem for all x ∈ Σ .

Stochastic Ion Channels and Membrane Voltage Fluctuations
The generation and propagation of a neuronal action potential arises from nonlinearities associated with active membrane conductances. Ions can diffuse in and out of the cell through ion specific channels embedded in the cell membrane; see Fig. 3. Ion pumps within the cell membrane maintain concentration gradients such that there is a higher concentration of Na + and Ca 2+ outside the cell and a higher concentration of K + inside the cell. The membrane current through a specific channel varies approximately linearly with changes in the voltage v relative to some equilibrium or reversal Opening and closing of ion channels underlying initiation and propagation of an action potential potential, which is the potential at which there is a balance between the opposing effects of diffusion and electrical forces. (We will focus on a space-clamped model of a neuron whose cell body is taken to be an isopotential.) Summing over all channel types, the total membrane current (flow of positive ions) leaving the cell through the cell membrane is where g s is the conductance due to channels of type s, and V s is the corresponding reversal potential.
Recordings of the current flowing through single channels indicate that channels fluctuate rapidly between open and closed states in a stochastic fashion. Nevertheless, most models of a neuron use deterministic descriptions of conductance changes, under the assumption that there are a large number of approximately independent channels of each type. It then follows from the law of large numbers that the fraction of channels open at any given time is approximately equal to the probability that any one channel is in an open state. The conductance g s for ion channels of type s is thus taken to be the product g s =ḡ s P s whereḡ s is equal to the density of channels in the membrane multiplied by the conductance of a single channel, and P s is the fraction of open channels. The voltage-dependence of the probabilities P s in the case of a delayed-rectifier K + current and a fast Na + current were originally obtained by Hodgkin and Huxley [76] as part of their Nobel prize winning work on the generation of action potentials in the squid giant axon. The delayed-rectifier K + current is responsible for terminating an action potential by repolarizing a neuron. We find that opening of the K + channel requires structural changes in four identical and independent subunits so that P K = n 4 where n is the probability that any one gate subunit has opened. In the case of the fast Na + current, which is responsible for the rapid depolarization of a cell leading to action potential generation, the probability of an open channel takes the form P Na = m 3 h where m 3 is the probability that an activating gate is open and h is the probability that an inactivating gate is open. Depolarization causes m to increase and h to decrease, whereas hyperpolarization has the opposite effect.
The dynamics of the gating variables m, n, h are usually formulated in terms of a simple kinetic scheme that describes voltage-dependent transitions of each gating subunit between open and closed states. More specifically, for each Y ∈ {m, n, h}, Hodgkin and Huxley originally fitted exponential-like functions to the experimental data obtained from the squid axon. The corresponding conductance-based model (in the absence of synaptic inputs) can then be written in the form Here is called a leak current, which represents the passive flow of ions through nongated channels.

Morris-Lecar Model
It is often convenient to consider a simplified planar model of a neuron, which tracks the membrane voltage v, and a recovery variable w that represents the fraction of open potassium channels. The advantage of a two-dimensional model is that we can use phase-plane analysis to develop a geometric picture of neuronal spiking. One well-known example is the Morris-Lecar (ML) model [100]. Although this model was originally developed to model Ca 2+ spikes in molluscs, it has been widely used to study neural excitability for Na + spikes [48], since it exhibits many of the same bifurcation scenarios as more complex models. The ML model has also been used to investigate subthreshold membrane potential oscillations (STOs) due to persistent Na + currents [27,145]. Another advantage of the ML model is that it is straightforward to incorporate intrinsic channel noise [80,109,114,132]. To capture the fluctuations in membrane potential from stochastic switching in voltage-gated ion channels, we will consider a stochastic version of the ML model that includes both discrete jump processes (to represent the opening and closing of Ca 2+ or Na + ion channels) and a two-dimensional continuous-time piecewise process (to represent the membrane potential and recovery variable w). We thus have an explicit example of a two-dimensional PDMP. (We can also consider fluctuations in the opening and closing of the K + ion channels, in which w is replaced by an additional discrete stochastic variable, representing the fraction of open K + channels [114,132]. This would yield a one-dimensional PDMP for the voltage alone.)

Deterministic Model
First, consider a deterministic version of the ML model [100] consisting of a fast inward calcium current (Ca 2+ ), a slow outward potassium current (K + ), a leak current (L), and an applied current (I app ). (In [80,114] the inward current is interpreted as a Na + current, but the same parameter values as the original ML model are used.) For simplicity, each ion channel is treated as a two-state system that switches between an open and a closed state-the more detailed subunit structure of ion channels is neglected [64]. The membrane voltage v evolves as where w is the K + gating variable. It is assumed that Ca 2+ channels are in quasisteady state a ∞ (v), thus eliminating the fraction of open Ca 2+ channels as a variable.
where g i are ion conductances, and V i are reversal potentials. Opening and closing rates of ion channels depend only on membrane potential v are represented by α and β, respectively, so that .
For the ML model, The dynamics of this system can be explored using phase-plane analysis as illustrated in Fig. 4 for an excitable regime. Exploiting the fact that the K + dynamics is much slower than the voltage and Ca 2+ dynamics, we can use a slow/fast analysis to investigate the initiation of an action potential following a perturbing stimulus [81]. The ML model can also support oscillatory solutions; see also Sect. 6.

Stochastic Model
The deterministic ML model holds under the assumption that the number of ion channels is very large, thus the ion channel activation can be approximated by the average Fig. 4 Deterministic phase plane dynamics (adapted from [114]). Thick curves show the nullclines: v = 0 as grey andẇ = 0 as black. Black stream lines represent deterministic trajectories. Green/blue curves represent an action potential trajectory in the limit of slow w. Parameter values are C m = 20 mF, ionic currents. However, it is known that channel noise does affect membrane potential fluctuations and thus neural function [146]. To account for ion channel fluctuations, we consider a stochastic version of the ML model [80,114,132], in which the number N of Ca 2+ channels is taken to be relatively small. (For simplicity, we ignore fluctuations in the K + channels by taking the number of the latter to be very large.) Let n(t) be the number of open Ca 2+ channels at time t, which means that there are N − n(t) closed channels. The voltage and recovery variables then evolve according to the following PDMP: for n(t) = n. Suppose that individual channels switch between open (O) and closed (C) states via a two-state Markov chain, It follows that at the population level, the number of open ion channels evolves according to a birth-death process with Note that we have introduced the small parameter ε to reflect the fact that Ca 2+ channels open and close much faster than the relaxation dynamics of the system (v, w). This is consistent with the parameter values of the ML model, where the slowness of the K + channels is reflected by the fact that the parameter φ = 0.04 ms −1 , the mem-brane rate constant is of order 0.05 ms −1 , whereas the transition rates of Ca 2+ or Na + channels are of order 1 ms −1 . The stationary density of the birth-death process is The corresponding CK equation is Comparison with the general CK equations (2.6) shows that and A is the tridiagonal generator matrix of the birth-death process. Carrying out the QSS diffusion approximation of Sect. 2.2 then yields the following Ito FP equation for C(v, w, t) = N n=0 p n (v, w, t) (see also [27]): The last line follows from a calculation in [80].
Almost all previous studies of ion channel fluctuations are based on some form of diffusion approximation, thus reducing the continuous dynamics to an effective Langevin equation [32,54,64,146]. However, these various approximations can lead to exponentially large errors in estimates for quantities such as the rate at which noisedriven action potentials are generated in the excitable regime. This has motivated recent work that deals directly with the CK equation (3.13). For example, Keener and Newby [80,115] consider the simplified problem of how ion channel fluctuations affect the initiation of an action potential due to the opening of a finite number of Ca 2+ or Na + channels. The slow K + channels are assumed to be frozen, so that they effectively act as a leak current, and each sodium channel is treated as a single activating subunit. The recovery variable w is thus fixed so the potassium current can be absorbed into the function g(v) , (3.16) and the CK equation (3.13) reduces to Since the right-hand side of equation (3.16) is negative (positive) for large (small) v, it follows that there exists an invariant interval for the voltage dynamics. In particular, let v 0 denote the voltage for whichv = 0 when n = 0, and let v N be the corresponding voltage when n = N , that is, In the fast switching limit ε → 0, we obtain the first-order deterministic rate equation We have introduced the effective potential Ψ (v) whose minima and maxima correspond to stable and unstable fixed points of the mean-field equation. By plotting the potential Ψ , it is straightforward to show that equation (3.18) exhibits bistability for a range of stimuli I app , that is, there exist two stable fixed points v ± separated by an unstable fixed point v 0 ; see Fig. 5. The problem of the spontaneous initiation of an action potential for small but finite ε thus reduces to an escape problem for a stochastic hybrid system, as outlined in Sect. 2.3.

Metastability in the Stochastic Ion Channel Model
To calculate the mean escape rate from the resting state v − using the Arrhenius formula (2.48), we take v → x and calculate the functions Φ 0 (x), k(x), and D(x). In At a critical current I * , the deterministic system switches from a bistable to a monostable regime, that is, I * is the threshold current for action potential generation the case of the stochastic ion channel model, equation (2.39) takes the explicit form Consider the trial solution which yields the following equation relating Γ and Λ 0 : Collecting terms independent of n and terms linear in n yields the pair of equations and Eliminating Γ from these equation gives This yields a quadratic equation for Λ 0 of the form Along the zero-energy surface Λ 0 (x, q) = 0, we have h(x, q) = 0, which yields the pair of solutions The normalized eigenfunction for the nontrivial case is Fig. 6, we show solutions to Hamilton's equations in the (x, q)-plane, highlighting the zero-energy maximum likelihood curve linking x − and x 0 . Note that NΦ(x 0 ), where Φ(x 0 ) is the area enclosed by the heteroclinic connection from x − to x 0 , gives the leading order contribution to log τ , where τ is the mean escape time. The next step is to determine the null eigenfunction S n (x) of equation (2.43), which becomes Trying a solution of the form S m ( Γ is then determined by canceling terms independent of m: (3.27) Fig. 7 Schematic diagram comparing MFPT calculated using the diffusion approximation with the MFPT of the full system. (Redrawn from [80].) The scales of the axes are based on numerical results for N = 10. Other parameter values as in Fig. 4 Finally, a QSS analysis of the CK equation shows that [80] D( where have used the fixed point condition g( Keener and Newby [80] calculated the MFPT (τ = 1/λ 0 ) using equation (2.48) and showed that their results agreed very well with Monte Carlo simulations of the full system, whose probability density evolves according to the CK equation (3.17). A summary of their findings is shown schematically in Fig. 7, together with the corresponding MFPT obtained using a quasi-steady-state diffusion approximation. The main observation is that although the Gaussian-like diffusion approximation does well in the superthreshold regime (I app > I * ), it deviates significantly from the full model results in the subthreshold regime (I app < I * ), where it overestimates the mean time to spike. This is related to the fact that the effective potential of the steady-state density under the diffusion approximation generates exponentially large errors in the MFPT.
In the above analysis of membrane voltage fluctuations, it was assumed that the potassium channel dynamics could be ignored during initiation of a spontaneous action potential (SAP). This corresponds to keeping the recovery variable w fixed. The resulting stochastic bistable model supported the generation of SAPs due to fluctuations in the opening and closing of fast Ca 2+ or Na + channels. However, it is also possible to generate a SAP due to fluctuations causing several K channels to close simultaneously, effectively decreasing w, and thereby causing v to rise. It follows that keeping w fixed in the stochastic model excludes the latter mechanism, and thus the resulting MFPT calculation underestimates the spontaneous rate of action potentials. To investigate this phenomenon, it is necessary to consider the full stochastic ML model given by equations (3.9) with a multiplicative noise term added to the dynamics of the recovery variable, which takes into account a finite number M of potassium ion channels. An additional complication is that the full model is an excitable rather than a bistable system, so it is not straightforward to relate the generation of SAPs with a noise-induced escape problem. Nevertheless, Newby et al. [110,114] used Chemical gating is mediated by the slow gating mechanism in both hemichannels WKB methods to identify the most probable paths of escape from the resting state and obtained the following results: (i) The most probable paths of escape dip significantly below the resting value for w, indicating a breakdown of the deterministic slow/fast decomposition. (ii) Escape trajectories all pass through a narrow region of state space (bottleneck or stochastic saddle node) so that, although there is no well-defined separatrix for an excitable system, it is possible to formulate an escape problem by determining the MFPT to reach the bottleneck from the resting state.

Stochastic Gap Junctions and Randomly Switching Environments
Many neurons in the mammalian central nervous system communicate via gap junctions, also known as electrical synapses [35]. Gap junctions are arrays of transmembrane channels that connect the cytoplasm (aqueous interior) of two neighboring cells and thus provide a direct diffusion pathway for ionic current and small organic molecules to move between cells. In many cases the electrical coupling is strong enough to mediate the synchronization of subthreshold and spiking activity among clusters of neurons. Cells sharing a gap junction channel each provide a hemichannel (also known as a connexon) that connect head-to-head [50,66,127]; see Fig. 8(a). Each hemichannel is composed of proteins called connexins that exist as various isoforms named Cx23 through Cx62, with Cx43 being the most common. Just as with the opening and closing of ion channels (see Sect. 2), gap junctions can be gated by both voltage and chemical agents. There appear to be at least two gating mechanisms associated with gap junctions [31], as illustrated in Fig. 8(b). Even when a gap junction is open, it tends to restrict the flow of molecules, and this is typically modeled by assuming that a gap junction has a certain channel permeability [81]. Given that gap junctions are gated, this suggests that thermal fluctuations could result in the stochastic opening and closing of gap junctions in an analogous fashion to ion channels. There has been relatively little work on the effects of thermal noise on gap junction diffusive coupling, beyond modeling the voltage characteristics of a single stochastically-gated gap junction [120]. Recently, however, there have been several studies on analyzing the effective permeability of stochastic gap junctions by Fig. 9 One-dimensional diffusion in a domain with a randomly switching gate on the right-hand side formulating the problem as diffusion in a domain with randomly switching internal barriers, which is modeled as a piecewise deterministic PDE [12,19].
To introduce the basic theory, we begin with the simpler problem of diffusion in a bounded interval with a randomly switching exterior boundary [11,92]. The latter can represent the random opening and closing of a stochastic ion channel in the plasma membrane of a cell or a subcellular compartment [17].

Diffusion on an Interval with a Switching Exterior Boundary
Consider particles diffusing in the finite interval [0, L] with a fixed absorbing boundary at x = 0 and a randomly switching gate at x = L, see with u satisfying the boundary conditions We are assuming that when the gate is open, the system is in contact with a particle bath of density η. Note that equation (4.2a)-(4.2b) only holds between jumps in the state of the gate, so that it is an example of a piecewise deterministic PDE. Since each realization of the gate will typically generate a different solution u(x, t), it follows that u(x, t) is a random field.

Derivation of Moment Equations
In [18] a method has been developed for deriving moment equations of the stochastic density u(x, t) in the case of particles diffusing in a domain with randomly switching boundary conditions. The basic approach is to discretize the piecewise deterministic diffusion equation (4.2a)-(4.2b) with respect to space using a finite-difference scheme and then to construct the differential CK equation for the resulting finitedimensional stochastic hybrid system. One of the nice features of finite-differences is that we can incorporate the boundary conditions into the resulting discrete linear operators. Since the CK equation is linear in the dependent variables, we can derive a closed set of moment equations for the discretized density and then retake the continuum limit. (For an alternative, probabilistic approach to deriving moment equations, see [90].) The first step is to introduce the lattice spacing a such that (N + 1)a = L for integer N and let u j = u(aj ), j = 0, . . . , N + 1. Then we obtain the PDMP for n = 0, 1. Away from the boundaries (i = 1, N), Δ n ij is given by the discrete Laplacian On the left-hand absorbing boundary, we have u 0 = 0, whereas on the right-hand boundary, we have These can be implemented by taking and Let u(t) = (u 1 (t), . . . , u N (t)) and introduce the probability density where we have dropped the explicit dependence on initial conditions. The probability density evolves according to the following differential CK equation for the stochastic hybrid system (4.3) (see Sect. 2.1): where A is the matrix Since the drift terms in the CK equation (4.6) are linear in the u j , it follows that we can obtain a closed set of equations for the moment hierarchy. Let v n,k (t) = E u k (t)1 N(t)=n = p n (u, t)u k (t) du. (4.8) Multiplying both sides of the CK equation (4.6) by u k (t) and integrating with respect to u give (after integrating by parts and using that p n (u, t) → 0 as u → ∞ by the maximum principle) We have assumed that the initial discrete state is distributed according to the stationary distribution ρ n , so that p n (u, t) du = ρ n .
Equations for rth-order moments r ≥ 2 can be obtained in a similar fashion. Let Multiplying both sides of the CK equation (4.6) by u k 1 (t) · · · u k r (t) and integrating with respect to u give (after integration by parts) Finally, taking the continuum limit a → 0 in equation (4.9) and setting we obtain the first-order moment equations and A similar procedure can be used to derive higher-order moment equations [18]. For example, the second-order moments (4.16) satisfy the equations and couple to the first-order moments via the boundary conditions and (4.18b) One of the important points to highlight regarding the stochastic diffusion equation (4.2a)-(4.2b) is that it describes a population of particles diffusing in the same random environment. This means that although the particles are noninteracting, statistical correlations arise at the population level. The inequality follows from the observation that the second-order moment equations are nonseparabale, that is, C n (x, y, t) = V n (x, t)V n (y, t).

Analysis of First-Order Moments
The steady-state solution of equations (4.13a) and (4.13b) can be determined explicitly. First, note that Since equations equations (4.13a) and (4.13b) have a globally attracting steady-state, it follows that where V n (x) ≡ lim t→∞ V n (x, t). Adding equations (4.13a) and (4.13b) and using the boundary conditions in equation (4.14) give where κ = V 0 (L) has to be determined. Hence The boundary conditions imply that which yields the solution Finally, we obtain κ by setting x = L: which can be rearranged to yield and thus [11,92] . (4.24) In the limit ξ → ∞ (fast switching),

Diffusive Flux Along a One-Dimensional Array of Electrically Coupled Neurons
Let us now consider a simple one-dimensional (1D) model of molecules diffusing along a line of M cells that are connected via gap junctions, see Fig. 10. For the moment, we ignore the effects of stochastic gating. Since gap junctions have relatively high resistance to flow compared to the cytoplasm, we assume that each intercellular membrane junction acts like an effective resistive pore with some permeability μ.
However, at each of the intercellular boundaries x = l j ≡ jL, j = 1, . . . , M − 1, the concentration is discontinuous due to the permeability of the gap junctions. Conservation of diffusive flux across each boundary then implies that where the superscripts + and − indicate that the function values are evaluated as limits from the right and left, respectively. Finally, it is necessary to specify the exterior boundary conditions at x = 0 and x = ML. We impose Dirichlet boundary conditions with u(0, t) = η and u(ML, t) = 0. In steady-state, there is a constant flux J 0 = −DK 0 through the system, and the steady-state concentration takes the form Rearranging equations (4.28a) gives which can be iterated to give Since we also have Introducing the effective diffusion coefficient D e according to we see that, for large M, (4.32)

Effective Permeability for Cells Coupled by Stochastically Gated Gap Junctions
This deterministic model has recently been extended to incorporate the effects of stochastically gated gap junctions [12]. The resulting model can be analyzed by extending the theory of diffusion in domains with randomly switching exterior boundaries [18] (see Sect. 4.1) to the case of switching interior boundaries. Solving the resulting first-order moment equations of the stochastic concentration allows us to calculate the mean steady-state concentration and flux, and thus extract the effective single-gate permeability of the gap junctions. We start by looking at a pair of stochastically-coupled cells; see Assume that transitions between the two states n = 0, 1 are described by the twostate Markov process (4.1). The random opening and closing of the gate means that with u satisfying Dirichlet boundary conditions on the exterior boundaries of Ω, 34) and N(t)-dependent boundary conditions on the interior boundary at x = l: where l ± = lim ε→0 + l ± ε. That is, when the gate is open, there is continuity of the concentration and the flux across x = l, whereas when the gate is closed, the righthand boundary of Ω 1 and the left-hand boundary of Ω 2 are reflecting. For simplicity, we assume that the diffusion coefficient is the same in both compartments, so that the piecewise nature of the solution is solely due to the switching gate. For illustration, we take the exterior boundary conditions to be Dirichlet, but the analysis is easily modified, for example, in the case of a Neumann boundary condition at one of the ends.

First-Order Moment Equations and Effective Permeability (M = 2)
To determine the effective permeability of a stochastically gated gap junction, we need to calculate the mean of the concentration u(x, t) defined by equation (4.19).
The corresponding first-order moment equations for V n can be derived along similar lines to the case of 1D diffusion in a domain with an exterior gate. We thus obtain equations (4.13a) and (4.13b) for x ∈ Ω 1 ∪ Ω 2 with exterior boundary conditions [12] V 0 (0, t) = ρ 0 η, and interior boundary conditions (4.38) As in Sect. 4.1, we will analyze the steady-state solution. From the interior boundary conditions (4.38) we set with K 1 to be determined later by imposing V 1 (l − ) = V 1 (l + ). Adding equations (4.13a) and (4.13b) and imposing the boundary conditions then give and This yields the piecewise linear solution (4.41) using equation (4.41), we obtain a piecewise solution of the form We have imposed the exterior boundary conditions. The interior boundary conditions for V 0 then determine the coefficients B, C in terms of K 1 so that we find Finally, we determine the unknown coefficient K 1 by requiring that V 1 (x) is continuous across x = l, that is, which yields the result This can be rearranged to yield the following result for the mean flux through the gate, J 0 = −DK 0 : . It is useful to note some asymptotic properties of the solution given by equations (4.41) and (4.45). First, in the fast switching limit ξ → ∞, we have J 0 → ηD/2L, μ e → ∞, and equation (4.41) reduces to the continuous steady-state solution The mean flux through the gate is the same as the steady-state flux without a gate. On the other hand, for finite switching rates, the mean flux J 0 is reduced. In the limit α → 0 (gate always closed), J 0 → 0, so that V (x) = η for x ∈ [0, l) and V (x) = 0 for x ∈ (l, L]. Finally, in the limit l → 2L, we recover the result for 1D diffusion in a single domain with a switching external boundary [11,92] (see also equation (4.24)): . ingly, such a model is formally equivalent to a signaling model analyzed in [94].) The analysis is considerably more involved if the gap junctions physically switch because there are significant statistical correlations arising from the fact that all the particles move in the same random environment, which exists in 2 M−1 different states if the gates switch independently [12]. Therefore we will restrict the analysis to the simpler problem in which individual particles independently switch conformational states: if a particle is in state N(t) = 0, then it cannot pass through a gate, whereas if it is in These equations can be solved along similar lines to the two-cell case [12]. This ultimately yields the following expression for the flux J 0 : .

Volume Neurotransmission
Although many neurons communicate via synapse-specific connections or gap junctions, it is also possible for populations of neurons to make nonspecific connections via volume neurotransmission [33,58]; see Fig. 12. For example, neurons may send projections to some distant nucleus or subnucleus, where they increase the concentration of neurotransmitter within the extracellular space surrounding the nucleus. The resulting increase in concentration modulates the electrophysiological neural activity in the distant region by binding of neurotransmitter to receptors on the target cells.
One important class of volume transmission involves axonal projections transmitting neuromodulators such as dopamine and serotonin from brain stem nuclei to other brain regions such as the striatum and cortex.
Recently, volume transmission has been formulated as another example of diffusion in a randomly switching environment [91]. Here, the environment is the extracellular volume surrounding the target cells, whereas each axonal terminal acts as a source of neurotransmitter when the source neuron fires and is a sink for neurotransmitter otherwise. The latter is due to the reuptake of neurotransmitter into the terminals. Lawley et al. [91] consider diffusion on a finite interval [0, L] as in Sect. 4.1 but with modified boundary conditions. One example assumes a reflecting boundary at x = 0 and a switching boundary at x = L due to the presence of a source cell at the right-hand side. The boundary condition thus switches between absorbing when the neuron is not firing (quiescent state N(t) = 0) and constant flux when the neuron is firing (firing state N(t) = 1). This yields the system of equations with u satisfying the boundary conditions (4.53)

Analysis of the first-order moment equations for V n (x) = E[u(x, t)1 N(t)=n ] establishes that in steady-state the total mean concentration
Here α is the switching rate from the quiescent state to the firing state, and β is the switching rate of the reverse transition. Thus we observe the same mean concentration V throughout the extracellular domain, even though some parts are further away from the source than others. Consistent with intuition, V increases with μ, which reflects the fact that the neuron on the boundary fires more often. Now suppose that both α and β become large (fast switching) but their ratio μ is fixed. In this case, η becomes large, and V → 0. This is due to the fact that any neurotransmitter that is released is rapidly reabsorbed at the same terminal. (Note that if the left-hand boundary is taken to be absorbing rather than reflecting, u(0, t) = 0, then the concentration is a linear function of x; this could represent a glial cell on the left-hand boundary, which absorbs neurotransmitter but does not fire.) The authors also consider the case where there is a source neuron at each end, so that each boundary switches according to an independent two-state Markov process. If we denote the two Markov processes by Now we find that the mean concentration approaches a uniform concentration, provided that the two neurons are identical; otherwise, the concentration is a linear function of x.

Stochastic Vesicular Transport in Axons and Dendrites
The efficient delivery of mRNA, proteins, and other molecular products to their correct location within a cell (intracellular transport) is of fundamental importance to normal cellular function and development [1,23]. The challenges of intracellular transport are particularly acute for neurons, which are amongst the largest and most complex cells in biology, in particular, with regards to the efficient trafficking of newly synthesized proteins from the cell body or soma to distant locations on the axon and dendrites. In healthy cells, the regulation of mRNA and protein trafficking within a neuron provides an important mechanism for modifying the strength of synaptic connections between neurons [9,34,72,139], and synaptic plasticity is generally believed to be the cellular substrate of learning and memory. On the other hand, various types of dysfunction in protein trafficking appear to be a major contributory factor to a number of neurodegenerative diseases associated with memory loss, including Alzheimer's [38]. Broadly speaking, there are two basic mechanisms for intracellular transport: passive diffusion within the cytosol or the surrounding plasma membrane of the cell, and active motor-driven transport along polymerized filaments such as microtubules and F-actin that comprise the cytoskeleton. Newly synthesized products from the nucleus are mainly transported to other intracellular compartments or the cell membrane via a microtubular network that projects radially from organizing centres (centrosomes) and forms parallel fiber bundles within axons and dendrites. The same network is used to transport degraded cell products back to the nucleus. Moreover, various animal viruses including HIV take advantage of microtubule-based transport in order to reach the nucleus from the cell surface and release their genome through nuclear pores [36]. Microtubules are polarized filaments with biophysically distinct plus and minus ends. In general, a given molecular motor will move with a bias toward a specific end of the microtubule; for example, kinesin moves toward the (+) end and dynein moves toward the (−) end. Microtubules are arranged throughout an axon or dendrite with a distribution of polarities: in axons and distal dendrites, they are aligned with the (−) ends pointing to the soma (plus-end-out), and in proximal dendrites, they have mixed polarity.
Axons of neurons can extend up to 1 m in large organisms, but synthesis of many of its components occurs in the cell body. Axonal transport is typically divided into three main categories based upon the observed speed [29]: fast transport (1-9 μm/s) of organelles and vesicles and slow transport (0.004-0.6 μm/s) of soluble proteins and cytoskeletal elements. Slow transport is further divided into two groups; actin and actin-bound proteins are transported in slow component A, whereas cytoskeletal polymers such as microtubules and neurofilaments are transported in slow component B. It had originally been assumed that the differences between fast and slow components were due to differences in transport mechanisms, but direct experimental observations now indicate that they all involve fast motors but differ in how the motors are regulated. Membranous organelles, which function primarily to deliver membrane and protein components to sites along the axon and at the axon tip, move rapidly in a unidirectional manner, pausing only briefly. In other words, they have a high duty ratio-the proportion of time a cargo complex is actually moving. On the other hand, cytoskeletal polymers and mitochondria move in an intermittent and bidirectional manner, pausing more often and for longer time intervals, and sometimes reversing direction. Such a transport has a low duty ratio.
Another example of a transport process in neurons that exhibits bidirectionality is the trafficking of mRNA containing granules within dendrites. There is increasing experimental evidence that local protein synthesis in the dendrites of neurons plays a crucial role in mediating persistent changes in synaptic structure and function, which are thought to be the cellular substrates of long-term memory [8,82,133]. This is consistent with the discovery that various mRNA species and important components of the translational machinery, such as ribosomes, are distributed in dendrites. Although many of the details concerning mRNA transport and localization are still unclear, a basic model is emerging. First, newly transcribed mRNA within the nucleus binds to proteins that inhibit translation, thus allowing the mRNA to be sequestered away from the protein-synthetic machinery within the cell body. The repressed mRNAs are then packaged into ribonucleoprotein granules that are subsequently transported into the dendrite via kinesin and dynein motors along microtubules. Finally, the mRNA is localized to an activated synapse by actin-based myosin motor proteins, and local translation is initiated following neutralization of the repressive mRNA-binding protein. Details regarding the motor-driven transport of mRNA granules in dendrites have been obtained by fluorescently labeling either the mRNA or mRNA-binding  [125] proteins and using live-cell imaging to track the movement of granules in cultured neurons [44,86,125]. It has been found that, under basal conditions, the majority of granules in dendrites are stationary or exhibit small oscillations around a few synaptic sites. However, other granules exhibit rapid retrograde (toward the cell body) or anterograde (away from the cell body) motion consistent with bidirectional transport along microtubules. These movements can be modified by neuronal activity as illustrated in Fig. 13. In particular, there is an enhancement of dendritically localized mRNA due to a combination of newly transcribed granules being transported into the dendrite, and the conversion of stationary or oscillatory granules already present in the dendrite into anterograde-moving granules.

Intracellular Transport as a Velocity Jump Process
In terms of the general theme of this review, intracellular transport models are relevant because they consist of a special type of PDMP known as a velocity jump process [57,112,113,122,123]. In the case of one-dimensional transport along a filament, an individual particle moves according to the piecewise deterministic ODE dx dt = v n(t) , (5.1) where the discrete random variable n(t) ∈ Γ indexes the current velocity state v n(t) .
The simplest example is a particle switching between an anterograde state with ve-locity v 1 > 0 and a retrograde state of velocity v 0 < 0, so that we have In the physics literature, ξ(t) is called a dichotomous Markov noise process (DMNP); see the review [5]. The corresponding CK equation is where α, β are the corresponding switching rates, which can depend on the current position x. In applications, we are typically interested in the marginal density p(x, t) = p 0 (x, t) + p 1 (x, t), which can be used to calculate moments of p such as the mean and variance, In the unbiased case, v 1 = v, v 0 = −v, α = β, the marginal probability density p(x, t) satisfies the telegrapher's equation (The individual densities p 0,1 satisfy the same equations.) The telegrapher's equation can be solved explicitly for a variety of initial conditions. More generally, the shorttime behavior (for t 1/α) is characterized by wave-like propagation with x(t) 2 ∼ (vt) 2 , whereas the long-time behavior (t 1/α) is diffusive with x 2 (t) ∼ 2Dt, D = v 2 /2α. As an explicit example, the solution for the initial conditions p(x, 0) = δ(x) and ∂ t p(x, 0) = 0 is given by where I n is the modified Bessel function of nth order, and Θ is the Heaviside function. The first two terms clearly represent the ballistic propagation of the initial data along characteristics x = ±vt, whereas the Bessel function terms asymptotically approach Gaussians in the large time limit. The steady-state equation for p(x) is simply p (x) = 0, which from integrability means that p(x) = 0 pointwise. This is consistent with the observation that the above explicit solution satisfies p(x, t) → 0 as t → ∞.
One of the first examples of modeling intracellular transport as a velocity jump process was within the context of the slow axonal transport of neurofilaments [6,57,123]. Neurofilaments are space-filling cytoskeletal polymers that increase the cross-sectional area of axons, which then increases the propagation speed of action potentials. Radioisotopic pulse labeling experiments provide information about the transport of neurofilaments at the population level, which takes the form of a slowly moving Gaussian-like wave that spreads out as it propagates distally. Blum and Reed [6] considered the following system on the semiinfinite domain 0 ≤ x < ∞: where p 1 represents the concentration of moving neurofilament proteins, and p i , i > 1, represent the concentrations in n − 1 distinct stationary states. In contrast to the two-state model of bidirectional transport, the system jumps between a single anterograde state and a set of stationary states. Conservation of mass implies that Moreover p 1 (0, t) = 1 for t > 0. Reed et al. [123] carried out an asymptotic analysis of equations (5.4a)-(5.4b) that is related to the QSS reduction method of Sect. 2.2. Suppose that p 1 is written in the form where u is the effective speed, u = vp ss 1 / n j =1 p ss j , and p ss is the steady-state solution for which Ap ss = 0. They then showed that Q ε (s, t) → Q 0 (s, t) as ε → 0, where Q 0 is a solution to the diffusion equation with H the Heaviside function. The diffusivity D can be calculated in terms of v and the transition matrix A. Hence the propagating and spreading waves observed in experiments could be interpreted as solutions to an effective advection-diffusion equation. More recently, [56,57] have developed a more rigorous analysis of spreading waves. Note that the large time behavior is consistent with the solution of the diffusion equation obtained in the fast switching limit. In contrast to these population models, direct observations of neurofilaments in axons of cultured neurons using fluorescence microscopy has demonstrated that individual neurofilaments are actually transported by fast motors but in an intermittent fashion [142]. Hence, it has been proposed that the slow rate of movement of a population is an average of rapid bidirectional movements interrupted by prolonged pauses, the so-called stop-and-go hypothesis [28,77,93]. Computational simulations of an associated system of PDEs shows how fast intermittent transport can account for the slowly spreading wave seen at the population level. One version of the model assumes that the neurofilaments can be in one of six states [28,93]: anterograde moving on track (state a), anterograde pausing on track (a 0 state), anterograde pausing off track (state a p ), retrograde pausing on track (state r 0 ), retrograde pausing off track (state r p ), and retrograde moving on track (state r). The state transition diagram is shown in Fig. 14.

Tug-of-War Model of Bidirectional Motor Transport
The observation that many types of motor-driven cargo move bidirectionally along microtubules suggests that cargo is transported by multiple kinesin and dynein motors. In proximal dendrites, it is also possible that one or more identical motors move a cargo bidirectionally by switching between microtubules with different polarities. In either case, it is well established that multiple molecular motors often work together as a motor-complex to pull a single cargo [144]. An open question concerns how the set of molecular motors pulling a vesicular cargo are coordinated. One possibility is that the motors compete against each other in a tug-of-war where an individual motor interacts with other motors through the force it exerts on the cargo. If the cargo places a force on a motor in the opposite direction it prefers to move, then it will be more likely to unbind from the microtubule. A recent biophysical model has shown that a tug-of-war can explain the coordinated behavior observed in certain animal models [101,102].
Suppose that a certain vesicular cargo is transported along a one-dimensional track via N + right-moving (anterograde) motors and N − left-moving (retrograde motors). At a given time t, the internal state of the cargo-motor complex is fully characterized by the numbers n + and n − of anterograde and retrograde motors that are bound to a microtubule and thus actively pulling on the cargo. Assume that over the time-scales of interest all motors are permanently bound to the cargo, so that 0 ≤ n ± ≤ N ± . The tug-of-war model of Muller et al. [101,102] assumes that the motors act independently, other than exerting a load on motors with the opposite directional preference. (However, some experimental work suggests that this is an oversimplification, that is, there is some direct coupling between motors [42]). Thus the properties of the motor complex can be determined from the corresponding properties of the individual motors together with a specification of the effective load on each motor. There are two distinct mechanisms whereby such bidirectional transport could be implemented [102]. First, the track could consist of a single polarized microtubule filament (or a chain of such filaments) on which up to N + kinesin motors and N − dynein motors Fig. 15 Schematic diagram of an asymmetric tug-of-war model. Two kinesin and two dynein motors transport a cargo in opposite directions along a single polarized microtubule track. Transitions between two possible motor states are shown can attach; see Fig. 15. Since individual kinesin and dynein motors have different biophysical properties, with the former tending to exert more force on a load, it follows that even when N + = N − , the motion will be biased in the anterograde direction. Hence, this version is referred to as an asymmetric tug-of-war model. Alternatively, the track could consist of two parallel microtubule filaments of opposite polarity such that N + kinesin motors can attach to one filament and N − to the other. In the latter case, if N + = N − , then the resulting bidirectional transport is unbiased resulting in a symmetric tug-of-war model.
When bound to a microtubule, the velocity of a single molecular motor decreases approximately linearly with force applied against the movement of the motor [141]. Thus, each kinesin is assumed to satisfy the linear force-velocity relation where F is the applied force in the retrograde direction, F s is the stall force satisfying v(F s ) = 0, v f is the forward motor velocity in the absence of an applied force in the preferred direction of the particular motor, and v b is the backward motor velocity when the applied force exceeds the stall force. Dynein motors will also be taken to satisfy a linear force-velocity relation: where now F is the force in the anterograde direction. Since the parameters associated with kinesin and dynein motors are different, we distinguish the latter by taking F s → F s etc. The original tug-of-war model assumes that the binding rate of kinesin is independent of the applied force, whereas the unbinding rate is taken to be an exponential function of the applied force: where F d is the experimentally measured force scale on which unbinding occurs. The force dependence of the unbinding rate is based on measurements of the walking distance of a single kinesin motor as a function of load [129], in agreement with Kramer's rate theory [70]. Similarly, for dynein, we take π(F ) = π 0 , γ (F ) = γ 0 e F/ F d . (5.8) Let F c denote the net load on the set of anterograde motors. Suppose that the molecular motors are not directly coupled to each other, so that they act independently and share the load; however, see [42]. It follows that a single anterograde motor feels the force F c /n + . Equation (5.7) implies that the binding and unbinding rates for n + kinesin motors take the form Similarly, each dynein motor feels the opposing force −F c /n − , so that the binding and unbinding rates for n − dynein motors take the form The cargo force F c is determined by the condition that all the motors move with the same cargo velocity v c . Suppose that the net velocity is in the anterograde direction, which implies F c /(n − F s ) > 1 > F c /(n + F s ). It follows from equations (5.5) and (5.6) that This generates a unique solution for the load F c and cargo velocity v c : and v c (n + , n − ) = n + F s − n − F s n + F s /v f + + n − F s / v b . (5.14) The corresponding expressions when the backward motors are stronger, n + F s+ < n − F s , are found by interchanging The original study of [101,102] considered the stochastic dynamics associated with transitions between different internal states (n + , n − ) of the motor complex, without specifying the spatial position of the complex along a 1D track. This defines a Markov process with a corresponding master equation for the time evolution of the probability distribution P (n + , n − , t). They determined the steady-state probability distribution of internal states and found that the motor complex exhibited at least three different modes of behavior: (i) the motor complex spends most of its time in states with approximately zero velocity; (ii) the motor complex exhibits fast backward and forward movement interrupted by stationary pauses, which is consistent with experimental studies of bidirectional transport; and (iii) the motor complex alternates between fast backward and forward movements. The transitions between these modes of behavior depend on motor strength, which primarily depends upon the stall force. The tug-of-war model can also be formulated as a velocity jump pro-cess [112,113]. This version of the tug-of-war model simultaneously keeps track of the internal state of the motor complex and its location along a 1D track. That is, the position along the track evolves according to piecewise deterministic ODE dx dt = v c n + (t), n − (t) , (5.15) in between changes in the number of bound kinesin and dynein motors. The various state transitions are As in previous examples, the corresponding CK equation can be reduced to an effective advection-diffusion equation in the limit that the rates of binding and unbinding of molecular motors are sufficiently fast [112,113]. One of the useful features of the tug-of-war model is that it allows various biophysical processes to be incorporated into the model. For example, a convenient experimental method for changing the stalling force (and hence the mode of motor behavior) is to vary the level of ATP available to the motor complex. At low [ATP] the motor has little fuel and is weaker, resulting in mode (i) behavior; then, as [ATP] increases and more fuel is available, mode (ii) behavior is seen until the stall force saturates at high values of [ATP] where mode (iii) behavior takes over. Thus, [ATP] provides a single control parameter that tunes the level of intermittent behavior exhibited by a motor complex [112]. Another potentially important signaling mechanism involves microtubule associated proteins (MAPs). These molecules bind to microtubules and effectively modify the free-energy landscape of motor-microtubule interactions [134]. For example, tau is a MAP found in the axon of neurons and is known to be a key player in Alzheimer's disease [88]. Another important MAP, called MAP2, is similar in structure and function to tau, but is present in dendrites; MAP2 has been shown to affect dendritic cargo transport [95]. Experiments have shown that the presence of tau or MAP2 on the microtubule can significantly alter the dynamics of kinesin; specifically, by reducing the rate at which kinesin binds to the microtubule [140]. This could be implemented by taking the binding rate γ 0 of kinesin to decrease within the domain of enhanced MAP concentration. This means that in the fast switching limit, we obtain the deterministic equation (2.8) with F (x) corresponding to an x-dependent mean velocity. Suppose, for example, that F (x) =v > 0 for x / ∈ [X − l, X + l] and F (x) a unimodal function for x ∈ [X − l, X + l] with a negative minimum at x = X. Here we are taking the region of enhanced τ to be an interval of length 2l centered about x = X. Writing F (x) = −Ψ (x − X), the corresponding deterministic potential has the form shown in Fig. 16. Since the mean velocity switches sign within the domain [X − l, X + l], it follows that there exists one stable fixed point x 0 and an unstable fixed point x * . One interesting effect of a local increase in MAPs is that it can generate stochastic oscillations in the motion of the motor-complex [113]. As a kinesin driven cargo encounters the MAP-coated trapping region, the motors unbind at their usual rate and can't rebind. Once the dynein motors are strong enough to pull the remaining kinesin motors off the microtubule, the motor-complex quickly transitions to (−) end directed transport. After the dynein-driven cargo leaves the MAP-coated region, kinesin motors can then reestablish (+) end directed transport until the motor-complex returns to the MAP-coated region. This process repeats until the motor-complex is able to move forward past the MAP-coated region. Interestingly, particle tracking experiments have observed oscillatory behavior during mRNA transport in dendrites [44,125]. In these experiments, motor-driven mRNA granules move rapidly until encountering a fixed location along the dendrite where they slightly overshoot then stop, move backward, and begin to randomly oscillate back and forth. After a period of time, lasting on the order of minutes, the motor-driven mRNA stops oscillating and resumes fast ballistic motion. Calculating the mean time to escape, the target can be formulated as an FPT problem, in which the particle starts at x = x 0 and has to make a rare transition to the unstable fixed point at x = x * . As in the analogous problem of stochastic action potential generation (Sect. 3), the QSS diffusion approximation breaks down for small ε, and we have to use the asymptotic methods of Sect. 2.3. The details can be found elsewhere [115].
Interestingly, there is recent evidence that the selective transport of cargo into the axon depends on the localized restriction of MAP2 to the proximal axon [67]. It is known that in both mammalian and Drosophila axons, secretory vesicles are trafficked by the cooperative action of two types of kinesin motors, KIF5 and KIF1 motors. Experimental studies of their motility indicate that MAP2 directly inhibits KIF5 motor activity and that axonal cargo entry and distribution depend on the balanced activities between KIF5 and KIF1 bound to the same cargo. That is, cargoes bound to the dominant motor KIF5 are unable to enter the axon, whereas those bound to motors that are not influenced by MAP2 are able to quickly enter the axon and move to the distal terminals. Moreover, cargoes bound to both KIF1 and KIF5 will enter the axon, but their axonal distribution will be affected by the reactivation of KIF5 past the proximal axon as the inhibition by MAP2 wears off, which slows down the transport; see Fig. 17.

Synaptic Democracy
A number of recent experimental studies of intracellular transport in axons of C. elegans and Drosophila have shown that (i) motor-driven vesicular cargo exhibits "stop and go" behavior, in which periods of ballistic anterograde or retrograde transport are interspersed by long pauses at presynaptic sites, and (ii) the capture of vesicles by synapses during the pauses is reversible in the sense that the aggregation of vesicles can be inhibited by signaling molecules resulting in dissociation from the target [96,148]. It has thus been hypothesized that the combination of inefficient capture at presynaptic sites and the back-and-forth motion of motor-cargo complexes between proximal and distal ends of the axon facilitates a more uniform distribution of resources, that is, greater "synaptic democracy" [96].
The idea of synaptic democracy has previously arisen within the context of equalizing synaptic efficacies, that is, ensuring that synapses have the same potential for affecting the postsynaptic response regardless of their locations along the dendritic tree [71,126]. An analogous issue arises within the context intracellular transport, since vesicles are injected from the soma (anterograde transport) so that one might expect synapses proximal to the soma to be preferentially supplied with resources. In principle, this could be resolved by routing cargo to specific synaptic targets, but there is no known form of molecular address system that could support such a mechanism, particularly in light of the dynamically changing distribution of synapses. From a mathematical perspective, the issue of synaptic democracy reflects a fundamental property shared by the one-dimensional advection-diffusion equation used to model active transport and the cable equation used to model ionic current flow, namely, they generate an exponentially decaying steady-state solution in response to a localized source of active particles or current.
The hypothesized mechanism of synaptic democracy that combines bidirectional transport with reversible delivery of cargo to synaptic targets has recently been investigated in a series of modeling studies [13,16,20,78]. Consider a simple three-state transport model of a single motor-complex moving on a semiinfinite 1D track as shown in Fig. 18. The motor complex is taken to be in one of three motile states labeled by n = 0, ±: stationary or slowly diffusing with diffusivity D 0 (n = 0), moving to the right (anterograde) with speed v + (n = +), or moving to the left (retrograde) with speed −v − (n = −); transitions between the three states are governed by a discrete Markov process. In addition, the motor complex can carry a single vesicle, which is reversibly exchanged with membrane-bound synaptic targets when in the Fig. 18 Three-state model of the bidirectional transport of a single motor-cargo complex. The particle switches between an anterograde state (n = +) of speed v + , a stationary or slowly diffusing state (n = 0), and a retrograde state (n = −) of speed v − . The motor-complex can only deliver a vesicle to a presynaptic target in the state n = 0 state n = 0. Let p n (x, t) denote the probability density that at time t the complex is at position x, x ∈ (0, ∞), is in motile state j , and a vesicle is not bound to the complex. Similarly, let p n (x, t) be the corresponding probability density when a vesicle is bound. We allow for the possibility that the velocities and diffusivity are different for the bound state by taking v ± → v ± and D 0 → D 0 . The evolution of the probability density is described by the following system of partial differential equations: Here α, β are the transition rates between the slowly diffusing and ballistic states. We also assume that there is a uniform distribution c of presynaptic targets along the axon, which can exchange vesicles with the motor-complex at the rates k ± . Now suppose that the transition rates α, β are fast compared to the exchange rates k ± and the effective displacement rates of the complex on a fundamental microscopic length-scale such as the size of a synaptic target (l ∼ 1 μm). Following Sect. 2.2, we can then use a QSS diffusion approximation to derive an advection-diffusion equation for the total probability densities p(x, t) = n=0,± p n (x, t). (5.17) That is, we obtain the equations Here are the stationary probabilities of the three-state Markov process describing transitions between the motile states n = 0 and n = ±, respectively. We have also absorbed a factor ρ 0 into k ± . To investigate how the above form of intracellular transport can lead to synaptic democracy, we consider a population of identical, noninteracting motor complexes. Let u(x, t) and u(x, t) denote the density of motor-complexes without and with an attached vesicle, respectively. From the reduced equations (5.18a)-(5.18b) we have case of a finite axon, we could model recycling by imposing an absorbing boundary condition at the distal end and reinjecting the distal flux into the somatic end. Since most of these complexes would be without a vesicle, this would mainly contribute to J 0 . Moreover, if the axon is much longer than the range of vesicular delivery necessary to supply en passant synapses, then the effects of the absorbing boundary can be ignored, and we can treat the axon as semiinfinite. Finally, at the population level, the concentration of vesicles within the presynaptic targets is no longer constant, that is, c = c(x, t) with We have also allowed for the possibility that synaptic vesicles degrade at a rate γ c . Let us begin by considering the case k − > 0 (reversible delivery) and γ c = 0 (no vesicular degeneration); the distribution c of presynaptic vesicles will remain bounded, provided that J 0 > 0. Equation (5.21) implies that, at steady state, Then substituting equation (5.22) into the steady-state versions of equations (5.20a)-(5.20b) gives and Combining with equation (5.22) then yields the following result for the steady-state density of synaptic vesicles: where In particular, if the transport properties of the motor-complex are independent of whether or not a vesicle is bound (v = v, D = D), then ξ = ξ , and we have a uniform vesicle distribution To further explore the ability of this model to produce a democratic cargo distribution, equation (5.20a)-(5.20b) can be solved numerically for a range of parameter values. Following [20], suppose that γ c is small (relative to k ± ) but nonzero and consider how the normalized distribution c(x)/c(0) varies with φ ≡ k − /γ c , which  [20].) For comparison, the corresponding concentration profile when J 0 = 0 (which is insensitive to φ) is shown by the thick line (red line in color online). We have also set γ = 10 −2 s −1 , J = 1.5, determines the proportion of vesicles that are recycled into the system after leaving the targets. Figure 19 displays the normalized concentration profiles for a variety of k − /γ c values with either J 0 = J 0 or J 0 = 0. (The domain size is taken to be sufficiently large to avoid boundary effects.) It can be seen that when J 0 > 0, the length scale over which nonexponential decay occurs is an increasing function of k − /γ c , whereas when J 0 = 0, the model fails to distribute cargo across a substantial region of the axon. Hence an additional component of a delivery mechanism that includes recapture is a source of motors which are able to receive vesicles. It should be emphasized that this does not require additional motors to be synthesized in the soma; instead, motors may return to the beginning of the axon after delivering their cargo. From the perspective of synaptic democracy it seems desirable to maximize k − ; however, increasing the recapture rate decreases the efficiency of the delivery mechanism and can result in a overall loss of vesicles due to motor degradation.
This mechanism for synaptic democracy appears to be quite robust. For example, it can be extended to the case where each motor carries a vesicular aggregate rather than a single vesicle, assuming that only one vesicle can be exchanged with a target at any one time [13]. The effects of reversible vesicular delivery also persist when exclusion effects between between motor-cargo complexes are taken into account [16] and when higher-dimensional cell geometries are considered [78].

Phase Reduction of Stochastic Hybrid Oscillators
In Sects. 2.3 and 3 we assumed that, in the adiabatic limit ε → 0, the resulting deterministic dynamical system exhibited bistability, and we explored how random switching of the associated PDMP for small ε can lead to noise-induced transitions between metastable states. In this section, we assume that the deterministic system supports a stable limit cycle so that the corresponding PDMP acts as a stochastic limit cycle oscillator, at least in the weak noise regime. There is an enormous literature on the analysis of stochastic limit cycle oscillators for SDEs (for recent surveys, see the reviews [3,47,105]). On the other hand, as far as we are aware, there has been very Two possibilities are orthogonal projection with phase θ (t) and isochronal projection with phase θ(t). In the latter case, the response to perturbations depends on the phase response curve R(θ), which is normal to the isochron at the point of intersection with the limit cycle little numerical or analytical work on limit cycle oscillations in PDMPs. A few notable exceptions are [21,27,52,89,137]. One possible approach would be to carry out a QSS diffusion approximation of the PDMP along the lines of Sect. 2.2 and then use stochastic phase reduction methods developed for SDEs. In this section, we review an alternative, variational method that deals directly with the PDMP [21], thus avoiding additional errors arising from the diffusion approximation. Another major advantage of the variational method is that it allows us to obtain rigorous exponential bounds on the expected time to escape from a neighborhood of the limit cycle [21,22].
Let us first briefly consider SDEs. Suppose that a deterministic smooth dynamical systemẋ = F (x), x ∈ R d , supports a limit cycle x(t) = Φ(θ(t)) of period Δ 0 , where θ(t) is a uniformly rotating phase,θ = ω 0 , and ω 0 = 2π/Δ 0 . The phase is neutrally stable with respect to perturbations along the limit cycle; this reflects invariance of an autonomous dynamical system with respect to time shifts. Now suppose that the dynamical system is perturbed by weak Gaussian noise such that dX = F (X) dt + √ 2εG(X) dW (t), where W (t) is a d-dimensional vector of independent Wiener processes. If the noise amplitude ε is sufficiently small relative to the rate of attraction to the limit cycle, then deviations transverse to the limit cycle are also small (up to some exponentially large stopping time). This suggests that the definition of a phase variable persists in the stochastic setting, and we can derive a stochastic phase equation by decomposing the solution to the SDE according to with β(t) and v(t) corresponding to the phase and amplitude components, respectively. However, there is not a unique way to define the phase β, which reflects the fact that there are different ways of projecting the exact solution onto the limit cycle [7,21,65,87,147]; see Fig. 20. One well-known approach is to use the method of isochrons [47,62,106,135,136,149]. Recently, a variational method for carrying out the amplitude-phase decomposition for SDEs has been developed, which yields exact SDEs for the amplitude and phase [22]. Within the variational framework, different choices of phase correspond to different choices of the inner product space R d .
By taking an appropriately weighted Euclidean norm the minimization scheme determined the phase by projecting the full solution on to the limit cycle using Floquet vectors. Hence, in a neighborhood of the limit cycle the phase variable coincided with the isochronal phase [7]. This had the advantage that the amplitude and phase decoupled to leading order. In addition, the exact amplitude and phase equations could be used to derive strong exponential bounds on the growth of transverse fluctuations. It turns out that an analogous variational method can be applied to PDMPs [21], which will be outlined in the remainder of this section. Suppose that the deterministic dynamical system (2.8), obtained in the adiabatic limit ε → 0, supports a stable periodic solution where ω 0 = 2π/Δ 0 is the natural frequency of the oscillator. In the state space of the continuous variable, the solution is an isolated attractive trajectory called a limit cycle. The dynamics on the limit cycle can be described by a uniformly rotating phase such that and x = Φ(θ(t)) with a 2π -periodic function Φ. Note that the phase is neutrally stable with respect to perturbations along the limit cycle-this reflects invariance of an autonomous dynamical system with respect to time shifts. By definition, Φ must satisfy the equation Differentiating both sides with respect to θ gives d dθ where J is the 2π -periodic Jacobian matrix J jk (θ ) ≡ ∂F j ∂x k x=Φ(θ) . (6.5) One concrete example of a PDMP that supports a limit cycle oscillation in the fast switching limit is a version of the stochastic Morris-Lecar model that has been applied to sodium-based subthreshold oscillations [27,145]; the corresponding deterministic model is given by equations (3.5). Numerical solutions of the latter are shown in Fig. 21.
The isochronal phase map has been the most popular means of decomposing the phase of stochastic oscillators evolving according to an SDE (and also studying their synchronization) [3,47,105]. Let U be the neighborhood of the limit cycle consisting of all points that eventually converge to the limit cycle under the deterministic dynamics of (2.8). The isochronal phase map Θ : U → S 1 is defined to be the phase that a point converges to. That is, Θ(y) is the unique α such that if x(0) = y and  with switching events occurring at the same times {t k } as x(t). The gradient of the isochronal phase, is known as the infinitesimal phase resetting curve. It can be shown that R(θ) satisfies the adjoint equation [48] ω 0 dR(θ) dθ = −J (θ) · R(θ) (6.9) under the normalization condition As it stands, equation (6.7) is not a closed equation for the isochronal phase, since the right-hand side depends on the full set of variables x(t).

Floquet Decomposition
Suppose that we fix a particular realization σ T of the Markov chain up to some time T , σ T = {N(t), 0 ≤ t ≤ T }. Suppose that there is a finite sequence of jump times {t 1 , . . . , t r } within the time interval (0, T ) and let n j be the corresponding discrete state in the interval (t j , t j +1 ) with t 0 = 0. Introduce the set We wish to decompose the piecewise deterministic solution x t to the PDMP (2.1) for t ∈ T into two components as in equation (6.1): with β t and v t corresponding to the phase and amplitude components, respectively. The phase β t and amplitude v t evolve according to a PDMP, involving the vector field F n j in the time intervals (t j , t j +1 ), analogous to x t ; see Fig. 1. (It is notationally convenient to switch from x(t) to x t etc.) However, such a decomposition is not unique unless we impose an additional mathematical constraint. We will adapt a variational principle recently introduced to analyze the dynamics of limit cycles with Gaussian noise [21]. To construct the variational principle, we first introduce an appropriate weighted norm on R d , based on a Floquet decomposition.
For any 0 ≤ t, define Π(t) ∈ R d×d to be the fundamental matrix for the following ODE: dz dt = A(t)z, (6.12) where A(t) = J (ω 0 t). That is, Π(t) := (z 1 (t)|z 2 (t)| · · · |z d (t)), where z i (t) satisfies (6.12), and {z i (0)} d i=1 is an orthogonal basis for R d . Floquet theory states that there exists a diagonal matrix S = diag(ν 1 , . . . , ν d ) whose diagonal entries are the Floquet characteristic exponents such that with P (θ) a 2π -periodic matrix whose first column is proportional to Φ (ω 0 t), and ν 1 = 0. That is, P (θ) −1 Φ (θ ) = c 0 e with e j = δ 1,j and c 0 an arbitrary constant; we set c 0 = 1 for convenience. To simplify the following notation, we will assume throughout this paper that the Floquet multipliers are real and hence P (θ) is a real matrix. We can readily generalize these results to the case that S is complex. The limit cycle is taken to be stable, meaning that, for a constant b > 0 and all 2 ≤ i ≤ d, we have ν i ≤ −b. Furthermore, P −1 (θ ) exists for all θ , since Π −1 (t) exists for all t.
The above Floquet decomposition motivates the following weighted inner product: For any θ ∈ R, we denote the standard Euclidean dot product on R d by ·, · , x, y θ = P −1 (θ )x, P −1 (θ )y , and x θ = √ x, x θ . In the case of SDEs, it has been shown that this choice of weighting yields a leading order separation of the phase from the amplitude and facilitates strong bounds on the growth of v t [21]. The former is a consequence of the fact that the matrix P −1 (θ ) generates a coordination transformation in which the phase in a neighborhood of the limit cycle coincides with the asymptotic phase defined using isochrons (see also [7]). In particular, we can show that the PRC R(θ) is related to the tangent vector Φ (θ ) according to [21] where

Defining the Piecewise Deterministic Phase Using a Variational Principle
We can now state the variational principle for the stochastic phase β t [21]. First, we consider a variational problem for an arbitrary prescribed function θ t (not to be confused with the phase on the limit cycle), which specifies the weighted Euclidean norm. Given θ t , we determine β t for t ∈ T by requiring β t = ϕ t (θ t ), where ϕ t (θ t ) is a local minimum of the following variational problem: 16) with N (ϕ t (θ t )) denoting a sufficiently small neighborhood of ϕ t (θ t ). The minimization scheme is based on the orthogonal projection of the solution onto the limit cycle with respect to the weighted Euclidean norm. We will derive an exact SDE for β t (up to some stopping time) by considering the first derivative At the minimum, We then determine θ t (and hence β t ) self-consistently by imposing the condition θ t = ϕ t (θ t ) = β t . It follows that the stochastic phase β t satisfies the implicit equation It will be seen that, up to a stopping time τ , there exists a unique continuous solution to this equation. Define M(z, ϕ) ∈ R as Assume that initially M(u 0 , β 0 ) > 0. We then seek a PDMP for β t that holds for all times less than the stopping time The implicit function theorem guarantees that a unique continuous β t exists until this time.
To derive the PDMP for β t , we consider the equation Rearranging, we find that the phase β t evolves according to the exact, but implicit, with n = n j for t ∈ (t j , t j +1 ). Finally, recalling that the amplitude term v t satisfies (6.25)

Weak Noise Limit
Equation (6.24) is a rigorous, exact implicit equation for the phase β t . We can derive an explicit equation for β t by carrying out a perturbation analysis in the weak noise limit. Let 0 < ε 1 and set x t = Φ(β t ) on the right-hand side of (6.24), that is, v t = 0. Writing β t ≈ θ t , we have the piecewise deterministic phase equation after using M(Φ(θ ), θ ) = M 0 and equation (6.14). The last line follows from the observation Hence, a phase reduction of the PDMP (2.1) yields a PDMP for the phase θ t . Of course, analogously to the phase reduction of SDEs, there are errors due to the fact that we have ignored O(ε) terms arising from amplitude-phase coupling; see below. This leads to deviations of the phase θ t from the exact variational phase β t over O(1/ε) time-scales.
In Fig. 22, we show results of numerical simulations of the stochastic ML model for N = 10 and ε = 0.01 with other parameters as in Fig. 21. We compare solutions of the explicit phase equation (6.26) with the exact phase defined using the variational principle; see Eq. (6.24). We also show sample trajectories for (v, w). It can be seen that initially the phases are very close and then very slowly drift apart as noise accumulates. The diffusive nature of the drift in both phases can be clearly seen, with the typical deviation of the phase from ω 0 t increasing in time.

Decay of Amplitude Vector
If we are interested in higher-order contributions to the phase equation, then it is necessary to consider the coupling between the phase and amplitude in both the continuous dynamics and the discrete switching process. Hence, the phase equation (6.26) Fig. 22 Simulation of the stochastic Morris-Lecar model for subthreshold Na + oscillations with N = 10 and ε = 0.01. (Adapted from Ref. [21].) Other parameter values as in Fig. 21. (a) Plot of the approximate phase θ t − tω 0 in green (with θ t satisfying equation (6.26) and the exact variational phase (satisfying (6.19)) β t − tω 0 in black. On the scale [−π, π] the two phases are in strong agreement. However, zooming in, we can see that the phases slowly drift apart as noise accumulates. The diffusive nature of the drift in both phases can be clearly seen with the typical deviation of the phase from ω 0 t increasing in time.
(b) Stochastic trajectory around limit cycle (dashed curve) in the v, w-plane. The stable attractor of the deterministic limit cycle is quite large, which is why the system can tolerate quite substantial stochastic perturbations will only be a reasonable approximation for small ε if the dynamics remains within some attracting neighborhood of the limit cycle, that is, the amplitude remains small. Since the amplitude v t satisfies √ εv t = x t − Φ β t , we have √ ε dv t dt = dx t dt − Φ (β t ) dβ t dt = F n (x t ) − M(x t , β t ) −1 Φ (β t ) F n (x t ), Φ (β t ) β t . (6.27) Now define w t = √ εP (β t ) −1 v t . Using the fact thatṖ ω 0 = J (t)P (t) − P (t)S , we find that 1 2 d dt w t 2 = w t , S w t d dt β t + w t , P (β t ) −1 F n (x t ) − J n P (β t )w t dβ t dt . (6.28) In the fast switching limit (as ε → 0), we can show that the dynamics of w t 2 decays to leading order [21]. That is, there exists a constant C such that the probability that the expected time to leave an O(a) neighborhood of the limit cycle is less than T scales as T exp(−Ca/ε). An interesting difference between this bound and the corresponding one obtained for SDEs [22] is that in the latter the bound is of the form T exp(−Cba/ε), where b is the rate of decay toward the limit cycle. In other words, in the SDE case, the bound is still powerful in the large ε case, as long as bε −1 1, that is, as long as the decay toward the limit cycle dominates the noise. However, this no longer holds in the PDMP case. Now, if ε is large, then the most likely way that the system can escape the limit cycle is that in stays in any particular state for too long without jumping, and the time that it stays in one state is not particularly affected by b (in most cases).

Synchronization of Hybrid Oscillators
As we have outlined previously, it is possible to apply phase reduction techniques to PDMPs that support a limit cycle in the fast switching limit [21]. One of the important consequences of this reduction is that it provides a framework for studying the synchronization of a population of PDMP oscillators, either through direct coupling or via a common noise source. In the case of SDEs, there there have been considerable recent interest in noise-induced phase synchronization [47,62,106,135,136,149]. This concerns the observation that a population of oscillators can be synchronized by a randomly fluctuating external input applied globally to all of the oscillators, even if there are no interactions between the oscillators. Evidence for such an effect has been found in experimental studies of neural oscillations in the olfactory bulb [59] and the synchronization of synthetic genetic oscillators [151]. A related phenomenon is the reproducibility of a dynamical system response when repetitively driven by the same fluctuating input, even though initial conditions vary across trials. One example is the spike-time reliability of single neurons [60,98].
Most studies of noise-induced synchronization take the oscillators to be driven by common Gaussian noise. Typically, phase synchronization is established by constructing the Lyapunov exponent for the dynamics of the phase difference between a pair of oscillators and averaging with respect to the noise. If the averaged Lyapunov exponent is negative definite, then the phase difference is expected to decay to zero in the large time limit, establishing phase synchronization. However, it has also been shown that common Poisson-distributed random impulses, dichotomous or telegrapher noise, and other types of noise generally induce synchronization of limit-cycle oscillators [63,104,107]. Consider, in particular, the case of an additive dichotomous noise signal I (t) driving a population of M identical noninteracting oscillators according to the system of equationsẋ j = F (x j ) + I (t), where x j ∈ R d is the state of the j th oscillator, j = 1, . . . , M [104]; see Fig. 23. Here I (t) switches between two values I 0 and I 1 at random times generated by a two-state Markov chain [5]. (In the case of the classical ML model, I (t) could represent a randomly switching external current.) That is, I (t) = I 0 (1 − N(t)) + I 1 N(t) for N(t) ∈ {0, 1}, with the time T between switching events taken to be exponentially distributed with mean switching time τ . Suppose that each oscillator supports a stable limit cycle for each of the two input values I 0 and I 1 . It follows that the internal state of each oscillator randomly jumps between the two limit cycles. Nagai et al. [104] show that in the slow switching limit (large τ ), the dynamics can be described by random phase maps. Moreover, if the phase maps are monotonic, then the associated Lyapunov exponent is generally negative, and phase synchronization is stable.
More generally, let N(t) ∈ Γ ≡ {0, . . . , N 0 − 1} denote the state of a randomly switching environment. When the environmental state is N(t) = n, each oscillator x i (t) evolves according to the piecewise deterministic differential equatioṅ x i = F n (x i ) for i = 1, . . . , M. The additive dichotomous noise case is recovered by taking N 0 = 2 and F n (x) = F (x) + I n . In the slow switching limit, we can generalize the approach of Nagai et al. [104] by assuming that each of the vector fields Fig. 23 Pair of noninteracting limit cycle oscillators with phases θ j (t), j = 1, 2, driven by a common switching external input I (t) F n (x i ), n ∈ Γ , supports a stable limit cycle and constructing the associated random phase maps. Here we briefly discuss the fast switching regime, assuming that in the adiabatic limit ε → 0, the resulting deterministic systemẋ i = F (x i ) supports a stable limit cycle. Since there is no coupling or remaining external drive to the oscillators in this limit, their phases are uncorrelated. This then raises the issue as to whether or not phase synchronization occurs when ε > 0.
Again, one approach would be to carry out a QSS analysis along the lines of Sect. 2.2, in which each oscillator is approximated by an SDE with a common Gaussian input. We could then adapt previous work on the phase reduction of stochastic limit cycle oscillators [62,106,135,136] and thus establish that phase synchronization occurs under the diffusion approximation. However, the QSS approximation is only intended to be accurate over time-scales that are longer than O(ε). Hence, it is unclear whether or not the associated Lyapunov exponent is accurate, since it is obtained from averaging the fluctuations in the noise over infinitesimally small timescales. Therefore, it would be interesting to derive a more accurate expression for the Lyapunov exponent by working directly with an exact implicit equation for the phase dynamics such as the population analog of equation (6.24).

Conclusion
In recent years, it has become clear that stochastic switching processes are prevalent in a wide range of biological systems. Such processes are typically modeled in terms of stochastic hybrid systems, also known as PDMPs. In this review, we provided a basic introduction to stochastic hybrid systems and illustrated the theory by considering applications to cellular neuroscience. (In a companion review paper, we focus on applications to switching gene regulatory networks [14].) We showed that although the theory of stochastic hybrid systems is underdeveloped compared to SDEs and discrete Markov processes, analogous techniques can be applied, including large deviations and WKB methods, diffusion approximations, and phase reduction methods. We end by listing several outstanding issues that are worthy of further exploration. motor-complexes and synaptic targets; identifying local signaling mechanisms for synaptic targeting; incorporating the contribution of intracellular stores; coupling mRNA transport to long-term synaptic plasticity. 4. Solving the diffusion equation with randomly switching boundary conditions when the switching of a gate depends, for example, on the local particle concentration; solving higher-dimensional boundary value problems; analyzing higherorder moments of the stochastic concentration. 5. Analyzing the synchronization of stochastic hybrid oscillators driven by a common environmental switching process; extending the theory to take into account a partial dependence of the switching process on the continuous dynamics of each oscillator. 6. Modeling synaptically coupled neural networks as a stochastic hybrid system, where the individual spikes of a neural population are treated as the discrete process, and the synaptic currents driving the neurons to fire correspond to the continuous process. So far, stochastic hybrid neural networks are phenomenologically based [11,24]. Can such networks be derived from a more fundamental microscopic theory, and is there a way of distinguishing the output activity of hybrid networks from those driven, for example, by Gaussian noise?

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.