Large Deviations for Nonlocal Stochastic Neural Fields
 Christian Kuehn^{1}Email author and
 Martin G Riedler^{2}
DOI: 10.1186/2190856741
© C. Kuehn, M.G. Riedler; licensee Springer. 2014
Received: 22 February 2013
Accepted: 10 June 2013
Published: 17 April 2014
Abstract
We study the effect of additive noise on integrodifferential neural field equations. In particular, we analyze an Amaritype model driven by a QWiener process, and focus on noiseinduced transitions and escape. We argue that proving a sharp Kramers’ law for neural fields poses substantial difficulties, but that one may transfer techniques from stochastic partial differential equations to establish a large deviation principle (LDP). Then we demonstrate that an efficient finitedimensional approximation of the stochastic neural field equation can be achieved using a Galerkin method and that the resulting finitedimensional rate function for the LDP can have a multiscale structure in certain cases. These results form the starting point for an efficient practical computation of the LDP. Our approach also provides the technical basis for further rigorous study of noiseinduced transitions in neural fields based on Galerkin approximations.
Mathematics Subject Classification (2000):60F10, 60H15, 65M60, 92C20.
Keywords
Stochastic neural field equations Nonlocal equations Large deviation principle Galerkin approximation1 Introduction
Starting from the classical works of Wilson/Cowan [64] and Amari [1], there has been considerable interest in the analysis of spatiotemporal dynamics of mesoscale models of neural activity. Continuum models for neural fields often take the form of nonlinear integrodifferential equations where the integral term can be viewed as a nonlocal interaction term; see [37] for a derivation of neural field models. Stationary states, traveling waves, and pattern formation for neural fields have been studied extensively; see, e.g., [20, 29] or the recent review by Bressloff [14] and references therein.
In this paper, we are going to study a stochastic neural field model. There are several motivations for our approach. In general, it is well known that intra and interneuron [27] dynamics are subject to fluctuations. Many meso or macroscale continuum models have stochastic perturbations due to finitesize effects [38, 61]. Therefore, there is certainly a genuine need to develop new techniques to analyze random neural systems [50]. For stochastic neural fields, there is also the direct motivation to understand the relation between noise and shortterm working memory [52] as well as noiseinduced phenomena [54] in perceptual bistability [62]. Although an eventual goal is to match results from stochastic neural fields to actual cortex data [35], we shall not attempt such a comparison here. However, the techniques we develop could have the potential to make it easier to understand the relation between models and experiments; see Sect. 10 for a more detailed discussion.
There is a relatively small amount of fairly recent work on stochastic neural fields, which we briefly review here. Brackley and Turner [11] study a neural field with a gain function, which has a random firing threshold. Fluctuating gain functions are also considered by Coombes et al. [22]. Bressloff and Webber [15] analyze a stochastic neural field equation with multiplicative noise while Bressloff and Wilkinson [16] study the influence of extrinsic noise on neural fields. In all these works, the focus is on the statistics of traveling waves such as front diffusion and the effects of noise on the wave speed. Hutt et al. [41] study the influence of external fluctuations on Turing bifurcation in neural fields. Kilpatrick and Ermentrout [43] are interested in stationary bump solutions. They observe numerically a noiseinduced passage to extinction as well as noiseinduced switching of bump solutions and conjecture that “a Kramers’ escape rate calculation” [[43], p. 16] could be applied to stochastic neural fields, but they do not carry out this calculation. In particular, the question is whether one can give a precise estimate of the mean transition time between metastable states for stochastic neural field equations; for a precise statement of the classical Kramers’ law; see Sect. 5, Eq. (32). However, to the best of our knowledge, there seems to be no general Kramers’ law or large deviation principle (LDP) calculation available for continuum neural field models although large deviations have been of recent interest in neuroscience applications [13, 33]. It is one of the main goals of this paper to provide the basic steps toward a general theory.
Although Kramers’ law [5] and LDPs [26, 34] are well understood for finitedimensional stochastic differential equations (SDEs), the work for infinitedimensional evolution equations is much more recent. In particular, it has been shown very recently that one may extend Kramers’ law to certain stochastic partial differential equations (SPDEs) [4, 6, 7] driven by spacetime white noise. The work of Berglund and Gentz [7] provides a quite general strategy how to “lift” a finitedimensional Kramers’ law to the SPDE setting using a Galerkin approximation due to Blömker and Jentzen [8]. Since the transfer of PDE techniques to neural fields has been very successful, either directly [51] or indirectly [14, 21], one may conjecture that the same strategy also works for SPDEs and stochastic neural fields.
for a traceclass operator Q, nonlinear gain function f, and an interaction kernel w; the technical details and definitions are provided in Sect. 2. Observe that (1) is a relatively general formulation of a nonlocal neural field. Hence, we expect that the techniques developed in this paper carry over to much wider classes of neural fields beyond (1) such as activitybased models.
Remark 1.1 To avoid confusion, we alert readers familiar with neural fields that the nonlinear gain function f in (1) is sometimes also called a “rate function.” However, we reserve “rate function” for a functional, to be denoted later by I, arising in the context of an LDP as this convention is standard in the context of LDPs.
Our main goal in the study of (1) is to provide estimates on the mean first passage times between metastable states. In particular, we develop the basic analytical tools to approximate equation (1) as well as its rate function using a finitedimensional Galerkin approximation. By making the rate function as explicit as possible, we do not only provide a starting point for further analytical work, but also provide a framework for efficient numerical methods to analyze metastable states.
The paper is structured as follows: The motivation for (1) is given in Sect. 3 where a formal calculation shows that a spacetime white noise perturbation of the gain function in a deterministic neural field leads to (1). In Sect. 4, we briefly describe important features of the deterministic dynamics for (1) where $\u03f5=0$. In particular, we collect several examples from the literature where the classical Kramers’ stability configuration of bistable stationary states separated by an unstable state occurs for Amaritype neural fields. In Sect. 5, we introduce the notation for Kramers’ law and LDPs and state the main theorem on finitedimensional rate functions. In Sect. 6, we argue that a direct approach to Kramers’ law via “lifting” for (1) is likely to fail. Although the Amari model has a hidden energytype structure, we have not been able to generalize the gradientstructure approach for SPDEs to the stochastic Amari model. This raises doubt whether a Kramers’ escape rate calculation can actually be carried out, i.e., whether one may express the prefactor of the mean firstpassage in the bistable case explicitly. Based on these considerations, we restrict ourselves to just derive an LDP. In Sect. 7, the LDP is established by a direct transfer of a result known for SPDEs. The disadvantage of this approach is that the resulting rate function is difficult to calculate, analytically or numerically, in practice. Therefore, we establish in Sect. 8 the convergence of a suitable Galerkin approximation for (1). Using this approximation, one may apply results about the LDP for SDEs, which we carry out in Sect. 9. In this context, we also notice that the traceclass noise can induce a multiscale structure of the rate function in certain cases. The last two observations lead to a tractable finitedimensional approximation of the LDP and hence also an associated finitedimensional approximation for firstexit time problems. We conclude the paper in Sect. 10 with implications of our work and remarks about future problems.
2 AmariType Models
for $x\in \mathcal{B}\subseteq {\mathbb{R}}^{d}$, a small parameter $\u03f5>0$, and $t\ge 0$, where ℬ is bounded and closed. In (2) the solution U models the averaged electrical potential generated by neurons at location x in an area of the brain ℬ. Neural field equations of the form (2) are called Amaritype equations or a ratebased neural field models. The equation is driven by an adapted spacetime stochastic process ${W}_{t}(x)$ on a filtered probability space $(\Omega ,\mathcal{F},{({\mathcal{F}}_{t})}_{t\ge 0},\mathbb{P})$. The precise definition of the process W will be given below.
The parameter $\alpha >0$ is the decay rate for the potential, $w:\mathcal{B}\times \mathcal{B}\to \mathbb{R}$ is a kernel that models the connectivity of neurons at location x to neurons at location y. Positive values of w model excitatory connections and negative values model inhibitory connections. The gain function $f:\mathbb{R}\to {\mathbb{R}}_{+}$ relates the potential of neurons to inputs into other neurons. Typically, the gain functions are chosen sigmoidal, for example, (up to affine transformations of the argument) $f(u)={(1+{\mathrm{e}}^{u})}^{1}$ or $f(u)=(tanh(u)+1)/2$. These examples of gain functions are bounded, infinitely often differentiable with bounded derivatives. However, throughout the paper, we only make the standing assumption that
(H1) the gain function f is globally Lipschitz continuous on ℝ.
Throughout the paper, we assume that
(H2) the kernel w is such that K is a compact, selfadjoint operator on ${L}^{2}(\mathcal{B})$.
We note that an integral operator is selfadjoint if and only if the kernel is symmetric, i.e., $w(x,y)=w(y,x)$ for all $x,y\in \mathcal{B}$. A sufficient condition for the compactness of K is, e.g., ${\parallel w\parallel}_{{L}^{2}(\mathcal{B}\times \mathcal{B})}<\mathrm{\infty}$ in which case the operator is called a Hilbert–Schmidt operator. Since ℬ is bounded, the continuity of the kernel w on $\mathcal{B}\times \mathcal{B}$ implies the compactness of K considered an integral operator on $C(\mathcal{B})$.
where W is an ${L}^{2}(\mathcal{B})$valued stochastic process. Interpreting the original equation in this form, we now give a definition of the noise process assuming that
(H3) W is a QWiener process on ${L}^{2}(\mathcal{B})$, where the covariance operator Q is a nonnegative, symmetric trace class operator on ${L}^{2}(\mathcal{B})$.
where ${\beta}^{i}$ are a sequence of independent scalar Wiener processes (cf. [[56], Proposition 2.1.10]). The series (5) converges in the meansquare on $C([0,T],{L}^{2}(\mathcal{B}))$. Furthermore, a straightforward adaptation of the proof of [[56], Proposition 2.1.10] shows that convergence in the meansquare also holds in the space $C([0,T],C(\mathcal{B}))$ for every $T>0$ if ${v}_{i}\in C(\mathcal{B})$ for all i (corresponding to nonzero eigenvalues) and ${sup}_{x\in \mathcal{B}}{\sum}_{i=1}^{\mathrm{\infty}}{\lambda}_{i}^{2}{v}_{i}{(x)}^{2}<\mathrm{\infty}$.
The solution possesses a modification in $C([0,T],{L}^{2}(\mathcal{B}))$ and from now on we always identify the solution (6) with its continuous modification. It is worthwhile to note that for cylindrical Wiener processes—and thus in particular spacetime white noise—there does not exist a solution to (4). This contrasts with other wellstudied infinitedimensional stochastic evolution equations, e.g., the stochastic heat equation. Due to the representation of the solution (6), it follows that a solution can only be as spatially regular as the stochastic convolution ${\int}_{0}^{t}{\mathrm{e}}^{\alpha (ts)}d{W}_{s}$. In the present case, the semigroup generated by the linear operator is not smoothing in contrast to, e.g., the semigroup generated by the Laplacian in the heat equation. Thus, the stochastic convolution is only as smooth as the noise which for spacetime white noise is not even a welldefined function. To be more specific, for cylindrical Wiener noise, the series representation of the stochastic convolution (cf. see Eq. (8) below) does not converge in a suitable probabilistic sense.
We next aim to strengthen the spatial regularity of the solution (6), which will be required later on. According to [[23], Theorem 7.10] the solution (6) is a continuous process taking values in the Banach space $C(\mathcal{B})$ if the initial condition satisfies ${u}_{0}\in C(\mathcal{B})$, the linear part in the drift of (4) generates a strongly continuous semigroup on $C(\mathcal{B})$, the nonlinear term KF is globally Lipschitz continuous on $C(\mathcal{B})$, and finally, if the stochastic convolution is a continuous process taking values in $C(\mathcal{B})$. It is easily seen that the first conditions are satisfied and sufficient conditions for the latter property are given in the following lemma.
possesses a modification with γHölder continuous paths in ${\mathbb{R}}_{+}\times \mathcal{B}$ for all $\gamma \in (0,\rho /2)$.
Now, the Kolmogorov–Centsov theorem implies the statement of the lemma. □
We present an example to illustrate the type of noise we are generally interested in. Further motivation is provided in Sect. 3.
3 Gain Function Perturbation
Recall that, by assumption (H2), the integral operator K defined by the kernel w is a selfadjoint compact operator. Thus, the spectral theorem implies that K possess only real eigenvalues ${\lambda}_{i}$, $i\in \mathbb{N}$, and the corresponding eigenfunctions ${v}_{i}$ form an orthonormal basis of ${L}^{2}(\mathcal{B})$. If additionally we assume that
(H4) K is a Hilbert–Schmidt operator on ${L}^{2}(\mathcal{B})$, that is, ${\parallel w\parallel}_{{L}^{2}(\mathcal{B}\times \mathcal{B})}<\mathrm{\infty}$,
is a traceclass Wiener process on the Hilbert space ${L}^{2}(\mathcal{B})$. Note, when comparing to (5) here the coefficients ${\lambda}_{i}$ may be negative, however, as ${\beta}^{i}$ is also a Wiener process this slight inconsistency can be neglected.
for a $\rho \in (0,1)$ and a $M<\mathrm{\infty}$. The condition (14) and the first part of (15) are easily checked but for the second part of (15) usually theoretical results on the speed of decay of the eigenvalues have to be obtained. We note that (15) is certainly satisfied with $\rho =1/2$ if K is a trace class operator and the eigenfunctions are pointwise bounded independently of i; see, e.g., Example 2.1.
4 Deterministic Dynamics
The linear stability condition $\mu <0$ is equivalent to $\eta <\alpha $ where $\eta \in spec(\mathcal{L})$. The stability analysis can be reduced to the understanding of the operator ℒ. However, this is a highly nontrivial problem as the behavior depends upon ℬ, ${U}^{\ast}(x)$, $w(x,y)$, and $f(u)$.
An LDP and Kramers’ law are of particular interest in the case of bistability. Therefore, we point out that there are many situations where (16) does have three stationary solutions: ${U}_{\pm}^{\ast}(x)$, which are stable and ${U}_{0}^{\ast}(x)$ which is unstable. The following three examples make this claim more precise.
Suppose there are precisely three solutions for $U=F(U)$ given by $U=0,a,1$ with $0<a<1$. If ${F}^{\prime}(0)<1$, ${F}^{\prime}(1)<1$ and ${F}^{\prime}(a)>1$ then (20) has an unstable stationary solution between the two stable stationary solutions.
where $H(\cdot )$ is the Heaviside function and b, a, A, and ${u}_{b}$ are parameters. Depending on parameter values, one may obtain three constant stationary solutions exhibiting bistability as expected from Example 4.1. However, there are also parameter values so that three stationary pulses exhibiting bistability exist.
Note that the choice $\mathcal{B}=\mathbb{R}$ is not essential to obtain two deterministicallystable stationary states ${U}_{\pm}^{\ast}(x)$ and one deterministicallyunstable stationary state ${U}_{0}^{\ast}(x)$. The important aspect is that certain algebraic equations, such as $U=f(U)$ and $U=F(U)$ in Example 4.1, have the correct number of solutions. Furthermore, one has to make sure that the sign of the nonlinearity f is chosen correctly to obtain the desired deterministic stability results for the stationary solutions. Hence, we expect that a similar situation also holds for bounded domains; see also [63].
Examples 4.1–4.2 are typical for many similar cases with $x\in \mathbb{R}$ or $x\in {\mathbb{R}}^{2}$. Many results on existence and stability of stationary solutions are available; see, e.g., [1, 46, 51, 52], and references therein.
yield three stationary solutions ${U}_{+}^{\ast}$, ${U}_{}^{\ast}$ and ${U}_{0}^{\ast}$. The solutions ${U}_{\pm}^{\ast}$ are stable and satisfy ${U}_{}^{\ast}\le 0$ and ${U}_{+}^{\ast}>0$. The solution ${U}_{0}^{\ast}$ is unstable.
Although we only focus on stationary solutions, it is important to remark that the techniques developed here could—in principle—also be applied to traveling waves $U(x,t)=U(xst)$ for $s>0$. The existence and stability of traveling waves for (16) has been investigated for many different situations; see, e.g. [12, 14, 21, 29], and references therein. However, it seems reasonable to restrict ourselves here to the stationary case as even for this simpler case an LDP and Kramers’ law are not yet well understood.
5 Large Deviations and Kramers’ Law
Furthermore, we are going to assume that the diffusion matrix $\mathfrak{D}(u):=G{(u)}^{T}G(u)\in {\mathbb{R}}^{N\times N}$ is positive definite.
Theorem 5.2 ([[34], Theorem 4.2, p. 127], [[26], Theorem 5.7.11])
where ${\parallel \cdot \parallel}_{2}$ denotes the usual Euclidean norm in ${\mathbb{R}}^{N}$. The formula (32) is also known as Kramers’ law [5] or Arrhenius–Eyring–Kramers’ law [2, 31, 45]. Note that the key differences with the general LDP (29) for the firstexit problem are that (32) yields a precise prefactor for the exponential transition time and uses the explicit form of the good rate function for gradient systems. It is interesting to note that a rigorous proof of (32) has only been obtained quite recently [9, 10].
6 Gradient Structures in Infinite Dimensions
The Gâteaux derivative is equal to the Fréchet derivative $\mathrm{\nabla}V=DV$ by a standard continuity result [[25], p. 47]. Hence, (36) shows that the stationary solutions of (35) are critical points of the gradient functional V. Since the gradient structure of the deterministic PDE (35) is a key structure to obtain a Kramers’type estimate for the SPDE (33), we would like to check whether there is an analogue available for the deterministic Amari model (16).

There is still a spacetime dependent nonlinear prefactor $1/{g}^{\prime}(P(x,t))$ in (42) for the deterministic system, so the system is not an exact gradient flow for a potential.

Applying the changeofvariable ${P}_{t}(x):=f({U}_{t}(x))$ for the stochastic Amari model (2) requires an Itôtype formula so that$\begin{array}{rcl}d{P}_{t}(x)& =& \frac{1}{{g}^{\prime}({P}_{t}(x))}[\alpha g({P}_{t}(x))+{\int}_{\mathcal{B}}w(x,y){P}_{t}(y)dy+\mathcal{O}\left({\u03f5}^{2}\right)]dt\\ +\u03f5M({P}_{t}(x))d{W}_{t}(x),\end{array}$(43)

Even if we would just assume—without any immediate physical motivation—that the noise term in (43) is purely additive $\u03f5d{W}_{t}(x)$, there is a problem to apply Kramers’ law since we do not have a structure like in (22) with $G(\cdot )=\mathit{Id}$ as ${W}_{t}(x)$ is a QWiener process defined in (5) and driving spacetime white noise in (4) is particularly excluded due to the nonexistence of a solution.
Based on these observations, an immediate approach to generalize a sharp Kramers’ formula to neural fields seems unlikely. Hence we try to understand an LDP for the stochastic Amaritype model (2).
7 Direct Approach to an LDP
A general direct approach for the derivation of an LDP for infinitedimensional stochastic evolution equations is presented in [23] and further results have been obtained for certain additional classes of SPDEs [17–19, 57]. The results in [23] are valid for semilinear equations with suitable Lipschitz assumptions on the nonlinearity and with solutions taking values in $C(\mathcal{D})$. We state the available results applied to continuous solutions of the Amari equation (4) assuming that the conditions of Lemma 2.1 are satisfied.
where this quasipotential relates to the minimal energy necessary to move the control system (44) started at the equilibrium state ${u}^{\ast}$ to z.
Theorem 7.1 ([[23], Theorem 12.18])
Following further the exposition in [[23], Sect. 12] explicit formulae for the rate function I are only available in the special case of the drift possessing gradient structure and spacetime white noise. As we have argued above, this structure is particularly not satisfied for neural field equations. Hence, the same observations as presented at the end of the last section prevent a further direct analytic approach to the LDP. Therefore, we try to understand the LDP problem for a discretized approximate finitedimensional version of the neural field equation.
8 Galerkin Approximation
respectively. The following theorem establishes the almost sure convergence of the Galerkin approximations to the solution of (4). Therefore, we may be able to infer properties of the behavior of paths of the solution from the path behavior of the Galerkin approximations. We have deferred the proof of the theorem to the Appendix.
9 Approximating the LDP
Furthermore, for arbitrary functions ${\varphi}_{t}\in {L}^{2}(\mathcal{B})$, which are used in the formulation of the rate function, we use the notation ${\varphi}_{t}^{\cdot ,N}$ to denote the projection onto the first N Galerkin coefficients. Theorem 5.1 immediately implies the following:
where ${g}^{i,N}({\varphi}_{t}^{\cdot ,N})=\alpha {\varphi}_{t}^{i,N}+{(KF)}^{i,N}({\varphi}_{t}^{1,N},\dots ,{\varphi}_{t}^{N,N})$.
where $g({\varphi}_{t})=\alpha {\varphi}_{t}+KF({\varphi}_{t})$ and ${\mathfrak{D}}^{1/2}u={\sum}_{i=1}^{\mathrm{\infty}}({\mathfrak{D}}^{1/2}u,{v}_{i}){v}_{i}$. Therefore, the next result just implies that the Galerkin approximation is consistent for the LDP.
Proposition 9.2 For each ${\varphi}_{t}\in {u}_{0}+{H}_{1}^{\mathrm{\infty}}$ we have ${lim}_{N\to \mathrm{\infty}}I({\varphi}_{t}){I}^{N}({\varphi}_{t}^{\cdot ,N})=0$.
by orthonormality of the basis in ${L}^{2}(\mathcal{B})$. □
Hence, we may work with the finitedimensional Galerkin system and its LDP for computational purposes. However, the truncation N may still be very large. We are going to show, using a formal analysis for a certain case, that there is an intrinsic multiscale structure of the rate function. We assume that we are in the special case considered in Sect. 3 where K and Q have the same eigenfunctions and the corresponding eigenvalues are given by ${\lambda}_{i}$ and ${\lambda}_{i}^{2}$, respectively.
and ${(\tilde{KF})}^{i,N}=\frac{1}{{\lambda}_{i}^{2}}{(KF)}^{i,N}$.
and observe that the result is independent of N. □
then $\eta (N)=1$.
Since ${\parallel {v}_{j}\parallel}_{{L}^{2}(\mathcal{B})}=1$ and ${L}^{2}(\mathcal{B})\hookrightarrow {L}^{1}(\mathcal{B})$, the last integral is uniformly bounded over $j\in \mathbb{N}$ by $meas{(\mathcal{B})}^{1/2}$. □

If $\kappa (N)\ll {\lambda}_{N}$ or $\kappa (N)\sim {\lambda}_{N}$ as $N\to \mathrm{\infty}$, then we can conclude that $\kappa (N)\to 0$, i.e.,${\left({\varphi}_{t}^{N,N}\right)}^{\prime}+\alpha {\varphi}_{t}^{N,N}\to 0\phantom{\rule{1em}{0ex}}\text{as}N\to 0$(59)

If $\kappa (N)\gg {\lambda}_{N}$ as $N\to \mathrm{\infty}$, then ${a}_{1}\gg 2{a}_{2}+{a}_{3}$ and the first term dominates the asymptotics. But ${a}_{1}^{N}\ge 0$ for all N so that the rate function only has a finite infimum if ${a}_{1}^{N}\to 0$ as $N\to \mathrm{\infty}$. This implies again that (59) holds for the case of a finite infimum.
so that for bounded nonlinearity f, which is represented in the terms $[\cdots ]$ in (60), the higherorder modes should really just be governed by $d{u}_{t}^{i,N}=\alpha {u}_{t}^{i,N}dt$.
Hence, Propositions 9.1–9.2 and the multiscale nature of the problem induced by the traceclass noise suggests a procedure how to approximate the rate function and the associated LDP in practice. In particular, we may compute the eigenvalues and eigenfunctions of K and Q up to a sufficiently large given order ${N}^{\ast}$. This yields an explicit representation of the Galerkin system and the associated rate function. Then one may apply any finitedimensional technique to understand the rate function. One may even find a better truncation order $N<{N}^{\ast}$ based on the knowledge that the minimizer of the rate function must have components that decay (almost) exponentially in time for orders bigger than N.
10 Outlook
In this paper, we have discussed several steps toward a better understanding of noiseinduced transitions in continuum neural fields. Although we have provided the main basic elements via the LDP and finitedimensional approximations, there are still several very interesting open problems.
We have demonstrated that a sharp Kramers’ rate calculation for neural fields with traceclass noise is very challenging as the techniques for whitenoise gradientstructure SPDEs cannot be applied directly. However, we have seen in Sect. 4 that the deterministic dynamics for neural fields frequently exhibits a classical bistable structure with a saddlestate between stable equilibria. This suggests that there should be a Kramers’ law with exponential scaling in the noise intensity as well as a precisely computable prefactor. It is interesting to ask how this prefactor depends on the eigenvalues of the traceclass operator Q defining the QWiener process. We expect that new technical tools are needed to answer this question.
From the viewpoint of experimental data, the exponential scaling for the LDP is relevant as it shows that noiseinduced transitions have exponential interarrival times. This leads to the possibility that working memory as well as perceptual bistability could be governed by a Poisson process. However, the same phenomena could also be governed by a slowly varying variable, i.e., by an adaptive neural field [14]; the “fast” activity variable U in the Amari model is augmented by one or more “slow” variables. In this context, the required assumptions on the equilibrium structure in Sect. 4 and the noise in Sect. 3 is not necessary to produce a bistable switch and the fast variable U can, e.g., just have a single deterministically unstable equilibrium and bistable, nonrandom switching between metastable states may occur. Of course, there is also the possibility that an intermediate regime between noiseinduced and deterministic escape is relevant [53].
It is interesting to note that the same problem arises generically across many natural sciences in the study of critical transitions (or “tipping points”) [48, 59]. The question which escape mechanism from a metastable state matches the data is often discussed very controversially and we shall not aim to provide a discussion here. However, our main goal to make the LDP and its associated rate functional as explicit as possible should definitely help to simplify comparison between models and experiment. For example, a parameter study or data assimilation for the finitedimensional Galerkin system considered in Theorem 8.1 and the associated rate function in Proposition 9.1 are often easier than working directly with the abstract solutions of the stochastic Amari model in $C([0,T],{L}^{2}(\mathcal{B}))$.
To study the parameter dependence is an interesting open question, which we aim to address in future work. In particular, the next step is to use the Galerkin approximations in Sect. 8 and the associated LDP in Sect. 9 for numerical purposes [49]. Recent work for SPDEs [8] suggests that a spectral method can also be efficient for stochastic neural fields. Results on numerical continuation and jump heights for SDEs [47] can also be immediately transferred to the spectral approximation, which would allow for studies of bifurcations and associated noiseinduced phenomena.
One may also ask how far the technical assumptions we make in this paper can be weakened. It is not clear which parts of the global Lipschitz assumptions may be replaced by local assumptions or removed altogether. Similar remarks apply to the multiscale nature of the problem induced by the decay of the eigenvalues of Q. How far this observation can be exploited to derive more efficient analytical as well as numerical techniques remains to be investigated.
On a more abstract level, it would certainly be desirable to extend our basic framework to other topics that have been considered already for deterministic neural fields. A generalization to activity based models with nonlinearity $f({\int}_{\mathcal{B}}w(x,y)u(y)dy)$ seems possible. Furthermore, it may be highly desirable to go beyond stationary solutions and investigate noiseinduced switching and transitions for traveling waves and patterns.
Appendix: Convergence of the Galerkin Approximation
almost surely.

It clearly holds that ${\parallel {U}_{0}{P}^{N}{U}_{0}\parallel}_{{L}^{2}}\to 0$ and the convergence ${\parallel {U}_{0}{P}^{N}{U}_{0}\parallel}_{0}\to 0$ holds by assumption.

Next, as argued above $(1+\parallel {U}_{0}\parallel +{sup}_{t\in [0,T]}\parallel {O}_{t}\parallel )$ is a.s. finite and the compactness of the operator K implies $\u2980K{P}^{N}K\u2980\to 0$ for $N\to \mathrm{\infty}$, see [[3], Lemma 12.1.4].

Finally, the third error term ${sup}_{t\in [0,T]}\parallel {O}_{t}{O}_{t}^{N}\parallel $ vanishes if the Galerkin approximations ${O}^{N}$ of the Ornstein–Uhlenbeck process O converge almost surely in the spaces $C([0,T],{L}^{2}(\mathcal{B}))$ and $C([0,T],C(\mathcal{B}))$, respectively. This convergence is proven in Lemma A.1 below.
The proof is completed. □
The following lemma contains the convergence of the Galerkin approximation of the Ornstein–Uhlenbeck process necessary for proving Theorem 8.1.
almost surely.
Remark A.1 Assumptions on the speed of convergence of the series ${\sum}_{i=1}^{\mathrm{\infty}}{\lambda}_{i}^{2}$ and ${\sum}_{i=1}^{\mathrm{\infty}}{\lambda}_{i}^{2}{v}_{i}^{2}$ and ${sup}_{x\in \mathcal{B}}{\sum}_{i=1}^{\mathrm{\infty}}{\lambda}_{i}^{2}{L}_{i}^{2\rho}{{v}_{i}(x)}^{2(1\rho )}$ readily yield a rate of convergence for the Galerkin approximation due to the definition of the constants ${b}_{N}$ in the proof of the lemma.
Proof of Lemma A.1 As in the proof Theorem 8.1 the unspecified norm $\parallel \cdot \parallel $ denotes either the norm in ${L}^{2}(\mathcal{B})$ or in $C(\mathcal{B})$ and estimates are valid in both cases. We fix $T>0$, $\rho \in (0,1)$ and a $p\in \mathbb{N}$ with $p>2d/\rho $. Throughout the proof $C>0$ denotes a constant that changes from line to line, but depends only on the fixed parameters T, p, ρ, α and the domain $\mathcal{B}\subset {\mathbb{R}}^{d}$.
In order to estimate the p th mean of the process ${Y}^{M,N}$, we proceed separately for the two cases ${L}^{2}(\mathcal{B})$ and $C(\mathcal{B})$.
where the final upper bound decreases to zero for $M\to \mathrm{\infty}$ by assumption.
where the righthand side decreases to zero for $M\to \mathrm{\infty}$.
The proof is completed. □
We note that the boundedness assumption on the domain ℬ in this study is only necessary when dealing with results in the space $C(\mathcal{B})$ as is the appropriate space for the LDP results. All other results in this paper which only deal with the space ${L}^{2}(\mathcal{B})$, e.g., existence of solutions and convergence of the Galerkin approximation, are also valid for unbounded spatial domains.
Hence, the upper bound, which is finite due to (7), is in independent of x and converges to zero for $N\to \mathrm{\infty}$. Moreover, we further find that $g(t)\in {L}^{2}((0,T),{L}^{2}(\mathcal{B}))$ implies ${Q}^{1/2}g(t)\in {L}^{2}((0,T),C(\mathcal{B}))$.
For a centered Gaussian random variable Z, it holds $\mathbb{E}{Z}^{p}\le p!{(\mathbb{E}{Z}^{2})}^{p/2}$ for all $p\in \mathbb{N}$.
Declarations
Acknowledgements
CK would like to thank the European Commission (EC/REA) for support by a MarieCurie International Reintegration Grant and the Austrian Academy of Sciences (ÖAW) for support via an APART fellowship. We also would like to thank two anonymous referees whose comments helped to improve the manuscript.
Authors’ Affiliations
References
 Amari S: Dynamics of pattern formation in lateralinhibition type neural fields. Biol Cybern 1977, 27: 77–87. 10.1007/BF00337259MATHMathSciNetView ArticleGoogle Scholar
 Arrhenius S: Über die Reaktionsgeschwindigkeit bei der Inversion von Rohrzucker durch Säuren. Z Phys Chem 1889, 4: 226–248.Google Scholar
 Atkinson K, Han W: Theoretical Numerical Analysis. 2nd edition. Springer, Berlin; 2005.MATHView ArticleGoogle Scholar
 arXiv: http://arxiv.org/abs/1201.4440 Barret F: Sharp asymptotics of metastable transition times for onedimensional SPDEs. arXiv:1201.4440; 2012.
 arXiv: http://arxiv.org/abs/arXiv:1106.5799v1 Berglund N: Kramers’ law: validity, derivations and generalisations. arXiv:1106.5799v1; 2011.
 Berglund N, Gentz B: Anomalous behavior of the Kramers rate at bifurcations in classical field theories. J Phys A, Math Theor 2009., 42: Article ID 052001 Article ID 052001Google Scholar
 arXiv: http://arxiv.org/abs/arXiv:1202.0990 Berglund N, Gentz B: Sharp estimates for metastable lifetimes in parabolic SPDEs: Kramers’ law and beyond. arXiv:1202.0990; 2012.
 Blömker D, Jentzen A: Galerkin approximations for the stochastic Burgers equation. SIAM J Numer Anal 2013. 10.1137/110845756Google Scholar
 Bovier A, Eckhoff M, Gayrard V, Klein M: Metastability in reversible diffusion processes. I. Sharp asymptotics for capacities and exit times. J Eur Math Soc 2004, 6(4):399–424.MATHMathSciNetView ArticleGoogle Scholar
 Bovier A, Gayrard V, Klein M: Metastability in reversible diffusion processes. II. Precise estimates for small eigenvalues. J Eur Math Soc 2005, 7: 69–99.MATHMathSciNetView ArticleGoogle Scholar
 Brackley CA, Turner MS: Random fluctuations of the firing rate function in a continuum neural field model. Phys Rev E 2007., 75: Article ID 041913 Article ID 041913Google Scholar
 Bressloff PC: Traveling fronts and wave propagation failure in an inhomogeneous neural network. Physica D 2001, 155: 83–100. 10.1016/S01672789(01)002664MATHMathSciNetView ArticleGoogle Scholar
 Bressloff PC: Metastable states and quasicycles in a stochastic Wilson–Cowan model. Phys Rev E 2010., 82: Article ID 051903 Article ID 051903Google Scholar
 Bressloff PC: Spatiotemporal dynamics of continuum neural fields. J Phys A, Math Theor 2012., 45: Article ID 033001 Article ID 033001Google Scholar
 Bressloff PC, Webber MA: Front propagation in stochastic neural fields. SIAM J Appl Dyn Syst 2012, 11(2):708–740. 10.1137/110851031MATHMathSciNetView ArticleGoogle Scholar
 Bressloff PC, Wilkerson J: Traveling pulses in a stochastic neural field model of direction selectivity. Front Comput Neurosci 2012., 6: Article ID 90 Article ID 90Google Scholar
 CardonWeber C: Large deviations for a Burgers’type SPDE. Stoch Process Appl 1999, 84(1):53–70. 10.1016/S03044149(99)000472MATHMathSciNetView ArticleGoogle Scholar
 Cerrai S, Röckner M: Large deviations for invariant measures of general stochastic reactiondiffusion systems. C R Math Acad Sci 2003, 337: 597–602. 10.1016/j.crma.2003.09.015MATHView ArticleGoogle Scholar
 Cerrai S, Röckner M: Large deviations for stochastic reactiondiffusion systems with multiplicative noise and nonLipschitz reaction term. Ann Probab 2004, 32: 1–40. 10.1214/aop/1078415827MathSciNetView ArticleGoogle Scholar
 Coombes S: Waves, bumps, and patterns in neural field theories. Biol Cybern 2005, 93: 91–108. 10.1007/s004220050574yMATHMathSciNetView ArticleGoogle Scholar
 Coombes S, Owen MR: Evans functions for integral neural field equations with Heaviside firing rate function. SIAM J Appl Dyn Syst 2004, 4: 574–600.MathSciNetView ArticleGoogle Scholar
 Coombes S, Laing CR, Schmidt H, Svanstedt N, Wyller JA: Waves in random neural media. Discrete Contin Dyn Syst, Ser A 2012, 32: 2951–2970.MATHMathSciNetView ArticleGoogle Scholar
 Da Prato G, Zabczyk J: Stochastic Equations in Infinite Dimensions. Cambridge University Press, Cambridge; 1992.MATHView ArticleGoogle Scholar
 arXiv: http://arxiv.org/abs/arXiv:1009.3526v4 Da Prato G, Jentzen A, Röckner M: A mild Itô formula for SPDEs. arXiv:1009.3526v4; 2012.
 Deimling K: Nonlinear Functional Analysis. Dover, Mineola; 2010.MATHGoogle Scholar
 Dembo A, Zeitouni O Applications of Mathematics 38. In Large Deviations Techniques and Applications. Springer, Berlin; 1998.View ArticleGoogle Scholar
 Destexhe A, RudolphLilith M: Neuronal Noise. Springer, Berlin; 2012.View ArticleGoogle Scholar
 Enculescu M, Bestehorn M: Liapunov functional for a delayed integrodifferential equation model of a neural field. Europhys Lett 2007., 77: Article ID 68007 Article ID 68007Google Scholar
 Ermentrout GB, McLeod JB: Existence and uniqueness of travelling waves for a neural network. Proc R Soc Edinb A 1993, 123(3):461–478. 10.1017/S030821050002583XMATHMathSciNetView ArticleGoogle Scholar
 Evans LC: Partial Differential Equations. 2nd edition. Am Math Soc, Providence; 2010.MATHGoogle Scholar
 Eyring H: The activated complex in chemical reactions. J Chem Phys 1935, 3: 107–115. 10.1063/1.1749604View ArticleGoogle Scholar
 Faris WG, JonaLasinio G: Large fluctuations for a nonlinear heat equation with noise. J Phys A, Math Gen 1982, 15: 3025–3055. 10.1088/03054470/15/10/011MATHMathSciNetView ArticleGoogle Scholar
 arXiv: http://arxiv.org/abs/arXiv:1302.1029v3 Faugeras O, MacLaurin J: A large deviation principle for networks of rate neurons with correlated synaptic weights. arXiv:1302.1029v3; 2013.
 Freidlin MI, Wentzell AD: Random Perturbations of Dynamical Systems. Springer, Berlin; 1998.MATHView ArticleGoogle Scholar
 Funahashi S, Bruce CJ, GoldmanRakic PS: Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. J Neurophysiol 1989, 61(2):331–349.Google Scholar
 GarcíaOjalvo J, Sancho MJ: Noise in Spatially Extended Systems. Springer, New York; 1999.MATHView ArticleGoogle Scholar
 Gerstner W, Kistler W: Spiking Neuron Models. Cambridge University Press, Cambridge; 2002.MATHView ArticleGoogle Scholar
 Ginzburg I, Sompolinsky H: Theory of correlations in stochastic neural networks. Phys Rev E 1994, 50: 3171–3191. 10.1103/PhysRevE.50.3171View ArticleGoogle Scholar
 Guo Y, Chow CC: Existence and stability of standing pulses in neural networks: I. Existence. SIAM J Appl Dyn Syst 2005, 4(2):217–248. 10.1137/040609471MATHMathSciNetView ArticleGoogle Scholar
 Guo Y, Chow CC: Existence and stability of standing pulses in neural networks: II. Stability. SIAM J Appl Dyn Syst 2005, 4(2):249–281. 10.1137/040609483MATHMathSciNetView ArticleGoogle Scholar
 Hutt A, Longtin A, SchimanskyGeier L: Additive noiseinduced Turing transitions in spatial systems with application to neural fields and the Swift–Hohenberg equation. Physica D 2008, 237: 755–773. 10.1016/j.physd.2007.10.013MATHMathSciNetView ArticleGoogle Scholar
 Jin D, Liang D, Peng J: Existence and properties of stationary solution of dynamical neural field. Nonlinear Anal, Real World Appl 2011, 12: 2706–2716. 10.1016/j.nonrwa.2011.03.016MATHMathSciNetView ArticleGoogle Scholar
 Kilpatrick ZP, Ermentrout B: Wandering bumps in stochastic neural fields. SIAM J Appl Dyn Syst 2013. 10.1137/120877106Google Scholar
 Kloeden PE, Neuenkirch A: The pathwise convergence of approximation schemes for stochastic differential equations. LMS J Comput Math 2007, 10: 235–253.MATHMathSciNetView ArticleGoogle Scholar
 Kramers HA: Brownian motion in a field of force and the diffusion model of chemical reactions. Physica 1940, 7(4):284–304. 10.1016/S00318914(40)900982MATHMathSciNetView ArticleGoogle Scholar
 Kubota S, Hamaguchi K, Aihara K: Local excitation solutions in onedimensional neural fields by external input stimuli. Neural Comput Appl 2009, 18: 591–602. 10.1007/s0052100902462View ArticleGoogle Scholar
 Kuehn C: Deterministic continuation of stochastic metastable equilibria via Lyapunov equations and ellipsoids. SIAM J Sci Comput 2012, 34(3):A1635A1658. 10.1137/110839874MATHMathSciNetView ArticleGoogle Scholar
 Kuehn C: A mathematical framework for critical transitions: normal forms, variance and applications. J Nonlinear Sci 2013. 10.1007/s003320129158xGoogle Scholar
 Kuehn C, Riedler MG: Spectral approximations for stochastic neural fields. In preparation; 2013. Kuehn C, Riedler MG: Spectral approximations for stochastic neural fields. In preparation; 2013.
 Laing C, Lord G (Eds): Stochastic Methods in Neuroscience. Oxford University Press, London; 2009.Google Scholar
 Laing CR, Troy WC: PDE methods for nonlocal models. SIAM J Appl Dyn Syst 2003, 2(3):487–516. 10.1137/030600040MATHMathSciNetView ArticleGoogle Scholar
 Laing CR, Troy WC, Gutkin B, Ermentrout B: Multiple bumps in a neuronal model of working memory. SIAM J Appl Math 2002, 63(1):62–97. 10.1137/S0036139901389495MATHMathSciNetView ArticleGoogle Scholar
 Meisel C, Kuehn C: On spatial and temporal multilevel dynamics and scaling effects in epileptic seizures. PLoS ONE 2012., 7(2): Article ID e30371 Article ID e30371
 MorenoBote R, Rinzel J, Rubin N: Noiseinduced alternations in an attractor network model of perceptual bistability. J Neurophysiol 2007, 98(3):1125–1139. 10.1152/jn.00116.2007View ArticleGoogle Scholar
 Potthast R, Beim Graben P: Existence and properties of solutions for neural field equations. Math Methods Appl Sci 2010, 33(8):935–949.MATHMathSciNetGoogle Scholar
 Prévôt C, Roeckner M: A Concise Course on Stochastic Partial Differential Equations. Springer, Berlin; 2007.MATHGoogle Scholar
 Röckner M, Wang FY, Wu L: Large deviations for stochastic generalized porous media equations. Stoch Process Appl 2006, 116: 1677–1689. 10.1016/j.spa.2006.05.007MATHView ArticleGoogle Scholar
 Runst T, Sickel W: Sobolev Spaces of Fractional Order, Nemytskij Operators, and Nonlinear Partial Differential Equations. de Gruyter, Berlin; 1996.MATHView ArticleGoogle Scholar
 Scheffer M, Bascompte J, Brock WA, Brovkhin V, Carpenter SR, Dakos V, Held H, van Nes EH, Rietkerk M, Sugihara G: Earlywarning signals for critical transitions. Nature 2009, 461: 53–59. 10.1038/nature08227View ArticleGoogle Scholar
 Shardlow T: Numerical simulation of stochastic PDEs for excitable media. J Comput Appl Math 2005, 175(2):429–446. 10.1016/j.cam.2004.06.020MATHMathSciNetView ArticleGoogle Scholar
 Soula H, Chow CC: Stochastic dynamics of a finitesize spiking neural network. Neural Comput 2007, 19(12):3262–3292. 10.1162/neco.2007.19.12.3262MATHMathSciNetView ArticleGoogle Scholar
 van Ee R: Dynamics of perceptual bistability for stereoscopic slant rivalry and a comparison with grating, houseface, and Necker cube rivalry. Vis Res 2005, 45: 29–40. 10.1016/j.visres.2004.07.039View ArticleGoogle Scholar
 Veltz R, Faugeras O: Local/global analysis of the stationary solutions of some neural field equations. SIAM J Appl Dyn Syst 2010, 9(3):954–998. 10.1137/090773611MATHMathSciNetView ArticleGoogle Scholar
 Wilson H, Cowan J: A mathematical theory of the functional dynamics of cortical and thalamic nervous tissue. Biol Cybern 1973, 13(2):55–80.MATHGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.