Meanfield description and propagation of chaos in networks of HodgkinHuxley and FitzHughNagumo neurons
 Javier Baladron^{1},
 Diego Fasoli^{1},
 Olivier Faugeras^{1, 2}Email author and
 Jonathan Touboul^{3, 4, 5, 6}
https://doi.org/10.1186/21908567210
© Baladron et al.; licensee Springer 2012
Received: 19 October 2011
Accepted: 9 March 2012
Published: 31 May 2012
Abstract
We derive the meanfield equations arising as the limit of a network of interacting spiking neurons, as the number of neurons goes to infinity. The neurons belong to a fixed number of populations and are represented either by the HodgkinHuxley model or by one of its simplified version, the FitzHughNagumo model. The synapses between neurons are either electrical or chemical. The network is assumed to be fully connected. The maximum conductances vary randomly. Under the condition that all neurons’ initial conditions are drawn independently from the same law that depends only on the population they belong to, we prove that a propagation of chaos phenomenon takes place, namely that in the meanfield limit, any finite number of neurons become independent and, within each population, have the same probability distribution. This probability distribution is a solution of a set of implicit equations, either nonlinear stochastic differential equations resembling the McKeanVlasov equations or nonlocal partial differential equations resembling the McKeanVlasovFokkerPlanck equations. We prove the wellposedness of the McKeanVlasov equations, i.e. the existence and uniqueness of a solution. We also show the results of some numerical experiments that indicate that the meanfield equations are a good representation of the mean activity of a finite size network, even for modest sizes. These experiments also indicate that the McKeanVlasovFokkerPlanck equations may be a good way to understand the meanfield dynamics through, e.g. a bifurcation analysis.
Mathematics Subject Classification (2000):60F99, 60B10, 92B20, 82C32, 82C80, 35Q80.
Keywords
meanfield limits propagation of chaos stochastic differential equations McKeanVlasov equations FokkerPlanck equations neural networks neural assemblies HodgkinHuxley neurons FitzHughNagumo neurons1 Introduction
Cortical activity displays highly complex behaviors which are often characterized by the presence of noise. Reliable responses to specific stimuli often arise at the level of population assemblies (cortical areas or cortical columns) featuring a very large number of neuronal cells, each of these presenting a highly nonlinear behavior, that are interconnected in a very intricate fashion. Understanding the global behavior of largescale neural assemblies has been a great endeavor in the past decades. One of the main interests of largescale modeling is characterizing brain functions, which most imaging techniques are recording. Moreover, anatomical data recorded in the cortex reveal the existence of structures, such as the cortical columns, with a diameter of about 50 μm to 1 mm, containing the order of 100 to 100,000 neurons belonging to a few different types. These columns have specific functions; for example, in the human visual area V1, they respond to preferential orientations of barshaped visual stimuli. In this case, information processing does not occur at the scale of individual neurons but rather corresponds to an activity integrating the individual dynamics of many interacting neurons and resulting in a mesoscopic signal arising through averaging effects, and this effectively depends on a few effective control parameters. This vision, inherited from statistical physics, requires that the space scale be large enough to include sufficiently many neurons and small enough so that the region considered is homogeneous. This is, in effect, the case of the cortical columns.
In the field of mathematics, studying the limits of systems of particle systems in interaction has been a longstanding problem and presents many technical difficulties. One of the questions addressed in mathematics was to characterize the limit of the probability distribution of an infinite set of interacting diffusion processes, and the fluctuations around the limit for a finite number of processes. The first breakthroughs to find answers to this question are due to Henry McKean (see, e.g. [1, 2]). It was then investigated in various contexts by a large number of authors such as Braun and Hepp [3], Dawson [4] and Dobrushin [5], and most of the theory was achieved by Tanaka and collaborators [6–9] and of course Sznitman [10–12]. When considering that all particles (in our case, neurons) have the same, independent initial condition, they are mathematically proved using stochastic theory (the Wasserstein distance, large deviation techniques) that in the limit where the number of particles tends to infinity, any finite number of particles behaves independently of the other ones, and they all present the same probability distribution, which satisfies a nonlinear Markov equation. Finitesize fluctuations around the limit are derived in a general case in [10]. Most of these models use a standard hypothesis of global Lipschitz continuity and linear growth condition of the drift and diffusion coefficients of the diffusions, as well as the Lipschitz continuity of the interaction function. Extensions to discontinuous càdlàg processes including singular interactions (through a local time process) were developed in [11]. Problems involving singular interaction variables (e.g. nonsmooth functions) are also widely studied in the field, but are not relevant in our case.
 1.
We derive, in a rigorous manner, the meanfield equations resulting from the interaction of infinitely many neurons in the case of widely accepted models of spiking neurons and synapses.
 2.
We prove a propagation of chaos property which shows that in the meanfield limit, the neurons become independent, in agreement with some recent experimental work [13] and with the idea that the brain processes information in a somewhat optimal way.
 3.
We show, numerically, that the meanfield limit is a good approximation of the mean activity of the network even for fairly small sizes of neuronal populations.
 4.
We suggest, numerically, that the changes in the dynamics of the meanfield limit when varying parameters can be understood by studying the meanfield FokkerPlanck equation.
We start by reviewing such models in the ‘Spiking conductancebased models’ section to motivate the present study. It is in the ‘Meanfield equations for conductancebased models’ section that we provide the limit equations describing the behaviors of an infinite number of interacting neurons and state and prove the existence and uniqueness of solutions in the case of conductancebased models. The detailed proof of the second main theorem, that of the convergence of the network equations to the meanfield limit, is given in the Appendix. In the ‘Numerical simulations’ section, we begin to address the difficult problem of the numerical simulation of the meanfield equations and show some results indicating that they may be an efficient way of representing the mean activity of a finitesize network as well as to study the changes in the dynamics when varying biological parameters. The final ‘Discussion and conclusion’ section focuses on the conclusions of our mathematical and numerical results and raises some important questions for future work.
2 Spiking conductancebased models
This section sets the stage for our results. We review in the ‘HodgkinHuxley model’ section the HodgkinHuxley model equations in the case where both the membrane potential and the ion channel equations include noise. We then proceed in the ‘The FitzHughNagumo model’ section with the FitzHughNagumo equations in the case where the membrane potential equation includes noise. We next discuss in the ‘Models of synapses and maximum conductances’ section the connectivity models of networks of such neurons, starting with the synapses, electrical and chemical, and finishing with several stochastic models of the synaptic weights. In the ‘Putting everything together’ section, we write the network equations in the various cases considered in the previous section and express them in a general abstract mathematical form that is the one used for stating and proving the results about the meanfield limits in the ‘Meanfield equations for conductancebased models’ section. Before we jump into this, we conclude in the ‘Meanfield methods in computational neuroscience: a quick overview’ section with a brief overview of the meanfield methods popular in computational neuroscience.
From the mathematical point of view, each neuron is a complex system, whose dynamics is often described by a set of stochastic nonlinear differential equations. Such models aim at reproducing the biophysics of ion channels governing the membrane potential and therefore the spike emission. This is the case of the classical model of Hodgkin and Huxley [14] and of its reductions [15–17]. Simpler models use discontinuous processes mimicking the spike emission by modeling the membrane voltage and considering that spikes are emitted when it reaches a given threshold. These are called integrateandfire models [18, 19] and will not be addressed here. The models of large networks we deal with here therefore consist of systems of coupled nonlinear diffusion processes.
2.1 HodgkinHuxley model
One of the most important models in computational neuroscience is the HodgkinHuxley model. Using pioneering experimental techniques of that time, Hodgkin and Huxley [14] determined that the activity of the giant squid axon is controlled by three major currents: voltagegated persistent ${\mathrm{K}}^{+}$ current with four activation gates, voltagegated transient ${\mathrm{Na}}^{+}$ current with three activation gates and one inactivation gate, and Ohmic leak current, ${I}_{\mathrm{L}}$, which is carried mostly by chloride ions (${\mathrm{Cl}}^{}$). In this paper, we only use the spaceclamped HodgkinHuxley model which we slightly generalize to a stochastic setting in order to better take into account the variability of the parameters. The advantages of this model are numerous, and one of the most prominent aspects in its favor is its correspondence with the most widely accepted formalism to describe the dynamics of the nerve cell membrane. A very extensive literature can also be found about the mathematical properties of this system, and it is now quite well understood.
This is a stochastic version of the HodgkinHuxley model. The functions ${\rho}_{x}$ and ${\zeta}_{x}$ are bounded and Lipschitz continuous (see discussion above). The functions n, m and h are bounded between 0 and 1; hence, the functions ${n}^{4}$ and ${m}^{3}h$ are Lipschitz continuous.
with $\mathrm{\Gamma}=0.1$ and $\mathrm{\Lambda}=0.5$ for all the ion channels. The system of SDEs has been integrated using the EulerMaruyama scheme with $\mathrm{\Delta}t=0.01$.
Because the HodgkinHuxley model is rather complicated and highdimensional, many reductions have been proposed, in particular to two dimensions instead of four. These reduced models include the famous FitzHughNagumo and MorrisLecar models. These two models are twodimensional approximations of the original HodgkinHuxley model based on quantitative observations of the time scale of the dynamics of each variable and identification of variables. Most reduced models still comply with the Lipschitz and linear growth conditions ensuring the existence and uniqueness of a solution, except for the FitzHughNagumo model which we now introduce.
2.2 The FitzHughNagumo model
Note that because the function $f(V)$ is not g lobally Lipschitz continuous (only locally), the wellposedness of the stochastic differential equation (Equation 5) does not follow immediately from the standard theorem which assumes the global Lipschitz continuity of the drift and diffusion coefficients. This question is settled below by Proposition 1.
The deterministic model has been solved with a RungeKutta method of order 4, while the stochastic model, with the EulerMaruyama scheme. In both cases, we have used an integration time step $\mathrm{\Delta}t=0.01$.
2.3 Partial conclusion
We have reviewed two main models of spaceclamped single neurons: the HodgkinHuxley and FitzHughNagumo models. These models are stochastic, including various sources of noise: external and internal. The noise sources are supposed to be independent Brownian processes. We have shown that the resulting stochastic differential Equations 2 and 5 were wellposed. As pointed out above, this analysis extends to a large number of reduced versions of the HodgkinHuxley such as those that can be found in the book [17].
2.4 Models of synapses and maximum conductances
We now study the situation in which several of these neurons are connected to one another forming a network, which we will assume to be fully connected. Let N be the total number of neurons. These neurons belong to P populations, e.g. pyramidal cells or interneurons. If the index of a neuron is i, $1\le i\le N$, we note $p(i)=\alpha $, $1\le \alpha \le P$ as the population it belongs to. We note ${N}_{p(i)}$ as the number of neurons in population $p(i)$. Since we want to be as close to biology as possible while keeping the possibility of a mathematical analysis of the resulting model, we consider two types of simplified, but realistic, synapses: chemical and electrical or gap junctions. The following material concerning synapses is standard and can be found in textbooks [20]. The new, and we think important, twist is to add noise to our models. To unify notations, in what follows, i is the index of a postsynaptic neuron belonging to population $\alpha =p(i)$, and j is the index of a presynaptic neuron to neuron i belonging to population $\gamma =p(j)$.
2.4.1 Chemical synapses
Destexhe et al. [23] give some typical values of the parameters ${T}_{\mathrm{max}}=1\text{mM}$, ${V}_{T}=2\text{mV}$ and $1/\lambda =5\text{mV}$.
Remember that the form of the diffusion term guarantees that the solutions to this equation with appropriate initial conditions stay between 0 and 1. The Brownian motions ${W}^{j,y}$ are assumed to be independent from one neuron to the next.
2.4.2 Electrical synapses
The electrical synapse transmission is rapid and stereotyped and is mainly used to send simple depolarizing signals for systems requiring the fastest possible response. At the location of an electrical synapse, the separation between two neurons is very small (≈3.5 nm). This narrow gap is bridged by the gap junction channels, specialized protein structures that conduct the flow of ionic current from the presynaptic to the postsynaptic cell (see, e.g. [24]).
Electrical synapses thus work by allowing ionic current to flow passively through the gap junction pores from one neuron to another. The usual source of this current is the potential difference generated locally by the action potential. Without the need for receptors to recognize chemical messengers, signaling at electrical synapses is more rapid than that which occurs across chemical synapses, the predominant kind of junctions between neurons. The relative speed of electrical synapses also allows for many neurons to fire synchronously.
where ${J}_{ij}(t)$ is the maximum conductance.
2.4.3 The maximum conductances
As shown in Equations 6, 7 and 10, we model the current going through the synapse connecting neuron j to neuron i as being proportional to the maximum conductance ${J}_{ij}$. Because the synaptic transmission through a synapse is affected by the nature of the environment, the maximum conductances are affected by dynamical random variations (we do not take into account such phenomena as plasticity). What kind of models can we consider for these random variations?
where the ${\xi}^{i,\gamma}(t)$, $i=1,\dots ,N$, $\gamma =1,\dots ,P$, are NPindependent zero mean unit variance white noise processes derived from NPindependent standard Brownian motions ${B}^{i,\gamma}(t)$, i.e. ${\xi}^{i,\gamma}(t)=\frac{d{B}^{i,\gamma}(t)}{dt}$, which we also assume to be independent of all the previously defined Brownian motions. The main advantage of this dynamics is its simplicity. Its main disadvantage is that if we increase the noise level ${\sigma}_{\alpha \gamma}$, the probability that ${J}_{ij}(t)$ becomes negative increases also: this would result in a negative conductance!
This shows that if the initial condition ${J}_{ij}(0)$ is equal to the mean $\frac{{\overline{J}}_{\alpha \gamma}}{{N}_{\gamma}}$, the mean of the process is constant over time and equal to $\frac{{\overline{J}}_{\alpha \gamma}}{{N}_{\gamma}}$. Otherwise, if the initial condition ${J}_{ij}(0)$ is of the same sign as ${\overline{J}}_{\alpha \gamma}$, i.e. positive, then the long term mean is $\frac{{\overline{J}}_{\alpha \gamma}}{{N}_{\gamma}}$ and the process is guaranteed not to touch 0 if the condition $2{N}_{\gamma}{\theta}_{\alpha \gamma}{\overline{J}}_{\alpha \gamma}\ge {({\sigma}_{\alpha \gamma}^{J})}^{2}$ holds [25]. Note that the long term variance is $\frac{{\overline{J}}_{\alpha \gamma}{({\sigma}_{\alpha \gamma}^{J})}^{2}}{2{N}_{\gamma}^{3}{\theta}_{\alpha \gamma}}$.
2.5 Putting everything together
We only cover the case of chemical synapses and leave it to the reader to derive the equations in the simpler case of gap junctions.
2.5.1 Network of FitzHughNagumo neurons
We assume that the parameters ${a}_{i}$, ${b}_{i}$ and ${c}_{i}$ in Equation 5 of the adaptation variable ${w}^{i}$ of neuron i are only functions of the population $\alpha =p(i)$.
which is a set of $N(P+3)$ stochastic differential equations.
2.5.2 Network of HodgkinHuxley neurons
We provide a similar description in the case of the HodgkinHuxley neurons. We assume that the functions ${\rho}_{x}^{i}$ and ${\zeta}_{x}^{i}$, $x\in \{n,m,h\}$, that appear in Equation 2 only depend upon $\alpha =p(i)$.
2.5.3 Partial conclusion
Equations 14 to 17 have a quite similar structure. They are wellposed, i.e. given any initial condition, and any time $T>0$, they have a unique solution on $[0,T]$ which is squareintegrable. A little bit of care has to be taken when choosing these initial conditions for some of the parameters, i.e. n, m and h, which take values between 0 and 1, and the maximum conductances when one wants to preserve their signs.
We let the reader apply the same machinery to the network of HodgkinHuxley neurons.
Let us note d as the positive integer equal to the dimension of the state space in Equation 18 ($d=3$) or 19 ($d=3+P$) or in the corresponding cases for the HodgkinHuxley model ($d=5$ and $d=5+P$). The reader will easily check that the following four assumptions hold for both models:
These assumptions are central to the proofs of Theorems 2 and 4.
They imply the following proposition stating that the system of stochastic differential equations (Equation 19) is wellposed:
Proposition 1 Let $T>0$ be a fixed time. If ${I}^{\alpha}(t)\le {I}_{m}$ on $[0,T]$, for $\alpha =1,\dots ,P$, Equations 18 and 19 together with an initial condition ${X}_{0}^{i}\in {\mathbb{L}}^{2}({\mathbb{R}}^{d})$, $i=1,\dots ,N$ of squareintegrable random variables, have a unique strong solution which belongs to ${\mathrm{L}}^{2}([0,T];{\mathbb{R}}^{dN})$.
Proof The proof uses Theorem 3.5 in chapter 2 in [26] whose conditions are easily shown to follow from hypotheses 2.5.3 to (H2). □
The case $N=1$ implies that Equations 2 and 5, describing the stochastic FitzHughNagumo and HodgkinHuxley neurons, are wellposed.
We are interested in the behavior of the solutions of these equations as the number of neurons tends to infinity. This problem has been longstanding in neuroscience, arousing the interest of many researchers in different domains. We discuss the different approaches developed in the field in the next subsection.
2.6 Meanfield methods in computational neuroscience: a quick overview
Obtaining the equations of evolution of the effective meanfield from microscopic dynamics is a very complex problem. Many approximate solutions have been provided, mostly based on the statistical physics literature.
Many models describing the emergent behavior arising from the interaction of neurons in largescale networks have relied on continuum limits ever since the seminal work of Amari, and Wilson and Cowan [27–30]. Such models represent the activity of the network by macroscopic variables, e.g. the populationaveraged firing rate, which are generally assumed to be deterministic. When the spatial dimension is not taken into account in the equations, they are referred to as neural masses, otherwise as neural fields. The equations that relate these variables are ordinary differential equations for neural masses and integrodifferential equations for neural fields. In the second case, they fall in a category studied in [31] or can be seen as ordinary differential equations defined on specific functional spaces [32]. Many analytical and numerical results have been derived from these equations and related to cortical phenomena, for instance, for the problem of spatiotemporal pattern formation in spatially extended models (see, e.g. [33–36]). The use of bifurcation theory has also proven to be quite powerful [37, 38]. Despite all its qualities, this approach implicitly makes the assumption that the effect of noise vanishes at the mesoscopic and macroscopic scales and hence that the behavior of such populations of neurons is deterministic.
A different approach has been to study regimes where the activity is uncorrelated. A number of computational studies on the integrateandfire neuron showed that under certain conditions, neurons in large assemblies end up firing asynchronously, producing null correlations [39–41]. In these regimes, the correlations in the firing activity decrease towards zero in the limit where the number of neurons tends to infinity. The emergent global activity of the population in this limit is deterministic and evolves according to a meanfield firing rate equation. However, according to the theory, these states only exist in the limit where the number of neurons is infinite, thereby raising the question of how the finiteness of the number of neurons impacts the existence and behavior of asynchronous states. The study of finitesize effects for asynchronous states is generally not reduced to the study of mean firing rates and can include higher order moments of firing activity [42–44]. In order to go beyond asynchronous states and take into account the stochastic nature of the firing and understand how this activity scales as the network size increases, different approaches have been developed, such as the population density method and related approaches [45]. Most of these approaches involve expansions in terms of the moments of the corresponding random variables, and the moment hierarchy needs to be truncated which is not a simple task that can raise a number of difficult technical issues (see, e.g. [46]).
However, increasingly many researchers now believe that the different intrinsic or extrinsic noise sources are part of the neuronal signal, and rather than being a pure disturbing effect related to the intrinsically noisy biological substrate of the neural system, they suggest that noise conveys information that can be an important principle of brain function [47]. At the level of a single cell, various studies have shown that the firing statistics are highly stochastic with probability distributions close to the Poisson distributions [48], and several computational studies confirmed the stochastic nature of singlecell firings [49–51]. How the variability at the singleneuron level affects the dynamics of cortical networks is less well established. Theoretically, the interaction of a large number of neurons that fire stochastic spike trains can naturally produce correlations in the firing activity of the population. For instance, power laws in the scaling of avalanchesize distributions has been studied both via models and experiments [52–55]. In these regimes, the randomness plays a central role.
In order to study the effect of the stochastic nature of the firing in large networks, many authors strived to introduce randomness in a tractable form. Some of the models proposed in the area are based on the definition of a Markov chain governing the firing dynamics of the neurons in the network, where the transition probability satisfies a differential equation, the master equation. Seminal works of the application of such modeling for neuroscience date back to the early 1990s and have been recently developed by several authors [43, 56–59]. Most of these approaches are proved correct in some parameter regions using statistical physics tools such as path integrals and VanKampen expansions, and their analysis often involve a moment expansion and truncation. Using a different approach, a static meanfield study of multipopulation network activity was developed by Treves in [60]. This author did not consider external inputs but incorporated dynamical synaptic currents and adaptation effects. His analysis was completed in [39], where the authors proved, using a FokkerPlanck formalism, the stability of an asynchronous state in the network. Later on, Gerstner in [61] built a new approach to characterize the meanfield dynamics for the spike response model, via the introduction of suitable kernels propagating the collective activity of a neural population in time. Another approach is based on the use of large deviation techniques to study large networks of neurons [62]. This approach is inspired by the work on spinglass dynamics, e.g. [63]. It takes into account the randomness of the maximum conductances and the noise at various levels. The individual neuron models are rate models, hence already meanfield models. The meanfield equations are not rigorously derived from the network equations in the limit of an infinite number of neurons, but they are shown to have a unique, nonMarkov solution, i.e. with infinite memory, for each initial condition.
Brunel and Hakim considered a network of integrateandfire neurons connected with constant maximum conductances [41]. In the case of sparse connectivity, stationarity, and in a regime where individual neurons emit spikes at a low rate, they were able to analytically study the dynamics of the network and to show that it exhibits a sharp transition between a stationary regime and a regime of fast collective oscillations weakly synchronized. Their approach was based on a perturbative analysis of the FokkerPlanck equation. A similar formalism was used in [44] which, when complemented with selfconsistency equations, resulted in the dynamical description of the meanfield equations of the network and was extended to a multi population network. Finally, Chizhov and Graham [64] have recently proposed a new method based on a population density approach allowing to characterize the mesoscopic behavior of neuron populations in conductancebased models.
Let us finish this very short and incomplete survey by mentioning the work of Sompolinsky and colleagues. Assuming a linear intrinsic dynamics for the individual neurons described by a rate model and random centered maximum conductances for the connections, they showed [65, 66] that the system undergoes a phase transition between two different stationary regimes: a ‘trivial’ regime where the system has a unique null and uncorrelated solution, and a ‘chaotic’ regime in which the firing rate converges towards a nonzero value and correlations stabilize on a specific curve which they were able to characterize.
All these approaches have in common that they are not based on the most widely accepted microscopic dynamics (such as the ones represented by the HodgkinHuxley equations or some of their simplifications) and/or involve approximations or moment closures. Our approach is distinct in that it aims at deriving rigorously and without approximations the meanfield equations of populations of neurons whose individual neurons are described by biological, if not correct at least plausible, representations. The price to pay is the complexity of the resulting meanfield equations. The specific study of their solutions is therefore a crucial step, which will be developed in forthcoming papers.
3 Meanfield equations for conductancebased models
In this section, we give a general formulation of the neural network models introduced in the previous section and use it in a probabilistic framework to address the problem of the asymptotic behavior of the networks, as the number of neurons N goes to infinity. In other words, we derive the limit in law of Ninteracting neurons, each of which satisfying a nonlinear stochastic differential equation of the type described in the ‘Spiking conductancebased models’ section. In the remainder of this section, we work in a complete probability space $(\mathrm{\Omega},\mathcal{F},\mathbb{P})$ satisfying the usual conditions and endow with a filtration ${({\mathcal{F}}_{t})}_{t}$.
3.1 Setting of the problem
We recall that the neurons in the network fall into different populations P. The populations differ through the intrinsic properties of their neurons and the input they receive. We assume that the number of neurons in each population $\alpha \in \{1,\dots ,P\}$, denoted by ${N}_{\alpha}$, increases as the network size increases and moreover that the asymptotic proportion of neurons in population α is nontrivial, i.e. ${N}_{\alpha}/N\to {\lambda}_{\alpha}\in (0,1)$ as N goes to infinity^{2}
We use the notations introduced in the ‘Partial conclusion’ section, and the reader should refer to this section to give a concrete meaning to the rather abstract (but required by the mathematics) setting that we now establish.
Moreover, we assume, as it is the case for all the models described in the ‘Spiking conductancebased models’ section, that the solutions of this stochastic differential equation exist for all time.
When included in the network, these processes interact with those of all the other neurons through a set of continuous functions that only depend on the population $\alpha =p(i)$, the neuron i belongs to and the populations γ of the presynaptic neurons. These functions, ${b}_{\alpha \gamma}(x,y):{\mathbb{R}}^{d}\times {\mathbb{R}}^{d}\mapsto {\mathbb{R}}^{d}$, are scaled by the coefficients $1/{N}_{\gamma}$, so the maximal interaction is independent of the size of the network (in particular, neither diverging nor vanishing as N goes to infinity).
As discussed in the ‘Spiking conductancebased models’ section, due to the stochastic nature of ionic currents and the noise effects linked with the discrete nature of charge carriers, the maximum conductances are perturbed dynamically through the $N\times P$independent Brownian motions ${B}_{t}^{i,\alpha}$ of dimension δ that were previously introduced. The interaction between the neurons and the noise term is represented by the function ${\beta}_{\alpha \gamma}:{\mathbb{R}}^{d}\times {\mathbb{R}}^{d}\mapsto {\mathbb{R}}^{d\times \delta}$.
In order to introduce the stochastic current and stochastic maximum conductances, we define two independent sequences of independent m and δdimensional Brownian motions noted as ${({W}_{t}^{i})}_{i\in \mathbb{N}}$ and ${({B}_{t}^{i\alpha})}_{i\in \mathbb{N},\alpha \in \{1\cdots P\}}$ which are adapted to the filtration ${\mathcal{F}}_{t}$.
Note that this implies that ${X}^{i,N}$ and ${X}^{j,N}$ have the same law whenever $p(i)=p(j)$, given identically distributed initial conditions.
These equations are similar to the equations studied in another context by a number of mathematicians, among which are McKean, Tanaka and Sznitman (see the ‘Introduction’ section), in that they involve a very large number of particles (here, particles are neurons) in interaction. Motivated by the study of the McKeanVlasov equations, these authors studied special cases of equations (Equation 21). This theory, referred to as the kinetic theory, is chiefly interested in the study of the thermodynamics questions. They show the property that in the limit where the number of particles tends to infinity, provided that the initial state of each particle is drawn independently from the same law, each particle behaves independently and has the same law, which is given by an implicit stochastic equation. They also evaluate the fluctuations around this limit under diverse conditions [1, 2, 6, 7, 9–11]. Some extensions to biological problems where the drift term is not globally Lipschitz but satisfies the monotone growth condition (Equation 20) were studied in [67]. This is the approach we undertake here.
3.2 Convergence of the network equations to the meanfield equations and properties of those equations
The P equations (Equation 24) yield the probability densities of the solutions ${\overline{X}}_{t}^{\alpha}$ of the meanfield equations (Equation 22). Because of the propagation of chaos result, the ${\overline{X}}_{t}^{\alpha}$ are statistically independent, but their probability functions are clearly functionally dependent.
Equations 22 and 24 are implicit equations on the law of ${\overline{X}}_{t}$.
We now state the main theoretical results of the paper as two theorems. The first theorem is about the wellposedness of the meanfield equation (Equation 22). The second is about the convergence of the solutions of the network equations to those of the meanfield equations. Since the proof of the second theorem involves similar ideas to those used in the proof of the first, it is given in the Appendix.
Theorem 2 Under assumptions (H1) to (H4), there exists a unique solution to the meanfield equation (Equation 22) on $[0,T]$ for any $T>0$.
We have introduced in the previous formula the process ${Z}_{t}$ with the same law as and independent of ${X}_{t}$. There is a trivial identification between the solutions of the meanfield equation (Equation 22) and the fixed points of the map Φ: any fixed point of Φ provides a solution for Equation 22, and conversely, any solution of Equation 22 is a fixed point of Φ.
The following lemma is useful to prove the theorem:
where ${N}_{t}$ is a stochastic integral, hence with a null expectation, $\mathbb{E}[{N}_{t}]=0$.
It also involves the term ${x}^{T}f(t,x)+\frac{1}{2}{\parallel g(t,x)\parallel}^{2}$ which, because of assumption (H4), is upperbounded by $K(1+{\parallel x\parallel}^{2})$. Finally, assumption (H3) again allows us to upperbound the term $\frac{1}{2}{\parallel {\mathbb{E}}_{Z}[\beta ({X}_{s},{Z}_{s})]\parallel}^{2}$ by $\frac{\tilde{K}}{2}(1+{\parallel {X}_{s}\parallel}^{2})$.
Using Gronwall’s inequality, we deduce the ${\mathbb{L}}^{2}$ boundedness of the solutions of the meanfield equations. □
This lemma puts us in a position to prove the existence and uniqueness theorem:
Proof We start by showing the existence of solutions and then prove the uniqueness property. We recall that by the application of Lemma 3, the solutions will all have bounded secondorder moment.
Existence. Let ${X}^{0}=({X}_{t}^{0}=\{{X}_{t}^{0\alpha},\alpha =1\cdots P\})\in \mathcal{M}(\mathcal{C})$ be a given stochastic process, and define the sequence of probability distributions ${({X}^{k})}_{k\ge 0}$ on $\mathcal{M}(\mathcal{C})$ defined by induction by ${X}^{k+1}=\mathrm{\Phi}({X}^{k})$. Define also a sequence of processes ${Z}^{k}$, $k\ge 0$, independent of the sequence of processes ${X}^{k}$ and having the same law. We note this as ‘X and Z i.i.d.’ below. We stop the processes at the time ${\tau}_{U}^{k}$ the first hitting time of the norm of ${X}^{k}$ to the constant value U. For convenience, we will make an abuse of notation in the proof and denote ${X}_{t}^{k}={X}_{t\wedge {\tau}_{U}^{k}}^{k}$. This implies that ${X}_{t}^{k}$ belongs to ${B}_{U}^{d}$, the ball of radius U centered at the origin in ${\mathbb{R}}^{d}$, for all times $t\in [0,T]$.
and treat each term separately. The upperbounds for the first two terms are obtained using the CauchySchwartz inequality, those of the last two terms using the BurkholderDavisGundy martingale moment inequality.
are uniformly (in $t\in [0,T]$) convergent. Denote the thus defined limit by ${\overline{X}}_{t}$. It is clearly continuous and ${\mathcal{F}}_{t}$adapted. On the other hand, the inequality (Equation 27) shows that for every fixed t, the sequence ${\{{X}_{t}^{n}\}}_{n\ge 1}$ is a Cauchy sequence in ${\mathbb{L}}^{2}$. Lemma 3 shows that $\overline{X}\in {\mathcal{M}}^{2}(\mathcal{C})$.
It is easy to show using routine methods that $\overline{X}$ indeed satisfies Equation 22.
and letting $U\to \mathrm{\infty}$, we have shown the existence of solution to Equation 22 which, by Lemma 3, is squareintegrable.
which ends the proof. □
We have proved the wellposedness of the meanfield equations. It remains to show that the solutions to the network equations converge to the solutions of the meanfield equations. This is what is achieved in the next theorem.
Theorem 4 Under assumptions (H1) to (H4), the following holds true:

Convergence^{3}: For each neuron i of population α, the law of the multidimensional process${X}^{i,N}$converges towards the law of the solution of the meanfield equation related to population α, namely${\overline{X}}^{\alpha}$.

Propagation of chaos: For any$k\in {\mathbb{N}}^{\ast}$, and any ktuple$({i}_{1},\dots ,{i}_{k})$, the law of the process$({X}_{t}^{{i}_{1},N},\dots ,{X}_{t}^{{i}_{n},N},t\le T)$converges towards^{4}${m}_{t}^{p({i}_{1})}\otimes \cdots \otimes {m}_{t}^{p({i}_{n})}$, i.e. the asymptotic processes have the law of the solution of the meanfield equations and are all independent.
This theorem has important implications in neuroscience that we discuss in the ‘Discussion and conclusion’ section. Its proof is given in the Appendix.
4 Numerical simulations
At this point, we have provided a compact description of the activity of the network when the number of neurons tends to infinity. However, the structure of the solutions of these equations is complicated to understand from the implicit meanfield equations (Equation 22) and of their variants (such as the McKeanVlasovFokkerPlanck equations (Equation 24)). In this section, we present some classical ways to numerically approximate the solutions to these equations and give some indications about the rate of convergence and the accuracy of the simulation. These numerical schemes allow us to compute and visualize the solutions. We then compare the results of the two schemes for a network of FitzHughNagumo neurons belonging to a single population and show their good agreement.
The main difficulty one faces when developing numerical schemes for Equations 22 and 24 is that they are nonlocal. By this, we mean that in the case of the McKeanVlasov equations, they contain the expectation of a certain function under the law of the solution to the equations (see Equation 22). In the case of the corresponding FokkerPlanck equation, it contains integrals of the probability density functions which is a solution to the equation (see Equation 24).
4.1 Numerical simulations of the McKeanVlasov equations
The fact that the McKeanVlasov equations involve an expectation of a certain function under the law of the solution of the equation makes them particularly hard to simulate directly. One is often reduced to use Monte Carlo simulations to compute this expectation, which amounts to simulating the solution of the network equations themselves (see [68]). This is the method we used. In its simplest fashion, it consists of a Monte Carlo simulation where one numerically solves the N network equations (Equation 21) with the classical EulerMaruyama method a number of times with different initial conditions, and averages the trajectories of the solutions over the number of simulations.
where ${\xi}_{n}^{i,r}$ and ${\zeta}_{n}^{i\gamma ,r}$ are independent d and δdimensional standard normal random variables. The initial conditions ${\tilde{X}}_{1}^{i,r}$, $i=1,\dots ,N$, are drawn independently from the same law within each population for each Monte Carlo simulation $r=1,\dots ,M$. One then chooses one neuron ${i}_{\alpha}$ in each population $\alpha =1,\dots ,P$. If the size N of the population is large enough, Theorem 4 states that the law, noted as ${p}_{\alpha}(t,X)$, of ${X}^{{i}_{\alpha}}$ should be close to that of the solution ${\overline{X}}^{\alpha}$ of the meanfield equations for $\alpha =1,\dots ,P$. Hence, in effect, simulating the network is a good approximation (see below) of the simulation of the meanfield or McKeanVlasov equations [68, 69]. An approximation of ${p}_{\alpha}(t,X)$ can be obtained from the Monte Carlo simulations by quantizing the phase space and incrementing the count of each bin whenever the trajectory of the ${i}_{\alpha}$ neuron at time t falls into that particular bin. The resulting histogram can then be compared to the solution of the McKeanVlasovFokkerPlanck equation (Equation 24) corresponding to population α whose numerical solution is described next.
The mean square error between the solution of the numerical recursion (Equation 30) ${\tilde{X}}_{n}^{i}$ and the solution of the meanfield equations (Equation 22) ${\overline{X}}_{n\mathrm{\Delta}t}^{i}$ is of order $O(\sqrt{\mathrm{\Delta}t}+1/\sqrt{N})$, the first term being related to the error made by approximating the solution of the network of size N, ${X}_{n\mathrm{\Delta}t}^{i,N}$ by an EulerMaruyama method, and the second term, to the convergence of ${X}_{n\mathrm{\Delta}t}^{i,N}$ towards the meanfield equation ${\overline{X}}_{n\mathrm{\Delta}t}^{i}$ when considering globally Lipschitz continuous dynamics (see proof of Theorem 4 in the Appendix). In our case, as shown before, the dynamics is only locally Lipschitz continuous. Finding efficient and provably convergent numerical schemes to approximate the solutions of such stochastic differential equations is an area of active research. There exist proofs that some schemes are divergent [70] or convergent [71] for some types of drift and diffusion coefficients. Since our equations are not included in either case, we conjecture convergence since we did not observe any divergence and leave the proof for future work.
4.2 Numerical simulations of the McKeanVlasovFokkerPlanck equation
where Δx is the integration step, and $M=({x}_{2}{x}_{1})/\mathrm{\Delta}x$ is chosen to be an integer multiple of 5.
for the secondorder derivatives (see [75]).
4.3 Comparison between the solutions to the network and the meanfield equations
where ${V}_{\mathrm{min}}$, ${V}_{\mathrm{max}}$, ${w}_{\mathrm{min}}$, ${w}_{\mathrm{max}}$, ${y}_{\mathrm{min}}$ and ${y}_{\mathrm{max}}$ define the volume in which we solve the network equations and estimate the histogram defined in the ‘Numerical simulations of the McKeanVlasov equations’ section, while ΔV, Δw and Δy are the quantization steps in each dimension of the phase space. For the simulation of the McKeanVlasovFokkerPlanck equation, instead, we use Dirichlet boundary conditions and assume the probability and its partial derivatives to be 0 on the boundary and outside the volume.
In general, the total number of coupled ODEs that we have to solve for the McKeanVlasovFokkerPlanck equation with the method of lines is the product $P{n}_{V}{n}_{w}{n}_{y}$ (in our case, we chose $P=1$). This can become fairly large if we increase the precision of the phase space discretization. Moreover, increasing the precision of the simulation in the phase space, in order to ensure the numerical stability of the method of lines, requires to decrease the time step Δt used in the RK2 scheme. This can strongly impact the efficiency of the numerical method (see the ‘Numerical simulations with GPUs’ section).
Parameters used in the simulations of the neural network and for solving the McKeanVlasovFokkerPlanck equation
Initial condition  Phase space  FitzHughNagumo  Synaptic weights  Synapse 

${t}_{\mathrm{fin}}=[0.5,1.2,1.5,2.2]$, Δt = 0.01 (mean field), 0.1 (network)  ${V}_{\mathrm{min}}=3$  a = 0.7  $\overline{J}=1$  ${V}_{\mathrm{rev}}=1$ 
${V}_{\mathrm{max}}=3$  b = 0.8  ${\sigma}_{J}=0.2$  ${a}_{r}=1$  
ΔV = 0.1  c = 0.08  ${a}_{d}=1$  
${\overline{V}}_{0}=0.0$  ${w}_{\mathrm{min}}=2$  I = 0.4  ${T}_{\mathrm{max}}=1$  
${\sigma}_{{V}_{0}}=0.4$  ${w}_{\mathrm{max}}=2$  ${\sigma}_{\mathrm{ext}}=0$  λ = 0.2  
${\overline{w}}_{0}=0.5$  Δw = 0.1  ${V}_{T}=2$  
${\sigma}_{{w}_{0}}=0.4$  ${y}_{\mathrm{min}}=0$  Γ = 0.1  
${\overline{y}}_{0}=0.3$  ${y}_{\mathrm{max}}=1$  Λ = 0.5  
${\sigma}_{{y}_{0}}=0.05$  Δy = 0.06 
The parameters for the noisy model of maximum conductances of Equation 11 are shown in the fourth column of the table. For these values of $\overline{J}$ and ${\sigma}_{J}$, the probability that the maximum conductances change sign is very small. Finally, the parameters of the chemical synapses are shown in the sixth column. The parameters Γ and Λ are those of the χ function (Equation 3). The solutions are computed over an interval of ${t}_{\mathrm{fin}}=0.5,1.2,1.5,2.2$ time units with a time sampling of $\mathrm{\Delta}t=0.1$ for the network and $\mathrm{\Delta}t=0.01$ for the McKeanVlasovFokkerPlanck equation. The rest of the parameters are the typical values for the FitzHughNagumo equations.
The marginals estimated from the trajectories of the network solutions are then compared to those obtained from the numerical solution of the McKeanVlasovFokkerPlanck equation (see Figures 4 and 5 right), using the method of lines explained above and starting from the same initial conditions (Equation 31) as the neural network.
Figures 4 and 5 show a qualitative similarity between the marginal probability density functions obtained by simulating the network and those obtained by solving the FokkerPlanck equation corresponding to the meanfield equations. To make this more quantitative, we computed the KullbackLeibler divergence ${D}_{\mathrm{KL}}({p}_{\mathrm{Network}}{p}_{\mathrm{MVFP}})$ between the two distributions.
4.4 Numerical simulations with GPUs
Unfortunately, the algorithm for solving the McKeanVlasovFokkerPlanck equation described in the previous section is computationally very expensive. In fact, when the number of points in the discretized grid of the $(V,w,y)$ phase space is big, i.e. when the discretization steps ΔV, Δw and Δy are small, we also need to keep Δt small enough in order to guarantee the stability of the algorithm. This implies that the number of equations that must be solved has to be large and moreover that they must be solved with a small time step if we want to keep the numerical errors small. This will inevitably slow down the simulations. We have dealt with this problem by using a more powerful hardware, the graphical processing units (GPUs).
We have changed the RungeKutta scheme of order 2 used for the simulations shown in the ‘Numerical simulations of the McKeanVlasovFokkerPlanck equation’ section and adopted a more accurate RungeKutta scheme of order 4. This was done because with the more powerful machine, each computation of the righthand side of the equation is faster, making it possible to use four calls per time step instead of two in the previous method. Hence, the parallel hardware allowed us to use a more accurate method.
Parameters used in the simulations of the McKeanVlasovFokkerPlanck equation on GPUs
Initial condition  Phase space  Stochastic FN neuron  Synaptic weights 

Δt = 0.0025,0.0012  ${V}_{\mathrm{min}}=4$  a = 0.7  $\overline{J}=1$ 
${\overline{V}}_{0}=0.0$  ${V}_{\mathrm{max}}=4$  b = 0.8  ${\sigma}_{J}=0.01$ 
${\sigma}_{{V}_{0}}=0.2$  ΔV = 0.027  c = 0.08  
${\overline{w}}_{0}=0.5$  ${w}_{\mathrm{min}}=3$  I = 0.4,0.7  
${\sigma}_{{w}_{0}}=0.2$  ${w}_{\mathrm{max}}=3$  ${\sigma}_{\mathrm{ext}}=0.27,0.45$  
${\overline{y}}_{0}=0.3$  Δw = 0.02  ${\sigma}_{w}=0.0007$  
${\sigma}_{{y}_{0}}=0.05$  ${y}_{\mathrm{min}}=0$  
${y}_{\mathrm{max}}=1$  
Δy = 0.003 
The results shown in Figures 8 and 9 and in Additional files 1, 2, 3 and 4 were obtained using two machines, each with seven nVidia Tesla C2050 cards, six 2.66 GHz dualXeon X5650 processors and 72G of ram. The communication inside each machine was done using the lpthreads library and between machines using MPI calls. The mean execution time per time step using the parameters already described is 0.05 s.
The reader interested in more details in the numerical implementations and in the gains that can be achieved by the use of GPUs can consult [77].
5 Discussion and conclusion
In this article, we addressed the problem of the limit in law of networks of biologically inspired neurons as the number of neurons tends to infinity. We emphasized the necessity of dealing with biologically inspired models and discussed at length the type of models relevant to this study. We chose to address the case conductancebased network models that are a relevant description of the neuronal activity. Mathematical results on the analysis of these diffusion processes in interaction resulted to the replacement of a set of NP ddimensional coupled equations (the network equations) in the limit of large N s by P ddimensional meanfield equations describing the global behavior of the network. However, the price to pay for this reduction was the fact that the resulting meanfield equations are nonstandard stochastic differential equations, similar to the McKeanVlasov equations. These can be expressed either as implicit equations on the law of the solution or, in terms of probability density function through the McKeanVlasovFokkerPlanck equations, as a nonlinear, nonlocal partial differential equation. These equations are, in general, hard to study theoretically.
Besides the fact that we explicitly model real spiking neurons, the mathematical part of our work differs from that of previous authors such as McKean, Tanaka and Sznitman (see the ‘Introduction’ section) because we are considering several populations with the effect that the analysis is significantly more complicated. Our hypotheses are also more general, e.g. the drift and diffusion functions are nontrivial and satisfy the general condition (H4) which is more general than the usual linear growth condition. Also, they are only assumed locally (and not globally) Lipschitz continuous to be able to deal, for example, with the FitzHughNagumo model. A locally Lipschitz continuous case was recently addressed in a different context for a model of swarming in [67].
Proofs of our results, for somewhat stronger hypotheses than ours and in special cases, are scattered in the literature, as briefly reviewed in the ‘Introduction’ and ‘Setting of the problem’ sections. Our main contribution is that we provide a complete, selfsufficient proof in a fairly general case by gathering all the ingredients that are required for our neuroscience applications. In particular, the case of the FitzHughNagumo model where the drift function does not satisfy the linear growth condition involves a generalization of previous works using the more general growth condition (H4).
The simulation of these equations can itself be very costly. We, hence, addressed in the ‘Numerical simulations’ section numerical methods to compute the solutions of these equations, in the probabilistic framework, using the convergence result of the network equations to the meanfield limit and standard integration methods of differential equations or in the FokkerPlanck framework. The simulations performed for different values of the external input current parameter and one of the parameters controlling the noise allowed us to show that the spatiotemporal shape of the probability density function describing the solution of the McKeanVlasovFokkerPlanck equation was sensitive to the variations of these parameters, as shown in Figures 8 and 9. However, we did not address the full characterization of the dynamics of the solutions in the present article. This appears to be a complex question that will be the subject of future work. It is known that for different McKeanVlasov equations, stationary solutions of these equations do not necessarily exist and, when they do, are not necessarily unique (see [78]). A very particular case of these equations was treated in [76] where the authors consider that the function ${f}_{\alpha}$ is linear, ${g}_{\alpha}$ is constant and ${b}_{\alpha \beta}(x,y)={S}_{\beta}(y)$. This model, known as the firingrate model, is shown in that paper to have the Gaussian solutions when the initial data is Gaussian, and the dynamics of the solutions can be exactly reduced to a set of 2Pcoupled ordinary differential equations governing the mean and the standard deviation of the solution. Under these assumptions, a complete study of the solutions is possible, and the dependence upon the parameters can be understood through bifurcation analysis. The authors show that intrinsic noise levels govern the dynamics, creating or destroying fixed points and periodic orbits.
The present study develops theoretical arguments to derive the meanfield equations resulting from the activity of large neuron ensembles. However, the rigorous and formal approach developed here does not allow direct characterization of brain states. The paper, however, opens the way to rigorous analysis of the dynamics of large neuron ensembles through derivations of different quantities that may be relevant. A first approach could be to derive the equations of the successive moments of the solutions. Truncating this expansion would yield systems of ordinary differential equations that can give approximate information on the solution. However, the choice of the number of moments taken into account is still an open question that can raise several deep questions [46].
Appendix 1: Proof of Theorem 4
with initial condition ${\overline{X}}_{0}^{i}={X}_{0}^{i}$, the initial condition of the neuron i in the network, which was assumed to be independent and identically distributed. $({W}_{t}^{i})$ and $({B}_{t}^{i})$ are the Brownian motions involved in the network equation (Equation 21). As described previously, $Z=({Z}^{1},\dots ,{Z}^{P})$ is a process independent of $\overline{X}$ that has the same law. Denoting, as described previously, the probability distribution of ${\overline{X}}_{t}^{\alpha}$ solution of the meanfield equation (Equation 22) by ${m}_{t}^{\alpha}$, the law of the collection of processes $({\overline{X}}_{t}^{{i}_{k}})$ for some fixed $k\in {\mathbb{N}}^{\ast}$, namely ${m}^{p({i}_{1})}\otimes \cdots \otimes {m}^{p({i}_{k})}$, is shown to be the limit of the process $({X}_{t}^{i})$ solution of the network equations (Equation 21) as N goes to infinity.
We recall, for completeness, Theorem 4:
Theorem 4 Under assumptions (H1) to (H4), the following holds true:

Convergence: For each neuron i of population α, the law of the multidimensional process${X}^{i,N}$converges towards the law of the solution of the meanfield equation related to population α, namely${\overline{X}}^{\alpha}$.

Propagation of chaos: For any$k\in {\mathbb{N}}^{\ast}$, and any kuplet$({i}_{1},\dots ,{i}_{k})$, the law of the process$({X}_{t}^{{i}_{1},N},\dots ,{X}_{t}^{{i}_{n},N},t\le T)$converges towards${m}_{t}^{p({i}_{1})}\otimes \cdots \otimes {m}_{t}^{p({i}_{n})}$, i.e. the asymptotic processes have the law of the solution of the meanfield equations and are all independent.
Proof
which implies, in particular, convergence in law of the process $({X}_{t}^{i,N},t\le T)$ towards $({\overline{X}}_{t}^{\alpha},t\le T)$ solution of the meanfield equations (Equation 22).
It is important to note that the probability distribution of these terms does not depend on the neuron i. We are interested in the limit, as N goes to infinity, of the quantity $\mathbb{E}[{sup}_{s\le T}{\parallel {X}_{s}^{i,N}{\overline{X}}_{s}^{i}\parallel}^{2}]$. We decompose this expression into the sum of the eight terms involved in Equation 35 using Hölder’s inequality and upperbound each term separately. The terms ${A}_{t}$ and ${B}_{t}$ are treated exactly as in the proof of Theorem 2. We start by assuming that f and g are uniformly globally K Lipschitz continuous with respect to the second variable. The locally Lipschitz case is treated in the same manner as done in the proof of Theorem 2 (1) by stopping the process at time ${\tau}_{U}$, (2) by using the Lipschitz continuity of f and g in the ball of radius U and (3) by a truncation argument and using the almost sure boundedness of the solutions extending the convergence to the locally Lipschitz case.