Efficient calculation of heterogeneous non-equilibrium statistics in coupled firing-rate models

Understanding nervous system function requires careful study of transient (non-equilibrium) neural response to rapidly changing, noisy input from the outside world. Such neural response results from dynamic interactions among multiple, heterogeneous brain regions. Realistic modeling of these large networks requires enormous computational resources, especially when high-dimensional parameter spaces are considered. By assuming quasi-steady-state activity, one can neglect the complex temporal dynamics; however, in many cases the quasi-steady-state assumption fails. Here, we develop a new reduction method for a general heterogeneous firing-rate model receiving background correlated noisy inputs that accurately handles highly non-equilibrium statistics and interactions of heterogeneous cells. Our method involves solving an efficient set of nonlinear ODEs, rather than time-consuming Monte Carlo simulations or high-dimensional PDEs, and it captures the entire set of first and second order statistics while allowing significant heterogeneity in all model parameters.


Introduction
Advances in neural recording technologies have enabled experimentalists to simultaneously measure activity across different regions with cellular resolution [1][2][3][4]. However, it is still a technical challenge to measure the many biophysical parameters that govern this multi-region activity. This challenge is exacerbated by the fact that cortical neurons are heterogeneous (i.e., parameters vary across cells) [5] and have significant trial-to-trial noise [6]. Given these features, computational modeling of neural networks often requires exploration of a high-dimensional parameter space and lengthy, time-consuming Monte Carlo simulations. Thus, efficient methods to simulate [7] or approximate network statistics [8] are needed. Aside from computational benefits, streamlined equations for network activity offer potential benefits for mathematical analysis.
We previously developed a fast approximation method [9] for the complete first and second order statistics of a firing-rate network model based on the Wilson-Cowan model [10], and applied it to the olfactory sensory pathway [11]. However, those methods assumed that the statistics of neural activity are stationary (i.e., in steady state). Many neu-ral systems rely on processing of time-varying, high frequency stimuli. The resulting neural responses are often transient, and a quasi-steady-state (QSS) approximation fails to capture the actual response statistics. For example, in the rodent vibrissa sensory [12], auditory [13][14][15], and electrosensory systems [16], stimuli and responses modulate on the order of a few milliseconds, i.e., much faster than the membrane time constants of neurons. Indeed, there is evidence that coding capabilities strongly depend on the timing of stimuli [17] (e.g., in the olfactory bulb [18][19][20]), further necessitating accurate modeling of time-varying neural activity. Modeling studies show the need to account for timevarying stimuli in calculating spiking statistics [21] and in capturing neural mechanisms such as divisive gain modulation [22]. Mathematical theory to efficiently characterize nonequilibrium heterogeneous spiking statistics is scarce despite the potential to shed light on crucial transient neural responses. Thus, it is clear that accurate modeling of time-varying neural activity would benefit mechanistic investigations of neural processing.
Here we present a method to approximate the non-equilibrium statistics of a general heterogeneous coupled firing-rate model of neural networks receiving background correlated noise, in which we: (i) assume weak coupling; equivalently, that neural activity is pairwise normal, and (ii) account for the entire probability distribution of inputs. The result is a computationally fast method because it requires the user to solve coupled nonlinear ODEs, rather than to simulate and average many realizations of coupled SDEs or numerically solve a high-dimensional PDE. The method performs much better than the related QSS method [9] in several representative examples; our code is freely available (see Availability of data and materials section).

Model equations and method
Each cell is modeled by a single activity variable x j , which may represent membrane voltage, calcium concentration, etc., and which evolves according to the following equation: (see [10]), where F k (·) ≥ 0 is a nonlinear function mapping input activity to firing rate or response (often called the F-I curve). All cells receive background noise η j uncorrelated in time but instantaneously correlated across different cells: 1). The parametersμ j andσ j model background noisy input. The parameter g jk represents coupling strength from the presynaptic kth cell and is a signed quantity; g jk < 0 represents inhibitory coupling ( Fig. 1(A)). We wish to compute all of the first and second order time-varying statistics: Mean activity μ j (t) := x j (t), Covariance of activity Cov j,k (t) := x j x k (t)μ j (t)μ k (t), Variance of firing Var ν j (t) :  Table 1 For convenience, we abbreviate the following quantities. When j = k in the double integrals of M F , the bivariate normal distribution j,k is replaced with the standard normal distribution 1 where the angular brackets · denotes averaging over realizations.

Reduction of the Fokker-Planck equation
The corresponding probability density function p( x, t) of X : , satisfies the Fokker-Planck equation [23]: where D j,k = c jkσ jσk τ j τ k (see Table 1), and the sum with D j,k is taken over all N c × N c pairs of (j, k). For convenience we have defined the probability flux or current, as J l ( x, t) : in the right-most part of Eq. (2). This high-dimensional partial differential equation contains all of the statistics of the system.

Moment closure methods
One way to tackle high-dimensional systems is through "moment closure" methods, in which state variables are integrated or averaged out, and assumptions on moments used to reduce the number of equations. Such approaches have been used in the physical [24,25] and life sciences [26][27][28]; see [29] for another type of reduction method for this kind of equation. Here, we propose a closure based on weak coupling, and therefore pairwise Gaussianity in the activity variables.
Without coupling, i.e. g jk = 0, the steady-state solution of Eq. (2) is simply a multivariate Gaussian distribution with mean μ = [μ 1 , . . . ,μ N c ] and covariance matrix Cov j,k = c jk τ j +τ kσ jσk in the steady state. This motivates a closure of the system in which we assume X is Gaus- where Y j is a standard normal random variable, with parameters μ j and σ j to be determined. We also assume the joint marginal distributions are bivariate Gaussian: where N denotes a bivariate Gaussian distribution, and d x j,k denotes integrating over all N c variables except x j and x k . Note that the integrated quantity ∂p( x,t) ∂t d x = 0, as any probability distribution must integrate to unity. We multiply Eq. (2) by x j and integrate the equation over all N c variables, where Consider the first term on the RHS: when l = j, we have The last equality comes from no flux at ±∞: J l | x l =∞ x l =-∞ = 0. A similar calculation applies to the second term, for all N c × N c values of (l 1 , l 2 ): when l 1 = j and l 2 = j, first integrate in x l 1 and x l 2 , and then use the fact that there is no density at ±∞: p( x, t)| x l 1/2 =∞ x l 1/2 =-∞ = 0; when l 1/2 = j, first integrate in x j , then integrate by parts, using where we have used the approximation Table 1) by assuming the marginal x k PDF is a normal distribution with mean μ k (t) and variance σ 2 k (t). To derive a similar equation for the variance σ 2 j (t), we multiply Eq. (2) by x 2 j and again integrate over all variables: where E j 2 (t) = x 2 j p( x, t) d x, and σ 2 j (t) = E j 2 (t) -(μ j (t)) 2 .
Following the same type of manipulations and again using the no density condition at ±∞: p( x, t)| x l 1/2 =∞ x l 1/2 =-∞ = 0, we get We now employ our approximation, x j = μ j (t) + y j σ j (t) where y j is a standard normal random variable, to close the last term in Eq. (7). We further approximate y j F k (μ k (t) + y k σ k (t))p( x, t) d x by assuming the joint marginal distribution of (x j , x k ) is bivariate normal, and use the definition of M F in Table 1: k). Therefore, the equation for the second moment is To derive the analogous equation for the Cov j,k (t), the procedure is almost exactly the same except that Eq. (2) is multiplied by x j x k , and two terms from the sum (over probability fluxes J l ) contribute, when l = j and l = k. The result is When j = k in Eq. (9), we recover Eq. (8).
The full set of kinetic equations given by Eq. (5), (8), and (9) form a system of nonlinear coupled ODEs with N c + N c (N c + 1)/2 variables. The statistics of the firing rate (i.e. ν j = F j (x j )) are obtained from a standard change of variables.
Ifμ,σ are constant in time, the system (Eq. (5), (8), (9)) settles to a steady state: A common approximation to non-equilibrium statistics is to assume that the system immediately equilibrates to the steady-state solution of Eq. (1) at each time point for the time-dependent parametersμ j (t),σ j (t), which we call the QSS method. We will find that the QSS method fails to capture meaningful features of network activity with relatively fast input.

Monte Carlo simulations
In the Results section, we compare our new method with Monte Carlo simulations. For all Monte Carlo simulations (i.e., both the actual non-equilibrium statistics and QSS), we used 1 million (1 × 10 6 ) realizations at each time point. The shaded error regions in all figures represent 1 standard deviation above and below the mean, which is approximated via the sample standard deviation on 1000 samples of 1000 realizations each: 2 , where X is the average over 1 million realizations and X(j) is an average over 1000 realizations.

Results
We implement our method for networks of various sizes N c , with two time-varying inputs. We choose F to be a sigmoidal: F j (·) = 0.5(1 + tanh((xx rev,j )/x sp,j )) ∈ [0, 1] (arbitrary units, x rev,j and x sp,j are parameters). To include heterogeneity, parameters were chosen randomly from the following distributions: where U ∈ [0, 1] is a uniform random variable, and N is a normal random variable. The input correlation matrix Cr was generated so to have approximately independent off-diagonal entries as follows: (i) create a matrix A with i.i.d. entries A jk ∼ N(0, 0. coupling is all-to-all), with inhibitory, excitatory, and self-coupling cases. Figure 1(B) shows that with relatively fast time-varying μ(t), a network of N c = 3 cells has complex non-equilibrium network statistics that cannot be approximated by the QSS approximation (i.e., assuming the system immediately equilibrates to the steady-state solution for each time point). This is true for the complete set of activity and response statistics, although for brevity only a subset are shown. All parameters are chosen as in Eq. (11) except for μ(t), which is the same for all three cells. Figure 2(A) shows that the time-varying method (Eq. (5), (8), and (9)), when applied to same network as in Fig. 1(B), gives accurate results for the complete set of first/second order statistics. Figure 2  (B) Left: although the average error increases with coupling magnitude, the discrepancies are not noticeable for mean activity and firing (not shown). (B) Middle: the method is visibly worse for the variance of activity as coupling magnitude increases. (B) Right: the method is visibly worse as coupling magnitude increases for variance of firing -note that the weakest coupling (red) has largest variance of firing cell network, but with a time-varying sinusoidal input. Again, the QSS method does not capture the actual network statistics, but our method does very well (colored solid curves). We only show a subset of statistics to illustrate our point; the others are qualitatively similar.
Thus far we have only consider small networks. In Figs. 3 and 4, our methods are applied to a large network of N c = 50 coupled cells where the magnitude of the coupling strengths vary: G jk ∼ N(0, l 2 100 ), for l = 1, 2, 3, 4. Figure 3 shows the results with pulse input (Fig. 1(B) upper-left) applied to all 50 cells, while Fig. 4 shows results from applying the sinusoidal input ( Fig. 2(B) left) to all 50 cells. Figure 3(A) (top row) shows the error between our method and the actual (Monte Carlo) statistics; we plot the absolute error averaged over all cells or pairs: i.e., for mean and variance of activity and firing, M = 50; for covariance of activity and firing, averaging over all M = 50 * 49/2 distinct pairs. All six sets of statistics are shown in Fig. 3(A): the left panel shows the average absolute error for both mean activity (solid) and mean firing (dot-dashed), middle panel shows the variance (thick solid) and covariance of activity (thin solid), the right panel shows the variance (thick solid) and covariance (thin solid) of firing. In all cases, as the coupling magnitude increases (red → green → cyan → purple), the error increases. For reference, the bottom row ( Fig. 3(B)) compares our method with the Monte Carlo simulations for a particular cell (or cell pair); the chosen cell or pair is the one that most closely matches the average absolute error. In Fig. 3(B), we only show three out of the six statistics (left is mean activity, middle is variance of activity, right is variance of firing) because these clearly show the performance of our method in relation to the size of the average absolute error. Figure 4 has exactly the same format as Fig. 3, but with sinusoidal input. Finally, in Fig. 5, to assess the performance of our method, we plot the absolute value of the error averaged over all six statistics and over all cells/pairs (vertical axis) as a function of a measure of coupling strength l (Fig. 5(A) is with pulse input, (B) with sinusoidal input). Each curve shows a different network size, ranging from N c = 3, 5, 10, 25, 50, with a particular instance of randomly chosen parameters for each curve. a The magnitude of Figure 5 Our method implicitly assumes weak coupling, so as the average magnitude of the coupling strength increases, the performance decreases. We demonstrate this with several instances of coupling matrices and network sizes N c = 3, 5, 10, 25, and N c = 50 with the four coupling values in Figs. 3 and 4, using the same pulse (A) and sinusoidal (B) inputs. On vertical axis, we plot the average absolute error over all first and second order statistics, including all cells and pairs, while on the horizontal axis, we plot a measure of average magnitude of the coupling values l. Note that, since G jk ∼ N(0, l 2 100 ), the average of all |G jk | is l 5 √ 2π in the infinite limit N c → ∞. For reference, some of the points on these curves are from prior figures, denoted in gray text and arrows the coupling strength, l, on the horizontal axis is from G jk ∼ N(0, l 2 100 ), so that the average of all N 2 c values of |G jk | is l 5 √ 2π in the infinite limit N c → ∞. Not surprisingly, the average error increases as coupling strength increases for each curve. Assessing how much absolute error is acceptable depends on the purposes of the approximation, but for reference, the instances of networks from prior figures are denoted in gray. Figure 5 indicates that, as long as the average absolute error is below 0.01, our method likely performs very well, independent of network size (cf. with Figs. 2-4). Average absolute errors larger than 0.01 might indicate at least some of the statistics calculated by our method are likely to be inaccurate, although others may be accurate depending on cell or pair (cf. Figs. 3-4).

Conclusion
The role of mathematical theory and computation in addressing neuroscience questions is as vital as ever despite tremendous advances in recording technologies. As detailed in the Introduction, the common assumption of equilibrium neural network responses is inaccurate in many neural systems. Here we derived and implemented a reduction method to calculate the complete set of first and second order non-equilibrium statistics in coupled heterogeneous networks of firing-rate models [10] receiving background correlated noise [30][31][32]. Importantly, our method captures the non-equilibrium statistics when they are vastly different from the quasi-steady-state, and works very well even with significant heterogeneity in all model parameters. As the overall magnitude of the coupling strengths increase, the performance of our method declines because the moment closure method assumes weak coupling.
Mathematical reductions that well approximate the statistics of firing-rate models [33], such as the one described here, are likely to be relevant for future theoretical studies of neural networks for several reasons. Wilson-Cowan type models [10] are commonly used because of their simplicity and history of successful application in neural systems. Analysis of spiking statistics using mean-field methods often results in similar firing-rate equations [34][35][36][37]. Finally, such methods might be useful for mechanistic investigations of neural function across multiple brain regions that commonly rely on larger models with more parameters and complexity [7,11].