Analysis of an Attractor Neural Network’s Response to Conflicting External Inputs

The theory of attractor neural networks has been influential in our understanding of the neural processes underlying spatial, declarative, and episodic memory. Many theoretical studies focus on the inherent properties of an attractor, such as its structure and capacity. Relatively little is known about how an attractor neural network responds to external inputs, which often carry conflicting information about a stimulus. In this paper we analyze the behavior of an attractor neural network driven by two conflicting external inputs. Our focus is on analyzing the emergent properties of the megamap model, a quasi-continuous attractor network in which place cells are flexibly recombined to represent a large spatial environment. In this model, the system shows a sharp transition from the winner-take-all mode, which is characteristic of standard continuous attractor neural networks, to a combinatorial mode in which the equilibrium activity pattern combines embedded attractor states in response to conflicting external inputs. We derive a numerical test for determining the operational mode of the system a priori. We then derive a linear transformation from the full megamap model with thousands of neurons to a reduced 2-unit model that has similar qualitative behavior. Our analysis of the reduced model and explicit expressions relating the parameters of the reduced model to the megamap elucidate the conditions under which the combinatorial mode emerges and the dynamics in each mode given the relative strength of the attractor network and the relative strength of the two conflicting inputs. Although we focus on a particular attractor network model, we describe a set of conditions under which our analysis can be applied to more general attractor neural networks.


Introduction
The theory of attractor neural networks has greatly influenced our understanding of the mechanisms underlying the computations performed by neural networks. This is especially true for hippocampal networks involved in spatial, declarative, and episodic memory. According to this theory, structured recurrent connections among N neurons cause the N -dimensional state vector to converge in time to a stable, low-dimensional space called the attractor [1]. Such a network embeds memories as stationary attractors, which may be a discrete set of point attractors representing a discrete set of objects [2] or a continuum of attractor states representing continuous variables such as heading direction [3,4] or spatial location within an environment [5][6][7][8][9][10]. Numerous theoretical studies have revealed properties of attractor neural networks that make them a desirable neural mechanism for memory storage, such as robustness to damage, pattern completion, and generalization [11,12]. Attractor neural networks should arise naturally in regions of the brain with recurrently connected neurons and Hebbian-type synaptic plasticity, and they provide a theoretical framework for experimental design and data interpretation [13].
Attractor neural networks have been studied extensively through both analysis and computational simulations [1,3,[14][15][16][17]. While some studies do examine the role of external input [16,18,19], most determine the set of stable equilibrium states in the absence of external input, establishing properties such as the structure and capacity of the attractor. Relatively little is known about how an attractor network may respond to conflicting external inputs. This creates a gap between the idealistic predictions of attractor network theory and experimental data, since it is often experimentally difficult if not impossible to isolate putative attractor dynamics from the influence of the strong (often conflicting) external inputs into the neural network. In the current study, we analyze an attractor neural network's response to conflicting external inputs that effectively create a competition between embedded attractor states. Our focus is the interesting behavior observed in our numerical simulations of the megamap model, a quasi-continuous attractor network representing a large spatial environment, driven by external inputs encoding two different locations in the environment [10]. However, the analytical methods and results obtained here can be applied to more general attractor network models.
The megamap model is designed for a network of principal cells in the CA3 subregion of the hippocampus, a region crucial for learning and memory [20][21][22]. These cells are often referred to as place cells due the strong spatial correlate of their activity. In small, standard recording environments (∼1 m 2 ), a given place cell is primarily An outline of the paper is as follows. In Sect. 2, we present the dynamical system of the megamap model and describe two methods used to set the recurrent weights. We then show a numerical example of the operational modes and derive a numerical test for determining the operational mode. In Sect. 3, we present the reduced 2-unit model and describe the conditions under which the reduced model is an accurate approximation of the full attractor network model. In Sect. 4, we characterize the conditions under which the combinatorial mode emerges and derive equations for the bifurcations of the dynamical system. We close in Sect. 5 by comparing our analysis to other analytical treatments of attractor neural networks, describing possible extensions of the reduced model, and discussing the implications of the results for various types of attractor network models.

Operational Modes of the Megamap
We begin by describing the basic equations governing the megamap model and by illustrating the operational modes through a numerical example. For further details, see [10].

Megamap Model
The megamap model is a standard firing rate model [18] consisting of a network of N place cells with recurrent excitation, global feedback inhibition, and external input. The state vector, u ∈ R N , loosely represents the depolarization of each place cell and is governed by τ u (t) = −u(t) + Wf u(t) − w I f I u(t) 1 + b, (1) where τ = 10 ms for all simulations, and 1 ∈ R N denotes a vector of all ones. Our interest is in how the activity vector, f (u) ∈ R N , is tuned to spatial location. For simplicity, we set the activity through the threshold linear gain function, f (u) = f pk [[u 1 ] + , . . . , [u N ] + ] T , where [·] + = max(·, 0), and f pk = 15 Hz is the peak firing rate of the activity bump. All interneurons are modeled as a single inhibitory unit providing global feedback inhibition so that only the external input and recurrent hippocampal input provide a spatial signal. The activity of the inhibitory unit is given by f I (u) = [1 T f (u) − θ f net ] + , where θ is the threshold parameter, and is the sum over any embedded activity pattern (Eq. (2)). The embedded activity patterns are set such that f net is independent of x. The inhibitory activity is scaled by the inhibitory weight parameter, w I . The external input, b ∈ R N , carries sensory information about the animal's location or self-motion, modeling idealistic neuronal inputs from the upstream entorhinal cortex.
The recurrent excitation, Wf (u), provides the internal network drive. The weight matrix, W ∈ R N ×N , represents the strength of connections among place cells. Several studies have shown that an attractor network emerges in relatively small environments (∼1 m 2 ) when the weights are set through Hebbian plasticity [9,34]. We constructed a benchmark model for how an attractor network of place cells can represent large spaces by setting the weights to obtain desired activity profiles (place fields) for each cell [10]. The preferred locations (place field centers) for each cell are distributed randomly throughout the environment, and the number of place fields per cell is set according to the Poisson distribution. The average density of place fields for a given cell is set such that 80% of place cells are silent in a 1 m 2 environment. The weight matrix is then set in one of two ways: (1) The optimal weights are set incrementally through the delta rule [33] so that a set of desired activity patterns, {f(x j )}, are embedded into the network as stable fixed points of the dynamical system (Eq. (1)) when the external input into each cell is an idealistic sum of Gaussians centered at the preferred locations of each cell ( Fig. 1(a)). The desired activity of each cell is the sum of Gaussian-like place fields. Explicitly, for each cell i with M i place fields centered at {c im } M i m=1 , the training input and desired activity are, respectively, given by when the animal is stationary at location x. The training input is set as the idealistic sum of Gaussian bumps whose amplitudes are given by the parameter b pk . The desired activity is set as the sum of activity bumps of height f pk over each place field center. The shift parameter, u 0 > 0, is the depolarization at which a cell becomes active. The optimal weights are set using a discrete set of locations {x j } distributed uniformly over the environment (at least 15 cm from a boundary). (2) The Hebbian weights are set as the sum of tuning curves, where each cell j has the preferred locations {c jm } M j m=1 , and w tune is the weight profile determined by computing the optimal weights when each cell has exactly one place field. This tuning curve is approximately Gaussian, and setting weights as the sum of Gaussians is a common method for constructing attractor network models of place cells [5,6,[34][35][36]. The resulting weights approximate the weights expected given the basic Hebbian learning rule [32][33][34].
If each cell had at most one place field, then the two methods would be equivalent. Both methods lead to an attractor network that robustly represents large spaces (∼100 m 2 ). Differences emerge in large environments (>16 m 2 ) in which individual place cells represent multiple, irregularly spaced locations.  (2)). The training input and activity bump are visualized by plotting b i and f (u i )/f pk for each place cell i redundantly at each of its preferred locations ( Fig. 2(a)-(b)). (b) The numerical test for the operational mode (Eq. (3)) predicts that the optimal megamap transitions from the WTA mode to the combinatorial mode at about 25 m 2 , while the Hebbian megamap is always in the WTA mode. The filled circles indicate the values of r(S 1 ∪ S 2 , S I ) (Eq. (3)) for the optimal (black) and Hebbian (gray) weights, where S k is the set of all cells active in the embedded pattern at location x k (f(x k )), and S I = {inh} since the inhibitory unit is active. The two squared points indicate values for the megamaps simulated in (c) and (d). The open circles and diamonds indicate the values of r(S k , {inh}), or Eq. (3) evaluated at any activity bump proportional to exactly one embedded activity pattern. All such activity bumps are stable. The "Dominant Eigenvalue" refers to the maximal eigenvalue computed in Eq. (3). (c) When the optimal megamap representing 16 m 2 is driven by a mixed external input (left), only one prominent activity bump persists in time (right). The external input is formed by choosing two well-separated locations x 1 and x 2 , setting b i = b i (x 1 ) for a randomly selected 50% of the cells, and setting b i = b i (x 2 ) for the remaining cells. The activity bump scaled by (1/f pk ) is equivalent to [u] + . (d) When the optimal megamap representing 36 m 2 is driven by an external input set in the same manner, activity bumps representing both locations persist in time

Numerical Example of the Operational Modes of the Megamap
Since the megamap can seamlessly represent much larger environments than was previously possible, the model allows one to explore whether any interesting properties emerge when the attractor network represents a large space. We found that the megamap with optimal weights sharply transitions from a winner-take-all (WTA) mode to a combinatorial mode as the environment becomes sufficiently large [10]. While a megamap in either mode is similarly robust to a noisy or incomplete external input, there are clear differences between the modes when the network is driven by conflicting external input encoding multiple locations in the environment. In this situation, small megamaps operating in the WTA mode effectively suppress the input encoding one location and fully represent the second location, but large megamaps operating in the combinatorial mode robustly represent both locations through two co-stable activity bumps ( Fig. 1(c) and (d)). Moreover, hysteresis is observed only in the WTA mode, and a megamap in the combinatorial mode linearly amplifies the difference in input strengths ( Fig. 3(a) and (c)). In our simulations with N ≈ 10,000 place cells, the transition between modes occurs when the learned region reaches about 25 m 2 [10].
The combinatorial mode is not commonly observed in attractor network models. Standard continuous attractor network models of place cells operate exclusively in the WTA mode unless the dynamical system is modified to make multi-peaked activity bumps more stable [6,37,38]. It is interesting that the optimal megamap operates in either mode without any changes to the parameters or dynamical system, but the megamap with Hebbian weights operates in the WTA mode regardless of the environmental size. The emergence of the combinatorial mode not only depends on the environmental size but also on the manner in which the recurrent connections are updated as the animal explores novel regions of the environment.

Numerical Test for the Operational Mode
We now propose a numerical test for determining the operational mode of the dynamical system (Eq. (1)). We specify that the system is in the combinatorial mode if there exist stable fixed points with multiple activity bumps, and the network is in the WTA mode if any stable fixed point has exactly one activity bump.
We find that the stability of any fixed point depends on the subset of active cells at the fixed point, or excitatory cells such that f (u i ) > 0 and the inhibitory unit (inh) when f I (u) > 0. We define S and S I as the sets of active excitatory and inhibitory cells, respectively, and prove in Appendix A that the fixed point is stable if and only if r(S, S I ) < 1, where r S, S I ≡ λ max f pk W − χ S I (inh)w I 11 T D(S) . ( Here, λ max (M) refers to the largest real part of all eigenvalues of the matrix M, χ S I (inh) is the indicator function for the set S I (1 if the inhibitory unit is active and 0 otherwise), and D(S) is the diagonal (0-1)-matrix with D ii (S) = χ S (i) (1 if i ∈ S and 0 otherwise). Note that the stability depends only on the weights (W and w I ) and on which cells are active. The external input and the magnitude of each state do not affect the stability of a fixed point.
To determine the operational mode, we randomly select two well-separated locations in the environment (at least 50 cm apart and at least 15 cm from an environmental boundary). Let x 1 and x 2 denote these two locations, and let S k denote the set of all active cells in the embedded activity bump over x k (Eq. (2)), or for k = 1, 2. Since θ < 1, the inhibitory unit is active given any embedded activity bump. In our numerical simulations, the inhibitory unit is always active at an equilibrium state regardless of the external input. Hence, we set S I = {inh}. According to our test, the system is in the combinatorial mode if and only if r(S 1 ∪ S 2 , {inh}) < 1. This test is accurate when there exists a fixed point with two bumps in which the set of active excitatory cells is the set of excitatory cells that are active in either embedded activity pattern, or S = S 1 ∪ S 2 . The activity pattern at such a fixed point is approximated by a linear combination of the two embedded activity bumps, or f (u) ≈ c 1 f(x 1 ) + c 2 f(x 2 ) for some positive constants c 1 and c 2 such that c 1 + c 2 > θ.
In all numerical simulations we performed, the test is accurate in distinguishing between the two operational modes. For the example presented in Fig. 1, the recurrent weight matrix W is updated as the animal gradually learns novel subregions of an environment [10]. For the optimal weights, the test predicts the transition from the WTA mode to the combinatorial mode as the area (A) of the learned environment grows. In particular, r(S 1 ∪ S 2 , {inh}) decreases as A becomes larger, dropping below 1 around 25 m 2 ( Fig. 1(b), black closed circles). As predicted, when A < 25 m 2 , exactly one activity bump persists in time given any initial state and any external input ( Fig. 1(c)). When A > 25 m 2 , two activity bumps persist in time given a mixed external input ( Fig. 1(d)). For the Hebbian weights, the test predicts that the system remains in the WTA mode regardless of A since r(S 1 ∪ S 2 , {inh}) gradually increases with A ( Fig. 1(b), gray closed circles). As predicted, we find numerically that two activity bumps are always unstable given Hebbian weights [10].
Equation (3) can also be used to test the stability of single-peaked fixed points. Regardless of A or the method used to set the weights, r(S k , {inh}) < 1 for any location x k ( Fig. 1(b), open circles and diamonds)). This indicates that any single-peaked fixed point proportional to an embedded activity bump is stable. It is important to note that even in the combinatorial mode, the system robustly represents any location through a stable single-peaked activity bump given a single-peaked external input that may be relatively weak, noisy, or incomplete.
The numerical test is a powerful tool for determining the behavior of the network a priori. In addition to determining whether it is possible for multiple activity bumps to persist in time, the test determines whether the network may show hysteresis or amplify the difference in input strengths ( Fig. 3(a) and (c)). However, the numerical test is limited in that it determines the stability but not the existence of a fixed point. Figure 1(b), open circles and diamonds, indicates that single-peaked activity bumps are stable for any size environment. In our numerical simulations, we found that these single-peaked fixed points always exist given the optimal weights, but all cells eventually become active when A = 625 m 2 given Hebbian weights [10]. Some sort of normalization, such as forcing the 1-norm (subtractive normalization) or 2-norm (multiplicative normalization) of the weight vector to be constant, would be required to maintain stability in the Hebbian network [33]. It would be interesting to examine in future work how normalization would affect the operational mode of the Hebbian network.

Reduction of the Megamap Model to the 2-Unit Model
Consider an external input that is some mixture of the two training inputs, b(x 1 ) and b(x 2 ) (Eq. (2)), where x 1 and x 2 are two well-separated locations in the environment. We seek a mapping from the full megamap model to a two-dimensional reduced model with the same form and the same qualitative dynamics given this conflicting external input. The simplest relevant simplification is to model two units, where the place cells in each unit k are given by the set S k (Eq. (4)), and the reduced state u k is the collective state of place cells in unit k. The reduced model does not include cells without a place field near x 1 or x 2 , as these cells should be silent (f (u i ) ≈ 0) if the system is stable.
The reduction is illustrated in Fig. 2(a)-(c). Explicit equations for the reduced 2unit model are given by Eqs. (5)- (7). The weights of the reduced model, w 0 and q, are directly related to the weights of the full megamap model. For example, consider the three cells whose place fields within the environment are illustrated in Fig. 2(a) by the colors blue (Cell 1), red (Cell 2), and green (Cell 3). Each cell is plotted redundantly on the megamap at each of its preferred locations ( Fig. 2(b)). If the external input innervates cells near locations x 1 and x 2 indicated in (a), then the cells enclosed by the blue and red circles in (b) are collectively represented by units 1 and 2, respectively. The reduced weight w 0 determines the degree to which cells within a unit reinforce each other's activity and is related to the weights among cells in a unit on the megamap. The reduced weight q determines the degree to which cells within one unit innervate cells in a different unit and is proportional to the average weight between cells in different units on the megamap. If each cell had only a single place field, then there would be no cross-excitation, or q = 0. Due to the multiplicity of place fields, however, two cells in different units may innervate each other due to overlapping place fields elsewhere in the environment. In the example shown, q > 0 since Cells 1 and 2 are neighbors on the megamap. We thus expect 0 < q < w 0 , since only some of the cells in the two units have overlapping place fields. Figure 2(e) and (f) shows w 0 and q (Eq. (7)) for a megamap representing square environments of increasing size ( Fig. 2(d)). This megamap was used to generate Figs. 1, 2, 3, and further details on its construction and behavior can be found in [10]. For the Hebbian megamap, new weights are added as the animal explores new locations. This results in a linear increase in both w 0 and q as the environment grows in size, but a constant difference, w 0 − q ( Fig. 2(f)). For the optimal megamap, weights are both increased and decreased so that each novel subregion is accurately learned. As a result, w 0 is constant for the most recent 1 m 2 subregion learned. While the reduced weight w 0 within a given subregion gradually decays as new subregions are incorporated into the megamap, w 0 changes little compared to the increase in q over the initial 100 m 2 ( Fig. 2(e)). The steady decrease in w 0 − q is correlated to the decrease in the dominant eigenvalue ( Fig. 1(b), closed black circles) and appears to be responsible for the change in operational mode. We prove this is the case in Sect. 4.1.

Reduced Model
We now present explicit equations for the reduced model. As shown in Appendix B, computing the sum over all cells in unit k (S k ) of each term in  1)). The reduced state variables and reduced external input, u k and b k (Eq. (6)), represent the collective state and collective external input into place cells near location x k , indicated by the blue and red circles in (b). The reduced weights, w 0 and q (Eq. (7)), are related to the strength of connections within a unit and between units, respectively. For this example, there should be a relatively weak cross-connection q since the blue and red cells are neighbors elsewhere in the environment. The reduced inhibitory weight is proportional to the inhibitory weight of the megamap (Eq. (7)). (d)-(f) We compute the reduced weights for a megamap that models an animal incrementally learning a square environment of increasing size [10]. The first three iterations are illustrated in (d). At each iteration, the recurrent weights are updated to incorporate the novel subregions (red) into the learned environment (gray). Previously learned subregions are not reinforced in later iterations. For the optimal weights (e), the average recurrent excitation (proportional to w 0 ) within a unit changes little over the first 100 m 2 compared to the increase in the average weight between units (proportional to q) as the environment grows in size. For the Hebbian weights (f), w 0 and q increase linearly at roughly the same rate. The color in (e) and (f) indicates the region number (the first nine regions are shown in (d)) Eq. (1) and scaling by (f pk /f net ) leads to the two-dimensional reduced model, The reduced model has the same form as the full megamap model, but the network connections are now defined by only two weights (w 0 and q) rather than the weight matrix W ∈ R N ×N . For simplicity, the activation function of the megamap, f (u i ) = f pk [u i ] + , is scaled in the reduced model to have a peak value of 1. The two reduced state variables and corresponding external inputs are given by for each unit k. When there is an activity bump over x k with the same radius as the embedded activity bump, f (u i ) ≈ u pk f i (x k ) for i ∈ S k , where 0 < u pk ≤ 1 is the peak of the state bump. In this case, u k ≈ u pk , and so the embedded activity bump over x k maps to the reduced activity, [ u k ] + = u k ≈ u pk . When there is no activity bump over x k , unit k is silent ([ u k ] + = 0) since u i < 0 for most cells in unit k. The external input is always nonnegative, and it is zero when there is no external input into place cells in unit k.
The reduced weights are given by where N denotes the average number of active cells in each embedded activity pattern, so N ≈ |S k | for any k. In our simulations of the megamap, 220 ≤ |S k | ≤ 225 for all locations k. The weight of the self-connection (w 0 ) is proportional to the average recurrent excitation between two place cells in the same unit k given the embedded activity bump over x k (Eq. (2)), and the weight of the cross-connection (q) is proportional to the average weight between two place cells in different units. The reduced inhibitory weight is proportional to the inhibitory weight of the megamap. The inhibition into any reduced unit ( I ) and the inhibition into any excitatory cell in the megamap (I = w I f I (u)) are related by Consequently, the inhibitory unit is active in the 2-unit model if and only if the inhibitory unit is active in the full megamap model, and the inhibition drives the state of an inactive unit further below zero for the 2-unit model than for the megamap model since f pk N > f net .

Approximations in the Reduction
As detailed in Appendix B, we make four approximations to map the N -dimensional system of Eq. (1) to the two-dimensional system of Eq. (5). First, we neglect cells that are in both units by assuming S 1 ∩ S 2 = ∅. Since place fields are set by the Poisson distribution, a small minority of cells in S 1 may also be in S 2 , but these relatively few cells should not have a large impact on the dynamics. Second, we neglect the small minority of cells with multiple place fields near x k . This permits the assumptions that both units have the same number of cells, or N = |S k | for any k, and that the average of the recurrent input (proportional to w 0 ) between two cells in the same unit given the embedded activity bump is the same for all k. Third, we neglect the asymmetries in the optimal weights of the megamap by assuming that the average weight from unit 1 to unit 2 (proportional to q) is the same as the average weight from unit 2 to unit 1. These first three approximations amount to neglecting the variability of the megamap and modeling only the average dynamics. The variability may affect the stability of a state in borderline cases. For example, when r(S 1 ∪ S 2 , {inh}) ≈ 1, the stability of two co-active bumps may depend on the locations chosen for x 1 and x 2 . The fourth approximation does affect the average dynamics of the megamap. We assume that any activity bump over x k has the same radius and is always centered over x k . Explicitly, we define S k (t) as the set of all cells near x k that are active at time t, or (The exact value of δ is not important here. It should be larger than the radius of the embedded activity bump, and small enough to exclude cells that are active due to their proximity to the location of the other unit.) To obtain Eq. (5), we assume S k (t) ∈ {∅, S k } for all t, where S k ≈ ∅ when there is no activity bump over x k , and S k ≈ S k when there is an activity bump over x k . In reality, the radius expands continuously from 0 to its equilibrium value as an activity bump emerges. We are perhaps justified in neglecting these transient, narrow activity bumps since we use the 2-unit model to infer the stable fixed points of the megamap. However, in the absence of external input, the equilibrium activity bump drifts over the megamap [10], so it is important to choose x k to be a location from which activity bumps do not drift. In addition, the equilibrium activity bump is wider for weaker external inputs. The 2-unit model does not capture the effects of a wider activity bump, but rather tracks only the height of the activity bump since S k ≈ S k ⇒ u k ≈ u pk . Despite this shortcoming, we find that the two models behave in the same way qualitatively (Fig. 3), and the analytical tractability of the 2-unit model permits us to derive explicit equations for the set of parameters leading to each operational mode and the relative strength of external input leading to hysteresis (in the WTA mode) or two co-stable activity bumps (in the combinatorial mode).

Constraints on the Parameters of the 2-Unit Model
In accordance with the construction of the megamap with optimal weights, the parameters of the 2-unit model are set such that when the network is driven by the training inputs, [ b pk 0] T and [0 b pk ] T , the respective fixed points of Eq. (5) correspond to the desired activity patterns, [1 0] T and [0 1] T , respectively. The training input strength, b pk , is proportional to the parameter b pk in the megamap model (Eq. (2) and Eq. (6)). These two desired activity patterns are obtained if and only if  (2)). For the relatively small megamap operating in the WTA mode (left, Fig. 1(c)), any equilibrium activity bump fully represents one location while effectively ignoring the input for the other location. For the large megamap operating in the combinatorial mode (right, Fig. 1(d)), the equilibrium activity bump fully represents one location when |b 1 pk − b 2 pk | is sufficiently large. Otherwise, the equilibrium state corresponds to a linear combination of the two embedded activity bumps, amplifying the difference in input strengths. The initial state for all simulations corresponds to f(x 2 ) (Eq. (2)). The activity ratio is given by act(u k , where u i and s k i are the equilibrium states of cell i given the conflicting external input, b i , and the isolated input, (b k pk /b pk )b i (x k ), respectively. Data points were omitted if f (s k ) was not an activity bump over location x k , which occurs in this example when b k pk ≈ 0. (b) The 2-unit model responds similarly to the conflicting external input. The parameters w 0 = 1.2 and q are comparable to the corresponding reduced weights of the megamap (Eq. (7), Fig. 2(e)). The reduced inhibitory weight, w I = 5.3, and threshold, θ = 0.9, are the exact values corresponding to the megamap parameters in (a). (c) The initial state (black circles) is varied randomly, and the external input is constant (b 1 pk = b 2 pk = b pk /2). The equilibrium state reached (red squares) depends on the initial state for the small megamap but not for the large megamap. (d) The 2-unit model with the same parameters as used in (b) similarly shows hysteresis only in the WTA mode.
We set w 0 , w I , and θ as the parameters of the 2-unit model, and we analyze its behavior as we vary q, b 1 , and b 2 . All parameters and variables are nonnegative and must satisfy the following constraints: 1. The inhibitory unit must be active given a desired activity pattern, but inactive if all place cells are inactive. Equivalently, 0 < θ < 1. 2. The strength of the training input must be much weaker than the desired equilibrium state, or 0 < b pk 1. By Eq. (9), this condition is equivalent to w I (1 − θ) w 0 < 1 + w I (1 − θ). 3. When q = 0, the attractor of the megamap should consist of single-peaked activity bumps. In the 2-unit model, this means that when q = 0 and b = 0, the system supports fixed points in which exactly one unit is active. Without loss of generality, suppose that the fixed point in the absence of external input is given by u 1 > 0 and u 2 < 0. We show in Appendix C.2 that the inhibitory unit must be active at such a fixed point. By Eq. (5), Thus, this condition imposes the constraint, w 0 > 1. 4. Finally, the cross-excitation must be small enough such that the desired activity pattern is a fixed point of the system given the training input. With b 1 = b pk and b 2 = 0, the fixed point must satisfy u 1 = 1 and u 2 = q − w I (1 − θ) < 0. Thus, this condition imposes the constraint, q < w I (1 − θ) w 0 .

Analysis of the Operational Modes of the 2-Unit Model
In accordance with the definitions of the operational modes of the megamap, we specify that the 2-unit model is in the combinatorial mode if there exist stable fixed points in which both units are active and in the WTA mode if any stable fixed point has exactly one active unit. We now analyze the 2-unit model to derive an explicit equation for the critical value of w 0 − q at which the system shifts from the WTA mode to the combinatorial mode. We also analyze how the system responds to conflicting inputs in each mode, dependent on the attractor network strength (w 0 − q) and the relative strengths of the competing inputs ( b 1 − b 2 ).

Characterization of the Operational Modes
Assume the 2-unit network is driven by an external input of the form b 1 ≥ b 2 ≥ 0. We derive all fixed points and analyze their stability in Appendices C and D, respectively. The main results are summarized below: • At least one unit must be active at any stable fixed point due to the constraint, w 0 > 1.

• A fixed point in which only unit 1 is active exists if and only if
Since w 0 − 1 < w I , this fixed point exists for all inputs such that b 1 ≥ b 2 if and only if w 0 − q > 1. If the fixed point exists, it is always stable and corresponds to the network encoding only the location with the stronger external input (x 1 ). The network effectively ignores the weaker input over location x 2 . • A fixed point in which only unit 2 is active exists if and only if This fixed point exists for some input such that b 1 ≥ b 2 if and only if w 0 − q > 1. If the fixed point exists, it is always stable and corresponds to the network encoding only the location with the weaker external input (x 2 ). The network effectively ignores the stronger input over location x 1 .
• A fixed point in which both units are active is stable if and only if w 0 − q < 1. When w 0 − q < 1, such a fixed point exists if and only if and the fixed point is unique.
Explicit equations for all fixed points are given in Appendix C. Setting b 1 = b 2 in (Eq. (12)), we conclude that the system is in the WTA mode when w 0 − q > 1 and in the combinatorial mode when w 0 − q < 1. This result is consistent with the hypothesis that the shift in operational mode observed in the megamap is due to the increase in cross-excitation between cells in the two respective activity bumps (Fig. 2(e)). Although the inhibitory weight and threshold (w I and θ , respectively) were not varied in our simulations of the megamap, the analysis of the 2-unit reduced model implies that the operational mode depends only on the difference in self-and cross-excitation, w 0 − q, and not on w I or θ . This is somewhat surprising since the competition between two activity bumps, which underlies the WTA mode, is mediated by feedback inhibition.
In the WTA mode of the 2-unit model, any stable fixed point represents exactly one location. This corresponds to the single-peaked activity bumps always observed in equilibrium states of a relatively small megamap (Fig. 1(c), Fig. 3(a) and (c)). Since Eq. (10) is always satisfied, there are two stable fixed points for a given set of inputs ( b 1 ≥ b 2 ) if and only if Eq. (11) is satisfied. In this case, the equilibrium state reached depends on the initial state, consistent with the hysteresis observed in the WTA mode of the megamap model (Fig. 3(c)).
In the combinatorial mode of the 2-unit model, the stable fixed point represents only the stronger input when Eq. (10) is satisfied and both inputs when Eq. (12) is satisfied. This is consistent with the combinatorial mode of the megamap model, for which the equilibrium state always has one activity bump given a sufficiently large difference in input strengths and two activity bumps given two similar inputs ( Fig. 3(a)). Since q > w 0 − 1, Eq. (11) is never satisfied, and the system never shows hysteresis. When both units are active in the equilibrium state, the state vector amplifies the difference in inputs according to The absence of hysteresis and the amplification of the difference in input strengths are both characteristic of the combinatorial mode of the megamap, as seen in the examples in Fig. 3(a) and (c).

Bifurcations of the Dynamical System
Our analysis of the 2-unit model reveals four types of qualitative dynamics observed in the model: • Type I: The state vector converges to a unique equilibrium in which only unit 1 is active.  14)- (16). The initial state is set to the desired activity pattern such that the active unit is the unit driven by the weaker input. We classified the dynamics as Type I or Type II when the only active unit in the equilibrium state is the unit receiving the stronger input, as Type III when the initially active unit remains the only active unit in the equilibrium state, and as Type IV when both units are active in the equilibrium state. (a) The parameters of the 2-unit model approximate the reduced parameters from the megamap model (Eq. (7)), as used in Fig. 3 We have already shown that Types I, II, and III are found in the WTA mode, while Types I, II, and IV are found in the combinatorial mode. We now derive explicit equations for the bifurcations, or parameter sets on the boundary between two different types of qualitative dynamics, in order to better understand the interplay between the inherent strength of the attractor network (w 0 − q) and the relative strength of ex- . To simplify analysis, we assume the net external input is constant, or b 1 + b 2 = b pk . As the learned environment grows from 0 to about 100 m 2 , the only parameter in the optimal megamap with large relative changes is q (Fig. 2(e)). Hence, we hold the parameters w 0 , w I , and θ fixed and determine the bifurcations for the parameters 0 ≤ q < w I (1 − θ) and − b pk ≤ b ≤ b pk , where b pk is given by Eq.  (14) over the domain − b pk ≤ x ≤ b pk . Since w 0 − 1 < w I , g is strictly increasing over its domain, and g(0) = 0. By Eq. (11), the system has Type III dynamics (hysteresis) if and only if which is only possible in the WTA mode since g(−| b|) ≤ 0. As illustrated in Fig. 4, when the external input into one unit becomes sufficiently stronger than the other, then only the unit receiving the stronger input will remain active in the equilibrium state as the system transitions to Type I or Type II dynamics. As q becomes larger for a fixed b = 0, the active unit increasingly depolarizes the silent unit. If this crossexcitation becomes sufficiently strong, it becomes impossible to maintain an activity bump over the unit receiving less input, again pushing the system into Type I or Type II dynamics. By Eq. (12), the system has Type IV dynamics (two co-stable activity bumps) if and only if which is only possible in the combinatorial mode since g(| b|) ≥ 0. As illustrated in Fig. 4, the system again transitions to Type I or Type II dynamics when the external input into one unit becomes sufficiently stronger than the other. However, increasing q now causes a transition from uni-peaked equilibrium states of Type I or Type II to multi-peaked equilibrium states of Type IV. Increased cross-excitation between the units causes the units to better reinforce one another, counteracting the competition between units induced by feedback inhibition. The bifurcations appear roughly linear for a wide range of weights w 0 and w I when θ = 0.9 ( Fig. 4(a)-(c)). To examine this, let d(x) denote the denominator in Eq. (14). Since − b pk ≤ b ≤ b pk and b pk < w I (1 − θ), Hence, g(x) approaches a linear function with slope ( w I − (w 0 − 1))/( w I − (w 0 − 1)/2) as θ approaches 1. The nonlinearities in g(x) are more apparent for smaller values of θ . Figure 4(d) shows an example with θ = 0.5.

Conclusions
We present a mathematical analysis of the properties of the megamap attractor neural network that emerge when the network represents a sufficiently large spatial environment [10]. Through stability analysis of the full megamap model, we derive a numerical test (Eq. (3)) for determining the operational mode of the dynamical system (Eq. (1)). In addition, we derive a linear mapping from the N -dimensional megamap model to a two-dimensional reduced model that has the same qualitative dynamics. Our analysis of the 2-unit model elucidates the role of each parameter in the full megamap model in the context of conflicting external inputs (Fig. 4). In particular, we show that the abrupt shift in operational mode occurs when q ≈ w 0 − 1, where w 0 and q are proportional to the average recurrent excitation between two cells in the same unit and in different units, respectively (Eq. (7)). The inhibitory weight does not affect the operational mode, but increasing w I increases the range of q, resulting in a larger range of the relative strength of inputs (b 1 pk − b 2 pk ) for which there are two co-stable activity bumps (Type IV dynamics). The inhibitory threshold (θ ) also does not affect the operational mode, but the bifurcations described by Eqs. (14)- (16) approach linear functions of b 1 pk − b 2 pk as θ approaches 1. This work is similar in nature to numerous theoretical studies of EI nets [39,40]. In many of these studies, two populations of neurons are considered, where one population represents excitatory cells and the other inhibitory cells. The recurrent circuitry among inhibitory cells is often neglected, simplifying the analysis. We consider two populations of excitatory neurons, each with extensive recurrent circuitry, and a third population of inhibitory neurons. We simplify the dynamical system by lumping all inhibitory neurons into a single inhibitory unit under the assumption that all inhibitory cells are statistically identical since interneurons in the hippocampus do not appear to have strong spatial tuning [41,42]. We also assume that the time constant of the inhibitory state is much smaller than that of excitatory cells, allowing us to approximate the inhibitory state as an instantaneous function of the excitatory activity vector. Without this simplification, it is likely that we would observe oscillations between activity bumps under some parameter sets [18].
A common approach used to analyze continuous attractor neural networks is to approximate the N -dimensional system of ordinary differential equations (Eq. (1)) by a partial differential equation by taking the limit as N → ∞. The state vector, u(t) ∈ R N , then becomes the continuous function, u(x, t) ∈ R, where x is a continuous variable representing the single preferred location of a given place cell. The cleanest results are obtained using a Heaviside activation function for f (u), for then one can solve for the radius of the activity bump at a fixed point [14,43]. Using a similar approach, we derived clean expressions for the set of stable fixed points; however, we found that the combinatorial mode does not exist given the Heaviside activation function in our dynamical system. Other mathematical studies have used Fourier analysis to analyze the PDE given the threshold linear activation function used for the megamap model [40,44]. Even when we approximate the recurrent weights using only the first two terms in the Fourier series, however, the recurrent circuitry among both populations of neurons renders the solutions too complex to be helpful in understanding how the parameters of the model affect the dynamics. The approaches we present in this study require only a few justified approximations of the full megamap model, and the simplicity of the results make the analysis useful in understanding the behavior of the megamap. Despite its simplicity, the numerical test accurately determines the operational mode of the full system (Fig. 1), and the reduced model has similar qualitative behavior to the full model (Figs. 2 and 3).
While we focus on a particular attractor neural network, the results apply to a broad class of attractor network models. The numerical test for determining the oper-ational mode (Eq. (3)) applies to any attractor network model in which the state vector is governed by Eq. (1), a standard firing rate model derived by averaging neuronal activity over multiple trials [33]. The reduced 2-unit model applies to any attractor neural network in which the four approximations outlined in Sect. 3.1 are valid approximations. This includes not only continuous attractor neural networks, but also discrete attractor neural networks such as Hopfield networks with graded neuronal responses [2]. In the latter case, the set S k used in the reduction of the full model is the set of all cells that are active in embedded activity pattern k. It is not necessary for the embedded activity patterns to have the shape of the Gaussian-like activity bumps considered here.
When considering the reduced model, it is important to understand the impact of the approximations underlying the linear mapping from the full model. For the megamap, the first three approximations neglect the variability in embedded activity patterns and weights due to the Poisson distribution of place fields [10]. This variability includes asymmetries in the full weight matrix, W. We find numerically that, as long as W is a relatively small perturbation from a symmetric matrix, the asymmetries have a negligible effect on the dynamics. For example, we observe only a slight difference in the transition point between operational modes determined by numerical simulations and the stability test (∼25 m 2 ) and by the reduced state variables (w 0 − q ≈ 1.05 at 25 m 2 , as seen in Fig. 2(e)). This result is not surprising, as uncorrelated random perturbations of the weight matrix in a Hopfield network have been shown to have a small effect on the dynamics [45,46]. The fourth approximation underlies the qualitative differences between the full megamap model and the 2-unit model. In particular, the variable radius of the activity bump underlies the nonlinearities observed in the megamap's response to the conflicting input ( Fig. 3(a) and (c)). In general, the reduced model captures the peak of the activity pattern, but it does not capture changes in the subset of active cells within each unit.
There are several natural directions in which the reduced model presented here could be extended. For example, one could examine how the attractor network responds to M conflicting external inputs, where M ≥ 2. As long as these inputs are well-separated spatially, an M-dimensional reduced model could be derived exactly as shown for M = 2 in Sect. 3.1. Using the same four approximations, the reduced model for M inputs would be The reduction equations (Eqs. (6)-(7)) and the four constraints would be unchanged. There are several intriguing questions that could be addressed by this model. For example, does the value of q at which there exists a fixed point with m ≥ 2 stable, coactive units depend on m? If so, the definition of the combinatorial mode would need to be reconsidered. Another interesting question is whether hysteresis emerges in the combinatorial mode when M > 2. For example, it is possible that, for a particular parameter set, any stable fixed point has two co-active units, but the subset of coactive units depends on the initial state.
A second possible extension would be to relax the fourth approximation of the reduced model to examine the spatial effects of the activity bump on the attractor. This could be done by modeling n N place cells for each unit, setting the reduced weight matrix W 0 ∈ R n×n through a Gaussian tuning curve, and setting the reduced weight matrix Q ∈ R n×n as a random matrix with Q W 0 . It would be interesting to compare the operational modes and bifurcations of this 2n-dimensional model to the operational modes and bifurcations of the two-dimensional model presented here.
A third possible extension would be to use the reduced model to explore remapping. In the current study, the full weight matrix is set during a learning phase in which the place cell activity is fixed at the desired activity pattern, and the network is driven by strong external inputs. Then the dynamics of the model are examined during a retrieval phase in which the weights are constant, and the recurrent input is stronger than the external input. This separation into a learning phase and retrieval phase is common in attractor neural network models in which the weights are incrementally learned [6,35,47], and there is experimental evidence supporting, at least in part, the use of two separate phases. For example, it has been observed experimentally that the acetylcholine system is more activated during the initial exploration of a novel space than when the animal is moving around in a familiar space, and acetylcholine may increase the strength of afferent input connections relative to feedback recurrent connections [48]. Nonetheless, it would be an interesting and relevant study to address how the dynamics change given plasticity in the recurrent weights during the retrieval phase, as is more biologically realistic. Exploring remapping mathematically would require a more complex reduced model that incorporates differential equations for w 0 (t) and q(t). The basic Hebbian learning rule is unstable, and the manner in which stability is maintained would affect the set of stable fixed points [33]. Another key factor would be the learning rate. In particular, when the two external inputs have equal strength, then two activity bumps initially become co-active in the WTA mode when the weights are constant. In the full model, this co-activity could last for hundreds of ms before one activity bump dominates [10]. Given Hebbian learning, the place cells in each unit would begin to reinforce each other's activity, effectively increasing q and possibly driving the system to the combinatorial mode.
There are many contexts in which an attractor neural network must resolve conflicting information from its rich array of neuronal inputs. For example, it is a common experimental paradigm to manipulate different cues in different ways in order to track how information flows through various levels of neural processing [49,50]. The WTA mode is ideal for robust memory retrieval, allowing the attractor network to perform computations such as transforming a noisy external input into a coherent, embedded activity pattern. On the other hand, the combinatorial mode permits a flexible recombination of embedded activity patterns in response to a changed environment. This flexibility could lead to phenomena such as the partial remapping observed in hippocampal place cells [6,10,31]. Perhaps the ideal attractor neural network operates between these two extremes, robustly encoding memories while still having the flexibility to adapt to our ever-changing world. The reduction method presented in this paper is a useful tool for simplifying the mathematical analysis of various behaviors of attractor network models to better understand how these behaviors depend on the network parameters and the learning process.
Thus, v · ∇f I (v) = χ S I (inh)f pk 1 T D(S) v. Substituting these expressions back into Eq. (17) and dropping terms of O( 2 ), we find Therefore This equation provides a numerical test to determine the stability of any fixed point.

Appendix B: Reduction of the Megamap
In this appendix, we map Eq. (1) to Eq. (5) using the approximations described in Sect. 3.1. By Eq. (1), by Eq. (6), where N ≈ |S 1 | ≈ |S 2 |. Let S k (t) be the set of all active cells near x k at by Eq. (7). Similarly, if there is an activity bump over x 2 at time t, then we assume S 2 (t) ≈ S 2 and u 2 (t) > 0, implying that It is reasonable to consider w ij and f j (x 2 ) as independent random variables for the following reason. The values of w ij and f j depend on the preferred locations of cells i and j , which are Poisson random variables. Since S 1 ∩ S 2 ≈ ∅, when i ∈ S 1 and j ∈ S 2 , w ij depends only on the place fields of cells i and j that are away from both x 1 and x 2 , and f j (x 2 ) depends only on the place fields of cell j near x 2 . Since all preferred locations are set as independent random variables, w ij and f j ( , and so by Eq. (7).
Putting it all together, we have derived a linear mapping from Eq. (1) to the differential equation, An analogous argument can be used to derive the equation governing u 2 (Eq. (5)).

Appendix C: Fixed Points of the 2-Unit Model
A fixed point of the 2-unit model is any solution of the equation, We also assume that all parameters satisfy the constraints outlined in Sect. 3.2.

C.2 One Active Unit
A fixed point with exactly one active unit corresponds to a single activity bump on the megamap. We will consider two cases separately.

Case 1
The unit receiving more input is the active unit, or u 1 > 0 and u 2 < 0. By the second row of Eq. (18), Thus, f I ( u) = u 1 − θ > 0. Substituting this back into Eq. (18), we find and Since u 1 > 0 for any permissible parameters, this fixed point exists if and only if u 2 < 0, or equivalently,

Case 2
The unit receiving less input is the active unit, or u 1 < 0 and u 2 > 0. By the first row of Eq. (18), Thus, f I ( u) = u 2 − θ > 0. Substituting this back into Eq. (18), we find and Since u 2 > 0 for any permissible parameters, this fixed point exists if and only if u 1 < 0, or equivalently,

C.3 Two Active Units
A fixed point with two active units, or u 1 > 0 and u 2 > 0, corresponds to two activity bumps on the megamap, each encoding a different location in the environment. Suppose the inhibitory unit is silent, or f I ( u) = 0. Equation (18) becomes If q = w 0 − 1 > 0, then The latter statement contradicts the assumption that u 1 > 0 and u 2 > 0. Thus, if f I ( u) = 0, then q = w 0 − 1. In this case, the system has the unique fixed point, If q < w 0 − 1, then Since u 1 > 0 and u 2 > 0, the inhibitory unit must be active, or f I ( u) = u 1 + u 2 − θ > 0. Substituting this back into Eq. (18), the fixed point must satisfy The determinant of the coefficient matrix is given by d = (q − (w 0 − 1))(2 w I − (w 0 − 1) − q). Note that 2 w I − (w 0 − 1) − q > 0. If q = w 0 − 1, then a fixed point exists if and only if b 1 = b 2 . Under these conditions, the set of all fixed points is given by the line segment, where u 1 > 0 and u 2 > 0.
The inhibitory unit is active at the fixed point since u 1 + u 2 ≥ θ/(1 − (q/ w I )) > θ given any set of permissible parameters. If q = w 0 − 1, then the system has the unique fixed point, The inhibitory unit is again active at the fixed point given any permissible parameters since All that remains is to determine parameters for which u 1 > 0 and u 2 > 0. Consider the activity difference, .
If q < w 0 − 1, then u 1 < u 2 when b 1 > b 2 , implying that the unit receiving less input has a higher activity level at the unique fixed point. While such a fixed point may exist for certain parameters, we will later show that it is not stable. If q > w 0 − 1, then u 1 ≥ u 2 , and by Eq. (21), Thus, this fixed point exists only for a sufficiently small input difference, b 1 − b 2 , or a sufficiently large weight between units, q.

Appendix D: Stability of a Fixed Point of the 2-Unit Model
Since the 2-unit model has the same form as the megamap model, the stability of a fixed point in which u 1 = 0, u 2 = 0, and u 1 + u 2 = θ is also determined by the stability test of Eq. (3), where S is now the set containing the indices of all active units, and f pk = 1 due to the rescaled activation function in the reduced model. We now evaluate r(S, S I ) to determine the stability of the various fixed points found in Appendix C (Eqs. (19)- (21)).

D.4 No Active Units
The only fixed point with no active unit is u 1 = u 2 = 0, which exists if and only if b 1 = b 2 = 0. Since Eq. (3) does not apply for this fixed point, suppose the state is perturbed from the 0-vector at time t 0 such that u 1 (t 0 ) > 0, u 2 (t 0 ) > 0, and u 1 (t 0 ) + u 2 (t 0 ) < θ. While both states are positive, the state vector is governed by where I denotes the 2 × 2 identity matrix, and W ≡ w 0 q q w 0 .

D.5 One Active Unit
Without loss of generality, assume unit 1 is the active unit, or u 1 > 0 and u 2 < 0. At a fixed point, u 1 > θ, and so by Eq. Since w 0 − w I < 1 + w I (1 − θ) − w I < 1, a fixed point with exactly one active unit is always stable.

D.6 Two Active Units
Finally, suppose u 1 > 0 and u 2 > 0. Since f I ( u) > 0 at the fixed point, the stability test is again given by λ max (M) < 1, where For any set of permissible parameters, the fixed point is stable to even perturbations (in the direction of the eigenvector v + = [1 1] T ) since λ + = w 0 + q − 2 w I < 1. However, the fixed point is stable to odd perturbations (in the direction of the eigenvector v − = [1 −1] T ) if and only if q > w 0 − 1, since λ − = w 0 − q. Thus, for a fixed self-excitatory weight w 0 , the system may transition from a mode in which this fixed point is unstable (WTA mode) to stable (combinatorial mode) as the cross-excitatory weight q increases.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.