Inhomogeneous Sparseness Leads to Dynamic Instability During Sequence Memory Recall in a Recurrent Neural Network Model
© D. Medina, C. Leibold; licensee Springer 2013
Received: 8 March 2013
Accepted: 24 June 2013
Published: 22 July 2013
Theoretical models of associative memory generally assume most of their parameters to be homogeneous across the network. Conversely, biological neural networks exhibit high variability of structural as well as activity parameters. In this paper, we extend the classical clipped learning rule by Willshaw to networks with inhomogeneous sparseness, i.e., the number of active neurons may vary across memory items. We evaluate this learning rule for sequence memory networks with instantaneous feedback inhibition and show that little surprisingly, memory capacity degrades with increased variability in sparseness. The loss of capacity, however, is very small for short sequences of less than about 10 associations. Most interestingly, we further show that, due to feedback inhibition, too large patterns are much less detrimental for memory capacity than too small patterns.
Many brain areas exhibit extensive recurrent connectivity. Over decades such neuronal feedback attracted a huge amount of theoretical modeling [1–3] and one of the most prominent functions that is proposed for the recurrent synaptic connections is that of associative memory. In most such theories, all memory items are generally treated as equal, particularly in terms of the sparseness with which they are neurally represented, i.e., in terms of how many neurons are active during recall. In this paper, we extend a particular class of such auto-association networks, viz., sequence memory networks, to include variable sparseness and thereby add one aspect of variability that is to be expected in biological neural networks.
Memory sequences have been shown to occur in the rodent brain during hippocampal sharp-wave ripple events [4, 5]. The major hypothesis of the present paper is that these sequences are stored in the recurrent connections of the hippocampal network, which is supported by findings of fast coordinated excitatory synaptic currents during sharp-wave ripple events in slices .
Here, we build on a previous model of memory sequences , which enhances memory capacity by instantaneous feedback inhibition. Both our mean field analysis and our simulations show that, in this model, inhomogeneity in pattern sizes reduces memory capacity, but it does so in an asymmetric way: whereas too small patterns significantly compromise the stability of sequence recall, too large patterns can be compensated for quite robustly.
Here, as in many other approaches, we define Θ as the Heaviside function, which is equivalent to restricting the neuron states to binary variables, with if neuron i fires and if it is silent. The state of the network at time t is thus denoted by . The other parameters are the firing threshold θ and the synaptic weights .
In the framework of the dynamical system of (1), associative memory is considered to be the (approximate) recall of a network state at time after the network has been initialized with some appropriate cue at time t. Recall can either occur as convergence to a dynamical attractor () , or as a one-step association () [8, 9]. The specific choice of the synaptic matrix determines which patterns can be recalled, or are stored in the network.
In this paper, we will deal with sequences of one-step associations with binary synapses [8, 10, 11] and instantaneous feedback inhibition [7, 9, 12]. Here, memory sequences are described as sequences of random activity patterns ξ that are binary vectors of dimension N, . A memory sequence of length Q is an ordered occurrence of activity patterns . The number of active neurons in each pattern is called pattern size, and will in general be different for each pattern.
As proposed in [7, 9, 11], we model the weights of the binary synapses as products of two independent binary stochastic processes, . The first stochastic variable indicates the presence () or absence () of a morphological connection, with probability , called morphological connectivity. The second process is called synaptic state and will be used to store memories. In the potentiated state (), a presynaptic spike increments the postsynaptic potential by 1, whereas in the silent state (), the postsynaptic potential remains unaffected. According to (1), neuron i fires a spike at cycle if its postsynaptic potential at time t exceeds the threshold θ.
Willshaw’s  clipped Hebbian rule is used to set the synaptic states such that the network is able to recall the memory sequences: a synapse is in the potentiated state only if it connects two neurons that are activated in sequence at least once.
where is the coding ratio. The effective connectivity c defines the noise level during recall, i.e., how many spurious inputs a neuron gets that are not part of the memory pattern to be recalled. If c is too large, the network will exhibit many spurious activations and the memory can no longer be recalled. Equation (2) thus provides a capacity estimate of the network in that it says how many associations are stored at the maximum noise level c.
The results show that the variability in ς is actually relatively large (about 10 % for ) and even increases with increasing number of associations P. We therefore decided not to use the expectation value for further discussion, but to show empirical distributions of many realizations of ϕ whenever possible.
Willshaw’s learning rule yields correlations in the synaptic states that are captured by the terms proportional to (see the Appendix).
where denotes the cumulative distribution function (cdf) of the normal distribution.
In the framework of this model and following [7, 12], inhibition is introduced as instantaneous negative feedback proportional to the total number of active neurons at time t. Formally, this is achieved by substituting in (10) and (11), where . The inhibitory weight is taken as throughout the paper (for discussion see ).
3.1 Inhomogeneous Sparseness Reduces Dynamic Stability
In summary, the range of thresholds under which the network successfully replays the full sequence is severely reduced as the pattern sizes become more and more inhomogeneous.
3.1.1 Replay Success Rate
as the relative difference between hit ratio and false alarm ratio, and consider a pattern at time t to be retrieved successfully if . By running the mean field equations many times with different random realizations of vector ϕ, we obtain an empirical replay success rate as the fraction of runs with successful retrieval .
3.1.2 Region of Stable Replay
3.2 Storage Capacity
as the maximum number of time steps for which the replay success rate remains above 90 % for a given pattern size vector ϕ and morphological connectivity . Since replay stability strongly depends on the firing threshold, the maximum is taken over all possible firing thresholds θ.
3.3 Asymmetry of the Size Distribution
From our observations of single runs in Fig. 3, we already derived some anecdotal insight into the mechanisms underlying the breakdown of sequence replay: network activity may cease after a small pattern, whereas large patterns may lead to epilepsy. However, it is unclear which of these two ways of terminating replay is more problematic, or whether both occur equally often.
A symmetric triangular distribution (Fig. 8b) is used for comparison. In order to study the effect of an excess of small patterns we constructed a negatively skewed distribution (Fig. 8a) by cutting away all patterns above the line of symmetry () and adding smaller patterns instead. Since this distribution has a lower mean coding ratio , we increased the number P of patterns to account for the same “noise” connectivity c as in the symmetric case. The region of stable replay is clearly reduced in the negatively skewed distribution as compared to the symmetric one. This reduction could either be because small pattern sizes are intrinsically bad, or because the asymmetry of the distribution is a limiting factor. We therefore also considered the case of an excess of large patterns (Fig. 8c). For such a positively skewed distribution, the asymmetry is the same as for the negatively skewed distribution; however, and interestingly, the region of stable replay is larger than for the symmetric distribution. Again, the connectivity was adjusted to the same value by reducing the number P of associations to compensate for the higher mean coding ratio .
From these observations, we conclude that indeed the small patterns are much more problematic for replay with inhomogeneous pattern sizes than the large patterns. To understand why, we compared the shape of the replay regions for the three distributions (a, b, and c), and observe that the slope of the lower side of the wedge is relatively insensitive to skewness, whereas the slope of the upper side of the wedge is very different in each case. Failures owing to activity explosion (the lower side of the wedge) are almost independent of the skewness, due to the instantaneous feedback inhibition in the mean field equations. On the other hand, the upper side of the wedge is determined by the network’s falling into a silent state. Thus, positive skewness (c) is the more robust distribution. Note that this was also apparent in Fig. 5, where the reduction of the region of stability with increasing inhomogeneity was much more pronounced on the upper side of the wedges than on the lower side, despite the relatively symmetric Gamma distribution used there.
Again, these results show that the small patterns are more detrimental for sequence replay than the large patterns, since in the latter case fluctuations can be compensated for by feedback inhibition, whereas the former have no compensatory mechanism.
3.4 Nonlinear Inhibition
So far, we have assumed a linear dependence of instantaneous feedback inhibition on the total network activity, i.e., , since it was shown to optimize replay quality [7, 12]. In this final section, we investigate how a particular nonlinear form of inhibition could improve the network’s resilience to inhomogeneity, because (a) physiological data from cortical inhibitory networks suggest supralinear dependence on input [13, 14] and (b) supralinear inhibition effectively provides a positive feedback (with respect to linear inhibition) in cases of too low activity.
Figure 10 shows the mean retrieval quality at time step (averaged over 102 random realizations of vector ϕ) in the plane, for linear (a) and nonlinear (b) inhibition and two levels of inhomogeneity .
For low inhomogeneity (), although the region of stability is wider in the nonlinear case, the retrieval quality in the gained region is not as good as in the region shared by both feedback strategies (see lighter red stripe in middle panel of Fig. 10b). This finding fits very well to previous papers that report that linear inhibition maximizes replay quality for homogeneous pattern sizes: the gain in robustness is mostly obtained by a reduced replay quality. For a large inhomogeneity (), linear feedback almost completely extinguishes replay, whereas nonlinear inhibition recovers a considerable stable replay region with high retrieval quality.
Supralinear inhibitory feedback at low activity levels thus significantly widens the replay region making the network resilient to higher levels of inhomogeneity than would be possible with linear feedback. The underlying mechanism by which this is achieved can be explained as follows: smaller-than-average patterns generate only little negative feedback and thereby keep up the activity in the network, whereas bigger-than-average patterns are compensated for optimally by linear negative feedback.
This paper extends previous models of sequence memory [7, 9, 11, 12] that were based on Willshaw’s learning rule  to inhomogeneous pattern sizes, i.e., patterns of variable sparseness. Our work reveals that inhomogeneity in the sparseness of stored patterns is detrimental to a recurrent network’s dynamic stability during sequence retrieval. Bigger than average patterns tend to lead the network into an all-active epileptic state as a result of an excessively high synaptic drive, whereas smaller than average patterns tend to lead to an all-silent state as a result of an insufficient synaptic drive. In either case, sequence retrieval is terminated prematurely due to dynamic instability. As expected, the higher the variability in pattern sizes, the higher is the probability of premature sequence termination. Our results thus suggest that a plasticity mechanism that ensures a certain degree of homogeneity in the sparseness of hippocampal representations would be useful for the reliable retrieval of long sequences.
Instantaneous linear feedback inhibition is able to compensate to a certain degree for bigger-than-average patterns, but it does nothing to prevent the network from falling silent since it does not compensate for an insufficient synaptic drive. This asymmetry is reflected on the relative impact of differently skewed pattern size distributions. Compared to a symmetric distribution, negative skewness leads to a smaller region of stable replay, whereas positive skewness leads to a larger region. Positively skewed pattern size distributions are thus more resilient to premature sequence termination under linear feedback. The higher vulnerability to smaller-than-average patterns can be corrected for by introducing a nonlinear negative feedback which is close to zero for lower-than-average network activity. Such supralinear inhibition can make the network resilient to higher levels of inhomogeneity than linear feedback inhibition.
Memory networks with variable sparseness have been studied by Amit and Huang [15, 16] under a different learning paradigm in which old memories are gradually overwritten by new memories, and for several more involved synaptic (meta-)plasticity rules. There, inhomogeneity in the pattern sizes was shown to decrease the signal-to-noise ratio during recall as well.
In contrast to palimpsest models [17–22] in which old memories are overwritten, our model assumes that all memories are equally well preserved in the synaptic states of the network, which argues for additional plasticity rules that continuously readjust the synaptic matrix to keep old memories fresh. Such a mechanism would necessarily require ongoing plasticity rules, which may then easily be linked to some sort of pattern size homeostasis that tries to keep the sparseness homogeneous. Such persistent network remodeling fits experimental findings that, at least for a few weeks after memory acquisition, existing memories can be extinct by blocking protein synthesis together with memory reactivation , hinting at the presence of plasticity mechanisms during early retrieval.
In this appendix, we give the mathematical details of how we derive the expectation values necessary for our dynamical model from the underlying stochasticity of the recurrent synaptic matrix. The effect of learning on the synaptic connections is modeled by binary random variables that take a value if the putative synapse from neuron j to neuron i is in a potentiated state (is able to transmit signals), whereas if it is inactive and cannot contribute to the postsynaptic depolarization. Since not all neurons are considered to be synaptically connected, the real synaptic weight is a product  where or 0 according to a binomial process with probability (the morphological connectivity) that is supposed to model the existence of a physical synapse. The two random variables and are considered to be independent.
The vector of pattern sizes defines how many neurons fire in each pattern . According to Willshaw’s rule, a given sequence of stored patterns, with sizes specified by ϕ, uniquely defines the matrix of synaptic states : Only those synapses are potentiated for which the presynaptic neuron is active in pattern and the postsynaptic neuron is active in pattern for at least one value . In order to translate this learning rule into formulas, we have to introduce the theoretical concept of an activation schedule.
A.1 Activation Schedule
where is the number of patterns in which the neuron is active and is a reordering of the patterns such that those in which the neuron is active have the lowest indices. If , the first factor equals 1 as indicated by the Kronecker symbol .
where the product on the right-hand side is the fraction of synapses remaining inactive after storing P sequential activations, corresponding to learning steps.
A.2 Mean and Variance of Total Synaptic Input
A.3 Mean and Variance of ς over Pattern Size Distribution
This work was funded by the German Federal Ministry for Education and Research (BMBF) under grant numbers 01GQ0981 (Bernstein Fokus on Neuronal Basis of Learning: Plasticity of Neuronal Dynamics) and 01GQ1004A (Bernstein Center for Computational Neuroscience Munich).
The authors are grateful for comments and discussions to Álvaro Tejero Cantero, Axel Kammerer, and Alexander Mathis.
- Little WA: The existence of persistent states in the brain. Math Biosci 1974, 19: 101–120. 10.1016/0025-5564(74)90031-5View ArticleGoogle Scholar
- Hopfield JJ: Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci USA 1982, 79(8):2554–2558. 10.1073/pnas.79.8.2554MathSciNetView ArticleGoogle Scholar
- Wennekers T, Palm G: Modelling generic cognitive functions with operational Hebbian cell assemblies. In Neural Network Research Horizons. Edited by: Weiss M. Nova Science Publishers, New York; 2007:225–294.Google Scholar
- Lee AK, Wilson MA: Memory of sequential experience in the hippocampus during slow wave sleep. Neuron 2002, 36(6):1183–1194. 10.1016/S0896-6273(02)01096-6View ArticleGoogle Scholar
- Diba K, Buzsaki G: Forward and reverse hippocampal place-cell sequences during ripples. Nat Neurosci 2007, 10(10):1241–1242. 10.1038/nn1961View ArticleGoogle Scholar
- Maier N, Tejero-Cantero Á, Dorrn AL, Winterer J, Beed PS, Morris G, Kempter R, Poulet JF, Leibold C, Schmitz D: Coherent phasic excitation during hippocampal ripples. Neuron 2011, 72: 137–152. 10.1016/j.neuron.2011.08.016View ArticleGoogle Scholar
- Kammerer A, Tejero-Cantero Á, Leibold C: Inhibition enhances memory capacity: optimal feedback, transient replay and oscillations. J Comput Neurosci 2013, 34: 125–136. 10.1007/s10827-012-0410-zMathSciNetView ArticleGoogle Scholar
- Nadal JP: Associative memory: on the (puzzling) sparse coding limit. J Phys A 1991, 24: 1093–1101. 10.1088/0305-4470/24/5/023View ArticleGoogle Scholar
- Gibson WG, Robinson J: Statistical analysis of the dynamics of a sparse associative memory. Neural Netw 1992, 5: 645–661. 10.1016/S0893-6080(05)80042-5View ArticleGoogle Scholar
- Willshaw DJ, Buneman OP, Longuet-Higgins HC: Non-holographic associative memory. Nature 1969, 222(5197):960–962. 10.1038/222960a0View ArticleGoogle Scholar
- Leibold C, Kempter R: Memory capacity for sequences in a recurrent network with biological constraints. Neural Comput 2006, 18(4):904–941. 10.1162/neco.2006.18.4.904MathSciNetView ArticleGoogle Scholar
- Hirase H, Recce M: A search for the optimal thresholding sequence in an associative memory. Network 1996, 4: 741–756.View ArticleGoogle Scholar
- Kapfer C, Glickfeld L, Atallah B, Scanziani M: Supralinear increase of recurrent inhibition during sparse activity in the somatosensory cortex. Nat Neurosci 2007, 10: 743–753. 10.1038/nn1909View ArticleGoogle Scholar
- Silberberg G, Markram H: Disynaptic inhibition between neocortical pyramidal cells mediated by Martinotti cells. Neuron 2007, 53: 735–746. 10.1016/j.neuron.2007.02.012View ArticleGoogle Scholar
- Amit Y, Huang Y: Precise capacity analysis in binary networks with multiple coding level inputs. Neural Comput 2010, 22(3):660–688. 10.1162/neco.2009.02-09-967MathSciNetView ArticleGoogle Scholar
- Huang Y, Amit Y: Capacity analysis in multi-state synaptic models: a retrieval probability perspective. J Comput Neurosci 2011, 30(3):699–720. 10.1007/s10827-010-0287-7MathSciNetView ArticleGoogle Scholar
- Amit DJ, Fusi S: Learning in neural networks with material synapses. Neural Comput 1994, 6: 957–982. 10.1162/neco.1918.104.22.1687View ArticleGoogle Scholar
- Fusi S, Drew PJ, Abbott LF: Cascade models of synaptically stored memories. Neuron 2005, 45(4):599–611. 10.1016/j.neuron.2005.02.001View ArticleGoogle Scholar
- Leibold C, Kempter R: Sparseness constrains the prolongation of memory lifetime via synaptic metaplasticity. Cereb Cortex 2008, 18: 67–77. 10.1093/cercor/bhm037View ArticleGoogle Scholar
- Barrett AB, van Rossum MC: Optimal learning rules for discrete synapses. PLoS Comput Biol 2008., 4(11): Article ID e1000230 Article ID e1000230Google Scholar
- Päpper M, Kempter R, Leibold C: Synaptic tagging, evaluation of memories, and the distal reward problem. Learn Mem 2011, 18: 58–70.View ArticleGoogle Scholar
- van Rossum MC, Shippi M, Barrett AB: Soft-bound synaptic plasticity increases storage capacity. PLoS Comput Biol 2012., 8(12): Article ID e1002836 Article ID e1002836Google Scholar
- Milekic MH, Alberini CM: Temporally graded requirement for protein synthesis following memory reactivation. Neuron 2002, 36(3):521–525. 10.1016/S0896-6273(02)00976-5View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.