Unsupervised Learning with Spike-Timing Dependent Plasticity

Our brain is a source of great inspiration for the development of Artificial General Intelligence. In fact, one of the common views is that any effort in developing human-level AI is almost destined to fail without an intimate understanding of how the brain works. However, we do not understand our brain that well yet. But that is another story for another day. In today’s blog post we are going to talk about a learning method in machine learning that takes its inspiration from a biological process underpinning how humans learn – Spike Timing Dependent Plasticity (STDP).

Biological neurons communicate with each other through synapses, which are tiny connections between neurons in our brains. A presynaptic neuron is the neuron that fires the electrical impulse (the signal, so to speak), and a postsynaptic neuron is the neuron that receives this impulse. The wiring of the neurons makes our brain an extremely complex piece of machinery: a typical neuron receives thousands of inputs and sends its signals to over 10,000 other neurons. Incoming signals to a neuron alter its voltage (potential). When these signals reach a threshold value the neuron will produce a sudden increase in voltage for a short time (1ms). We refer to these short bursts of electrical energy as spikes. Computers communicate with bits, while neurons use spikes.

Anatomy of a neuron (image credit: Wikimedia)

Artificial Neural Networks (ANNs) attempt to capture this mechanism of neuronal communication through mathematical models. However, these computational models may be an inadequate representation of the brain. To understand the trend towards STDP and why we think it is a viable path forward, let’s back up a little bit and talk briefly about the current common methods in ANNs.

Gradient Descent: the dominant paradigm

Artificial Neural Networks are based on a collection of connected nodes mimicking the behaviour of biological neurons. A receiving (or postsynaptic) neuron receives multiple inputs, processes the signals, multiplies them by a weight, applies a nonlinear transfer function, and then propagates this signal to other neurons. The weights of the neurons vary as learning happens. This process of tweaking the weights is the most important thing in an artificial neural network. One popular learning algorithm is Stochastic Gradient Descent (SGD). To calculate the gradient of the loss function with respect to the weights, most state of the art ANNs use a procedure called back-propagation. However, the biological plausibility of back-propagation remains highly debatable. For example, there is no evidence of a global error minimisation mechanism in biological neurons. Therefore, a better learning algorithm might help us to move towards AGI. Something that raises the biological realism of our models. And this is where the Spiking Neural Network comes in.

The incorporation of timing in an SNN

The main difference between a conventional ANN and SNN is the neuron model that is used. The neuron model used in a conventional ANN does not employ individual spikes in computations. Instead the output signals from the neurons are treated as normalised firing rates, or frequency, of inputs within a certain time frame [1]. This is an averaging mechanism and is commonly referred to as rate coding. Consequently, input to the network can be real values, instead of a binary time-series. In contrast, each individual spike is used in the neuron model of an SNN. Instead of using rate coding, SNN uses pulse coding. What is important here is the incorporation of timing of the firing in computations, like real neurons do. The neurons in an SNN do not fire at every propagation cycle. They only fire when signals from other incoming neurons cause charge accumulation that reaches a certain threshold voltage.

Basic model of a spiking neuron (Image credit: EPFL)

The use of individual spikes in pulse coding is more biologically accurate in two ways. First, it is a more plausible representation for tasks where speed is an important consideration. For example in human visual system. Studies have shown that humans analyse and classify visual input (e.g. facial recognition) in under 100ms. Considering it takes at least 10 synaptic steps from the retina to the temporal lobe [2], this leaves about 10ms of processing time for each neuron. This is too little time for an averaging mechanism like rate coding to take place. Hence, an implementation that uses pulse coding might be a more suitable model for object recognition tasks, which is currently not the case considering the popularity of conventional ANN. Second, the use of only local information (i.e. timing of spikes) in learning is a more biologically realistic representation in comparison with a global error minimisation mechanism.

Learning using Spike-Timing Dependent Plasticity

The changing and shaping of neuron connections in our brain is known as synaptic plasticity. Neurons fire, or spike, to signal the presence of the feature that they are tuned for. As cleverly suggested by the Canadian psychologist Donald Hebb, “Neurons that fire together, wire together.” Simply put, when two neurons fire at almost the same time the connections between them are strengthened and thus they become more likely to fire again in the future. When two neurons fire in an uncoordinated manner the connections between them weaken and they are more likely to act independently in the future. This is known as Hebbian learning. The strengthening of synapses is known as Long Term Potentiation (LTP) and the weakening of synaptic strength is known as Long Term Depression (LTD). What determines whether a synapse will undergo LTP or LTD is the timing between the pre- and postsynaptic firing. If the presynaptic neuron fires before the postsynaptic neuron within the preceding 20ms, LTP occurs; and if the presynaptic neuron fires after the postsynaptic neuron within the following 20ms, LTD occurs. This is known as Spike-Timing Dependent Plasticity (STDP).

This biological mechanism can be adopted as a learning rule in machine learning. A general approach is to apply a delta rule Δw to each synapse in a network to compute its weight change. The weight change will be positive (therefore increasing the strength of the synaptic connection) if the postsynaptic neuron fires just after the presynaptic neuron, and negative if the postsynaptic neuron fires just before the presynaptic neuron. Compared with the supervised learning algorithm employed in backpropagation, STDP is an unsupervised learning method. This is another reason STDP-based learning is believed to more accurately reflect human learning, given that much of the most important learning we do is experiential and unsupervised, i.e. there is no “right answer” available for the brain to learn from.


STDP represents a potential shift in approach when it comes to developing learning procedures in neural networks. Recent research shows that it has predominantly been applied in pattern recognition related tasks. One 2015 study using an exponential STDP learning rule achieved 95% accuracy on the MNIST dataset [3], a large handwritten digit database that is widely used a training dataset for computer vision. Merely a year later, researchers have managed to make significant progress. For example, Kheradpisheh et al. achieved 98.5% accuracy MNIST by combining SNN and features of deep learning [4]. The network they used comprised several convolutional and pooling layers, and STDP learning rules were used in the convolutional layers to learn the features. Another interesting study took its inspiration from Reinforcement Learning and combined it with a hierarchical SNN to perform pattern recognition [5]. Using a network structure that consists of two simple and two complex layers and a novel reward-modulated STDP (R-STDP), their method outperformed classic unsupervised STDP on several image datasets. STDP has also been applied in real-time learning to take advantage of its speedy nature [6]. The SNN and fast unsupervised STDP learning method that was developed achieved an impressive 21.3 fps in training and 17.9 fps in testing. To put things in perspective, human eyes are able to detect around 24 fps.

Apart from object recognition, STDP has also been applied in speech recognition related tasks. One study uses an STDP-trained, nonrecurrent SNN to convert speech signals into a spike train signature for speech recognition [7]. Another study combines a hidden Markov model with SNN and STDP learning to classify segments of sequential data such as individual spoken words [8]. STDP has also proven to be a useful learning method in modelling pitch perception (i.e. recognising tones). Researchers developed a computational model using neural network that learns using STDP rules to identify (and strengthen) the neuronal connections that are most effective for the extraction of pitch [9].

Final thoughts

Having learned what we have about STDP, what can we conclude about the state of the art of machine learning? We think that conventional Artificial Neural Networks are probably here to stay. They are simplistic models of neurons but they do work. However the extent to which supervised ANNs would be suitable in the development of AGI is debatable. On the other hand, while the Spiking Neural Network is a more authentic model of how the human brain works, its performance thus far still lags behind that of ANNs on some tasks, not least because a lot more research has been done on supervised ANNs than SNNs. Despite its intuitive appeal and biological validity, there are also many neuroscientific experiments in which STDP has not matched observations [10]. One major quandary is the observation of LTD in certain hippocampal neurons (CA3 and CA1 regions, to be precise) when low frequency (1 Hz) presynaptic stimulation drives postsynaptic firing [11]. Conventional STDP wisdom says LTP should happen in this case. The frequency-dependence of plasticity does not stop here. At high enough frequencies (i.e. firing rates), the STDP learning rule becomes LTP-only. That is, both positive and negative Δw produce LTP [12]. Several other additional mechanisms also appear to influence STDP learning. For example, LTD can be converted to LTP by altering the firing pattern of the postsynaptic spikes: firing ‘bursts’ or even a pair of spikes in the postsynaptic neuron lead to LTP where single spikes would have led to LTD [13] [14]. Plasticity also appears to accumulate as a nonlinear function of the number of pre- and postsynaptic pairings, with depression accumulating at a lower rate than potentiation, i.e. requiring more pairings [13]. Finally, it seems that neural activity that does not cause any measurable plasticity may have a ‘priming’ effect on subsequent activities. In the CA1 region for example, LTP could be activated with as few as four stimuli, provided that a single priming stimulus was given 170 ms earlier [15] .

SNN’s inferior performance when compared to other ANNs might be due to its poor scalability. Large scale SNN’s are relatively rare because the computational intensity involved in designing such networks are not yet fully supported in most high performance computing (there are, however, exceptions such as this and this). Most implementations today use only one or two trainable layers of unsupervised learning, which limits its generalisation capabilities [16]. Moreover, and perhaps most importantly, STDP is vulnerable to the common shortcoming of unsupervised learning algorithms: it works well in sifting out statistically significant features but has problems identifying rare but diagnostic features which are crucial in important processes such as decision making. My sense is that if STDP is to become the key in unlocking the secrets of AGI, there needs to be more creativity in its implementation that takes advantage of its biological roots and nuances while striving for a general purpose learning algorithm.

What do you think? Comment and let us know your thoughts!


[1] Vreeken, J. (2003). Spiking neural networks, an introduction.

[2] Thorpe, S., Delorme, A., & Van Rullen, R. (2001). Spike-based strategies for rapid processing. Neural networks, 14(6), 715-725.

[3] Diehl, P. U., & Cook, M. (2015). Unsupervised learning of digit recognition using spike-timing-dependent plasticity. Frontiers in computational neuroscience, 9.

[4] Kheradpisheh, S. R., Ganjtabesh, M., Thorpe, S. J., & Masquelier, T. (2016). STDP-based spiking deep neural networks for object recognition. arXiv preprint arXiv:1611.01421.

[5] Mozafari, M., Kheradpisheh, S. R., Masquelier, T., Nowzari-Dalini, A., & Ganjtabesh, M. (2017). First-spike based visual categorization using reward-modulated STDP. arXiv preprint arXiv:1705.09132.

[6] Liu, D., & Yue, S. (2017). Fast unsupervised learning for visual pattern recognition using spike timing dependent plasticity. Neurocomputing, 249, 212-224.

[7] Tavanaei, A., & Maida, A. S. (2017). A spiking network that learns to extract spike signatures from speech signals. Neurocomputing, 240, 191-199.

[8] Tavanaei, A., & Maida, A. S. (2016). Training a Hidden markov model with a Bayesian spiking neural network. Journal of Signal Processing Systems, 1-10.

[9] Saeedi, N. E., Blamey, P. J., Burkitt, A. N., & Grayden, D. B. (2016). Learning Pitch with STDP: A Computational Model of Place and Temporal Pitch Perception Using Spiking Neural Networks. PLoS computational biology, 12(4), e1004860.

[10] Shouval, H. Z., Wang, S. S. H., & Wittenberg, G. M. (2010). Spike timing dependent plasticity: a consequence of more fundamental learning rules. Frontiers in Computational Neuroscience, 4.

[11] Wittenberg, G. M., and Wang, S. S.-H. (2006). Malleability of spike-timing- dependent plasticity at the CA3-CA1 synapse. J. Neurosci. 26, 6610–6617.

[12] Sjöström, P. J., Turrigiano, G. G., & Nelson, S. B. (2001). Rate, timing, and cooperativity jointly determine cortical synaptic plasticity. Neuron, 32(6), 1149-1164.

[13] Wittenberg, G. M., and Wang, S. S.-H. (2006). Malleability of spike-timing- dependent plasticity at the CA3-CA1 synapse. J. Neurosci. 26, 6610–6617.

[14] Pike, F. G., Meredith, R. M., Olding, A. W., & Paulsen, O. (1999). Postsynaptic bursting is essential for ‘Hebbian’induction of associative long‐term potentiation at excitatory synapses in rat hippocampus. The Journal of physiology, 518(2), 571-576.

[15] Rose, G. M., and Dunwiddie, T. V. (1986). Induction of hippocampal long-term potentiation using physiologically patterned stimulation. Neurosci. Lett. 69, 244–248.

[16] Almási, A. D., Woźniak, S., Cristea, V., Leblebici, Y., & Engbersen, T. (2016). Review of advances in neural networks: Neural design technology stack. Neurocomputing, 174, 31-41.

Also published on Medium.

Yi-Ling Hwong


Comment ( 1 )

  1. ReplyPhilipp
    Great post! "For example, LTD can be converted to LTP by altering the firing pattern of the postsynaptic spikes: firing ‘bursts’ or even a pair of spikes in the postsynaptic neuron lead to LTP where single spikes would have led to LTD." This makes some sense to me: If the firing is predicted you'll prune the synapses that contribute too late, but if the firing isn't predicted you'll try to get that predictiveness even if it arrives slightly later. Some thoughts: - If you haven't seen Hinton's "Can the brain do back-propagation"-talk on youtube, you should. - Beating supervised algorithms with unsupervised ones or global ones with local ones might not be very realistic. After all you are using much less information. - Creating a powerful, biologically realistic learning algorithm might have to involve stuff like hippocampal replay, which incidentally is both connected to reward (a form of supervision) and might implement some form of global learning (going in reverse order from high level interpretations to the low level input). - I agree that having more layers is key.