Category Archives

5 Articles

Reading List

Reading list – August 2017

Posted by Yi-Ling Hwong on

1. Neuroscience-inspired Artificial Intelligence

Authors: Demis Hassabis, Dharshan Kumaran, Christopher Summerfield, and Matthew Botvinick
Type: Review article in Neuron
Publication date: 19 July 2017

This paper outlined the contribution of neuroscience to the most recent advances in AI and argued that the study of neural computation in humans and other animals could provide useful (albeit subtle) inspiration to AI researchers, stimulating questions about specific aspects of learning and intelligence that could guide algorithm design.

  • Four specific examples of neuroscientific inspirations that are currently used in AI were mentioned: attentional mechanism, episodic memory, working memory and continual learning
  • Four areas where neuroscience could be relevant for future AI research were also mentioned: intuitive understanding of the physical world, efficient (or rapid) learning, transfer learning, imagination and planning

2. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

Authors: William Lotter, Gabriel Kreiman, David Cox
Type: arXiv preprint (accompanying codebase available here)
Publication date: 25 May 2016

The PredNet architecture (image credit: PredNet)

This paper introduced ‘PredNet’, a predictive neural network architecture that is able to predict future frames in a video sequence using a deep, recurrent convolutional network with both bottom-up and top-down connections.

  • The study demonstrated the potential for video to be used in unsupervised learning, where prediction of future frames can serve as a powerful learning signal, given that an agent must have an implicit model of the objects that constitute the environment and how they are allowed to move.
  • By training using car-mounted camera videos, results showed that the network was able to learn to predict both the movement of the camera and the movement of the objects in the camera’s view.

3. Distral: Robust Multitask Reinforcement Learning

Authors: Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu
Type: arXiv preprint
Publication date: 13 July 2017

This paper proposed a method to overcome a common problem in Deep Reinforcement Learning, whereby training on multiple related tasks negatively affect performance on the individual tasks, when intuition tells us solutions to related tasks should improve learning since the tasks share common structures.

  • The authors developed Distral (Distill & transfer learning), based on the idea of a shared ‘policy’ that distills common behaviours or representations from task-specific policies.
  • Knowledge obtained in an individual task is distilled into the shared policy and then transferred to other tasks.

4. How Prior Probability Influences Decision Making: A Unifying Probabilistic Model

Authors: Yanping Huang, Timothy Hanks, Mike Shadlen, Abram L. Friesen, Rajesh P. Rao
Type: Conference proceeding published at the Neural Information Processing Systems Conference
Publication year: 2012

This paper tackled the problem of how the brain combines sensory input and prior knowledge when making decision in the natural world.

  • The authors derived a model based on the framework of a partially observable Markov decision processes (POMDPs) and computed the optimal behaviour for sequential decision making tasks.
  • Their results suggest that decision making in our brain may be controlled by the dual principles of Bayesian inference and reward maximisation.
  • The proposed model offered a unifying explanation for experimental data previously accounted for by two competing models for incorporating prior knowledge, the additive offset model that assumes static influence of the prior, and the dynamic weighing model that assumes a time-varying effect.

5. First-spike based visual categorization using reward-modulated STDP

Authors: Milad Mozafari, Saeed Reza Kheradpisheh, Timothée Masquelier, Abbas Nowzari-Dalini, Mohammad Ganjtabesh
Type: arXiv preprint
Publication date: 25 May 2017

This paper proposed a hierarchical Spiking Neural Network (SNN) equipped with a novel Reward-modulated STDP (R-STDP) learning algorithm to solve object recognition tasks without using an external classifier.

  • The learning algorithm combined the principles of Reinforcement Learning and STDP
  • The network is structured as a feedforward convolutional SNN with four layers, however training took place in only one layer.
  • Results from R-STDP outperformed STDP on several datasets

6. A Distributional Perspective on Reinforcement Learning

Authors: Marc G. Bellemare, Will Dabney, Rémi Munos
Type: arXiv preprint
Publication date: 21 July 2017

This paper sought to provide a more complete picture of reinforcement learning (RL) by incorporating the concept of value distribution, understood as the distribution of the random return received by a learning agent.

  • The main object of the study is a random return Z that is characterised by the interaction of three random variables: the reward R, the next state-action, and its random return.
  • The authors designed a new algorithm using this distributional perspective to learn approximate value distribution and obtained state of the art results, at the same time demonstrating the importance of the value distribution in approximate RL.
AGI/Artificial General Intelligence/deep belief networks/Reading List/Reinforcement Learning/sparse coding/Sparse Distributed Representations/stationary problem/unsupervised learning

Reading list – May 2016

Posted by ProjectAGI on
Reading list – May 2016
Digit classification error over time in our experiments. The image isn’t very helpful but it’s a hint as to why we’re excited 🙂

Project AGI

A few weeks ago we paused the “How to build a General Intelligence” series (part 1, part 2, part 3, part 4). We paused it because the next article in the series requires us to specify everything in detail, and we need working code to do that.

We have been testing our algorithm on a variety of MNIST-derived handwritten digit datasets, to better understand how well it generalizes its representation of digit-images and how it behaves when exposed to varying degrees of predictability. Initial results look promising: We will post everything here once we’ve verified them and completed the first batch of proper experiments. The series will continue soon!

Deep Unsupervised Learning

Our algorithm is a type of Online Deep Unsupervised Learning, so naturally we’re looking carefully at similar algorithms.

We recommend this video of a talk by Andrew Ng. It starts with a good introduction to the methods and importance of feature representation and touches on types of automatic feature discovery. He looks at some of the important feature detectors in computer vision, such as SIFT and HoG and shows how feature detectors – such as edge detectors – can emerge from more general pattern recognition algorithms such as sparse coding. For more on sparse coding see Shakir’s excellent machine learning blog.

For anyone struggling to intuit deep feature discovery, I also loved this post on yCombinator which nicely illustrates how and why deep networks discover useful features, and why the depth helps.

The latter part of the video covers Ng’s latest work on deep hierarchical sparse coding using Deep Belief Networks, in turn based on AutoEncoders. He reports benchmark-beating results on video activity and phoneme recognition with this framework. You can find details of his deep unsupervised algorithm here:

Finally he presents a plot suggesting that training dataset size is a more important determiner of eventual supervised network performance than algorithm choice! This is a fundamental limitation of supervised learning where the necessary training data is much more limited than in unsupervised learning (in the latter case, the real world provides a handy training set!)

Effect of algorithm and training set size on accuracy. Training set size more significant. This is a fundamental limitation of supervised learning.

Online K-sparse autoencoders (with some deep-ness)

We’ve also been reading this paper by Makhzani and Frey about deep online learning with auto-encoders (a type of supervised learning neural network that is used in an unsupervised way to reconstruct its input, often known as semi-supervised learning). Actually we’ve struggled to find any comparison of autoencoders to earlier methods of unsupervised learning both in terms of computational efficiency and ability to cover the search space effectively. Let us know if you find a paper that covers this.

The Makhzani paper has some interesting characteristics – the algorithm is online, which means it receives data as a stream rather than in batches. It is also sparse, which we believe is desirable from a representational perspective.
One limitation is that the solution is most likely unable to handle changes in input data statistics (i.e. non-stationary problems). The reason this is an important quality is that in any arbitrarily deep network the typical position of a vertex is between higher and lower vertices. If all vertices are continually learning, the problem being modelled by any single vertex is constantly changing. Therefore, intermediate vertices must be capable of online learning of non stationary problems otherwise the network will not be able to function effectively. In Makhzani and Frey, they instead use the greedy layerwise training approach from Deep Belief Networks. The authors describe this approach:
“4.6. Deep Supervised Learning Results The k-sparse autoencoder can be used as a building block of a deep neural network, using greedy layerwise pre-training (Bengio et al., 2007). We first train a shallow k-sparse autoencoder and obtain the hidden codes. We then fix the features and train another ksparse autoencoder on top of them to obtain another set of hidden codes. Then we use the parameters of these autoencoders to initialize a discriminative neural network with two hidden layers.”

The limitation introduced can be thought of as an inability to escape from local minima that result from prior training. This paper by Choromanska et al tries to explain why this happens.
Greedy layerwise training is an attempt to work around the fact that deep belief networks of Autoencoders cannot effectively handle nonstationary problems.

For more information here’s some papers on deep sparse networks built from autoencoders:

Variations on Supervised Learning – a Taxonomy

Back to supervised learning, and the limitation of training dataset size. Thanks to a discussion with Jay Chakravarty we have this brief taxonomy of supervised learning workarounds for insufficient training datasets:

Weakly supervised learning: [For poorly labelled training data] where you want to learn models for object recognition under weak supervision – you have say object labels for images, but no localization (e.g. bounding box) for the object in the image (there might be other objects in the image as well). You would use a Latent SVM to solve the problem of localizing the objects in the images, and at the same time learning a classifier for it.
Another example of weakly supervised learning is that you have a bag of positive samples mixed up with negative training samples, but also have a bag of purely negative samples – you would use Multiple Instance Learning for this.

Cross-modal adaptation: where one mode of data supervises another – e.g. audio supervises video or vice-versa.

Domain adaptation: model learnt on one set of data is adapted, in unsupervised fashion, to new datasets with slightly different data distributions.

Transfer learning: using the knowledge gained in learning one problem on a different, but related problem. Here’s a good example of transfer learning, a finalist in the NVIDIA 2016 Global Impact Award. The system learns to predict poverty from day and night satellite images, with very few labelled samples.

Full paper:

Interactive Brain Concept Map

We enjoyed this interactive map of the distribution of concepts within the cortex captured using fMRI and produced by the Gallant Lab (source papers here).

Using the map you can find the voxels corresponding to various concepts, which although maybe not generalizable due to the small sample size (7) gives you a good idea of the hierarchical structure the brain has produced, and what the intermediate concepts represent.

Thanks to David Ray @ for the link.

Interactive brain concept map

OpenAI Gym – Reinforcement Learning platform

We also follow the OpenAI project with interest. OpenAI have just released their “Gym” – a platform for training and testing reinforcement learning algorithms. Have a play with it here:

According to Wired magazine, OpenAI will continue to release free and open source software (FOSS) for the wider impact this will have on uptake. There are many companies now competing to win market share in this space.

The Talking Machines Blog

We’re regular readers of this blog and have been meaning to mention it for months. Worth reading.

How the brain generates actions

A big gap in our knowledge is how the brain generates actions from its internal representation. This new paper by Vicente et al challenges the established (rather vague) dogma on how the brain generates actions.
“We found that contrary to common belief, the indirect pathway does not always prevent actions from being performed, it can actually reinforce the performance of actions. However, the indirect pathway promotes a different type of actions, habits.”

This is probably quite informative for reverse-engineering purposes. Full paper here.

Hierarchical Temporal Memory

HTM is an online method for feature discovery and representation and now we have a baseline result for HTM on the famous MNIST numerical digit classification problem. Since HTM works with time-series data, the paper compares HTM to LSTM (Long-Short-Term Memory), the leading supervised-learning approach to this problem domain.

It is also interesting that the paper deals with adaptation to sudden changes in the input data statistics, the very problem that frustrates the deep belief networks described above.

Full paper by Cui et al here.

For a detailed mathematical description of HTM see this paper by Mnatzaganian and Kudithipudi.

AGI/AlphaGo/Artificial General Intelligence/deep convolutional networks/HQSOM/machine learning/Reading List/unsupervised learning

Reading list: Assorted AGI links. March 2016

Posted by ProjectAGI on
Reading list: Assorted AGI links. March 2016
A Minecraft API is now available to train your AGIs

Our News

We are working hard on experiments, and software to run experiments. So this week there is no normal blog post. Instead, here’s an eclectic mix of links we’ve noticed recently.

First, AlphaGo continues to make headlines. Of interest to Project AGI is Yann LeCun agreeing with us that unsupervised hierarchical modelling is an essential step in building intelligence with humanlike qualities [1]. We also note this IEEE Spectrum post by Jean-Christophe Baillie [2] which argues, as we did [3], that we need to start creating embodied agents.


Speaking of which, the BBC reports that the Minecraft team are preparing an API for machine learning researchers to test their algorithms in the famous game [4]. The Minecraft team also stress the value of embodied agents and the depth of gameplay and graphics. It sounds like Minecraft could be a crucial testbed for an AGI. We’re always on the lookout for test problems like these.

Of course, to play Minecraft well you need to balance local activities – building, mining etc. – with exploration. Another frontier, beyond AlphaGo, is exploration. Monte-Carlo Tree Search (as used in AlphaGo) explores in more limited ways than humans do, argues John Langford [5].

Sharing places with robots 

If robots are going to be embodied, we need to make some changes. Wired magazine says that a few small changes to the urban environment and driver behaviour will make the rollout of autonomous vehicles easier [6]. It’s important to meet the machines halfway, for the benefit of all.

This excellent paper on robotic grasping also caught our attention [7]. A key challenge in this area is adaptability to slightly varying circumstances, such as variations in the objects being grasped and their pose relative the the arm. General solutions to these problems will suddenly make robots far more flexible and applicable to a greater range of tasks.

Hierarchical Quilted Self-Organizing Maps & Distributed Representations

Last week I also rediscovered this older paper on Hierarchical-Quilted Self-Organizing Maps (HQSOMs) [8].This is close to our hearts because we originally believed this type of representation was the right approach for AGI. With the success of Deep Convolutional Networks (DCNs) it’s worth looking back and noticing the similarities between the two. While HQSOM is purely unsupervised learning, (a plus, see comment from Yann LeCun above) DCNs are trained by supervised techniques. However, both methods use small, overlapping, independent units – analogous to biological cortical columns – to classify different patches of the input. The overlapping and independent classifiers lead to robust and distributed representations, which is probably the reason these methods work so well.

Distributed representation is one of the key features of Hawkins’ Hierarchical Temporal Memory (HTM). Fergal Byrne has recently published an updated description of the HTM algorithm [9] for those interested.

We at Project AGI believe that a grid-like “region” of columns employing a “Winner-Take-All” policy [10], with overlapping input receptive fields, can produce a distributed representation. Different regions are then connected together into a tree-like structure (acyclic). The result is a hierarchy. Not only does this resemble the state-of-the-art methods of DCNs, but there’s a lot of biological evidence for this type of representation too. This paper by Rinkus [11] describes columnar features arranged into a hierarchy, with winner-take-all behaviour implemented via local inhibition.

Rinkus says: “Saying only that a group of L2/3 units forms a WTA CM places no a priori constraints on what their tuning functions or receptive fields should look like. This is what gives that functionality a chance of being truly generic, i.e., of applying across all areas and species, regardless of the observed tuning profiles of closely neighboring units.”

Reinforcement Learning 

But unsupervised learning can’t be the only form of learning. We also need to consider consequences, and so we need reinforcement learning to take account of these. As Yann said, the “cherry on the cake” (this is probably understating the difficulty of the RL component, but right now it seems easier than creating representations).

Shakir’s Machine Learning blog has a great post exploring the biology of reinforcement learning [12] within the brain. This is a good overview of the topic and useful for ML researchers wanting to access this area.

But regular readers of this blog will remember that we’re obsessed with unfolding or inverting abstract plans into concrete actions. We found a great paper by Manita et al [13] that shows biological evidence for the translation and propagation of an abstract concept into sensory and motor areas, where it can assist with perception. This is the hierarchy in action.

Long-Short-Term Memory (LSTM)

One more tack before we finish. Thanks to Jay for this link to NVIDIA’s description of LSTMs [14], an architecture for recurrent neural networks (i.e. the state can depend on the previous state of the cells). It’s a good introduction, but we’re still fans of Monner’s Generalized LSTM [15].

Fun thoughts

Now let’s end with something fun. Wired magazine again, describing watching AlphaGo as our first taste of a superhuman intelligence [16]. Although this is a “narrow” intelligence, not a general one, it has qualities beyond anything we’ve experienced in this domain before. What’s more, watching these machines can make us humans better, without any nasty bio-engineering:

“But as hard as it was for Fan Hui to lose back in October and have the loss reported across the globe—and as hard as it has been to watch Lee Sedol’s struggles—his primary emotion isn’t sadness. As he played match after match with AlphaGo over the past five months, he watched the machine improve. But he also watched himself improve. The experience has, quite literally, changed the way he views the game. When he first played the Google machine, he was ranked 633rd in the world. Now, he is up into the 300s. In the months since October, AlphaGo has taught him, a human, to be a better player. He sees things he didn’t see before. And that makes him happy. “So beautiful,” he says. “So beautiful.”


















Deep Learning/Friston/Generalized LSTM/Graves/ICML/Long Short Term Memory/LSTM/Monner/Predictive Coding/Reading List

Reading list – July 2015

Posted by ProjectAGI on
This month’s reading list continues with a subtheme on recurrent neural networks, and in particular Long Short Term Memory (LSTM).

First here’s an interesting report on a panel discussion about the future of Deep Learning at the International Conference on Machine Learning (ICML), 2015:

Participants included Yoshua Bengio (University of Montreal), Neil Lawrence (University of Sheffield), Juergen Schmidhuber (IDSIA), Demis Hassabis (Google DeepMind), Yann LeCun (Facebook, NYU) and Kevin Murphy (Google).

It was great to hear the panel express an interest in some of our favourite topics, notably hierarchical representation, planning and action selection (reported as sequential decision making) and unsupervised learning. From the Deep Learning community this is a new focus – most DL is based on supervised learning.
In the Q&A session, it was suggested that reinforcement learning be used to motivate the exploration of search-spaces to train unsupervised algorithms. In robotics, robustly trading off the potential reward of exploration vs using existing knowledge has been a hot topic for several years (example).
The theory of Predictive Coding suggests that the brain strives to eliminate unpredictability. This presents difficulties for motivating exploration – critics have asked why we don’t seek out quiet, dark solitude! Friston suggests that prior expectations balance the need for immediate predictability with improved understanding in the longer term. For a good discussion, see here.
Our in-depth reading this month has continued on the theme of LSTM. The most thorough introduction we have found is Alex Graves’ “Supervised Sequence Labelling with Recurrent Neural Networks”:
However, a critical limitation of LSTM as presented in Graves’ work is that online training is not possible – so you can’t use this variant of LSTM in an embodied agent.
The best and online variant of LSTM seems to be Derek Monner’s Generalized LSTM algorithm, introduced in D. Monner and J. A. Reggia (2012). “A generalized LSTM-like training algorithm for second-order recurrent neural networks”. You download the paper from Monner’s website here:
We’ll be back with some actual code soon, including our implementation of Generalized LSTM. And don’t worry, we’ll be back to unsupervised learning soon with a focus on Growing Neural Gas.
Deep Learning/LSTM/Reading List/Recurrent Neural Networks

Reading List – May 2015

Posted by ProjectAGI on
John Lisman, “The Challenge of Understanding the Brain: Where We Stand in 2015“, Neuron, 2015

For many in ML and AI,  biological knowledge is focussed on cortex. This paper gives an excellent broad overview of current biological understanding of intelligence.

Sebastian Billaudelle and Subutai Ahmad, “Porting HTM Models to the Heidelberg Neuromorphic Computing Platform“,, 2015

Progress in Numenta’s HTM and particularly interesting to see collaboration with the Human Brain Project. Does this signify growing collaboration with computational biologists?

Yann LeCun, Yoshua Bengio and Geoffrey Hinton, “Deep learning“, Nature, 2015

Review of Deep learning by the guys that are recognised as THE fathers of the field – they have been working on NN for years without due acclaim, until recently.

LSTM Tutorial, Department of Computer Science, University of Kaiserslautem, 2015

Nitish Srivastava, Elman Mansimov and Ruslan Salakhutdinov, Unsupervised Learning of Video Representations using LSTMs“,, 2015

Andrej Karpathy, “The Unreasonable Effectiveness of Recurrent Neural Networks“, Andrej Karpathy blog, 2015

We have a renewed interest in RNN’s and LSTM’s and their great potential. Here are a few papers and tutorials that are well worthwhile for learning about this area.