Reading list – August 2017
1. Neuroscience-inspired Artificial Intelligence
Authors: Demis Hassabis, Dharshan Kumaran, Christopher Summerfield, and Matthew Botvinick
Type: Review article in Neuron
Publication date: 19 July 2017
This paper outlined the contribution of neuroscience to the most recent advances in AI and argued that the study of neural computation in humans and other animals could provide useful (albeit subtle) inspiration to AI researchers, stimulating questions about specific aspects of learning and intelligence that could guide algorithm design.
- Four specific examples of neuroscientific inspirations that are currently used in AI were mentioned: attentional mechanism, episodic memory, working memory and continual learning
- Four areas where neuroscience could be relevant for future AI research were also mentioned: intuitive understanding of the physical world, efficient (or rapid) learning, transfer learning, imagination and planning
2. Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning
Authors: William Lotter, Gabriel Kreiman, David Cox
Type: arXiv preprint (accompanying codebase available here)
Publication date: 25 May 2016
This paper introduced ‘PredNet’, a predictive neural network architecture that is able to predict future frames in a video sequence using a deep, recurrent convolutional network with both bottom-up and top-down connections.
- The study demonstrated the potential for video to be used in unsupervised learning, where prediction of future frames can serve as a powerful learning signal, given that an agent must have an implicit model of the objects that constitute the environment and how they are allowed to move.
- By training using car-mounted camera videos, results showed that the network was able to learn to predict both the movement of the camera and the movement of the objects in the camera’s view.
3. Distral: Robust Multitask Reinforcement Learning
Authors: Yee Whye Teh, Victor Bapst, Wojciech Marian Czarnecki, John Quan, James Kirkpatrick, Raia Hadsell, Nicolas Heess, Razvan Pascanu
Type: arXiv preprint
Publication date: 13 July 2017
This paper proposed a method to overcome a common problem in Deep Reinforcement Learning, whereby training on multiple related tasks negatively affect performance on the individual tasks, when intuition tells us solutions to related tasks should improve learning since the tasks share common structures.
- The authors developed Distral (Distill & transfer learning), based on the idea of a shared ‘policy’ that distills common behaviours or representations from task-specific policies.
- Knowledge obtained in an individual task is distilled into the shared policy and then transferred to other tasks.
4. How Prior Probability Influences Decision Making: A Unifying Probabilistic Model
Authors: Yanping Huang, Timothy Hanks, Mike Shadlen, Abram L. Friesen, Rajesh P. Rao
Type: Conference proceeding published at the Neural Information Processing Systems Conference
Publication year: 2012
This paper tackled the problem of how the brain combines sensory input and prior knowledge when making decision in the natural world.
- The authors derived a model based on the framework of a partially observable Markov decision processes (POMDPs) and computed the optimal behaviour for sequential decision making tasks.
- Their results suggest that decision making in our brain may be controlled by the dual principles of Bayesian inference and reward maximisation.
- The proposed model offered a unifying explanation for experimental data previously accounted for by two competing models for incorporating prior knowledge, the additive offset model that assumes static influence of the prior, and the dynamic weighing model that assumes a time-varying effect.
5. First-spike based visual categorization using reward-modulated STDP
Authors: Milad Mozafari, Saeed Reza Kheradpisheh, Timothée Masquelier, Abbas Nowzari-Dalini, Mohammad Ganjtabesh
Type: arXiv preprint
Publication date: 25 May 2017
This paper proposed a hierarchical Spiking Neural Network (SNN) equipped with a novel Reward-modulated STDP (R-STDP) learning algorithm to solve object recognition tasks without using an external classifier.
- The learning algorithm combined the principles of Reinforcement Learning and STDP
- The network is structured as a feedforward convolutional SNN with four layers, however training took place in only one layer.
- Results from R-STDP outperformed STDP on several datasets
6. A Distributional Perspective on Reinforcement Learning
Authors: Marc G. Bellemare, Will Dabney, Rémi Munos
Type: arXiv preprint
Publication date: 21 July 2017
This paper sought to provide a more complete picture of reinforcement learning (RL) by incorporating the concept of value distribution, understood as the distribution of the random return received by a learning agent.
- The main object of the study is a random return Z that is characterised by the interaction of three random variables: the reward R, the next state-action, and its random return.
- The authors designed a new algorithm using this distributional perspective to learn approximate value distribution and obtained state of the art results, at the same time demonstrating the importance of the value distribution in approximate RL.