AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Artificial Intelligence's Next Big Step: Reinforcement Learning - The New Stack

#artificialintelligenceFeb-2-2017, 00:40:13 GMT

Almost every machine learning breakthrough you hear about (and most of what's currently called "artificial intelligence") is supervised learning; where you start with a curated and labeled data set. But another technique, reinforcement learning, is just starting to make its way out of the research lab. Reinforcement learning is where an agent learns by interacting with its environment. It isn't told by a trainer what to do and it learns what actions to take to get the highest reward in the situation by trial and error, even when the reward isn't obvious and immediate. It learns how to solve problems rather than being taught what solutions look like. Reinforcement learning is how DeepMind created the AlphaGo system that beat a high-ranking Go player (and has recently been winning online Go matches anonymously). It's how University of California Berkeley's BRETT robot learns how to move its hands and arms to perform physical tasks like stacking blocks or screwing the lid onto a bottle, in just three hours (or ten minutes if it's told where the objects are that it's going to work with, and where they need to end up).

artificial intelligence, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Country:

North America > United States > California > Alameda County > Berkeley (0.24)
Europe > Sweden > Skåne County > Malmö (0.05)

Industry:

Information Technology (1.00)
Leisure & Entertainment > Games > Go (0.89)
Leisure & Entertainment > Games > Computer Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning Policies For Learning Policies -- Meta Reinforcement Learning (RL²) in Tensorflow

#artificialintelligenceFeb-1-2017, 16:15:24 GMT

Reinforcement Learning provides a framework for training agents to solve problems in the world. One of the limitations of these agents however is their inflexibility once trained. They are able to learn a policy to solve a specific problem (formalized as an MDP), but that learned policy is often useless in new problems, even relatively similar ones. Imagine the simplest possible agent: one trained to solve a two-armed bandit task in which one arm always provides a positive reward, and the other arm always provides no reward. Using any RL algorithm such as Q-Learning or Policy Gradient, the agent can quickly learn to always choose the arm with the positive reward.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Artificial Intelligence: Reinforcement Learning in Python

#artificialintelligenceJan-28-2017, 07:16:11 GMT

When people talk about artificial intelligence, they usually don't mean supervised and unsupervised machine learning. These tasks are pretty trivial compared to what we think of AIs doing - playing chess and Go, driving cars, and beating video games at a superhuman level. Reinforcement learning has recently become popular for doing all of that and more. Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn't been until recently that we've been able to observe first hand the amazing results that are possible. In 2016 we saw Google's AlphaGo beat the world Champion in Go. We saw AIs playing video games like Doom and Super Mario.

artificial intelligence, machine learning, reinforcement learning, (5 more...)

#artificialintelligence

Genre:

Instructional Material > Course Syllabus & Notes (0.52)
Instructional Material > Online (0.40)

Industry:

Leisure & Entertainment > Games > Computer Games (0.82)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

jiweil/Neural-Dialogue-Generation

@machinelearnbotJan-27-2017, 21:40:08 GMT

This project is maintained by Jiwei Li. This repo will continue to be updated. After training, the trained models will be saved in save_t_given_s/model*. Decoding given a pre-trained generative model. The pre-trained model doesn't have to be a vanila Seq2Seq model (for example, it can be a trained model from adversarial learning).

artificial intelligence, machine learning, reinforcement learning, (12 more...)

@machinelearnbot

Country: North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.39)

Add feedback

VIME: Variational Information Maximizing Exploration

Houthooft, Rein, Chen, Xi, Duan, Yan, Schulman, John, De Turck, Filip, Abbeel, Pieter

arXiv.org Artificial IntelligenceJan-27-2017

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

exploration, neural network, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

1605.09674

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Continuing To Learn the Structure of Learning

#artificialintelligenceJan-26-2017, 01:07:28 GMT

Learning to reinforcement learn by Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Country: North America > United States > Arizona (0.05)

Genre:

Overview (0.35)
Research Report (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)

Add feedback

Learning to reinforcement learn

Wang, Jane X, Kurth-Nelson, Zeb, Tirumala, Dhruva, Soyer, Hubert, Leibo, Joel Z, Munos, Remi, Blundell, Charles, Kumaran, Dharshan, Botvinick, Matt

arXiv.org Machine LearningJan-23-2017

In recent years deep reinforcement learning (RL) systems have attained superhuman performance in a number of challenging task domains. However, a major limitation of such applications is their demand for massive amounts of training data. A critical present objective is thus to develop deep RL methods that can adapt rapidly to new tasks. In the present work we introduce a novel approach to this challenge, which we refer to as deep meta-reinforcement learning. Previous work has shown that recurrent networks can support meta-learning in a fully supervised context. We extend this approach to the RL setting. What emerges is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure. This second, learned RL algorithm can differ from the original one in arbitrary ways. Importantly, because it is learned, it is configured to exploit structure in the training domain. We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL. We consider prospects for extending and scaling up the approach, and also point out some potentially important implications for neuroscience.

experiment, machine learning, reinforcement learning, (16 more...)

arXiv.org Machine Learning

1611.05763

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Review of Reinforcement Learning Thrun

AITopics Original LinksJan-19-2017, 10:58:41 GMT

artificial intelligence, machine learning, reinforcement learning thrun, (5 more...)

AITopics Original Links

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

[cs/9605103] Reinforcement Learning: A Survey

AITopics Original LinksJan-19-2017, 10:46:06 GMT

Which authors of this paper are endorsers? Disable MathJax (What is MathJax?)

artificial intelligence, machine learning, reinforcement learning, (3 more...)

AITopics Original Links

Genre: Research Report (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Pioneering AI researcher to advise RBC's machine learning lab

#artificialintelligenceJan-18-2017, 14:10:14 GMT

A pioneer in machine learning from the University of Alberta is teaming up with the Royal Bank of Canada on artificial intelligence research. Richard Sutton, a professor at the school's department of computer science and a graduate of the University of Massachusetts, will advise the bank's machine learning research division and collaborate with RBC's second AI research lab, to be located in Edmonton. Sutton specializes in the same branch of machine learning that Google's AlphaGo computer program used, in part, to beat one of the highest-ranking professional players of the board game Go -- until recently, a notoriously difficult game for computers to play. The announcement is the latest in a string of AI-related partnerships, acquisitions and investments that have been struck in Canada in recent months -- the most high-profile of which have involved Facebook and Google, which have been in a fierce competition for access to talent. For over three decades, Sutton has specialized in reinforcement learning. In this branch of machine learning, an algorithm is designed to receive either a reward or penalty based on its behaviour, and learns to make choices that will result in the most reward -- and, hopefully, most desired behaviour -- over time.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Country:

North America > Canada > Alberta (1.00)
North America > United States > Massachusetts (0.25)
North America > Canada > Ontario > Toronto (0.22)
Oceania > Australia > New South Wales > Sydney (0.05)

Industry:

Banking & Finance (0.94)
Information Technology (0.72)
Leisure & Entertainment > Games > Go (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback