AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Deep Reinforcement Learning: Pong from Pixels

#artificialintelligenceAug-29-2016, 19:20:42 GMT

This is a long overdue blog post on Reinforcement Learning (RL). You may have noticed that computers can now automatically learn to play ATARI games (from raw game pixels!), they are beating world champions at Go, simulated quadrupeds are learning to run and leap, and robots are learning how to perform complex manipulation tasks that defy explicit programming. It turns out that all of these advances fall under the umbrella of RL research. I also became interested in RL myself over the last year: I worked through Richard Sutton's book, read through David Silver's course, watched John Schulmann's lectures, wrote an RL library in Javascript, over the summer interned at DeepMind working in the DeepRL group, and most recently pitched in a little with the design/development of OpenAI Gym, a new RL benchmarking toolkit. So I've certainly been on this funwagon for at least a year but until now I haven't gotten around to writing up a short post on why RL is a big deal, what it's about, how it all developed and where it might be going. It's interesting to reflect on the nature of recent progress in RL. Similar to what happened in Computer Vision, the progress in RL is not driven as much as you might reasonably assume by new amazing ideas. In Computer Vision, the 2012 AlexNet was mostly a scaled up (deeper and wider) version of 1990's ConvNets. Similarly, the ATARI Deep Q Learning paper from 2013 is an implementation of a standard algorithm (Q Learning with function approximation, which you can find in the standard RL book of Sutton 1998), where the function approximator happened to be a ConvNet. AlphaGo uses policy gradients with Monte Carlo Tree Search (MCTS) - these are also standard components.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

#artificialintelligence

Industry:

Leisure & Entertainment > Games > Computer Games (0.55)
Leisure & Entertainment > Games > Go (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Reinforcement Learning and DQN, learning to play from pixels - Ruben Fiszel's website

#artificialintelligenceAug-29-2016, 10:41:09 GMT

My 2 month summer internship at Skymind (the company behind the open source deeplearning library DL4J) comes to an end and this is a post to summarize what I have been working on: Building a deep reinforcement learning library for DL4J: … (drums roll) … RL4J! This post begins by an introduction to reinforcement learning and is then followed by a detailed explanation of DQN (Deep Q-Network) for pixel inputs and is concluded by an RL4J example. I will assume from the reader some familiarity with neural networks. But first, lets talk about the core concepts of reinforcement learning. A "simple aspect of science" may be defined as one which, through good fortune, I happen to understand. Reinforcement Learning is an exciting area of machine learning. It is basically the learning of an efficient strategy in a given environment. Informally, this is very similar to Pavlovian conditioning: you assign a reward for a given behavior and over time, the agents learn to reproduce that behavior in order to receive more rewards. It is an iterative trial and error process. Formally, an environment is defined as a Markov Decision Process (MDP). Note: It is usually more convenient to use the set of Action \(A_s\) which is the set of available move from a given state, than the complete set A. \(A_s\) is simply the elements \(a\) in \(A\) such that \(P(s' s, a) 0\).

machine learning, reinforcement learning, transition, (15 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

deeplearning4j/rl4j

#artificialintelligenceAug-28-2016, 19:00:52 GMT

Reinforcement learning framework integrated with deeplearning4j. This is a tech preview and distributed as is.

artificial intelligence, deep learning, reinforcement learning, (1 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Deep Deterministic Policy Gradients in TensorFlow

#artificialintelligenceAug-22-2016, 01:25:38 GMT

Deep Reinforcement Learning has recently gained a lot of traction in the machine learning community due to the significant amount of progress that has been made in the past few years. Traditionally, reinforcement learning algorithms were constrained to tiny, discretized grid worlds, which seriously inhibited them from gaining credibility as being viable machine learning tools. Here's a classic example from Richard Sutton's book, which I will be referencing a lot. After Deep Q-Networks [4] became a hit, people realized that deep learning methods could be used to solve high-dimensional problems. One of the subsequent challenges that the reinforcement learning community faced was figuring out how to deal with continuous action spaces.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Single-shot Adaptive Measurement for Quantum-enhanced Metrology

Palittapongarnpim, Pantita, Wittek, Peter, Sanders, Barry C.

arXiv.org Machine LearningAug-22-2016

Quantum-enhanced metrology aims to estimate an unknown parameter such that the precision scales better than the shot-noise bound. Single-shot adaptive quantum-enhanced metrology (AQEM) is a promising approach that uses feedback to tweak the quantum process according to previous measurement outcomes. Techniques and formalism for the adaptive case are quite different from the usual non-adaptive quantum metrology approach due to the causal relationship between measurements and outcomes. We construct a formal framework for AQEM by modeling the procedure as a decision-making process, and we derive the imprecision and the Cram\'{e}r-Rao lower bound with explicit dependence on the feedback policy. We also explain the reinforcement learning approach for generating quantum control policies, which is adopted due to the optimal policy being non-trivial to devise. Applying a learning algorithm based on differential evolution enables us to attain imprecision for adaptive interferometric phase estimation, which turns out to be SQL when non-entangled particles are used in the scheme.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1117/12.2237355

1608.06238

Country:

North America > Canada (0.68)
North America > United States (0.68)
Europe (0.68)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.69)

Add feedback

Teaching Machines to Direct Traffic through Deep Reinforcement Learning

#artificialintelligenceAug-18-2016, 15:40:30 GMT

The dreaded time of day when traffic conditions seem bent on making you late. As your car slowly creeps in line behind countless others stuck at a stop light, you think to yourself, "Why aren't these lights changing faster?" Traffic control scientists have long tried to solve this signaling problem. Unfortunately, the complexity of traffic situations has made the job extremely hard. A recent study suggests that machines can learn how to plan traffic signals just right to reduce wait times and make traffic queues shorter.

algorithm, artificial intelligence, machine learning, (6 more...)

#artificialintelligence

Genre: Research Report (0.94)

Industry:

Transportation > Ground > Road (0.94)
Transportation > Infrastructure & Services (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Physical Review Letters - Accepted Paper: Quantum-enhanced machine learning

#artificialintelligenceAug-18-2016, 05:10:40 GMT

The emerging field of quantum machine learning has the potential to substantially aid in the problems and scope of artificial intelligence. This is only enhanced by recent successes in the field of classical machine learning. In this work we propose an approach for the systematic treatment of machine learning, from the perspective of quantum information. Our approach is general and covers all three main branches of machine learning: supervised, unsupervised and reinforcement learning. While quantum improvements in supervised and unsupervised learning have been reported, reinforcement learning has received much less attention.

artificial intelligence, machine learning, reinforcement learning, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Add feedback

aikorea/awesome-rl

#artificialintelligenceAug-17-2016, 17:05:03 GMT

A curated list of resources dedicated to reinforcement learning. We are looking for more contributors and maintainers!

aikorea awesome-rl, artificial intelligence, reinforcement learning, (1 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.55)

Add feedback

Teaching machines to direct traffic through deep reinforcement learning

#artificialintelligenceAug-17-2016, 10:06:17 GMT

Rush hour--the dreaded time of day when traffic conditions seem bent on making you late. As your car slowly creeps in line behind countless others stuck at a stop light, you think to yourself, "Why aren't these lights changing faster?" Traffic control scientists have long tried to solve this signaling problem. Unfortunately, the complexity of traffic situations makes the job extremely hard. A recent study suggests that machines can learn how to plan traffic signals just right to reduce wait times and make traffic queues shorter.

machine learning, reinforcement, reinforcement learning, (10 more...)

#artificialintelligence

Genre: Research Report (0.94)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.74)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

How deep reinforcement learning can help chatbots

#artificialintelligenceAug-16-2016, 11:40:20 GMT

In March this year, Microsoft CEO Satya Nadella talked about the industry trend of using human language more pervasively for interaction with computing devices, a trend he called "conversation as a platform." He also announced several bot initiatives, including the company's bot framework. In April, Facebook launched its Messenger platform with bots. Then, in May, Google announced its attempt to develop AI-powered bots, called Google Assistant. Since then, bots have been widely regarded as a new user interface (UI) to fundamentally change how computing will be experienced by people.

machine learning, natural language, reinforcement learning, (19 more...)

#artificialintelligence

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.91)

Add feedback