AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Deep Reinforcement Learning framework for Autonomous Driving

Sallab, Ahmad El, Abdou, Mohammed, Perot, Etienne, Yogamani, Senthil

arXiv.org Machine LearningApr-8-2017

Reinforcement learning is considered to be a strong AI paradigm which can be used to teach machines through interaction with the environment and learning from their mistakes. Despite its perceived utility, it has not yet been successfully applied in automotive applications. Motivated by the successful demonstrations of learning of Atari games and Go by Google DeepMind, we propose a framework for autonomous driving using deep reinforcement learning. This is of particular relevance as it is difficult to pose autonomous driving as a supervised learning problem due to strong interactions with the environment including other vehicles, pedestrians and roadworks. As it is a relatively new area of research for autonomous driving, we provide a short overview of deep reinforcement learning and then describe our proposed framework. It incorporates Recurrent Neural Networks for information integration, enabling the car to handle partially observable scenarios. It also integrates the recent work on attention models to focus on relevant information, thereby reducing the computational complexity for deployment on embedded hardware. The framework was tested in an open source 3D car racing simulator called TORCS. Our simulation results demonstrate learning of autonomous maneuvering in a scenario of complex road curvatures and simple interaction of other vehicles.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Machine Learning

doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023

1704.02532

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Elon Musk's OpenAI has unveiled an unusual approach to building smarter machines

#artificialintelligenceApr-7-2017, 20:05:27 GMT

In 2013 a British artificial-intelligence startup called DeepMind surprised computer scientists by showing off software that could learn to play classic Atari games better than an expert human player. DeepMind was soon acquired by Google, and the technique that beat the Atari games, reinforcement learning, has become a hot topic in the field of AI and robotics. Google used reinforcement learning to create software that beat a champion Go player last year. Now OpenAI, a nonprofit research institute cofounded and funded by Elon Musk, says it has discovered that an easier-to-use alternative to reinforcement learning can get rival results when it plays games and performs other tasks. At MIT Technology Review's EmTech Digital conference in San Francisco on Monday, OpenAI's research director, Ilya Sutskever, said that could allow researchers to make progress in machine learning faster.

artificial intelligence, computer game, Sutskever, (14 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Combining policy gradient and Q-learning

O'Donoghue, Brendan, Munos, Remi, Kavukcuoglu, Koray, Mnih, Volodymyr

arXiv.org Artificial IntelligenceApr-7-2017

Policy gradient is an efficient technique for improving a policy in a reinforcement learning setting. However, vanilla online variants are on-policy only and not able to take advantage of off-policy data. In this paper we describe a new technique that combines policy gradient with off-policy Q-learning, drawing experience from a replay buffer. This is motivated by making a connection between the fixed points of the regularized policy gradient algorithm and the Q-values. This connection allows us to estimate the Q-values from the action preferences of the policy, to which we apply Q-learning updates. We refer to the new technique as 'PGQL', for policy gradient and Q-learning. We also establish an equivalency between action-value fitting techniques and actor-critic algorithms, showing that regularized policy gradient techniques can be interpreted as advantage function learning algorithms. We conclude with some numerical examples that demonstrate improved data efficiency and stability of PGQL. In particular, we tested PGQL on the full suite of Atari games and achieved performance exceeding that of both asynchronous advantage actor-critic (A3C) and Q-learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

1611.01626

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Microsoft Maluuba teaches management 101 to machines in its first paper since being acquired

#artificialintelligenceApr-6-2017, 20:02:58 GMT

In mid-January, the ongoing race for AI put Montreal-based Maluuba on our radar. Microsoft acquired the startup and its team of researchers to build better machine intelligence tools for analyzing unstructured text to enable more natural human computer interaction -- think bots that can actually respond with reasonable intelligence to a text you send. The team dropped its first paper since being acquired and it sheds light on what the group's priorities are. The paper outlines a method for multi-advisor reinforcement learning that breaks problems down to be simpler and more easily computable. In oversimplified terms, Maluuba is effectively trying to teach leadership to groups of machines working to solve problems.

machine learning, reinforcement, reinforcement learning, (8 more...)

#artificialintelligence

Country: North America > Canada > Quebec > Montreal (0.25)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.52)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.39)

Add feedback

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

Chow, Yinlam, Ghavamzadeh, Mohammad, Janson, Lucas, Pavone, Marco

arXiv.org Artificial IntelligenceApr-6-2017

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account \emph{risk}, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective of this paper is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented via a chance constraint or a constraint on the conditional value-at-risk (CVaR) of the cumulative cost. We collectively refer to such problems as percentile risk-constrained MDPs. Specifically, we first derive a formula for computing the gradient of the Lagrangian function for percentile risk-constrained MDPs. Then, we devise policy gradient and actor-critic algorithms that (1) estimate such gradient, (2) update the policy in the descent direction, and (3) update the Lagrange multiplier in the ascent direction. For these algorithms we prove convergence to locally optimal policies. Finally, we demonstrate the effectiveness of our algorithms in an optimal stopping problem and an online marketing application.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1512.01629

Country: North America > United States > California > Santa Clara County (0.28)

Genre:

Research Report (0.49)
Overview (0.34)

Industry: Marketing (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Peter Stone: Robot Skill Learning: From the Real World to Simulation and Back CMU RI Seminar

RobohubApr-2-2017, 03:05:21 GMT

Abstract: "For autonomous robots to operate in the open, dynamically changing world, they will need to be able to learn a robust set of interacting skills. This talk begins by introducing "Overlapping Layered Learning" as a novel hierarchical machine learning paradigm for learning such interacting skills in simulation. While learning in simulation is appealing because it avoids the prohibitive sample cost of learning in the real world, unfortunately policies learned in simulation often fail when applied on physical robots. This talk then introduces "Grounded Simulation Learning" to address this problem by algorithmically altering the simulator to better match the real world, and connects this new algorithm to a theoretical analysis of off-policy evaluation in reinforcement learning. Overlapping Layered Learning was the key deciding factor in UT Austin Villa's RoboCup robot soccer 3D simulation league championship, and Grounded Simulation Learning has led to the fastest known stable walk on a widely used humanoid robot."

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Robohub

Industry: Leisure & Entertainment > Sports > Soccer (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

The Next Challenges for Reinforcement Learning

#artificialintelligenceApr-1-2017, 23:46:48 GMT

Recent years have seen great progress for AI. In particular, artificial agents have learned to classify images and recognize speech at near-human level. However, for artificial agents to reach their full potential, they should not only observe, but also act and learn from the consequences of their actions. Learning how to behave is especially important when an agent interacts with humans through natural language, because of the complexity of language and because each person has a different communication style. Reinforcement learning (RL) is the area of research that is concerned with learning effective behavior in a data-driven way.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

#artificialintelligence

Country: Europe > Sweden > Skåne County > Malmö (0.05)

Industry: Leisure & Entertainment > Games (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

Apple's Artificial Intelligence Guru Talks About a Sci-Fi Future

#artificialintelligenceMar-29-2017, 13:07:21 GMT

Artificial intelligence has made great progress in helping computers recognize images in photos and recommending products online that you're more likely to buy. But the technology still faces many challenges, especially when it comes to computers remembering things like humans do. On Tuesday, Apple's director of AI research, Ruslan Salakhutdinov, discussed some of those limitations. However, he steered clear during his talk at an MIT Technology Review conference of how his secretive company incorporates AI into its products like Siri. Salakhutdinov, who joined Apple in October, said he is particularly interested in a type of AI known as reinforcement learning, which researchers use to teach computers to repeatedly take different actions to figure out the best possible result.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

Apple's Artificial Intelligence Guru Talks About a Sci-Fi Future

#artificialintelligenceMar-28-2017, 23:42:34 GMT

artificial intelligence, machine learning, reinforcement learning, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

Deep learning boosted AI. Now the next big thing in machine intelligence is coming

#artificialintelligenceMar-28-2017, 21:18:59 GMT

Inside a simple computer simulation, a group of self-driving cars are performing a crazy-looking maneuver on a four-lane virtual highway. Half are trying to move from the right-hand lanes just as the other half try to merge from the left. It seems like just the sort of tricky thing that might flummox a robot vehicle, but they manage it with precision. I'm watching the driving simulation at the biggest artificial-intelligence conference of the year, held in Barcelona this past December. What's most amazing is that the software governing the cars' behavior wasn't programmed in the conventional sense at all.

deep learning, ground transportation, neural network, (19 more...)

#artificialintelligence

Industry:

Transportation > Ground > Road (1.00)
Leisure & Entertainment > Games (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.42)

Add feedback