AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Better Exploration with Optimistic Actor-Critic

Ciosek, Kamil, Vuong, Quan, Loftin, Robert, Hofmann, Katja

arXiv.org Machine LearningOct-28-2019

Actor-critic methods, a type of model-free Reinforcement Learning, have been successfully applied to challenging tasks in continuous control, often achieving state-of-the art performance. However, wide-scale adoption of these methods in real-world domains is made difficult by their poor sample efficiency. We address this problem both theoretically and empirically. On the theoretical side, we identify two phenomena preventing efficient exploration in existing state-of-the-art algorithms such as Soft Actor Critic. First, combining a greedy actor update with a pessimistic estimate of the critic leads to the avoidance of actions that the agent does not know about, a phenomenon we call pessimistic underexploration. Second, current algorithms are directionally uninformed, sampling actions with equal probability in opposite directions from the current mean. This is wasteful, since we typically need actions taken along certain directions much more than others. To address both of these phenomena, we introduce a new algorithm, Optimistic Actor Critic, which approximates a lower and upper confidence bound on the state-action value function. This allows us to apply the principle of optimism in the face of uncertainty to perform directed exploration using the upper bound while still using the lower bound to avoid overestimation. We evaluate OAC in several challenging continuous control tasks, achieving state-of the art sample efficiency.

algorithm, exploration, exploration policy, (15 more...)

arXiv.org Machine Learning

1910.12807

Country:

North America > United States > Colorado > Denver County > Denver (0.14)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)

Add feedback

A framework for deep energy-based reinforcement learning with quantum speed-up

Jerbi, Sofiene, Nautrup, Hendrik Poulsen, Trenkwalder, Lea M., Briegel, Hans J., Dunjko, Vedran

arXiv.org Artificial IntelligenceOct-28-2019

In the past decade, deep learning methods have seen tremendous success in various supervised and unsupervised learning tasks such as classification and generative modeling. More recently, deep neural networks have emerged in the domain of reinforcement learning as a tool to solve decision-making problems of unprecedented complexity, e.g., navigation problems or game-playing AI. Despite the successful combinations of ideas from quantum computing with machine learning methods, there have been relatively few attempts to design quantum algorithms that would enhance deep reinforcement learning. This is partly due to the fact that quantum enhancements of deep neural networks, in general, have not been as extensively investigated as other quantum machine learning methods. In contrast, projective simulation is a reinforcement learning model inspired by the stochastic evolution of physical systems that enables a quantum speed-up in decision making. In this paper, we develop a unifying framework that connects deep learning and projective simulation, opening the route to quantum improvements in deep reinforcement learning. Our approach is based on so-called generative energy-based models to design reinforcement learning methods with a computational advantage in solving complex and large-scale decision-making problems.

agent, learning, neural network, (15 more...)

arXiv.org Artificial Intelligence

1910.1276

Country:

Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Austria > Tyrol > Innsbruck (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry:

Leisure & Entertainment > Games (0.92)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Asynchronous Methods for Model-Based Reinforcement Learning

Zhang, Yunzhi, Clavera, Ignasi, Tsai, Boren, Abbeel, Pieter

arXiv.org Artificial IntelligenceOct-28-2019

Significant progress has been made in the area of model-based reinforcement learning. State-of-the-art algorithms are now able to match the asymptotic performance of model-free methods while being significantly more data efficient. However, this success has come at a price: state-of-the-art model-based methods require significant computation interleaved with data collection, resulting in run times that take days, even if the amount of agent interaction might be just hours or even minutes. When considering the goal of learning in real-time on real robots, this means these state-of-the-art model-based algorithms still remain impractical. In this work, we propose an asynchronous framework for model-based reinforcement learning methods that brings down the run time of these algorithms to be just the data collection time. We evaluate our asynchronous framework on a range of standard MuJoCo benchmarks. We also evaluate our asynchronous framework on three real-world robotic manipulation tasks. We show how asynchronous learning not only speeds up learning w.r.t wall-clock time through parallelization, but also further reduces the sample complexity of model-based approaches by means of improving the exploration and by means of effectively avoiding the policy overfitting to the deficiencies of learned dynamics models.

algorithm, asynchronous framework, learning, (14 more...)

arXiv.org Artificial Intelligence

1910.12453

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Reinforcement Learning: Frontiers of Artificial Intelligence

#artificialintelligenceOct-27-2019, 21:22:01 GMT

Deep Reinforcement Learning: Frontiers of Artificial Intelligence Books by Mohit Sewak Book Description This book starts by presenting the basics of reinforcement learning using highly intuitive and easy-to-understand examples and applications, and then introduces the cutting-edge research advances that make reinforcement learning capable of out-performing most state-of-art systems, and even humans in a number of applications. The book not only equips readers with an understanding of multiple advanced and innovative algorithms, but also prepares them to implement systems such as those created by Google Deep Mind in actual code. This book is intended for readers who want to both understand and apply advanced concepts in a field that combines the best of two worlds – deep learning and reinforcement learning – to tap the potential of'advanced artificial intelligence' for creating real-world applications and game-winning algorithms.

deep reinforcement learning, frontier, reinforcement learning, (8 more...)

#artificialintelligence

Genre: Summary/Review (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)

Add feedback

Reinforcement Learning Algorithms with Python

#artificialintelligenceOct-27-2019, 19:38:11 GMT

Reinforcement Learning Algorithms with Python: Learn, understand, and develop smart algorithms for addressing AI challenges Develop self-learning algorithms and agents using TensorFlow and other Python tools, frameworks, and libraries Key Features Learn, develop, and deploy advanced reinforcement learning algorithms to solve a variety of tasks Understand and develop model-free and model-based algorithms for building self-learning agents Work with advanced Reinforcement Learning concepts and algorithms such as imitation learning and evolution strategies Book Description Reinforcement Learning (RL) is a popular and promising branch of AI that involves making smarter models and agents that can automatically determine ideal behavior based on changing requirements. This book will help you master RL algorithms and understand their implementation as you build self-learning agents. Starting with an introduction to the tools, libraries, and setup needed to work in the RL environment, this book covers the building blocks of RL and delves into value-based methods, such as the application of Q-learning and SARSA algorithms. You'll learn how to use a combination of Q-learning and neural networks to solve complex problems. Furthermore, you'll study the policy gradient methods, TRPO, and PPO, to improve performance and stability, before moving on to the DDPG and TD3 deterministic algorithms. This book also covers how imitation learning techniques work and how Dagger can teach an agent to drive.

algorithm, reinforcement learning algorithm, reinforcement learning implementing rl cycle, (12 more...)

#artificialintelligence

Genre:

Summary/Review (0.56)
Instructional Material > Course Syllabus & Notes (0.36)
Collection (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A robot hand taught itself to solve a Rubik's Cube after creating its own training regime

#artificialintelligenceOct-27-2019, 16:07:54 GMT

Over a year ago, OpenAI, the San Francisco–based for-profit AI research lab, announced that it had trained a robotic hand to manipulate a cube with remarkable dexterity. That might not sound earth-shattering. But in the AI world, it was impressive for two reasons. First, the hand had taught itself how to fidget with the cube using a reinforcement-learning algorithm, a technique modeled on the way animals learn. Second, all the training had been done in simulation, but it managed to successfully translate to the real world.

algorithm, robot, rubik, (14 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.25)
North America > United States > Michigan (0.05)

Industry: Leisure & Entertainment > Games > Rubik's Cube (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.44)
(2 more...)

Add feedback

Don't Ever Ignore Reinforcement Learning Again - WebSystemer.no

#artificialintelligenceOct-27-2019, 02:02:23 GMT

Do you want to create automatic fly stunt manoeuvres in helicopters? Or are you managing an investment portfolio? Do you want to take over the control of a power station? Or are you aiming at controlling the dynamics of a humanoid robot locomotion? Do you want to defeat a World Champion in Chess, BackGammon or Go?

agent, current position, electric shock, (12 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Don't Ever Ignore Reinforcement Learning Again

#artificialintelligenceOct-27-2019, 02:02:14 GMT

agent, current position, electric shock, (11 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear Order

Kurenkov, Vladislav, Maksudov, Bulat, Khan, Adil

arXiv.org Artificial IntelligenceOct-27-2019

In this work, we analyze the performance of general deep reinforcement learning algorithms for a task-oriented language grounding problem, where language input contains multiple sub-goals and their order of execution is non-linear. We generate a simple instructional language for the GridWorld environment, that is built around three language elements (order connectors) defining the order of execution: one linear - "comma" and two non-linear - "but first", "but before". We apply one of the deep reinforcement learning baselines - Double DQN with frame stacking and ablate several extensions such as Prioritized Experience Replay and Gated-Attention architecture. Our results show that the introduction of non-linear order connectors improves the success rate on instructions with a higher number of sub-goals in 2-3 times, but it still does not exceed 20%. Also, we observe that the usage of Gated-Attention provides no competitive advantage against concatenation in this setting. Source code and experiments' results are available at https://github.com/vkurenkov/language-grounding-multigoal

connector, instruction, order connector, (14 more...)

arXiv.org Artificial Intelligence

1910.12354

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Long-term Joint Scheduling for Urban Traffic

Liang, Xianfeng, Wu, Likang, Chen, Joya, Liu, Yang, Yu, Runlong, Hou, Min, Wu, Han, Ye, Yuyang, Liu, Qi, Chen, Enhong

arXiv.org Artificial IntelligenceOct-27-2019

Recently, the traffic congestion in modern cities has become a growing worry for the residents. As presented in Baidu traffic report, the commuting stress index has reached surprising 1.973 in Beijing during rush hours, which results in longer trip time and increased vehicular queueing. Previous works have demonstrated that by reasonable scheduling, e.g, rebalancing bike-sharing systems and optimized bus transportation, the traffic efficiency could be significantly improved with little resource consumption. However, there are still two disadvantages that restrict their performance: (1) they only consider single scheduling in a short time, but ignoring the layout after first reposition, and (2) they only focus on the single transport. However, the multi-modal characteristics of urban public transportation are largely under-exploited. In this paper, we propose an efficient and economical multi-modal traffic scheduling scheme named JLRLS based on spatio -temporal prediction, which adopts reinforcement learning to obtain optimal long-term and joint schedule. In JLRLS, we combines multiple transportation to conduct scheduling by their own characteristics, which potentially helps the system to reach the optimal performance. Our implementation of an example by PaddlePaddle is available at https://github.com/bigdata-ustc/Long-term-Joint-Scheduling, with an explaining video at https://youtu.be/t5M2wVPhTyk.

bike, bus system, scheduling, (14 more...)

arXiv.org Artificial Intelligence

1910.12283

Country:

Asia > China > Beijing > Beijing (0.25)
Asia > China > Anhui Province (0.04)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback