AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning

Kielak, Kacper

arXiv.org Artificial IntelligenceApr-30-2019

Reinforcement learning has seen great advancements in the past five years. The successful introduction of deep learning in place of more traditional methods allowed reinforcement learning to scale to very complex domains achieving super-human performance in environments like the game of Go or numerous video games. Despite great successes in multiple domains, these new methods suffer from their own issues that make them often inapplicable to the real world problems. Extreme lack of data efficiency, together with huge variance and difficulty in enforcing safety constraints, is one of the three most prominent issues in the field. Usually, millions of data points sampled from the environment are necessary for these algorithms to converge to acceptable policies. This thesis proposes novel Generative Adversarial Imaginative Reinforcement Learning algorithm. It takes advantage of the recent introduction of highly effective generative adversarial models, and Markov property that underpins reinforcement learning setting, to model dynamics of the real environment within the internal imagination module. Rollouts from the imagination are then used to artificially simulate the real environment in a standard reinforcement learning process to avoid, often expensive and dangerous, trial and error in the real environment. Experimental results show that the proposed algorithm more economically utilises experience from the real environment than the current state-of-the-art Rainbow DQN algorithm, and thus makes an important step towards sample efficient deep reinforcement learning.

algorithm, computer game, upstream oil & gas, (23 more...)

arXiv.org Artificial Intelligence

1904.13255

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.66)
Energy > Oil & Gas > Upstream (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Routing Networks and the Challenges of Modular and Compositional Computation

Rosenbaum, Clemens, Cases, Ignacio, Riemer, Matthew, Klinger, Tim

arXiv.org Machine LearningApr-29-2019

Compositionality is a key strategy for addressing combinatorial complexity and the curse of dimensionality. Recent work has shown that compositional solutions can be learned and offer substantial gains across a variety of domains, including multi-task learning, language modeling, visual question answering, machine comprehension, and others. However, such models present unique challenges during training when both the module parameters and their composition must be learned jointly. In this paper, we identify several of these issues and analyze their underlying causes. Our discussion focuses on routing networks, a general approach to this problem, and examines empirically the interplay of these challenges and a variety of design decisions. In particular, we consider the effect of how the algorithm decides on module composition, how the algorithm updates the modules, and if the algorithm uses regularization.

architecture, neural network, survey article, (21 more...)

arXiv.org Machine Learning

1904.12774

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
(2 more...)

Add feedback

A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks

Ahmed, Kazi Ishfaq, Hossain, Ekram

arXiv.org Machine LearningApr-29-2019

Optimal resource allocation is a fundamental challenge for dense and heterogeneous wireless networks with massive wireless connections. Because of the non-convex nature of the optimization problem, it is computationally demanding to obtain the optimal resource allocation. Recently, deep reinforcement learning (DRL) has emerged as a promising technique in solving non-convex optimization problems. Unlike deep learning (DL), DRL does not require any optimal/ near-optimal training dataset which is either unavailable or computationally expensive in generating synthetic data. In this paper, we propose a novel centralized DRL based downlink power allocation scheme for a multi-cell system intending to maximize the total network throughput. Specifically, we apply a deep Q-learning (DQL) approach to achieve near-optimal power allocation policy. For benchmarking the proposed approach, we use a Genetic Algorithm (GA) to obtain near-optimal power allocation solution. Simulation results show that the proposed DRL-based power allocation scheme performs better compared to the conventional power allocation schemes in a multi-cell scenario.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1904.13032

Genre: Research Report (1.00)

Industry:

Telecommunications (0.84)
Information Technology > Networks (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Challenges of Real-World Reinforcement Learning

Dulac-Arnold, Gabriel, Mankowitz, Daniel, Hester, Todd

arXiv.org Artificial IntelligenceApr-29-2019

Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are often hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. We present a set of nine unique challenges that must be addressed to productionize RL to real world problems. For each of these challenges, we specify the exact meaning of the challenge, present some approaches from the literature, and specify some metrics for evaluating that challenge. An approach that addresses all nine challenges would be applicable to a large number of real world problems. We also present an example domain that has been modified to present these challenges as a testbed for practical RL research.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

1904.12901

Country: North America > United States > California (0.46)

Genre: Research Report (0.41)

Industry:

Leisure & Entertainment > Games (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Deep Neuroevolution of Recurrent and Discrete World Models

Risi, Sebastian, Stanley, Kenneth O.

arXiv.org Artificial IntelligenceApr-28-2019

Neural architectures inspired by our own human cognitive system, such as the recently introduced world models, have been shown to outperform traditional deep reinforcement learning (RL) methods in a variety of different domains. Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual information processing, memory, and decision-making. However, so far the components of these models have to be trained separately and through a variety of specialized training methods. This paper demonstrates the surprising finding that models with the same precise parts can be instead efficiently trained end-to-end through a genetic algorithm (GA), reaching a comparable performance to the original world model by solving a challenging car racing task. An analysis of the evolved visual and memory system indicates that they include a similar effective representation to the system trained through gradient descent. Additionally, in contrast to gradient descent methods that struggle with discrete variables, GAs also work directly with such representations, opening up opportunities for classical planning in latent space. This paper adds additional evidence on the effectiveness of deep neuroevolution for tasks that require the intricate orchestration of multiple components in complex heterogeneous architectures.

evolutionary algorithm, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1906.08857

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Czechia > Prague (0.05)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
(2 more...)

Add feedback

Argus: Smartphone-enabled Human Cooperation via Multi-Agent Reinforcement Learning for Disaster Situational Awareness

Sadhu, Vidyasagar, Salles-Loustau, Gabriel, Pompili, Dario, Zonouz, Saman, Sritapan, Vincent

arXiv.org Artificial IntelligenceApr-28-2019

Argus exploits a Multi-Agent Reinforcement Learning (MARL) framework to create a 3D mapping of the disaster scene using agents present around the incident zone to facilitate the rescue operations. The agents can be both human bystanders at the disaster scene as well as drones or robots that can assist the humans. The agents are involved in capturing the images of the scene using their smartphones (or on-board cameras in case of drones) as directed by the MARL algorithm. These images are used to build real time a 3D map of the disaster scene. Via both simulations and real experiments, an evaluation of the framework in terms of effectiveness in tracking random dynamicity of the environment is presented.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICAC.2016.43

1906.03037

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Industry:

Law Enforcement & Public Safety (0.94)
Media > Photography (0.54)
Government > Military (0.41)
Media > Film (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.47)

Add feedback

RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion

Sarmad, Muhammad, Lee, Hyunjoo Jenny, Kim, Young Min

arXiv.org Artificial IntelligenceApr-28-2019

We present RL-GAN-Net, where a reinforcement learning (RL) agent provides fast and robust control of a generative adversarial network (GAN). Our framework is applied to point cloud shape completion that converts noisy, partial point cloud data into a high-fidelity completed shape by controlling the GAN. While a GAN is unstable and hard to train, we circumvent the problem by (1) training the GAN on the latent space representation whose dimension is reduced compared to the raw point cloud input and (2) using an RL agent to find the correct input to the GAN to generate the latent space representation of the shape that best fits the current input of incomplete point cloud. The suggested pipeline robustly completes point cloud with large missing regions. To the best of our knowledge, this is the first attempt to train an RL agent to control the GAN, which effectively learns the highly nonlinear mapping from the input noise of the GAN to the latent space of point cloud. The RL agent replaces the need for complex optimization and consequently makes our technique real time. Additionally, we demonstrate that our pipelines can be used to enhance the classification accuracy of point cloud with missing data.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

1904.12304

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Arbitrage of Energy Storage in Electricity Markets with Deep Reinforcement Learning

Hanchen, Xu, Xiao, Li, Xiangyu, Zhang, Junbo, Zhang

arXiv.org Machine LearningApr-27-2019

In this letter, we address the problem of controlling energy storage systems (ESSs) for arbitrage in real-time electricity markets under price uncertainty. We first formulate this problem as a Markov decision process, and then develop a deep reinforcement learning based algorithm to learn a stochastic control policy that maps a set of available information processed by a recurrent neural network to ESSs' charging/discharging actions. Finally, we verify the effectiveness of our algorithm using real-time electricity prices from PJM.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1904.12232

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Safe Reinforcement Learning with Scene Decomposition for Navigating Complex Urban Environments

Bouton, Maxime, Nakhaei, Alireza, Fujimura, Kikuo, Kochenderfer, Mykel J.

arXiv.org Artificial IntelligenceApr-25-2019

Navigating urban environments represents a complex task for automated vehicles. They must reach their goal safely and efficiently while considering a multitude of traffic participants. We propose a modular decision making algorithm to autonomously navigate intersections, addressing challenges of existing rule-based and reinforcement learning (RL) approaches. We first present a safe RL algorithm relying on a model-checker to ensure safety guarantees. To make the decision strategy robust to perception errors and occlusions, we introduce a belief update technique using a learning based approach. Finally, we use a scene decomposition approach to scale our algorithm to environments with multiple traffic participants. We empirically demonstrate that our algorithm outperforms rule-based methods and reinforcement learning techniques on a complex intersection scenario.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1904.11483

Country: North America > United States > California > Santa Clara County (0.28)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning

Schaul, Tom, Borsa, Diana, Modayil, Joseph, Pascanu, Razvan

arXiv.org Artificial IntelligenceApr-25-2019

Rather than proposing a new method, this paper investigates an issue present in existing learning algorithms. We study the learning dynamics of reinforcement learning (RL), specifically a characteristic coupling between learning and data generation that arises because RL agents control their future data distribution. In the presence of function approximation, this coupling can lead to a problematic type of 'ray interference', characterized by learning dynamics that sequentially traverse a number of performance plateaus, effectively constraining the agent to learn one thing at a time even when learning in parallel is better. We establish the conditions under which ray interference occurs, show its relation to saddle points and obtain the exact learning dynamics in a restricted setting. We characterize a number of its properties and discuss possible remedies.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1904.11455

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.90)

Industry: Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback