AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning

Brunner, Gino, Fritsche, Manuel, Richter, Oliver, Wattenhofer, Roger

arXiv.org Artificial IntelligenceSep-30-2018

Learning in sparse reward settings remains a challenge in Reinforcement Learning, which is often addressed by using intrinsic rewards. One promising strategy is inspired by human curiosity, requiring the agent to learn to predict the future. In this paper a curiosity-driven agent is extended to use these predictions directly for training. To achieve this, the agent predicts the value function of the next state at any point in time. Subsequently, the consistency of this prediction with the current value function is measured, which is then used as a regularization term in the loss function of the algorithm. Experiments were made on grid-world environments as well as on a 3D navigation task, both with sparse rewards. In the first case the extended agent is able to learn significantly faster than the baselines.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1810.00361

Country:

Europe (0.93)
North America > United States > New York (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Reinforcement Learning in R

Pröllochs, Nicolas, Feuerriegel, Stefan

arXiv.org Machine LearningSep-29-2018

Reinforcement learning refers to a group of methods from artificial intelligence where an agent performs learning through trial and error. It differs from supervised learning, since reinforcement learning requires no explicit labels; instead, the agent interacts continuously with its environment. That is, the agent starts in a specific state and then performs an action, based on which it transitions to a new state and, depending on the outcome, receives a reward. Different strategies (e.g. Q-learning) have been proposed to maximize the overall reward, resulting in a so-called policy, which defines the best possible action in each state. Mathematically, this process can be formalized by a Markov decision process and it has been implemented by packages in R; however, there is currently no package available for reinforcement learning. As a remedy, this paper demonstrates how to perform reinforcement learning in R and, for this purpose, introduces the ReinforcementLearning package. The package provides a remarkably flexible framework and is easily applied to a wide range of different problems. We demonstrate its use by drawing upon common examples from the literature (e.g. finding optimal game strategies).

machine learning, reinforcement, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1810.0024

Country:

Europe (0.93)
North America > United States (0.68)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning

Shu, Tianmin, Tian, Yuandong

arXiv.org Artificial IntelligenceSep-29-2018

Most of the prior work on multi-agent reinforcement learning (MARL) achieves optimal collaboration by directly controlling the agents to maximize a common reward. In this paper, we aim to address this from a different angle. In particular, we consider scenarios where there are self-interested agents (i.e., worker agents) which have their own minds (preferences, intentions, skills, etc.) and can not be dictated to perform tasks they do not wish to do. For achieving optimal coordination among these agents, we train a super agent (i.e., the manager) to manage them by first inferring their minds based on both current and past observations and then initiating contracts to assign suitable tasks to workers and promise to reward them with corresponding bonuses so that they will agree to work together. The objective of the manager is maximizing the overall productivity as well as minimizing payments made to the workers for ad-hoc worker teaming. RL), which consists of agent modeling and policy learning. We have evaluated our approach in two environments, Resource Collection and Crafting, to simulate multi-agent management problems with various task settings and multiple designs for the worker agents. The experimental results have validated the effectiveness of our approach in modeling worker agents' minds online, and in achieving optimal ad-hoc teaming with good generalization and fast adaptation. As the main assumption and building block in economy, self-interested agents play a central roles in our daily life. Selfish agents, with their private beliefs, preferences, intentions, and skills, could collaborate effectively to make great achievement with proper incentives and contracts, an amazing phenomenon that happens every day in every corner of the world.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

1810.00147

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Basics of Machine Learning

#artificialintelligenceSep-28-2018, 20:34:45 GMT

This video will help you understand what is machine learning, what are the types of machine learning whether it's supervised, unsupervised and reinforcement learning, how the technology works with simple examples, and will also explain how it is currently being used in various industries.

artificial intelligence, machine learning, reinforcement learning

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.41)

Add feedback

Artificial Intelligence: What Is Reinforcement Learning - A Simple Explanation & Practical Examples

#artificialintelligenceSep-28-2018, 12:46:04 GMT

Reinforcement learning is one of the most discussed, followed and contemplated topics in artificial intelligence (AI) as it has the potential to transform most businesses. In this article, I want to provide a simple guide that explains reinforcement learning and give you some practical examples of how it is used today. At the core of reinforcement learning is the concept that the optimal behavior or action is reinforced by a positive reward. Similar to toddlers learning how to walk who adjust actions based on the outcomes they experience such as taking a smaller step if the previous broad step made them fall, machines and software agents use reinforcement learning algorithms to determine the ideal behavior based upon feedback from the environment. Depending on the complexity of the problem, reinforcement learning algorithms can keep adapting to the environment over time if necessary in order to maximize the reward in the long-term.

machine learning, reinforcement, reinforcement learning, (7 more...)

#artificialintelligence

Country: Europe > United Kingdom (0.05)

Industry:

Information Technology (0.52)
Health & Medicine (0.52)
Leisure & Entertainment > Games (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

How do some of the best AI algorithms perform on real robots? Not well, it turns out

#artificialintelligenceSep-28-2018, 03:56:54 GMT

All the biggest labs leading AI research will have you believe that their fancy game-playing software bots will one day be applicable to the real world. The skills from playing Go, Poker or Dota 2 will be transferable to algorithms designing new drugs, controlling robots, teaching computers how to negotiate – you name it. One startup, Kindred.AI, decided to put some of those claims to the test, particularly with respect to robotics and machine-learning software. "We wanted to know the readiness of the state-of-the-art RL algorithms on real robotic applications," Rupam Mahmood, lead of AI at Kindred, told The Register. Reinforcement learning (RL) is a popular AI method that teaches agents how to perform a specific task by rewarding them every time they get closer to the stated goal.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

#artificialintelligence

Country: North America > United States > California (0.16)

Industry:

Education (0.56)
Transportation > Ground > Road (0.32)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Generalization and Regularization in DQN

Farebrother, Jesse, Machado, Marlos C., Bowling, Michael

arXiv.org Artificial IntelligenceSep-28-2018

Deep reinforcement learning (RL) algorithms have shown an impressive ability to learn complex control policies in high-dimensional environments. However, despite the ever-increasing performance on popular benchmarks like the Arcade Learning Environment (ALE), policies learned by deep RL algorithms can struggle to generalize when evaluated in remarkably similar environments. These results are unexpected given the fact that, in supervised learning, deep neural networks often learn robust features that generalize across tasks. In this paper, we study the generalization capabilities of DQN in order to aid in understanding this mismatch between generalization in deep RL and supervised learning methods. We provide evidence suggesting that DQN overspecializes to the domain it is trained on. We then comprehensively evaluate the impact of traditional methods of regularization from supervised learning, $\ell_2$ and dropout, and of reusing learned representations to improve the generalization capabilities of DQN. We perform this study using different game modes of Atari 2600 games, a recently introduced modification for the ALE which supports slight variations of the Atari 2600 games used for benchmarking in the field. Despite regularization being largely underutilized in deep RL, we show that it can, in fact, help DQN learn more general features. These features can then be reused and fine-tuned on similar tasks, considerably improving the sample efficiency of DQN.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

1810.00123

Country: North America > Canada > Alberta (0.28)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Games > Computer Games (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Switching Isotropic and Directional Exploration with Parameter Space Noise in Deep Reinforcement Learning

Karino, Izumi, Tanaka, Kazutoshi, Niiyama, Ryuma, Kuniyoshi, Yasuo

arXiv.org Machine LearningSep-27-2018

This paper proposes an exploration method for deep reinforcement learning based on parameter space noise. Recent studies have experimentally shown that parameter space noise results in better exploration than the commonly used action space noise. Previous methods devised a way to update the diagonal covariance matrix of a noise distribution and did not consider the direction of the noise vector and its correlation. In addition, fast updates of the noise distribution are required to facilitate policy learning. We propose a method that deforms the noise distribution according to the accumulated returns and the noises that have led to the returns. Moreover, this method switches isotropic exploration and directional exploration in parameter space with regard to obtained rewards. We validate our exploration strategy in the OpenAI Gym continuous environments and modified environments with sparse rewards. The proposed method achieves results that are competitive with a previous method at baseline tasks. Moreover, our approach exhibits better performance in sparse reward environments by exploration with the switching strategy.

artificial intelligence, exploration, upstream oil & gas, (18 more...)

arXiv.org Machine Learning

1809.0657

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Nogueira, Rodrigo, Bulian, Jannis, Ciaramita, Massimiliano

arXiv.org Machine LearningSep-27-2018

We propose a method to efficiently learn diverse strategies in reinforcement learning for query reformulation in the tasks of document retrieval and question answering. In the proposed framework an agent consists of multiple specialized sub-agents and a meta-agent that learns to aggregate the answers from sub-agents to produce a final answer. Sub-agents are trained on disjoint partitions of the training data, while the meta-agent is trained on the full training set. Our method makes learning faster, because it is highly parallelizable, and has better generalization performance than strong baselines, such as an ensemble of agents trained on the full data. We show that the improved performance is due to the increased diversity of reformulation strategies.

arxiv preprint arxiv, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1809.10658

Country: Europe > Romania (0.15)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning and Planning with a Semantic Model

Wu, Yi, Wu, Yuxin, Tamar, Aviv, Russell, Stuart, Gkioxari, Georgia, Tian, Yuandong

arXiv.org Artificial IntelligenceSep-27-2018

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI. This paper describes progresses on this challenge in the context of man-made environments, which are visually diverse but contain intrinsic semantic regularities. We propose a hybrid model-based and model-free approach, LEArning and Planning with Semantics (LEAPS), consisting of a multi-target sub-policy that acts on visual inputs, and a Bayesian model over semantic structures. When placed in an unseen environment, the agent plans with the semantic model to make high-level decisions, proposes the next sub-target for the sub-policy to execute, and updates the semantic model based on new observations. We perform experiments in visual navigation tasks using House3D, a 3D environment that contains diverse human-designed indoor scenes with real-world objects. LEAPS outperforms strong baselines that do not explicitly plan using the semantic content. Deep reinforcement learning (DRL) has undoubtedly witnessed strong achievements in recent years (Silver et al., 2016; Mnih et al., 2015; Levine et al., 2016).

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

1809.10842

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
(2 more...)

Add feedback