AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Expected Policy Gradients for Reinforcement Learning

Ciosek, Kamil, Whiteson, Shimon

arXiv.org Machine LearningJan-10-2018

We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates (or sums) across actions when estimating the gradient, instead of relying only on the action in the sampled trajectory. For continuous action spaces, we first derive a practical result for Gaussian policies and quadric critics and then extend it to an analytical method for the universal case, covering a broad class of actors and critics, including Gaussian, exponential families, and reparameterised policies with bounded support. For Gaussian policies, we show that it is optimal to explore using covariance proportional to the matrix exponential of the scaled Hessian of the critic with respect to the actions. EPG also provides a general framework for reasoning about policy gradient methods, which we use to establish a new general policy gradient theorem, of which the stochastic and deterministic policy gradient theorems are special cases. Furthermore, we prove that EPG reduces the variance of the gradient estimates without requiring deterministic policies and with little computational overhead. Finally, we show that EPG outperforms existing approaches on six challenging domains involving the simulated control of physical systems.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1801.03326

Country: Europe > United Kingdom (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Machine Learning at Udacity Goes Deeper Udacity

#artificialintelligenceJan-9-2018, 22:14:25 GMT

We just unlocked a Free Preview of our Machine Learning Engineer Nanodegree Program! Discover amazing new content, and explore your future in Machine Learning, today! The Machine Learning Engineer Nanodegree program has been one of Udacity's benchmark programs for over 2 years. Thousands of students have graduated the program, and many have gone on to great careers at companies like Google, Amazon, and more. As technology evolves, so does our curriculum, and we think much of the program's success can be attributed to keeping the content up-to-the-minute current.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

#artificialintelligence

Genre:

Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.97)
Education > Educational Setting > Online (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.97)

Add feedback

zuoxingdong/gym-maze

#artificialintelligenceJan-8-2018, 17:00:54 GMT

This repository contains a customizable gym environment for all kinds of mazes or gridworlds. The motivation of this repository is, as maze or gridworld are used very often in the reinforcement learning community, however, it is still lack of a standardized framework. The repo will be actively maintained, any comments, feedbacks or improvements are highly welcomed. We have provided a Jupyter Notebook to illustrate how to make various of maze environments, and generate animation of the agent's trajectory following the optimal actions solved by A* optimal planner.

artificial intelligence, machine learning, zuoxingdong gym-maze, (4 more...)

#artificialintelligence

Technology:

Information Technology > Knowledge Management (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

Add feedback

Gated-Attention Architectures for Task-Oriented Language Grounding

Chaplot, Devendra Singh, Sathyendra, Kanthashree Mysore, Pasumarthi, Rama Kumar, Rajagopal, Dheeraj, Salakhutdinov, Ruslan

arXiv.org Artificial IntelligenceJan-8-2018

To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding. We propose an end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input. The proposed model combines the image and text representations using a Gated-Attention mechanism and learns a policy to execute the natural language instruction using standard reinforcement and imitation learning methods. We show the effectiveness of the proposed model on unseen instructions as well as unseen maps, both quantitatively and qualitatively. We also introduce a novel environment based on a 3D game engine to simulate the challenges of task-oriented language grounding over a rich set of instructions and environment states.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

1706.0723

Genre:

Research Report (0.50)
Instructional Material (0.47)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Expressivity, Trainability, and Generalization in Machine Learning

#artificialintelligenceJan-7-2018, 01:02:55 GMT

Update 11/29: I'm looking for translators to help translate this post into different languages, particularly Chinese (中文), Spanish (Español), Korean (한국어), Russian (ру сский язы к), and Japanese (日本語). When I read Machine Learning papers, I ask myself whether the contributions of the paper fall under improvements to 1) Expressivity 2) Trainability, and/or 3) Generalization. I learned this categorization from my colleague Jascha Sohl-Dickstein at Google Brain, and the terminology is also introduced in this paper. I have found this categorization effective in thinking about how individual research papers (especially on the theoretical side) tie subfields of AI research (e.g. In this blog post, I discuss how these concepts tie into current (Nov 2017) machine learning research on Supervised Learning, Unsupervised Learning, and Reinforcement Learning. I consider Generalization to be comprised of two categories -- "weak" and "strong" generalization -- and I will discuss them separately.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

#artificialintelligence

Industry: Information Technology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Dissecting Reinforcement Learning-Part.3

#artificialintelligenceJan-7-2018, 01:02:07 GMT

The update rule is based on the tuple State-Reward-State. Remember that now we are in the control case. Here we use the Q-function (see second post) to estimate the best policy. The Q-function requires as input a state-action pair.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

#artificialintelligence

Industry: Transportation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Applications of Deep Learning and Reinforcement Learning to Biological Data

Mahmud, Mufti, Kaiser, M. Shamim, Hussain, Amir, Vassanelli, Stefano

arXiv.org Machine LearningJan-7-2018

Rapid advances of hardware-based technologies during the past decades have opened up new possibilities for Life scientists to gather multimodal data in various application domains (e.g., Omics, Bioimaging, Medical Imaging, and [Brain/Body]-Machine Interfaces), thus generating novel opportunities for development of dedicated data intensive machine learning techniques. Overall, recent research in Deep learning (DL), Reinforcement learning (RL), and their combination (Deep RL) promise to revolutionize Artificial Intelligence. The growth in computational power accompanied by faster and increased data storage and declining computing costs have already allowed scientists in various fields to apply these techniques on datasets that were previously intractable for their size and complexity. This review article provides a comprehensive survey on the application of DL, RL, and Deep RL techniques in mining Biological data. In addition, we compare performances of DL techniques when applied to different datasets across various application domains. Finally, we outline open issues in this challenging research area and discuss future development perspectives.

classification, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TNNLS.2018.2790388

1711.03985

Country:

North America > United States (0.68)
Europe (0.67)

Genre: Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations

Wang, Xingyu, Klabjan, Diego

arXiv.org Machine LearningJan-6-2018

This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be not optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert strategies, we introduce a new objective function that directly pits experts against Nash Equilibrium strategies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. In our setting the model and algorithm do not decouple by agent. In order to find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In our numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to previous approaches using tabular representations. Moreover, with sub-optimal expert demonstrations our algorithms recover both reward functions and strategies with good quality.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1801.02124

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Beginner's guide to Reinforcement Learning & its implementation in Python

#artificialintelligenceJan-5-2018, 20:07:48 GMT

One of the most fundamental question for scientists across the globe has been – "How to learn a new skill?". The desire to understand the answer is obvious – if we can understand this, we can enable human species to do things we might not have thought before. Alternately, we can train machines to do more "human" tasks and create true artificial intelligence. While we don't have a complete answer to the above question yet, there are a few things which are clear. Irrespective of the skill, we first learn by interacting with the environment.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Guest Post (Part I): Demystifying Deep Reinforcement Learning - Intel AI

#artificialintelligenceJan-5-2018, 09:32:38 GMT

Two years ago, a small company in London called DeepMind uploaded their pioneering paper "Playing Atari with Deep Reinforcement Learning" to Arxiv. In this paper they demonstrated how a computer learned to play Atari 2600 video games by observing just the screen pixels and receiving a reward when the game score increased. The result was remarkable, because the games and the goals in every game were very different and designed to be challenging for humans. The same model architecture, without any change, was used to learn seven different games, and in three of them the algorithm performed even better than a human! It has been hailed since then as the first step towards general artificial intelligence – an AI that can survive in a variety of environments, instead of being confined to strict realms such as playing chess. No wonder DeepMind was immediately bought by Google and has been on the forefront of deep learning research ever since.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback