AITopics | Reinforcement Learning

Collaborating Authors

Reinforcement Learning

"Reinforcement learning is learning what to do – how to map situations to actions – so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them."
– Sutton, Richard S. and Andrew G. Barto. Reinforcement Learning: An Introduction. (1.1). MIT Press, Cambridge, MA, 1998.

News Overviews Instructional Materials AI-Alerts Classics

Exploration for Multi-task Reinforcement Learning with Deep Generative Models

Bangaru, Sai Praveen, Suhas, JS, Ravindran, Balaraman

arXiv.org Machine LearningNov-29-2016

Exploration in multi-task reinforcement learning is critical in training agents to deduce the underlying MDP. Many of the existing exploration frameworks such as $E^3$, $R_{max}$, Thompson sampling assume a single stationary MDP and are not suitable for system identification in the multi-task setting. We present a novel method to facilitate exploration in multi-task reinforcement learning using deep generative models. We supplement our method with a low dimensional energy model to learn the underlying MDP distribution and provide a resilient and adaptive exploration signal to the agent. We evaluate our method on a new set of environments and provide intuitive interpretation of our results.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

1611.09894

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Reinforcement Learning and AI

@machinelearnbotNov-28-2016, 10:50:03 GMT

If you poled a group of data scientist just a few years back about how many machine learning problem types there are you would almost certainly have gotten a binary response: problem types were clearly divided into supervised and unsupervised. Supervised: You've got labeled data (clearly defined examples). Unsupervised: You've got data but it's not labeled. See if there's a structure in there. Supervised: You've got labeled data (clearly defined examples).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

@machinelearnbot

Industry: Leisure & Entertainment > Games (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Reinforcement Learning for Multi-Domain Dialogue Systems

Cuayáhuitl, Heriberto, Yu, Seunghak, Williamson, Ashley, Carse, Jacob

arXiv.org Artificial IntelligenceNov-26-2016

Standard deep reinforcement learning methods such as Deep Q-Networks (DQN) for multiple tasks (domains) face scalability problems. We propose a method for multi-domain dialogue policy learning---termed NDQN, and apply it to an information-seeking spoken dialogue system in the domains of restaurants and hotels. Experimental results comparing DQN (baseline) versus NDQN (proposed) using simulations report that our proposed method exhibits better scalability and is promising for optimising the behaviour of multi-domain dialogue systems.

dialogue system, interaction, multi-domain dialogue system, (10 more...)

arXiv.org Artificial Intelligence

1611.08675

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.64)

Industry:

Consumer Products & Services > Hotels (0.49)
Consumer Products & Services > Restaurants (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Google's DeepMind AI grasps basic laws of physics

#artificialintelligenceNov-25-2016, 16:00:47 GMT

Google DeepMind's artificial intelligence team, alongside researchers at the University of California, Berkeley, has trained AI machines to interact with objects in order to evaluate their properties without any prior awareness of physical laws. The research project drew inspiration from child development and sought to train AI to mirror human capacity to interact with physical objects and infer properties such as mass, friction, and malleability. The study, entitled Learning to perform physics experiments via deep reinforcement learning, explained that while recent advances in AI have achieved'superhuman performance' in complex control problems and other processing tasks, the machines still lack a common sense understanding of our physical world – 'it is not clear that these systems can rival the scientific intuition of even a young child.' Lead researcher Misha Denil and his team set about various trials in different virtual environments in which the AI was faced with a series of blocks and tasked with assessing their properties. In the first simulation, called Which is Heavier, the AI was given a set of four blocks which were the same size but varied in mass.

deepmind ai grasp basic law, large language model, machine learning, (7 more...)

#artificialintelligence

Country: North America > United States > California > Alameda County > Berkeley (0.27)

Genre: Research Report (0.59)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.62)

Add feedback

Jetson Developer Meetup

#artificialintelligenceNov-24-2016, 19:25:08 GMT

Get to know some intelligent machines and the developers who built them. Join us for a night of cocktails/appetizers, tech talks, and learn how our partners, developers and start-ups are using the Jetson TX1 AI supercomputer to create intelligent devices to solve tomorrow's problems today. Meet Jetson partners and hear first-hand how they took their projects from idea to reality. Get to know folks from Horus, Parrot and many more --companies that are using Jetson every day! And, of course, we'll have swag.

jetson developer meetup, machine learning, reinforcement learning, (5 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.23)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.32)

Add feedback

Messing around with OpenAI Gym

#artificialintelligenceNov-23-2016, 13:10:04 GMT

First of all it might be useful to explain what OpenAI Gym actually does: OpenAI Gym aims to provide an easy environment to develop and test reinforcement learning algorithms. To be clear, OpenAI Gym doesn't power any algorithms itself, leaving it up to more specialised packages like TensorFlow or Theano. So what makes this the ultimate geek toy for AI-researchers? Well, this is because of the many environments OpenAI Gym provides, one of them being the'atari' environment. That's right, you can test the performance of your reinforcement learning algorithms on a variety of different atari games and what's more, you can automatically upload the performance of your algorithms and compare them to other people's approaches.

large language model, machine learning, reinforcement learning, (7 more...)

#artificialintelligence

Industry: Leisure & Entertainment > Games > Computer Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

kidzik/osim-rl

#artificialintelligenceNov-21-2016, 20:55:09 GMT

OpenSim is a biomechanical physics environment for musculoskeletal simulations. Biomechanical community designed a range of musculoskeletal models compatible with this environment. These models can be, for example, fit to clinical data to understand underlying causes of injuries using inverse kinematics and inverse dynamics. For many of these models there are controllers designed for forward simulations of movement, however they are often finely tuned for the model and data. Advancements in reinforcement learning may allow building more robust controllers which can in turn provide another tool for validating the models.

kidzik osim-rl, machine learning, reinforcement learning, (6 more...)

#artificialintelligence

Industry: Health & Medicine (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)

Add feedback

Google's DeepMind AI --"Grasps Basic Laws of Physics"

#artificialintelligenceNov-21-2016, 14:15:04 GMT

When encountering novel object, humans and other animals are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with themin a goal driven way. This process of active interaction is in the same spirit of a scientist performing an experiment to discover hidden facts. The study, entitled Learning to perform physics experiments via deep reinforcement learning, explained that while recent advances in AI have achieved'superhuman performance' in complex control problems and other processing tasks, the machines still lack a common sense understanding of our physical world – 'it is not clear that these systems can rival the scientific intuition of even a young child.' "We found," the team concluded, "that state of art deep reinforcement learning methods can learn to perform the experiments necessary to discover these hidden properties of the physical world. By systematically manipulating the problem difficulty and the cost incurred by the AI agent for performing experiments, we found that agents learn different strategies that balance the cost of gathering information against the cost of making mistakes in different situations."

artificial intelligence, machine learning, reinforcement learning, (6 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Memory Lens: How Much Memory Does an Agent Use?

Dann, Christoph, Hofmann, Katja, Nowozin, Sebastian

arXiv.org Machine LearningNov-21-2016

We propose a new method to study the internal memory used by reinforcement learning policies. We estimate the amount of relevant past information by estimating mutual information between behavior histories and the current action of an agent. We perform this estimation in the passive setting, that is, we do not intervene but merely observe the natural behavior of the agent. Moreover, we provide a theoretical justification for our approach by showing that it yields an implementation-independent lower bound on the minimal memory capacity of any agent that implement the observed policy. We demonstrate our approach by estimating the use of memory of DQN policies on concatenated Atari frames, demonstrating sharply different use of memory across 49 games. The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Machine Learning

1611.06928

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Probabilistic Verification for Cognitive Models

Junges, Sebastian (RWTH Aachen University) | Jansen, Nils (University of Texas at Austin) | Katoen, Joost-Pieter (RWTH Aachen University) | Topcu, Ufuk (University of Texas at Austin)

AAAI ConferencesNov-19-2016

Many robotics applications and scenarios that involve interaction with humans are safety or performance critical. A natural path to assessing such notions is to include a cognitive model describing typical human behaviors into a larger modeling context. In this work, we set out to investigate a combination of such a model with formal verification. We present a general and flexible framework utilizing methods from probabilistic model checking and discuss current pitfalls. We start from information about typical behavior, obtained from generalizing specific scenarios by the usage of inverse reinforcement learning. We translate this information in order to define a formal model exhibiting stochastic behavior (whenever significant data is present) or nondeterminism (if the model is underspecified or no significant data is present) that can be analyzed. This model for a human can be combined with a robot model by using standard parallel composition. The benefit is manyfold: First, safe or optimal strategies for involved robots regarding a human can be synthesized depending on the given model. In general, verification can determine if such benign strategies are even possible. Furthermore, the cognitive model itself can be analyzed with respect to possible unnatural behaviors; thereby feedback to developers of such models is provided. We evaluate and describe our approaches by means of a well-known model for visiomotor tasks and provide a framework that can readily incorporate other models.

machine learning, probabilistic verification, reinforcement learning, (3 more...)

AAAI Conferences

2016 AAAI Fall Symposium Series

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.80)
Information Technology > Artificial Intelligence > Robots (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)

Add feedback