AITopics | Kim, Kee-eung

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Kim, Dongho, Kim, Kee-eung, Poupart, Pascal

Neural Information Processing SystemsFeb-15-2020, 00:27:00 GMT

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward. In order to formalize cost-sensitive exploration, we use the constrained Markov decision process (CMDP) as the model of the environment, in which we can naturally encode exploration requirements using the cost function. We extend BEETLE, a model-based BRL method, for learning in the environment with cost constraints. We demonstrate the cost-sensitive exploration behaviour in a number of simulated problems. Papers published at the Neural Information Processing Systems Conference.

artificial intelligence, bayesian reinforcement learning, machine learning, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Kim, Dongho, Kim, Kee-eung, Poupart, Pascal

Neural Information Processing SystemsDec-31-2012

In this paper, we consider Bayesian reinforcement learning (BRL) where actions incur costs in addition to rewards, and thus exploration has to be constrained in terms of the expected total cost while learning to maximize the expected long-term total reward. In order to formalize cost-sensitive exploration, we use the constrained Markov decision process (CMDP) as the model of the environment, in which we can naturally encode exploration requirements using the cost function. We extend BEETLE, a model-based BRL method, for learning in the environment with cost constraints. We demonstrate the cost-sensitive exploration behaviour in a number of simulated problems.

Add feedback

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

Choi, Jaedeug, Kim, Kee-eung

Neural Information Processing SystemsDec-31-2012

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains.

Add feedback

MAP Inference for Bayesian Inverse Reinforcement Learning

Choi, Jaedeug, Kim, Kee-eung

Neural Information Processing SystemsDec-31-2011

The difficulty in inverse reinforcement learning (IRL) arises in choosing the best reward function since there are typically an infinite number of reward functions that yield the given behaviour data as optimal. Using a Bayesian framework, we address this challenge by using the maximum a posteriori (MAP) estimation for the reward function, and show that most of the previous IRL algorithms can be modeled into our framework. We also present a gradient method for the MAP estimation based on the (sub)differentiability of the posterior distribution. We show the effectiveness of our approach by comparing the performance of the proposed method to those of the previous algorithms.

Add feedback

Filters

Collaborating Authors

Kim, Kee-eung

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Cost-Sensitive Exploration in Bayesian Reinforcement Learning

Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions

MAP Inference for Bayesian Inverse Reinforcement Learning