AITopics | Kawaguchi, Kenji

Collaborating Authors

Kawaguchi, Kenji

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prior-Free Exploration Bonus for and beyond Near Bayes-Optimal Behavior

Kawaguchi, Kenji (BWBP Artificial Intelligence Laboratory) | Sato, Hiroshi (National Defense Academy of Japan)

AAAI ConferencesAug-3-2013

We study Bayesian reinforcement learning (RL) as a solution of the exploration-exploitation dilemma. As full Bayesian planning is intractable except for special cases, previous work has proposed several approximation methods. However, these were often computationally expensive or limited to Dirichlet priors. In this paper, we propose a new algorithm that is fast and of polynomial time for near Bayesian optimal policy with any prior distributions that are not greatly misspecified. Perhaps even more interestingly, the proposed algorithm can naturally avoid being misled by incorrect beliefs, while effectively utilizing useful parts of prior information. It can work well even when an utterly misspecified prior is assigned. In that case, the algorithm will follow PAC-MDP behavior instead, if an existing PAC-MDP algorithm does so. The proposed algorithm naturally outperformed other algorithms compared with it on a standard benchmark problem.

bayes-optimal behavior, prior-free exploration bonus

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

A Greedy Approximation of Bayesian Reinforcement Learning with Probably Optimistic Transition Model

Kawaguchi, Kenji, Araya, Mauricio

arXiv.org Artificial IntelligenceJun-12-2013

Bayesian Reinforcement Learning (RL) is capable of not only incorporating domain knowledge, but also solving the exploration-exploitation dilemma in a natural way. As Bayesian RL is intractable except for special cases, previous work has proposed several approximation methods. However, these methods are usually too sensitive to parameter values, and finding an acceptable parameter setting is practically impossible in many applications. In this paper, we propose a new algorithm that greedily approximates Bayesian RL to achieve robustness in parameter space. We show that for a desired learning behavior, our proposed algorithm has a polynomial sample complexity that is lower than those of existing algorithms. We also demonstrate that the proposed algorithm naturally outperforms other existing algorithms when the prior distributions are not significantly misleading. On the other hand, the proposed algorithm cannot handle greatly misspecified priors as well as the other algorithms can. This is a natural consequence of the fact that the proposed algorithm is greedier than the other algorithms. Accordingly, we discuss a way to select an appropriate algorithm for different tasks based on the algorithms' greediness. We also introduce a new way of simplifying Bayesian planning, based on which future work would be able to derive new algorithms.

algorithm, artificial intelligence, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

1303.3163

Country:

North America > United States > Massachusetts (0.14)
Europe > France (0.14)
Asia > Japan (0.14)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback