Bayesian Policy Gradient Algorithms

Dec-31-2007–Neural Information Processing Systems

Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policyby following a performance gradient estimate. Conventional policy gradient methods use Monte-Carlo techniques to estimate this gradient. Since Monte Carlo methods tend to have high variance, a large number of samples is required, resulting in slow convergence. In this paper, we propose a Bayesian framework that models the policy gradient as a Gaussian process. This reduces the number of samples needed to obtain accurate gradient estimates. Moreover, estimates of the natural gradient as well as a measure of the uncertainty in the gradient estimates are provided at little extra cost.

artificial intelligence, bayesian inference, gradient, (18 more...)

Neural Information Processing Systems

Dec-31-2007

Conferences PDF

Add feedback

Country:
- North America > Canada > Alberta (0.28)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)
    - Reinforcement Learning (0.87)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)

Duplicate Docs Excel Report

Title
Bayesian Policy Gradient Algorithms
Bayesian Policy Gradient Algorithms

Similar Docs Excel Report more

Title	Similarity	Source
None found