Nonparametric Bayesian Policy Priors for Reinforcement Learning

Doshi-velez, Finale, Wingate, David, Roy, Nicholas, Tenenbaum, Joshua B.

Dec-31-2010–Neural Information Processing Systems

We consider reinforcement learning in partially observable domains where the agent can query an expert for demonstrations. Our nonparametric Bayesian approach combines model knowledge, inferred from expert information and independent exploration, with policy knowledge inferred from expert trajectories. We introduce priors that bias the agent towards models with both simple representations and simple policies, resulting in improved policy and model learning.

machine learning, reinforcement learning, world model, (15 more...)

Neural Information Processing Systems

Dec-31-2010

Conferences PDF

Add feedback

Country:
- North America > United States > Massachusetts (0.28)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.88)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Learning Graphical Models
      - Undirected Networks > Markov Models (1.00)
      - Directed Networks > Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found