A Bayesian Approach for Policy Learning from Trajectory Preference Queries

Wilson, Aaron, Fern, Alan, Tadepalli, Prasad

Dec-31-2012–Neural Information Processing Systems

We consider the problem of learning control policies via trajectory preference queries to an expert. In particular, the learning agent can present an expert with short runs of a pair of policies originating from the same state and the expert then indicates the preferred trajectory. The agent's goal is to elicit a latent target policy from the expert with as few queries as possible. To tackle this problem we propose a novel Bayesian model of the querying process and introduce two methods that exploit this model to actively select expert queries. Experimental results on four benchmark problems indicate that our model can effectively learn policies from trajectory preference queries and that active query selection can be substantially more efficient than random selection.

artificial intelligence, machine learning, trajectory, (16 more...)

Neural Information Processing Systems

Dec-31-2012

Conferences PDF

Add feedback

Country:
- North America > United States (0.29)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.70)
  - Machine Learning
    - Statistical Learning (0.95)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.70)

Duplicate Docs Excel Report

Title
A Bayesian Approach for Policy Learning from Trajectory Preference Queries Aaron Wilson

Similar Docs Excel Report more

Title	Similarity	Source
None found