Active Learning of Hierarchical Policies from State-Action Trajectories

Hamidi, Mandana (Oregon State University) | Tadepalli, Prasad (School of Electrical Engineering and Computer Science) | Goetschalckx, Robby (Oregon State University) | Fern, Alan (Oregon State University)

Mar-1-2015–AAAI Conferences

While most work on trajectory mining is applied to pre- dict movements of mobile users, in this paper we consider a more general problem of building behavior models of users from their state-action trajectories. We assume that the user behavior can be compactly modeled as a Probabilistic State-Dependent Grammar (PSDG) which represents a hierarchical policy. The key problem is that while the states and actions of the user are directly observed, his intentional structure is not. We propose to learn the user’s policy from a set of selected trajectories and intention queries at selected states in the trajectory. Our main contributions are an algorithm for learning hierarchical policies from state-action trajectories, and principled heuristics for selecting suitable trajectories and intention queries. Experiments in multiple domains show that our approach is effective and more sample-efficient than learning non-hierarchical policies.

algorithm, query, trajectory, (17 more...)

AAAI Conferences

Mar-1-2015

Conferences PDF

Add feedback

Country:
- North America > United States
  - Oregon > Benton County
    - Corvallis (0.04)
  - New York > New York County
    - New York City (0.04)
  - California > San Francisco County
    - San Francisco (0.14)

Industry:
- Transportation (0.31)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (0.94)
  - Natural Language > Grammars & Parsing (0.53)
  - Machine Learning
    - Inductive Learning (0.68)
    - Supervised Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found