Active Imitation Learning of Hierarchical Policies
Hamidi, Mandana (Oregon State University) | Tadepalli, Prasad (Oregon State University) | Goetschalckx, Robby (Oregon State University) | Fern, Alan (Oregon State University)
However, by being autonomous, structure of the policy, which is often critical for understanding these approaches have the problem of discovering the demonstration, is unobserved. We unnatural hierarchies, which may be difficult to interpret and formulate this problem as active learning of Probabilistic communicate to people. State-Dependent Grammars (PSDGs) from In this paper, we study the problem of learning policies demonstrations. Given a set of expert demonstrations, with hierarchical structure from demonstrations of a teacher our approach learns a hierarchical policy by whose policy is structured hierarchically, with natural applications actively selecting demonstrations and using queries to problems such as tutoring arithmetic, cooking, and to explicate their intentional structure at selected furniture assembly. A key challenge in this problem is that the points. Our contributions include a new algorithm demonstrations do not reveal the hierarchical task structure of for imitation learning of hierarchical policies and the teacher. Rather, only ground states and teacher actions are principled heuristics for the selection of demonstrations directly observable. This can lead to significant ambiguity in and queries.
Jul-15-2015
- Country:
- Technology: