Neural Modular Control for Embodied Question Answering
Das, Abhishek, Gkioxari, Georgia, Lee, Stefan, Parikh, Devi, Batra, Dhruv
–arXiv.org Artificial Intelligence
We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. 'exit room', 'find kitchen', 'find refrigerator', etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies.
arXiv.org Artificial Intelligence
Oct-25-2018
- Country:
- Europe > Switzerland
- North America > United States (0.28)
- Genre:
- Research Report (0.50)
- Industry:
- Education (0.47)
- Technology: