Reviews: State Aware Imitation Learning
–Neural Information Processing Systems
This paper proposes a framework for MAP-estimation based imitation learning, which can be seen as adding to the standard supervised learning loss a cost of deviating from the observed state distribution. The key idea for optimizing such a loss is a gradient expression for the stationary distribution of a given policy that can be estimated online using policy rollouts. This idea is an extension of a previous result by Morimura (2010). I find the approach original, and potentially interesting to the NIPS community. The consequences of deviating from the demonstrated states in imitation learning have been recognized earlier, but this paper proposes a novel approach to this problem.
Neural Information Processing Systems
Oct-7-2024, 13:21:44 GMT
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning (1.00)
- Robots (0.89)
- Information Technology > Artificial Intelligence