EMI: Exploration with Mutual Information
Reinforcement learning could be hard when the reward signal is sparse. In these scenarios, exploration strategy becomes essentially important: a good exploration strategy not only helps the agent to gain a faster and better understanding of the world but also makes it robust to the change of the environment. In this article, we discuss a novel exploration method, namely Exploration with Mutual Information(EMI) proposed by Kim et al. in ICML 2019. In a nutshell, EMI learns representations for both observations(states) and actions in the expectation that we can have a linear dynamics model on these representations. EMI then computes the intrinsic reward as the prediction error under the linear dynamics model.
Aug-28-2019, 19:01:30 GMT
- Technology: