COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration
Watters, Nicholas, Matthey, Loic, Bosnjak, Matko, Burgess, Christopher P., Lerchner, Alexander
–arXiv.org Artificial Intelligence
Recent advances in deep reinforcement learning (RL) have shown remarkable success on challenging tasks (Andrychowicz et al., 2018; Mnih et al., 2015; Silver et al., 2016). However, data efficiency and robustness to new contexts remain persistent challenges for deep RL algorithms, especially when the goal is for agents to learn practical tasks with limited supervision. Drawing inspiration from self-supervised "play" in human development (Gopnik et al., 1999; Settles, 2011), we introduce an agent that learns object-centric representations of its environment without supervision and subsequently harnesses these to learn policies efficiency and robustly. Our agent, which we call Curious Object-Based seaRch Agent (COBRA), brings together three key ingredients: (i) learning representations of the world in terms of objects, (ii) curiosity-driven exploration, and (iii) model based RL. The benefits of this synthesis are data efficiency and policy robustness. To put this into practice, we introduce the following technical contributions: - A method for learning action-conditioned dynamics over slot-structured object-centric representations that requires no supervision and is trained from raw pixels.
arXiv.org Artificial Intelligence
May-22-2019