Visual Interaction Networks: Learning a Physics Simulator from Video

Watters, Nicholas, Zoran, Daniel, Weber, Theophane, Battaglia, Peter, Pascanu, Razvan, Tacchetti, Andrea

Neural Information Processing Systems 

From just a glance, humans can make rich predictions about the future of a wide range of physical systems. On the other hand, modern approaches from engineering, robotics, and graphics are often restricted to narrow domains or require information about the underlying state. We introduce the Visual Interaction Network, a general-purpose model for learning the dynamics of a physical system from raw visual observations. Our model consists of a perceptual front-end based on convolutional neural networks and a dynamics predictor based on interaction networks. Through joint training, the perceptual front-end learns to parse a dynamic visual scene into a set of factored latent object representations.