Visual Dynamics: Stochastic Future Generation via Layered Cross Convolutional Networks