Learning Graphical Models of Images, Videos and Their Spatial Transformations