MaskViT: Masked Visual Pre-Training for Video Prediction