MAGVLT: Masked Generative Vision-and-Language Transformer