2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Open in new window