Multilingual Diversity Improves Vision-Language Representations Thao Nguyen 1 Matthew Wallingford

Neural Information Processing Systems 

Massive web-crawled image-text datasets lay the foundation for recent progress in multimodal learning.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found