MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens Anas Awadalla 1,2 Le Xue 2 Oscar Lo1

Open in new window