OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Hugo Laurençon ú, 1, 2 Lucile Saulnier ú, 1 Léo T ronchon

Neural Information Processing Systems 

We describe the dataset creation process, present comprehensive filtering rules, and provide an analysis of the dataset's content.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found