Wukong: A100MillionLarge-scaleChinese Cross-modalPre-trainingBenchmark

Neural Information Processing Systems 

Their success heavily relies on the scale of pretrained cross-modal datasets. However, the lack of large-scale datasets and benchmarks in Chinese hinders the development of Chinese VLP models and broader multilingual applications. In this work, we release a large-scale Chinese cross-modal dataset named Wukong, which contains 100 million Chinese image-text pairs collected from the web.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found