TaiSu: A166MLarge-scaleHigh-QualityDatasetfor ChineseVision-LanguagePre-training

Neural Information Processing Systems 

It has achieved great success on different vision-language downstream tasks suchasImage-TextRetrieval,Image-Captioning,andVisualQuestionAnswering.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found