WikiDO: ANewBenchmarkEvaluating Cross-ModalRetrievalforVision-LanguageModels

Open in new window