Supplementary Material - WikiDO: A New Benchmark Evaluating Cross-Modal Retrieval for Vision-Language Models A Datasheet for WikiDO dataset 1 A.1 Motivation