Goto

Collaborating Authors

 Government





Supplemental: A Benchmark for Compositional Text-to-image Retrieval

Neural Information Processing Systems

GQA GQA has annotations of objects and attributes in images. We use this to construct queries like "square white plate". We train on the GQA train split (with the test unseen queries and corresponding images removed). Hence, we have around 67K training images and 27K queries. CLEVR On CLEVR, we test on 96 classes on 22,500 images.



7428e6db752171d6b832c53b2ed297ab-Paper-Conference.pdf

Neural Information Processing Systems

First, we formalize the problem definition.Weintroducetheconceptof"Idon'tknow (idk) responses" and in this context, honesty necessitates that an aligned LLM provides idk responses for unknown questions and correct responses for known questions.