VisIT-Bench: A Dynamic Benchmark for Evaluating Instruction-Following Vision-and-Language Models

Open in new window