VIOLIN: A Large-Scale Dataset for Video-and-Language Inference

Open in new window