Zero-Shot Video Question Answering via Frozen Bidirectional Language Models

Open in new window