Is Your Video Language Model a Reliable Judge?

Open in new window