Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment

Open in new window