Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video Captioning

Open in new window