HaVTR: Improving Video-Text Retrieval Through Augmentation Using Large Foundation Models

Open in new window