HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval

Open in new window