Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations

Neural Information Processing Systems 

The core idea of contrastive learning is to pull the textual and visual representations of matched text-video pairs together and push the representations of unmatched text-video pairs apart.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found