Supplementary Material for " Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations "

Neural Information Processing Systems 

It may raise challenges to protecting information security. Implementation Details The EMCL module is trained with the neural network. Moreover, we update the initial value M using an average moving method. " denotes higher is better. We utilize the CLIP (ViT -B/32) [14] as pre-trained Bi-Encoder.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found