Supplementary Material for " Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations "
–Neural Information Processing Systems
Potential negative societal impacts Although our work improves the performance of text-video retrieval, but may reduce the difficulty of cross-modal retrieval of sensitive information on the network. It may raise challenges to protecting information security. Limitations of our work Iterative approaches are sensitive to initialization and parameters such as the dimensions and the number of subspaces. In our work, although we use the L2 normalization operation to limit the value range of the parameters, the EM algorithm [3] may still converge to bad results. At the same time, the selection of the number of subspaces also has a relatively significant impact on the model effect.
Neural Information Processing Systems
Apr-27-2026, 14:19:57 GMT
- Country:
- Asia > China (0.16)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.14)
- Industry:
- Information Technology > Security & Privacy (0.88)
- Technology: