emcl
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Spain (0.04)
- Asia > China > Hong Kong (0.04)
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
Most video-and-language representation learning approaches employ contrastive learning, e.g., CLIP, to project the video and text features into a common latent space according to the semantic similarities of text-video pairs. However, such learned shared latent spaces are not often optimal, and the modality gap between visual and textual representation can not be fully eliminated. In this paper, we propose Expectation-Maximization Contrastive Learning (EMCL) to learn compact video-and-language representations. Specifically, we use the Expectation-Maximization algorithm to find a compact set of bases for the latent space, where the features could be concisely represented as the linear combinations of these bases. Such feature decomposition of video-and-language representations reduces the rank of the latent space, resulting in increased representing power for the semantics. Extensive experiments on three benchmark text-video retrieval datasets prove that our EMCL can learn more discriminative video-and-language representations than previous methods, and significantly outperform previous state-of-the-art methods across all metrics. More encouragingly, the proposed method can be applied to boost the performance of existing approaches either as a jointly training layer or an out-of-the-box inference module with no extra training, making it easy to be incorporated into any existing methods.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Spain (0.04)
- Asia > China > Hong Kong (0.04)
Expectation-Maximization Contrastive Learning for Compact Video-and-Language Representations
Most video-and-language representation learning approaches employ contrastive learning, e.g., CLIP, to project the video and text features into a common latent space according to the semantic similarities of text-video pairs. However, such learned shared latent spaces are not often optimal, and the modality gap between visual and textual representation can not be fully eliminated. In this paper, we propose Expectation-Maximization Contrastive Learning (EMCL) to learn compact video-and-language representations. Specifically, we use the Expectation-Maximization algorithm to find a compact set of bases for the latent space, where the features could be concisely represented as the linear combinations of these bases. Such feature decomposition of video-and-language representations reduces the rank of the latent space, resulting in increased representing power for the semantics.
Adapting to Dynamic LEO-B5G Systems: Meta-Critic Learning Based Efficient Resource Scheduling
Yuan, Yaxiong, lei, Lei, Vu, Thang X., Chang, Zheng, Chatzinotas, Symeon, Sun, Sumei
Low earth orbit (LEO) satellite-assisted communications have been considered as one of key elements in beyond 5G systems to provide wide coverage and cost-efficient data services. Such dynamic space-terrestrial topologies impose exponential increase in the degrees of freedom in network management. In this paper, we address two practical issues for an over-loaded LEO-terrestrial system. The first challenge is how to efficiently schedule resources to serve the massive number of connected users, such that more data and users can be delivered/served. The second challenge is how to make the algorithmic solution more resilient in adapting to dynamic wireless environments.To address them, we first propose an iterative suboptimal algorithm to provide an offline benchmark. To adapt to unforeseen variations, we propose an enhanced meta-critic learning algorithm (EMCL), where a hybrid neural network for parameterization and the Wolpertinger policy for action mapping are designed in EMCL. The results demonstrate EMCL's effectiveness and fast-response capabilities in over-loaded systems and in adapting to dynamic environments compare to previous actor-critic and meta-learning methods.
- Europe > Finland > Central Finland > Jyväskylä (0.04)
- North America > United States > New York > Nassau County > Mineola (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Telecommunications (0.93)
- Information Technology > Services (0.34)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)