Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event Parser

Neural Information Processing Systems 

Audio-visual learning has been a major pillar of multi-modal machine learning, where the community mostly focused on its modality-aligned setting, i.e ., the

Similar Docs  Excel Report  more

TitleSimilaritySource
None found