Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

Open in new window