Cross-modal Active Complementary Learning with Self-refining Correspondence Y ang Qin

Neural Information Processing Systems 

Recently, image-text matching has attracted more and more attention from academia and industry, which is fundamental to understanding the latent correspondence across visual and textual modalities.