Group Contrastive Learning for Weakly Paired Multimodal Data