Hierarchical Contrastive Learning for Multimodal Data