Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications