Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages