Scaling Multimodal Pre-Training via Cross-Modality Gradient Harmonization

Open in new window