MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance

Wei, Yake, Hu, Di

arXiv.org Artificial Intelligence 

This problem has raised widely attention superior efficacy in alleviating the imbalanced recently (Zhang et al., 2024). Several methods have been multimodal learning problem. However, in this proposed to improve the training of worse learnt modality paper, we identify the previously ignored gradient with additional module (Wang et al., 2020) or modalityspecific conflict between multimodal and unimodal training strategy (Peng et al., 2022; Wu et al., 2022; learning objectives, potentially misleading the unimodal Wei et al., 2024). These methods often have one common encoder optimization. To well diminish sense that targetedly improves unimodal training. Among these conflicts, we observe the discrepancy between them, multitask-like methods that directly add unimodal multimodal loss and unimodal loss, where learning objectives besides the multimodal joint learning both gradient magnitude and covariance of the objective, exhibit their superior effectiveness for alleviating easier-to-learn multimodal loss are smaller than this imbalanced multimodal learning problem (Wang et al., the unimodal one. With this property, we analyze 2020; Du et al., 2023; Fan et al., 2023). Pareto integration under our multimodal scenario and propose MMPareto algorithm, which However, behind the effective performance, we observe a could ensure a final gradient with direction that is previously ignored risk in model optimization under this common to all learning objectives and enhanced widely used multitask-like scenario, potentially limiting magnitude to improve generalization, providing model ability. Every coin has two sides.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found