Balancing Multimodal Training Through Game-Theoretic Regularization

Jun-23-2026, 03:02:32 GMT–Neural Information Processing Systems

Multimodal learning holds promise for richer information extraction by capturing dependencies across data sources. Yet, current training methods often underperform due to modality competition, a phenomenon where modalities contend for training resources leaving some underoptimized. This raises a pivotal question: how can we address training imbalances, ensure adequate optimization across all modalities, and achieve consistent performance improvements as we transition from unimodal to multimodal data? This paper proposes the Multimodal Competition Regularizer (MCR), inspired by a mutual information (MI) decomposition designed to prevent the adverse effects of competition in multimodal training. Our key contributions are: 1) A game-theoretic framework that adaptively balances modality contributions by encouraging each to maximize its informative role in the final prediction 2) Refining lower and upper bounds for each MI term to enhance the extraction of both taskrelevant unique and shared information across modalities.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Jun-23-2026, 03:02:32 GMT

Conferences PDF

Add feedback

Genre:
- Overview (0.93)
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.67)

Industry:
- Health & Medicine (0.48)

Technology:
- Information Technology
  - Data Science (1.00)
  - Artificial Intelligence
    - Vision (1.00)
    - Representation & Reasoning (1.00)
    - Natural Language (1.00)
    - Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found