Efficient Quantification of Multimodal Interaction at Sample Level

Jun-24-2025–arXiv.org Machine Learning

Interactions between modalities -- redundancy, uniqueness, and synergy -- collectively determine the composition of multimodal information. Understanding these interactions is crucial for analyzing information dynamics in multimodal systems, yet their accurate sample-level quantification presents significant theoretical and computational challenges. To address this, we introduce the Lightweight Sample-wise Multimodal Interaction (LSMI) estimator, rigorously grounded in pointwise information theory. We first develop a redundancy estimation framework, employing an appropriate pointwise information measure to quantify this most decomposable and measurable interaction. Building upon this, we propose a general interaction estimation method that employs efficient entropy estimation, specifically tailored for sample-wise estimation in continuous distributions. Extensive experiments on synthetic and real-world datasets validate LSMI's precision and efficiency. Crucially, our sample-wise approach reveals fine-grained sample- and category-level dynamics within multimodal data, enabling practical applications such as redundancy-informed sample partitioning, targeted knowledge distillation, and interaction-aware model ensembling. The code is available at https://github.com/GeWu-Lab/LSMI_Estimator.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

Jun-24-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada (0.04)
- Europe > Switzerland
  - Zürich > Zürich (0.14)
- Asia > China
  - Beijing > Beijing (0.04)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology
  - Information Management (0.87)
  - Artificial Intelligence
    - Natural Language (1.00)
    - Machine Learning (1.00)
    - Vision (0.68)
    - Representation & Reasoning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found