Seek Commonality but Preserve Differences: Dissected Dynamics Modeling for Multi-modal Visual RL
–Neural Information Processing Systems
Accurate environment dynamics modeling is crucial for obtaining effective state representations in visual reinforcement learning (RL) applications. However, when facing multiple input modalities, existing dynamics modeling methods (e.g., Deep-MDP) usually stumble in addressing the complex and volatile relationship between different modalities. In this paper, we study the problem of efficient dynamics modeling for multi-modal visual RL. We find that under the existence of modality heterogeneity, modality-correlated and distinct features are equally important but play different roles in reflecting the evolution of environmental dynamics. Motivated by this fact, we propose Dissected Dynamics Modeling (DDM), a novel multi-modal dynamics modeling method for visual RL.
Neural Information Processing Systems
May-28-2025, 07:09:35 GMT
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Information Technology (0.68)
- Transportation > Ground
- Road (0.46)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks (1.00)
- Reinforcement Learning (0.90)
- Natural Language (0.93)
- Representation & Reasoning (1.00)
- Robots (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence