Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables

Jun-13-2026, 21:47:20 GMT–Neural Information Processing Systems

In this paper, we study the theoretical properties of the learned representations from multi-modal contrastive learning beyond linear representations and specific data distributions. Our analysis reveals that, enabled by temperature optimization, multi-modal contrastive learning not only maximizes mutual information between modalities but also adapts to intrinsic dimensions of data, which can be much lower than user-specified dimensions for representation vectors. Experiments on both synthetic and real-world datasets demonstrate the ability of contrastive learning to learn low-dimensional and informative representations, bridging theoretical insights and practical performance.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Jun-13-2026, 21:47:20 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)