Global Minimizers of Sigmoid Contrastive Loss

Jun-13-2026, 01:06:00 GMT–Neural Information Processing Systems

The meta-task of obtaining and aligning representations through contrastive pretraining is steadily gaining importance since its introduction in CLIP and ALIGN. In this paper we theoretically explain the advantages of synchronizing with trainable inverse temperature and bias under the sigmoid loss, as implemented in the recent SigLIP and SigLIP2 models of Google DeepMind. Temperature and bias can drive the loss function to zero for a rich class of configurations that we call $(\mathsf{m}, \mathsf{br})$ -Constellations.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Jun-13-2026, 01:06:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)