Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment
–Neural Information Processing Systems
Most multimodal models treat every negative pair alike, ignoring the ambiguous negatives that differ from the positive by only a small detail. We propose BoundaryA ware Curriculum with Local Attention(BACL), a lightweight add-on that turns these borderline cases into a curriculum signal. ABoundary-aware Negative Sampler gradually raises difficulty, while a Contrastive Local Attention loss highlights where the mismatch occurs. The two modules are fully differentiable and work with any off-the-shelf dual encoder. Theory predicts a fast O(1/n) error rate; practice shows up to +32 % R@1 over CLIP and new SOTA on four large-scale benchmarks, all without extra labels.
Neural Information Processing Systems
Jun-19-2026, 15:24:26 GMT
- Country:
- North America > United States (0.28)
- Europe > Austria (0.28)
- Genre:
- Research Report > Experimental Study (1.00)
- Industry:
- Education (0.68)
- Technology:
- Information Technology
- Data Science (1.00)
- Artificial Intelligence
- Vision (1.00)
- Representation & Reasoning (1.00)
- Natural Language (1.00)
- Machine Learning
- Neural Networks (0.93)
- Performance Analysis > Accuracy (0.88)
- Information Technology