FALCON: False-Negative Aware Learning of Contrastive Negatives in Vision-Language Alignment

Kim, Myunsoo, Shim, Seong-Woong, Lee, Byung-Jun

Nov-12-2025–arXiv.org Artificial Intelligence

False negatives pose a critical challenge in vision-language pretraining (VLP) due to the many-to-many correspondence between images and texts in large-scale datasets. These false negatives introduce conflicting supervision signals that degrade the learned embedding space and diminish the effectiveness of hard negative sampling. In this paper, we propose FALCON (False-negative Aware Learning of COntrastive Negatives), a learning-based mini-batch construction strategy that adaptively balances the trade-off between hard and false negatives during VLP. Rather than relying on fixed heuristics, FALCON employs a negative mining scheduler that dynamically selects negative samples of appropriate hardness for each anchor instance during mini-batch construction, guided by a proxy for cross-modal alignment improvement. Experimental results demonstrate that FALCON significantly improves performance across three vision-language learning frameworks (ALBEF, BLIP-2, SigLIP-2) and a broad range of downstream tasks and evaluation settings, underscoring its effectiveness and robustness in mitigating the impact of false negatives.

artificial intelligence, false negative, machine learning, (14 more...)

arXiv.org Artificial Intelligence

Nov-12-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Switzerland > Zürich > Zürich (0.14)

Genre:
- Research Report > New Finding (0.88)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)