Vision-Language Models are Strong Noisy Label Detectors
–Neural Information Processing Systems
Recent research on fine-tuning vision-language models has demonstrated impressive performance in various downstream tasks. However, the challenge of obtaining accurately labeled data in real-world applications poses a significant obstacle during the fine-tuning process. To address this challenge, this paper presents a Denoising Fine-Tuning framework, called DeFT, for adapting vision-language models.
Neural Information Processing Systems
Dec-26-2025, 06:38:33 GMT
- Technology:
- Information Technology > Artificial Intelligence > Vision (0.82)