A Unified Debiasing Approach for Vision-Language Models across Modalities and Tasks

Neural Information Processing Systems 

Recent advancements in Vision-Language Models (VLMs) have enabled complex multimodal tasks by processing text and image data simultaneously, significantly enhancing the field of artificial intelligence.