Dual-View Alignment Learning with Hierarchical-Prompt for Class-Imbalance Multi-Label Classification
Huang, Sheng, Yan, Jiexuan, Liu, Beiyan, Liu, Bo, Hong, Richang
–arXiv.org Artificial Intelligence
This is especially challenging in Class-Imbalanced Multi-Label Image Classification (CI-MLIC) tasks, where data imbalance and multi-object recognition present significant obstacles. T o address these challenges, we propose a novel method termed Dual-View Alignment Learning with Hierarchical Prompt (HP-DV AL), which leverages multi-modal knowledge from vision-language pretrained (VLP) models to mitigate the class-imbalance problem in multi-label settings. Specifically, HP-DV AL employs dual-view alignment learning to transfer the powerful feature representation capabilities from VLP models by extracting complementary features for accurate image-text alignment. T o better adapt VLP models for CI-MLIC tasks, we introduce a hierarchical prompt-tuning strategy that utilizes global and local prompts to learn task-specific and context-related prior knowledge. Additionally, we design a semantic consistency loss during prompt tuning to prevent learned prompts from deviating from general knowledge embedded in VLP models. The effectiveness of our approach is validated on two CI-MLIC benchmarks: MS-COCO and VOC2007. Extensive experimental results demonstrate the superiority of our method over SOT A approaches, achieving mAP improvements of 10.0% and 5.2% on the long-tailed multi-label image classification task, and 6.8% and 2.9% on the multi-label few-shot image classification task.
arXiv.org Artificial Intelligence
Sep-23-2025
- Country:
- Asia > China
- Anhui Province > Hefei (0.05)
- Chongqing Province > Chongqing (0.05)
- North America > United States
- New Jersey > Middlesex County > New Brunswick (0.04)
- Asia > China
- Genre:
- Research Report
- New Finding (0.48)
- Promising Solution (0.48)
- Research Report
- Industry:
- Education (1.00)
- Technology: