Adaptive Cross-Modal Few-shot Learning
Chen Xing, Negar Rostamzadeh, Boris Oreshkin, Pedro O. O. Pinheiro
–Neural Information Processing Systems
Metric-based meta-learning techniques have successfully been applied to fewshot classification problems. In this paper, we propose to leverage cross-modal information to enhance metric-based few-shot learning methods. Visual and semantic feature spaces have different structures by definition. For certain concepts, visual features might be richer and more discriminative than text ones. While for others, the inverse might be true. Moreover, when the support from visual information is limited in image classification, semantic representations (learned from unsupervised text corpora) can provide strong prior knowledge and context to help learning.
Neural Information Processing Systems
Jan-27-2025, 09:31:17 GMT