Attention Temperature Matters in ViT-Based Cross-Domain Few-Shot Learning Yixiong Zou Ran Ma Yuhua Li Ruixuan Li
–Neural Information Processing Systems
Cross-domain few-shot learning (CDFSL) is proposed to transfer knowledge from large-scale source-domain datasets to downstream target-domain datasets with only a few training samples. However, Vision Transformer (ViT), as a strong backbone network to achieve many top performances, is still under-explored in the CDFSL task in its transferability against large domain gaps. In this paper, we find an interesting phenomenon of ViT in the CDFSL task: by simply multiplying a temperature (even as small as 0) to the attention in ViT blocks, the target-domain performance consistently increases, even though the attention map is downgraded to a uniform map.
Neural Information Processing Systems
Mar-27-2025, 09:18:56 GMT
- Country:
- Asia > China (0.14)
- Europe > Belgium (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine (0.93)
- Technology: