Attention Temperature Matters in ViT-Based Cross-Domain Few-Shot Learning Yixiong Zou Ran Ma Yuhua Li Ruixuan Li

Mar-27-2025, 09:18:56 GMT–Neural Information Processing Systems

Cross-domain few-shot learning (CDFSL) is proposed to transfer knowledge from large-scale source-domain datasets to downstream target-domain datasets with only a few training samples. However, Vision Transformer (ViT), as a strong backbone network to achieve many top performances, is still under-explored in the CDFSL task in its transferability against large domain gaps. In this paper, we find an interesting phenomenon of ViT in the CDFSL task: by simply multiplying a temperature (even as small as 0) to the attention in ViT blocks, the target-domain performance consistently increases, even though the attention map is downgraded to a uniform map.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Mar-27-2025, 09:18:56 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.14)
- Europe > Belgium (0.14)
- North America > United States (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Health & Medicine (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)
  - Natural Language (1.00)
  - Vision (1.00)