Intra-Modal Proxy Learning for Zero-Shot Visual Categorization with CLIP
–Neural Information Processing Systems
Vision-language pre-training methods, e.g., CLIP, demonstrate an impressive zero-shot performance on visual categorizations with the class proxy from the text embedding of the class name.
Neural Information Processing Systems
Feb-11-2026, 18:07:01 GMT
- Country:
- Asia > China
- Zhejiang Province > Hangzhou (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Washington > Pierce County > Tacoma (0.04)
- Asia > China
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.68)
- Natural Language > Large Language Model (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence