Self-Supervised Visual Representation Learning with Semantic Grouping
–Neural Information Processing Systems
In this paper, we tackle the problem of learning visual representations from unlabeled scene-centric data. Existing works have demonstrated the potential of utilizing the underlying complex structure within scene-centric data; still, they commonly rely on hand-crafted objectness priors or specialized pretext tasks to build a learning framework, which may harm generalizability. Instead, we propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
Neural Information Processing Systems
Nov-14-2025, 18:21:10 GMT
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East > Jordan (0.04)
- Europe > Finland
- North America > United States
- Tennessee > Davidson County > Nashville (0.04)
- Asia
- Technology: