In-Context Symmetries: Self-Supervised Learning through Contextual World Models
–Neural Information Processing Systems
At the core of self-supervised learning for vision is the idea of learning invariant or equivariant representations with respect to a set of data transformations. This approach, however, introduces strong inductive biases, which can render the representations fragile in downstream tasks that do not conform to these symmetries. In this work, drawing insights from world models, we propose to instead learn a general representation that can adapt to be invariant or equivariant to different transformations by paying attention to context -- a memory module that tracks task-specific states, actions, and future states. Here, the action is the transformation, while the current and future states respectively represent the input's representation before and after the transformation.
Neural Information Processing Systems
May-25-2025, 16:03:35 GMT
- Country:
- Europe > Netherlands (0.14)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science > Problem Solving (0.85)
- Machine Learning
- Inductive Learning (0.86)
- Neural Networks > Deep Learning (0.92)
- Natural Language (1.00)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence