Learning Representations from Audio-Visual Spatial Alignment Pedro Morgado Yi Li
–Neural Information Processing Systems
While these approaches learn high-quality representations for downstream tasks such as action recognition, their training objectives disregard spatial cues naturally occurring in audio and visual signals.
Neural Information Processing Systems
Oct-2-2025, 15:04:22 GMT
- Country:
- North America
- Canada (0.04)
- United States > California
- San Diego County > San Diego (0.04)
- North America
- Industry:
- Media (0.68)
- Technology: