AVLEN: Audio-Visual-LanguageEmbodied Navigationin3DEnvironments
–Neural Information Processing Systems
Similartoaudio-visual navigationtasks,thegoalofourembodied agentistolocalize anaudioeventvia navigating the 3D visual world; however, the agent may also seek help from a human (oracle), where the assistance is provided in free-form natural language.
Neural Information Processing Systems
Feb-8-2026, 00:48:32 GMT
- Country:
- Asia > China
- Hong Kong (0.04)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > China
- Technology: