DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis
–Neural Information Processing Systems
We propose DeepASA, a multi-purpose model for auditory scene analysis that performs multi-input multi-output (MIMO) source separation, dereverberation, sound event detection (SED), audio classification, and direction-of-arrival estimation (DoAE) within a unified framework. DeepASA is designed for complex auditory scenes where multiple, often similar, sound sources overlap in time and move dynamically in space. To achieve robust and consistent inference across tasks, we introduce an object-oriented processing (OOP) strategy.
Neural Information Processing Systems
Jun-14-2026, 07:51:47 GMT
- Technology: