Systematic Evaluation of Time-Frequency Features for Binaural Sound Source Localization
Panah, Davoud Shariat, Ragano, Alessandro, Barry, Dan, Skoglund, Jan, Hines, Andrew
–arXiv.org Artificial Intelligence
ABSTRACT This study presents a systematic evaluation of time-frequency feature design for binaural sound source localization (SSL), focusing on how feature selection influences model performance across diverse conditions. We investigate the performance of a convolu-tional neural network (CNN) model using various combinations of amplitude-based features (magnitude spectrogram, interaural level difference - ILD) and phase-based features (phase spectrogram, interaural phase difference - IPD). Evaluations on in-domain and out-of-domain data with mismatched head-related transfer functions (HRTFs) reveal that carefully chosen feature combinations often outperform increases in model complexity. While two-feature sets such as ILD + IPD are sufficient for in-domain SSL, generalization to diverse content requires richer inputs combining channel spectrograms with both ILD and IPD. Using the optimal feature sets, our low-complexity CNN model achieves competitive performance. Our findings underscore the importance of feature design in binaural SSL and provide practical guidance for both domain-specific and general-purpose localization.
arXiv.org Artificial Intelligence
Nov-19-2025
- Country:
- Europe > Ireland
- Leinster > County Dublin > Dublin (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States > California
- San Francisco County > San Francisco (0.14)
- Canada > Quebec
- Europe > Ireland
- Genre:
- Research Report > New Finding (1.00)
- Technology: