Broadband DOA estimation using Convolutional neural networks trained with noise signals
Chakrabarty, Soumitro, Habets, Emanuël. A. P.
ABSTRACT A convolution neural network (CNN) based classification method for broadband DOA estimation is proposed, where the phase component of the short-time Fourier transform coefficients of the received microphone signals are directly fed into the CNN and the features required for DOA estimation are learned during training. Since only the phase component of the input is used, the CNN can be trained with synthesized noise signals, thereby making the preparation of the training data set easier compared to using speech signals. Through experimental evaluation, the ability of the proposed noise trained CNN framework to generalize to speech sources is demonstrated. In addition, the robustness of the system to noise, small perturbations in microphone positions, as well as its ability to adapt to different acoustic conditions is investigated using experiments with simulated and real data. Index Terms-- source localization, convolution neural networks, supervised learning, DOA estimation 1. INTRODUCTION Many applications such as hands-free communication, teleconferencing, and distant speech recognition require information on the location of a sound source in the acoustic environment.
Dec-12-2017
- Country:
- Europe > Germany
- Bavaria > Middle Franconia
- Nuremberg (0.04)
- Berlin (0.04)
- Bavaria > Middle Franconia
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Germany
- Genre:
- Research Report (0.50)
- Technology: