Goto

Collaborating Authors

 interference source


Binaural Angular Separation Network

arXiv.org Artificial Intelligence

We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones. The model is trained with simulated room impulse responses (RIRs) using omni-directional microphones without needing to collect real RIRs. By relying on specific angular regions and multiple room simulations, the model utilizes consistent time difference of arrival (TDOA) cues, or what we call delay contrast, to separate target and interference sources while remaining robust in various reverberation environments. We demonstrate the model is not only generalizable to a commercially available device with a slightly different microphone geometry, but also outperforms our previous work which uses one additional microphone on the same device. The model runs in real-time on-device and is suitable for low-latency streaming applications such as telephony and video conferencing.


Stimulus Evoked Independent Factor Analysis of MEG Data with Large Background Activity

Neural Information Processing Systems

This paper presents a novel technique for analyzing electromagnetic imaging data obtained using the stimulus evoked experimental paradigm. The technique is based on a probabilistic graphical model, which describes the data in terms of underlying evoked and interference sources, and explicitly models the stimulus evoked paradigm. The new algorithm outperforms existing techniques on two real datasets, as well as on simulated data.


A DNN based Normalized Time-frequency Weighted Criterion for Robust Wideband DoA Estimation

arXiv.org Artificial Intelligence

Deep neural networks (DNNs) have greatly benefited direction of arrival (DoA) estimation methods for speech source localization in noisy environments. However, their localization accuracy is still far from satisfactory due to the vulnerability to nonspeech interference. To improve the robustness against interference, we propose a DNN based normalized time-frequency (T-F) weighted criterion which minimizes the distance between the candidate steering vectors and the filtered snapshots in the T-F domain. Our method requires no eigendecomposition and uses a simple normalization to prevent the optimization objective from being misled by noisy filtered snapshots. We also study different designs of T-F weights guided by a DNN. We find that duplicating the Hadamard product of speech ratio masks is highly effective and better than other techniques such as direct masking and taking the mean in the proposed approach. However, the best-performing design of T-F weights is criterion-dependent in general. Experiments show that the proposed method outperforms popular DNN based DoA estimation methods including widely used subspace methods in noisy and reverberant environments.


Stimulus Evoked Independent Factor Analysis of MEG Data with Large Background Activity

Neural Information Processing Systems

This paper presents a novel technique for analyzing electromagnetic imaging data obtained using the stimulus evoked experimental paradigm. The technique is based on a probabilistic graphical model, which describes the data in terms of underlying evoked and interference sources, and explicitly models the stimulus evoked paradigm.


Stimulus Evoked Independent Factor Analysis of MEG Data with Large Background Activity

Neural Information Processing Systems

This paper presents a novel technique for analyzing electromagnetic imaging data obtained using the stimulus evoked experimental paradigm. The technique is based on a probabilistic graphical model, which describes thedata in terms of underlying evoked and interference sources, and explicitly models the stimulus evoked paradigm.