Acero, Alex
DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Muralidharan, Deepak, Moniz, Joel Ruben Antony, Zhang, Weicheng, Pulman, Stephen, Li, Lin, Barnes, Megan, Pan, Jingjing, Williams, Jason, Acero, Alex
Named entity recognition (NER) is usually developed and tested on text from well-written sources. However, in intelligent voice assistants, where NER is an important component, input to NER may be noisy because of user or speech recognition error. In applications, entity labels may change frequently, and non-textual properties like topicality or popularity may be needed to choose among alternatives. We describe a NER system intended to address these problems. We test and train this system on a proprietary user-derived dataset. We compare with a baseline text-only NER system; the baseline enhanced with external gazetteers; and the baseline enhanced with the search and indirect labelling techniques we describe below. The final configuration gives around 6% reduction in NER error rate. We also show that this technique improves related tasks, such as semantic parsing, with an improvement of up to 5% in error rate.
ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition
Frey, Brendan J., Kristjansson, Trausti T., Deng, Li, Acero, Alex
A challenging, unsolved problem in the speech recognition community is recognizing speech signals that are corrupted by loud, highly nonstationary noise. One approach to noisy speech recognition is to automatically remove the noise from the cepstrum sequence before feeding it in to a clean speech recognizer. In previous work published in Eurospeech, we showed how a probability model trained on clean speech and a separate probability model trained on noise could be combined for the purpose of estimating the noisefree speech from the noisy speech. We showed how an iterative 2nd order vector Taylor series approximation could be used for probabilistic inference in this model. In many circumstances, it is not possible to obtain examples of noise without speech.
ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition
Frey, Brendan J., Kristjansson, Trausti T., Deng, Li, Acero, Alex
A challenging, unsolved problem in the speech recognition community isrecognizing speech signals that are corrupted by loud, highly nonstationary noise. One approach to noisy speech recognition isto automatically remove the noise from the cepstrum sequence beforefeeding it in to a clean speech recognizer. In previous work published in Eurospeech, we showed how a probability model trained on clean speech and a separate probability model trained on noise could be combined for the purpose of estimating the noisefree speechfrom the noisy speech. We showed how an iterative 2nd order vector Taylor series approximation could be used for probabilistic inferencein this model. In many circumstances, it is not possible to obtain examples of noise without speech.
Speech Denoising and Dereverberation Using Probabilistic Models
Attias, Hagai, Platt, John C., Acero, Alex, Deng, Li
This paper presents a unified probabilistic framework for denoising and dereverberation of speech signals. The framework transforms the denoising and dereverberation problems into Bayes-optimal signal estimation. The key idea is to use a strong speech model that is pre-trained on a large data set of clean speech. Computational efficiency is achieved by using variational EM, working in the frequency domain, and employing conjugate priors. The framework covers both single and multiple microphones. We apply this approach to noisy reverberant speech signals and get results substantially better than standard methods.
Speech Denoising and Dereverberation Using Probabilistic Models
Attias, Hagai, Platt, John C., Acero, Alex, Deng, Li
This paper presents a unified probabilistic framework for denoising and dereverberation of speech signals. The framework transforms the denoising and dereverberation problems into Bayes-optimal signal estimation. The key idea is to use a strong speech model that is pre-trained on a large data set of clean speech. Computational efficiency is achieved by using variational EM, working in the frequency domain, and employing conjugate priors. The framework covers both single and multiple microphones. We apply this approach to noisy reverberant speech signals and get results substantially better than standard methods.
Speech Denoising and Dereverberation Using Probabilistic Models
Attias, Hagai, Platt, John C., Acero, Alex, Deng, Li
This paper presents a unified probabilistic framework for denoising and dereverberation of speech signals. The framework transforms the denoising anddereverberation problems into Bayes-optimal signal estimation. The key idea is to use a strong speech model that is pre-trained on a large data set of clean speech. Computational efficiency is achieved by using variational EM, working in the frequency domain, and employing conjugate priors. The framework covers both single and multiple microphones. Weapply this approach to noisy reverberant speech signals and get results substantially better than standard methods.