An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition

Bengio, Samy

Neural Information Processing Systems 

This paper presents a novel Hidden Markov Model architecture to model the joint probability of pairs of asynchronous sequences describing thesame event. It is based on two other Markovian models, namely Asynchronous Input/ Output Hidden Markov Models and Pair Hidden Markov Models. An EM algorithm to train the model is presented, as well as a Viterbi decoder that can be used to obtain theoptimal state sequence as well as the alignment between the two sequences. The model has been tested on an audiovisual speech recognition task using the M2VTS database and yielded robust performances under various noise conditions. 1 Introduction Hidden Markov Models (HMMs) are statistical tools that have been used successfully inthe last 30 years to model difficult tasks such as speech recognition [6) or biological sequence analysis [4). They are very well suited to handle discrete of continuous sequencesof varying sizes.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found