Visual Speech Recognition with Stochastic Networks
–Neural Information Processing Systems
This paper presents ongoing work on a speaker independent visual speech recognition system. The work presented here builds on previous research efforts in this area and explores the potential use of simple hidden Markov models for limited vocabulary, speaker independent visual speech recognition. The task at hand is recognition of the first four English digits, a task with possible applications in car-phone dialing. The images were modeled as mixtures of independent Gaussian distributions, and the temporal dependencies were captured with standard left-to-right hidden Markov models. The results indicate that simple hidden Markov models may be used to successfully recognize relatively unprocessed image sequences.
Neural Information Processing Systems
Dec-31-1995