A Hybrid Neural Net System for State-of-the-Art Continuous Speech Recognition
Zavaliagkos, G., Zhao, Y., Schwartz, R., Makhoul, J.
–Neural Information Processing Systems
Untill recently, state-of-the-art, large-vocabulary, continuous speech recognition (CSR) has employed Hidden Markov Modeling (HMM) to model speech sounds. In an attempt to improve over HMM we developed a hybrid system that integrates HMM technology with neural networks. We present the concept of a "Segmental Neural Net" (SNN) for phonetic modeling in CSR. By taking into account all the frames of a phonetic segment simultaneously, the SNN overcomes the well-known conditional-independence limitation of HMMs. In several speaker-independent experiments with the DARPA Resource Management corpus, the hybrid system showed a consistent improvement in performance over the baseline HMM system. 1 INTRODUCTION The current state of the art in continuous speech recognition (CSR) is based on the use of hidden Markov models (HMM) to model phonemes in context.
Neural Information Processing Systems
Dec-31-1993