Speaker Independent Speech Recognition with Neural Networks and Speech Knowledge

Bengio, Yoshua, Mori, Renato de, Cardin, Régis

Neural Information Processing Systems 

Yoshua Bengio Renato De Mori Dept Computer Science Dept Computer Science McGill University McGill University Montreal, Canada H3A2A7 RegisCardin Dept Computer Science McGill University ABSTRACT We attempt to combine neural networks with knowledge from speech science to build a speaker independent speech recognition system.This knowledge is utilized in designing the preprocessing, input coding, output coding, output supervision and architectural constraints. To handle the temporal aspect of speech we combine delays, copies of activations of hidden and output units at the input level, and Back-Propagation for Sequences (BPS), a learning algorithm for networks with local self-loops. This strategy is demonstrated in several experiments, inparticular a nasal discrimination task for which the application of a speech theory hypothesis dramatically improved generalization. 1 INTRODUCTION The strategy put forward in this research effort is to combine the flexibility and learning abilities of neural networks with as much knowledge from speech science as possible in order to build a speaker independent automatic speech recognition system. This knowledge is utilized in each of the steps in the construction ofan automated speech recognition system: preprocessing, input coding, output coding, output supervision, architectural design. Fast Fourier Transform (FFT), or compressing the frame sequence in such a way as to conserve an approximately constant rate of change.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found