Lipreading by neural networks: Visual preprocessing, learning, and sensory integration

Wolff, Gregory J., Prasad, K. Venkatesh, Stork, David G., Hennecke, Marcus

Neural Information Processing Systems 

Automated speech recognition is notoriously hard, and thus any predictive source of information and constraints that could be incorporated into a computer speech recognition system would be desirable. Humans, especially the hearing impaired, can utilize visual information - "speech reading" - for improved accuracy (Dodd & Campbell, 1987, Sanders & Goodrich, 1971). Speech reading can provide direct information about segments, phonemes, rate, speaker gender and identity, and subtle information for segmenting speech from background noise or multiple speakers (De Filippo & Sims, 1988, Green & Miller, 1985). Fundamental support for the use of visual information comes from the complementary nature of the visual and acoustic speech signals. Utterances that are difficult to distinguish acoustically are the easiest to distinguish.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found