Goto

Collaborating Authors

 Roginski, Krist


English Alphabet Recognition with Telephone Speech

Neural Information Processing Systems

The English alphabet is difficult to recognize automatically because many letters sound alike; e.g., BID, PIT, VIZ and F IS. When spoken over the telephone, the information needed to discriminate among several of these pairs, such as F IS, PIT, BID and VIZ, is further reduced due to the limited bandwidth of the channel Speaker-independent recognition of spelled names over the telephone is difficult due to variability caused by channel distortions, different handsets, and a variety of background noises. Finally, when dealing with a large population of speakers, dialect and foreign accents alter letter pronunciations. An R from a Boston speaker may not contain an [r]. Human classification performance on telephone speech underscores the difficulty of the problem.


English Alphabet Recognition with Telephone Speech

Neural Information Processing Systems

Mark Fanty, Ronald A. Cole and Krist Roginski Center for Spoken Language Understanding Oregon Graduate Institute of Science and Technology 19600 N.W. Von Neumann Dr., Beaverton, OR 97006 Abstract A recognition system is reported which recognizes names spelled over the telephone with brief pauses between letters. The system uses separate neural networks to locate segment boundaries and classify letters. The letter scores are then used to search a database of names to find the best scoring name. The speaker-independent classification rate for spoken letters is89%. The system retrieves the correct name, spelled with pauses between letters, 91 % of the time from a database of 50,000 names. 1 INTRODUCTION The English alphabet is difficult to recognize automatically because many letters sound alike; e.g., BID, PIT, VIZ and F IS.