Audio-Visual Sound Separation Via Hidden Markov Models

Dec-31-2002–Neural Information Processing Systems

It is well known that under noisy conditions we can hear speech much more clearly when we read the speaker's lips. This suggests the utility of audiovisual information for the task of speech enhancement. We propose a method to exploit audiovisual cues to enable speech separation under non-stationary noise and with a single microphone. We revise and extend HMM-based speech enhancement techniques, in which signal and noise models are factori ally combined, to incorporate visual lip information and employ novel signal HMMs in which the dynamics of narrow-band and wide band components are factorial. We avoid the combinatorial explosion in the factorial model by using a simple approximate inference technique to quickly estimate the clean signals in a mixture. We present a preliminary evaluation of this approach using a small-vocabulary audiovisual database, showing promising improvements in machine intelligibility for speech enhanced using audio and visual information.

enhancement, information, speech, (16 more...)

Neural Information Processing Systems

Dec-31-2002

Conferences PDF

Add feedback

Country:
- North America > United States
  - Pennsylvania > Allegheny County
    - Pittsburgh (0.05)
  - California
    - Los Angeles County > Los Angeles (0.14)
    - San Diego County > San Diego (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - Jordan (0.04)

Industry:
- Automobiles & Trucks (0.47)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Duplicate Docs Excel Report

Title
Audio-Visual Sound Separation Via Hidden Markov Models
Audio-Visual Sound Separation Via Hidden Markov Models

Similar Docs Excel Report more

Title	Similarity	Source
None found