Lumi\`ereNet: Lecture Video Synthesis from Audio

Kim, Byung-Hak, Ganapathi, Varun

arXiv.org Machine Learning 

We present Lumi\`ereNet, a simple, modular, and completely deep-learning based architecture that synthesizes, high quality, full-pose headshot lecture videos from instructor's new audio narration of any length. Unlike prior works, Lumi\`ereNet is entirely composed of trainable neural network modules to learn mapping functions from the audio to video through (intermediate) estimated pose-based compact and abstract latent codes. Our video demos are available at [22] and [23].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found