Reviews: Video-to-Video Synthesis

Neural Information Processing Systems 

This paper focuses on video-2-video synthesis, i.e. given a real video the goal is to learn a model that outputs a new photorealistic and temporally consistent video with (ideally) the same data distribution, preserving the content and style of the source video. Existing image-2-image methods produce photorealistic images, but they do not account for the temporal dimension, resulting in high-frequency artifacts across time. This work builds on existing image-2-image works and mainly extends them into the temporal dimension to ensure temporal coherence. By employing conditional GANs the method provides high-level control over the output, e.g. Although the theoretical background and components are employed from past work, there is significant amount of effort in putting them together and adding the temporal extension.