Unsupervised Learning of Disentangled Representations from Video

May-30-2017–arXiv.org Machine Learning

We present a new model DrNET that learns disentangled image representations from video. Our approach leverages the temporal coherence of video and a novel adversarial loss to learn a representation that factorizes each frame into a stationary part and a temporally varying component. The disentangled representation can be used for a range of tasks. For example, applying a standard LSTM to the time-vary components enables prediction of future frames. We evaluate our approach on a range of synthetic and real videos, demonstrating the ability to coherently generate hundreds of steps into the future.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Machine Learning

May-30-2017

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.40)

Industry:
- Leisure & Entertainment > Games > Computer Games (0.46)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.59)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found