MONet: Unsupervised Scene Decomposition and Representation

Burgess, Christopher P., Matthey, Loic, Watters, Nicholas, Kabra, Rishabh, Higgins, Irina, Botvinick, Matt, Lerchner, Alexander

Jan-22-2019–arXiv.org Machine Learning

Realistic visual scenes contain rich structure, which humans effortlessly exploit to reason effectively and intelligently. In particular, object perception, the ability to perceive and represent individual objects, is considered a fundamental cognitive ability that allows us to understand - and efficiently interact with - the world as perceived through our senses [Johnson, 2018, Green and Quilty-Dunn, 2017]. However, despite recent breakthroughs in computer vision fuelled by advances in deep learning, learning to represent realistic visual scenes in terms of objects remains an open challenge for artificial systems. The impact and application of robust visual object decomposition would be far-reaching. Models such as graph-structured networks that rely on handcrafted object representations have recently achieved remarkable results in a wide range of research areas, including reinforcement learning, physical modeling, and multi-agent control [Battaglia et al., 2018, Wang et al., 2018, Hamrick et al., 2017, Hoshen, 2017]. The prospect of acquiring visual object representations through unsupervised learning could be invaluable for extending the generality and applicability of such models.

dataset, monet, representation, (16 more...)

arXiv.org Machine Learning

Jan-22-2019

arXiv.org PDF

Add feedback

Country:
- Europe > United Kingdom > England
  - Oxfordshire > Oxford (0.04)
  - Greater London > London (0.04)

Genre:
- Research Report (0.64)

Industry:
- Health & Medicine (0.88)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found