AITopics | learning object-centric representation

Collaborating Authors

learning object-centric representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Neural Information Processing SystemsDec-23-2025, 23:17:32 GMT

Learning object-centric representations of multi-object scenes is a promising approach towards machine intelligence, facilitating high-level reasoning and control from visual sensory data. However, current approaches for \textit{unsupervised object-centric scene representation} are incapable of aggregating information from multiple observations of a scene. As a result, these ``single-view'' methods form their representations of a 3D scene based only on a single 2D observation (view). Naturally, this leads to several inaccuracies, with these methods falling victim to single-view spatial ambiguities. To address this, we propose \textit{The Multi-View and Multi-Object Network (MulMON)}---a method for learning accurate, object-centric representations of multi-object scenes by leveraging multiple views.

learning object-centric representation, multi-object scene, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.39)

Add feedback

Review for NeurIPS paper: Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Neural Information Processing SystemsJan-23-2025, 16:07:46 GMT

They also get inspiration from prior work on iterative inference for VAEs and propose an inference mechanism to allow the model efficiently learn an object-centric scene representation from multiple views of a scene which contains multiple objects. During training, the model learns to infer [1,...K] objects (d-dimensional Gaussian latents) in the scene where K upper-bounds the number of objects the model can recognize, and K is set to a high enough value. During training 5 views of a scene are presented and the model is expected to reconstruct both the final rendering and object segmentations for a randomly queried novel viewpoint. They evaluate their their model on GQN-Jaco and two variant so the CLEVR datasets. They compare their model to IODINE and GQN for object segmentation, novel queried viewpoint prediction and disentanglement analysis; the results show that their method performs better quantitatively and qualitatively. They also demonstrate that their model has learned good feature-level disentangled representations.

learning object-centric representation, representation, scene representation, (9 more...)

Neural Information Processing Systems

Genre: Research Report (0.39)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.39)

Add feedback

Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views

Neural Information Processing SystemsOct-10-2024, 00:45:24 GMT

Learning object-centric representations of multi-object scenes is a promising approach towards machine intelligence, facilitating high-level reasoning and control from visual sensory data. However, current approaches for \textit{unsupervised object-centric scene representation} are incapable of aggregating information from multiple observations of a scene. As a result, these single-view'' methods form their representations of a 3D scene based only on a single 2D observation (view). Naturally, this leads to several inaccuracies, with these methods falling victim to single-view spatial ambiguities. To address this, we propose \textit{The Multi-View and Multi-Object Network (MulMON)}---a method for learning accurate, object-centric representations of multi-object scenes by leveraging multiple views.

learning object-centric representation, multi-object scene, representation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.42)

Add feedback