AITopics | sqair

Collaborating Authors

sqair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Neural Information Processing SystemsMar-16-2026, 21:57:48 GMT

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Adam Kosiorek, Hyunjik Kim, Yee Whye Teh, Ingmar Posner

Neural Information Processing SystemsFeb-13-2026, 06:19:32 GMT

While this approach is inspiring, its focus on modelling individual (and thereby inherently static)scenesleadstoanumber oflimitations.

artificial intelligence, arxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Neural Information Processing SystemsNov-20-2025, 22:26:24 GMT

generative modelling, infer, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Adam Kosiorek, Hyunjik Kim, Yee Whye Teh, Ingmar Posner

Neural Information Processing SystemsNov-20-2025, 17:27:21 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, sqair, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

particular, we clarify some potential misunderstandings from R# 3 and provide extra experiments as suggested by R#3

Neural Information Processing SystemsOct-3-2025, 09:56:35 GMT

We thank all reviewers for their valuable and constructive comments. Below, we address the detailed comments. It is shown that PR can be extended to "selectively" incorporate uncertain We'll make this clearer in the final version. The odd columns are real data and even ones are the reconstruction results. It was a fault to miss the 8-th column (i.e., the reconstruction We'll fix these issues for better presentation.

constraint, knowledge, penalty term, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.31)

Add feedback

Reviews: Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Neural Information Processing SystemsOct-7-2024, 13:51:08 GMT

I have read the other reviews and the author rebuttal. I am still very much in favor of accepting this paper, but I have revised my score down from a 9 to an 8; some of the issues pointed out by the other reviewers, while well-addressed in the rebuttal, made me realize that my initial view of the paper was a bit too rosy. The model starts with the basic Attend, Infer, Repeat (AIR) framework and extends it to handle images sequences (SQAIR). This extension requires taking into account the fact that objects may enter into or leave the frame over the course of a motion sequence. To support this behavior, SQAIR's generative and inference networks for each frame have two phases.

generative modelling, image sequence, sequence, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.82)

Add feedback

Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects

Kosiorek, Adam, Kim, Hyunjik, Teh, Yee Whye, Posner, Ingmar

Neural Information Processing SystemsApr-22-2020, 01:52:46 GMT

It can reliably discover and track objects through the sequence; it can also conditionally generate future frames, thereby simulating expected motion of objects. This is achieved by explicitly encoding object numbers, locations and appearances in the latent variables of the model. SQAIR retains all strengths of its predecessor, Attend, Infer, Repeat (AIR, Eslami et. We use a moving multi-\textsc{mnist} dataset to show limitations of AIR in detecting overlapping or partially occluded objects, and show how \textsc{sqair} overcomes them by leveraging temporal consistency of objects. Finally, we also apply SQAIR to real-world pedestrian CCTV data, where it learns to reliably detect, track and generate walking pedestrians with no supervision.

artificial intelligence, generative modelling, machine learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.52)

Add feedback

Exploiting Spatial Invariance for Scalable Unsupervised Object Tracking

Crawford, Eric, Pineau, Joelle

arXiv.org Machine LearningNov-20-2019

The ability to detect and track objects in the visual world is a crucial skill for any intelligent agent, as it is a necessary precursor to any object-level reasoning process. Moreover, it is important that agents learn to track objects without supervision (i.e. without access to annotated training videos) since this will allow agents to begin operating in new environments with minimal human assistance. The task of learning to discover and track objects in videos, which we call \textit{unsupervised object tracking}, has grown in prominence in recent years; however, most architectures that address it still struggle to deal with large scenes containing many objects. In the current work, we propose an architecture that scales well to the large-scene, many-object setting by employing spatially invariant computations (convolutions and spatial attention) and representations (a spatially local object specification scheme). In a series of experiments, we demonstrate a number of attractive features of our architecture; most notably, that it outperforms competing methods at tracking objects in cluttered scenes with many objects, and that it can generalize well to videos that are larger and/or contain more objects than videos encountered during training.

module, propagation module, video, (15 more...)

arXiv.org Machine Learning

1911.09033

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Variational Tracking and Prediction with Generative Disentangled State-Space Models

Akhundov, Adnan, Soelch, Maximilian, Bayer, Justin, van der Smagt, Patrick

arXiv.org Machine LearningOct-14-2019

We address tracking and prediction of multiple moving objects in visual data streams as inference and sampling in a disentangled latent state-space model. By encoding objects separately and including explicit position information in the latent state space, we perform tracking via amortized variational Bayesian inference of the respective latent positions. Inference is implemented in a modular neural framework tailored towards our disentangled latent space. Generative and inference model are jointly learned from observations only. Comparing to related prior work, we empirically show that our Markovian state-space assumption enables faithful and much improved long-term prediction well beyond the training horizon. Further, our inference model correctly decomposes frames into objects, even in the presence of occlusions. Tracking performance is increased significantly over prior art.

prediction, sequence, vtssi, (15 more...)

arXiv.org Machine Learning

1910.06205

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(16 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Scalable Object-Oriented Sequential Generative Models

Jiang, Jindong, Janghorbani, Sepehr, de Melo, Gerard, Ahn, Sungjin

arXiv.org Machine LearningOct-6-2019

In SCALOR, we achieve scalability with respect to the object density by parallelizing both the propagation and discovery processes, reducing the parallel time complexity per scene image to O (1) from O (N) with N the number of objects in an image. We also observe that the serial object processing in SQAIR based on an RNN not only increases the computation time but also deteriorates discovery performance. To this end, we propose a parallel discovery model with much better discovery capacity and performance. Temporally predicting and detecting trajectories of objects, SCALOR can also be regarded as a generative tracking model. In our experiments, we show that SCALOR can model videos with nearly one hundred moving objects along with complex background on synthetic datasets. Furthermore, we evaluate and demonstrate SCALOR on natural videos as well with tens of objects with complex background. The contribution of this work are: (i) We propose the SCALOR model that significantly improves (two orders of magnitude) the scalability with regard to the the object density. It is applicable to nearly a hundred objects with comparable computation time to SQAIR, which scales only to a few objects.

background, representation, scalor, (16 more...)

arXiv.org Machine Learning

1910.02384

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback