AITopics | moog

Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing SystemsMar-22-2026, 17:34:10 GMT

Current vision models typically maintain a fixed correspondence between their representation structure and image space.Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time.

artificial intelligence, name change, proceedings, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.39)

Add feedback

e0e25d425450b6fc8e34380de71b3aee-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 10:41:21 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.92)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing SystemsOct-10-2025, 19:12:42 GMT

Current vision models typically maintain a fixed correspondence between their representation structure and image space. Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at

international conference, moog, representation, (16 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.92)
Energy > Power Industry (0.41)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing SystemsMay-27-2025, 19:34:15 GMT

Current vision models typically maintain a fixed correspondence between their representation structure and image space.Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present Moving Off-the-Grid (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time. We find that a simple self-supervised objective--next frame prediction--trained on video data, results in a set of latent tokens which bind to specific scene structures and track them as they move. We demonstrate the usefulness of MooG's learned representation both qualitatively and quantitatively by training readouts on top of the learned representation on a variety of downstream tasks. We show that MooG can provide a strong foundation for different vision tasks when compared to "on-the-grid" baselines.

artificial intelligence, off-the-grid, scene-grounded video representation, (2 more...)

Neural Information Processing Systems

Industry: Energy > Power Industry (0.88)

Technology: Information Technology > Artificial Intelligence > Vision (0.62)

Add feedback

Moving Off-the-Grid: Scene-Grounded Video Representations

van Steenkiste, Sjoerd, Zoran, Daniel, Yang, Yi, Rubanova, Yulia, Kabra, Rishabh, Doersch, Carl, Gokay, Dilara, Heyward, Joseph, Pot, Etienne, Greff, Klaus, Hudson, Drew A., Keck, Thomas Albert, Carreira, Joao, Dosovitskiy, Alexey, Sajjadi, Mehdi S. M., Kipf, Thomas

arXiv.org Artificial IntelligenceNov-8-2024

Current vision models typically maintain a fixed correspondence between their representation structure and image space. Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present Moving Off-the-Grid (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time. By using a combination of cross-attention and positional embeddings we disentangle the representation structure and image structure. We find that a simple self-supervised objective--next frame prediction--trained on video data, results in a set of latent tokens which bind to specific scene structures and track them as they move. We demonstrate the usefulness of MooG's learned representation both qualitatively and quantitatively by training readouts on top of the learned representation on a variety of downstream tasks. We show that MooG can provide a strong foundation for different vision tasks when compared to "on-the-grid" baselines.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.05927

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.92)
Energy > Power Industry (0.81)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

Modular Object-Oriented Games: A Task Framework for Reinforcement Learning, Psychology, and Neuroscience

Watters, Nicholas, Tenenbaum, Joshua, Jazayeri, Mehrdad

arXiv.org Artificial IntelligenceFeb-24-2021

In recent years, trends towards studying object-based games have gained momentum in the fields of artificial intelligence, cognitive science, psychology, and neuroscience. In artificial intelligence, interactive physical games are now a common testbed for reinforcement learning (François-Lavet et al., 2018; Leike et al., 2017; Mnih et al., 2013; Sutton and Barto, 2018) and object representations are of particular interest for sample efficient and generalizable AI (Battaglia et al., 2018; Greff et al., 2020; van Steenkiste et al., 2019). In cognitive science and psychology, object-based games are used to study a variety of cognitive capacities, such as planning, intuitive physics, and intuitive psychology (Chabris, 2017; Ullman et al., 2017). Developmental psychologists also use object-based visual stimuli to probe questions about object-oriented reasoning in infants and young animals (Spelke and Kinzler, 2007; Wood et al., 2020). In neuroscience, object-based computer games have recently been used to study decision-making and physical reasoning in both human and non-human primates (Fischer et al., 2016; McDonald et al., 2019; Rajalingham et al., 2021; Yoo et al., 2020). Furthermore, a growing number of researchers are studying tasks using a combination of approaches from these fields.

modular object-oriented game, reinforcement learning, task framework, (12 more...)

arXiv.org Artificial Intelligence

2102.12616

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.41)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

That review you wrote on Amazon? Priceless

USATODAY - Tech Top StoriesMar-20-2017, 14:11:16 GMT

Research shows that a 5-star rating isn't actually the best, as if customers don't see at least a few negative reviews they think the system's been gamed. SAN FRANCISCO -- Chances are, a week or two after you buy something online you'll get an email asking, "How'd we do?" and a link to review the product. Your response and those of other customers are worth a lot: $400 billion, according to one analyst. The more buys, the more reviews. The more buys, the higher your rank in search and the more sales you get," said Alice Kim, owner of online cosmetic brand Elizabeth Mott. Even a single comment can make a huge difference. Just going from zero review to one increases the rate at which online window-shoppers actually click the "buy" button by 65%, said Matt Moog, CEO of Power Reviews, a company that makes ratings and review software. He estimates 20% of sales are driven by reviews and one-third of online shoppers say straight out they won't buy a product that hasn't been positively reviewed. Increasingly, online reviews matter for all buyers even though online sales made up just 8.3% of U.S. retail sales in the fourth quarter, according to the Department of Commerce -- and Amazon's reviews matter most of all. The online retailer lives and dies by its reviews online, said owner Alice Kim. Fifty-five percent of shoppers start their buying research on Amazon, a survey by marketing firm BloomReach found, and half of all shoppers say they rely primarily on Amazon for reviews, according to Market Track, an e-commerce analysis firm. "They can be in Best Buy or Home Depot, but they go on their phones to check Amazon reviews," said Greg Perry of One Click Retail, an e-commerce data company. Amazon's reviews rank so highly in part because they're considered the most trustworthy, even though -- like other sites -- it's not immune from people using the reviews for ancillary purposes, say last November's flood of one-star reviews of anchor Megyn Kelly's book hours after it went on sale, which the LA Times said was orchestrated by a pro-Trump forum on Reddit. The Seattle retailer has gone to great lengths to root out fake reviews, launching over 1,000 lawsuits against those who post them, according to the company. It also marks and gives more weight to reviews by people who actually bought the product and has introduced a machine learning algorithm that gives more weight to newer, more helpful reviews. In October Amazon began requiring that any review of a product given to the reviewer for free or at reduced cost be marked as such. "Our focus is to make sure our reviews are authentic and helpful," said spokeswoman Angie Newman. Whether consumers realize it or not, the notes they hastily type out, whether glowing or scathing, wield tremendous power. "Before, you might have told ten people about a product.

amazon, artificial intelligence, machine learning, (13 more...)

USATODAY - Tech Top Stories

Country:

North America > United States > California > San Francisco County > San Francisco (0.25)
North America > United States > New York (0.05)

Industry: Retail > Online (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.91)

Add feedback

Boston Dynamics' Marc Raibert on Next-Gen ATLAS: "A Huge Amount of Work"

IEEE Spectrum RoboticsMar-20-2016, 13:01:51 GMT

Boston Dynamics unveiled yesterday a massively upgraded version of its ATLAS humanoid that is smaller, lighter, and more agile. In a video, the new robot is seen walking untethered in snow-covered woods, lifting and placing boxes on shelves, and even face-planting and immediately getting up unscathed after being pushed by an engineer. As one observer commented, "We expected [ATLAS] to turn around and blast that guy with a laser beam." What is perhaps most impressive about the "next generation" ATLAS is that it is just a huge technological leap forward compared to its predecessor, which was already a pretty incredible robot. The new ATLAS can do things we've never seen other robots doing before, making it one of the most advanced humanoids in existence.

artificial intelligence, boston dynamic, robot, (10 more...)

IEEE Spectrum Robotics

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback