Goto

Collaborating Authors

 moog



Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing Systems

Current vision models typically maintain a fixed correspondence between their representation structure and image space.Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time.


Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing Systems

Current vision models typically maintain a fixed correspondence between their representation structure and image space. Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at


Moving Off-the-Grid: Scene-Grounded Video Representations

Neural Information Processing Systems

Current vision models typically maintain a fixed correspondence between their representation structure and image space.Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present Moving Off-the-Grid (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time. We find that a simple self-supervised objective--next frame prediction--trained on video data, results in a set of latent tokens which bind to specific scene structures and track them as they move. We demonstrate the usefulness of MooG's learned representation both qualitatively and quantitatively by training readouts on top of the learned representation on a variety of downstream tasks. We show that MooG can provide a strong foundation for different vision tasks when compared to "on-the-grid" baselines.


Moving Off-the-Grid: Scene-Grounded Video Representations

van Steenkiste, Sjoerd, Zoran, Daniel, Yang, Yi, Rubanova, Yulia, Kabra, Rishabh, Doersch, Carl, Gokay, Dilara, Heyward, Joseph, Pot, Etienne, Greff, Klaus, Hudson, Drew A., Keck, Thomas Albert, Carreira, Joao, Dosovitskiy, Alexey, Sajjadi, Mehdi S. M., Kipf, Thomas

arXiv.org Artificial Intelligence

Current vision models typically maintain a fixed correspondence between their representation structure and image space. Each layer comprises a set of tokens arranged "on-the-grid," which biases patches or tokens to encode information at a specific spatio(-temporal) location. In this work we present Moving Off-the-Grid (MooG), a self-supervised video representation model that offers an alternative approach, allowing tokens to move "off-the-grid" to better enable them to represent scene elements consistently, even as they move across the image plane through time. By using a combination of cross-attention and positional embeddings we disentangle the representation structure and image structure. We find that a simple self-supervised objective--next frame prediction--trained on video data, results in a set of latent tokens which bind to specific scene structures and track them as they move. We demonstrate the usefulness of MooG's learned representation both qualitatively and quantitatively by training readouts on top of the learned representation on a variety of downstream tasks. We show that MooG can provide a strong foundation for different vision tasks when compared to "on-the-grid" baselines.


Modular Object-Oriented Games: A Task Framework for Reinforcement Learning, Psychology, and Neuroscience

Watters, Nicholas, Tenenbaum, Joshua, Jazayeri, Mehrdad

arXiv.org Artificial Intelligence

In recent years, trends towards studying object-based games have gained momentum in the fields of artificial intelligence, cognitive science, psychology, and neuroscience. In artificial intelligence, interactive physical games are now a common testbed for reinforcement learning (François-Lavet et al., 2018; Leike et al., 2017; Mnih et al., 2013; Sutton and Barto, 2018) and object representations are of particular interest for sample efficient and generalizable AI (Battaglia et al., 2018; Greff et al., 2020; van Steenkiste et al., 2019). In cognitive science and psychology, object-based games are used to study a variety of cognitive capacities, such as planning, intuitive physics, and intuitive psychology (Chabris, 2017; Ullman et al., 2017). Developmental psychologists also use object-based visual stimuli to probe questions about object-oriented reasoning in infants and young animals (Spelke and Kinzler, 2007; Wood et al., 2020). In neuroscience, object-based computer games have recently been used to study decision-making and physical reasoning in both human and non-human primates (Fischer et al., 2016; McDonald et al., 2019; Rajalingham et al., 2021; Yoo et al., 2020). Furthermore, a growing number of researchers are studying tasks using a combination of approaches from these fields.


That review you wrote on Amazon? Priceless

USATODAY - Tech Top Stories

Research shows that a 5-star rating isn't actually the best, as if customers don't see at least a few negative reviews they think the system's been gamed. SAN FRANCISCO -- Chances are, a week or two after you buy something online you'll get an email asking, "How'd we do?" and a link to review the product. Your response and those of other customers are worth a lot: $400 billion, according to one analyst. The more buys, the more reviews. The more buys, the higher your rank in search and the more sales you get," said Alice Kim, owner of online cosmetic brand Elizabeth Mott. Even a single comment can make a huge difference. Just going from zero review to one increases the rate at which online window-shoppers actually click the "buy" button by 65%, said Matt Moog, CEO of Power Reviews, a company that makes ratings and review software. He estimates 20% of sales are driven by reviews and one-third of online shoppers say straight out they won't buy a product that hasn't been positively reviewed. Increasingly, online reviews matter for all buyers even though online sales made up just 8.3% of U.S. retail sales in the fourth quarter, according to the Department of Commerce -- and Amazon's reviews matter most of all. The online retailer lives and dies by its reviews online, said owner Alice Kim. Fifty-five percent of shoppers start their buying research on Amazon, a survey by marketing firm BloomReach found, and half of all shoppers say they rely primarily on Amazon for reviews, according to Market Track, an e-commerce analysis firm. "They can be in Best Buy or Home Depot, but they go on their phones to check Amazon reviews," said Greg Perry of One Click Retail, an e-commerce data company. Amazon's reviews rank so highly in part because they're considered the most trustworthy, even though -- like other sites -- it's not immune from people using the reviews for ancillary purposes, say last November's flood of one-star reviews of anchor Megyn Kelly's book hours after it went on sale, which the LA Times said was orchestrated by a pro-Trump forum on Reddit. The Seattle retailer has gone to great lengths to root out fake reviews, launching over 1,000 lawsuits against those who post them, according to the company. It also marks and gives more weight to reviews by people who actually bought the product and has introduced a machine learning algorithm that gives more weight to newer, more helpful reviews. In October Amazon began requiring that any review of a product given to the reviewer for free or at reduced cost be marked as such. "Our focus is to make sure our reviews are authentic and helpful," said spokeswoman Angie Newman. Whether consumers realize it or not, the notes they hastily type out, whether glowing or scathing, wield tremendous power. "Before, you might have told ten people about a product.


Boston Dynamics' Marc Raibert on Next-Gen ATLAS: "A Huge Amount of Work"

IEEE Spectrum Robotics

Boston Dynamics unveiled yesterday a massively upgraded version of its ATLAS humanoid that is smaller, lighter, and more agile. In a video, the new robot is seen walking untethered in snow-covered woods, lifting and placing boxes on shelves, and even face-planting and immediately getting up unscathed after being pushed by an engineer. As one observer commented, "We expected [ATLAS] to turn around and blast that guy with a laser beam." What is perhaps most impressive about the "next generation" ATLAS is that it is just a huge technological leap forward compared to its predecessor, which was already a pretty incredible robot. The new ATLAS can do things we've never seen other robots doing before, making it one of the most advanced humanoids in existence.