Goto

Collaborating Authors

 gwm


GWM: Towards Scalable Gaussian World Models for Robotic Manipulation

arXiv.org Artificial Intelligence

Training robot policies within a learned world model is trending due to the inefficiency of real-world interactions. The established image-based world models and policies have shown prior success, but lack robust geometric information that requires consistent spatial and physical understanding of the three-dimensional world, even pre-trained on internet-scale video sources. To this end, we propose a novel branch of world model named Gaussian World Model (GWM) for robotic manipulation, which reconstructs the future state by inferring the propagation of Gaussian primitives under the effect of robot actions. At its core is a latent Diffusion Transformer (DiT) combined with a 3D variational autoencoder, enabling fine-grained scene-level future state reconstruction with Gaussian Splatting. GWM can not only enhance the visual representation for imitation learning agent by self-supervised future prediction training, but can serve as a neural simulator that supports model-based reinforcement learning. Both simulated and real-world experiments depict that GWM can precisely predict future scenes conditioned on diverse robot actions, and can be further utilized to train policies that outperform the state-of-the-art by impressive margins, showcasing the initial data scaling potential of 3D world model.


Multi-level Collaborative Distillation Meets Global Workspace Model: A Unified Framework for OCIL

arXiv.org Artificial Intelligence

--Online Class-Incremental Learning (OCIL) enables models to learn continuously from non-i.i.d. However, OCIL faces two key challenges: maintaining model stability under strict memory constraints and ensuring adaptability to new tasks. Under stricter memory constraints, current replay-based methods are less effective. While ensemble methods improve adaptability (plasticity), they often struggle with stability. T o overcome these challenges, we propose a novel approach that enhances ensemble learning through a Global Workspace Model (GWM)--a shared, implicit memory that guides the learning of multiple student models. The GWM is formed by fusing the parameters of all students within each training batch, capturing the historical learning trajectory and serving as a dynamic anchor for knowledge consolidation. This fused model is then redistributed periodically to the students to stabilize learning and promote cross-task consistency. In addition, we introduce a multi-level collaborative distillation mechanism. This approach enforces peer-to-peer consistency among students and preserves historical knowledge by aligning each student with the GWM. As a result, student models remain adaptable to new tasks while maintaining previously learned knowledge, striking a better balance between stability and plasticity. Extensive experiments on three standard OCIL benchmarks show that our method delivers significant performance improvement for several OCIL models across various memory budgets. Class-Incremental Learning is designed to integrate the knowledge of classes from a stream of data with an evolved distribution [1].


Graph World Model

arXiv.org Artificial Intelligence

World models (WMs) demonstrate strong capabilities in prediction, generation, and planning tasks. Existing WMs primarily focus on unstructured data and cannot leverage the ubiquitous structured data, often represented as graphs, in the digital world. While multiple graph foundation models have been proposed, they focus on graph learning tasks and cannot extend to diverse multi-modal data and interdisciplinary tasks. To address these challenges, we propose the Graph World Model (GWM), a world model that supports both unstructured and graph-structured states with multi-modal information and represents diverse tasks as actions. The core of a GWM is a generic message-passing algorithm to aggregate structured information, either over a unified multi-modal token space by converting multi-modal data into text (GWM-T) or a unified multi-modal embedding space by modality-specific encoders (GWM-E). Notably, GWM introduces action nodes to support diverse tasks, where action nodes are linked to other nodes via direct reference or similarity computation. Extensive experiments on six tasks from diverse domains, including multi-modal generation and matching, recommendation, graph prediction, multi-agent, retrieval-augmented generation, and planning and optimization, show that the same GWM outperforms or matches domain-specific baselines' performance, benefits from multi-hop structures, and demonstrates strong zero-shot/few-shot capabilities on unseen new tasks. Our code for GWM is released at https://github.com/ulab-uiuc/GWM.


Ibeo's LiDAR systems to provide higher autonomy to autonomous vehicles - Geospatial World

#artificialintelligence

Germany's Ibeo Automotive Systems, which specializes in lidar systems for autonomous driving, has signed a contract to provide China's Great Wall Motor Company (GWM) with its latest solid-state design. Ibeo said that it has commissioned key partner ZF Friedrichschafen – which in 2016 acquired a major stake in Ibeo – to produce the sensors and control unit for the "Level 3" system, which will provide partial autonomy. GWM has contracted one of its own subsidiaries to develop the system, which will be based around vertical cavity surface-emitting lasers (VCSELs) produced by Austria's AMS. Ibeo points out that, after signing a letter of intent in 2019, it has already been in pre-development with GWM for a year. Officially, the project started with the signing of an additional contract by the two parties last month.


Amid Political Tensions, Chinese Automaker Invests $1 Billion In Indian Plant

International Business Times

Despite worsening political ties between China and India, a Chinese automaker is making a large investment in the subcontinent. Great Wall Motors, or GWM, China's largest maker of sport utility vehicles and pickup trucks, has signed a memorandum of understanding to invest $1 billion to upgrade an auto plant in India's western province of Maharashtra. The plant, a former General Motors (GM) facility located near the city of Pune and is expected to generate jobs for 3,000 people. "This would be a highly automated plant in Talegaon [a town near Pune] with advanced robotics technology integrated in many of the production processes," said Parker Shi, managing director of GWM India. GWM becomes the second large Chinese automaker to enter the Indian market after MG Motor, a unit of SAIC Motor Corp., did so last year.


Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks

arXiv.org Machine Learning

Recently, Graph Neural Networks (GNNs) are trending in the machine learning community as a family of architectures that specializes in capturing the features of graph-related datasets, such as those pertaining to social networks and chemical structures. Unlike for other families of the networks, the representation power of GNNs has much room for improvement, and many graph networks to date suffer from the problem of underfitting. In this paper we will introduce a Graph Warp Module, a supernode-based auxiliary network module that can be attached to a wide variety of existing GNNs in order to improve the representation power of the original networks. Through extensive experiments on molecular graph datasets, we will show that our GWM indeed alleviates the underfitting problem for various existing networks, and that it can even help create a network with the state-of-the-art generalization performance.


Learning Graph Weighted Models on Pictures

arXiv.org Machine Learning

Graph Weighted Models (GWMs) have recently been proposed as a natural generalization of weighted automata over strings and trees to arbitrary families of labeled graphs (and hypergraphs). A GWM generically associates a labeled graph with a tensor network and computes a value by successive contractions directed by its edges. In this paper, we consider the problem of learning GWMs defined over the graph family of pictures (or 2-dimensional words). As a proof of concept, we consider regression and classification tasks over the simple Bars & Stripes and Shifting Bits picture languages and provide an experimental study investigating whether these languages can be learned in the form of a GWM from positive and negative examples using gradient-based methods. Our results suggest that this is indeed possible and that investigating the use of gradient-based methods to learn picture series and functions computed by GWMs over other families of graphs could be a fruitful direction.