unisim
Interview with Sherry Yang: Learning interactive real-world simulators
Sherry Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans and Pieter Abbeel won an outstanding paper award at ICLR2024 for their work Learning Interactive Real-World Simulators. In the paper, they introduce a universal simulator (called UniSim) which takes image and text input to train a robot simulator. We spoke to Sherry about this work, some of the challenges, and potential applications. There are two components – there is the universal component and then there is a simulator component. Looking at the simulator component first – typically when people build a simulator, they do this based on an understanding of the real world, using physics equations. Researchers will build a simulator to study how things work, such as how cars move, for example.
Congratulations to the #ICLR2024 test of time and outstanding paper award winners
The Twelfth International Conference on Learning Representations (ICLR) is taking place this week in Vienna, Austria. During the opening of the conference, the outstanding paper award winners, and honourable mentions, were announced. The conference organisers also introduced a new award for this year: the test of time award. This award honours a paper from 2013/2014 that the programme chairs judge to have had a lasting impact. Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets?
- Information Technology > Artificial Intelligence > Natural Language (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Learning Interactive Real-World Simulators
Yang, Mengjiao, Du, Yilun, Ghasemipour, Kamyar, Tompson, Jonathan, Kaelbling, Leslie, Schuurmans, Dale, Abbeel, Pieter
Generative models trained on internet data have revolutionized how text, image, and video content can be created. Perhaps the next milestone for generative models is to simulate realistic experience in response to actions taken by humans, robots, and other interactive agents. Applications of a real-world simulator range from controllable content creation in games and movies, to training embodied agents purely in simulation that can be directly deployed in the real world. We explore the possibility of learning a universal simulator of real-world interaction through generative modeling. We first make the important observation that natural datasets available for learning a real-world simulator are often rich along different dimensions (e.g., abundant objects in image data, densely sampled actions in robotics data, and diverse movements in navigation data). With careful orchestration of diverse datasets, each providing a different aspect of the overall experience, we can simulate the visual outcome of both high-level instructions such as "open the drawer" and low-level controls such as "move by x, y" from otherwise static scenes and objects. We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies, each of which can be deployed in the real world in zero shot after training purely in simulation. We also show that other types of intelligence such as video captioning models can benefit from training with simulated experience, opening up even wider applications. Video demos can be found at universal-simulator.github.io.
- Leisure & Entertainment > Games (0.46)
- Media > Television (0.46)
UniSim: A Neural Closed-Loop Sensor Simulator
Yang, Ze, Chen, Yun, Wang, Jingkang, Manivasagam, Sivabalan, Ma, Wei-Chiu, Yang, Anqi Joyce, Urtasun, Raquel
Rigorously testing autonomy systems is essential for making safe self-driving vehicles (SDV) a reality. It requires one to generate safety critical scenarios beyond what can be collected safely in the world, as many scenarios happen rarely on public roads. To accurately evaluate performance, we need to test the SDV on these scenarios in closed-loop, where the SDV and other actors interact with each other at each timestep. Previously recorded driving logs provide a rich resource to build these new scenarios from, but for closed loop evaluation, we need to modify the sensor data based on the new scene configuration and the SDV's decisions, as actors might be added or removed and the trajectories of existing actors and the SDV will differ from the original log. In this paper, we present UniSim, a neural sensor simulator that takes a single recorded log captured by a sensor-equipped vehicle and converts it into a realistic closed-loop multi-sensor simulation. UniSim builds neural feature grids to reconstruct both the static background and dynamic actors in the scene, and composites them together to simulate LiDAR and camera data at new viewpoints, with actors added or removed and at new placements. To better handle extrapolated views, we incorporate learnable priors for dynamic objects, and leverage a convolutional network to complete unseen regions. Our experiments show UniSim can simulate realistic sensor data with small domain gap on downstream tasks. With UniSim, we demonstrate closed-loop evaluation of an autonomy system on safety-critical scenarios as if it were in the real world.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (4 more...)