AITopics | Tedrake, Russ

Plotting

Tedrake, Russ

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empirical Analysis of Sim-and-Real Cotraining Of Diffusion Policies For Planar Pushing from Pixels

Wei, Adam, Agarwal, Abhinav, Chen, Boyuan, Bosworth, Rohan, Pfaff, Nicholas, Tedrake, Russ

arXiv.org Artificial IntelligenceMar-28-2025

-- In imitation learning for robotics, cotraining with demonstration data generated both in simulation and on real hardware has emerged as a powerful recipe to overcome the "sim2real gap". This work seeks to elucidate basic principles of this sim-and-real cotraining to help inform simulation design, sim-and-real dataset creation, and policy training. Focusing narrowly on the canonical task of planar pushing from camera inputs enabled us to be thorough in our study. These experiments confirm that cotraining with simulated data can dramatically improve performance in real, especially when real data is limited. The results also suggest that reducing the domain gap in physics may be more important than visual fidelity for nonprehensile manipulation tasks. Perhaps surprisingly, having some visual domain gap actually helps the cotrained policy - binary probes reveal that high-performing policies learn to distinguish simulated domains from real. We conclude by investigating this nuance and mechanisms that facilitate positive transfer between sim-and-real. In total, our experiments span over 40 real-world policies (evaluated on 800+ trials) and 200 simulated policies (evaluated on 40,000+ trials). Foundation models trained on large datasets have transformed natural language processing [2]][3] and computer vision [4]. However, this data-driven recipe has been challenging to replicate in robotics since real-world data for imitation learning can be expensive and time-consuming to collect [5]. Fortunately, alternative data sources, such as simulation and video, contain useful information for robotics. In particular, simulation is promising since it can automate robot-specific data collection.

machine learning, natural language, simulation, (18 more...)

arXiv.org Artificial Intelligence

2503.22634

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Sampling-Based Motion Planning with Discrete Configuration-Space Symmetries

Cohn, Thomas, Tedrake, Russ

arXiv.org Artificial IntelligenceMar-1-2025

When planning motions in a configuration space that has underlying symmetries (e.g. when manipulating one or multiple symmetric objects), the ideal planning algorithm should take advantage of those symmetries to produce shorter trajectories. However, finite symmetries lead to complicated changes to the underlying topology of configuration space, preventing the use of standard algorithms. We demonstrate how the key primitives used for sampling-based planning can be efficiently implemented in spaces with finite symmetries. A rigorous theoretical analysis, building upon a study of the geometry of the configuration space, shows improvements in the sample complexity of several standard algorithms. Furthermore, a comprehensive slate of experiments demonstrates the practical improvements in both path length and runtime.

artificial intelligence, machine learning, planning & scheduling, (18 more...)

arXiv.org Artificial Intelligence

2503.00614

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.91)

Add feedback

Scalable Real2Sim: Physics-Aware Asset Generation Via Robotic Pick-and-Place Setups

Pfaff, Nicholas, Fu, Evelyn, Binagia, Jeremy, Isola, Phillip, Tedrake, Russ

arXiv.org Artificial IntelligenceMar-1-2025

Simulating object dynamics from real-world perception shows great promise for digital twins and robotic manipulation but often demands labor-intensive measurements and expertise. We present a fully automated Real2Sim pipeline that generates simulation-ready assets for real-world objects through robotic interaction. Using only a robot's joint torque sensors and an external camera, the pipeline identifies visual geometry, collision geometry, and physical properties such as inertial parameters. Our approach introduces a general method for extracting high-quality, object-centric meshes from photometric reconstruction techniques (e.g., NeRF, Gaussian Splatting) by employing alpha-transparent training while explicitly distinguishing foreground occlusions from background subtraction. We validate the full pipeline through extensive experiments, demonstrating its effectiveness across diverse objects. By eliminating the need for manual intervention or environment modifications, our pipeline can be integrated directly into existing pick-and-place setups, enabling scalable and efficient dataset creation.

artificial intelligence, identification, reconstruction, (15 more...)

arXiv.org Artificial Intelligence

2503.0037

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots > Robots in the Workplace (0.61)

Add feedback

Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization

Yang, Lujie, Suh, H. J. Terry, Zhao, Tong, Graesdal, Bernhard Paus, Kelestemur, Tarik, Wang, Jiuguang, Pang, Tao, Tedrake, Russ

arXiv.org Artificial IntelligenceFeb-27-2025

Physics-Driven Data Generation for Contact-Rich Manipulation via Trajectory Optimization Lujie Y ang 1, 2, H.J. Terry Suh 1, Tong Zhao 2, Bernhard Paus Græsdal 1, Tarik Kelestemur 2, Jiuguang Wang 2, Tao Pang 2, and Russ Tedrake 1 Abstract --We present a low-cost data generation pipeline that integrates physics-based simulation, human demonstrations, and model-based planning to efficiently generate large-scale, high-quality datasets for contact-rich robotic manipulation tasks. Starting with a small number of embodiment-flexible human demonstrations collected in a virtual reality simulation environment, the pipeline refines these demonstrations using optimization-based kinematic retargeting and trajectory optimization to adapt them across various robot embodiments and physical parameters. This process yields a diverse, physically consistent, contact-rich dataset that enables cross-embodiment data transfer, and offers the potential to reuse legacy datasets collected under different hardware configurations or physical parameters. We validate the pipeline's effectiveness by training diffusion policies from the generated datasets for challenging long-horizon contact-rich manipulation tasks across multiple robot embodiments, including a floating Allegro hand and bimanual robot arms. The trained policies are deployed zero-shot on hardware for bimanual iiwa arms, achieving high success rates with minimal human input. I NTRODUCTION The emergence of foundation models has transformed fields such as natural language processing and computer vision, where models trained on massive, internet-scale datasets demonstrate remarkable generalization across diverse reasoning tasks [1, 2, 3, 4, 5]. Motivated by this success, the robotics community is currently pursuing foundation models for generalist robot policies capable of flexible and robust decision-making across a wide range of tasks [6, 7, 8], leading to significant industrial investments in large-scale robot learning [9]. However, the pursuit for generalist robot policies remains constrained by the limited availability of high-quality datasets, especially for contact-rich robotic manipulation. Existing datasets [7, 10, 11, 12] are orders of magnitude smaller than those used to train foundation models in other domains, such as Large Language Models (LLMs). To address data scarcity, robot learning researchers often rely on a spectrum of data sources varying in cost, quality, and transferability.

demonstration, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.20382

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

History-Guided Video Diffusion

Song, Kiwhan, Chen, Boyuan, Simchowitz, Max, Du, Yilun, Tedrake, Russ, Sitzmann, Vincent

arXiv.org Artificial IntelligenceFeb-10-2025

Classifier-free guidance (CFG) is a key technique for improving conditional generation in diffusion models, enabling more accurate control while enhancing sample quality. It is natural to extend this technique to video diffusion, which generates video conditioned on a variable number of context frames, collectively referred to as history. However, we find two key challenges to guiding with variable-length history: architectures that only support fixed-size conditioning, and the empirical observation that CFG-style history dropout performs poorly. To address this, we propose the Diffusion Forcing Transformer (DFoT), a video diffusion architecture and theoretically grounded training objective that jointly enable conditioning on a flexible number of history frames. We then introduce History Guidance, a family of guidance methods uniquely enabled by DFoT. We show that its simplest form, vanilla history guidance, already significantly improves video generation quality and temporal consistency. A more advanced method, history guidance across time and frequency further enhances motion dynamics, enables compositional generalization to out-of-distribution history, and can stably roll out extremely long videos. Website: https://boyuan.space/history-guidance

artificial intelligence, guidance, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.06764

Country: North America (0.14)

Genre:

Workflow (0.67)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Planning Shorter Paths in Graphs of Convex Sets by Undistorting Parametrized Configuration Spaces

Garg, Shruti, Cohn, Thomas, Tedrake, Russ

arXiv.org Artificial IntelligenceNov-28-2024

Abstract-- Optimization based motion planning provides a useful modeling framework through various costs and constraints. Using Graph of Convex Sets (GCS) for trajectory optimization gives guarantees of feasibility and optimality by representing configuration space as the finite union of convex sets. Nonlinear parametrizations can be used to extend this technique to handle cases such as kinematic loops, but this distorts distances, such that solving with convex objectives will yield paths that are suboptimal in the original space. We present a method to extend GCS to nonconvex objectives, allowing us to "undistort" the optimization landscape while maintaining feasibility guarantees. We demonstrate our method's efficacy on three different robotic planning domains: a bimanual robot moving an object with both arms, the set of 3D rotations using Euler angles, and a rational parametrization of kinematics that enables certifying regions as collision free. Across the board, our method significantly improves path length and trajectory duration with only a minimal increase in runtime.

artificial intelligence, objective, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2411.18913

Country: North America > United States > Massachusetts (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.68)

Add feedback

Faster Algorithms for Growing Collision-Free Convex Polytopes in Robot Configuration Space

Werner, Peter, Cohn, Thomas, Jiang, Rebecca H., Seyde, Tim, Simchowitz, Max, Tedrake, Russ, Rus, Daniela

arXiv.org Artificial IntelligenceNov-13-2024

We propose two novel algorithms for constructing convex collision-free polytopes in robot configuration space. Finding these polytopes enables the application of stronger motion-planning frameworks such as trajectory optimization with Graphs of Convex Sets [1] and is currently a major roadblock in the adoption of these approaches. In this paper, we build upon IRIS-NP (Iterative Regional Inflation by Semidefinite & Nonlinear Programming) [2] to significantly improve tunability, runtimes, and scaling to complex environments. IRIS-NP uses nonlinear programming paired with uniform random initialization to find configurations on the boundary of the free configuration space. Our key insight is that finding near-by configuration-space obstacles using sampling is inexpensive and greatly accelerates region generation. We propose two algorithms using such samples to either employ nonlinear programming more efficiently (IRIS-NP2) or circumvent it altogether using a massively-parallel zero-order optimization strategy (IRIS-ZO). We also propose a termination condition that controls the probability of exceeding a user-specified permissible fraction-in-collision, eliminating a significant source of tuning difficulty in IRIS-NP. We compare performance across eight robot environments, showing that IRIS-ZO achieves an order-of-magnitude speed advantage over IRIS-NP. IRIS-NP2, also significantly faster than IRIS-NP, builds larger polytopes using fewer hyperplanes, enabling faster downstream computation.

artificial intelligence, hyperplane, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2410.12649

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Multi-Query Shortest-Path Problem in Graphs of Convex Sets

Morozov, Savva, Marcucci, Tobia, Amice, Alexandre, Graesdal, Bernhard Paus, Bosworth, Rohan, Parrilo, Pablo A., Tedrake, Russ

arXiv.org Artificial IntelligenceSep-29-2024

The Shortest-Path Problem in Graph of Convex Sets (SPP in GCS) is a recently developed optimization framework that blends discrete and continuous decision making. Many relevant problems in robotics, such as collision-free motion planning, can be cast and solved as an SPP in GCS, yielding lower-cost solutions and faster runtimes than state-of-the-art algorithms. In this paper, we are motivated by motion planning of robot arms that must operate swiftly in static environments. We consider a multi-query extension of the SPP in GCS, where the goal is to efficiently precompute optimal paths between given sets of initial and target conditions. Our solution consists of two stages. Offline, we use semidefinite programming to compute a coarse lower bound on the problem's cost-to-go function. Then, online, this lower bound is used to incrementally generate feasible paths by solving short-horizon convex programs.

artificial intelligence, optimization problem, vertex, (17 more...)

arXiv.org Artificial Intelligence

2409.19543

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

GCS*: Forward Heuristic Search on Implicit Graphs of Convex Sets

Chia, Shao Yuan Chew, Jiang, Rebecca H., Graesdal, Bernhard Paus, Kaelbling, Leslie Pack, Tedrake, Russ

arXiv.org Artificial IntelligenceJul-11-2024

We consider large-scale, implicit-search-based solutions to the Shortest Path Problems on Graphs of Convex Sets (GCS). We propose GCS*, a forward heuristic search algorithm that generalizes A* search to the GCS setting, where a continuous-valued decision is made at each graph vertex, and constraints across graph edges couple these decisions, influencing costs and feasibility. Such mixed discrete-continuous planning is needed in many domains, including motion planning around obstacles and planning through contact. This setting provides a unique challenge for best-first search algorithms: the cost and feasibility of a path depend on continuous-valued points chosen along the entire path. We show that by pruning paths that are cost-dominated over their entire terminal vertex, GCS* can search efficiently while still guaranteeing cost optimality and completeness. To find satisficing solutions quickly, we also present a complete but suboptimal variation, pruning instead reachability-dominated paths. We implement these checks using polyhedral-containment or sampling-based methods. The sampling-based implementation is probabilistically complete and asymptotically cost optimal, and performs effectively even with minimal samples in practice. We demonstrate GCS* on planar pushing tasks where the combinatorial explosion of contact modes renders prior methods intractable and show it performs favorably compared to the state-of-the-art. Project website: https://shaoyuan.cc/research/gcs-star/

artificial intelligence, reachescheaper, vertex, (14 more...)

arXiv.org Artificial Intelligence

2407.08848

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Chen, Boyuan, Monso, Diego Marti, Du, Yilun, Simchowitz, Max, Tedrake, Russ, Sitzmann, Vincent

arXiv.org Artificial IntelligenceJul-4-2024

This paper presents Diffusion Forcing, a new training paradigm where a diffusion model is trained to denoise a set of tokens with independent per-token noise levels. We apply Diffusion Forcing to sequence generative modeling by training a causal next-token prediction model to generate one or several future tokens without fully diffusing past ones. Our approach is shown to combine the strengths of next-token prediction models, such as variable-length generation, with the strengths of full-sequence diffusion models, such as the ability to guide sampling to desirable trajectories. Our method offers a range of additional capabilities, such as (1) rolling-out sequences of continuous tokens, such as video, with lengths past the training horizon, where baselines diverge and (2) new sampling and guiding schemes that uniquely profit from Diffusion Forcing's variable-horizon and causal architecture, and which lead to marked performance gains in decision-making and planning tasks. In addition to its empirical success, our method is proven to optimize a variational lower bound on the likelihoods of all subsequences of tokens drawn from the true joint distribution. Project website: https://boyuan.space/diffusion-forcing

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2407.01392

Country:

North America > United States (0.14)
Asia (0.14)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Energy (0.68)
Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
(2 more...)

Add feedback