diffusion-based policy
SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning
Recent advances in robotic policy learning have enabled complex manipulation in real-world environments, yet the execution speed of these policies often lags behind hardware capabilities due to the cost of collecting faster demonstrations. Existing works on policy acceleration reinterpret action sequence for unseen execution speed, thereby encountering distributional shifts from the original demonstrations. Reinforcement learning is a promising approach that adapts policies for faster execution without additional demonstration, but its unguided exploration is sample inefficient. We propose SpeedAug, an RL-based policy acceleration framework that efficiently adapts pre-trained policies for faster task execution. SpeedAug constructs behavior prior that encompasses diverse tempos of task execution by pre-training a policy on speed-augmented demonstrations. Empirical results on robotic manipulation benchmarks show that RL fine-tuning initialized from this tempo-enriched policy significantly improves the sample efficiency of existing RL and policy acceleration methods while maintaining high success rate.
A Novel Task-Driven Diffusion-Based Policy with Affordance Learning for Generalizable Manipulation of Articulated Objects
Zhang, Hao, Kan, Zhen, Shang, Weiwei, Song, Yongduan
Abstract--Despite recent advances in dexterous manipulations, the manipulation of articulated objects and generalization across different categories remain significant challenges. T o address these issues, we introduce DART, a novel framework that enhances a d iffusion-based policy with a ffor dance learning and linear t emporal logic (L TL) representations to improve the learning efficiency and generalizability of articulated dexterous manipulation. Specifically, DART leverages L TL to understand task semantics and affordance learning to identify optimal interaction points. Additionally, we exploit an optimization method based on interaction data to refine actions, overcoming the limitations of traditional diffusion policies that typically rely on offline reinforcement learning or learning from demonstrations. Experimental results demonstrate that DART outperforms most existing methods in manipulation ability, generalization performance, transfer reasoning, and robustness. The manipulation of articulated objects has been an interesting and important topic in robotic learning. Although prior research has demonstrated promising results in the manipulation of rigid bodies, significant challenges persist when it comes to handling articulated objects [1]. Generalizing to various types of articulated objects [2] is particularly difficult for dexterous manipulations. For example, if a dexterous hand can open the lid of a toilet, it should also be capable of opening the lid of a garbage can, despite their cosmetic differences. While many recent efforts have focused on improving the robotic generalization performance [3] or reducing the exploration burden [4], enhancing the learning efficiency or improve the generalization ability for high degrees of freedom (DOF) skills, such as dexterous manipulation, remains a challenging problem, not to mention achieving both simultaneously.
- North America > United States > Florida > Alachua County > Gainesville (0.14)
- Asia > China > Anhui Province > Hefei (0.05)
- Asia > China > Chongqing Province > Chongqing (0.04)
- North America > United States > Tennessee > Putnam County > Cookeville (0.04)
Diffusion Policy Attacker: Crafting Adversarial Attacks for Diffusion-based Policies
Diffusion models have emerged as a promising approach for behavior cloning (BC), leveraging their exceptional ability to model multi-modal distributions. Diffusion policies (DP) have elevated BC performance to new heights, demonstrating robust efficacy across diverse tasks, coupled with their inherent flexibility and ease of implementation. Despite the increasing adoption of Diffusion Policies (DP) as a foundation for policy generation, the critical issue of safety remains largely unexplored. While previous attempts have targeted deep policy networks, DP used diffusion models as the policy network, making it ineffective to be attacked using previous methods because of its chained structure and randomness injected. In this paper, we undertake a comprehensive examination of DP safety concerns by introducing adversarial scenarios, encompassing offline and online attacks, global and patch-based attacks.
- Information Technology > Security & Privacy (0.44)
- Government > Military (0.44)
Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations
Ho, Chonlam, Hu, Jianshu, Wang, Hesheng, Dou, Qi, Ban, Yutong
Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations Chonlam Ho 1,, Jianshu Hu 1,, Hesheng Wang 2, Qi Dou 3, and Y utong Ban 1 Abstract -- Intelligent surgical robots have the potential to revolutionize clinical practice by enabling more precise and automated surgical procedures. However, the automation of such robot for surgical tasks remains under-explored compared to recent advancements in solving household manipulation tasks. These successes have been largely driven by (1) advanced models, such as transformers and diffusion models, and (2) large-scale data utilization. Aiming to extend these successes to the domain of surgical robotics, we propose a diffusion-based policy learning framework, called Diffusion Stabilizer Policy (DSP), which enables training with imperfect or even failed trajectories. Our approach consists of two stages: first, we train the diffusion stabilizer policy using only clean data. Then, the policy is continuously updated using a mixture of clean and perturbed data, with filtering based on the prediction error on actions. Comprehensive experiments conducted in various surgical environments demonstrate the superior performance of our method in perturbation-free settings and its robustness when handling perturbed demonstrations.
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Asia > China > Hong Kong (0.04)
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
Gong, Zhefei, Ding, Pengxiang, Lyu, Shangke, Huang, Siteng, Sun, Mingyang, Zhao, Wei, Fan, Zhaoxin, Wang, Donglin
In robotic visuomotor policy learning, diffusion-based models have achieved significant success in improving the accuracy of action trajectory generation compared to traditional autoregressive models. However, they suffer from inefficiency due to multiple denoising steps and limited flexibility from complex constraints. In this paper, we introduce Coarse-to-Fine AutoRegressive Policy (CARP), a novel paradigm for visuomotor policy learning that redefines the autoregressive action generation process as a coarse-to-fine, next-scale approach. CARP decouples action generation into two stages: first, an action autoencoder learns multi-scale representations of the entire action sequence; then, a GPT-style transformer refines the sequence prediction through a coarse-to-fine autoregressive process. This straightforward and intuitive approach produces highly accurate and smooth actions, matching or even surpassing the performance of diffusion-based policies while maintaining efficiency on par with autoregressive policies. We conduct extensive evaluations across diverse settings, including single-task and multi-task scenarios on state-based and image-based simulation benchmarks, as well as real-world tasks. CARP achieves competitive success rates, with up to a 10% improvement, and delivers 10x faster inference compared to state-of-the-art policies, establishing a high-performance, efficient, and flexible paradigm for action generation in robotic tasks.
- Research Report (0.64)
- Workflow (0.50)
TEDi Policy: Temporally Entangled Diffusion for Robotic Control
Høeg, Sigmund H., Tingelstad, Lars
Recently, diffusion models have proven powerful for robotic imitation learning, mainly due to their ability to express complex and multimodal distributions [1, 2]. Chi et al. [1], with Diffusion Policy, show that diffusion models excel at imitation learning by surpassing the previous state-of-the-art imitation learning methods by a large margin. A limitation of diffusion models is that multiple iterations are needed to obtain a clean prediction, where each iteration requires evaluating a neural network, which is typically large in size. This limits the application of diffusion-based policies to environments with fast dynamics that require fast control frequencies, restricting them to more static tasks, such as pick-and-place operations. Furthermore, the scarcity of computational resources onboard mobile robots further motivates the need to minimize the computation required to predict actions using diffusion-based policies. Several techniques to reduce the required steps while preserving the performance of diffusion-based imitation learning policies have been proposed [2, 3], mainly inspired by techniques developed for speeding up image-generation diffusion models [4, 5, 6]. Still, there are few examples of improvements specific to sequence-generating diffusion models.