Real-Time Execution of Action Chunking Flow Policies
Black, Kevin, Galliker, Manuel Y., Levine, Sergey
–arXiv.org Artificial Intelligence
Modern AI systems, especially those interacting with the physical world, increasingly require real-time performance. However, the high latency of state-of-the-art generalist models, including recent vision-language action models (VLAs), poses a significant challenge. While action chunking has enabled temporal consistency in high-frequency control tasks, it does not fully address the latency problem, leading to pauses or out-of-distribution jerky movements at chunk boundaries. This paper presents a novel inference-time algorithm that enables smooth asynchronous execution of action chunking policies. Our method, real-time chunking (RTC), is applicable to any diffusion- or flow-based VLA out of the box with no re-training. It generates the next action chunk while executing the current one, "freezing" actions guaranteed to execute and "inpainting" the rest. To test RTC, we introduce a new benchmark of 12 highly dynamic tasks in the Kinetix simulator, as well as evaluate 6 challenging real-world bimanual manipulation tasks. Results demonstrate that RTC is fast, performant, and uniquely robust to inference delay, significantly improving task throughput and enabling high success rates in precise tasks $\unicode{x2013}$ such as lighting a match $\unicode{x2013}$ even in the presence of significant latency. See https://pi.website/research/real_time_chunking for videos.
arXiv.org Artificial Intelligence
Dec-8-2025
- Country:
- Europe
- Poland (0.04)
- United Kingdom > North Sea
- Southern North Sea (0.04)
- North America > Montserrat (0.04)
- Europe
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.66)
- Research Report
- Industry:
- Automobiles & Trucks (0.46)
- Energy (0.68)
- Information Technology (0.46)
- Transportation > Ground
- Road (0.46)
- Technology:
- Information Technology
- Architecture > Real Time Systems (1.00)
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.67)
- Reinforcement Learning (0.68)
- Natural Language > Large Language Model (0.68)
- Representation & Reasoning (1.00)
- Robots (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology