Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training

Dec-24-2025, 23:12:21 GMT–Neural Information Processing Systems

Learning a generalist embodied agent capable of completing multiple tasks poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets. In contrast, a vast amount of human videos exist, capturing intricate tasks and interactions with the physical world. Promising prospects arise for utilizing actionless human videos for pre-training and transferring the knowledge to facilitate robot policy learning through limited robot demonstrations. However, it remains a challenge due to the domain gap between humans and robots. Moreover, it is difficult to extract useful information representing the dynamic world from human videos, because of its noisy and multimodal data structure.

actionable discrete diffusion policy, artificial intelligence, machine learning, (9 more...)

Neural Information Processing Systems

Dec-24-2025, 23:12:21 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Robots (1.00)
  - Machine Learning (1.00)