Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Zhang, Edwin, Lu, Yujie, Wang, William, Zhang, Amy

Apr-10-2023–arXiv.org Artificial Intelligence

Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and multiple and new tasks. Recent advances with architectures have allowed for improved scaling along one or two of these dimensions, but are still prohibitive computationally. In this paper, we propose to address all three axes by leveraging Language to Control Diffusion models as a hierarchical planner conditioned on language (LCD). We effectively and efficiently scale diffusion models for planning in extended temporal, state, and task dimensions to tackle long horizon control problems conditioned on natural language instructions. We compare LCD with other state-of-the-art models on the CALVIN language robotics benchmark and find that LCD outperforms other SOTA methods in multi task success rates while dramatically improving computational efficiency with a single task success rate (SR) of 88.7% against the previous best of 82.6%. We show that LCD can successfully leverage the unique strength of diffusion models to produce coherent long range plans while addressing their weakness at generating low-level details and control. We release our code and models at https://github.com/ezhang7423/language-control-diffusion.

diffusion model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

Apr-10-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.68)

Genre:
- Research Report > New Finding (0.68)

Industry:
- Education (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks (0.94)
  - Natural Language (1.00)
  - Robots (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found