aloha

Mar-31-2023, 21:41:37 GMT–Stanford Engineering

We introduce Action Chunking with Transformers (ACT). The key design choice is to predict a sequence of actions ("an action chunk") instead of a single action like standard Behavior Cloning. The ACT policy (figure: right) is trained as the decoder of a Conditional VAE (CVAE), i.e. a generative model. It synthesizes images from multiple viewpoints, joint positions, and style variable \(\mathcal{z}\) with a transformer encoder, and predicts a sequence of actions with a transformer decoder. The encoder of CVAE (figure: left) compresses action sequence and joint observation into \(\mathcal{z}\), the "style" of the action sequence.

artificial intelligence, mathcal, sequence, (6 more...)

Stanford Engineering

Mar-31-2023, 21:41:37 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.88)