Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behavior
Block, Adam, Jadbabaie, Ali, Pfrommer, Daniel, Simchowitz, Max, Tedrake, Russ
Training dynamic agents from datasets of expert examples, known as imitation learning, promises to take advantage of the plentiful demonstrations available in the modern data environment, in an analogous manner to the recent successes of language models conducting unsupervised learning on enormous corpora of text [68, 71]. Imitation learning is especially exciting in robotics, where mass stores of pre-recorded demonstrations on Youtube [1] or cheaply collected simulated trajectories [43, 20] can be converted into learned robotic policies. For imitation learning to be a viable path toward generalist robotic behavior, it needs to be able to both represent and execute the complex behaviors exhibited in the demonstrated data. An approach that has shown tremendous promise is generative behavior cloning: fitting generative models, such as diffusion models [2, 19, 34], to expert demonstrations with pure supervised learning. In this paper, we ask: Under what conditions can generative behavior cloning imitate arbitrarily complex expert behavior? In this paper, we are interested in how algorithmic choices interface with the dynamics of the agent's environment to render imitation possible. The key challenge separating imitation learning from vanilla supervised learning is one of compounding error: when the learner executes the trained behavior in its environment, small mistakes can accumulate into larger ones; this in turn may bring the agent to regions of state space not seen during training, leading to larger-still deviations from intended trajectories.
Oct-24-2023
- Country:
- Europe > United Kingdom
- England (0.14)
- North America > United States
- Massachusetts (0.14)
- Europe > United Kingdom
- Genre:
- Research Report (1.00)
- Industry:
- Energy (0.45)
- Information Technology (0.46)
- Technology: