Action Inference by Maximising Evidence: Zero-Shot Imitation from Observation with World Models, Philip Becker-Ehmck 1, Patrick van der Smagt 1,3, Maximilian Karl
–Neural Information Processing Systems
Unlike most reinforcement learning agents which require an unrealistic amount of environment interactions to learn a new behaviour, humans excel at learning quickly by merely observing and imitating others. This ability highly depends on the fact that humans have a model of their own embodiment that allows them to infer the most likely actions that led to the observed behaviour. In this paper, we propose Action Inference by Maximising Evidence (AIME) to replicate this behaviour using world models. AIME consists of two distinct phases. In the first phase, the agent learns a world model from its past experience to understand its own body by maximising the evidence lower bound (ELBO).
Neural Information Processing Systems
Mar-27-2025, 12:27:48 GMT
- Country:
- North America > United States (0.28)
- Genre:
- Research Report (0.46)
- Industry:
- Leisure & Entertainment > Games (0.46)
- Technology: