Outcome-Driven Reinforcement Learning via Variational Inference Tim G. J. Rudner University of Oxford Vitchyr H. Pong
–Neural Information Processing Systems
Illustration of the shaping effect of the reward function derived from the goal-directed variational inference objective.
Neural Information Processing Systems
Aug-15-2025, 01:03:21 GMT
- Country:
- North America > United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Alameda County
- Berkeley (0.04)
- Louisiana > Orleans Parish
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.40)
- North America > United States
- Technology: