Will it Blend? Composing Value Functions in Reinforcement Learning
van Niekerk, Benjamin, James, Steven, Earle, Adam, Rosman, Benjamin
An important property for lifelong-learning agents is the ability to combine existing skills to solve unseen tasks. In general, however, it is unclear how to compose skills in a principled way. We provide a "recipe" for optimal value function composition in entropy-regularised reinforcement learning (RL) and then extend this to the standard RL setting. Composition is demonstrated in a video game environment, where an agent with an existing library of policies is able to solve new tasks without the need for further learning.
Jul-12-2018
- Country:
- Africa > South Africa
- Gauteng
- Johannesburg (0.04)
- Pretoria (0.04)
- Gauteng
- Europe > Sweden
- Africa > South Africa
- Genre:
- Instructional Material (0.49)
- Research Report (0.64)
- Industry:
- Leisure & Entertainment > Games > Computer Games (0.54)
- Technology: