OpenAI Baselines: ACKTR & A2C
ACKTR can learn continuous control tasks, like moving a robotic arm to a target location, purely from low-resolution pixel inputs (left). ACKTR (pronounced "actor") -- Actor Critic using Kronecker-factored Trust Region -- was developed by researchers at the University of Toronto and New York University, and we at OpenAI have collaborated with them to release a Baselines implementation. The authors use ACKTR to learn control policies for simulated robots (with pixels as input, and continuous action spaces) and Atari agents (with pixels as input and discrete action spaces). ACKTR combines three distinct techniques: actor-critic methods, trust region optimization for more consistent improvement, and distributed Kronecker factorization to improve sample efficiency and scalability. For machine learning algorithms, two costs are important to consider: sample complexity and computational complexity.
Aug-22-2017, 15:40:41 GMT