Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

Open in new window