Total stochastic gradient algorithms and applications in reinforcement learning

Paavo Parmas

Neural Information Processing Systems 

Neural Information Processing Systems http://nips.cc/