Parametrized Quantum Policies for Reinforcement Learning

Neural Information Processing Systems 

All these results, however, only consider supervised and generative learning settings.