A new convergent variant of Q-learning with linear function approximation

Neural Information Processing Systems 

In this paper, we investigate the convergence of reinforcement learning with linear function approximation in control settings.