AUnifying View of Linear Function Approximation in Off-Policy Reinforcement Learning through Matrix Splitting and Preconditioning

Open in new window