Improved V ariance-A ware Confidence Sets for Linear Bandits and Linear Mixture MDP

Neural Information Processing Systems 

One of the most fundamental and widely used methods is linear function approximation.