PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits

Bianca Dumitrascu, Karen Feng, Barbara Engelhardt

Neural Information Processing Systems 

Its explicit estimation of the posterior distribution of the contextfeature covariance leads tosubstantial empirical gains overapproximate approaches.