Towards minimax policies for online linear optimization with bandit feedback

Open in new window