paper

Akshay Krishnamurthy

Neural Information Processing Systems 

In this section we provide a detailed proof for the main theorem. First we state some facts about the learning rate and the algorithm. This bound contains three parts. The first is an upper bound for the first step when there is no data. The third part is an "average" of the estimated future regret.