Less_Hints (6)

Manish Purohit

Neural Information Processing Systems 

In this work, we study the question of whether an algorithm really requires a hint at every time step. Somewhat surprisingly, we show that an algorithm can obtain O(log T) regret with just O(p T) hints under a natural query model; in contrast, we also show that o(p T) hints cannot guarantee better than (p T) regret. We give two applications of our result, to the well-studied setting of optimistic regret bounds and to the problem of online learning with abstention.