Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff
Dekel, Ofer, Eldan, Ronen, Koren, Tomer
–Neural Information Processing Systems
Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of $\widetilde{O}(T^{5/6})$, while the best known lower bound is $\Omega(T^{1/2})$. Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of $\widetilde{O}(T^{2/3})$. We present an efficient algorithm for the banditsmooth convex optimization problem that guarantees a regret of $\widetilde{O}(T^{5/8})$. Our result rules out an $\Omega(T^{2/3})$ lower bound and takes a significant step towards the resolution of this open problem.
Neural Information Processing Systems
Dec-31-2015
- Country:
- Asia > Middle East > Israel (0.14)
- Genre:
- Research Report > New Finding (0.34)
- Industry:
- Education (0.34)
- Technology: