Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff
–Neural Information Processing Systems
Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of \widetilde{O}(T {5/6}), while the best known lower bound is \Omega(T {1/2}) . Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of \widetilde{O}(T {2/3}) .
Neural Information Processing Systems
Oct-11-2024, 13:06:45 GMT
- Technology: