Reviews: Fast Rates for Bandit Optimization with Upper-Confidence Frank-Wolfe
–Neural Information Processing Systems
This is a very interesting paper at the intersection of bandit problems and stochastic optimization. The authors consider the setup where at each time step a decision maker takes an action and gets as a feedback a gradient estimate at the chosen point. The gradient estimate is noisy but the assumption is that we have control over how noisy the gradient estimate is. The challenge for the decision maker is to make a sequence of moves such that the average of all iterates has low error compared to the optima. So the goal is similar to the goal in stochastic optimization, but unlike stochastic optimization (but like bandit problems) one gets to see limited information about the loss function.
Neural Information Processing Systems
Oct-8-2024, 10:42:39 GMT
- Technology: