Exploiting easy data in online optimization
Sani, Amir, Neu, Gergely, Lazaric, Alessandro
–Neural Information Processing Systems
We consider the problem of online optimization, where a learner chooses a decision from a given decision set and suffers some loss associated with the decision and the state of the environment. The learner's objective is to minimize its cumulative regret against the best fixed decision in hindsight. Over the past few decades numerous variants have been considered, with many algorithms designed to achieve sub-linear regret in the worst case. However, this level of robustness comes at a cost. Proposed algorithms are often over-conservative, failing to adapt to the actual complexity of the loss sequence which is often far from the worst case.
Neural Information Processing Systems
Mar-19-2020, 05:32:12 GMT
- Technology: