Reviews: Online Learning with a Hint
–Neural Information Processing Systems
The paper concerns online linear optimization where at each trial, the player, prior to prediction, receives a hint about the loss function. The hint has a form of a unit vector which is weakly correlated with the loss vector (its angle's cosine with loss vector is at least alpha). The paper shows that: - When the set of feasible actions is strongly convex, there exists an algorithm which gets logarithmic regret (in T). The algorithm is obtained by a reduction to the online learning problem with exp-concave losses. The bound is unimprovable in general, as shown in the Lower Bounds section.
Neural Information Processing Systems
Oct-7-2024, 15:23:38 GMT