Online Learning with Sublinear Best-Action Queries
Russo, Matteo, Celli, Andrea, Baldeschi, Riccardo Colini, Fusco, Federico, Haimovich, Daniel, Karamshuk, Dima, Leonardi, Stefano, Tax, Niek
–arXiv.org Artificial Intelligence
Online learning is a foundational problem in machine learni ng. In its simplest version, a decision maker repeatedly interacts with a fixed set of n actions over a time horizon T . At each time, the decision maker needs to choose one of a set of actions; sub sequently, it receives an action-dependent loss and observes some feedback. These loss funct ions are generated by an omniscient (but oblivious) adversary and are only revealed on-the-go. The goal of the decision maker is to design a learning algorithm that achieves small regret with respect to the best fixed action in hindsight, i.e., the difference between the decision maker' s loss and that of the fixed action. Several online learning algorithms have been developed, character ized by optimal instance-independent regret bound, depending on the feedback model [ 13, 28 ]. Following the recent literature on algorithms with machine learning-based predictions (see, e.g., the survey by Mitzenmacher and Vassilvitskii [ 24 ]), we study the case where the learner is allowed to issue a limited number of best-action queries to an oracle that reveals the identity of the best action for that step, so that the learner can choose it. This s etting is motivated by scenarios in which obtaining accurate predictions on the optimal choice among numerous actions is possible but comes with significant costs and time constraints. For in stance, consider an online platform that continuously moderates posted content (e.g., Meta [ 22, 23 ] or Google [ 17 ]), and the online learning problem it faces: posts are generated one after the other, and the platform's task consists
arXiv.org Artificial Intelligence
Jul-23-2024
- Country:
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Genre:
- Research Report (0.50)
- Industry:
- Education > Educational Setting > Online (1.00)
- Technology: