Adversarial Online Learning with Temporal Feedback Graphs

Jun-29-2024–arXiv.org Artificial Intelligence

Prediction with expert advice is one of the most fundamental problems in online learning. In its simplest form, a learner must choose from one of K actions (possibly choosing a randomized mixture of actions) every round for T rounds. An adversary then reveals a loss vector containing the loss for each action to the learner, the learner incurs their appropriate loss, and play proceeds to the next round. The goal of the learner in such settings is usually to minimize their regret: the gap between their total utility at the end of the game and the maximum utility they could have received if they played the best fixed action in hindsight. Notably, it is possible to construct algorithms for the learner which achieve regret sublinear in T against any adversarially chosen sequence of losses. Traditionally, a learner may use the entire history of losses up until round t to decide their action at round t. In this paper, we investigate the question of what happens if we restrict the learner's action at time t to depend on the losses in some subset of these rounds. Formally, we require the learner's (randomized) action at time t to be a function of the losses in some subset of rounds S

algorithm, feedback graph, graph, (16 more...)

arXiv.org Artificial Intelligence

Jun-29-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre:
- Research Report (0.50)

Industry:
- Education > Educational Setting > Online (0.72)

Technology:
- Information Technology
  - Artificial Intelligence
    - Machine Learning (1.00)
    - Representation & Reasoning (1.00)
  - Enterprise Applications > Human Resources
    - Learning Management (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found