Online learning with dynamics: A minimax perspective

Oct-11-2024, 02:10:20 GMT–Neural Information Processing Systems

We consider the problem of online learning with dynamics, where a learner interacts with a stateful environment over multiple rounds. In each round of the interaction, the learner selects a policy to deploy and incurs a cost that depends on both the chosen policy and current state of the world. The state-evolution dynamics and the costs are allowed to be time-varying, in a possibly adversarial way. In this setting, we study the problem of minimizing policy regret and provide non-constructive upper bounds on the minimax rate for the problem. Our main results provide sufficient conditions for online learnability for this setup with corresponding rates.

complexity term, minimax perspective, online

Neural Information Processing Systems

Oct-11-2024, 02:10:20 GMT

Conferences Web Page

Add feedback

Industry:
- Education > Educational Setting > Online (0.68)

Technology:
- Information Technology
  - Enterprise Applications > Human Resources
    - Learning Management (0.68)
  - Artificial Intelligence > Representation & Reasoning
    - Search (0.64)