A note on continuous-time online learning
In online learning, the data is provided in a sequential order, and the goal of the learner is to make online decisions to minimize overall regrets. This note is concerned with continuous-time models and algorithms for several online learning problems: online linear optimization, adversarial bandit, and adversarial linear bandit. For each problem, we extend the discrete-time algorithm to the continuous-time setting and provide a concise proof of the optimal regret bound.
May-16-2024
- Country:
- North America > United States > California > Santa Clara County (0.14)
- Genre:
- Research Report (0.50)
- Industry:
- Education > Educational Setting > Online (0.87)
- Technology: