Equipping Experts/Bandits with Long-term Memory
Kai Zheng, Haipeng Luo, Ilias Diakonikolas, Liwei Wang
–Neural Information Processing Systems
We propose the first reduction-based approach to obtaining long-term memory guarantees for online learning in the sense of Bousquet and Warmuth [8], by reducing the problem to achieving typical switching regret. Specifically, for the classical expert problem with K actions and T rounds, using our framework we develop various algorithms with a regret bound of order O( T (S ln T + n ln K)) compared to any sequence of experts with S 1 switches among n min{S, K} distinct experts. In addition, by plugging specific adaptive algorithms into our framework we also achieve the best of both stochastic and adversarial environments simultaneously.
Neural Information Processing Systems
Jan-22-2025, 23:42:42 GMT
- Country:
- North America > United States > California (0.28)
- Industry:
- Education > Educational Setting > Online (0.34)
- Technology: