Provably Efficient Q-Learning with Low Switching Cost

Yu Bai, Tengyang Xie, Nan Jiang, Yu-Xiang Wang

Jan-23-2025, 09:57:24 GMT–Neural Information Processing Systems

We take initial steps in studying PAC-MDP algorithms with limited adaptivity, that is, algorithms that change its exploration policy as infrequently as possible during regret minimization. This is motivated by the difficulty of running fully adaptive algorithms in real-world applications (such as medical domains), and we propose to quantify adaptivity using the notion of local switching cost.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Jan-23-2025, 09:57:24 GMT

Conferences PDF

Add feedback

Country:
- North America
  - Canada (0.14)
  - United States (0.14)

Industry:
- Health & Medicine (0.95)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)