Finite-Sample Analysis for SARSA with Linear Function Approximation
Shaofeng Zou, Tengyu Xu, Yingbin Liang
–Neural Information Processing Systems
SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.
Neural Information Processing Systems
Jan-26-2025, 03:01:54 GMT