A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning
Wenhao Yang, Xiang Li, Zhihua Zhang
–Neural Information Processing Systems
We propose and study a general framework for regularized Markov decision processes (MDPs) where the goal is to find an optimal policy that maximizes the expected discounted total reward plus a policy regularization term.
Neural Information Processing Systems
Oct-2-2025, 14:46:58 GMT
- Technology: