Structure Matters: Dynamic Policy Gradient
–Neural Information Processing Systems
In this work, we study $\gamma$-discounted infinite-horizon tabular Markov decision processes (MDPs) and introduce a framework called dynamic policy gradient (DynPG).
Neural Information Processing Systems
Jun-10-2026, 17:24:54 GMT
- Technology: