Structure Matters: Dynamic Policy Gradient
–Neural Information Processing Systems
In this work, we study γ-discounted infinite-horizon tabular Markov decision processes (MDPs) and introduce a framework called dynamic policy gradient (DynPG).
Neural Information Processing Systems
Jun-15-2026, 04:24:40 GMT
- Country:
- Europe (0.45)
- North America > United States
- Illinois (0.28)
- Genre:
- Research Report > Experimental Study (1.00)