On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes
–Neural Information Processing Systems
We consider infinite-horizon stationary γ-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy.
Neural Information Processing Systems
Mar-14-2024, 11:54:56 GMT
- Country:
- Europe > France (0.04)
- North America
- Asia > Middle East
- Israel > Haifa District > Haifa (0.04)
- Technology: