Reviews: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Oct-7-2024, 15:21:05 GMT–Neural Information Processing Systems

The authors consider distributionally robust finite MDPs over a finite horizon. The transition probabilities conditionally to a state-action pair should remain at L1-bounded distance from a base measure, which is feasible as being generated using a given reference policy. This is a nice idea. A few comments are mentioned next. Related to that question, why the requirement of staying "close" to this policy would be beneficial.

policy-conditioned uncertainty set, reference policy, robust markov decision process, (4 more...)

Neural Information Processing Systems

Oct-7-2024, 15:21:05 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)