Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time
–Neural Information Processing Systems
Our approach combines three key ideas: (1) value-demand augmentation, (2) action-space approximate dynamic programming, and (3) time-space rounding.
Neural Information Processing Systems
Oct-10-2025, 12:54:14 GMT
- Country:
- Europe
- Germany (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Alberta
- Census Division No. 13 > Woodlands County (0.04)
- United States > Wisconsin
- Dane County > Madison (0.04)
- Canada > Alberta
- Europe
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine (0.45)
- Information Technology (0.46)
- Transportation (0.46)
- Technology: