Deterministic Policies for Constrained Reinforcement Learning in Polynomial Time
–Neural Information Processing Systems
Our approach combines three key ideas: (1) value-demand augmentation, (2) action-space approximate dynamic programming, and (3) time-space rounding.
Neural Information Processing Systems
Feb-17-2026, 08:36:28 GMT
- Country:
- Europe
- Germany (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America > United States
- Wisconsin > Dane County > Madison (0.04)
- Europe
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Health & Medicine (0.45)
- Information Technology (0.46)
- Transportation (0.46)
- Technology: