Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs
–Neural Information Processing Systems
We present an SPI formulation for this RL setting that takes into account the preferences of the algorithm's user for handling the trade-offs for different reward signals
Neural Information Processing Systems
Oct-2-2025, 09:16:59 GMT
- Country:
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America
- Canada > Quebec
- Montreal (0.04)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Massachusetts (0.04)
- California > Los Angeles County
- Canada > Quebec
- Europe > United Kingdom
- Genre:
- Research Report (0.93)
- Industry:
- Health & Medicine (1.00)