Beyond Black-Box Advice: Learning-Augmented Algorithms for MDPs with Q-Value Predictions
–Neural Information Processing Systems
A notable example that motivates our work is the problem of minimizing costs (or maximizing rewards) in a single-trajectory Markov Decision Process (MDP).
Neural Information Processing Systems
Oct-9-2025, 01:13:14 GMT
- Country:
- Asia
- China > Guangdong Province
- Shenzhen (0.04)
- Middle East > Jordan (0.04)
- China > Guangdong Province
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- California (0.04)
- Asia
- Genre:
- Research Report (0.46)
- Industry:
- Transportation > Air (0.45)