Beyond Black-Box Advice: Learning-Augmented Algorithms for MDPs with Q-Value Predictions
–Neural Information Processing Systems
A notable example that motivates our work is the problem of minimizing costs (or maximizing rewards) in a single-trajectory Markov Decision Process (MDP).
Neural Information Processing Systems
Feb-15-2026, 20:09:47 GMT
- Country:
- Asia
- China > Guangdong Province
- Shenzhen (0.04)
- Middle East > Jordan (0.04)
- China > Guangdong Province
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- California (0.04)
- Asia
- Genre:
- Research Report > New Finding (0.67)
- Industry:
- Transportation > Air (0.43)