Massively Scalable Inverse Reinforcement Learning in Google Maps
Barnes, Matt, Abueg, Matthew, Lange, Oliver F., Deeds, Matt, Trader, Jason, Molitor, Denali, Wulfmeier, Markus, O'Banion, Shawn
–arXiv.org Artificial Intelligence
Optimizing for humans' latent preferences remains a grand challenge in route recommendation. Prior research has provided increasingly general techniques based on inverse reinforcement learning (IRL), yet no approach has been successfully scaled to world-sized routing problems with hundreds of millions of states and demonstration trajectories. In this paper, we provide methods for scaling IRL using graph compression, spatial parallelization, and problem initialization based on dominant eigenvectors. We revisit classic algorithms and study them in a large-scale setting, and make the key observation that there exists a trade-off between the use of cheap, deterministic planners and expensive yet robust stochastic policies. We leverage this insight in Receding Horizon Inverse Planning (RHIP), a new generalization of classic IRL algorithms that provides fine-grained control over performance trade-offs via its planning horizon. Our contributions culminate in a policy that achieves a 16-24% improvement in global route quality, and to the best of our knowledge, represents the largest instance of IRL in a real-world setting to date. Benchmark results show critical benefits to more sustainable modes of transportation, where factors beyond journey time play a substantial role. We conclude by conducting an ablation study of key components, presenting negative results from alternative eigenvalue solvers, and identifying opportunities to further improve scalability via IRL-specific batching strategies.
arXiv.org Artificial Intelligence
Sep-10-2023
- Country:
- Africa (0.67)
- Asia (0.93)
- Europe (0.93)
- North America > United States (0.14)
- Genre:
- Research Report (1.00)
- Industry:
- Information Technology > Services (0.41)
- Leisure & Entertainment > Games
- Computer Games (0.67)
- Transportation > Ground
- Road (0.46)