Massively Scalable Inverse Reinforcement Learning in Google Maps

Barnes, Matt, Abueg, Matthew, Lange, Oliver F., Deeds, Matt, Trader, Jason, Molitor, Denali, Wulfmeier, Markus, O'Banion, Shawn

Sep-10-2023–arXiv.org Artificial Intelligence

Optimizing for humans' latent preferences remains a grand challenge in route recommendation. Prior research has provided increasingly general techniques based on inverse reinforcement learning (IRL), yet no approach has been successfully scaled to world-sized routing problems with hundreds of millions of states and demonstration trajectories. In this paper, we provide methods for scaling IRL using graph compression, spatial parallelization, and problem initialization based on dominant eigenvectors. We revisit classic algorithms and study them in a large-scale setting, and make the key observation that there exists a trade-off between the use of cheap, deterministic planners and expensive yet robust stochastic policies. We leverage this insight in Receding Horizon Inverse Planning (RHIP), a new generalization of classic IRL algorithms that provides fine-grained control over performance trade-offs via its planning horizon. Our contributions culminate in a policy that achieves a 16-24% improvement in global route quality, and to the best of our knowledge, represents the largest instance of IRL in a real-world setting to date. Benchmark results show critical benefits to more sustainable modes of transportation, where factors beyond journey time play a substantial role. We conclude by conducting an ablation study of key components, presenting negative results from alternative eigenvalue solvers, and identifying opportunities to further improve scalability via IRL-specific batching strategies.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Sep-10-2023

arXiv.org PDF

Add feedback

Country:
- Africa (0.67)
- Asia (0.93)
- Europe (0.93)
- North America > United States (0.14)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Services (0.41)
- Leisure & Entertainment > Games
  - Computer Games (0.67)
- Transportation > Ground
  - Road (0.46)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.46)
  - Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found