PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators
Zeng, Kuo-Hao, Zhang, Zichen, Ehsani, Kiana, Hendrix, Rose, Salvador, Jordi, Herrasti, Alvaro, Girshick, Ross, Kembhavi, Aniruddha, Weihs, Luca
–arXiv.org Artificial Intelligence
We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation. PoliFormer uses a foundational vision transformer encoder with a causal transformer decoder enabling long-term memory and reasoning. It is trained for hundreds of millions of interactions across diverse environments, leveraging parallelized, multi-machine rollouts for efficient training with high throughput. PoliFormer is a masterful navigator, producing state-of-the-art results across two distinct embodiments, the LoCoBot and Stretch RE-1 robots, and four navigation benchmarks. It breaks through the plateaus of previous work, achieving an unprecedented 85.5% success rate in object goal navigation on the CHORES-S benchmark, a 28.5% absolute improvement. PoliFormer can also be trivially extended to a variety of downstream applications such as object tracking, multi-object navigation, and open-vocabulary navigation with no finetuning.
arXiv.org Artificial Intelligence
Jun-28-2024
- Country:
- Oceania > New Zealand
- North Island > Auckland Region > Auckland (0.04)
- North America > United States
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- California > Los Angeles County
- Long Beach (0.04)
- Utah > Salt Lake County
- Europe
- Czechia > Prague (0.04)
- United Kingdom > England
- Greater London > London (0.04)
- France > Île-de-France
- Asia
- South Korea > Daegu
- Daegu (0.04)
- Middle East
- South Korea > Daegu
- Africa
- Rwanda > Kigali
- Kigali (0.04)
- Ethiopia > Addis Ababa
- Addis Ababa (0.04)
- Rwanda > Kigali
- Oceania > New Zealand
- Genre:
- Research Report (0.50)
- Industry:
- Leisure & Entertainment (0.93)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Robots (1.00)
- Representation & Reasoning (1.00)
- Natural Language > Large Language Model (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Reinforcement Learning (0.67)
- Information Technology > Artificial Intelligence