PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

Zeng, Kuo-Hao, Zhang, Zichen, Ehsani, Kiana, Hendrix, Rose, Salvador, Jordi, Herrasti, Alvaro, Girshick, Ross, Kembhavi, Aniruddha, Weihs, Luca

Jun-28-2024–arXiv.org Artificial Intelligence

We present PoliFormer (Policy Transformer), an RGB-only indoor navigation agent trained end-to-end with reinforcement learning at scale that generalizes to the real-world without adaptation despite being trained purely in simulation. PoliFormer uses a foundational vision transformer encoder with a causal transformer decoder enabling long-term memory and reasoning. It is trained for hundreds of millions of interactions across diverse environments, leveraging parallelized, multi-machine rollouts for efficient training with high throughput. PoliFormer is a masterful navigator, producing state-of-the-art results across two distinct embodiments, the LoCoBot and Stretch RE-1 robots, and four navigation benchmarks. It breaks through the plateaus of previous work, achieving an unprecedented 85.5% success rate in object goal navigation on the CHORES-S benchmark, a 28.5% absolute improvement. PoliFormer can also be trivially extended to a variety of downstream applications such as object tracking, multi-object navigation, and open-vocabulary navigation with no finetuning.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

Jun-28-2024

arXiv.org PDF

Add feedback

Country:
- Africa (0.46)
- Asia > Middle East
  - Israel (0.14)
  - Qatar (0.14)
- Europe (0.46)
- North America > United States (0.68)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Neural Networks > Deep Learning (1.00)
    - Reinforcement Learning (0.67)
  - Natural Language > Large Language Model (1.00)
  - Representation & Reasoning (1.00)
  - Robots (1.00)
  - Vision (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found