On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Hansen, Nicklas, Yuan, Zhecheng, Ze, Yanjie, Mu, Tongzhou, Rajeswaran, Aravind, Su, Hao, Xu, Huazhe, Wang, Xiaolong
–arXiv.org Artificial Intelligence
In this paper, we examine the effectiveness of pre-training for visuo-motor control tasks. We revisit a simple Learning-from-Scratch (LfS) baseline that incorporates data augmentation and a shallow ConvNet, and find that this baseline is surprisingly competitive with recent approaches (PVR, MVP, R3M) that leverage frozen visual representations trained on large-scale vision datasets -- across a variety of algorithms, task domains, and metrics in simulation and on a real robot. Our results demonstrate that these methods are hindered by a significant domain gap between the pre-training datasets and current benchmarks for visuo-motor control, which is alleviated by finetuning. Based on our findings, we provide recommendations for future research in pre-training for control and hope that our simple yet strong baseline will aid in accurately benchmarking progress in this area.
arXiv.org Artificial Intelligence
Jun-15-2023
- Country:
- Asia > China
- North America > United States
- California > San Diego County
- San Diego (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California > San Diego County
- Genre:
- Research Report > New Finding (1.00)
- Technology: