Box Pose and Shape Estimation and Domain Adaptation for Large-Scale Warehouse Automation
Yu, Xihang, Talak, Rajat, Shi, Jingnan, Viereck, Ulrich, Gilitschenski, Igor, Carlone, Luca
–arXiv.org Artificial Intelligence
Modern warehouse automation systems rely on fleets of intelligent robots that generate vast amounts of data -- most of which remains unannotated. This paper develops a self-supervised domain adaptation pipeline that leverages real-world, unlabeled data to improve perception models without requiring manual annotations. Our work focuses specifically on estimating the pose and shape of boxes and presents a correct-and-certify pipeline for self-supervised box pose and shape estimation. We extensively evaluate our approach across a range of simulated and real industrial settings, including adaptation to a large-scale real-world dataset of 50,000 images. The self-supervised model significantly outperforms models trained solely in simulation and shows substantial improvements over a zero-shot 3D bounding box estimation baseline. Keywords: Certifiable models, computer vision, 3D robot vision, object pose estimation, safe perception, self-supervised learning.
arXiv.org Artificial Intelligence
Jul-2-2025
- Country:
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Ontario > Toronto (0.14)
- British Columbia > Metro Vancouver Regional District
- United States > Massachusetts
- Middlesex County
- Cambridge (0.04)
- Wilmington (0.04)
- Middlesex County
- Canada
- South America > Brazil (0.04)
- North America
- Genre:
- Research Report (0.64)
- Technology: