USCILab3D: A Large-scale, Long-term, Semantically Annotated Outdoor Dataset

Neural Information Processing Systems 

In this paper, we introduce the \textbf{USCILab3D dataset}, a large-scale, annotated outdoor dataset designed for versatile applications across multiple domains, including computer vision, robotics, and machine learning. The dataset was acquired using a mobile robot equipped with 5 cameras and a 32-beam, 360 {\circ} scanning LIDAR. The robot was teleoperated, over the course of a year and under a variety of weather and lighting conditions, through a rich variety of paths within the USC campus (229 acres \sim 92.7 hectares). The raw data was annotated using state-of-the-art large foundation models, and processed to provide multi-view imagery, 3D reconstructions, semantically-annotated images and point clouds (267 semantic categories), and text descriptions of images and objects within. The dataset also offers a diverse array of complex analyses using pose-stamping and trajectory data.