EnvoDat: A Large-Scale Multisensory Dataset for Robotic Spatial Awareness and Semantic Reasoning in Heterogeneous Environments

Nwankwo, Linus, Ellensohn, Bjoern, Dave, Vedant, Hofer, Peter, Forstner, Jan, Villneuve, Marlene, Galler, Robert, Rueckert, Elmar

arXiv.org Artificial Intelligence 

Abstract-- To ensure the efficiency of robot autonomy under diverse real-world conditions, a high-quality heterogeneous dataset is essential to benchmark the operating algorithms' performance and robustness. Current benchmarks predominantly focus on urban terrains, specifically for on-road autonomous driving, leaving multi-degraded, densely vegetated, dynamic and feature-sparse environments, such as underground tunnels, natural fields, and modern indoor spaces underrepresented. To fill this gap, we introduce EnvoDat, a large-scale, multi-modal dataset collected in diverse environments and conditions, including high illumination, fog, rain, and zero visibility at different times of the day. Overall, EnvoDat contains 26 sequences from 13 scenes, 10 sensing modalities, over 1.9T B of data, and over 89K fine-grained polygon-based annotations for more than 82 object and terrain classes. EnvoDat includes time-synchronized multimodal sensor data (e.g., RGB, LiDAR, depth) and Furthermore, real-world environments are often in a state I. INTRODUCTION This viability poses challenges for (whether known or unknown), describe their location, accurate perception and SLAM in autonomous agents. However, adapting autonomous for contemporary perception and SLAM algorithms agents to perform such innate abilities and operate reliably can potentially lead to inaccuracies.