Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity (Supplemental Material)
–Neural Information Processing Systems
Arboretum is a 134.6M sample dataset designed to advance AI for biodiversity applications by providing a large-scale, accurately annotated multimodal dataset that includes images and corresponding textual descriptions for a diverse set of species. Arboretum aims to facilitate the development of AI models for species identification, ecological monitoring, and agricultural research. Additionally, we introduce three new benchmark datasets: Arboretum-Unseen, Arboretum-LifeStages, and Arboretum-Balanced. As the authors of this submission, we affirm that we bear all responsibility in case of any rights violations or ethical issues associated with this work. We confirm that the submitted work is original, and if it includes third-party content, it is used with proper permissions and attributions.
Neural Information Processing Systems
Mar-27-2025, 03:38:57 GMT