The iNaturalist Sounds Dataset
Chasmai, Mustafa, Shepard, Alexander, Maji, Subhransu, Van Horn, Grant
–arXiv.org Artificial Intelligence
We present the iNaturalist Sounds Dataset (iNatSounds), a collection of 230,000 audio files capturing sounds from over 5,500 species, contributed by more than 27,000 recordists worldwide. The dataset encompasses sounds from birds, mammals, insects, reptiles, and amphibians, with audio and species labels derived from observations submitted to iNaturalist, a global citizen science platform. Each recording in the dataset varies in length and includes a single species annotation. We benchmark multiple backbone architectures, comparing multiclass classification objectives with multilabel objectives. Despite weak labeling, we demonstrate that iNatSounds serves as a useful pretraining resource by benchmarking it on strongly labeled downstream evaluation datasets. The dataset is available as a single, freely accessible archive, promoting accessibility and research in this important domain. We envision models trained on this data powering next-generation public engagement applications, and assisting biologists, ecologists, and land use managers in processing large audio collections, thereby contributing to the understanding of species compositions in diverse soundscapes.
arXiv.org Artificial Intelligence
Jun-3-2025
- Country:
- Asia (0.04)
- Europe (0.04)
- North America
- Costa Rica (0.04)
- Mexico > Baja California (0.04)
- United States
- California (0.04)
- Hawaii (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Hampshire County
- Amherst (0.04)
- Nevada (0.04)
- Pennsylvania (0.04)
- South America
- Chile > Santiago Metropolitan Region
- Santiago Province > Santiago (0.04)
- Colombia (0.04)
- Chile > Santiago Metropolitan Region
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Law (0.34)
- Leisure & Entertainment (0.46)
- Technology:
- Information Technology
- Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (0.93)
- Natural Language (1.00)
- Vision (1.00)
- Machine Learning
- Communications (1.00)
- Data Science (0.93)
- Artificial Intelligence
- Information Technology