Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
Geldenhuys, Christiaan M., Niesler, Thomas R.
–arXiv.org Artificial Intelligence
We consider the problem of detecting, isolating and classifying elephant calls in continuously recorded audio. Such automatic call characterisation can assist conservation efforts and inform environmental management strategies. In contrast to previous work in which call detection was performed at a segment level, we perform call detection at a frame level which implicitly also allows call endpointing, the isolation of a call in a longer recording. For experimentation, we employ two annotated datasets, one containing Asian and the other African elephant vocalisations. We evaluate several shallow and deep classifier models, and show that the current best performance can be improved by using an audio spectrogram transformer (AST), a neural architecture which has not been used for this purpose before, and which we have configured in a novel sequence-to-sequence manner. We also show that using transfer learning by pre-training leads to further improvements both in terms of computational complexity and performance. Finally, we consider sub-call classification using an accepted taxonomy of call types, a task which has not previously been considered. We show that also in this case the transformer architectures provide the best performance. Our best classifiers achieve an average precision (AP) of 0.962 for framewise binary call classification, and an area under the receiver operating characteristic (AUC) of 0.957 and 0.979 for call classification with 5 classes and sub-call classification with 7 classes respectively. All of these represent either new benchmarks (sub-call classifications) or improvements on previously best systems. We conclude that a fully-automated elephant call detection and subcall classification system is within reach. Such a system would provide valuable information on the behaviour and state of elephant herds for the purposes of conservation and management.
arXiv.org Artificial Intelligence
Oct-15-2024
- Country:
- North America
- United States
- District of Columbia > Washington (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- Nevada > Clark County
- Las Vegas (0.04)
- Colorado > Denver County
- Denver (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Utah > Salt Lake County
- Salt Lake City (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Washington > King County
- Seattle (0.04)
- California
- San Francisco County > San Francisco (0.04)
- Los Angeles County > Long Beach (0.04)
- Canada
- United States
- Europe
- France (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Germany > North Rhine-Westphalia
- Cologne Region > Bonn (0.04)
- Czechia > South Moravian Region
- Brno (0.04)
- Asia
- Sri Lanka (0.04)
- Southeast Asia (0.04)
- China > Hong Kong (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Malaysia > Kuala Lumpur
- Kuala Lumpur (0.04)
- Japan > Honshū
- Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Africa
- West Africa (0.04)
- Sub-Saharan Africa (0.04)
- South Africa (0.04)
- Mozambique (0.04)
- Kenya (0.04)
- North America
- Genre:
- Research Report > New Finding (1.00)
- Technology: