Knowledge-Augmented Vision Language Models for Underwater Bioacoustic Spectrogram Analysis

Nihal, Ragib Amin, Yen, Benjamin, Ashizawa, Takeshi, Nakadai, Kazuhiro

arXiv.org Artificial Intelligence 

Marine mammals depend on acoustic communication for navigation, social interactions, and finding food across vast ocean environments. As climate change and human activities threaten many species with extinction, understanding these vocalizations has become essential for conservation efforts under Sustainable Development Goal 14 - Life Below Water. Automatically classifying marine mammal sounds from recordings presents major challenges. Underwater soundscapes are complex, each species has unique vocal patterns, and interpreting acoustic features requires biological expertise. Current approaches face a three-way trade-off between performance, cost, and interpretability.