Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
Marincione, Davide, Crisostomi, Donato, Dessi, Roberto, Rodolà, Emanuele, Rossi, Emanuele
–arXiv.org Artificial Intelligence
Foundation models capable of generalizing across species and tasks represent a promising new frontier in bioacoustics, with NatureLM being one of the most prominent examples. While its domain-specific fine-tuning yields strong performance on bioacoustic benchmarks, we observe that it also introduces trade-offs in instruction-following flexibility. For instance, NatureLM achieves high accuracy when prompted for either the common or scientific name individually, but its accuracy drops significantly when both are requested in a single prompt. We address this by applying a simple model merging strategy that interpolates NatureLM with its base language model, recovering instruction-following capabilities with minimal loss of domain expertise. Finally, we show that the merged model exhibits markedly stronger zero-shot generalization, achieving over a 200% relative improvement and setting a new state-of-the-art in closed-set zero-shot classification of unseen species.
arXiv.org Artificial Intelligence
Nov-20-2025
- Country:
- Africa > Rwanda
- Asia > Middle East
- Jordan (0.04)
- Europe > Austria (0.04)
- North America
- Canada
- British Columbia > Vancouver (0.04)
- Quebec > Montreal (0.04)
- United States
- California > San Francisco County
- San Francisco (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.05)
- Maryland > Baltimore (0.04)
- California > San Francisco County
- Canada
- Genre:
- Research Report > New Finding (0.46)
- Technology: