WhAM: Towards A Translative Model of Sperm Whale Vocalization
Paradise, Orr, Muralikrishnan, Pranav, Chen, Liangyuan, García, Hugo Flores, Pardo, Bryan, Diamant, Roee, Gruber, David F., Gero, Shane, Goldwasser, Shafi
–arXiv.org Artificial Intelligence
Sperm whales communicate in short sequences of clicks known as codas. We present WhAM (Whale Acoustics Model), the first transformer-based model capable of generating synthetic sperm whale codas from any audio prompt. WhAM is built by finetuning VampNet, a masked acoustic token model pretrained on musical audio, using 10k coda recordings collected over the past two decades. Through iterative masked token prediction, WhAM generates high-fidelity synthetic codas that preserve key acoustic features of the source recordings. We evaluate WhAM's synthetic codas using Fréchet Audio Distance and through perceptual studies with expert marine biologists. On downstream classification tasks including rhythm, social unit, and vowel classification, WhAM's learned representations achieve strong performance, despite being trained for generation rather than classification. Our code is available at https://github.com/Project-CETI/wham
arXiv.org Artificial Intelligence
Dec-3-2025
- Country:
- Asia
- China > Guangdong Province
- Shenzhen (0.04)
- India (0.04)
- Middle East > Israel
- Haifa District > Haifa (0.04)
- Singapore (0.04)
- South Korea
- China > Guangdong Province
- Atlantic Ocean (0.04)
- Europe
- North America
- Canada
- British Columbia > Vancouver (0.04)
- Ontario > National Capital Region
- Ottawa (0.04)
- Dominica (0.04)
- United States
- Arizona (0.04)
- California
- Los Angeles County > Long Beach (0.04)
- Santa Clara County > Sunnyvale (0.04)
- Illinois > Cook County
- Chicago (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Rhode Island (0.04)
- Canada
- Oceania > New Zealand (0.04)
- Asia
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Education (0.67)
- Health & Medicine (0.67)
- Leisure & Entertainment (1.00)
- Media > Music (0.93)
- Technology: