CoLLA T: On Adding Fine-grained Audio Understanding to Language Models using Token-Level Locked-Language Tuning

Neural Information Processing Systems 

Humans can easily understand various audio concepts, but conventional audio classification models fail due to their inability to predict unseen classes during training.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found