Llama-Mimi: Speech Language Models with Interleaved Semantic and Acoustic Tokens

Open in new window