GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot
Zeng, Aohan, Du, Zhengxiao, Liu, Mingdao, Wang, Kedong, Jiang, Shengmin, Zhao, Lei, Dong, Yuxiao, Tang, Jie
–arXiv.org Artificial Intelligence
We introduce GLM-4-Voice, an intelligent and human-like end-to-end spoken chatbot. It supports both Chinese and English, engages in real-time voice conversations, and varies vocal nuances such as emotion, intonation, speech rate, and dialect according to user instructions. GLM-4-Voice uses an ultra-low bitrate (175bps), single-codebook speech tokenizer with 12.5Hz frame rate derived from an automatic speech recognition (ASR) model by incorporating a vector-quantized bottleneck into the encoder. To efficiently transfer knowledge from text to speech modalities, we synthesize speech-text interleaved data from existing text pre-training corpora using a text-to-token model. We continue pre-training from the pre-trained text language model GLM-4-9B with a combination of unsupervised speech data, interleaved speech-text data, and supervised speech-text data, scaling up to 1 trillion tokens, achieving state-of-the-art performance in both speech language modeling and spoken question answering. We then fine-tune the pre-trained model with high-quality conversational speech data, achieving superior performance compared to existing baselines in both conversational ability and speech quality.
arXiv.org Artificial Intelligence
Dec-3-2024
- Country:
- South America > Colombia
- Bolivar Department > Cartagena (0.04)
- North America
- United States
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Hawaii > Honolulu County
- Honolulu (0.04)
- California
- Santa Clara County > Sunnyvale (0.04)
- Los Angeles County > Long Beach (0.04)
- Washington > King County
- Canada > British Columbia
- United States
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- France > Provence-Alpes-Côte d'Azur
- Bouches-du-Rhône > Marseille (0.04)
- Czechia > South Moravian Region
- Brno (0.04)
- Austria
- Ireland > Leinster
- Asia
- South America > Colombia
- Genre:
- Research Report (0.65)
- Technology: