When Large Language Models Meet Speech: A Survey on Integration Approaches
Yang, Zhengdong, Shimizu, Shuichiro, Yu, Yahan, Chu, Chenhui
–arXiv.org Artificial Intelligence
Recent advancements in large language models (LLMs) have spurred interest in expanding their application beyond text-based tasks. A large number of studies have explored integrating other modalities with LLMs, notably speech modality, which is naturally related to text. This paper surveys the integration of speech with LLMs, categorizing the methodologies into three primary approaches: text-based, latent-representation-based, and audio-token-based integration.
arXiv.org Artificial Intelligence
Feb-26-2025
- Country:
- Asia > Japan
- Honshū (0.28)
- Europe (0.68)
- North America > United States (1.00)
- Asia > Japan
- Genre:
- Overview (1.00)
- Research Report (1.00)
- Technology: