SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval
Lin, Yueqian, Fu, Yuzhe, Zhang, Jingyang, Liu, Yudong, Zhang, Jianyi, Sun, Jingwei, Li, Hai "Helen", Chen, Yiran
–arXiv.org Artificial Intelligence
These benchmarks contain only short audio clips and thus do not reflect the complexity of achieving long-context Speech Large Language Models (Speech LLMs) represent a understanding and extracting precise information from lengthy significant advancement in speech language understanding and audio sequences. To systematically assess the unique challenges processing, as they leverage contextual reasoning capabilities of posed by SIR, we present SPIRAL (Speech Informational large language models to process audio inputs [1]. Unlike traditional Retrieval and Lookup), a 1,012-sample benchmark specifically cascaded pipelines, where automatic speech recognition crafted to evaluate Speech LLM performance on long-form (ASR) and language modeling are handled by separate modules, audio sequences (around 90 seconds in duration). On a high Speech LLMs unify audio processing, cross-modal fusion, and level, SPIRAL constructs SIR questions by embedding a critical language modeling in a single architecture [2]. These unified piece of information within lengthy and potentially distracting models can perform multiple tasks like speech recognition, dialogues, thereby assessing the model ability to pinpoint and speech translation, speaker identification and emotion recognition, retrieve essential content from long-form inputs.
arXiv.org Artificial Intelligence
Dec-16-2024
- Country:
- North America > United States
- California > Los Angeles County > Los Angeles (0.04)
- Europe
- Norway (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Asia
- Indonesia > Bali (0.04)
- Middle East
- UAE > Dubai Emirate
- Dubai (0.04)
- Republic of Türkiye > Istanbul Province
- Istanbul (0.04)
- UAE > Dubai Emirate
- Japan > Honshū
- Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- China > Beijing
- Beijing (0.04)
- North America > United States
- Genre:
- Research Report (1.00)
- Industry:
- Health & Medicine (0.46)
- Technology: