UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network

Arora, Siddhant, Futami, Hayato, Jung, Jee-weon, Peng, Yifan, Sharma, Roshan, Kashiwagi, Yosuke, Tsunoo, Emiru, Watanabe, Shinji

Oct-4-2023–arXiv.org Artificial Intelligence

Recent studies have demonstrated promising outcomes by employing large language models with multi-tasking capabilities. They utilize prompts to guide the model's behavior and surpass performance of task-specific models. Motivated by this, we ask: can we build a single model that jointly perform various spoken language understanding (SLU) tasks? To address this, we utilize pre-trained automatic speech recognition (ASR) models and employ various task and dataset specifiers as discrete prompts. We demonstrate efficacy of our single multi-task learning (MTL) model "UniverSLU" for 12 different speech classification and sequence generation tasks across 17 datasets and 9 languages. Results show that UniverSLU achieves competitive performance and even surpasses task-specific models. We also conduct preliminary investigations into enabling human-interpretable natural phrases instead of task specifiers as discrete prompts and test the model's generalization capabilities to new paraphrases.

artificial intelligence, natural language, speech recognition, (4 more...)

arXiv.org Artificial Intelligence

Oct-4-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Speech > Speech Recognition (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found