On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding

Laperrière, Gaëlle, Pelloin, Valentin, Rouvier, Mickaël, Stafylakis, Themos, Estève, Yannick

Oct-11-2022–arXiv.org Artificial Intelligence

In this paper we examine the use of semantically-aligned speech representations for end-to-end spoken language understanding (SLU). We employ the recently-introduced SAMU-XLSR model, which is designed to generate a single embedding that captures the semantics at the utterance level, semantically aligned across different languages. This model combines the acoustic frame-level speech representation learning model (XLS-R) with the Language Agnostic BERT Sentence Embedding (LaBSE) model. We show that the use of the SAMU-XLSR model instead of the initial XLS-R model improves significantly the performance in the framework of end-to-end SLU. Finally, we present the benefits of using this model towards language portability in SLU.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-11-2022

arXiv.org PDF

Add feedback

Country:
- Asia > Russia (0.04)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.04)
- Europe
  - France (0.04)
  - Greece (0.04)
  - Russia > Northwestern Federal District
    - Leningrad Oblast > Saint Petersburg (0.04)
  - Italy > Liguria
    - Genoa (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language > Text Processing (1.00)
  - Machine Learning > Performance Analysis
    - Accuracy (0.47)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found