Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter

Li, Song, You, Yongbin, Wang, Xuezhi, Ding, Ke, Wan, Guanglu

Sep-19-2023–arXiv.org Artificial Intelligence

Ref. [6, 7] introduced an additional language identification (LID) module Multilingual intelligent assistants, such as ChatGPT, have to predict language information, while Ref. [2] treated language recently gained popularity. To further expand the applications information as a special textual token and concatenated of multilingual artificial intelligence (AI) assistants and it to the input of the decoder of the autoregressive speech facilitate international communication, it is essential to enhance recognition model, achieving joint modeling of speech recognition the performance of multilingual speech recognition, and language identification. Ref. [3] provided language which is a crucial component of speech interaction. In this information directly as prior information to speech recognition paper, we propose two simple and parameter-efficient methods: models, this can be achieved by encoding language information language prompt tuning and f rame-level language as a one-hot vector or embedding and concatenating adapter, to respectively enhance language-configurable and it with acoustic features.

language information, recognition, speech recognition, (12 more...)

arXiv.org Artificial Intelligence

Sep-19-2023

arXiv.org PDF

Add feedback

Country:
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:
- Research Report (0.40)

Technology:
- Information Technology > Artificial Intelligence
  - Speech > Speech Recognition (1.00)
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found